{"id":19,"date":"2022-10-27T21:02:21","date_gmt":"2022-10-27T21:02:21","guid":{"rendered":"https:\/\/blogs.oregonstate.edu\/awanf\/?p=19"},"modified":"2022-10-27T21:04:59","modified_gmt":"2022-10-27T21:04:59","slug":"automating-workflows","status":"publish","type":"post","link":"https:\/\/blogs.oregonstate.edu\/awanf\/2022\/10\/27\/automating-workflows\/","title":{"rendered":"Automating Workflows"},"content":{"rendered":"\n<p>As part of my team&#8217;s development environment, I set up a CI pipeline for our repo. CI or Continuous Integration is a devops practice where developers often merge their code to trunk of the repository.  Since commits are made often, it&#8217;s important to determine whether or not new features or code pass the application&#8217;s integration and unit tests.  There are tons of options when it comes to setting up a CI pipeline but the way I chose was Github Actions since our repository is on Github.  Let&#8217;s take a look at my workflow and go through it step by step.<\/p>\n\n\n\n<p>Github Actions uses YAML format to define a series of steps to take when performing actions with your codebase.  First though, we have to tell github actions when to run this workflow by using <code>on<\/code>. <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>name: CI Workflow\nrun-name: Lint and Test\n# Whenever a developer opens up a pull request against the main branch of our repository, this workflow will run.\non: \n  pull_request:\n    branches:\n      - \"main\"\n<\/code><\/pre>\n\n\n\n<p>Next, are the parts where we define what to actually do.  In our case, we have a number of microservices and our front end as part of our monorepo.  Since our microservices are meant to be standalone applications, I decided to write our workflow to first check if any files in a microservice changed before running the unit tests.  However, let&#8217;s first install the necessary tools to lint the project and make sure it conforms to our code style guidelines and pep8 standards.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>jobs:\n  #----------------------------------------------\n  #          install and run linters\n  #----------------------------------------------\n  python_linting:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions\/checkout@v3\n      - name: Setup Python\n        uses: actions\/setup-python@v4\n        with:\n          python-version: \"3.10\"\n\n      - name: Load pip cache if it exists\n        uses: actions\/cache@v3\n        with:\n          path: ~\/.cache\/pip\n          key: ${{ runner.os }}-pip\n          restore-keys: ${{ runner.os }}-pip\n\n      - name: Install python code quality tools\n        run: python -m pip install isort black flake8 yamllint\n      - name: Run isort, black, &amp; flake8\n        working-directory: .\/services\n        run: |\n          isort .  --profile=black --filter-files --line-length=100\n          black . --line-length=100 --check\n          flake8 . --max-line-length=100 --extend-ignore=E203\n      - name: Run yamllint\n        run: yamllint . -d relaxed --no-warnings\n<\/code><\/pre>\n\n\n\n<p>Next up, we detect which of our microservices (or frontend) had file changes.  That way we can spin up test workers to install the dependencies for each of those applications and test them independently.  Say our etl-service, prediction-api, and frontend changed in this pull request.  This part of the workflow will output a list of changed microservices\/frontend something like <code>[etl-service, prediction-api, frontend]<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      #----------------------------------------------\n      #       Detect Changes to Microservices and Frontend\n      # \n      #----------------------------------------------\n  detect_changes_job:\n    runs-on: ubuntu-latest\n    permissions:\n      pull-requests: read\n    outputs:\n      services_changed_arr: ${{steps.services_filter.outputs.changes}} # arr of libs (services) changed\n      frontend_changed_bool: ${{ steps.frontend_filter.outputs.frontend }}\n\n    steps:\n      - name: Detect which\/if any services changes\n        uses: dorny\/paths-filter@v2\n        id: services_filter\n        with:\n          filters: |\n            etl-service:\n              - 'services\/etl-service\/**'\n            neural-network:\n              - 'services\/neural-network\/**'\n            prediction-api:\n              - 'services\/prediction-api\/**'\n            utilities:\n              - 'services\/utilities\/**'\n\n      - name: Detect if any frontend changes\n        uses: dorny\/paths-filter@v2\n        id: frontend_filter\n        with:\n          filters: |\n            frontend:\n              - 'frontend\/**'<\/code><\/pre>\n\n\n\n<p>Finally, we spin up a <code>matrix<\/code> of jobs, install the dependencies for each part of the project, then run our tests.  In github actions, a matrix is basically a for loop.  We simply loop through our list of changed services and perform the same actions on each. The frontend is a little different to test since it&#8217;s written in Javascript so we define a different action to take for that compared to our python microservices.  What&#8217;s nice about a matrix approach is that our workers can run concurrently and share cached files which gets our production code out faster!<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>      #----------------------------------------------\n      #       Run matrix of jobs based on services changed\n      # \n      #----------------------------------------------\n  run_ci_on_service:\n    needs: detect_changes_job # detect_changes_job result\n    if: needs.detect_changes_job.outputs.services_changed_arr != '&#091;]' # &#091;etl-service, neural-network, ...]\n    strategy:\n      matrix:\n        services_to_test: ${{ fromJSON(needs.detect_changes_job.outputs.services_changed_arr) }} # parse\n    runs-on: ubuntu-latest\n    steps:\n      - name: Check out repository\n        uses: actions\/checkout@v3\n\n      - name: Set up python 3.10\n        id: setup-python\n        uses: actions\/setup-python@v4\n        with:\n          python-version: \"3.10\"\n\n      - name: Load cached venv\n        id: cached-poetry-dependencies\n        uses: actions\/cache@v3\n        env:\n          LOCKFILE_LOCATION: \"**\/${{matrix.services_to_test}}\/poetry.lock\" # workaround for using variable in the hashFiles function below\n        with:\n          path: .\/services\/${{matrix.services_to_test}}\/.venv\n          key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles(env.LOCKFILE_LOCATION) }}\n\n      - name: Load cached .local\n        uses: actions\/cache@v3\n        with:\n          path: ~\/.local\n          key: localdir-${{ runner.os }}-${{ hashFiles('.github\/workflows\/main.yaml') }}\n\n      - name: Install\/configure Poetry\n        uses: snok\/install-poetry@v1.3.3\n        with:\n          virtualenvs-create: true\n          virtualenvs-in-project: true\n\n      - name: Install project dependencies # if cache doesn't exist\n        working-directory: .\/services\/${{matrix.services_to_test}}\n        if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'\n        run: poetry install\n\n      - name: Set modified dirname envvar # this is due to pycov needing the module name and not a folder\n        id: modify-dirname\n        run: |\n          export modified_dirname=$(echo ${{ matrix.services_to_test }} |  sed 's\/-\/_\/g')\n          echo \"dirname=$modified_dirname\" &gt;&gt; $GITHUB_ENV\n\n      - name: Install sndfile dep for utilities Librosa\n        if: matrix.services_to_test == 'utilities'\n        run: |\n          sudo apt-get install libsndfile1-dev\n\n        # todo: add in coverage report and thresholds\n      - name: Run tests and generate report\n        working-directory: .\/services\/${{matrix.services_to_test}}\n        run: |\n          poetry run pytest --cov=${{env.dirname}} .\/tests\n\n      #----------------------------------------------\n      #       Run CI on frontend\n      # \n      #----------------------------------------------\n\n  run_ci_on_frontend:\n    needs: detect_changes_job # detect_changes_job result\n    if: needs.detect_changes_job.outputs.frontend_changed_bool == 'true'\n    runs-on: ubuntu-latest\n    steps:\n      - name: Checkout repo\n        uses: actions\/checkout@v3\n\n      - name: Setup Node and cache npm dependencies\n        uses: actions\/setup-node@v3\n        with:\n          node-version: 16\n          cache: \"npm\"\n          cache-dependency-path: frontend\/package-lock.json\n\n      - name: Load cached node_modules\n        id: cached-node-modules\n        uses: actions\/cache@v3\n        with:\n          path: frontend\/node_modules\n          key: node_modules-${{ hashFiles('frontend\/package-lock.json') }}\n\n      - name: Install node_modules # if cache doesn't exist\n        working-directory: frontend\n        if: steps.cached-node-modules.outputs.cache-hit != 'true'\n        run: npm install\n\n      - name: Run npm lint script\n        working-directory: frontend\n        run: npm run lint<\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<p>It was a fun process to set up and has paid dividends (not literally I&#8217;m not getting paid for this) for my team&#8217;s development practices.  Devops is awesome!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As part of my team&#8217;s development environment, I set up a CI pipeline for our repo. CI or Continuous Integration is a devops practice where developers often merge their code to trunk of the repository. Since commits are made often, it&#8217;s important to determine whether or not new features or code pass the application&#8217;s integration [&hellip;]<\/p>\n","protected":false},"author":12668,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-19","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/posts\/19","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/users\/12668"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/comments?post=19"}],"version-history":[{"count":3,"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/posts\/19\/revisions"}],"predecessor-version":[{"id":22,"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/posts\/19\/revisions\/22"}],"wp:attachment":[{"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/media?parent=19"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/categories?post=19"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.oregonstate.edu\/awanf\/wp-json\/wp\/v2\/tags?post=19"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}