As part of my team’s development environment, I set up a CI pipeline for our repo. CI or Continuous Integration is a devops practice where developers often merge their code to trunk of the repository. Since commits are made often, it’s important to determine whether or not new features or code pass the application’s integration and unit tests. There are tons of options when it comes to setting up a CI pipeline but the way I chose was Github Actions since our repository is on Github. Let’s take a look at my workflow and go through it step by step.
Github Actions uses YAML format to define a series of steps to take when performing actions with your codebase. First though, we have to tell github actions when to run this workflow by using on
.
name: CI Workflow
run-name: Lint and Test
# Whenever a developer opens up a pull request against the main branch of our repository, this workflow will run.
on:
pull_request:
branches:
- "main"
Next, are the parts where we define what to actually do. In our case, we have a number of microservices and our front end as part of our monorepo. Since our microservices are meant to be standalone applications, I decided to write our workflow to first check if any files in a microservice changed before running the unit tests. However, let’s first install the necessary tools to lint the project and make sure it conforms to our code style guidelines and pep8 standards.
jobs:
#----------------------------------------------
# install and run linters
#----------------------------------------------
python_linting:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Load pip cache if it exists
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip
restore-keys: ${{ runner.os }}-pip
- name: Install python code quality tools
run: python -m pip install isort black flake8 yamllint
- name: Run isort, black, & flake8
working-directory: ./services
run: |
isort . --profile=black --filter-files --line-length=100
black . --line-length=100 --check
flake8 . --max-line-length=100 --extend-ignore=E203
- name: Run yamllint
run: yamllint . -d relaxed --no-warnings
Next up, we detect which of our microservices (or frontend) had file changes. That way we can spin up test workers to install the dependencies for each of those applications and test them independently. Say our etl-service, prediction-api, and frontend changed in this pull request. This part of the workflow will output a list of changed microservices/frontend something like [etl-service, prediction-api, frontend]
#----------------------------------------------
# Detect Changes to Microservices and Frontend
#
#----------------------------------------------
detect_changes_job:
runs-on: ubuntu-latest
permissions:
pull-requests: read
outputs:
services_changed_arr: ${{steps.services_filter.outputs.changes}} # arr of libs (services) changed
frontend_changed_bool: ${{ steps.frontend_filter.outputs.frontend }}
steps:
- name: Detect which/if any services changes
uses: dorny/paths-filter@v2
id: services_filter
with:
filters: |
etl-service:
- 'services/etl-service/**'
neural-network:
- 'services/neural-network/**'
prediction-api:
- 'services/prediction-api/**'
utilities:
- 'services/utilities/**'
- name: Detect if any frontend changes
uses: dorny/paths-filter@v2
id: frontend_filter
with:
filters: |
frontend:
- 'frontend/**'
Finally, we spin up a matrix
of jobs, install the dependencies for each part of the project, then run our tests. In github actions, a matrix is basically a for loop. We simply loop through our list of changed services and perform the same actions on each. The frontend is a little different to test since it’s written in Javascript so we define a different action to take for that compared to our python microservices. What’s nice about a matrix approach is that our workers can run concurrently and share cached files which gets our production code out faster!
#----------------------------------------------
# Run matrix of jobs based on services changed
#
#----------------------------------------------
run_ci_on_service:
needs: detect_changes_job # detect_changes_job result
if: needs.detect_changes_job.outputs.services_changed_arr != '[]' # [etl-service, neural-network, ...]
strategy:
matrix:
services_to_test: ${{ fromJSON(needs.detect_changes_job.outputs.services_changed_arr) }} # parse
runs-on: ubuntu-latest
steps:
- name: Check out repository
uses: actions/checkout@v3
- name: Set up python 3.10
id: setup-python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Load cached venv
id: cached-poetry-dependencies
uses: actions/cache@v3
env:
LOCKFILE_LOCATION: "**/${{matrix.services_to_test}}/poetry.lock" # workaround for using variable in the hashFiles function below
with:
path: ./services/${{matrix.services_to_test}}/.venv
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles(env.LOCKFILE_LOCATION) }}
- name: Load cached .local
uses: actions/cache@v3
with:
path: ~/.local
key: localdir-${{ runner.os }}-${{ hashFiles('.github/workflows/main.yaml') }}
- name: Install/configure Poetry
uses: snok/install-poetry@v1.3.3
with:
virtualenvs-create: true
virtualenvs-in-project: true
- name: Install project dependencies # if cache doesn't exist
working-directory: ./services/${{matrix.services_to_test}}
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install
- name: Set modified dirname envvar # this is due to pycov needing the module name and not a folder
id: modify-dirname
run: |
export modified_dirname=$(echo ${{ matrix.services_to_test }} | sed 's/-/_/g')
echo "dirname=$modified_dirname" >> $GITHUB_ENV
- name: Install sndfile dep for utilities Librosa
if: matrix.services_to_test == 'utilities'
run: |
sudo apt-get install libsndfile1-dev
# todo: add in coverage report and thresholds
- name: Run tests and generate report
working-directory: ./services/${{matrix.services_to_test}}
run: |
poetry run pytest --cov=${{env.dirname}} ./tests
#----------------------------------------------
# Run CI on frontend
#
#----------------------------------------------
run_ci_on_frontend:
needs: detect_changes_job # detect_changes_job result
if: needs.detect_changes_job.outputs.frontend_changed_bool == 'true'
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v3
- name: Setup Node and cache npm dependencies
uses: actions/setup-node@v3
with:
node-version: 16
cache: "npm"
cache-dependency-path: frontend/package-lock.json
- name: Load cached node_modules
id: cached-node-modules
uses: actions/cache@v3
with:
path: frontend/node_modules
key: node_modules-${{ hashFiles('frontend/package-lock.json') }}
- name: Install node_modules # if cache doesn't exist
working-directory: frontend
if: steps.cached-node-modules.outputs.cache-hit != 'true'
run: npm install
- name: Run npm lint script
working-directory: frontend
run: npm run lint
It was a fun process to set up and has paid dividends (not literally I’m not getting paid for this) for my team’s development practices. Devops is awesome!