Monday, November 29, 2021

Setting up super fast Cypress tests on GitHub Actions

If you've been keeping track of The Array release posts you know that we prioritize shipping things fast and often. Just as important to us is being sure that we are not going to break things unnecessarily for our users as we add new features and speedups.

What we have found that works really well is nothing terribly novel by itself: a solid foundation of unit tests, end to end tests (integration tests), and CI/CD that for automation and gatekeeper keeping master clean.

Unit & Integration tests

In our Django codebase you'll find good number of Django tests that help keep us honest as we hack away at the backend of PostHog that keeps track of all the 1's and 0's that our customers depend on for making product decisions. These are our frontline defenders that let us know that something might be up before we even get to the point of creating a PR.

For this we do lean heavily on the standard Django test runner.

If you are interested in learning more on testing with Django check out Django's great docs on testing.

These tests only get you so far though. You know that the backend is going to behave well after you land the changes that you've made, but what if you accidentally changed something that breaks the frontend in weird and unexpected ways?

Enter Cypress

According to Cypress's GitHub repo it is a fast, easy and reliable testing for anything that runs in a browser. What does that mean exactly though?

It lets you programmatically interact with your application by querying the DOM and running actions against any selected elements. You can see that in a few of our Cypress test definitions

Tracking the elements

To keep our tested elements clear, manageable, and reusable upon refactor, we take advantage of the element attributes that html and react specifically recognize. Cypress has an amazing built in inspector on their test-runner that allows you to identify elements that you would like to add tests to.

While the tool works great, we found that occasionally the heavily nested components and classes would create selectors that were inflexible.

With the data-attr tag, we just need to keep track of the tag when updating/changing the components we're using without needing to rely on the inspector to find the precise selector for the test!

<LineGraph
data-attr="trend-line-graph"
{...props}
/>

Example of our integration test for our Funnel user experience:

describe('Funnels', () => {
//boilerplate to make sure we are on the funnel page of the app
beforeEach(() => {
cy.get('[data-attr=menu-item-funnels]').click()
})
// Test to make sure that the funnel page actually loaded
it('Funnels loaded', () => {
cy.get('h1').should('contain', 'Funnels')
})
// Test that when we select a funnel then we can edit that funnel
it('Click on a funnel', () => {
cy.get('[data-attr=funnel-link-0]').click()
cy.get('[data-attr=funnel-editor]').should('exist')
})
// Test that we can create a new funnel when we click 'create funnel' button
it('Go to new funnel screen', () => {
cy.get('[data-attr=create-funnel]').click()
cy.get('[data-attr=funnel-editor]').should('exist')
})
// Test that we can create a new funnel end to end
it('Add 1 action to funnel', () => {
cy.get('[data-attr=create-funnel]').click()
cy.get('[data-attr=funnel-editor]').should('exist')
cy.get('[data-attr=edit-funnel-input]').type('Test funnel')
cy.get('[data-attr=add-action-event-button]').click()
cy.get('[data-attr=trend-element-subject-0]').click()
cy.contains('Pageviews').click()
cy.get('[data-attr=save-funnel-button]').click()
cy.get('[data-attr=funnel-viz]').should('exist')
})
})

I personally love this syntax. It feels super readable to me and reminds me a bit of the best parts of jQuery.

GitHub Actions

So that's all well and cool, but what about making sure that in a fit of intense focus and momentum we don't inadvertently push a breaking change to master? We need someone or something to act as a gatekeeper to keep us from from shooting ourselves in the foot. We need CI.

We could use Travis, or Jekins, or CircleCI… but as you may have noticed we keep almost everything about PostHog in GitHub, from our product roadmap, issues, this blog, everything is in GitHub. So it made sense to us to keep our CI in GitHub if we could. We decided to give GitHub Actions a test. So far, we have loved it.

GitHub actions are basically a workflow you can trigger from events that occure on your GitHub repo. We trigger ours on the creation of a pull request. We also require that our actions all return 👍  before you can merge your PR into master. Thus, we keep master clean.

To make sure that things are only improving with our modifications, we first re-run our Django unit and integration tests just to make sure that in our customers final environment things are still going to behave as expected. We need to be sure that there was nothing unique about your dev environment that could have fooled the tests into a false sense of awesome. You can check out how we set this up here Django github actions

The second round of poking we do with our app is we hit it with Cypress tests that we discussed earlier. These boot up our app and click through workflows just as a user would, asserting along the way that things look and behave as we would expect. You can check out how we've setup our Cypress action here

Caching

We ran up upon an issue though. Installing python dependencies, javascript dependencies, building our frontend app, booting up a chromium browser… this all takes a lot of time. We are impatient. We want instant gratifiction, at least when it comes to our code. Most of this stuff doesn't even change between commits on a PR anyways. Why are we spending valuable time and resources towards having things be repulled and rebuilt? That's where we ended up using one of the best features of GitHub Actions: the cache step.

Using the cache step we can cache the results of pulling python dependencies or javascript dependencies. This saves a chunk of time if you have ever messed around with watching yarn sort out the deps for a large frontend project. Check it out:

How we manage caching the cache for pip:

- uses: actions/cache@v1
name: Cache pip dependencies
id: pip-cache
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install python dependencies
run: |
python -m pip install --upgrade pip
python -m pip install $(grep -ivE "psycopg2" requirements.txt) --no-cache-dir --compile
python -m pip install psycopg2-binary --no-cache-dir --compile

Note that there is no if block to determine whether to use the cache or not when we pip install the dependencies. This is because pip is smart enough to use the rehydrated cache if it sees it, if it doesnt see it it will just go out to the internet to grab what it needs.

Yarn is a bit more involved only because we grab the location of the cache directory first and use that output as an input to the caching step

Yarn dependency caching

- name: Get yarn cache directory path
id: yarn-dep-cache-dir-path
run: echo "::set-output name=dir::$(yarn cache dir)"
- uses: actions/cache@v1
name: Setup Yarn dep cache
id: yarn-dep-cache
with:
path: ${{ steps.yarn-dep-cache-dir-path.outputs.dir }}
key: ${{ runner.os }}-yarn-dep-${{ hashFiles('**/yarn.lock') }}
restore-keys: |
${{ runner.os }}-yarn-dep-
- name: Yarn install deps
run: |
yarn install --frozen-lockfile
if: steps.yarn-dep-cache.outputs.cache-hit != 'true'

That last line with the if block tells GitHub to not run yarn install if the cache exists. This saves us a ton of time if nothing has changed

On top of that, let's say you are making changes to only the API. There's no reason why you should be rebuiling the frontend each time the tests are run. So we go ahead and cache that between runs as well.

Frontend app build cache

- uses: actions/cache@v1
name: Setup Yarn build cache
id: yarn-build-cache
with:
path: frontend/dist
key: ${{ runner.os }}-yarn-build-${{ hashFiles('frontend/src/') }}
restore-keys: |
${{ runner.os }}-yarn-build-
- name: Yarn build
run: |
yarn build
if: steps.yarn-build-cache.outputs.cache-hit != 'true'

Now you are catching if the cache exists so we can skip building the frontend altogether since it's been rehydrated from the last run. Nifty!

Throw more computers at it!

One of the best thing about Cypress is that you can grow with it. It would be a real pain if you invested all of this time into building out tests just to have your test suite take 60 minutes to run. Luckily both GitHub actions and Cypress have a solution to that!

Run it in parallel!

matrix:
# run 4 copies of the current job in parallel
containers: [1, 2, 3, 4]

Configure Cypress step to coordinate with Cypress SaaS

- name: Cypress run
uses: cypress-io/github-action@v1
with:
config-file: cypress.json
record: true
parallel: true
group: 'PostHog Frontend'
env:
# pass the Dashboard record key as an environment variable
CYPRESS_RECORD_KEY: ${{ secrets.CYPRESS_RECORD_KEY }}
# Recommended: pass the GitHub token lets this action correctly
# determine the unique run id necessary to re-run the checks
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Depending on the count of tests and the frequency you are running your suite this might cost you some money having to upgrade your account on Cypress.io but their free tier is pretty generous and they do have OSS plans that are free.

This all has cut the time it takes for GitHub to stamp our pull requests from >10 minutes to ~5 minutes and that's with our relatively small set of tests.

As we grow functionality within PostHog all of this will only become more important so that we don't end up with a 30 minute end to end test blocking you from landing that really killer new feature. Sweet.

👀 at errors

The final bit here is what happens if the tests are failing?

If this is all happening in a browser up in the cloud how do we capture what went wrong? We need that to figure out how to fix it. Luckily, again, Cypress and GitHub actions has a solution: artifacts.

Artifacts allow us to take the screenshots that Cypress takes when things go wrong, zip them up, and make them available on the dashboard for the actions that are being run.

Capturing Cypress screenshots

- name: Archive test screenshots
uses: actions/upload-artifact@v1
with:
name: screenshots
path: cypress/screenshots
if: ${{ failure() }}

As you can tell by the if block here, we only upload the artifacts if there is a problem. That's because we already know what the app will look like when things go right… hopefully 😜

Roadmap

There is one thing that we don't capture in our current test suite: Performance!

We have customers who upload hundreds of telemetry events a second. If we introduce a regression that dings performance this could cause an outage for them where they lose data which is arguably worse than a regression on the frontend.

Our plan here use GitHub actions to standup an instance of our infrastructure and hammer it with sythentic event telemetry and compare that against a baseline from prior performance tests. If the test runtime changes materially we will block the pull request from being merged in to guard master from having a potentially breaking change. Stay tuned for a post on automated performance testing.

The pitch™

Hey! You made it this far. If you see yourself working on challenging issues at a high paced startup with a really rad group of people. You are in luck! We are looking for people like you!

Sunday, November 21, 2021

Azure DevOps agent with Docker Compose

 Using Docker commands in pipeline definition is nice, but has some drawbacks: First of all this approach suffers in speed of execution, because the container must start each time you run a build (and should be stopped at the end of the build). Is indeed true that if the docker image is already present in the agent machine startup time is not so high, but some images, like MsSql, are not immediately operative, so you need to wait for them to be ready for Every Build. The alternative is leave them running even if the build is finished, but this could lead to resource exaustion.

Another problem is dependency from Docker engine. If I include docker commands in build definition, I can build only on a machine that has Docker Installed. If most of my projects uses MongoDb, MsSql and Redis, I can simple install all three on my build machine maybe using a fast SSD as storage. In that scenario I’m expecting to use my physical instances, not waiting for docker to spin new container.

Including Docker Commands in pipeline definition is nice, but it tie the pipeline to Docker and can have a penalty in execution speed

What I’d like to do is leverage docker to spin out an agent and all needed dependencies once, then use that agent with a standard build that does not require docker. This gives me the flexibility of setting up a build machine with everything preinstalled, or to simply use Docker to spin out in seconds an agent that can build my code. Removing Docker dependency from my pipeline definition gave user the most flexibility.

For my first experiment I want also use Docker in Windows 2109 to leverage Windows Container.

First of all you can read the nice MSDN article about how to create a Windows Docker image that downloads, install and run an agent inside a Windows server machine with Docker for Windows. This allows you to spin out a new Docker Agent based on Windows image in minutes (just the time to download and configure the agent).

Thanks to Windows Containers, running an Azure DevOps agent based on Windows is a simple Docker Run command.

Now I need that agent to being able to use MongoDb and MsSql to run integration tests. Clearly I can install both db engine on my host machine and let docker agent to use them, but since I’ve already my agent in Docker I wish for dependencies to run also in Docker ; so… welcome Docker Compose.

Thanks to Docker Compose I can define a YAML file with a list of images that are part of a single sceanrio so I specified an Agent image followed by a Sql Server and a MongoDb images. The beauty of Docker-compose is the ability to refer to other container machines by name. Lets do an example: here is my complete docker compose YML file.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
version: '2.4'

services:
  agent:
    image: dockeragent:latest
    environment:
      - AZP_TOKEN=efk5g3j344xfizar12duju65r34llyw4n7707r17h1$36o6pxsa4q
      - AZP_AGENT_NAME=mydockeragent
      - AZP_URL=https://dev.azure.com/gianmariaricci
      - NSTORE_MONGODB=mongodb://mongo/nstoretest
      - NSTORE_MSSQL=Server=mssql;user id=sa;password=sqlPw3$secure

  mssql:
    image: mssqlon2019:latest
    environment:
      - sa_password=sqlPw3$secure
      - ACCEPT_EULA=Y

    ports:
      - "1433:1433"

  mongo:
    platform: linux
    image: mongo
    ports:
      - "27017:27017"

To simplify everything all of my integration tests that needs a connection string to MsSql or MongoDb grab the connection string by environment variable. This is convenient so each developer can use db instances of choice but also this technique makes super easy to configure a Docker agent specifying database connection strings as seen in Figure 1. I can specify in environment variables connection string to use for testing and I can simply use other docker service names directly in connection string.

image

Figure 1Environment variable to specify connection string.

As you can see (1) connection strings refers to other containers by name, nothing could be easier.

The real advantage of using Docker Compose is the ability to include Docker Compose file (as well as dockerfiles for all custom images that you could need)  inside your source code. With this approach you can specify build pipelines leveraging YAML build of Azure DevOps and also the configuration of the agent with all dependencies. Since you can configure as many Agent you want for Azure DevOps (you actually pay for number of concurrent executing pipeline) thanks to Docker Compose you can setup an agent suitable for your project in less than one minutes. But this is optional, if you do not like to use Docker compose, you can simply setup an agent manually, just as you did before.

Including a docker compose file in your source code allows consumer of the code to start a compatible agent with a single command.

Thanks to docker compose, you pay the price of downloading pulling images once, also you are paying only once the time needed for any image to become operative (like MsSql or other databases that needs a little bit before being able to satisfy requests). After everything is up and running, your agent is operative and can immediately run your standard builds, no docker reference inside your YAML build file, no time wasted in waiting your images to become operational.

Thanks to experimental feature of Windows Server2019, I was able to specify a docker-compose file that contains not only windows images, but also Linux images. The only problem I had is that I did not find a Windows 2019 Image for Sql Server. I started getting error using standard MsSql images (build for windows 2016); So I decided to download official Docker file, change reference image and recreate the image and everything worked like a charm.

image

Figure 2Small modification to Sql Server docker windows image file, targeting Windows Server 2019.

Since it is used only for test, I’m pretty confident that it should work and indeed my build runs just fine.

image

Figure 3My build result, ran on docker agent.

Actually my agent created with Docker Compose is absolutely equal to all other agentw, from the point of view of Azure DevOps it has nothing different, but I’ve started it with a single line of Docker-Compose command

image

Figure 4Agent running in docker it is treated as standard agents.

That’s all, with a little effort I’m able to include in my source code both YAML build definition as YAML Docker Compose file to specify the agent with all prerequisites to ran the build. This is especially useful for Open Source projects, where you want to fork a project then activate CI with no effort.