Friday, November 19, 2021

Caching node modules and Cypress installation in an azure devops pipeline.

 


I have a back-end-for-front-end application which I scaffolded using Visual Studio. The backend is an ASP.NET core web API, and the front end is Angular. In the angular application, I have Cypress end-to-end tests that I want to run as part of a build pipeline. Accomplishing the objective requires a few things. For instance, I need to install Cypress binaries on the build agent. I also need to start the angular app to which the tests can run again. Installing Cypress binaries is a lengthy process, and with other steps, the entire build can take a long time to finish. After a few trial and error, I finally got the build to run Cypress tests and cache the binaries. Keep in mind that caching makes sense if the time it takes to save and restore the cache data is considerably less than the time it takes to download and install the data directly.

Below I show the current configs of the pipeline that is working for me:

# Starter pipeline
# Start with a minimal pipeline that you can customize to build and deploy your code.
# Add steps that build, run tests, deploy, and more:
# https://aka.ms/yaml
trigger:
- master
pool:
vmImage: "ubuntu-latest"
variables:
buildConfiguration: "Release"
spaDir: "AzurePipelineRestoreAndSaveCacheEx/ClientApp"
cyCacheDir: "/home/vsts/.cache/Cypress"
npmCacheDir: "/home/vsts/.npm"
steps:
- task: NodeTool@0
inputs:
versionSpec: "12.19"
displayName: "Install Node.js"
- task: Cache@2
inputs:
key: 'npm_v1 | "$(Agent.OS)" | $(spaDir)/package-lock.json'
path: $(npmCacheDir)
cacheHitVar: NPM_CACHE_RESTORED
displayName: "Cache ~/.npm directory"
- task: Cache@2
inputs:
key: 'cy_v1 | "$(Agent.OS)" | $(spaDir)/package-lock.json'
path: $(cyCacheDir)
cacheHitVar: CYPRESS_CACHE_RESTORED
displayName: "Cache Cypress binary"
- script: |
CYPRESS_INSTALL_BINARY=0 npm ci
displayName: Install dependencies (skip Cypress install)
workingDirectory: "$(spaDir)"
condition: eq(variables.CYPRESS_CACHE_RESTORED, 'true')
- script: |
npm ci
displayName: Install dependencies (include Cypress install)
workingDirectory: "$(spaDir)"
condition: eq(variables.CYPRESS_CACHE_RESTORED, 'false')
- task: CmdLine@2
inputs:
script: "npm test"
workingDirectory: "$(spaDir)"
displayName: "Run cypress test"
- task: PublishTestResults@2
inputs:
testResultsFormat: "JUnit"
testResultsFiles: "**/test-output-*.xml"
displayName: "Publish test results"
- task: DotNetCoreCLI@2
displayName: "Publish web app"
inputs:
command: "publish"
projects: "**/*.csproj"
publishWebProjects: true
zipAfterPublish: true
arguments: "--output $(build.artifactstagingdirectory)"
- task: PublishBuildArtifacts@1
inputs:
pathToPublish: $(Build.ArtifactStagingDirectory)

First, I want to mention about the build agent. As shown in the above configs, I’m using an ubuntu based build agent. At first, I used the windows based agent; however, the build took a long time to finish. I got a significant reduction in build time just by switching to the ubuntu agent.

pool:
vmImage: "ubuntu-latest"

The next thing I want to point out is the Cache task which I use to cache both the Cypress and .npm directory. For instance, the configured task below caches the global .npm directory:

- task: Cache@2
inputs:
key: 'npm_v1 | "$(Agent.OS)" | $(spaDir)/package-lock.json'
path: $(npmCacheDir)
cacheHitVar: NPM_CACHE_RESTORED
displayName: "Cache ~/.npm directory"

In the above configs, I specify a key that consists of the string ‘npm_v1’, the name of the operating system, and the MD5 hash of the package-lock.json file. When the task runs, it’s going to use this key to look up data in the cache. In case of a hit, the task restores the data to the folder specified in the path parameter. In case of a miss, at the end of the build, the task saves the data under the path directory into the cache. Because the key consists of the md5 hash of the package-lock.json, as long as the the content of the file remains the same and the operating system remains the same, subsequent lookup data in the cache using the key should result in a cache hit. You may wonder about the constant string ‘npm_v1’ I use. This is for when I want to forgo the existing cache.

Clearing a cache is currently not supported. However you can add a string literal (such as version2) to your existing cache key to change the key in a way that avoids any hits on existing caches

Pipeline caching

From my understanding, Cypress installation put binary files both under node_modules/.bin as well as the global system cache. On linux, the Cypress files are under the folder ~/.cache/Cypress by default. This directory is configurable by setting the environment variable CYPRESS_CACHE_FOLDER. For more info, check out the document.

Depending on whether the npm data exist in the cache, the build executes one of the next two tasks. Both tasks run the npm ci to install dependencies. However, one task skips the Cypress installation in case of the cache hit. When caching the cypress binaries, the Cache task sets the result into the variable CYPRESS_CACHED_RESTORED. Therefore, I use condition and check this variable to determine whether the files exist in the cache and skip cypress installation as the Cache task would have restored the files to the correct location.

You may wonder why I always run 

<code>npm ci</code>
without checking the cache. This is because npm looks at the shared cache directory which contains a cached version of all downloaded node modules. For linux, that directory is /home/vsts/.mpm which I cache in the preceding task. Therefore, even if I run
npm ci
, I get the benefit of caching because npm first checks to see if the package already exists in the shared cache directory to avoid making unnecessary network calls. For running in a CI environment, using 
npm ci i
s a preferred way over 
npm install
since the former is going to do a clean install by removing the node modules directory. It also makes sense why I need to cache both the Cypress cache directory as well as the shared npm directory.This is because the cypress installation puts files under node_modules/.bin but I do not cache the node_modules directory, only the shared npm directory.

There are different ways to enable caching in a Node.js project, but the recommended way is to cache npm’s shared cache directory. This directory is managed by npm and contains a cached version of all downloaded modules. During install, npm checks this directory first (by default) for modules which can reduce or eliminate network calls to the public npm registry or to a private registry.

Pipeline caching

Each Cache task has a post job that runs at the end of the build to check and update the cache if necessary. For instance, in case of a cache miss, the post job checks and stores the file under the path directory into the cache.

Cache – No data exist in the cache
Cache post-job – saves data into the cache for next run

After getting all the node modules and cypress binaries files ready, the next task is for starting up the local server to run Cypress tests.

- task: CmdLine@2
inputs:
script: "npm test"
workingDirectory: "$(spaDir)"
displayName: "Run cypress test"

In the above snippet, the script “npm test” refers to the following script in package.json

"scripts": {
"ng": "ng",
"start": "ng serve",
"build": "ng build",
"cy": "npx cypress run",
"build:ssr": "ng run AzurePipelineRestoreAndSaveCacheEx:server:dev",
"test": "start-server-and-test start http-get://localhost:4200 cy",
"lint": "ng lint",
"e2e": "ng e2e"
},

Notice in the above configs that I use the start-server-and-test module to start and serve the angular app on localhost and then run the Cypress test. The 

cy
 command references the 
npx cypress run
, and the start command references 
ng serve
.

One the tests have finished, cypress outputs the test results in junit format. For this to work, I need to have the following configs in the cypress.json file.

{
"chromeWebSecurity": false,
"baseUrl": "http://localhost:4200",
"reporter": "junit",
"reporterOptions": {
"mochaFile": "tests/test-output-[hash].xml",
"toConsole": true,
"attachments": true
}
}

Once I have run the tests for the front-end app, the last two tasks basically build and publish the ASP.NET core application. Part of this includes building the angular app again, but this time in prod mode, as per the config in the .csproj file.

<Target Name="PublishRunWebpack" AfterTargets="ComputeFilesToPublish">
<!-- As part of publishing, ensure the JS resources are freshly built in production mode -->
<Exec WorkingDirectory="$(SpaRoot)" Command="npm run build -- --prod" />
<Exec WorkingDirectory="$(SpaRoot)" Command="npm run build:ssr -- --prod" Condition=" '$(BuildServerSideRenderer)' == 'true' " />

As this is a sample project, I don’t have much node packages. The time it takes to save and restore the cache is almost the same as the time it takes to download and install the packages directly. However, if your project is large and has a lot of packages, you probably benefit from caching.

You can checkout the sample project at this link: https://dev.azure.com/taithienbo/_git/PipelineCachingEx.

References

Install Cypress binaries

Azure devops pipeline caching

Cypress setting up CI

Cypress binary cache

Cypress CI boot-your-server

Start-server-and-test module

Free hosting web sites and features -2024

  Interesting  summary about hosting and their offers. I still host my web site https://talash.azurewebsites.net with zero cost on Azure as ...