TLDR: when restoring NPM cache on your continuous integration service use the exact lock file hash, do not use lax partial restore cache keys.
The NPM caching on CI
Imagine we have a Node project that we test on continuous integration server. Take bahmutov/snowball-npm-cache-example repository for example. It has only a single production dependency - my favorite debug
module; I use it to log all the things the right way.
1 | $ npm i -S debug |
Let's say we want to test our project on continuous integration service, like GitHub Actions. We need to check out the source code and install dependencies before we can run any tests. Here is our initial CI workflow file (copied almost verbatim from Example CI configs).
1 | name: ci |
We are using actions/cache official action, and the above syntax comes straight from the page Caching dependencies to speed up workflows documentation. Let me explain it.
After the code is checked out, the actions/cache
step takes over. It uses the name cache-node-modules
that we have picked. While not necessary right now, it will become handy later.
The action will cache the folder ~/.npm
. This is the folder where NPM caches downloaded NPM modules. We can see this folder's name locally by asking NPM
1 | $ npm config get cache |
The action first tries to restore this folder ~/.npm
on CI by looking up caches stored for this project. Every cache has a key - a name for the cache. This is where the key
and the restore_keys
come into play. First, the action computes them. The key
is the most precise cache name, it uses the OS name, the cache-name
string and the hash of the lock file.
1 | key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/package-lock.json') }} |
Every time we install a dependency locally, the package-lock.json
is recreated. Thus the hash of this file changes every time the package lock file changes.
The restore-keys
are fallbacks. If the cache with the exact key
is not found, the actions/cache
tries to find a cache where the name starts with the given key. Maybe there is already a cache that starts with ${{ runner.os }}-build-${{ env.cache-name }}-
? This would be the case if there was a previously saved cache with different package-lock.json
file. If not found, then actions/cache
tries to find a saved cache that starts with ${{ runner.os }}-build-
- so even the specified cache-name
does not matter. Finally, if nothing has been found, the actions/cache
tries to restore ${{ runner.os }}-build-
prefix. If that fails, it tries to restore cache with ${{ runner.os }}-
prefix. At this point you might be asking yourself "wait, it will restore some random cache and go from there?"
Yes. The restore keys are very lax, and a pretty random cache can be restored, giving you some random ~/.npm
folder before npm ci
runs. That is not a problem. Yet. Let's see the GitHub Action messages the first time our project runs.
There were no previously saved caches (this was the very first CI run for this repository). Thus there was nothing to restore, and the ~/.npm
folder was empty when npm ci
ran. Two dependencies were downloaded and stored in ~/.npm
folder. They were the debug
NPM dependency and its single transient dependency ms
1 | $ npm ls |
After installation, the cache is saved under the precise key Linux-build-cache-node-modules-e9940409f0500326b7e54199eda4e7eefb0b839256d569cdb4979c7fff132c2c
which includes the OS name, the cache name we specified in the YML file and the hash of the package-lock.json
file using key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/package-lock.json') }}
.
The cache size
Before we continue, let's NOT change any dependencies, and just print the cached modules after the restore. We can just run the command du -d 0 -h ~/.npm
on CI to print the cache folder size in human-readable format. We will do it after restoring the cache and after the NPM install.
1 | - name: Cache node_modules |
I tried listing all cached NPM modules using the npm-cache-list module, but failed to see any real results; it never printed
debug
andls
in its output. Seemsnpm cache
is really missinglist
command.
The cached folder is 84 Kilobytes.
Tip: You can see the zipped cache folder size in the Cache node_modules
messages when it downloads the found cache too:
1 | ▶ Run actions/cache@v2 |
The snowball
Now let's change the dependencies in our project. For example, we can replace debug
module with morgan
for some reason.
1 | $ npm uninstall debug |
By changing the dependencies, we have modified both package.json
and package-lock.json
files
1 | $ git status |
What happens when we push the code? Well, the CI has the following cache right now
1 | Linux-build-cache-node-modules-e9940409f0500326b7e54199eda4e7eefb0b839256d569cdb4979c7fff132c2c |
We will push the updated package-lock.json
file with a different hash. Thus that cache will NOT match the exact key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/package-lock.json') }}
when restoring the cache. But it will match the prefix restore keys, because actions/core
will find that cache when looking for ${{ runner.os }}-build-${{ env.cache-name }}-
prefix. This is what the CI output shows:
The Cache node_modules
shows that it has found the previous cache, restored it, ran NPM CI command, and then saved the new ~/.npm
folder under the new full key which includes the new lock file's hash. The new cache folder has size of 384 Kilobytes by the way, and it includes both morgan
and debug
modules (and its dependencies)!
Print more information
An even better view of what is going on is available if you enable debugging output from the actions/cache
steps by setting a secret in the GitHub repo
1 | ACTIONS_STEP_DEBUG: true |
Let's remove all NPM dependencies from our project and run the CI again
1 | $ npm uninstall morgan |
We now have no dependencies at all in our project
1 | { |
Let's see how the cache behaves. Here is the partial output from the action run - it is pretty verbose!
Again, it shows that the previous cache was restored because it matched the lax restore key using prefix ${{ runner.os }}-build-${{ env.cache-name }}-
. We never compact the cache, thus even removing the project's dependencies has no effect on the ~/.npm
folder. We just roll over the previous cache under the new hash key. Every new dependency only grows the cache.
Prevent the cache snowball
You might think a few kilobytes of extra stuff carried over in the cache folder is no big deal. But remember - that cache keeps growing and growing, since you never delete anything there. After a while it can reach a magnificent number, just like I have locally
1 | $ du -d 0 -h ~/.npm |
This is a realistic problem for larger repositories, especially with Automated dependency updates configured - the NPM cache will keep all those versions around forever, snowballing the cache size to hundreds of megabytes and even gigabytes.
So what can we do to "reset" the cache and stop it from growing? Well, you might think that changing the cache name would do the trick. For example, you could change the GitHub action environment to use cache-name
by adding -v2
there.
1 | - name: Cache node_modules |
Let's see how it works out.
Wow, we again got an old cache restored because it matcher another restore key ${{ runner.os }}-build-
.
So the only solution I found for preventing always increasing caches is ... 🥁 ... is using NO restore keys. When doing this, we should also change the cache name to drop the previous cache (since it might match an already saved cache because of using the unchanged hash of the package lock file)
1 | - name: Cache node_modules |
Tip: the disk usage utility du
we have used exits with 1 if the folder is not found. Thus we should make it more robust before we can have NO ~/.npm
folder by using du -d 0 -h ~/.npm || true
command.
The CI runs and has nothing to cache, since there is no ~/.npm
folder if there are no project dependencies to download, cache, and install
Let's verify the cache is acting as expected. Let's install morgan
dependency first.
1 | $ npm i -S morgan |
The CI runs and the cache is 288K - because it only has morgan
dependency (and not morgan
+ debug
like before).
Now let's remove morgan
and go back to just having our debug
dependency.
1 | $ npm uninstall morgan |
Since we use the exact hash, the previous cache was discarded and we have recreated it from scratch, having only the minimal 84K ~/.npm
folder. From now on, every CI job that does not modify package.json
will install this minimal, up-to-date cache folder.
The best solution
Remembering the cache key format is tiresome. You just want to install NPM dependencies and cache ~/.npm
folder, right? So the simplest way is to use bahmutov/npm-install action I wrote. It will do precisely that this blog describes - uses the exact key
to restore and save NPM cache, and it runs npm ci
or yarn
for you.
The CI file is now
1 | name: ci |
The CI runs and saves the new cache (since the action sets the exact key to be yarn|npm-${platformAndArch}-${lockHash}
which is slightly different from what we have used before in this example)
Let's push another commit to verify the action works.
1 | $ git commit --allow-empty -m "trigger ci" |
The CI runs, gets cache hit and quickly installs using JUST the right dependencies.
You might say that discarding the entire cache on package lock file is extreme. I say no - you will have a clean reinstall whenever the lock file changes, but how often do you modify the dependencies? I believe that you probably have a lot more commits that change the source files, but leave the dependencies intact. The commits that touch the dependencies will run longer, but all others commits would benefit from smaller cache restore.
Using bahmutov/npm-install
abstracts all this away, I use it myself all the time. It is truly a simple solution.
More info
Blog post Do Not Let Cypress Cache Snowball on CI talks about Cypress-specific caching.
Read Cleaning Up Space on Development Machine post.
In the next blog post I will show how Cypress binaries snowball the same way during CI builds and how to solve it.