Snapshot testing the hard way

Framework-agnostic snapshot testing for Mocha, Jest, Vue, etc.

I really liked snapshot testing as implemented in Jest. Once you have a complex object, compare it with its previous value using expect(value).toMatchSnapshot(). If there is no previous snapshot, it will be saved under the test name. If there is a snapshot file already, the assertion compares the given value with the saved one and throws a well formatted exception if the two values differ.

Jest snapshot image source:https://facebook.github.io/jest/img/blog/snapshot.png

Other frameworks have implemented the snapshot matching feature, for example Ava v0.18.0. Yet I always found that the snapshot testing is too closely tied to the Jest framework itself. This issue #2497 discusses the work that went into adding snapshots to Ava (see commit ee65b6d) and how the snapshot testing could be better separated from Jest framework.

Goal

I wanted to design a snapshot assertion that is separate from any particular testing framework. I love Mocha and would love to be able to bring snapshot assertions to my unit, API and DOM tests. I would also like to be able to use the same library with other test runners, like Jest, Ava and QUnit. A stretch goal is to make this snapshot testing work inside end to end tests inside Cypress tool.

I wanted the simplest API possible. A single function that takes just a value. No test names, no arguments, nothing but a value. Everything else should be figured out automatically.

Snap-shot

This is what I wanted (and it got ultimately done in snap-shot).

spec.js
1
2
3
4
const snapshot = require('snap-shot')
it('adds two numbers', () => {
snapshot(2 + 3)
})

Notice the snap-shot is a 3rd party module, independent of Mocha / Jasmine or any other BDD testing framework. Let us run this test using Mocha.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ npm install -D mocha snap-shot
$ cat package.json
{
"name": "snapshot-testing",
"version": "1.0.0",
"scripts": {
"test": "mocha spec.js"
},
"dependencies": {
"mocha": "3.2.0",
"snap-shot": "2.7.1"
}
}
$ npm test -s
✓ adds two numbers
1 passing (23ms)

There is a new file in our folder.

1
2
$ cat __snapshots__/spec.js.snap-shot
exports['adds two numbers 1'] = 5

Notice how snap-shot has discovered the right test name "adds two numbers" and saved the sum. If we have more snapshots in the same test, they will be saved in the same file with different index.

spec.js
1
2
3
4
5
const snapshot = require('snap-shot')
it('adds two numbers', () => {
snapshot(2 + 3)
snapshot(2 + 3 + 10)
})
1
2
3
4
$ npm test -s
$ cat __snapshots__/spec.js.snap-shot
exports['adds two numbers 1'] = 5
exports['adds two numbers 2'] = 15

If the value changes, the snapshot difference will be clearly shown using variable-diff.

spec.js
1
2
3
4
5
const snapshot = require('snap-shot')
it('adds two numbers', () => {
snapshot(2 + 2)
snapshot(2 + 3 + 10)
})
1
2
3
4
5
6
$ npm test -s
snapshot difference
5 => 4
1) adds two numbers
0 passing (27ms)
1 failing

Beautiful. snap-shot supports multiple values per test (as shown above), dynamic test names and asynchronous tests. It even works with transpiled code, React + JSX and Vue.js libraries.

The snapshot testing was plugged into our Node server API tests and literally collapsed the test boilerplate code into nothing. Combined with Ramda pipeline that cleans the returned data and makes it invariant from dynamic values, it became a simple and zero maintenance way to use real world data in our unit tests.

The rest of this blog post is dedicated to the way snap-shot is implemented. I will finish with recipes for cleaning up data and making it suitable for snapshot testing.

Who called me?

When a test function calls snapshot() passing the value, we need to figure out the caller test name. Without test runner context to read it from it turns out to be hard :)

1
2
3
4
5
6
function snapshot() {
// hmm, how do we get to the "works" string?
}
it('works', () => {
snapshot()
})

We need to walk up the stack and find the caller callback function (in the case above it will be an anonymous arrow function expression). Hopefully the call information in the stack has meaningful line number we could save for the next step. I have looked how to grab the accurate stack locations in previous blog post Accurate call sites. In short, you can grab the call sites from V8 api or from an exception

1
2
3
4
5
6
try {
throw new Error('on purpose')
} catch (e) {
console.log('caller', e.stack.split('\n')[2])
// do more parsing if necessary
}

From the stack we will get spec filename and line number, for example spec.js, line 5. We will use this information to find the actual test that called snapshot.

Finding the spec

Given filename and line number, let us find the it(<name>, callback) where the callback function covers the given line number. In order to do this, we will parse the source of the file into an Abstract Syntax Tree (AST). Then we will visit each node in tree to find a call expression node where the name of the function called is "it" and the callback function argument (position 2) is the caller callback function we just found.

Hiding all details and using falafel finding the spec that calls the callback that calls snapshot is below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/*
source:
it('works', () => {
snapshot()
})
*/
const options = {
locations: true,
sourceType: 'module'
}
falafel(source, options, node => {
// nested IFs make debugging easier
if (node.type === 'CallExpression') {
if (isTestFunction(node.callee)) {
if (node.loc.start.line < line &&
node.loc.end.line > line) {
specName = node.arguments[0].value
// specName will be "works"
}
}
}
}
})

Note the "options.locations" - we need to keep track of source line numbers for each node in the AST.

The above code becomes a little bit hairy if the source file is transpiled by the loader, for example if it has JSX and cannot be parsed by falafel directly. In snap-shot this is detected and the source file is transformed using babel-core. We just have to preserve the source line numbers, which we can do

1
2
3
4
5
6
7
8
9
10
11
function transpile (filename) {
const babel = require('babel-core')
const {transformFileSync} = babel
// trick to keep the line numbers same as original code
const opts = {
sourceMaps: false,
retainLines: true
}
const {code} = transformFileSync(filename, opts)
return code
}

The name of the rose

Things become even trickier when the test function is called not with a literal string, but with a variable. Often we generate the tests for each item in an array like this

1
2
3
4
5
6
const names = ['test A', 'test B', 'test C']
names.forEach(name => {
it(name, () => {
snapshot(name + ' works')
})
})

The static inspection of the call expression it(name, ...) does not provide a unique name to use. In this situation snap-shot does the following. It takes the source string of the test callback () => { snapshot(name + ' works') } and computes the SHA-256 hash. This hash is used to save the snapshot values instead of the name. In the above case the snapshot file will have something like

1
2
3
exports['7464af... 1'] = "test A works"
exports['7464af... 2'] = "test B works"
exports['7464af... 3'] = "test C works"

Notice that it is equivalent to a single test with 3 snapshot calls like

1
2
3
4
5
it('7464af...', () => {
snapshot('test A works')
snapshot('test B works')
snapshot('test C works')
})

Even better is to give snap-shot something to work with. Instead of arrow function, give it a named function as a callback. snap-shot will use the name to save the snapshot, instead of SHA hash.

1
2
3
4
5
6
7
8
9
10
11
12
const tests = ['test A', 'test B', 'test C']
tests.forEach(name => {
it(name, function testSomething () {
snapshot(name + ' works')
})
})
/*
snapshot will use "testSomething" name
exports['testSomething 1'] = "test A works"
exports['testSomething 2'] = "test B works"
exports['testSomething 3'] = "test C works"
*/

I believe this is enough for the purpose of bookkeeping. The above approach only breaks for test frameworks that randomize the test order (like rocha or Jest when running the latest modified tests first).

Promises are either a pain or easy

Nodejs has a problem. Asynchronous call stacks from promises are non-existent. Thus the best we can do is to make sure the outside function around the snapshot(value) call can be found. This requires a function just for the purpose of calling snapshot

1
2
3
4
5
6
7
8
9
10
11
// does not work
it('promise to snapshot (does nothing!)', () => {
// straight into snapshot comparison does not work
return Promise.resolve(20)
.then(snapshot)
})
it('function around snapshot call', () => {
// works fine
return Promise.resolve(20)
.then(data => snapshot(data))
})

I hate non-implicit calls like data => snapshot(data) and prefer point-free code. Luckily we have an easy solution - pass the entire promise chain to snapshot and let is grab the resolved value.

1
2
3
it('snapshot can wrap promise', () => {
return snapshot(Promise.resolve('promise resolved'))
})

We use the last solution in our asynchronous tests.

Snapshot maintenance

Even carefully stored data snapshots become obsolete with time and will need to be updated. The simplest way to update all stored values is to run snap-shot with an environment variable UPDATE=1. Combined with Mocha grep feature, it is also very simple to update just a particular test.

1
$ UPDATE=1 mocha -g "test name pattern" spec.js

In the future I plan to add snapshot pruning and other nice to have features.

Snapshot testing recipes

In this section I would like to give examples of snapshot testing an API using snap-shot. A lot of examples rely on massaging the data using Ramda library. I recommend watching this video playlist to learn about Ramda functions that are super useful for this type of data processing.

Saving the test name

If the tests are generated from data, like describe above, we can still "remember" the real test name. I can do this as the last step in the data transformation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
const tests = ['/a', '/b', '/c']
tests.forEach(name => {
it(name, function testServer () {
snapshot(
fetch(baseUrl + name)
.then(r = r.json())
.then(value => ({name, value}))
)
})
})
/*
snapshot file will show the actual url / test name
exports['testServer 1'] = {
name: '/a',
value: ...
}
exports['testServer 2'] = {
name: '/b',
value: ...
}
exports['testServer 3'] = {
name: '/c',
value: ...
}
*/

The above code is simple enough to write without Ramda.

See the snapshot

As I refactored the unit tests to use the snapshot, I have hit the stride. First, I would take a test that checked many properties at once. It could be individual checks (better error messages) or object equality assertion (shorter).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// parsing commit messages in
// https://github.com/bahmutov/simple-commit-message
// separate assertions
it('handles "break" type', () => {
const message = 'break(log): new log format'
const parsed = parse(message)
snapshot(parsed)
la(parsed.firstLine === message, 'first line', parsed)
la(parsed.type === 'major', 'type', parsed)
la(parsed.scope === 'log', 'scope', parsed)
la(parsed.subject === 'new log format', 'subject', parsed)
})
// deep object equality test
it('handles "major" type', () => {
const message = 'major(log): new log format'
const parsed = parse(message)
const expected = {
firstLine: message,
type: 'major',
scope: 'log',
subject: 'new log format'
}
la(R.equals(parsed, expected),
'different', parsed, 'from', expected)
})

The above tests are very verbose. Yet, they make is simple to see what the expected parsed value is. When we switch this code to snap-shot we want the same experience - we want to see what the saved value is before committing it to the repository. This is simple to do and the best way is to run the desired spec by itself with environment variable SHOW=1 set.

1
2
3
it.only('handles "break" type', () => {
snapshot(parse('break(log): new log format'))
})
1
2
3
4
5
6
saving snapshot "handles "break" type 1" for file src/valid-message-spec.js
{ firstLine: 'break(log): new log format',
type: 'major',
scope: 'log',
subject: 'new log format' }
✓ handles "break" type (45ms)

Everything looks good, we do not need to revert the snapshot file and the test looks much cleaner. Similarly we can transform the second test into a snapshot test, see the saved result and commit the changes.

Handle undefined values

snap-shot does NOT save "undefined" value as a snapshot because it does not make a good expected value. What if a server sometimes returns "undefined" and that is ok?

Also, what if the actual value we want to store is a nested property inside a returned result? We have to handle the following responses

1
2
3
4
5
6
7
{
people: {
names: [...]
}
}
// or
undefined

Luckily we can access nested path, or provide a default value if any values along the path are "undefined" using Ramda.pathOr function.

1
2
3
4
5
6
it('names', () => {
return snapshot(
fetchJson(url)
.then(R.pathOr('N/A', ['people', 'names']))
)
})

If the server returns nothing, the snapshot will be exports['names'] = 'N/A'. If the server returns a valid names list, the snapshot will be exports['names'] = [...] value.

Saving invariant snapshots

Imagine we fetch a list of people, which returns something like

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
"people": [
{
"name": "Joe",
"age": 21
},
{
"name": "Mary",
"age": 20
},
{
"name": "Adam",
"age": 25
}
]
}

The list of people might change, so we do not want to save the list directly. Instead we want to save a transformed data that will be invariant to the actual numbers and names. The snapshot will allow us to test if the server returns a valid list of people, without knowing what the actual values should be.

Each person in the list should have a name - which should be a non-empty string. Each person should have an age - a positive number. If the server returns a list with a person record that does not pass these predicates, than something is wrong.

Let us transform the list into "invariant" snapshot (we assume the returned list will always have same number of items).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
const R = require('ramda')
const isNonEmptyString = s => typeof s === 'string' && s
const person = {
name: isNonEmptyString
age: R.lte(1) // age should be >= 1
}
it('returns people', () => {
return snapshot(
fetchJson('/people')
.then(R.prop('people')) // get "people" from returned object
.then(R.project(['name', 'age'])) // pick only "name" and "age" from each
.then(R.map(R.evolve(person)))
)
})

The R.map(R.evolve(person)) goes through the returned list, changing each property listed in const peson = {...} object. Each named property passes through the function and the value is placed into the returned object. For a single object it will produce

1
2
3
4
5
R.map(person)({
"name": "Joe",
"age": 21
})
//> {name: true, age: true}

The list we store in the snapshot thus ensures that all returned objects have valid name and age

snapshot.js
1
2
3
4
5
exports['returns people'] = [
{name: true, age: true},
{name: true, age: true},
{name: true, age: true}
]

Assemble the pipeline separately

The above code is a little hard to read, because each processing step happens as a promise callback.

1
2
3
4
5
6
return snapshot(
fetchJson('/people')
.then(R.prop('people')) // get "people" from returned object
.then(R.project(['name', 'age'])) // pick only "name" and "age" from each
.then(R.map(R.evolve(person)))
)

Luckily, we can assemble the single function from each step separately and use it as a single .then() callback. Since each step happens from top to bottom, I prefer to use Ramda.pipe function.

1
2
3
4
5
6
7
8
9
const toInvariant = R.pipe(
R.prop('people'),
R.project(['name', 'age']),
R.map(R.evolve(person))
)
return snapshot(
fetchJson('/people')
.then(toInvariant)
)

Short and clear, I hope. I even recommend unit testing the above toInvariant function to make sure it behaves as expected. You can use comment-value to understand the behavior of each composed function along the pipeline.

Conclusions

This is my 3 tool and blog post in a row that used Abstract Syntax Trees to achieve something cool. The other ones were

Overall I must say getting it right was not an easy task and it might not prove useful after all. We will wait and see. The way the testing code is written in the real world might prevent snap-shot from finding the correct spec function. If that happens it will fail to save and load the proper snapshot value. Yet it will succeed if the testing code follows the my favorite principle

Testing code should be simple

The production code should be simple and elegant, but it is hard to achieve. The testing code on the other hand MUST be as simple as possible. Setup data, call an action, validate the result. That should be it, otherwise the tests become a drag and impossible to maintain. When tests are simple, the snap-shot should work no matter what testing framework is used.