Even old things can be improved. Consider a wood axe. Simple, right? Humans have been chopping trees for thousands of years, yet there was undiscovered improvement. This guy has improved the axe to use all that angular momentum to chop wood easier and quicker. Notice the very unusual shape of the blade - it is mean to cut and then rotate to push the sides of the split apart.
What does this teach us? There is room to improve everything. JavaScript test runners especially have a lot of room to grow! Consider a simple runner like QUnit or Mocha. Each runner does two things: collects all user's unit functions, then executes them. Yet there is so much more we can do to make test writing and running more pleasant and productive.
I will take a look at 5 test runners and the new features they bring: Ava, Jest, Rocha, Focha and Locha. Ava and Jest are well known (and I love them both), and the later 3 are my own wrappers around my go-to test runner Mocha. Each test runner has something interesting to offer, and I hope that through cross-pollination of ideas the testing experience in JavaScript would improve.
Ava
Ava came onto the scene suddenly and with a splash. It introduced ES6 code transpiling by default, allowing everyone to unit test modern JavaScript code. It also introduced a nice feature that was unavailable in other test runners (as far as I know).
Each spec file is running in its own Ava test runner instance.
This means, whatever one spec file is doing is unaffecting the other spec files. There is no shared memory, or module system - every test runner was spawn as a separate process, isolating the tests from one file from tests in another.
The isolation helps with finding inter dependencies among tests, and it also allows to run tests in parallel (which really means "faster"). And it almost happened ;) But the requirement to transpile everything in each subprocess during Node 0.12 and 4 days meant the parallel speed advantages were kind of moot for small "trial" projects.
Luckily, today with Node 6/8 the transpile is almost never necessary, and Ava test runs are super fast.
Jest
Snapshot testing
Jest test runner has introduced a bunch of features that I love. In particular I am awed by its snapshot testing feature. No longer I have to write many assertions to compare the entire result with expected value; I don't even have to compute the expected value.
Instead I just need to say that the computation should match the snapshot.
For larger things like DOM component rendering creating the expected value by hand is almost impossible!
1 | import React from 'react'; |
The snapshot assertion expect(tree).toMatchSnapshot()
will try to load previous value from
a snapshot file. If Jest cannot find the snapshot file,
that means it has never ran before. Jest will save whatever the computed
tree
object is, and you should commit the snapshot file in the code repository, just like
a regular test fixture file. It is a plain JavaScript file after all.
1 | // snapshot file |
Next time it runs locally or on CI, if the tree
has been rendered differently, Jest can show you a
beautiful error message
I loved snapshot testing so much, I really wanted it inside my Mocha tests. While a test runner like Ava just grabbed the Jest snapshot module (see what I mean about test ecosystem tools borrowing ideas from each other?), I had a strong "Not Invented Here" syndrome. So I wrote my own snap-shot library that can work with pretty much any test framework as a zero-configuration add-on.
1 | const snapshot = require('snap-shot') |
The snap-shot
works without integration with the test runner, and thus it had to overcome a
major problem. When a test calls snapshot(value)
, how do you know the test file and the test
name so you can look up the previously saved snapshot? snap-shot
works by inspecting the
stack trace when it is called and then by finding the spec file, and then by inspecting
its AST to find the it(name, cb)
statement. You can find details in
this blog post
and in these slides.
This works 90% of the time, but has problems finding the right test in heavily transpiled JavaScript
code or other languages like CoffeeScript and TypeScript. I spent some time trying to solve this
problem, but then have decided to limit myself to BDD frameworks (like Jest, Mocha, etc).
These test runners have a couple of standard methods available to test code, like beforeEach
and afterEach
1 | beforeEach(() => { |
Result
1 | runs before each test |
By relying on the global functions like beforeEach
, I could write a snapshot utility that actually
works in any language - because it would find its "owner" test during runtime and not by static
source inspection. So I made snap-shot-it - it registers
beforeEach
callback to grab the current test about to be executed. If the test calls snapshot
then inside snap-shot-it
it can find the test's name, spec file, etc without any hunting.
Beautiful, but why spend so much time writing a utility that already exists? Because I want simple
1 page module that is not relying on a particular framework. I also want to learn and experiment,
and snap-shot
, snap-shot-it
produces another cool collection of tools. By factoring out
saving, loading and comparing snapshot values into
snap-shot-core I have been able to implement additional
features.
Have a data you want to snapshot, but the actual values change? Only the shape of the data stays the same? Example: top selling item returned by the API - the name and SKU numbers change, but the object must have name and SKU. No problem - schema-shot to the rescue. Have a list and it keeps growing, so the snapshot should only check a subset? No problem - subset-shot has you covered. Have a function that produces a lot of data and want to use that as a snapshot? Perfect opportunity to use data-driven snapshot
1 | // checks if n is prime |
Produces snapshot that has
1 | exports['tests prime 1'] = { |
In summary, snapshot testing is really useful, and now there is a variety of snapshot choices; I described the alternatives.
Code coverage for faster testing
In addition, Jest has another cool feature. It collects code coverage by default (yey zero config!) and thus is able to track which test files cover which source files. When a test or a source file changes, Jest can rerun the affected test files which means super fast feedback loop.
Jest can run tests for changed files by using collected code coverage
Full proud disclosure - I wrote untested 5 years ago
(Jan 2013). untested
orders your unit tests by code coverage so you can test faster, it even supports
browser based tests through lasso. It is kind of cool to see
ideas that were prototypes for a long time used in production to test millions of files.
update as this twitter thread notes I was mistaken thinking Jest uses code coverage to track test dependencies. Instead Jest uses file to file dependencies. If the test file "a-spec.js" loads "a.js" then when file "a.js" changes, test file "a-spec.js" will rerun all its tests. On the other hand, a test runner like Wallaby.js actually does track code coverage for each test and can accurately rerun only the affected individual tests.
Rocha
Each unit test should be independent from the other unit tests. Easy to say, right?
But it is so easy for one test to leave changed global state, affecting the result of
another test. In this file, one of the tests changes value foo
, making the third
test pass.
1 | describe('example', function () { |
Yet if we run the third test by itself, it stops working, because nothing set foo = 42
before it
runs. The flaky tests are hard to debug, because isolating the tests literally breaks it or removes
the source of the flake.
This is why I wrote rocha - a "random" Mocha test runner. Before running the tests, Rocha randomly changes the order of unit tests, hopefully breaking the "happy test order", and instead flushing out inter-test dependencies. The above tests show the difference between Mocha and Rocha
Running tests using Mocha
1 | > mocha spec/tricky-spec.js |
Rocha shuffles your tests to flush out inter-test dependencies
Running tests using Rocha
1 | > rocha spec/tricky-spec.js |
Perfect, we caught the flaky test. Maybe not right away, maybe after a few runs, when each run used a different reshufle. But what happens when we try to investigate the problem - will it disappear because the tests will be shuffled again? No. When tests fail, Rocha will save the failing test order, and on next run will set it to be the same again. A developer can rerun the "bad" test order until the problem is discovered and fixed.
Focha
Imagine you have 100 of tests. If each test runs for 10 seconds that's 1000 seconds which is approximately 15 minutes. That's a long time to wait to find out if all tests are passing. What usually happens is:
- a few tests break on CI
- you push a fix
and now you wait for CI to finish running through the 100 tests just to find out if
test #66
that was failing before starts passing again. Wouldn't it be more useful to
find previously failing tests first?
Focha runs tests that previously failed first so you find out if you have fixed them sooner
Similarly to rocha
, Focha
is a wrapper around Mocha that concentrates on
collecting failing tests (the "F" in "Focha"). When all tests finish, Focha saves
failing tests (if any have failed) in a JSON file or sends it to a REST api
endpoint.
Next time Focha runs, it loads and runs just the failing tests. Thus you find
out if the test #66
has been fixed in 10 seconds rather than in 15 minutes.
If the previously failing tests pass, then you call focha --all
to run all tests
1 | { |
Useful!
Locha
Finally, there is Locha - the "L"oud Mocha. Imagine a test that exercises a complex piece of code. That code probably has a lot of logging statements. I love using debug module so I can enable log messages through an environment variable.
1 | DEBUG=my-module npm test |
Being able to easily turn on verbose logging leads to a dilemma - do you enable all logging in CI by default just in case a test fails? That's not good - each test can generate 10 - 100 - 1000 log messages! In our testing at Cypress the CI test log output was overwhelming CircleCI and TravisCI UI, and we could only download the raw text file if we wanted to see it! But if we disabled the log messages, when a test failed we had absolutely no idea what went wrong, which is also not good.
Locha gives you a happy compromise. It runs all your tests with minimal default logging, but if any test fails, it reruns the failing tests only with extra environment variables. Take a look at this test file.
1 | const debug = require('debug')('failing') |
It has a log of debug
statements, but they will only output messages to the console
if we run the tests with DEBUG=failing npm test
command. By default, the tests are pretty
quiet. One of the tests is failing. Here is the output from the Locha test runner
1 | $ npm run demo |
Do you see two test runs? First one executed two unit tests, and only the console.log
statements were visible. During the first run, test "B" failed, and Locha has executed just
this test in the second round. During this round Locha has added the environment variables
we have passed as a CLI flag --env DEBUG:failing
to the mix. Thus the second round
is pretty "loud", and allows us to debug the failure, or at least get an idea why it happens.
Locha keeps passing tests' output to a minimum and makes failing tests very verbose
Final thoughts
Making a useful testing tool is tricky, but there is definitely room for improvements. The entire testing and quality assurance process in JavaScript is still a chore, and a hindrance. We must do better, and have more useful information from our tests faster to avoid introducing bugs into the code.