I love code coverage and use tools like
istanbul to get the test coverage information in a lot
of my projects. Recently I started using
nyc that makes it easy to collect the coverage information
for JavaScript used in any external program, like code covered by the mocha src/*-spec.js
command; to get code coverage we just need to prepend the command and run
nyc mocha src/*-spec.js
instead.
I have even written a code coverage proxy
that can be used to instrument live website code. Website
instrumentation shows parts of the code has been covered by the users' actions. Sometimes
it is fun to sit back and watch what parts of the web application's code have been executed.
Yet, just like nyc
, the proxy sends code coverage information as one large object.
If we send coverage information more often,
we incur a large performance penalty - comparing and diffing large objects can be costly.
It would be cool to flip the coverage object and remove of "pull" mechanism that periodically
compares the object with its previous copy. It would use better to use the "push" approach
where the coverage object emits events when new statements have been covered.
Without Object.observe
this seems impossible to achieve - yet there is a neat trick we can
play with standard ES5 objects to achieve this.
I did not want to write and maintain another code coverage tool, thus I picked nyc
as the
starting point. It allows preloading custom modules before running using -r <module name>
syntax.
Thus whatever I wrote needed to work at the preloading step, before any code coverage has been
collected. Turns out this is exactly the best place to run the code to be prepared!
Code instrumentation
Typical code coverage in JavaScript works like this
- when the source is loaded, it is instrumented, for example inside a Node require hook using code transformation. For example, istanbul hook will instrument every file with name matching pattern
1 | var istanbul = require('istanbul') |
- the instrumented code has a global object, called
__coverage__
with an object entry for each individual source file - each statement, function and branch has a counter inside the object entry
- extra statements in the instrumented source update the coverage counters,
like
__coverage__[__filename].s['42']++
for example to tell that statement '42' has been executed one more time.
1 | // original file |
1 | // instrumented file |
After the instrumented file runs and the process exits, the outside process generates JSON or
HTML reports from the __coverage__
object.
It is all about being first
Let us say, nyc
loads out module first, before any user code runs. That means there is no
global.__coverage__
object yet. Thus we can create a placeholder property on the global
object; when nyc
actually sets the property our "setter" function will run, allowing us
to do additional processing. For example we will repeat the same trick and will set up
placeholder properties for each input source file to be notified.
I will use liverage
name for our code, standing for "live code coverage" (and not "live rage"
as some might guess). You can find the finished project at
bahmutov/liverage
1 | // liverage module |
To the outside world, even to the nyc
module that runs after this code, setting the
global.__coverage__
variable seems to work just like before. Yet, the first time (the global
object itself is only set once) the property is set, we get to run our own "setter" function.
1 | $ nyc -r liverage mocha *-spec.js |
Be ready for any file
The coverage object will have information for many files, each file will add an entry whenever it gets loaded
1 | { |
You can definitely control which files to cover, and which ones to exclude using
nyc options. We need to add placeholder properties
for all file entries to play the same trick as we have played before with the coverage object
itself. The problem is, we knew the expected property name __coverage__
before, but in this
case we do not know what property names will be added.
Inside liverage
code we can either parse the command line and options from package.json
and try to match the nyc
logic. Or we can brute force the problem. When running the tool,
we probably are only interested in any JavaScript source file in the current folder, excluding
node_modules
folder. Thus I grab all source files and add placeholder entries to the coverage
object right away.
1 | function findSourceFiles () { |
At the end of the run, the coverage object will have multiple empty values still for files
that were not part of the run, for example "/Users/home/me/project/src/bar.js" entry was a
placeholder object that never got set by nyc
1 | { |
This is not problem for nyc
- it is robust enough to skip the empty entries when computing
the final coverage reports.
Be ready to go live
Our code knows when the global coverage object is created, and each file coverage object is added to it.
Finally, we are at the last step where we play the same "placeholder" trick for the third time.
For each file loaded, nyc
will set the coverage information object. Whenever it is set,
our "setter" method runs, thus we have synchronous access to the file coverage object between
the moment it was created but before the execution begins. The coverage object has entries for
each statement - this is the most important coverage information we want to convert to an event
emitter. The inside of the instrumented file looks something like this
1 | var fileCoverage = { |
When this file coverage object is set, we need to replace the individual primitives inside the s
object with "smarter setter" functions that will notify our code that a particular statement
has new value. This is simple to do; in this case we do not even have to anticipate the future
property names since the full object is already there for us.
The property conversion has been factored out into a self-contained function that replaces
the existing properties inside fileCoverage.s
object with "set" functions.
1 | function liveStatementCoverage (cb, filename, fileCoverage) { |
From our "liverage" module we simply call the above function whenever we get a new file coverage object, passing our event emitter as a callback.
1 | // liverage |
Every covered statement will be broadcast to the connected clients, thus every client can observe the server's code been covered in real time.
Above: an example client implemented using CycleJs, code in bahmutov/liverage-client
Conclusion
Even without Object.observe we can get real time object value updates if we can predict the future property names. In this case we had to forecast the property names three times in order to preemptively define property with our "setter" function. Yet at the end we got a useful "push" system for observing code coverage events.