May 4 2016

Turning code coverage into live stream

A neat trick for making object property updates into a live event stream.

I love code coverage and use tools like istanbul to get the test coverage information in a lot of my projects. Recently I started using nyc that makes it easy to collect the coverage information for JavaScript used in any external program, like code covered by the mocha src/*-spec.js command; to get code coverage we just need to prepend the command and run nyc mocha src/*-spec.js instead.

I have even written a code coverage proxy that can be used to instrument live website code. Website instrumentation shows parts of the code has been covered by the users' actions. Sometimes it is fun to sit back and watch what parts of the web application's code have been executed. Yet, just like nyc, the proxy sends code coverage information as one large object. If we send coverage information more often, we incur a large performance penalty - comparing and diffing large objects can be costly.

It would be cool to flip the coverage object and remove of "pull" mechanism that periodically compares the object with its previous copy. It would use better to use the "push" approach where the coverage object emits events when new statements have been covered. Without Object.observe this seems impossible to achieve - yet there is a neat trick we can play with standard ES5 objects to achieve this.

I did not want to write and maintain another code coverage tool, thus I picked nyc as the starting point. It allows preloading custom modules before running using -r <module name> syntax. Thus whatever I wrote needed to work at the preloading step, before any code coverage has been collected. Turns out this is exactly the best place to run the code to be prepared!

Code instrumentation

Typical code coverage in JavaScript works like this

when the source is loaded, it is instrumented, for example inside a Node require hook using code transformation. For example, istanbul hook will instrument every file with name matching pattern

var istanbul = require('istanbul')
var hook = istanbul.hook
var Instrumenter = istanbul.Instrumenter
var cover = instrumenter.instrumentSync.bind(instrumenter)
hook.hookRequire(isJavaScriptFilename, cover)

the instrumented code has a global object, called __coverage__ with an object entry for each individual source file
each statement, function and branch has a counter inside the object entry
extra statements in the instrumented source update the coverage counters, like __coverage__[__filename].s['42']++ for example to tell that statement '42' has been executed one more time.

// original file
function add(a, b) {
  return a + b
}

// instrumented file
global.__coverage__ = {}
global.__coverage__[__filename] = {
  s: {
    '1': 1 // function declaration - executed once right away
    '2': 0 // inside the function "return a + b"
  },
  statementMap: {...}, // maps statements into lines and columns
  f: {
    '1': 0 // single function, has not been executed yet
  }
}
function add(a, b) {
  global.__coverage__[__filename].s['2']++
  return a + b
}

After the instrumented file runs and the process exits, the outside process generates JSON or HTML reports from the __coverage__ object.

It is all about being first

Let us say, nyc loads out module first, before any user code runs. That means there is no global.__coverage__ object yet. Thus we can create a placeholder property on the global object; when nyc actually sets the property our "setter" function will run, allowing us to do additional processing. For example we will repeat the same trick and will set up placeholder properties for each input source file to be notified.

I will use liverage name for our code, standing for "live code coverage" (and not "live rage" as some might guess). You can find the finished project at bahmutov/liverage

// liverage module
var cover
Object.defineProperty(global, '__coverage__', {
  configurable: true,
  enumerable: true,
  get: () => {
    return cover
  },
  set: (value) => {
    console.log('setting new coverage object')
    cover = value
  }
})

To the outside world, even to the nyc module that runs after this code, setting the global.__coverage__ variable seems to work just like before. Yet, the first time (the global object itself is only set once) the property is set, we get to run our own "setter" function.

1
2
3

$ nyc -r liverage mocha *-spec.js
setting new coverage object
(mocha output)

Be ready for any file

The coverage object will have information for many files, each file will add an entry whenever it gets loaded

{
  "/Users/home/me/project/foo.js": { ... },
  "/Users/home/me/project/src/bar.js": { ... },
  ...
}

You can definitely control which files to cover, and which ones to exclude using nyc options. We need to add placeholder properties for all file entries to play the same trick as we have played before with the coverage object itself. The problem is, we knew the expected property name __coverage__ before, but in this case we do not know what property names will be added.

Inside liverage code we can either parse the command line and options from package.json and try to match the nyc logic. Or we can brute force the problem. When running the tool, we probably are only interested in any JavaScript source file in the current folder, excluding node_modules folder. Thus I grab all source files and add placeholder entries to the coverage object right away.

function findSourceFiles () {
  const toFull = (name) => path.resolve(name)
  return glob.sync('{src,examples}/**/*.js')
    .concat(glob.sync('*.js'))
    .map(toFull)
}
const jsFiles = findSourceFiles()
Object.defineProperty(global, '__coverage__', {
  configurable: true,
  enumerable: true,
  get: () => {
    return cover
  },
  set: (value) => {
    console.log('setting new coverage object')
    // prepare for every source file ;)
    jsFiles.forEach((filename) => {
      var fileCoverage
      Object.defineProperty(value, filename, {
        configurable: true,
        enumerable: true,
        get: () => fileCoverage,
        set: (coverage) => {
          fileCoverage = coverage
        }
      })
    })
    cover = value
  }
})

At the end of the run, the coverage object will have multiple empty values still for files that were not part of the run, for example "/Users/home/me/project/src/bar.js" entry was a placeholder object that never got set by nyc

{
  "/Users/home/me/project/foo.js": { ... },
  "/Users/home/me/project/src/bar.js": undefined,
  ...
}

This is not problem for nyc - it is robust enough to skip the empty entries when computing the final coverage reports.

Be ready to go live

Our code knows when the global coverage object is created, and each file coverage object is added to it.

Finally, we are at the last step where we play the same "placeholder" trick for the third time. For each file loaded, nyc will set the coverage information object. Whenever it is set, our "setter" method runs, thus we have synchronous access to the file coverage object between the moment it was created but before the execution begins. The coverage object has entries for each statement - this is the most important coverage information we want to convert to an event emitter. The inside of the instrumented file looks something like this

var fileCoverage = {
  s: { // statement coverage
    '1': 0,
    '2': 1,
    ...
  },
  statementMap: { // statement to source map
    ...
  },
  f: { // function coverage
    '1': 1
  },
  b: { // if / else branch coverage
    '1': 0
  }
}
global.__coverage__[__filename] = fileCoverage

When this file coverage object is set, we need to replace the individual primitives inside the s object with "smarter setter" functions that will notify our code that a particular statement has new value. This is simple to do; in this case we do not even have to anticipate the future property names since the full object is already there for us.

The property conversion has been factored out into a self-contained function that replaces the existing properties inside fileCoverage.s object with "set" functions.

function liveStatementCoverage (cb, filename, fileCoverage) {
  Object.keys(fileCoverage.s).forEach((statementIndex) => {
    var counter = fileCoverage.s[statementIndex]
    Object.defineProperty(fileCoverage.s, statementIndex, {
      enumerable: true,
      get: () => counter,
      set: (x) => {
        counter = x
        cb({
          filename: filename,
          s: statementIndex,
          counter: x
        })
      }
    })
  })
  return fileCoverage
}

From our "liverage" module we simply call the above function whenever we get a new file coverage object, passing our event emitter as a callback.

// liverage
const startServer = require('./ws-coverage')
const server = startServer() // our WebSocket server
jsFiles.forEach((filename) => {
  var fileCoverage
  Object.defineProperty(value, filename, {
    configurable: true,
    enumerable: true,
    get: () => fileCoverage,
    set: (coverage) => {
      fileCoverage = liveStatementCoverage(server.broadcast, filename, coverage)
    }
  })
})

Every covered statement will be broadcast to the connected clients, thus every client can observe the server's code been covered in real time.

liverage

Above: an example client implemented using CycleJs, code in bahmutov/liverage-client

Conclusion

Even without Object.observe we can get real time object value updates if we can predict the future property names. In this case we had to forecast the property names three times in order to preemptively define property with our "setter" function. Yet at the end we got a useful "push" system for observing code coverage events.

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do