Jan 28 2017

Accurate values in comments

How to update the code comments during program execution.

Values in comments

Take any JavaScript code example. Usually, the inputs are given and expected output is shown as a comment. For example, Ramda R.compose function is shown below. The library gives two examples for compose. I find that the more examples are given, the faster I understand the code.

var classyGreeting = (firstName, lastName) =>
  "The name's " + lastName + ", " + firstName + " " + lastName
var yellGreeting = R.compose(R.toUpper, classyGreeting);
yellGreeting('James', 'Bond'); //=> "THE NAME'S BOND, JAMES BOND"

R.compose(Math.abs, R.add(1), R.multiply(2))(-4) //=> 7

Similarly, Lodash docs follow the same convention. The code examples in blog posts, tutorial slides and even books use the same "code, result as a comment" format.

Yet, there is a problem. More complex examples, especially the ones that evolve over time require extra effort from the author to ensure that the value comments alongside the code remain accurate. There is nothing worse for the reader than to see an out of date comment!

1 2	R.compose(Math.abs, R.add(1), R.multiply(2))(-4) //=> 3 // oops, value "3" was produced when we had `R.add(5)`!

Inspiration

Recently, Chrome DevTools implemented live value previews. When the debugger is paused, the values of the variables that are already computed are shown right next to the code. Here is how the JavaScript values preview works (source https://developers.google.com/web/updates/2015/07/preview-javascript-values-inline-while-debugging)

Live Preview

This is extremely nice feature, if only we could mark values in comments as "live" during Nodejs execution and ask the runtime to update them. Wait a minute!

Code and comment instrumentation

Recording executed statements is the main feature of code coverage tools like istanbul. It is implemented via on the fly code instrumentation; when a JavaScript file is loaded by the Nodejs module system a user-supplied callback function is called. This function can transform the loaded source before it is evaluated. Thus we can do all sorts of interesting things. For example, using my node-hook to instrument we can print a message on file load.

var hook = require('node-hook');
function logLoadedFilename(source, filename) {
  return 'console.log("' + filename + '");\n' + source;
}
hook.hook('.js', logLoadedFilename);
// load the actual file
require('./dummy');
// prints fulle dummy.js filename, runs dummy.js

We do not have to include the above wrapper code in our application. Instead I prefer using Node "preload" module feature. Place the wrapper code into a separate module and load it before the first JavaScript file.

printer.js

var hook = require('node-hook');
function logLoadedFilename(source, filename) {
  return 'console.log("' + filename + '");\n' + source;
}
hook.hook('.js', logLoadedFilename);

1 2	$ node -r ./printer.js dummy.js /home/user/dummy.js

Thus we can create a wrapper for Node that is simple to use and can modify any loaded JavaScript file to include any desired additional code. For example we could find all variables in the comments and insert additional statements into the loaded source to save the values of those variables in a big data structure. When the program finishes its run, we need to save this data structure and / or update the original file with new values.

comment-value

This is how the comment-value was born. Its first goal is to update the variable values in specially formatted line comments that put just the variable name followed by colon like this // name:.

In the simple example we added 3 extra line comments (// a:, // b: and // sum:, recording argument variables a and b and the variable sum. The comments are empty - we do not even bother writing the expected values manually.

example.js

function add(a, b) {
  // a:
  // b:
  return a + b
}
const sum = add(10, 2)
// sum:

Install the tool comment-value and run it on the file example.js

$ npm i -g comment-value
$ values example.js
$ cat example.js
function add(a, b) {
  // a: 10
  // b: 2
  return a + b
}
const sum = add(10, 2)
// sum: 12

Now, imagine that we have decided to use simple 2 + 3 to explain the above addition. Just change the values when calling the add function to add(2, 3) and rerun the values example.js.

$ values example.js
$ cat example.js
function add(a, b) {
  // a: 2
  // b: 3
  return a + b
}
const sum = add(2, 3)
// sum: 5

All values have been recomputed and the comments have been updated. No need to do this manually, and the reader can rest assured - the example is correct and up to date.

The implementation is pretty simple. We look at each source line, finding every variable name that matches format // name:. Then we insert an object at the begging of the source file to record values and a statement after the comment line to record the value. The above example.js code would look something like this when instrumented

example.js

const values = []
function add(a, b) {
  // a:
  values[0] = a
  // b:
  values[1] = b
  return a + b
}
const sum = add(10, 2)
// sum:
values[2] = a
// saveUpdated puts the values back
// into example.js and saves it to disk
process.on('exit', saveUpdated)

Perfect.

Taking it to the next level

While the comment-value tool is already useful, we can do better. Not only we want to show the result variable, we also want to easily show the intermediate expression values. For example, the large Ramda compose function example in the beginning blog post has been written with implicit code style without intermediate variables.

compose-example.js

1 2	var R = require('ramda') R.compose(Math.abs, R.add(1), R.multiply(2))(-4) //=> 7

There is no result variable even! Can we somehow update the value in the line comment //=> 7? Yes, but using a more complex transformation.

We cannot simply look at each line in isolation, finding a single variable name and inserting a quick assignment statement. Instead we need to understand the structure of our code to the left of a "magical" line comment that starts with //=> string. Luckily, the raw JavaScript source can be parsed into an Abstract Syntax Tree (AST) using off the shelf tools. I am using falafel which takes source string and a callback function that visits every node in the tree.

In this tiny example, we discovered CallExpression and a "magic" comment that are next to each other.

index.js

add(2, 3) //=> ?
---
add(2, 3): CallExpression, "add" is callee,
                           2 and 3 are arguments
add: Identifier
2  : Literal
3  : Literal

Each node in the AST has its location information and the source code. For example the above CallExpression node would be processed like this:

const source = fs.readFileSync('./index.js', 'utf8')
const output = falafel(source, visitor)
function visitor (node) {
  // node.type is "CallExpression"
  // node.source() returns "add(2, 3)"
  // node.loc.end is {line: 0, column: 9}
  const c = findSpecialCommentToTheRight(node.loc.end)
  // c is {index: 0} pointing to "values" array
  const wrapped = wrap(node)
  node.update(wrapped)
}

If we call node.update() with new source code, it will replace the code for this particular node. With falafel, there is no need to generate a complex replacement AST node when wrapping a source fragment, we could just return a new string.

We could do anything inside the wrapping logic, but we have to be careful not to break the surrounding code. For CallExpression node we want to actually execute the function, record its result and then return it to the outside code. For example here is the input and instrumented code.

index.js

1	const sum = add(2, 3) //=> ?

instrumented.js

const values = []
const sum = (function () {
  const result = add(2, 3)
  values[0] = result
  return result
}())

You can see the actual instrumented code by passing -i option to the values program. It will look a lot more complex in order to handle some edge cases.

In action

Let us take the compose example again and see how useful comments could be. Rewrite the example to split the composed functions to one per line for clarity.

compose-example.js

const R = require('ramda')
R.compose(
  Math.abs,
  R.add(1),
  R.multiply(2)
)(-4)

In the real example, we would split making the composed function and we would print the result, right?

compose-example.js

const R = require('ramda')
const fn = R.compose(
  Math.abs,
  R.add(1),
  R.multiply(2)
)
console.log(fn(-4))

Let us insert "magic" comments - and we can insert them inside the composition! We can use different strings to mark the comments, I prefer short //> (or // > to be compatible with standard js linter)

compose-example.js

const R = require('ramda')
const fn = R.compose(
  Math.abs,     //>
  R.add(1),     //>
  R.multiply(2) //>
)
console.log(fn(-4))

Run the values to compute the values values compose-example.js

compose-example.js

const R = require('ramda')
const fn = R.compose(
  Math.abs,     //> 7
  R.add(1),     //> -7
  R.multiply(2) //> -8
)
console.log(fn(-4))

Great! We can even wrap the call inside the console.log (going extra mile for a common use case). Just put //> after console.log(fn(-4)) to extract the value of the first argument.

compose-example.js

const R = require('ramda')
const fn = R.compose(
  Math.abs,     //> 7
  R.add(1),     //> -7
  R.multiply(2) //> -8
)
console.log(fn(-4)) //> 7

Finally, we can enable live update and let the users explore how the intermediate values change as we keep changing the parameters. Just run the tool in watch mode and keep editing the file. See this in action in the clip below

Future work

The current comment-value tool solves my problems, but I hope to extend it with several features.

testing - the tool could run in "comparison" mode. If a value comment is empty, then a new value will be filled. If there is a value there already the tool will compare the computed and the current value. If they are different it will raise an error. This can be used to test code and intermediate values given some specific inputs.
online mode - testing lots of code examples in the my presentation slides. Maybe if I target GitHub gists ...
type signatures - we could record the run time type signatures of intermediate expressions, instead of values. This would explain the code and allow its refactoring
data coverage during unit tests - we could collect all different data items for a given variable during unit tests. This would be helpful to find out if there are missing tests. For example, the following testing code achieves 100% code coverage during unit tests.

function isEmail(email) {
  return /^[\w\.]+@\w+\.\w+$/.test(email)
}
it('allows valid email', () => {
  console.assert(isEmail('[email protected]'))
})
it('allows email with dots', () => {
  console.assert(isEmail('[email protected]'))
})

Yet, this is a perfect example when full statement coverage is possible yet guarantees neither code robustness nor correctness. But what if we could collect all values of input argument email during unit tests?

function isEmail(email) {
  // s:
  return /^[\w\.]+@\w+\.\w+$/.test(s)
}
// all unit tests

We would get a list back, probably as a JSON file. In our case, the variable email would be all emails we have passed to isEmail, no matter how they arrived, maybe even from other tests!

data-coverage.json

1
2
3

{
  "email": ["[email protected]", "[email protected]"]
}

This will quickly give you an idea of more email "types" that you should test. For example, there were no emails with other characters, like dashes! We really would quickly notice (or could even automate) missing test data classes and edge cases.

QA Engineer walks into a bar. Orders a beer. Orders 0 beers. Orders 999999999 beers. Orders a lizard. Orders -1 beers. Orders a sfdeljknesv.
— Bill Sempf (@sempf) September 23, 2014

code documentation - we should stop using @example inside JavaDoc block comments. They are hard to format, paint to write and a chore to maintain. Instead we could have little executable snippets with "comment-value" tool executing them and updating the expected values.

related xplain generates documentation examples from unit tests, which makes sure the code examples are accurate and in sync with the code, but approaching this problem from the opposite direction.

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do

Accurate values in comments

How to update the code comments during program execution.

Values in comments

Inspiration

Code and comment instrumentation

comment-value

Taking it to the next level

In action

Future work