Jul 14 2014

Journey from procedural to reactive JavaScript with stops

Same simple problem solved in different programming styles.

update I have presented this journey at several conferences. Here are the slides

You can find the companion source code with each solution in the repo bahmutov/javascript-journey. Just clone and run any particular file to experiment.

Problem: given an array of numbers, multiply each number by a constant and print the result.

1 2	var numbers = [3, 1, 7]; var constant = 2;

Let us take a stab at this simple problem using different programming techniques. We will start with typical C-like procedural solution and will journey through modern object-oriented JavaScript, followed by a more functional programming approach. We will finish with chained event emitters and reactive programming solutions. Finally, I will list additional resources for each technique.

procedural

This is the way many people program the solution using the procedural style. You can see the same style in C, C++, C#, Java, etc.

var k = 0;
for(k = 0; k < numbers.length; k += 1) {
  console.log(numbers[k] * constant);
}
// 6 2 14

This code is fast and simple to read. Unfortunately, it is not very reusable and extremely brittle. Many times I had to debug off by 1 errors in array iteration.

Let us start with code reuse. JavaScript allows to easily create new functions, so we can separate loop iteration from processing

function mul(a, b) {
  return a * b;
}
function processNumber(n, constant) {
  console.log(mul(n, constant));
}
for(k = 0; k < numbers.length; k += 1) {
  processNumber(numbers[k], constant);
}
// 6 2 14

Each function has clear purpose, can be quickly understood, and is easy to test. We still had to pass arguments (like constant) around and combine printing and multiplication manually (by writing processNumber function). The manual iteration using for loop can still be a source of errors.

object-oriented

Current JavaScript version has added Array methods that iterate over each element, making the code much cleaner. Let us reuse the functions we have written earlier and just replace the for loop with object method

numbers.forEach(function (n) {
  processNumber(n, constant);
});
// 6 2 14

Notice that the original problem clearly stated multiply then print each number. I believe the code should naturally express the algorithm, and right now the processNumber function hides the algorithm. Fortunately, the Array has several iteration methods. We can combine map iterator with forEach to better express our steps. In particular by using map we are creating a copy of the original array, thus making sure we are not modifying the original numbers.

function mul(a, b) {
  return a * b;
}
function print(n) {
  console.log(n);
}
numbers.map(function (n) {
  return mul(n, constant);
}).forEach(print);
// 6 2 14

Notice how we used print function: just pass it to forEach method, completely removing an extra function. Fewer functions and fewer arguments lead to fewer places for bugs to creep in.

I consider functions mul and print pure. They have no side effects and their result only depends on the inputs. I do not consider console.log statement to be a side effect. Any program composed from pure functions will be easier to understand, simpler to test and debug. Luckily, JavaScript is both object-oriented and functional language.

functional

JavaScript functions can be passed around, borrowed, and even bound and called on completely foreign objects! Let us rewrite the same program in functional style, removing use of Array.map and Array.forEach methods, and using a 3rd party library iterators. In addition, we can use another trick common in functional languages: partial binding. Notice that we always call mul with constant argument. Why not create a new function with one of the multiplication arguments prefilled?

var _ = require('lodash');
var byConstant = _.partial(mul, constant);
_(numbers)
  .map(byConstant)
  .forEach(print);
// 6 2 14

Each item in the array will first go through function byConstant which is just mul(constant, ...) then will be printed. Pure function mul that only gets its inputs from arguments (and not from global state) is simple to write and test. Convenient partial application method makes using pure functions a snap. Additional advice: when writing a pure function, plan for its reuse via partial application and place arguments less likely to change first.

Avoiding Array iterators also allows us to iterate over any collection, for example over values in an object

_({ John: 3, Jim: 10 })
  .map(byConstant)
  .forEach(print);
// 6 20

I consider this code functional, even though it wraps around numbers and then calls methods. You could write purely functional code in this situation:

1	_.forEach(_.map(numbers, byConstant), print);

To make the code simple to understand, we need to use partial application again. The Ramda library does partial application by default for each of its functions, an approach called currying

var R = require('ramda');
var mapByConstant = R.map(byConstant);
var printEach = R.forEach(print);
var algorithm = R.compose(printEach, mapByConstant);
algorithm(numbers);
// 6 2 14

If you write code that makes heavy use of tiny composable functions, I will gladly hire you. But we can go a couple of steps further!

lazy evaluation

When we use numbers.map(...) method or _(numbers).map we are creating a copy of the original array. What if the original array is huge and we only plan to print a few of the first processed numbers? This is where lazy iterators come to the rescue!

var lazy = require('lazy.js');
lazy(numbers)
  .take(2)
  .map(byConstant)
  .each(print);
// 6 2

correction: a better showcase of lazy evaluation would be to place take(2) after map:

var lazy = require('lazy.js');
lazy(numbers)
  .map(byConstant)
  .take(2)
  .each(print);

Notice take(2) method in the pipeline. It limits the processing to the first two items from the original array, saving time and memory. Of course this is only beneficial if you really plan on using a subset of the entire original array.

lazy async evaluation

Once we have lazy evaluation that kicks off for each requested element, we can add the asynchronous steps. For example, we can sleep for 1 second between each item

lazy(numbers)
  .async(1000) // 1000ms = 1 sec
  .map(byConstant)
  .each(print);
// sleeps 1 second between printing each number

JavaScript is ideally suited to asynchronous lazy evaluation. Every iteration of the sequence just puts the processing on the event loop, letting other tasks to happen. Once we start thinking about non-immediate calls, we can handle much more interesting problems.

Going async

Let us change the original problem and think how we could program the same algorithm if the numbers we multiplied by some asynchronous function instead of nice and fast mul? For example we might have to send the numbers to be multiplied to an external server. This is where we can use promises

promises

A promise is a special object returned by an asynchronous function. You can attach callbacks to a promise that will be called when the actual value is returned. Promises make async processing dead simple, at least in my experience. Let us make a promise-returning function asyncMul that will resolve with result a * b after 1 second delay. The actual multiply and print algorithm becomes a little longer, because we have to massage all arguments to be promise-returning functions

var Q = require('q');
function asyncMul(a, b) {
  return Q(a * b).delay(1000);
}
var byConstantAsync = _.partial(asyncMul, constant);
var promiseToMulNumber = function (n) {
  return byConstantAsync(n);
};

Q.all(numbers.map(promiseToMulNumber))
  .then(print)
  .done();
// sleeps 1 second
// then prints [6, 2, 14]

Notice that this algorithm is a little different. Instead of printing the first number, then sleeping for 1 second, then printing the second number, then sleeping, etc. we are kicking off all async functions in parallel. This is what I found from my experience: parallel promises are much easier to program than a sequence. For example, if we know that our array only has 3 items, we can manually setup a sequence of steps

byConstantAsync(numbers[0])
  .then(print)
  .then(_.partial(byConstantAsync, numbers[1]))
  .then(print)
  .then(_.partial(byConstantAsync, numbers[2]))
  .then(print)
  .done();
// sleeps 1 second
// 6
// sleeps 1 second
// 2
// sleeps 1 second
// 14

Yeah, I feel sick too. But the shorthand for making this sequence for ALL array items is equally bad:

var mulAndPrint = function (n) {
  return function () {
    return byConstantAsync(n).then(print);
  };
};
numbers.map(mulAndPrint)
  .reduce(Q.when, Q())
  .done();
// sleeps 1 second
// 6
// ...

Again, we had to rely on JavaScript being able to return functions as results to massage an async multiplication function into a function returning a promise to multiply a given value by a constant then to print the result.

In general I find promises to be the best solution to callback hell and the pyramid of doom when you deal with a few async steps. If you have lots of steps to sequence, like when iterating over a large array, you might want to use something else.

event emitter

Node is great. It is fast and versatile, and even includes an event emitter class in its system libraries. Here is the solution to the problem that uses an event emitter.

var events = require('events');
var numberEmitter = new events.EventEmitter();
numberEmitter.on('number', _.compose(print, byConstant));
lazy(numbers)
  .async(1000)
  .each(_.bind(numberEmitter.emit, numberEmitter, 'number'));
// prints 6, 2 14 with 1 second intervals

Notice that the event name number is arbitrary. We could have called the event next or step. We only use the event to connect each generated number with multiplication and printing callback.

Let us move the boilerplate code into separate function that will return a source object. Then we can attach all steps to the source object.

function source(list) {
  var eventEmitter = new events.EventEmitter();
  lazy(list)
    .async(1000)
    .each(_.bind(eventEmitter.emit, eventEmitter, 'step'));
  return {
    on: function (cb) {
      eventEmitter.on('step', cb);
    }
  };
}
source(numbers)
  .on(_.compose(print, byConstant));

Excellent, but can we decouple the _.compose(print, byConstant) callback into something similar to Array.map and Array.forEach?

connect event emitter and promises

You can quickly convert a single emitted event into a promise-returning functions, see Promisify event emitter.

chained event emitters

Let us introduce what might seem like unnecessary complexity. The complexity will be hidden from the user in the library code.

For each event, the source will generate a number. Then we want to multiply this number by a constant, and then generate another event.

// source library
var stepEmitter = {
  // apply given callback to the value received
  // then emit the result through NEW step emitter
  map: function (cb) {
    var emitter = new events.EventEmitter();
    this.on('step', function (value) {
      var mappedValue = cb(value);
      emitter.emit('step', mappedValue);
    });
    return _.extend(emitter, stepEmitter);
  },
  // apply given callback to to the value received
  // then emit the value again through NEW step emitter
  forEach: function (cb) {
    var emitter = new events.EventEmitter();
    this.on('step', function (value) {
      cb(value);
      emitter.emit('step', value);
    });
    return _.extend(emitter, stepEmitter);
  }
};
// returns stepEmitter = event emitter + stepEmitter methods
function source(list) {
  var eventEmitter = new events.EventEmitter();
  lazy(list)
    .async(1000)
    .each(_.bind(eventEmitter.emit, eventEmitter, 'step'));

  return _.extend(eventEmitter, stepEmitter);
}
// use source library
source(numbers) // 1
  .map(byConstant) // 2
  .forEach(print); // 3

Here is the gist of this code: source(numbers) returns an event emitter that emits a number every second (line // 1). .map(byConstant) returns another event emitter that receives the previous number, runs it through byConstant and then emits the result (line // 2). .forEach(print) returns yet another event emitter that calls print on whatever it receives and emits the value unchanged (line // 3).

We now get a versatile and powerful graph of processing blocks instead of a single event source and a single callback. Whenever an item is received, the step emitter is free to process it however it wants. The processing step can call other asynchronous functions, or filter items, or join them together. For example, let us extend stepEmitter with buffer method that would wait until it has received N items and then send all values at once to the next step.

var stepEmitter = {
  // same map and forEach methods as above
  // returns a new step emitter that accumulates N items
  // emits the entire array with N items as single argument
  buffer: function (n) {
    var received = [];
    var emitter = new events.EventEmitter();
    this.on('step', function (value) {
      received.push(value);
      if (received.length === n) {
        emitter.emit('step', received);
        received = [];
      }
    });
    return _.extend(emitter, stepEmitter);
  }
};
// same source() function as above
source(numbers)
  .map(byConstant)
  .buffer(3)
  .forEach(print);
// sleeps 3 seconds then prints [6, 2, 14]

We could buffer items, split arrays into individual items, and join multiple event streams into one! By doing this we are moving from event emitter pattern to reactive programming.

reactive programming

If you view the data flow in your program as multiple streams of asynchronous events, you are programming in reactive style. These streams could be anything: mouse clicks, keyboard characters, ajax calls, messages from web workers. Instead of you managing the complexity of putting these events together (just like we were trying to do in the previous section), you could use an of the shelf reactive library that provides lots of utility methods for managing the event flow. Here is a tiny preview of reactive pipeline using RxJs.

The main idea behind RxJs is the concept of Observable. It is any source of asynchronous events, equivalent to source(numbers) in the event emitter example. The RxJs comes with lots of built-in Observables: UI events, callbacks, even timer events. Notice that in our examples we mixed generating each number with timer intervals by using lazy(numbers).async(1000) call. The reactive solution separates numbers and time intervals:

var Rx = require('rx');
var timeEvents = Rx.Observable
  .interval(1000)
  .timeInterval(); // 1
var numberEvents = Rx.Observable
  .fromArray(numbers); // 2
Rx.Observable.zip(timeEvents, numberEvents, function pickValue(t, n) { return n; }) // 3
  .map(byConstant)
  .subscribe(print);
// prints 6 2 14 with 1 second intervals

In this reactive example, let us split these two concepts into two streams. The first stream generates a timestamp event every second (line // 1). The second stream generates a number at each turn from the given array (line // 2). We zip these two streams (line // 3), creating an output stream that generates a single number every second. The callback pickValue picks the number value out of two streams. The number then will be transformed using map and finally printed in subscribe steps.

The reactive libraries like RxJs and bacon.js offer rich apis for controlling the data flow and connecting multiple streams.

Conclusions

We started with very straightforward procedural loop iterating over a list of numbers and gradually moved to use more abstract and powerful programming paradigms. Of course each problem might require its own tool, but I do find functional and reactive approached both elegant and less error prone than other techniques. At a bare minimum, using built-in Array iterators will save you from a common source of errors. It will also give you a nice jumping point to functional programming using excellent libraries such as lodash and Ramda.

What about generators? I am skipping the generators, because they are not yet available in EcmaScript 5 and will be included in the next version. For now, I will just mention that most of the libraries shown here already support generator functions as a source of asynchronous events.

To learn more about each style read the following

procedural - Effective JavaScript by David Herman
object-oriented - Principles of Object-Oriented Programming in JavaScript by Nicholas C. Zakas
functional
- Functional JavaScript by Michael Fogus
- Why Ramda?
- Applicative programming with Lodash
lazy evaluation - lazy.js docs
promises and other async approaches - Async JavaScript by Trevor Burnham
event emitters - Node.js in Practice by Alex Young and Marc Harter
reactive programming

Lazy evaluation and reactive programming are two topics that could definitely use more documentation.

Better world by better software

Gleb Bahmutov PhD

Our planet 🌏 is in danger

Act today: what you can do