Please read Kliesli Compositions in JavaScript by Luis Atencio first. This blog post is just a refactoring of the code for clarity. While Luis does an admirable job showing the composition (a compliment), the final code is not as clear as it could be. We can even say that the code by putting type unwrapping and wrapping inside each composed function strays away from the functional principles of simplicity and composability. See last part for comparison, but hope you read my derivation before jumping to the end.
The example
Imagine you have a text file
1 | $ cat text.txt |
Imagine you want to count words in this single line text file. Do not handle any errors while coding it. Instead concentrate on making the program from small easy to understand functions. You might write a function to read the file, then another function to decode it, then another function to split text into words. Final function could count the words and return a number, which the program can print. In code this is simple:
1 | const fs = require('fs') |
When we run this program with an existing, non-empty file, things are great
1 | $ cat text.txt |
Yet, this program is extremely brittle. It does not handle any errors that
might happen in the real world. The file might not exist or it might be empty.
This program is also less than performant because it reads the file in blocking
manner using fs.readFileSync
instead of using a callback. But if we used a
callback, we could not easily perform composition check(read(path))
.
Composition of functions
Let us start with an observation that the function processFile
is really
a seriest of nested function calls. The only variable in the function is
path
and that goes inside the "deepest" nested call read(path)
. We could
draw what happens to the data in processFile
as a data processing pipeline
1 | const processFile = path => |
Note the flow of data goes from the most nested function read
that executes
first all the way to count
that executes last. Each function expected a
single argument and returns a single result. Well,
except for decode
, but that one due to being split into a function returning
a function is really making a function we really want on the fly:
1 | const decodeUtf8 = decode('utf8') |
Function read
takes a string argument, and function count
returns a number.
Thus the composed function processFile
expects a string and returns a number.
We wrote processFile
but really is is just a series of function calls,
each grabbing the result of the previous step and calling the next function
in the list. People have been writing functions to do this for us for a long
time. Ramda for example has R.compose
1 | const R = require('ramda') |
Note that we have constructed processFile
without even defining the
variable path
(this is called pointfree style).
I personally prefer the R.pipe function that reverses the
order of composed functions. I think it reads more naturally top to bottom.
In addition, we can write type of the result at each step; types usually
start with an upper case letter.
1 | const processFile = R.pipe( |
By taking the input to first step read
(that is string
) and output of
the last step count
(that is of type number
) we can write the input and
output types for the composed function processFile
. We could write it like
this:
1 | // processFile :: String -> Number |
Ok we can create the program function out of small building blocks on the fly, but the program really cannot handle any errors! It will crash and burn badly in a very common case - if the input file does not exist.
Task
Let us kill two birds with one stone. Let us make the program asynchronous
by reading a file using a callback, and let us handle any errors reading the
file. Instead of passing "plain" values from one function to another, the
read
will return the contents of the file, but stored inside a "Task"
object. I will use data.task for this. Almost verbatim from the
data.task example
1 | // read :: String -> Task(Error, Buffer) |
Calling read
with a string returns you a "Task". And that "Task" later
will have either an Error or a Buffer. Due to its asynchronous nature, we
can no longer run check
or decode
or even print the file contents right
away. Instead we need to attach actions we want to executes as callbacks
to the Task object.
1 | // read :: String -> Task(Error, Buffer) |
This is kind of interesting: Task.map
expects a function that knows nothing
about working with a Task. Instead the callback function to Task.map
receives the value from inside Task (whenever it becomes available).
The "plain" value returned by the callback function decode('utf8')
is
automatically placed back into a Task object.
The result of Task.map(...)
is another Task. We can keep "mapping" over
previous result.
1 | // read :: String -> Task(Error, Buffer) |
Here is a cool thing about a Task. As you might have guessed each Task.map
method call returns another Task instance, but nothing is executed until
Task.fork
is called.
We can demonstrate this by interspersing log statements with mapping calls
1 | const t1 = read('./text.txt') |
And we can demonstrate that no Tasks are executed by commenting out .fork()
call and adding a log statement to read
function.
1 | // read :: String -> Task(Error, Buffer) |
Delayed execution is the main benefit of Tasks - you can keep adding more
computations until you are happy with the chain, and only then call Task.fork
to actually run it. Promises on the other hand are eager - as soon as you
have created a Promise it starts running.
Composing Task with plain functions
Now that we got our very first function read
returning a Task, but we lost the
ability to compose functions using R.compose
or R.pipe
. How can we compose
a Task - it is no longer a "plain" value we can pass to the next function in
the chain. Luckily, Ramda has a composition function just for this case.
If every function in the chain expects a "plain" input but returns a Task,
all functions can be composed again using a library utility
R.composeK or R.pipeK (the K stands for
Kliesli composition, but I am not linking a reference url because the Wikipedia
article will scare you away for good). Think of this as a composition for
functions that all return same wrapped type like Task.
To compose, each individual function in the chain must return result of
type Task. The simplest case to wrap a "plain" value in a Task is to call
Task.of(x)
factory function.
1 | // decode :: String -> Buffer -> Task(String) |
If we decide to draw pipeline processFile
, I would visualize it as
a series of pipe segments. Each segment that we write expected "plain" value
but outputs Task
object. Yet except for read
no other function we wrote
deals with or needs an actual Task object! No other function is asynchronous,
so hard-coding Task.of(result)
inside each function is short sighed. It
makes a function harder to read and harder to test. We only returned a Task
from each so we could use these functions with R.pipeK
.
I prefer adapting a function to each particular case, rather than changing
it (see my favorite adaptors). Thus I
will change the check
, decode('utf8')
and count
functions back to their
original "simple" form, and will convert the result into a Task on the fly.
1 | // decode :: String -> Buffer -> String |
Individual functions decode
, check
and count
are simple again, but
our pipeline is a little heavy. We notice that x => Task.of(check(x))
for
example is functional composition itself!
1 | const processFile = R.pipeK( |
Notice the repeated R.compose(Task.of, _)
syntax. We could partially apply
the first argument here to shorten it.
1 | const asTask = R.partial(R.compose, [Task.of]) |
We could avoid even using R.partial
if R.compose
were curried, but it
is hard to curry a function with unknown number of arguments. Luckily, Ramda
includes R.o which is a curried compose! That is the function that
makes our code tiny in this case.
1 | const asTask = R.o(Task.of) |
Perfect. We are composing functions, some of which return an actual Task, but
most do not, and we are adapting the return value on the fly so the pipe line
holds. The actual "flow" is still, aside from the read
function, just
"plain" values along the "happy" path.
1 | --------------------------------------------------------- |
What happens if there is a file read error? In that case, the control flow
will skip the "happy path pipe", and will go directly to the error callback
function in the .fork(onError, onSuccess)
execution.
1 | --------------------------------------------------------- |
I like thinking of Task / Promises / Either making little railway tracks, and the data moving along the tracks like box cars. Sometimes due to an error, the box car jumps to an error track where it will keep rolling until someone handles the lost car. Watch Functional programming design patterns by Scott Wlaschin for a good talk using this analogy.
Notice that check
, decode
and count
do not take advantage of the full
Task pipeline, unlike read
. Also, they probably should not - they are
synchronous functions and their problem is a different one. Take check
for example: it returns null
value to indicate an empty file. But what if
the null
value was a legitimate one? Would it return -1
or some magic
constant to indicate a problem? Or would it throw an Error? And how could we
compose these functions safely in that case?
Either pipeline
Let us take a look at check
function. Again, just like read
, it should not
return a "plain" special value to indicate a problem. Instead it should return
an object that, just like Task, allows mapping over the inner value.
This wrapper is called Either
, it is commonly used to replace multiple
if-else branches
and there many libraries that implement it.
I will use data.either.
1 | const Either = require('data.either') |
Great, what about decode
and count
? They too could just return Either
,
and we could compose all 3 functions into single pipeline using R.pipeK
,
only this time the result would be an Either(...)
object.
1 | // decode : String -> Buffer -> Either(String) |
The way to get the value from Either
is to NOT ignore possible errors, and
for example provide default value.
1 | console.log( |
An Either
is especially useful here, because a function like decode
might
receive an invalid encoding; Either
allows us to avoid crashing.
1 | // decode : String -> Buffer -> Either(String) |
We could do the same trick and NOT hard code returned Either
type in count
1 | // count :: String -> Number |
In a sense we have constructed a pipeline where each function (well, except
count
but that only gets called with a valid string input) is safe.
1 | ----------------------------------------- |
From check
and from decode
, if there is an error, the control will "jump"
to the error track.
1 | ----------------------------------------- |
Do you like my ASCII drawing skills?!
Combining pipes
Finally, let us connect the two pipelines we have composed. We cannot combine Task and Either segments of the pipeline sequentially unfortunately.
1 | ------------=====----------------- |
To see why, think about when you are going to call Task.fork
- is before
creating the first Either? No, that cannot be right, we want the final object
to be a Task, so we can call Task.fork
on it when we are ready to use it.
The opposite is not true - we could convert an Either into a Task, see
Natural Transformation video for example.
Ok, back to our code.
We have to stick the Either pipeline inside the Task pipeline.
The read
function will start the outer Task pipe. The result of read
will be passed to the inner Either pipe check -> decode -> count
.
The output of the outer Task pipe that Task.fork
will pass to the callback
function will be an Either
object returned by the Either pipe. As a diagram
it would look like this
1 | --------------------------------------------------------- |
The Either pipeline is the same as before (decode
does not check the encoding
here).
1 | // check : Buffer -> Either(Buffer) |
Now we need to combine read
and processBuffer
as a Task-returning
functions, and we can use the same asTask
approach, because processBuffer
is a "regular" function we can adapt.
1 | // read : String -> Task(Error, Buffer) |
How do we get result from the output of processFile
which is
Task(Either(Number))
? In two steps: first calling .fork
then .getOrElse
1 | processFile('./text.txt') |
The double-pipe insulation handles missing file, access errors, empty file
and could be easily extended to cover invalid encodings and other errors.
All we need is to return an Either
- we are already making this the
function's return type anyway in preparation for R.pipeK
anyway.
Comparison to the original blog post example
To finish this blog post I want to go back to the Luis Atencio and his example.
Because Luis only uses single composition to make the Task pipe, it looks
deceptively simple (I will switch compose
to pipe
for easier comparison)
1 | // processFile :: String -> Task(Error, Either) |
Yet, and this is a big one - this forces each function check
, decode
and count
to receive what an Either
as argument, and return Task
result.
This makes the code inside the count
function for example
unwrap the argument to get to the plain string value.
1 | // count :: String -> Task(_, String) |
Note the type signature - it is incorrect, the function count
receives
a Either(String)
, not String
. That is why it has to do text.map
to get
the actual string t
. Similarly there is no reason (other than forcing the
function to be compatible with a Task pipeline) for count
to return a Task!
Let us compare the count
that fits into a Task pipe with our version of
the same function that goes into Either pipe, but adapts the return value
type externally using asEither
composition. I will keep the same variable
names for honest comparison.
1 | // count :: Either(String) -> Task(_, String) |
First version has 65 characters, second has 38. We saved almost 50% of code
(and a lot of complexity) by keeping count
focused on what it should
actually do - split a string into an array and return its length. Leave
the marshaling of arguments and returns to others.
Final thoughts
Thanks to Luis Atencio for great examples. Go buy his book Functional Programming in JavaScript - it is excellent and practical.