It is easy to accidentally overwrite data in your program, especially when the program transitions to multiple requests executing in parallel. Node is great at async, event-driven programming using single event loop, but even when using node there might be some unexpected results.
Here is a shared variable example that is common when using a singleton middleware responding to multiple requests.
step 1 - sync
Let us start with simple initial code that works as expected: logs a field from a request object
1 | function server(req) { |
This is simple sync sequence of steps.
step 2 - async log
Instead of writing the request name, let us queue it up for logging later. This might be necessary to handle the request as quick as possible, postponing logging until the server is less busy.
1 | function server(req) { |
I added finished requests
message to show the sequence of events.
There are no problems caused by the changed sequence of events. In particular
the req.name
property has the expected value, because each execution of the
callback function has its own variable req
on the stack, pointing at a different
copy of the argument object to the server call (lines // 2, // 3, // 4
), which are
allocated on the heap. I described the different in stack vs heap in
this blog post.
step 3 - async logger singleton
Let us now move the logging feature into a separate object. It makes a great sense
to use a singleton pattern, since all messages ultimately go into same console
object (could be any message sink).
1 | var logger = { |
We used single property data
(line // 1
) to temporarily hold the message
between queue
and flush
calls. Thus the last value value written baz
was called 3 times, eventhough we scheduled the first 2 flush
calls
before it. This is an example where object-oriented approach with objects containing
inner state causes problems, while functional programming with passing values
around (as in step 2)
step 4 - local storage
Obviously this is very contrived example, but what if we wanted to make it more realistic and schedule the printing to happen after the server handles the request? Or pass information between the middleware layers specific to a particular execution stack? We have 5 choices:
- pass all values around as arguments. This is safe, but becomes very verbose.
- store values as properties in the request object. It is passed around anyway, so we could just use it. I am against this method because I like keeping the request pristine.
- implement a queue data structure inside the logger.
- keep extra info in a singleton hashtable with some request property used to generate the hash. First middleware adds the hashtable record, last middleware deletes it.
- use the robust implementation of the previous approach using continuation-local-storage.