Modular development using Nodejs

Split large projects into small modules.

Nodejs + NPM solve the giant project problem

Giant projects

Every book on good software practices talks about splitting large projects into components and modules. This principle often is at odds with situation in large software projects, where a giant source tree includes all the source for a particular solution. Often it spans languages, technologies, deployment environments, etc. The downsides to single source tree are easy to see; and these can be good red flags:

  • Extremely hard to understand the scope of the project from looking at the source code
  • Bringing new developers to the project is difficult
    • Lots of parts to understand before any productive work begins
    • Tight coupling between logically separate entities (server / db / client)
    • Installation instructions for tools run into pages
  • Testing is a nightmare
  • Delivery dates are unclear, and part of the code cannot be delivered separately
  • Work on new new features is abandoned because the risk of breaking some existing integration is high

Why are things so bad?

I truly believe that large projects grow into their unmanageable sizes due to primitive tools and lack of practical workflows for most languages. Even if I want to split a C++ application into multiple libraries, I have to pay upfront cost in creating new library projects, making sure I am building the libraries with correct compiler flags, and linking them into the final application.

Programmers are rational actors (in the short term). When the project is under active development, the up front costs to splitting the project are expensive enough, making it easier to keep adding the source to the already large tree. Thus every individual decision to keep expanding single project makes sense, it is the overall result that is a monster.

Modular JavaScript using Nodejs

nodejs is distributed with an excellent node package manager (npm). Joyent (the company behind node) also runs a public package registry, growing extremely fast, with more than 40K packages listed and more than 100M package downloads per month as of October 2013. There are options to run private registries, if needed.

Other languages also have package registries, Maven being a well known example. Still, nothing comes close to fast and simple package system NPM provides. A new package can be created and shared in 10 seconds using npm init wizard, published to public registry in less than 5 seconds using npm publish, and used from other packages using npm install <name>.

An interesting feature of npm dependency system is ability to point to any git repository, including using a specific tag. Thus one can easily avoid public or private repositories and work completely by linking components by pointing at the source repositories. The same approach is used for a front-end dependency manager bower. I often use this approach for modules developed inside the company:

// package.json
"dependencies": {
    "async": "0.2.9",
    "cool-feature": "http+git://local-company-git/cool/feature.git#0.1.0"
}

I think NPM got the dependency mechanism just right. It is extremely easy to create, share and reuse components, your own or written by others. This is very clearly seen from the registry's front page. Look a the three most dependent upon projects (projects listed as dependencies from other projects) and the number of downloads for each:

underscore

async

request

Real-time stats badges provided by NodeICO

The built-in modular development support makes it extremely natural to split large project into smaller units:

  • creating, connecting components is fast and easy
  • each component can be iterated separately without breaking the large project
  • understanding small components is quick

Back-end or front-end?

I develop the majority of my code for server-side nodejs. If I want to run the code in the browser, I can use browserify to convert my JavaScript into a bundle. Even for pure front-side work, I often use NPM + Bower simultaneously: npm is used to import the build tools, such as grunt, while bower is used to import prebuilt front-side libraries. I can even keep the meta information in sync between package.json and bower.json using grunt-npm2bower-sync.