I hate large projects.
Large project has one or more of the following characteristics:
- More than 20 source files
- More than a single programming language
- I consider HTML / CSS / JavaScript / CoffeeScript close enough to count as one
- When describing what it does you need to use AND a lot.
- None of the code in the project can be reused elsewhere
- Single source repository for all the code
- Getting from source to built / tested / running application takes more than 30 minutes.
- Unit tests are unreliable - sometimes they fail for no reason
Examples of large projects:
- Database plus data access plus website
- API server plus analytics implementation
- 3 different websites built from same physical source repository
Large project has:
- Incomplete documentation, because it is hard to document database schema and REST API same way, so you skip the documentation.
- Messy API that has never been properly cleaned up, since the data keeps changing. Why bother tidying up one part when the rest is a mess?
- Painful deployment, because it is hard to automate all the different parts. Examples: "how do I deploy the static documentation if only documentation has changed?", "how do I publish this module if commits only concern it?"
- Home grown build tools and scripts inside the repository that are the "special" sauce for building this particular project. How do they work and are they reliable is anyone's guess.
- Misalignment of interests; most users are interested in "simple" and fast installation, while the large size of a project optimizes for keeping the already installed and bootstrapped project for a long time.
As a human being working on a large project I:
- Have troubles keeping track of all the moving parts
- Cannot quickly bring a new developer onboard - they just need to catch up on so many things before making any meaningful contributions.
- Often sigh because parts are broken or not working as expected.
Most importantly, I can never catch a break from the project, since there is nothing I can switch to. Everything is one project, and I can only "celebrate" releases, and the next day back to the same code, same battles. Mentally I am working on the same code every day.
Advice: break up
Start breaking large projects into separate parts aggressively. It is possible, and it is only a question of right source control and tools. For example for my JavaScript or CoffeeScript projects I use node package manager and bower, both can link dependencies directly against git repositories. Other tools can be setup to link projects in similar way.
Once a subproject is separate, you can quickly define its precise goals and features, test it properly, document external features, and setup quick continuous build. Even better, once a feature lives in its separate project, you can forget about it for a while, only coming back when something new is needed. It is complete for a while, and that's a great feeling!
In practice
I have lots of open sourced projects, and as you can see they all are in pretty good shape. I move among them at will, implementing a new feature here and there, always happy when a small project becomes slightly better, but I am not bogged down in a giant application. Here is my component workflow and advice specific to Nodejs. Hope it helps you split a large project into pieces that you can actually finish!
Tools
Tools for working with monorepos and for working with individual packages
- Lerna - A tool for managing JavaScript projects with
multiple packages.
- babel, create-react-app, pug, jest, and many other projects are using Lerna
- semantic-release fully automated package publishing. Keep committing and new versions will "magically" appear if there are public changes
- next-update makes upgrading dependencies a breeze.
- next-update-travis is low intensity dependency upgrades from TravisCI
- greenkeeper.io is fast testing of available dependency updates via pull requests (that need to be manually reviewed)
- greenkeeper-keeper is a service to automatically merge Greenkeeper pull requests.
- renovate keeps dependencies up to date automatically by testing and merging pull requests (an alternative to greenkeeper.io + greenkeeper-keeper)
Examples of monorepos
In addition to the list in Lerna:
- Cycle.js
- Angular - check out the build scripts!
- Lots of private company repositories I have experienced
- The Problem with Shared Code by Jeff Whelpley.
- Michael Church has written a very interesting blog post about the culture of large projects, read it here Java Shop Politics.
Arguments for large monorepos
- On Monolithic Repositories
- Advantages of monolithic version control
- simplified organization
- simplified dependencies
- tooling (over the entire monorepo maybe, not for individual projects in my opinion)
- cross-project changes