One of the fancy first principles in scaling a system is to utilize a virtual machine to its fullest before scaling horizontally. This is the essential pull and tug of designing a good fundamental cloud architecture for the running application.
Here's a quick demonstration of scaling up vertically until the system is full and then adding new VMs horizontally to the system.
If we take a look at the AWS Instance List menu we see a large list of VMs with more than 1 core. If we choose to use m1.large instance sizes for network related reasons - AWS hits constraints on bandwidth long before it hits CPU or RAM limitations - it has 2vCPUs/4ECPUs. On nmon it looks like the VM has 4 available (shared) processors to run code.
In python, ruby, or php standard off-the shelf tools will run an application multi-process to consume all available processors and utilize the VM's full capacity. From a cloud architecture standpoint, this is a best practice before multiplying horizontally. But node.js wants to run single-processor, single-threaded, event-driven in a corner. Node's strength is its ability to seamlessly handle events straight off the OS but it wants to run in its small corner in a teeny tiny ball of fear of other processors.
This is one of those times when choosing the sexy new hyper bleeding edge tool off the shelf is a bad idea. Dammit. Mrfl.
Starting in node.js 0.8.x, node.js introduced the cluster module. It uses a classic master->worker fork() architecture (belying node's pure-C core) where the master communicates jobs to the workers through file descriptors -- not the best way to handle messaging among workers but serviceable. As documented in this Strongloop article, in the original implementation the master served to the workers as first come first serve which naturally favored a small number of workers over the rest of the herd leading to imbalances. node.js cluster 0.10.x utilizes ROUND_ROBIN worker balancing algorithms to force work to spread evenly over the processors -- an improvement.
The cluster module is marked as 1 - Experimental. Use in production at your own risk.
There's tons of code online demonstrating using basic clustering with node.js and express but anyone who has ever written a basic server using fork() and join will get it instantly. As usual with forked() workers, avoid using shared information held in memory at all costs. Coordinating critical sections with processes instead of threads is, to put it mildly, living on the edge.
This will work. But... a few words of consideration...
- Experimentation and load testing will find the tolerance for node.js processes/processor. It depends on how much the process waits on round trips from the database. It will be 1, 2 or 4 but each application is different. Meaning with 4 ECPUs it may be possible to run up to 16 node.js workers effectively depending on memory constraints.
- Because of shifting VM sizes, making the CPU/worker ratio configurable in config scripts and controlled by chef/ansible is a much smarter move than baking it into the master loop in the startup code of the application.
- Highly experimental modules? Logging and insanely defensive coding.
Node.js feels like the ultimate tool designed for the brave new cloud world but sometimes it wakes up and reminds everyone it is not yet a 1.0 release.
Also: squarespace hates PNGs.