Real-Time With Node.js and MongoDB: Building a Live Dashboard That Doesn't Melt at Scale
In early 2015 we had a business intelligence dashboard that refreshed every 5 seconds via AJAX polling. Each poll hit a PHP backend which queried MySQL. At 200 concurrent users, the dashboard was responsible for 2,400 database queries per minute - most of which returned identical data.
The fix wasn't clever caching. It was reconsidering the model entirely: instead of clients asking "is there new data?", the server would tell clients when new data arrives. That's WebSockets.
Why Node.js for Real-Time
The traditional PHP/Apache model: one thread per request. A WebSocket connection is a persistent, long-lived connection - it stays open for the entire session. A 500-user dashboard means 500 simultaneous open connections. Thread-per-connection doesn't scale.
Node.js uses a single-threaded event loop. It handles thousands of concurrent connections not by spawning threads but by registering callbacks and waiting for events:
// This is the mental model of the Node.js event loop
while (true) {
event = eventQueue.pop();
if (event) {
event.callback(); // Execute callback, return immediately
}
// No blocking - if callback does I/O, register another callback and move on
}
The constraint: callbacks must not block the event loop. CPU-heavy synchronous operations (video encoding, large JSON parsing, cryptographic operations) will freeze all connections. Node.js is excellent at I/O-bound work (waiting for DB, waiting for network); it's poor at CPU-bound work.
The Stack
- Node.js 0.12 (LTS in 2015)
- Socket.io 1.3 - WebSocket abstraction with fallback to long-polling for IE9
- MongoDB 3.0 - document store with the new WiredTiger storage engine
- Redis - pub/sub for multi-instance coordination
Socket.io: Rooms and Namespaces
Socket.io's room concept let us efficiently broadcast to subsets of clients:
// server.js
var io = require('socket.io')(httpServer);
io.on('connection', function(socket) {
console.log('client connected:', socket.id);
// Client tells us which dashboard section they're viewing
socket.on('subscribe', function(section) {
// Leave previous room, join new one
socket.leaveAll();
socket.join('dashboard:' + section);
// Send current state immediately on subscribe
getDashboardData(section, function(err, data) {
socket.emit('dashboard:update', data);
});
});
socket.on('disconnect', function() {
console.log('client disconnected:', socket.id);
// Socket.io automatically removes socket from all rooms
});
});
// When new data arrives, broadcast to everyone in the relevant room
function broadcastUpdate(section, data) {
io.to('dashboard:' + section).emit('dashboard:update', data);
}
On the client:
// client.js
var socket = io();
socket.on('connect', function() {
socket.emit('subscribe', 'sales'); // Subscribe to sales section
});
socket.on('dashboard:update', function(data) {
updateCharts(data); // Re-render charts with new data
});
socket.on('disconnect', function() {
showReconnectingIndicator();
});
Socket.io handles reconnection automatically. The disconnect + connect cycle happens transparently; the client just re-subscribes in the connect handler.
MongoDB: Tailing the Oplog
MongoDB's replication mechanism writes every write operation to the oplog - a special capped collection in the local database. We could tail this collection to react to database changes in real-time.
This predates MongoDB Change Streams (added in 3.6). In 2015, the approach was a tailable cursor on the oplog:
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect('mongodb://localhost:27017/local', function(err, db) {
var oplogCollection = db.collection('oplog.rs');
// Get current oplog position
oplogCollection.find({}, { ts: 1 })
.sort({ $natural: -1 })
.limit(1)
.toArray(function(err, docs) {
var lastTimestamp = docs[0].ts;
// Tailable cursor - stays open and returns new docs as they arrive
var cursor = oplogCollection.find({
ts: { $gt: lastTimestamp },
ns: 'aunimeda.orders' // Watch 'orders' collection in 'aunimeda' DB
}, {
tailable: true,
awaitdata: true,
numberOfRetries: -1, // Retry forever
tailableRetryInterval: 200
});
cursor.each(function(err, doc) {
if (err) return console.error(err);
if (!doc) return; // No new docs yet
// doc.op: 'i' = insert, 'u' = update, 'd' = delete
if (doc.op === 'i') {
handleNewOrder(doc.o);
} else if (doc.op === 'u') {
handleOrderUpdate(doc.o2._id, doc.o.$set);
}
});
});
});
Every new order insert immediately triggered handleNewOrder, which called broadcastUpdate('orders', ...), which pushed to all clients in the dashboard:orders room. Zero polling.
Multi-Instance Coordination with Redis Pub/Sub
Running a single Node.js process couldn't use all CPU cores. We ran multiple instances with PM2:
# pm2 ecosystem.config.js
apps: [{
name: 'dashboard',
script: 'server.js',
instances: 4, # One per CPU core
exec_mode: 'cluster', # Node.js cluster module
}]
The problem: with 4 processes, a WebSocket connection from client A goes to process 1. A broadcast from process 2 won't reach client A.
Redis pub/sub solved this:
var redis = require('redis');
var redisSub = redis.createClient();
var redisPub = redis.createClient();
// Every process subscribes to the Redis channel
redisSub.subscribe('dashboard:broadcast');
redisSub.on('message', function(channel, message) {
var payload = JSON.parse(message);
// Emit to local Socket.io clients - this process's connected clients
io.to(payload.room).emit('dashboard:update', payload.data);
});
// When data changes, any process publishes to Redis
// Redis delivers to all processes, each emits to its local clients
function broadcastUpdate(section, data) {
redisPub.publish('dashboard:broadcast', JSON.stringify({
room: 'dashboard:' + section,
data: data
}));
}
Socket.io 1.x had a built-in Redis adapter (socket.io-redis) that handled exactly this, but understanding the underlying pub/sub pattern was valuable.
The Result
| Metric | Before (AJAX polling) | After (WebSockets) |
|---|---|---|
| DB queries/minute at 200 users | 2,400 | ~4 (only on actual data changes) |
| Dashboard update latency | 0–5 seconds | <100ms |
| Server memory at 200 users | 480MB (200 PHP-FPM workers) | 95MB (4 Node.js processes) |
| CPU at 200 users (idle data) | 40% (constant polling) | 2% |
The latency drop from "up to 5 seconds" to "under 100ms" changed how the product felt. Users stopped second-guessing whether the data was fresh.
What We Got Wrong
Memory leaks in long-running processes. PHP restarts after every request - memory leaks are irrelevant. Node.js runs for days. We had a subtle leak in our MongoDB cursor handling that caused memory to grow ~2MB/hour. Caught it after 3 days when the process hit 6GB and OOM-killed.
Tools: process.memoryUsage() logged every minute, node --inspect with Chrome DevTools heap snapshots. The leak was a closure inside the oplog tailing function that held a reference to a growing array.
Unhandled promise rejections. In 2015, unhandled rejections were silent warnings, not process crashes. We had several "why did the broadcast stop?" incidents traced to promise chains without .catch(). Node.js 15 (2020) made unhandled rejections crash the process - the right call.
The Mental Model Shift
The most valuable thing from this project wasn't the technology - it was the event-driven programming model. Before Node.js, we thought in terms of threads blocking on I/O. After, we thought in callbacks and event queues.
This mental model transfers: browser event handlers, React's useEffect, Go channels, Rust async/await - all variations on the same idea. Concurrent I/O without thread overhead, at the cost of callback complexity. The answer to callback hell (deeply nested callbacks) came in 2017 with async/await in Node.js 7.6, but the event loop underneath is unchanged.
In 2024, real-time architectures have more options: Server-Sent Events for one-way push, WebRTC for peer-to-peer, MongoDB Change Streams replacing oplog tailing, and managed services (Pusher, Ably) for teams that don't want to run their own Socket.io infrastructure. The underlying pattern - event-driven, non-blocking - remains the foundation.