MongoDB 1.6: Scaling Out with Sharding and Replica Sets
It's 2010, and your MySQL "sharding-by-hand" approach is failing. You're tired of writing application-level logic to decide which database node holds which user. 10gen has just released MongoDB 1.6, and it brings two massive features to the table: production-grade Sharding and Replica Sets.
Replica Sets: The End of Master-Slave
Forget the old master-slave replication. Replica Sets introduce automated failover. You have a set of nodes, and they hold an election to decide who is the Primary. If the Primary goes down, the secondaries automatically elect a new one.
// Initiating a replica set in the mongo shell
rs.initiate({
_id: "myConfigSet",
members: [
{ _id: 0, host: "cfg1.example.net:27019" },
{ _id: 1, host: "cfg2.example.net:27019" },
{ _id: 2, host: "cfg3.example.net:27019" }
]
})
Sharding: Horizontal Scaling
Sharding allows you to distribute a single collection across multiple machines. The mongos router handles the complexity, so your application thinks it's talking to a single database.
To shard a collection, you first need to enable sharding for the database and then pick a Shard Key.
// Connect to the mongos instance
sh.enableSharding("myapp")
// Shard the 'orders' collection based on the 'user_id'
// Using a hashed shard key or a range-based one
sh.shardCollection("myapp.orders", { "user_id": 1 })
Choosing the Right Shard Key
This is where most people trip up in 2010. If you pick a monotonically increasing value (like an ObjectId or a timestamp) as your shard key, all your writes will hit the exact same shard—the "hot shard" problem. For high-write volume, you want a key with high cardinality and even distribution, like a UUID or a hash of a user ID.
Monitoring with mongostat
In 1.6, keeping an eye on your shards is vital. Use the mongostat tool to watch inserts, queries, and page faults in real-time. If you see high "locked %", it's time to add more shards. MongoDB handles the "rebalancing" of data chunks in the background, moving data from over-utilized shards to under-utilized ones without taking your app offline.