Consistent Hashing: A Deep Dive into Distributed Load Balancing
In the world of distributed systems, horizontal scaling is the holy grail. But as we add more servers to handle growing traffic, how do we efficiently distribute requests?
Scaling is not just about adding resources; it's about adding resources without disrupting the system.
This article explores Consistent Hashing, the algorithm that powers DynamoDB, Cassandra, and many CDN architectures.
The Problem with Modulo Hashing
The naive approach to load balancing is basic modular arithmetic:
Where is the number of servers. This works perfectly until changes. If a server crashes or we add a new one, changes, and almost all keys are remapped.
Mathematically, the probability of a key moving is:
This causes a cache stampede, potentially bringing down your entire database layer.
Enter Consistent Hashing
Consistent hashing solves this by mapping both keys and servers to a unit circle, or the "hash ring".
- Hash servers to integers in the range .
- Hash keys to the same range.
- Map a key to the first server found moving clockwise on the ring.
The Algorithm
Here is a robust implementation of a Consistent Hash Ring in Go. We use crc32 for hashing and sort.Search for efficient lookups ().
package consistenthash
import (
"hash/crc32"
"sort"
"strconv"
)
type Hash func(data []byte) uint32
type Map struct {
hash Hash
replicas int // Virtual nodes per real node
keys []int // Sorted hash ring
hashMap map[int]string // Map hash -> server name
}
func New(replicas int, fn Hash) *Map {
m := &Map{
replicas: replicas,
hash: fn,
hashMap: make(map[int]string),
}
if m.hash == nil {
m.hash = crc32.ChecksumIEEE
}
return m
}
// Add adds some keys to the hash.
func (m *Map) Add(keys ...string) {
for _, key := range keys {
for i := 0; i < m.replicas; i++ {
hash := int(m.hash([]byte(strconv.Itoa(i) + key)))
m.keys = append(m.keys, hash)
m.hashMap[hash] = key
}
}
sort.Ints(m.keys)
}
Virtual Nodes
To ensure uniform distribution even with a small number of servers, we introduce Virtual Nodes. Each physical server is hashed multiple times ( replicas) onto the ring.
If be the random variable representing the load on a server. With virtual nodes, the standard deviation of load distribution decreases significantly:
This creates a smoother distribution of keys, preventing "hot spots" where one server handles uneven traffic.
Complexity Analysis
Let's break down the performance characteristics:
| Operation | Complexity | Description |
|---|---|---|
| Add Node | Adding virtual nodes and resorting ring | |
| Remove Node | Removing virtual nodes | |
| Lookup | Binary search to find next server |
Where is the total number of virtual nodes on the ring.
Conclusion
Consistent Hashing is a fundamental building block of modern distributed systems. By minimizing key reorganization during scaling events, it allows systems to grow elastically and resiliently.
Whether you are designing a distributed cache, a sharded database, or a peer-to-peer network, understanding this algorithm is essential for any backend engineer.
Related Articles
Building a Real-Time Chat System: WebSockets, Message Queues, and Presence at Scale
A production-grade exploration of real-time communication architecture using Go, WebSockets for persistent connections, RabbitMQ for message fanout, and distributed presence tracking.
TAO: How Facebook Built a Graph Database for 3 Billion People
The engineering story behind TAO — Facebook's distributed graph store handling 400+ billion edges, 1 billion queries/second, and sub-millisecond latency. From the MySQL scaling crisis to a custom planet-scale system.
System Design: Building Facebook from Scratch
A comprehensive engineering blueprint for building Facebook at scale — covering social graph, news feed, messaging, media storage, ads, and the infrastructure serving 3 billion users.