The Rehashing Problem
Imagine you have 4 cache servers. You map keys using hash(key) % 4.
key1-> Server 1key2-> Server 2
What if Server 1 dies?
Now you have 3 servers. The formula becomes hash(key) % 3.
key1moves to Server X?key2moves to Server Y?
Nearly 100% of keys are moved. The cache is cold. Every request hits the database. Your database melts under the sudden load — a phenomenon known as cache stampede.
The Solution: The Ring
Instead of mapping to a server index (0, 1, 2), we map both keys and servers to a position on a Circle (0° to 360°), also known as the hash ring.
Interactive: Consistent Hashing Ring
Total Nodes: 3
How it works: Users (inner circle) are mapped to the first Node (outer circle) found moving clockwise.
Effect: Adding a node only re-maps keys that fall into its immediate range. Keys labeled User don't all move at once.
How it works
- Map Servers: Hash the IP of Server A -> 45°. Server B -> 200°. Server C -> 310°.
- Map Keys: Hash UserID -> 100°.
- Assign: Walk clockwise from UserID (100°) until you hit a Server (Server B at 200°). That server owns the key.
Adding/Removing Nodes
-
If we add Server D (150°):
- Only keys between Server A (45°) and Server D (150°) move to D.
- Keys after 150° stay on Server B.
- Minimal Movement: Only keys move on average, where K is total keys and N is total nodes.
-
If Server B dies (200°):
- Only keys that were on Server B (between 150° and 200°) move to the next server clockwise (Server C at 310°).
- All other servers are unaffected.
Virtual Nodes (VNodes)
The Hotspot Problem
With only a few physical nodes on the ring, the distribution is often uneven. If Server A handles 10° to 200° (190° range), and Server B handles 200° to 10° (170° range), the load is skewed.
The Solution: Multiple Points Per Server
Don't put Server A on the ring once. Put it 100-200 times at different positions:
Server A_1 → 23°,Server A_2 → 87°,Server A_3 → 156°...Server A_100 → 342°
This creates a uniform distribution of keys even with few physical servers.
VNode Advantages
- Load Balancing: Keys are evenly distributed across the ring regardless of how many physical servers exist.
- Heterogeneous Hardware: A powerful server can have 300 virtual nodes while a smaller one has 100, proportional to their capacity.
- Faster Recovery: When a node fails, its load is spread across many other nodes instead of just one successor.
import hashlib
class ConsistentHash:
def __init__(self, num_vnodes=150):
self.num_vnodes = num_vnodes
self.ring = {} # position -> server_name
self.sorted_keys = []
def _hash(self, key: str) -> int:
return int(hashlib.md5(key.encode()).hexdigest(), 16)
def add_node(self, server: str):
"""Add a server with N virtual nodes to the ring."""
for i in range(self.num_vnodes):
vnode_key = f"{server}:vnode{i}"
position = self._hash(vnode_key)
self.ring[position] = server
self.sorted_keys.append(position)
self.sorted_keys.sort()
def remove_node(self, server: str):
"""Remove server and all its virtual nodes."""
for i in range(self.num_vnodes):
vnode_key = f"{server}:vnode{i}"
position = self._hash(vnode_key)
del self.ring[position]
self.sorted_keys.remove(position)
def get_node(self, key: str) -> str:
"""Walk clockwise to find the responsible server."""
if not self.ring:
return None
h = self._hash(key)
# Binary search for the first position >= h
for pos in self.sorted_keys:
if pos >= h:
return self.ring[pos]
# Wrap around to first node
return self.ring[self.sorted_keys[0]]
# Usage
ch = ConsistentHash(num_vnodes=150)
ch.add_node("Server-A")
ch.add_node("Server-B")
ch.add_node("Server-C")
print(ch.get_node("user:1001")) # -> Server-B
print(ch.get_node("user:2345")) # -> Server-A
Real-World Usage
Cassandra & DynamoDB
Both partition data across nodes using consistent hashing. Each row's partition key is hashed to determine which node stores it. When you add a new node to a Cassandra cluster, only a fraction of data moves — the rest stays put.
Amazon DynamoDB
DynamoDB uses consistent hashing internally to distribute items across storage partitions. As tables grow, DynamoDB automatically splits partitions and redistributes using the ring.
Discord
Discord routes voice channels to voice servers using consistent hashing. When a server goes down, only the channels on that server need to be migrated — other channels are unaffected.
Memcached (Client-Side Sharding)
Memcached clients (like libmemcached) use consistent hashing on the client side to decide which Memcached server to read/write a key to. This avoids a centralized router.
Akamai CDN
Akamai, one of the world's largest CDNs, was co-founded by researchers who developed consistent hashing. They use it to route web requests to the nearest edge server.
Comparison
| Feature | Modulo Hashing (% N) | Consistent Hashing (Ring) |
|---|---|---|
| Add Node | Remaps ~100% of keys | Remaps ~ keys |
| Remove Node | Remaps ~100% of keys | Remaps ~ keys |
| Hotspots | High (if bad hash) | Low (with Virtual Nodes) |
| Complexity | O(1) lookup | O(log N) lookup (Binary Search on Ring) |
| Implementation | Trivial | Moderate (need sorted ring + binary search) |
Interview Tips 💡
- Start with the problem: "Modulo hashing breaks when servers change because all keys get remapped."
- Explain the ring: "Both servers and keys are hashed onto a circle. Walk clockwise to find the owner."
- Always mention VNodes: "Virtual nodes solve the uneven distribution problem and enable heterogeneous hardware."
- Quantify the benefit: "Adding 1 node to N nodes only moves 1/N of all keys, compared to nearly 100% with modulo."
- Name real systems: Cassandra, DynamoDB, Discord, Memcached, Akamai.
Related Concepts
- Database Sharding — Consistent hashing is the preferred sharding strategy
- Load Balancing — Balancing requests across servers
- CAP Theorem — Trade-offs in distributed data stores
- Unique ID Generation — Hash-based approaches for distributed IDs
About ScaleWiki
ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.
Read more about our Editorial Guidelines & Authorship.
Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.
Related Articles
CRDTs (Real-time Collaboration)
Conflict-free Replicated Data Types enable distributed systems to achieve eventual consistency without coordination, powering Google Docs, Figma, and collaborative editing through mathematically proven merge algorithms.
ACID vs BASE: Consistency Models
The two philosophies of database transaction handling: Strict guarantees (ACID) versus flexible availability (BASE). Deep dive into isolation levels, transaction anomalies, and hybrid approaches.
Database Indexing
Deep dive into database indexing internals. How B-Trees work, Clustered vs Non-Clustered indexes, Composite Index best practices, and covering indexes.