Back to All Concepts
DatabasesDistributed SystemsAlgorithmsHashingAdvanced

Consistent Hashing

How to add/remove servers without moving every single key. The Ring, Virtual Nodes, and real-world usage in Cassandra, DynamoDB, and Discord.

The Rehashing Problem

Imagine you have 4 cache servers. You map keys using hash(key) % 4.

  • key1 -> Server 1
  • key2 -> Server 2

What if Server 1 dies? Now you have 3 servers. The formula becomes hash(key) % 3.

  • key1 moves to Server X?
  • key2 moves to Server Y?

Nearly 100% of keys are moved. The cache is cold. Every request hits the database. Your database melts under the sudden load — a phenomenon known as cache stampede.

The Solution: The Ring

Instead of mapping to a server index (0, 1, 2), we map both keys and servers to a position on a Circle (0° to 360°), also known as the hash ring.

Interactive: Consistent Hashing Ring

Total Nodes: 3

How it works: Users (inner circle) are mapped to the first Node (outer circle) found moving clockwise.

Effect: Adding a node only re-maps keys that fall into its immediate range. Keys labeled User don't all move at once.

How it works

  1. Map Servers: Hash the IP of Server A -> 45°. Server B -> 200°. Server C -> 310°.
  2. Map Keys: Hash UserID -> 100°.
  3. Assign: Walk clockwise from UserID (100°) until you hit a Server (Server B at 200°). That server owns the key.

Adding/Removing Nodes

  • If we add Server D (150°):

    • Only keys between Server A (45°) and Server D (150°) move to D.
    • Keys after 150° stay on Server B.
    • Minimal Movement: Only K/NK/N keys move on average, where K is total keys and N is total nodes.
  • If Server B dies (200°):

    • Only keys that were on Server B (between 150° and 200°) move to the next server clockwise (Server C at 310°).
    • All other servers are unaffected.

Virtual Nodes (VNodes)

The Hotspot Problem

With only a few physical nodes on the ring, the distribution is often uneven. If Server A handles 10° to 200° (190° range), and Server B handles 200° to 10° (170° range), the load is skewed.

The Solution: Multiple Points Per Server

Don't put Server A on the ring once. Put it 100-200 times at different positions:

  • Server A_1 → 23°, Server A_2 → 87°, Server A_3 → 156° ... Server A_100 → 342°

This creates a uniform distribution of keys even with few physical servers.

VNode Advantages

  1. Load Balancing: Keys are evenly distributed across the ring regardless of how many physical servers exist.
  2. Heterogeneous Hardware: A powerful server can have 300 virtual nodes while a smaller one has 100, proportional to their capacity.
  3. Faster Recovery: When a node fails, its load is spread across many other nodes instead of just one successor.
python
import hashlib

class ConsistentHash:
    def __init__(self, num_vnodes=150):
        self.num_vnodes = num_vnodes
        self.ring = {}  # position -> server_name
        self.sorted_keys = []

    def _hash(self, key: str) -> int:
        return int(hashlib.md5(key.encode()).hexdigest(), 16)

    def add_node(self, server: str):
        """Add a server with N virtual nodes to the ring."""
        for i in range(self.num_vnodes):
            vnode_key = f"{server}:vnode{i}"
            position = self._hash(vnode_key)
            self.ring[position] = server
            self.sorted_keys.append(position)
        self.sorted_keys.sort()

    def remove_node(self, server: str):
        """Remove server and all its virtual nodes."""
        for i in range(self.num_vnodes):
            vnode_key = f"{server}:vnode{i}"
            position = self._hash(vnode_key)
            del self.ring[position]
            self.sorted_keys.remove(position)

    def get_node(self, key: str) -> str:
        """Walk clockwise to find the responsible server."""
        if not self.ring:
            return None
        h = self._hash(key)
        # Binary search for the first position >= h
        for pos in self.sorted_keys:
            if pos >= h:
                return self.ring[pos]
        # Wrap around to first node
        return self.ring[self.sorted_keys[0]]

# Usage
ch = ConsistentHash(num_vnodes=150)
ch.add_node("Server-A")
ch.add_node("Server-B")
ch.add_node("Server-C")

print(ch.get_node("user:1001"))  # -> Server-B
print(ch.get_node("user:2345"))  # -> Server-A
Click to expand code...

Real-World Usage

Cassandra & DynamoDB

Both partition data across nodes using consistent hashing. Each row's partition key is hashed to determine which node stores it. When you add a new node to a Cassandra cluster, only a fraction of data moves — the rest stays put.

Amazon DynamoDB

DynamoDB uses consistent hashing internally to distribute items across storage partitions. As tables grow, DynamoDB automatically splits partitions and redistributes using the ring.

Discord

Discord routes voice channels to voice servers using consistent hashing. When a server goes down, only the channels on that server need to be migrated — other channels are unaffected.

Memcached (Client-Side Sharding)

Memcached clients (like libmemcached) use consistent hashing on the client side to decide which Memcached server to read/write a key to. This avoids a centralized router.

Akamai CDN

Akamai, one of the world's largest CDNs, was co-founded by researchers who developed consistent hashing. They use it to route web requests to the nearest edge server.

Comparison

FeatureModulo Hashing (% N)Consistent Hashing (Ring)
Add NodeRemaps ~100% of keysRemaps ~1/N1/N keys
Remove NodeRemaps ~100% of keysRemaps ~1/N1/N keys
HotspotsHigh (if bad hash)Low (with Virtual Nodes)
ComplexityO(1) lookupO(log N) lookup (Binary Search on Ring)
ImplementationTrivialModerate (need sorted ring + binary search)

Interview Tips 💡

  1. Start with the problem: "Modulo hashing breaks when servers change because all keys get remapped."
  2. Explain the ring: "Both servers and keys are hashed onto a circle. Walk clockwise to find the owner."
  3. Always mention VNodes: "Virtual nodes solve the uneven distribution problem and enable heterogeneous hardware."
  4. Quantify the benefit: "Adding 1 node to N nodes only moves 1/N of all keys, compared to nearly 100% with modulo."
  5. Name real systems: Cassandra, DynamoDB, Discord, Memcached, Akamai.

Related Concepts

About ScaleWiki

ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.

Read more about our Editorial Guidelines & Authorship.

Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.

Related Articles