The Impossible Triangle
In 2000, computer scientist Eric Brewer conjectured—and in 2002 Seth Gilbert and Nancy Lynch formally proved—that a distributed data store can provide at most two of the following three guarantees simultaneously:
- Consistency (C): Every read receives the most recent write or an error. All nodes see the same data at the same time. If you write
balance = 500to Node A, an immediate read from Node B must return500(or fail), never the stale value. - Availability (A): Every request (read or write) receives a non-error response, without guaranteeing it contains the most recent write. The system is always responsive—no timeouts, no "Service Unavailable."
- Partition Tolerance (P): The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes. Network cables get cut, data centers lose connectivity, packets disappear.
Interactive: Pick Two
Interactive: CAP Theorem
Pick any two properties. You cannot have all three in a distributed system.
The "Real" Choice: Why Partition Tolerance Is Mandatory
In a distributed system, Partition Tolerance (P) is not optional. Networks fail. Cables get cut. Data centers lose power. Cloud regions go dark. You cannot avoid partitions—they are a physical reality of distributed computing.
This means the CAP theorem reduces to a binary choice:
CP (Consistency + Partition Tolerance)
If the network breaks, we stop accepting writes (or reads) to ensure data doesn't diverge. The system becomes unavailable for the duration of the partition, but when it recovers, data is guaranteed to be consistent.
Behavior during partition: Nodes that cannot communicate with the majority refuse to serve requests rather than risk returning stale data.
Real-world examples:
- MongoDB (with majority write concern): Rejects writes if the primary cannot reach a majority of replicas.
- HBase / Google Bigtable: Strong consistency via a single leader per region.
- Zookeeper / etcd: Used for configuration and coordination, where consistency is critical.
Best for: Banking systems, inventory management, leader election, distributed locks—anywhere wrong data is worse than no data.
AP (Availability + Partition Tolerance)
If the network breaks, we keep accepting reads and writes, even if nodes disagree. Data inconsistencies are resolved later through conflict resolution strategies.
Behavior during partition: Every node continues serving requests independently. When the partition heals, the system reconciles diverged data.
Real-world examples:
- Cassandra: Always writable. Uses "last write wins" or custom merge functions for conflict resolution.
- DynamoDB: Prioritizes availability; offers optional strongly consistent reads.
- CouchDB: Stores conflicting revisions and lets the application resolve them.
Best for: Social media feeds, shopping carts, DNS, real-time analytics—anywhere some data is better than no data.
CA (Consistency + Availability)?
This is only possible if you have no network partitions, which effectively means a Single Node system (like PostgreSQL or MySQL on one server). The moment you distribute data across multiple nodes, partitions become possible, and CA is no longer achievable.
Key Insight: CA systems are not truly "distributed." They are monolithic databases with single-machine guarantees.
The Spectrum of Consistency
In practice, systems don't fall neatly into CP or AP. Most modern databases offer tunable consistency, letting you choose per-query or per-operation:
# Cassandra: Tune consistency per query
from cassandra.cluster import Cluster
from cassandra import ConsistencyLevel
cluster = Cluster(['node1', 'node2', 'node3'])
session = cluster.connect('keyspace')
# Strong consistency (CP behavior)
session.default_consistency_level = ConsistencyLevel.QUORUM
# High availability (AP behavior)
session.default_consistency_level = ConsistencyLevel.ONE
# Write to all replicas (strongest CP)
session.default_consistency_level = ConsistencyLevel.ALL
The Quorum Formula
For a replication factor N, if you write to W nodes and read from R nodes:
- Strong Consistency: Guaranteed when
W + R > N - Example: N=3, W=2, R=2 →
2 + 2 = 4 > 3✅ Consistent - Example: N=3, W=1, R=1 →
1 + 1 = 2 < 3❌ May read stale data
PACELC Theorem: The Extension
The CAP theorem only describes behavior during partitions. But what about when the network is healthy? The PACELC theorem (proposed by Daniel Abadi in 2012) extends CAP:
Partition → choose Availability or Consistency. Else (no partition) → choose Latency or Consistency.
This captures a crucial real-world trade-off: even without partitions, replicating data to multiple nodes introduces latency. Should we wait for all replicas to confirm (consistent but slow) or respond immediately (fast but potentially stale)?
| System | P: A or C? | E: L or C? | Notes |
|---|---|---|---|
| DynamoDB | A | L | Fast reads, eventual consistency by default |
| Cassandra | A | L | Tunable, but optimized for speed |
| MongoDB | C | C | Strong consistency, higher latency |
| Spanner | C | C | Google's globally consistent DB (uses GPS clocks!) |
Real-World Case Study: Amazon's Shopping Cart
Amazon's Dynamo paper (2007) is the canonical example of choosing AP. During the 2004 holiday season, Amazon discovered that even a 100ms increase in latency reduced sales by 1%. They designed Dynamo to:
- Always accept writes — putting an item in a cart should never fail.
- Resolve conflicts later — if two carts diverge, merge them (union of items). It's better to have a duplicate item in your cart than to lose one.
This philosophy directly influenced DynamoDB, Cassandra, and Riak.
Common Misconceptions
"You pick two and ignore one"
Not exactly. You can't avoid P in distributed systems, so you're really choosing between C and A during partitions. When there's no partition, you can have both C and A.
"CAP applies to the entire system"
Different parts of your system can make different trade-offs. Your user authentication might be CP (never let two users log in with the same session), while your news feed is AP (showing slightly stale posts is fine).
"Eventual Consistency means inconsistency"
Eventual consistency means the system will converge to a consistent state, given enough time without new writes. It doesn't mean data is permanently wrong—just temporarily stale.
Interview Tips 💡
When discussing CAP in system design interviews:
- Don't just state CAP — explain why P is mandatory and frame it as a C vs. A choice.
- Give concrete examples: "For a banking ledger, I'd choose CP because showing a wrong balance is worse than briefly being unavailable. For a social media timeline, I'd choose AP because seeing a slightly stale feed is acceptable."
- Mention PACELC to show depth: "Even without partitions, there's a latency vs. consistency trade-off. Google Spanner solves this with TrueTime (atomic clocks + GPS)."
- Discuss tunable consistency: "Cassandra lets me set QUORUM for payment operations (CP) and ONE for analytics reads (AP) in the same cluster."
Related Concepts
- Database Sharding — Horizontal partitioning introduces CAP trade-offs
- Database Replication — Replicas must choose C vs. A
- Raft Consensus — A CP consensus algorithm
- Consistent Hashing — Distributes data across nodes
- Distributed Transactions — Two-phase commit vs. eventual consistency
About ScaleWiki
ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.
Read more about our Editorial Guidelines & Authorship.
Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.
Related Articles
System Design: Instagram News Feed
Designing a scalable social feed. Fan-out on Write vs Fan-out on Read, and solving the Justin Bieber problem.
System Design: Ticketmaster (Booking System)
How to handle massive concurrency (e.g., Taylor Swift Eras Tour) without double-booking seats. Optimistic Locking, Redis TTL, and Active Queues.
System Design: URL Shortener (Bit.ly)
Designing a high-read, heavy-scale service like Bit.ly. Deep dive into ID generation (Base62 vs UUID) and Redirection mechanics.