Why Cache?

In any distributed system, the database is often the bottleneck. Disk I/O is slow (10-100ms). Caching solves this by keeping frequently accessed data in memory (RAM), which is orders of magnitude faster (1-5ms).

The Latency Hierarchy:

L1 Cache: ~1 ns
RAM (Memory): ~100 ns
SSD (Disk): ~100 µs (1,000x slower)
Network Call: ~50 ms (500,000x slower)

Application

Cache (Redis)

Database

The Core Concepts

When implementing a cache, you must make three major architectural decisions.

1. Where does the cache live?

Client Side: Browser cache, iOS local storage. Saves network calls entirely.
CDN: Content Delivery Network. Geographic caching for static assets (images, CSS).
Server Side: Redis/Memcached. sitting in front of your database.

2. How do we write to it? (Strategies)

Do you write to the cache first? Or the database? Where is the source of truth?

Should you use Cache-Aside (safest)?
Or Write-Back (fastest)?
Read the Deep Dive on Caching Strategies →

3. What do we delete? (Eviction)

When your 16GB Redis instance is full, what gets thrown out?

Do you delete the Oldest stuff?
Or the Least Frequently Used stuff?
Read the Deep Dive on Eviction Policies (LRU/LFU) →

Common Pitfalls

The "Stale Data" Problem

Caching introduces a new problem: Consistency. If you update a User in the DB, but the Cache still shows the old version, the user sees wrong data.

"There are only two hard things in Computer Science: cache invalidation and naming things."

Cache Stampede (Dog-piling)

If a popular key (e.g., "homepage_news") expires, 10,000 users might hit the database simultaneously to regenerate it.

Solution: Locking (Mutex) or probabilistic early expiration.

Popular Caching Technologies

Technology	Type	Best For
Redis	In-memory key-value store	Sessions, real-time features, pub/sub
Memcached	Simple in-memory cache	High-throughput simple caching
CDN (CloudFlare, Akamai)	Edge caching	Static assets, global distribution
Varnish	HTTP accelerator	Web page caching
Browser Cache	Client-side	Static assets, reducing server load

Real-World Examples

1. Netflix

Netflix caches movie metadata, user preferences, and thumbnails at multiple layers: CDN edge, regional data centers, and in-memory on servers.

2. Twitter Timeline

Twitter's timeline is pre-computed and cached using Redis. When you tweet, it's fanned out to followers' cached timelines.

3. E-commerce Product Pages

Amazon caches product details, reviews, and pricing. Inventory is read-through to ensure accuracy.

Interview Tips 💡

When discussing caching in system design interviews:

Identify read vs. write ratio: Caching benefits read-heavy systems most.
Estimate cache size: How much data will you cache? Can it fit in RAM?
Discuss invalidation strategy: How will you keep cache fresh?
Consider cache stampede: What happens when many requests hit an expired key simultaneously? (Solution: lock and single-fetch, or staggered TTLs)

Related Concepts

Redis Internals — Deep dive into how Redis works
CDN — Edge caching for global content delivery
Database Replication — Another strategy for read scaling
Rate Limiting — Often used alongside caching
Consistent Hashing — Used to distribute cache keys across nodes

About ScaleWiki

ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.

Read more about our Editorial Guidelines & Authorship.

Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.

Intermediate

Backpressure

Flow control mechanism that prevents fast producers from overwhelming slow consumers by signaling when to slow down, pause, or drop data in streaming systems.

StreamingPerformanceReactive Programming

Advanced

Bloom Filters

Space-efficient probabilistic data structure for membership testing that allows false positives but guarantees no false negatives, using minimal memory compared to hash sets.

Data StructuresProbabilisticOptimization