Back to All Concepts
PerformanceOptimizationDatabaseBeginner

Caching Overview

High-speed data storage to reduce latency. The single most effective way to scale read-heavy systems.

Why Cache?

In any distributed system, the database is often the bottleneck. Disk I/O is slow (10-100ms). Caching solves this by keeping frequently accessed data in memory (RAM), which is orders of magnitude faster (1-5ms).

The Latency Hierarchy:

  1. L1 Cache: ~1 ns
  2. RAM (Memory): ~100 ns
  3. SSD (Disk): ~100 µs (1,000x slower)
  4. Network Call: ~50 ms (500,000x slower)
Application
Cache (Redis)
Database

The Core Concepts

When implementing a cache, you must make three major architectural decisions.

1. Where does the cache live?

  • Client Side: Browser cache, iOS local storage. Saves network calls entirely.
  • CDN: Content Delivery Network. Geographic caching for static assets (images, CSS).
  • Server Side: Redis/Memcached. sitting in front of your database.

2. How do we write to it? (Strategies)

Do you write to the cache first? Or the database? Where is the source of truth?

3. What do we delete? (Eviction)

When your 16GB Redis instance is full, what gets thrown out?

Common Pitfalls

The "Stale Data" Problem

Caching introduces a new problem: Consistency. If you update a User in the DB, but the Cache still shows the old version, the user sees wrong data.

"There are only two hard things in Computer Science: cache invalidation and naming things."

Cache Stampede (Dog-piling)

If a popular key (e.g., "homepage_news") expires, 10,000 users might hit the database simultaneously to regenerate it.

  • Solution: Locking (Mutex) or probabilistic early expiration.

Popular Caching Technologies

TechnologyTypeBest For
RedisIn-memory key-value storeSessions, real-time features, pub/sub
MemcachedSimple in-memory cacheHigh-throughput simple caching
CDN (CloudFlare, Akamai)Edge cachingStatic assets, global distribution
VarnishHTTP acceleratorWeb page caching
Browser CacheClient-sideStatic assets, reducing server load

Real-World Examples

1. Netflix

Netflix caches movie metadata, user preferences, and thumbnails at multiple layers: CDN edge, regional data centers, and in-memory on servers.

2. Twitter Timeline

Twitter's timeline is pre-computed and cached using Redis. When you tweet, it's fanned out to followers' cached timelines.

3. E-commerce Product Pages

Amazon caches product details, reviews, and pricing. Inventory is read-through to ensure accuracy.

Interview Tips 💡

When discussing caching in system design interviews:

  1. Identify read vs. write ratio: Caching benefits read-heavy systems most.
  2. Estimate cache size: How much data will you cache? Can it fit in RAM?
  3. Discuss invalidation strategy: How will you keep cache fresh?
  4. Consider cache stampede: What happens when many requests hit an expired key simultaneously? (Solution: lock and single-fetch, or staggered TTLs)

Related Concepts

About ScaleWiki

ScaleWiki is an interactive educational platform dedicated to demystifying distributed systems, software architecture, and system design. Our mission is to provide high-quality, technically accurate resources for software engineers preparing for interviews or solving complex scaling challenges in production.

Read more about our Editorial Guidelines & Authorship.

Educational Disclaimer: The architectural patterns and system designs discussed in this article are based on common industry practices, technical whitepapers, and public engineering blogs. Actual implementations in enterprise environments may vary significantly based on specific product requirements, legacy constraints, and evolving technologies.

Related Articles