Week 4 Preview: Caching — Beyond "Just Add Redis"
System Design Mastery Series
What's Coming in Week 4
Last week, you mastered messaging and async processing. You learned how to reliably move data between services, handle failures gracefully, and maintain audit trails.
This week, we tackle the most deceptively simple topic in distributed systems: caching.
THE CACHING PARADOX
Everyone knows: "Just add Redis, it'll be fast!"
Reality:
┌─────────────────────────────────────────────────────────────────┐
│ │
│ "There are only two hard things in Computer Science: │
│ cache invalidation and naming things." │
│ │
│ — Phil Karlton │
│ │
└─────────────────────────────────────────────────────────────────┘
What makes caching hard:
- When do you invalidate? Too early = wasted cache. Too late = stale data.
- What do you cache? Everything = memory explosion. Nothing = no benefit.
- How do you handle cache miss storms? 1 miss = 1 DB hit. 10K misses = DB crash.
- How do you stay consistent? User updates profile, sees old data. Confused.
Week 4 teaches you to cache strategically, not reflexively.
Weekly Goal
Cache strategically, not reflexively. Understand invalidation, consistency, and thundering herds.
By the end of this week, you'll be able to:
- Choose the right caching pattern for any scenario
- Design invalidation strategies that actually work
- Prevent thundering herd without complex locking
- Build multi-tier caching architectures
- Know when NOT to cache (sometimes the answer is don't!)
The Week at a Glance
WEEK 4: CACHING — BEYOND "JUST ADD REDIS"
┌────────────────────────────────────────────────────────────────────────┐
│ │
│ Day 1: Caching Patterns │
│ ════════════════════════ │
│ Cache-aside vs Read-through vs Write-through vs Write-behind │
│ When to use each? What are the trade-offs? │
│ System: E-commerce product pages │
│ │
│ Day 2: Invalidation Strategies │
│ ═══════════════════════════════ │
│ TTL-based vs Event-driven vs Versioned keys │
│ The hardest problem in CS — how to actually solve it │
│ System: Product catalog with inventory + pricing │
│ │
│ Day 3: Thundering Herd │
│ ══════════════════════ │
│ What causes it? Locking, probabilistic expiration, coalescing │
│ Protecting your database from cache miss storms │
│ System: Homepage with 50K requests/second │
│ │
│ Day 4: Feed Caching │
│ ═══════════════════ │
│ Cache-per-user vs shared cache │
│ Push-on-write vs Pull-on-read — the celebrity problem │
│ System: Social media feed for 10M users │
│ │
│ Day 5: Multi-Tier Caching │
│ ═════════════════════════ │
│ CDN → API Gateway → App → Database │
│ What belongs at each layer? How to invalidate across tiers? │
│ System: API serving mobile + web, auth + anonymous │
│ │
│ Capstone: Complete Caching Architecture │
│ ═══════════════════════════════════════ │
│ Design caching for a high-traffic e-commerce platform │
│ Apply all 5 days to build a production-ready solution │
│ │
└────────────────────────────────────────────────────────────────────────┘
Key Concepts You'll Master
1. Caching Patterns (Day 1)
THE FOUR CACHING PATTERNS
┌────────────────────────────────────────────────────────────────────────┐
│ CACHE-ASIDE (Lazy Loading) │
│ │
│ Application ──┬──▶ Cache ──▶ Hit? Return │
│ │ │
│ └──▶ Miss? ──▶ Database ──▶ Write to Cache ──▶ Retur │
│ │
│ Pros: Only caches what's needed, cache failure doesn't break app │
│ Cons: First request always slow, potential inconsistency │
│ Use when: Read-heavy, can tolerate stale data │
└────────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ READ-THROUGH │
│ │
│ Application ──▶ Cache ──▶ Miss? Cache fetches from DB │
│ └──▶ Hit? Return │
│ │
│ Pros: Simpler app logic, cache handles loading │
│ Cons: Cache becomes a dependency │
│ Use when: Want to simplify application code │
└────────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ WRITE-THROUGH │
│ │
│ Application ──▶ Cache ──▶ Write to DB synchronously │
│ └──▶ Return after both complete │
│ │
│ Pros: Cache always consistent with DB │
│ Cons: Write latency includes cache + DB │
│ Use when: Consistency is critical │
└────────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ WRITE-BEHIND (Write-Back) │
│ │
│ Application ──▶ Cache ──▶ Return immediately │
│ └──▶ Async write to DB later │
│ │
│ Pros: Lowest write latency, can batch writes │
│ Cons: Data loss risk if cache fails before DB write │
│ Use when: Performance critical, can accept some data loss risk │
└────────────────────────────────────────────────────────────────────────┘
2. Invalidation Strategies (Day 2)
THE INVALIDATION PROBLEM
You cache product price = $99
Database updates to $79 (sale!)
Customer sees $99 in cache
Buys product, expects $99
Actually charged $79
Customer confused (but happy?)
Wait, flip it:
Cache shows $79 (old sale price)
Database now $99 (sale ended)
Customer expects $79
Actually charged $99
Customer angry, support ticket, refund
INVALIDATION STRATEGIES:
1. TIME-TO-LIVE (TTL)
cache.set("product:123", data, ttl=300) # 5 minutes
✓ Simple to implement
✓ Self-healing (bad data expires)
✗ Stale for up to TTL duration
✗ All items expire at same rate (inventory vs descriptions)
2. EVENT-DRIVEN
When product updated → Publish event → Invalidate cache
✓ Near real-time consistency
✓ Only invalidate what changed
✗ Requires event infrastructure
✗ What if event is lost?
3. VERSIONED KEYS
cache.set(f"product:123:v{version}", data)
✓ Old and new can coexist
✓ Atomic switchover
✗ More complex key management
✗ Storage overhead
4. HYBRID (Reality)
Event-driven + short TTL as safety net
Best of both worlds
3. Thundering Herd (Day 3)
THE THUNDERING HERD PROBLEM
Normal operation:
Cache has homepage data
50,000 requests/second served from cache
Database: relaxing, 0 load
Cache expires:
All 50,000 requests simultaneously hit database
Database: "I'm in danger"
Response time: 10ms → 10,000ms → timeout
Users: "Site is down!"
This is the THUNDERING HERD.
SOLUTIONS:
1. MUTEX/LOCKING
First request: Acquires lock, fetches from DB, updates cache
Other requests: Wait for lock, then read from cache
Problem: All requests waiting → high latency
2. PROBABILISTIC EARLY EXPIRATION
Don't expire at exact time
Random chance of refresh before expiration
gap = ttl - current_time
if random() < (1.0 / gap):
refresh_cache()
Some requests refresh early, preventing stampede
3. REQUEST COALESCING
Multiple requests for same key → single DB query
All waiters get the same result
async def get_with_coalesce(key):
if key in pending_requests:
return await pending_requests[key]
pending_requests[key] = fetch_from_db(key)
result = await pending_requests[key]
del pending_requests[key]
return result
4. BACKGROUND REFRESH
Never let cache expire
Background job refreshes before TTL
Cache always warm
4. Feed Caching (Day 4)
THE FEED PROBLEM
10 million users
Each has personalized feed
User posts → appears in all followers' feeds
APPROACH 1: PULL ON READ (Compute on request)
User opens app:
1. Get user's following list
2. Fetch recent posts from each followed user
3. Merge, sort, return
✓ Storage efficient
✓ Always fresh
✗ Slow for users following many accounts
✗ High compute at read time
APPROACH 2: PUSH ON WRITE (Pre-compute feeds)
User posts:
1. Get user's followers list
2. Push post to each follower's feed cache
✓ Fast reads (just fetch cached feed)
✓ Predictable read latency
✗ Write amplification (1 post → 1M cache writes for celebrities)
✗ Storage heavy
THE CELEBRITY PROBLEM:
Kim Kardashian posts
→ 300 million followers
→ Push to 300 million feed caches?
→ That's 300 million writes for 1 post!
SOLUTION: HYBRID
Regular users (< 10K followers): Push on write
Celebrities (> 10K followers): Pull on read
When building feed:
1. Fetch pre-computed feed (pushed posts)
2. Merge with celebrity posts (pulled on demand)
3. Return combined feed
5. Multi-Tier Caching (Day 5)
CACHING LAYERS
┌─────────────────────────────────────────┐
│ │
User ──────────▶ │ CDN (Edge) │
│ TTL: 60s, Static assets, Public API │
│ │
└────────────────┬────────────────────────┘
│ Cache miss
▼
┌─────────────────────────────────────────┐
│ │
│ API Gateway │
│ Rate limiting, Auth cache │
│ │
└────────────────┬────────────────────────┘
│
▼
┌─────────────────────────────────────────┐
│ │
│ Application Cache (Redis) │
│ TTL: 5-60min, Business objects │
│ │
└────────────────┬────────────────────────┘
│ Cache miss
▼
┌─────────────────────────────────────────┐
│ │
│ Database Query Cache │
│ TTL: Seconds, Query results │
│ │
└────────────────┬────────────────────────┘
│ Cache miss
▼
┌─────────────────────────────────────────┐
│ │
│ Database │
│ The source of truth │
│ │
└─────────────────────────────────────────┘
WHAT GOES WHERE?
CDN:
✓ Static assets (images, CSS, JS)
✓ Public API responses (product listings)
✗ Personalized content
✗ Authenticated endpoints (be careful!)
API Gateway:
✓ Auth token validation
✓ Rate limit counters
✗ Business data
Application (Redis):
✓ Session data
✓ User profiles
✓ Product details
✓ Computed aggregates
Database:
✓ Query result cache
✓ Buffer pool
✗ Rely on this alone
Systems You'll Design
System 1: Product Catalog Cache (Days 1-2)
E-COMMERCE PRODUCT CACHING
Challenge:
- 1 million products
- 100,000 requests/second
- Prices change daily
- Inventory changes every second
- Users must see accurate stock
Key decisions:
- Different TTLs for different data?
- Event-driven for inventory?
- How to handle flash sales?
System 2: Homepage Cache (Day 3)
HIGH-TRAFFIC HOMEPAGE
Challenge:
- 50,000 requests/second
- Personalized for some users
- Changes hourly (featured products)
- Cache miss = database overwhelmed
Key decisions:
- How to prevent thundering herd?
- Cache per user or shared?
- Background refresh strategy?
System 3: Social Feed Cache (Day 4)
PERSONALIZED FEED SYSTEM
Challenge:
- 10 million users
- Each follows 0-1000 accounts
- Some accounts have millions of followers
- Feed must feel "real-time"
Key decisions:
- Push vs pull vs hybrid?
- How to handle celebrities?
- Cache per user or compute on read?
System 4: Multi-Tier API Cache (Day 5)
COMPLETE API CACHING STRATEGY
Challenge:
- Mobile + Web clients
- Authenticated + Anonymous users
- CDN in front
- Multiple data types with different freshness needs
Key decisions:
- What to cache at CDN vs App?
- How to handle authenticated caching?
- Cache invalidation across all tiers?
How Week 3 Connects to Week 4
BUILDING ON WEEK 3 CONCEPTS
Week 3: Messaging Week 4: Caching
─────────────────────────── ─────────────────────────────
Transactional Outbox ────▶ Event-driven invalidation
(Guaranteed delivery) (Cache updates via events)
Backpressure ────▶ Thundering herd protection
(Handle overload) (Prevent cache miss storms)
Dead Letter Queue ────▶ Cache failure handling
(Handle failures) (What if Redis goes down?)
Kafka Consumer Groups ────▶ Cache invalidation consumers
(Multiple subscribers) (Services listening for changes)
EXAMPLE: Product Update Flow
1. Admin updates product price (Week 2: Idempotency)
2. Update written to DB + Outbox (Week 3: Transactional Outbox)
3. Event published to Kafka (Week 3: Queue vs Stream)
4. Cache Invalidation Consumer receives event (Week 3: Consumer Groups)
5. Cache entry invalidated or updated (Week 4: Invalidation)
6. CDN cache purged via API (Week 4: Multi-tier)
7. Next request gets fresh data (Week 4: Cache-aside)
Prepare for Week 4
Pre-Reading (Optional but Helpful)
-
Redis Documentation — Commands: GET, SET, SETEX, DEL, KEYS https://redis.io/commands/
-
CDN Basics — How edge caching works Cloudflare or AWS CloudFront documentation
-
Phil Karlton's Quote — Why cache invalidation is hard https://martinfowler.com/bliki/TwoHardThings.html
Questions to Ponder
Before starting Week 4, think about:
-
In your current systems, what's cached?
- How do you know when to invalidate?
- Have you ever served stale data to users?
-
What would happen if your cache went down?
- Would your database survive the load?
- How quickly could you recover?
-
How do you handle cache warming?
- After a deploy, is the cache cold?
- How long until performance is normal?
Week 4 Success Metrics
By the end of Week 4, you should be able to:
- Explain cache-aside, read-through, write-through, and write-behind patterns
- Design invalidation strategies for different consistency requirements
- Prevent thundering herd with at least 3 different techniques
- Choose between push-on-write and pull-on-read for feed systems
- Design multi-tier caching with clear responsibilities per tier
- Know when NOT to cache (and explain why)
- Estimate cache hit ratios and memory requirements
- Handle cache failures gracefully
A Taste of What's Coming
Here's a problem you'll solve in Day 3:
THE BLACK FRIDAY PROBLEM
It's 12:00:00 AM on Black Friday.
Your homepage cache TTL is 60 seconds.
You have 100,000 users waiting to refresh.
At 12:00:00, the cache expires.
At 12:00:01, 100,000 requests hit your database.
At 12:00:02, your database is on fire.
How do you prevent this?
Day 3 will teach you:
1. Probabilistic early expiration
2. Request coalescing
3. Background refresh
4. Mutex with timeout
5. Cache warming strategies
You'll implement each one and know when to use which.
Get Ready
Week 4 is where performance engineering meets reliability engineering. Caching seems simple until you've been paged at 3 AM because a cache stampede took down your database.
After this week, you'll never think "just add Redis" again. You'll think:
- What's the invalidation strategy?
- What's the thundering herd protection?
- What's the failure mode?
- What's the consistency guarantee?
- Should we even cache this?
See you in Week 4.
Week 4 Preview — Caching: Beyond "Just Add Redis"
Coming up: Day 1 will start with caching patterns. We'll design a product catalog cache for an e-commerce platform, handling the trade-offs between freshness and performance.