Himanshu Kukreja
0%
Week Preview

Week 4 Preview: Caching — Beyond "Just Add Redis"

System Design Mastery Series


What's Coming in Week 4

Last week, you mastered messaging and async processing. You learned how to reliably move data between services, handle failures gracefully, and maintain audit trails.

This week, we tackle the most deceptively simple topic in distributed systems: caching.

THE CACHING PARADOX

Everyone knows:  "Just add Redis, it'll be fast!"

Reality:
  ┌─────────────────────────────────────────────────────────────────┐
  │                                                                 │
  │  "There are only two hard things in Computer Science:           │
  │   cache invalidation and naming things."                        │
  │                                                                 │
  │                                    — Phil Karlton               │
  │                                                                 │
  └─────────────────────────────────────────────────────────────────┘

What makes caching hard:
  - When do you invalidate? Too early = wasted cache. Too late = stale data.
  - What do you cache? Everything = memory explosion. Nothing = no benefit.
  - How do you handle cache miss storms? 1 miss = 1 DB hit. 10K misses = DB crash.
  - How do you stay consistent? User updates profile, sees old data. Confused.
  
Week 4 teaches you to cache strategically, not reflexively.

Weekly Goal

Cache strategically, not reflexively. Understand invalidation, consistency, and thundering herds.

By the end of this week, you'll be able to:

  • Choose the right caching pattern for any scenario
  • Design invalidation strategies that actually work
  • Prevent thundering herd without complex locking
  • Build multi-tier caching architectures
  • Know when NOT to cache (sometimes the answer is don't!)

The Week at a Glance

WEEK 4: CACHING — BEYOND "JUST ADD REDIS"

┌────────────────────────────────────────────────────────────────────────┐
│                                                                        │
│  Day 1: Caching Patterns                                               │
│  ════════════════════════                                              │
│  Cache-aside vs Read-through vs Write-through vs Write-behind          │
│  When to use each? What are the trade-offs?                            │
│  System: E-commerce product pages                                      │
│                                                                        │
│  Day 2: Invalidation Strategies                                        │
│  ═══════════════════════════════                                       │
│  TTL-based vs Event-driven vs Versioned keys                           │
│  The hardest problem in CS — how to actually solve it                  │
│  System: Product catalog with inventory + pricing                      │
│                                                                        │
│  Day 3: Thundering Herd                                                │
│  ══════════════════════                                                │
│  What causes it? Locking, probabilistic expiration, coalescing         │
│  Protecting your database from cache miss storms                       │
│  System: Homepage with 50K requests/second                             │
│                                                                        │
│  Day 4: Feed Caching                                                   │
│  ═══════════════════                                                   │
│  Cache-per-user vs shared cache                                        │
│  Push-on-write vs Pull-on-read — the celebrity problem                 │
│  System: Social media feed for 10M users                               │
│                                                                        │
│  Day 5: Multi-Tier Caching                                             │
│  ═════════════════════════                                             │
│  CDN → API Gateway → App → Database                                    │
│  What belongs at each layer? How to invalidate across tiers?           │
│  System: API serving mobile + web, auth + anonymous                    │
│                                                                        │
│  Capstone: Complete Caching Architecture                               │
│  ═══════════════════════════════════════                               │
│  Design caching for a high-traffic e-commerce platform                 │
│  Apply all 5 days to build a production-ready solution                 │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Key Concepts You'll Master

1. Caching Patterns (Day 1)

THE FOUR CACHING PATTERNS

┌────────────────────────────────────────────────────────────────────────┐
│  CACHE-ASIDE (Lazy Loading)                                            │
│                                                                        │
│  Application ──┬──▶ Cache ──▶ Hit? Return                              │
│                │                                                       │
│                └──▶ Miss? ──▶ Database ──▶ Write to Cache ──▶ Retur    │
│                                                                        │
│  Pros: Only caches what's needed, cache failure doesn't break app      │
│  Cons: First request always slow, potential inconsistency              │
│  Use when: Read-heavy, can tolerate stale data                         │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│  READ-THROUGH                                                          │
│                                                                        │
│  Application ──▶ Cache ──▶ Miss? Cache fetches from DB                 │
│                       └──▶ Hit? Return                                 │
│                                                                        │
│  Pros: Simpler app logic, cache handles loading                        │
│  Cons: Cache becomes a dependency                                      │
│  Use when: Want to simplify application code                           │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│  WRITE-THROUGH                                                         │
│                                                                        │
│  Application ──▶ Cache ──▶ Write to DB synchronously                   │
│                       └──▶ Return after both complete                  │
│                                                                        │
│  Pros: Cache always consistent with DB                                 │
│  Cons: Write latency includes cache + DB                               │
│  Use when: Consistency is critical                                     │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│  WRITE-BEHIND (Write-Back)                                             │
│                                                                        │
│  Application ──▶ Cache ──▶ Return immediately                          │
│                       └──▶ Async write to DB later                     │
│                                                                        │
│  Pros: Lowest write latency, can batch writes                          │
│  Cons: Data loss risk if cache fails before DB write                   │
│  Use when: Performance critical, can accept some data loss risk        │
└────────────────────────────────────────────────────────────────────────┘

2. Invalidation Strategies (Day 2)

THE INVALIDATION PROBLEM

You cache product price = $99
Database updates to $79 (sale!)
Customer sees $99 in cache
Buys product, expects $99
Actually charged $79
Customer confused (but happy?)

Wait, flip it:
Cache shows $79 (old sale price)
Database now $99 (sale ended)
Customer expects $79
Actually charged $99
Customer angry, support ticket, refund

INVALIDATION STRATEGIES:

1. TIME-TO-LIVE (TTL)
   cache.set("product:123", data, ttl=300)  # 5 minutes
   
   ✓ Simple to implement
   ✓ Self-healing (bad data expires)
   ✗ Stale for up to TTL duration
   ✗ All items expire at same rate (inventory vs descriptions)

2. EVENT-DRIVEN
   When product updated → Publish event → Invalidate cache
   
   ✓ Near real-time consistency
   ✓ Only invalidate what changed
   ✗ Requires event infrastructure
   ✗ What if event is lost?

3. VERSIONED KEYS
   cache.set(f"product:123:v{version}", data)
   
   ✓ Old and new can coexist
   ✓ Atomic switchover
   ✗ More complex key management
   ✗ Storage overhead

4. HYBRID (Reality)
   Event-driven + short TTL as safety net
   Best of both worlds

3. Thundering Herd (Day 3)

THE THUNDERING HERD PROBLEM

Normal operation:
  Cache has homepage data
  50,000 requests/second served from cache
  Database: relaxing, 0 load

Cache expires:
  All 50,000 requests simultaneously hit database
  Database: "I'm in danger" 
  Response time: 10ms → 10,000ms → timeout
  Users: "Site is down!"

This is the THUNDERING HERD.

SOLUTIONS:

1. MUTEX/LOCKING
   First request: Acquires lock, fetches from DB, updates cache
   Other requests: Wait for lock, then read from cache
   
   Problem: All requests waiting → high latency

2. PROBABILISTIC EARLY EXPIRATION
   Don't expire at exact time
   Random chance of refresh before expiration
   
   gap = ttl - current_time
   if random() < (1.0 / gap):
       refresh_cache()
   
   Some requests refresh early, preventing stampede

3. REQUEST COALESCING
   Multiple requests for same key → single DB query
   All waiters get the same result
   
   async def get_with_coalesce(key):
       if key in pending_requests:
           return await pending_requests[key]
       
       pending_requests[key] = fetch_from_db(key)
       result = await pending_requests[key]
       del pending_requests[key]
       return result

4. BACKGROUND REFRESH
   Never let cache expire
   Background job refreshes before TTL
   Cache always warm

4. Feed Caching (Day 4)

THE FEED PROBLEM

10 million users
Each has personalized feed
User posts → appears in all followers' feeds

APPROACH 1: PULL ON READ (Compute on request)

User opens app:
  1. Get user's following list
  2. Fetch recent posts from each followed user
  3. Merge, sort, return

✓ Storage efficient
✓ Always fresh
✗ Slow for users following many accounts
✗ High compute at read time

APPROACH 2: PUSH ON WRITE (Pre-compute feeds)

User posts:
  1. Get user's followers list
  2. Push post to each follower's feed cache

✓ Fast reads (just fetch cached feed)
✓ Predictable read latency
✗ Write amplification (1 post → 1M cache writes for celebrities)
✗ Storage heavy

THE CELEBRITY PROBLEM:

Kim Kardashian posts
  → 300 million followers
  → Push to 300 million feed caches?
  → That's 300 million writes for 1 post!

SOLUTION: HYBRID

Regular users (< 10K followers): Push on write
Celebrities (> 10K followers): Pull on read

When building feed:
  1. Fetch pre-computed feed (pushed posts)
  2. Merge with celebrity posts (pulled on demand)
  3. Return combined feed

5. Multi-Tier Caching (Day 5)

CACHING LAYERS

                   ┌─────────────────────────────────────────┐
                   │                                         │
  User ──────────▶ │  CDN (Edge)                             │
                   │  TTL: 60s, Static assets, Public API    │
                   │                                         │
                   └────────────────┬────────────────────────┘
                                    │ Cache miss
                                    ▼
                   ┌─────────────────────────────────────────┐
                   │                                         │
                   │  API Gateway                            │
                   │  Rate limiting, Auth cache              │
                   │                                         │
                   └────────────────┬────────────────────────┘
                                    │
                                    ▼
                   ┌─────────────────────────────────────────┐
                   │                                         │
                   │  Application Cache (Redis)              │
                   │  TTL: 5-60min, Business objects         │
                   │                                         │
                   └────────────────┬────────────────────────┘
                                    │ Cache miss
                                    ▼
                   ┌─────────────────────────────────────────┐
                   │                                         │
                   │  Database Query Cache                   │
                   │  TTL: Seconds, Query results            │
                   │                                         │
                   └────────────────┬────────────────────────┘
                                    │ Cache miss
                                    ▼
                   ┌─────────────────────────────────────────┐
                   │                                         │
                   │  Database                               │
                   │  The source of truth                    │
                   │                                         │
                   └─────────────────────────────────────────┘


WHAT GOES WHERE?

CDN:
  ✓ Static assets (images, CSS, JS)
  ✓ Public API responses (product listings)
  ✗ Personalized content
  ✗ Authenticated endpoints (be careful!)

API Gateway:
  ✓ Auth token validation
  ✓ Rate limit counters
  ✗ Business data

Application (Redis):
  ✓ Session data
  ✓ User profiles
  ✓ Product details
  ✓ Computed aggregates

Database:
  ✓ Query result cache
  ✓ Buffer pool
  ✗ Rely on this alone

Systems You'll Design

System 1: Product Catalog Cache (Days 1-2)

E-COMMERCE PRODUCT CACHING

Challenge:
  - 1 million products
  - 100,000 requests/second
  - Prices change daily
  - Inventory changes every second
  - Users must see accurate stock

Key decisions:
  - Different TTLs for different data?
  - Event-driven for inventory?
  - How to handle flash sales?

System 2: Homepage Cache (Day 3)

HIGH-TRAFFIC HOMEPAGE

Challenge:
  - 50,000 requests/second
  - Personalized for some users
  - Changes hourly (featured products)
  - Cache miss = database overwhelmed

Key decisions:
  - How to prevent thundering herd?
  - Cache per user or shared?
  - Background refresh strategy?

System 3: Social Feed Cache (Day 4)

PERSONALIZED FEED SYSTEM

Challenge:
  - 10 million users
  - Each follows 0-1000 accounts
  - Some accounts have millions of followers
  - Feed must feel "real-time"

Key decisions:
  - Push vs pull vs hybrid?
  - How to handle celebrities?
  - Cache per user or compute on read?

System 4: Multi-Tier API Cache (Day 5)

COMPLETE API CACHING STRATEGY

Challenge:
  - Mobile + Web clients
  - Authenticated + Anonymous users
  - CDN in front
  - Multiple data types with different freshness needs

Key decisions:
  - What to cache at CDN vs App?
  - How to handle authenticated caching?
  - Cache invalidation across all tiers?

How Week 3 Connects to Week 4

BUILDING ON WEEK 3 CONCEPTS

Week 3: Messaging              Week 4: Caching
───────────────────────────    ─────────────────────────────

Transactional Outbox     ────▶  Event-driven invalidation
(Guaranteed delivery)           (Cache updates via events)

Backpressure             ────▶  Thundering herd protection
(Handle overload)               (Prevent cache miss storms)

Dead Letter Queue        ────▶  Cache failure handling
(Handle failures)               (What if Redis goes down?)

Kafka Consumer Groups    ────▶  Cache invalidation consumers
(Multiple subscribers)          (Services listening for changes)


EXAMPLE: Product Update Flow

1. Admin updates product price (Week 2: Idempotency)
2. Update written to DB + Outbox (Week 3: Transactional Outbox)
3. Event published to Kafka (Week 3: Queue vs Stream)
4. Cache Invalidation Consumer receives event (Week 3: Consumer Groups)
5. Cache entry invalidated or updated (Week 4: Invalidation)
6. CDN cache purged via API (Week 4: Multi-tier)
7. Next request gets fresh data (Week 4: Cache-aside)

Prepare for Week 4

Pre-Reading (Optional but Helpful)

  1. Redis Documentation — Commands: GET, SET, SETEX, DEL, KEYS https://redis.io/commands/

  2. CDN Basics — How edge caching works Cloudflare or AWS CloudFront documentation

  3. Phil Karlton's Quote — Why cache invalidation is hard https://martinfowler.com/bliki/TwoHardThings.html

Questions to Ponder

Before starting Week 4, think about:

  1. In your current systems, what's cached?

    • How do you know when to invalidate?
    • Have you ever served stale data to users?
  2. What would happen if your cache went down?

    • Would your database survive the load?
    • How quickly could you recover?
  3. How do you handle cache warming?

    • After a deploy, is the cache cold?
    • How long until performance is normal?

Week 4 Success Metrics

By the end of Week 4, you should be able to:

  • Explain cache-aside, read-through, write-through, and write-behind patterns
  • Design invalidation strategies for different consistency requirements
  • Prevent thundering herd with at least 3 different techniques
  • Choose between push-on-write and pull-on-read for feed systems
  • Design multi-tier caching with clear responsibilities per tier
  • Know when NOT to cache (and explain why)
  • Estimate cache hit ratios and memory requirements
  • Handle cache failures gracefully

A Taste of What's Coming

Here's a problem you'll solve in Day 3:

THE BLACK FRIDAY PROBLEM

It's 12:00:00 AM on Black Friday.
Your homepage cache TTL is 60 seconds.
You have 100,000 users waiting to refresh.

At 12:00:00, the cache expires.
At 12:00:01, 100,000 requests hit your database.
At 12:00:02, your database is on fire.

How do you prevent this?

Day 3 will teach you:
  1. Probabilistic early expiration
  2. Request coalescing
  3. Background refresh
  4. Mutex with timeout
  5. Cache warming strategies

You'll implement each one and know when to use which.

Get Ready

Week 4 is where performance engineering meets reliability engineering. Caching seems simple until you've been paged at 3 AM because a cache stampede took down your database.

After this week, you'll never think "just add Redis" again. You'll think:

  • What's the invalidation strategy?
  • What's the thundering herd protection?
  • What's the failure mode?
  • What's the consistency guarantee?
  • Should we even cache this?

See you in Week 4.


Week 4 Preview — Caching: Beyond "Just Add Redis"

Coming up: Day 1 will start with caching patterns. We'll design a product catalog cache for an e-commerce platform, handling the trade-offs between freshness and performance.