Himanshu Kukreja
0%
LearnSystem DesignWeek 4Multi Tier Caching
Day 05

Week 4 — Day 5: Multi-Tier Caching

System Design Mastery Series


Preface

This week, you've learned caching patterns, invalidation strategies, thundering herd prevention, and feed caching. Each focused on a single cache layer.

But production systems don't have just one cache. They have many:

THE REAL WORLD: MULTIPLE CACHE LAYERS

User request: "Show me product #12345"

Layer 1: Browser Cache
  └─ Hit? Return cached page (0ms)
  └─ Miss? Continue...

Layer 2: CDN (Edge)
  └─ Hit? Return from edge server (20ms)
  └─ Miss? Continue...

Layer 3: API Gateway
  └─ Hit? Return cached response (5ms)
  └─ Miss? Continue...

Layer 4: Application Cache (Redis)
  └─ Hit? Return from Redis (2ms)
  └─ Miss? Continue...

Layer 5: Database Cache (Query Cache)
  └─ Hit? Return from buffer pool (10ms)
  └─ Miss? Query disk (50ms)

Each layer serves a purpose.
Each layer has different characteristics.
Each layer needs different invalidation strategies.

Today: How to design caching across ALL these layers.

This is where caching gets complex — and where the real performance wins happen.


Part I: Foundations

Chapter 1: The Caching Hierarchy

1.1 Understanding the Layers

CACHE LAYER HIERARCHY

                    Latency    Hit Rate   Capacity   Scope
                    ───────    ────────   ────────   ─────
┌──────────────┐
│   Browser    │    ~0ms      High       Small      Per user
│   Cache      │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│     CDN      │    10-50ms   High       Large      Global
│   (Edge)     │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ API Gateway  │    1-5ms     Medium     Medium     Regional
│   Cache      │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ Application  │    1-2ms     High       Large      Application
│ Cache (Redis)│
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Database    │    5-50ms    Medium     Medium     Database
│Query Cache   │
└──────┬───────┘
       │
       ▼
┌──────────────┐
│  Database    │    10-100ms  N/A        Large      Database
│   (Disk)     │
└──────────────┘

Moving down the hierarchy:
  - Latency increases
  - Data freshness increases
  - Capacity increases
  - Scope broadens

1.2 What Belongs at Each Layer

CONTENT PLACEMENT BY LAYER

BROWSER CACHE
├── Static assets (JS, CSS, images)
├── User-specific preferences
├── Recently viewed items
└── API responses with Cache-Control headers

CDN (EDGE)
├── Static assets
├── Public API responses (product listings)
├── Marketing pages
├── Media files (images, videos)
└── NOT: Personalized content, authenticated responses

API GATEWAY
├── Rate limit counters
├── Authentication tokens (validation cache)
├── Common API responses
├── Request deduplication
└── NOT: User-specific data

APPLICATION (REDIS)
├── Session data
├── User profiles
├── Product details
├── Computed aggregates (counts, stats)
├── Feature flags
└── Everything that's read frequently

DATABASE
├── Query result cache
├── Buffer pool (automatic)
├── Materialized views
└── Prepared statement cache

1.3 Cache Characteristics by Layer

Layer TTL Range Invalidation Consistency Best For
Browser Minutes-Days Headers, versioned URLs Eventual Static assets, preferences
CDN Seconds-Hours Purge API, TTL Eventual Public content, media
API Gateway Seconds-Minutes TTL, events Eventual Auth, rate limits
Application Seconds-Hours Events, TTL Near real-time Business data
Database Automatic Query invalidation Strong Query results

1.4 Key Terminology

Term Definition
Edge CDN servers geographically close to users
Origin Your actual servers (behind CDN)
Cache-Control HTTP header controlling browser/CDN caching
Vary HTTP header for cache key variation
Purge Explicitly removing content from CDN
Stale-while-revalidate Serve stale, refresh async
Cache key Unique identifier for cached content
Hit ratio Percentage of requests served from cache

Chapter 2: Browser Caching

2.1 HTTP Cache Headers

HTTP CACHE-CONTROL DIRECTIVES

Response headers that control browser caching:

Cache-Control: max-age=31536000, immutable
  └─ Cache for 1 year, never revalidate

Cache-Control: no-cache
  └─ Cache, but revalidate before using

Cache-Control: no-store
  └─ Don't cache at all (sensitive data)

Cache-Control: private, max-age=3600
  └─ Only browser can cache (not CDN), 1 hour

Cache-Control: public, max-age=86400
  └─ Anyone can cache (browser, CDN), 24 hours

Cache-Control: max-age=0, must-revalidate
  └─ Always check with server before using

Cache-Control: stale-while-revalidate=60
  └─ Can serve stale for 60s while refreshing

2.2 Versioned Assets (Cache Busting)

CACHE BUSTING STRATEGIES

Problem:
  User has cached old JavaScript
  You deploy new JavaScript
  User still sees old version!

Solution 1: Query string versioning
  /app.js?v=1.2.3
  └─ Change version, URL changes, cache misses

Solution 2: Filename hashing (recommended)
  /app.abc123.js
  └─ Hash of content in filename
  └─ Content changes = filename changes
  └─ Old files can stay cached (no conflict)

Solution 3: Path versioning
  /v2/app.js
  └─ New version = new path

Implementation:

# Cache headers in FastAPI

from fastapi import FastAPI, Response
from fastapi.responses import FileResponse
import hashlib

app = FastAPI()


# Static assets: Long cache with content hash
@app.get("/static/{filename}")
async def get_static(filename: str):
    file_path = f"static/{filename}"
    
    # Generate ETag from content
    with open(file_path, 'rb') as f:
        content_hash = hashlib.md5(f.read()).hexdigest()
    
    return FileResponse(
        file_path,
        headers={
            "Cache-Control": "public, max-age=31536000, immutable",
            "ETag": f'"{content_hash}"'
        }
    )


# API responses: Short cache with revalidation
@app.get("/api/products/{product_id}")
async def get_product(product_id: str, response: Response):
    product = await fetch_product(product_id)
    
    # Cache for 5 minutes, allow stale for 1 minute while revalidating
    response.headers["Cache-Control"] = "public, max-age=300, stale-while-revalidate=60"
    response.headers["ETag"] = f'"{product["version"]}"'
    
    return product


# Personalized content: Private cache only
@app.get("/api/me/profile")
async def get_my_profile(response: Response, user = Depends(get_current_user)):
    profile = await fetch_profile(user.id)
    
    # Only browser can cache, not CDN
    response.headers["Cache-Control"] = "private, max-age=60"
    
    return profile


# Sensitive data: No caching
@app.get("/api/me/payment-methods")
async def get_payment_methods(response: Response):
    # Never cache payment info
    response.headers["Cache-Control"] = "no-store"
    
    return await fetch_payment_methods()

2.3 Conditional Requests (Revalidation)

CONDITIONAL REQUEST FLOW

First request:
  Client → GET /api/product/123
  Server → 200 OK
           ETag: "abc123"
           Cache-Control: max-age=300

After 5 minutes (cache expired):
  Client → GET /api/product/123
           If-None-Match: "abc123"
  
  If unchanged:
    Server → 304 Not Modified (no body!)
    Client uses cached version
  
  If changed:
    Server → 200 OK
             ETag: "def456"
             (full response body)

Benefit: Saves bandwidth when content unchanged

Chapter 3: CDN Caching

3.1 How CDNs Work

CDN ARCHITECTURE

                         ┌─────────────────────────────┐
                         │         Your Origin         │
                         │         (servers)           │
                         └─────────────┬───────────────┘
                                       │
                    ┌──────────────────┼──────────────────┐
                    │                  │                  │
                    ▼                  ▼                  ▼
            ┌───────────────┐  ┌───────────────┐  ┌───────────────┐
            │  CDN Edge     │  │  CDN Edge     │  │  CDN Edge     │
            │  US-East      │  │  EU-West      │  │  Asia-Pacific │
            └───────┬───────┘  └───────┬───────┘  └───────┬───────┘
                    │                  │                  │
            ┌───────┴───────┐  ┌───────┴───────┐  ┌───────┴───────┐
            │               │  │               │  │               │
            ▼               ▼  ▼               ▼  ▼               ▼
         [Users]         [Users]           [Users]           [Users]
         New York        London            Tokyo             Sydney


Request flow:
  1. User in Tokyo requests image
  2. DNS routes to nearest edge (Asia-Pacific)
  3. Edge checks cache:
     - HIT: Return immediately (20ms)
     - MISS: Fetch from origin, cache, return (200ms first time)
  4. Next Tokyo user gets cached version (20ms)

3.2 CDN Cache Configuration

# CDN configuration example (CloudFront-style)

CDN_CACHE_BEHAVIORS = {
    # Static assets: Aggressive caching
    "/static/*": {
        "ttl_default": 86400 * 365,  # 1 year
        "ttl_min": 86400,
        "ttl_max": 86400 * 365,
        "compress": True,
        "forward_headers": [],  # Don't vary on headers
        "forward_query_strings": False,
        "cache_methods": ["GET", "HEAD"],
    },
    
    # API - Public endpoints: Moderate caching
    "/api/products/*": {
        "ttl_default": 60,  # 1 minute
        "ttl_min": 0,
        "ttl_max": 3600,
        "compress": True,
        "forward_headers": ["Accept", "Accept-Language"],
        "forward_query_strings": True,  # Different params = different cache
        "cache_methods": ["GET"],
    },
    
    # API - Authenticated: No CDN caching
    "/api/me/*": {
        "ttl_default": 0,
        "ttl_min": 0,
        "ttl_max": 0,
        "forward_headers": ["Authorization", "Cookie"],
        "cache_methods": [],  # Don't cache
    },
    
    # Media: Long cache with purge capability
    "/media/*": {
        "ttl_default": 86400 * 7,  # 1 week
        "ttl_min": 3600,
        "ttl_max": 86400 * 30,
        "compress": False,  # Already compressed (images, video)
        "forward_headers": [],
        "forward_query_strings": False,
    },
}

3.3 CDN Cache Keys and Vary

CACHE KEY COMPONENTS

Default cache key:
  URL + Query String
  
  /api/products?category=electronics&sort=price
  └─ Cached separately from:
  /api/products?category=electronics&sort=name


Vary header adds dimensions:

  Vary: Accept-Language
  └─ /api/products (Accept-Language: en) ≠ /api/products (Accept-Language: es)

  Vary: Accept-Encoding  
  └─ Gzipped and non-gzipped cached separately

  Vary: Cookie
  └─ DANGEROUS! Every user gets different cache = no caching benefit


EXAMPLE: Language-aware caching

Response:
  Cache-Control: public, max-age=3600
  Vary: Accept-Language
  Content-Language: en

CDN caches separately for:
  - Accept-Language: en
  - Accept-Language: es
  - Accept-Language: fr

3.4 CDN Invalidation (Purging)

# CDN Purge Implementation

import httpx
from typing import List


class CDNPurgeService:
    """
    Service to invalidate CDN cache.
    
    Different CDN providers have different APIs.
    This is a generic implementation.
    """
    
    def __init__(self, cdn_api_url: str, api_key: str):
        self.cdn_api_url = cdn_api_url
        self.api_key = api_key
        self.client = httpx.AsyncClient()
    
    async def purge_url(self, url: str) -> bool:
        """Purge a specific URL from CDN."""
        response = await self.client.post(
            f"{self.cdn_api_url}/purge",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={"urls": [url]}
        )
        return response.status_code == 200
    
    async def purge_pattern(self, pattern: str) -> bool:
        """Purge URLs matching pattern (e.g., /products/*)."""
        response = await self.client.post(
            f"{self.cdn_api_url}/purge",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={"pattern": pattern}
        )
        return response.status_code == 200
    
    async def purge_tag(self, tag: str) -> bool:
        """Purge all URLs with a specific cache tag."""
        response = await self.client.post(
            f"{self.cdn_api_url}/purge",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={"tag": tag}
        )
        return response.status_code == 200


# Integration with product updates
class ProductService:
    def __init__(self, db, cache, cdn_purge: CDNPurgeService):
        self.db = db
        self.cache = cache
        self.cdn = cdn_purge
    
    async def update_product(self, product_id: str, data: dict) -> dict:
        # Update database
        product = await self.db.update_product(product_id, data)
        
        # Invalidate application cache
        await self.cache.delete(f"product:{product_id}")
        
        # Purge CDN
        await self.cdn.purge_url(f"/api/products/{product_id}")
        await self.cdn.purge_pattern(f"/api/products?*")  # List pages
        
        return product

Chapter 4: API Gateway Caching

4.1 What to Cache at the Gateway

API GATEWAY CACHE USE CASES

1. AUTHENTICATION TOKEN VALIDATION
   - Validate JWT/session token
   - Cache validation result (short TTL)
   - Avoid hitting auth service every request

2. RATE LIMIT COUNTERS
   - Track requests per API key
   - Must be fast (every request checks)
   - Shared across gateway instances

3. RESPONSE CACHING
   - Cache entire API responses
   - Based on URL + headers
   - Short TTL for dynamic content

4. REQUEST DEDUPLICATION
   - Multiple identical requests in flight
   - Collapse into single backend request
   - Similar to thundering herd protection

4.2 Gateway Response Caching

# API Gateway Caching Implementation

from fastapi import FastAPI, Request, Response
from fastapi.responses import JSONResponse
import hashlib
import json


class GatewayCache:
    """
    API Gateway response cache.
    
    Caches responses based on:
    - HTTP method
    - URL path
    - Query parameters
    - Selected headers (Accept, Accept-Language)
    """
    
    def __init__(self, redis_client, default_ttl: int = 60):
        self.redis = redis_client
        self.default_ttl = default_ttl
        
        # Paths that should be cached
        self.cacheable_paths = {
            "/api/products": 60,
            "/api/categories": 300,
            "/api/config": 3600,
        }
        
        # Headers that affect cache key
        self.vary_headers = ["Accept", "Accept-Language", "Accept-Encoding"]
    
    def _build_cache_key(self, request: Request) -> str:
        """Build cache key from request."""
        parts = [
            request.method,
            str(request.url.path),
            str(sorted(request.query_params.items())),
        ]
        
        # Add varied headers
        for header in self.vary_headers:
            value = request.headers.get(header, "")
            parts.append(f"{header}:{value}")
        
        key_string = "|".join(parts)
        return f"gateway_cache:{hashlib.sha256(key_string.encode()).hexdigest()}"
    
    def _is_cacheable(self, request: Request) -> bool:
        """Check if request is cacheable."""
        if request.method not in ("GET", "HEAD"):
            return False
        
        # Check if path matches cacheable patterns
        path = request.url.path
        for pattern in self.cacheable_paths:
            if path.startswith(pattern):
                return True
        
        return False
    
    def _get_ttl(self, request: Request) -> int:
        """Get TTL for request."""
        path = request.url.path
        for pattern, ttl in self.cacheable_paths.items():
            if path.startswith(pattern):
                return ttl
        return self.default_ttl
    
    async def get_cached_response(self, request: Request) -> Optional[Response]:
        """Try to get cached response."""
        if not self._is_cacheable(request):
            return None
        
        cache_key = self._build_cache_key(request)
        cached = await self.redis.get(cache_key)
        
        if cached:
            data = json.loads(cached)
            return JSONResponse(
                content=data["body"],
                status_code=data["status_code"],
                headers={
                    **data["headers"],
                    "X-Cache": "HIT",
                    "X-Cache-Key": cache_key[:16]
                }
            )
        
        return None
    
    async def cache_response(
        self,
        request: Request,
        response_body: dict,
        status_code: int,
        headers: dict
    ):
        """Cache a response."""
        if not self._is_cacheable(request):
            return
        
        if status_code != 200:
            return
        
        cache_key = self._build_cache_key(request)
        ttl = self._get_ttl(request)
        
        data = {
            "body": response_body,
            "status_code": status_code,
            "headers": {k: v for k, v in headers.items() if k.lower() not in ("content-length", "transfer-encoding")}
        }
        
        await self.redis.setex(cache_key, ttl, json.dumps(data))


# Middleware for automatic caching
class CacheMiddleware:
    def __init__(self, app, cache: GatewayCache):
        self.app = app
        self.cache = cache
    
    async def __call__(self, scope, receive, send):
        if scope["type"] != "http":
            await self.app(scope, receive, send)
            return
        
        request = Request(scope, receive)
        
        # Try cache
        cached_response = await self.cache.get_cached_response(request)
        if cached_response:
            await cached_response(scope, receive, send)
            return
        
        # Not cached - forward to app
        await self.app(scope, receive, send)

4.3 Authentication Cache

# Auth Token Validation Cache

class AuthCache:
    """
    Cache authentication token validation results.
    
    Reduces load on auth service by caching valid tokens.
    Short TTL ensures revoked tokens stop working quickly.
    """
    
    def __init__(self, redis_client, auth_service, ttl: int = 60):
        self.redis = redis_client
        self.auth_service = auth_service
        self.ttl = ttl
    
    async def validate_token(self, token: str) -> Optional[dict]:
        """
        Validate token with caching.
        
        Returns user info if valid, None if invalid.
        """
        cache_key = f"auth:{self._hash_token(token)}"
        
        # Check cache
        cached = await self.redis.get(cache_key)
        if cached:
            if cached == "INVALID":
                return None
            return json.loads(cached)
        
        # Validate with auth service
        try:
            user = await self.auth_service.validate(token)
            
            if user:
                # Cache valid result
                await self.redis.setex(cache_key, self.ttl, json.dumps(user))
                return user
            else:
                # Cache invalid result (shorter TTL)
                await self.redis.setex(cache_key, 10, "INVALID")
                return None
                
        except Exception as e:
            # Auth service error - don't cache, let through
            logger.error(f"Auth service error: {e}")
            raise
    
    def _hash_token(self, token: str) -> str:
        """Hash token for cache key (security)."""
        return hashlib.sha256(token.encode()).hexdigest()
    
    async def invalidate_token(self, token: str):
        """Invalidate a cached token (on logout/revoke)."""
        cache_key = f"auth:{self._hash_token(token)}"
        await self.redis.delete(cache_key)
    
    async def invalidate_user_tokens(self, user_id: str):
        """Invalidate all tokens for a user."""
        # This requires tracking tokens by user
        # Or just wait for TTL expiry
        pattern = f"auth:user:{user_id}:*"
        keys = await self.redis.keys(pattern)
        if keys:
            await self.redis.delete(*keys)

Chapter 5: Application Cache (Redis)

5.1 Application Cache Patterns

We covered this extensively in Days 1-4. Quick recap:

APPLICATION CACHE PATTERNS

CACHE-ASIDE (Most common)
  Read: Check cache → Miss → Load DB → Store cache → Return
  Write: Update DB → Invalidate cache

COMPUTED VALUES
  Store pre-computed aggregates
  - User's follower count
  - Product's average rating
  - Dashboard statistics

SESSION DATA
  User sessions with sliding expiration
  Shopping carts
  Temporary wizard state

FEATURE FLAGS
  Configuration that rarely changes
  A/B test assignments
  Gradual rollout percentages

RATE LIMITING
  Request counts per window
  User quotas
  API usage tracking

5.2 Layered Cache Keys

# Structured cache keys for multi-tier systems

class CacheKeyBuilder:
    """
    Build structured cache keys for multi-tier caching.
    
    Key format: {prefix}:{version}:{entity}:{id}:{variant}
    
    Example: app:v2:product:12345:en
    """
    
    def __init__(self, prefix: str = "app", version: str = "v1"):
        self.prefix = prefix
        self.version = version
    
    def product(self, product_id: str, locale: str = None) -> str:
        key = f"{self.prefix}:{self.version}:product:{product_id}"
        if locale:
            key += f":{locale}"
        return key
    
    def product_list(self, category: str, page: int, locale: str = None) -> str:
        key = f"{self.prefix}:{self.version}:products:{category}:page{page}"
        if locale:
            key += f":{locale}"
        return key
    
    def user_profile(self, user_id: str) -> str:
        return f"{self.prefix}:{self.version}:user:{user_id}:profile"
    
    def user_feed(self, user_id: str) -> str:
        return f"{self.prefix}:{self.version}:feed:{user_id}"
    
    def session(self, session_id: str) -> str:
        return f"{self.prefix}:session:{session_id}"  # No version - survives deploys
    
    def rate_limit(self, key: str, window: str) -> str:
        return f"{self.prefix}:ratelimit:{key}:{window}"
    
    def invalidation_pattern(self, entity: str, id: str = "*") -> str:
        """Pattern for bulk invalidation."""
        return f"{self.prefix}:{self.version}:{entity}:{id}:*"


# Usage
keys = CacheKeyBuilder(prefix="myapp", version="v3")

# After schema change, bump version to v4
# All v3 keys become orphaned and expire naturally
# No explicit invalidation needed!

Chapter 6: Multi-Tier Invalidation

6.1 The Invalidation Challenge

MULTI-TIER INVALIDATION PROBLEM

Product price changes from $99 to $79:

Layer 1: Browser
  - Users have old price cached
  - Can't push invalidation to browsers!
  - Must wait for Cache-Control max-age

Layer 2: CDN
  - Multiple edge servers have old price
  - Must purge from all edges
  - Purge takes time to propagate

Layer 3: API Gateway
  - Response cache has old price
  - Must invalidate gateway cache

Layer 4: Application (Redis)
  - Product cache has old price
  - Must delete/update Redis key

Layer 5: Database
  - Query cache might have old price
  - Usually auto-invalidated on write


Propagation order matters!
If CDN still has old price, users get stale data
even if application cache is fresh.

6.2 Invalidation Strategies by Layer

# Multi-Tier Invalidation Service

from dataclasses import dataclass
from typing import List, Optional
import asyncio


@dataclass
class InvalidationResult:
    """Result of multi-tier invalidation."""
    success: bool
    browser_invalidated: bool  # Can only suggest via headers
    cdn_invalidated: bool
    gateway_invalidated: bool
    app_cache_invalidated: bool
    errors: List[str]


class MultiTierInvalidationService:
    """
    Coordinates cache invalidation across all tiers.
    
    Order of invalidation:
    1. Application cache (Redis) - fastest
    2. API Gateway cache
    3. CDN - slowest to propagate
    4. Browser - can only influence future requests
    """
    
    def __init__(
        self,
        redis_client,
        gateway_cache,
        cdn_service,
        event_publisher
    ):
        self.redis = redis_client
        self.gateway = gateway_cache
        self.cdn = cdn_service
        self.events = event_publisher
    
    async def invalidate_product(
        self,
        product_id: str,
        eager_cdn_purge: bool = True
    ) -> InvalidationResult:
        """
        Invalidate a product across all cache tiers.
        """
        errors = []
        
        # 1. Application cache (immediate)
        app_success = await self._invalidate_app_cache(product_id, errors)
        
        # 2. API Gateway cache
        gateway_success = await self._invalidate_gateway_cache(product_id, errors)
        
        # 3. CDN
        cdn_success = True
        if eager_cdn_purge:
            cdn_success = await self._invalidate_cdn(product_id, errors)
        else:
            # Queue CDN purge for background processing
            await self.events.publish("cdn-purge-queue", {
                "type": "product",
                "id": product_id
            })
        
        return InvalidationResult(
            success=app_success and gateway_success and cdn_success,
            browser_invalidated=False,  # Can't force browser
            cdn_invalidated=cdn_success,
            gateway_invalidated=gateway_success,
            app_cache_invalidated=app_success,
            errors=errors
        )
    
    async def _invalidate_app_cache(self, product_id: str, errors: List[str]) -> bool:
        """Invalidate Redis cache."""
        try:
            keys_to_delete = [
                f"product:{product_id}",
                f"product_page:{product_id}",
                f"product:{product_id}:*",  # Pattern for variants
            ]
            
            pipe = self.redis.pipeline()
            for key in keys_to_delete:
                if '*' in key:
                    # Pattern delete - scan and delete
                    cursor = 0
                    while True:
                        cursor, keys = await self.redis.scan(cursor, match=key, count=100)
                        if keys:
                            pipe.delete(*keys)
                        if cursor == 0:
                            break
                else:
                    pipe.delete(key)
            
            await pipe.execute()
            return True
            
        except Exception as e:
            errors.append(f"Redis invalidation failed: {e}")
            return False
    
    async def _invalidate_gateway_cache(self, product_id: str, errors: List[str]) -> bool:
        """Invalidate API Gateway cache."""
        try:
            patterns = [
                f"/api/products/{product_id}",
                f"/api/products?*",  # List endpoints
            ]
            
            for pattern in patterns:
                await self.gateway.invalidate_pattern(pattern)
            
            return True
            
        except Exception as e:
            errors.append(f"Gateway invalidation failed: {e}")
            return False
    
    async def _invalidate_cdn(self, product_id: str, errors: List[str]) -> bool:
        """Invalidate CDN cache."""
        try:
            urls = [
                f"/api/products/{product_id}",
                f"/api/products/{product_id}/*",
                f"/products/{product_id}",  # Product page
            ]
            
            results = await asyncio.gather(*[
                self.cdn.purge_url(url) for url in urls
            ], return_exceptions=True)
            
            for i, result in enumerate(results):
                if isinstance(result, Exception):
                    errors.append(f"CDN purge failed for {urls[i]}: {result}")
            
            return all(not isinstance(r, Exception) for r in results)
            
        except Exception as e:
            errors.append(f"CDN invalidation failed: {e}")
            return False
    
    async def invalidate_bulk(
        self,
        entity_type: str,
        entity_ids: List[str]
    ) -> InvalidationResult:
        """
        Bulk invalidation for multiple entities.
        
        More efficient than individual invalidations.
        """
        errors = []
        
        # 1. Application cache - bulk delete
        try:
            pipe = self.redis.pipeline()
            for entity_id in entity_ids:
                pipe.delete(f"{entity_type}:{entity_id}")
            await pipe.execute()
            app_success = True
        except Exception as e:
            errors.append(f"Bulk Redis invalidation failed: {e}")
            app_success = False
        
        # 2. Gateway - invalidate pattern
        try:
            await self.gateway.invalidate_pattern(f"/api/{entity_type}s/*")
            gateway_success = True
        except Exception as e:
            errors.append(f"Gateway invalidation failed: {e}")
            gateway_success = False
        
        # 3. CDN - use cache tags if available
        try:
            await self.cdn.purge_tag(f"{entity_type}")
            cdn_success = True
        except Exception as e:
            errors.append(f"CDN invalidation failed: {e}")
            cdn_success = False
        
        return InvalidationResult(
            success=app_success and gateway_success and cdn_success,
            browser_invalidated=False,
            cdn_invalidated=cdn_success,
            gateway_invalidated=gateway_success,
            app_cache_invalidated=app_success,
            errors=errors
        )

6.3 Event-Driven Multi-Tier Invalidation

# Event-Driven Invalidation System

class InvalidationEventHandler:
    """
    Handles invalidation events for all cache tiers.
    
    Listens to data change events and coordinates
    cache invalidation across tiers.
    """
    
    def __init__(self, invalidation_service: MultiTierInvalidationService):
        self.invalidation = invalidation_service
    
    async def handle_event(self, event: dict):
        """Route event to appropriate handler."""
        event_type = event.get("type")
        
        handlers = {
            "product.updated": self._handle_product_updated,
            "product.deleted": self._handle_product_deleted,
            "product.price_changed": self._handle_price_change,
            "category.updated": self._handle_category_updated,
            "inventory.updated": self._handle_inventory_updated,
            "user.updated": self._handle_user_updated,
            "global.invalidate": self._handle_global_invalidate,
        }
        
        handler = handlers.get(event_type)
        if handler:
            await handler(event)
        else:
            logger.warning(f"Unknown invalidation event type: {event_type}")
    
    async def _handle_product_updated(self, event: dict):
        """Handle product update - full invalidation."""
        product_id = event["product_id"]
        
        result = await self.invalidation.invalidate_product(
            product_id,
            eager_cdn_purge=True
        )
        
        if not result.success:
            logger.error(f"Product invalidation failed: {result.errors}")
    
    async def _handle_price_change(self, event: dict):
        """Handle price change - priority invalidation."""
        product_id = event["product_id"]
        
        # Price changes are high priority - invalidate everywhere immediately
        result = await self.invalidation.invalidate_product(
            product_id,
            eager_cdn_purge=True
        )
        
        # Also invalidate any deals/promotions pages
        await self.invalidation.invalidate_bulk(
            "deals_page",
            ["current", "featured"]
        )
    
    async def _handle_inventory_updated(self, event: dict):
        """Handle inventory update - partial invalidation."""
        product_id = event["product_id"]
        
        # Inventory is usually cached with short TTL
        # Just invalidate application cache, CDN will expire quickly
        await self.invalidation._invalidate_app_cache(product_id, [])
    
    async def _handle_category_updated(self, event: dict):
        """Handle category update - affects multiple products."""
        category_id = event["category_id"]
        
        # Invalidate category cache
        await self.invalidation.invalidate_bulk(
            "category",
            [category_id]
        )
        
        # Invalidate product listings in this category
        await self.invalidation.gateway.invalidate_pattern(
            f"/api/products?category={category_id}*"
        )
        await self.invalidation.cdn.purge_pattern(
            f"/api/products?category={category_id}*"
        )
    
    async def _handle_user_updated(self, event: dict):
        """Handle user update - private cache only."""
        user_id = event["user_id"]
        
        # User data is private - only in app cache
        await self.invalidation.redis.delete(f"user:{user_id}")
        await self.invalidation.redis.delete(f"user:{user_id}:profile")
    
    async def _handle_global_invalidate(self, event: dict):
        """Handle global invalidation - nuclear option."""
        entity_type = event.get("entity_type", "*")
        
        logger.warning(f"Global invalidation triggered for {entity_type}")
        
        # Bump cache version (makes all keys invalid)
        await self.invalidation.redis.incr("cache_version")
        
        # Purge CDN entirely for this entity type
        await self.invalidation.cdn.purge_pattern(f"/api/{entity_type}*")


# Consumer that processes invalidation events
async def run_invalidation_consumer(kafka_consumer, handler: InvalidationEventHandler):
    """Run the invalidation event consumer."""
    async for message in kafka_consumer:
        try:
            event = json.loads(message.value.decode())
            await handler.handle_event(event)
            await kafka_consumer.commit()
        except Exception as e:
            logger.error(f"Failed to process invalidation event: {e}")

Chapter 7: Cache Strategies by Request Type

7.1 Anonymous vs Authenticated Requests

CACHING STRATEGY BY AUTH STATUS

ANONYMOUS (Not logged in):
├── Browser: Cache static assets, marketing pages
├── CDN: Cache everything (product pages, listings)
├── Gateway: Cache API responses
├── App: Cache shared data (products, categories)
│
│  Key: All anonymous users see same content
│  Can cache aggressively at all layers

AUTHENTICATED (Logged in):
├── Browser: Cache static assets only
├── CDN: CAREFUL! Don't cache personalized content
├── Gateway: Cache auth validation, rate limits
├── App: Cache user-specific data
│
│  Key: Each user sees different content
│  CDN caching requires careful Vary headers
# Auth-Aware Caching

class AuthAwareCacheMiddleware:
    """
    Middleware that applies different caching strategies
    based on authentication status.
    """
    
    def __init__(self, app, cache_service):
        self.app = app
        self.cache = cache_service
    
    async def __call__(self, scope, receive, send):
        if scope["type"] != "http":
            await self.app(scope, receive, send)
            return
        
        request = Request(scope, receive)
        is_authenticated = self._is_authenticated(request)
        
        # Set caching strategy based on auth
        if is_authenticated:
            await self._handle_authenticated(request, scope, receive, send)
        else:
            await self._handle_anonymous(request, scope, receive, send)
    
    def _is_authenticated(self, request: Request) -> bool:
        """Check if request is authenticated."""
        return (
            "Authorization" in request.headers or
            "session_id" in request.cookies
        )
    
    async def _handle_anonymous(self, request, scope, receive, send):
        """Handle anonymous request - aggressive caching."""
        # Try CDN-style caching
        cached = await self.cache.get_cached_response(request)
        if cached:
            await cached(scope, receive, send)
            return
        
        # Forward to app with caching headers
        response_started = False
        original_body = []
        
        async def send_wrapper(message):
            nonlocal response_started
            if message["type"] == "http.response.start":
                # Add cache headers for anonymous
                headers = dict(message.get("headers", []))
                headers[b"cache-control"] = b"public, max-age=60"
                headers[b"vary"] = b"Accept-Encoding"
                message["headers"] = list(headers.items())
                response_started = True
            await send(message)
        
        await self.app(scope, receive, send_wrapper)
    
    async def _handle_authenticated(self, request, scope, receive, send):
        """Handle authenticated request - private caching."""
        
        async def send_wrapper(message):
            if message["type"] == "http.response.start":
                headers = dict(message.get("headers", []))
                # Private cache only - no CDN
                headers[b"cache-control"] = b"private, max-age=0, must-revalidate"
                headers[b"vary"] = b"Authorization, Cookie"
                message["headers"] = list(headers.items())
            await send(message)
        
        await self.app(scope, receive, send_wrapper)

7.2 Mobile vs Web Clients

CACHING STRATEGY BY CLIENT TYPE

WEB BROWSER:
├── Has robust caching support
├── Service Worker for offline
├── IndexedDB for structured data
├── Cache API for fine-grained control
│
│  Strategy: Use browser capabilities fully
│  Cache-Control with stale-while-revalidate

MOBILE APP (Native):
├── App-level cache (SQLite, Realm)
├── Network layer cache (OkHttp, URLCache)
├── May have offline requirements
│
│  Strategy: Server provides cache hints
│  App decides local caching strategy
│  ETag/Last-Modified for efficient sync

MOBILE WEB:
├── Limited storage
├── Unreliable network
├── High latency sensitivity
│
│  Strategy: Minimize payload
│  Aggressive server-side caching
│  Short TTLs with efficient revalidation

7.3 API Response Caching Decisions

# Response Caching Decision Tree

class ResponseCachePolicy:
    """
    Determines caching policy for API responses.
    """
    
    def get_cache_headers(
        self,
        request: Request,
        response_data: dict,
        endpoint: str
    ) -> dict:
        """Determine appropriate cache headers."""
        
        # Start with defaults
        headers = {
            "Cache-Control": "no-cache",
            "Vary": "Accept, Accept-Encoding"
        }
        
        # Check authentication
        if self._is_authenticated(request):
            # Authenticated: Private cache only
            headers["Cache-Control"] = "private, max-age=0"
            headers["Vary"] += ", Authorization"
            return headers
        
        # Anonymous: Apply endpoint-specific rules
        policy = self._get_endpoint_policy(endpoint)
        
        if policy["cacheable"]:
            directives = [
                "public",
                f"max-age={policy['max_age']}",
            ]
            
            if policy.get("stale_while_revalidate"):
                directives.append(
                    f"stale-while-revalidate={policy['stale_while_revalidate']}"
                )
            
            if policy.get("immutable"):
                directives.append("immutable")
            
            headers["Cache-Control"] = ", ".join(directives)
            
            # Add ETag for conditional requests
            if "version" in response_data:
                headers["ETag"] = f'"{response_data["version"]}"'
        
        return headers
    
    def _get_endpoint_policy(self, endpoint: str) -> dict:
        """Get caching policy for endpoint."""
        policies = {
            "/api/products": {
                "cacheable": True,
                "max_age": 60,
                "stale_while_revalidate": 30,
            },
            "/api/products/{id}": {
                "cacheable": True,
                "max_age": 300,
                "stale_while_revalidate": 60,
            },
            "/api/categories": {
                "cacheable": True,
                "max_age": 3600,
                "stale_while_revalidate": 300,
            },
            "/api/search": {
                "cacheable": True,
                "max_age": 30,  # Short for search results
            },
            "/api/cart": {
                "cacheable": False,  # User-specific
            },
            "/api/checkout": {
                "cacheable": False,  # Sensitive
            },
        }
        
        # Match endpoint to policy
        for pattern, policy in policies.items():
            if self._matches_pattern(endpoint, pattern):
                return policy
        
        return {"cacheable": False}
    
    def _matches_pattern(self, endpoint: str, pattern: str) -> bool:
        """Simple pattern matching for endpoints."""
        if "{" not in pattern:
            return endpoint.startswith(pattern)
        
        # Handle patterns like /api/products/{id}
        pattern_parts = pattern.split("/")
        endpoint_parts = endpoint.split("/")
        
        if len(pattern_parts) != len(endpoint_parts):
            return False
        
        for p, e in zip(pattern_parts, endpoint_parts):
            if p.startswith("{") and p.endswith("}"):
                continue  # Wildcard match
            if p != e:
                return False
        
        return True

Part II: Production Implementation

Chapter 8: Complete Multi-Tier Cache System

# Complete Multi-Tier Cache System

import asyncio
import hashlib
import json
import logging
from dataclasses import dataclass
from typing import Dict, List, Optional, Any
from datetime import datetime
from enum import Enum

logger = logging.getLogger(__name__)


# =============================================================================
# Configuration
# =============================================================================

@dataclass
class MultiTierCacheConfig:
    """Configuration for multi-tier cache system."""
    
    # Application cache (Redis)
    app_cache_default_ttl: int = 300
    app_cache_max_ttl: int = 86400
    
    # Gateway cache
    gateway_cache_enabled: bool = True
    gateway_cache_ttl: int = 60
    
    # CDN
    cdn_enabled: bool = True
    cdn_default_ttl: int = 60
    cdn_purge_on_update: bool = True
    
    # Browser
    browser_cache_default: int = 0
    browser_cache_static: int = 31536000
    
    # Invalidation
    async_invalidation: bool = True
    invalidation_queue: str = "cache-invalidation"


class CacheLayer(Enum):
    BROWSER = "browser"
    CDN = "cdn"
    GATEWAY = "gateway"
    APPLICATION = "application"
    DATABASE = "database"


# =============================================================================
# Cache Layer Interfaces
# =============================================================================

class ICacheLayer:
    """Interface for cache layers."""
    
    async def get(self, key: str) -> Optional[Any]:
        raise NotImplementedError
    
    async def set(self, key: str, value: Any, ttl: int = None) -> bool:
        raise NotImplementedError
    
    async def delete(self, key: str) -> bool:
        raise NotImplementedError
    
    async def delete_pattern(self, pattern: str) -> int:
        raise NotImplementedError


class ApplicationCache(ICacheLayer):
    """Redis-based application cache."""
    
    def __init__(self, redis_client, config: MultiTierCacheConfig):
        self.redis = redis_client
        self.config = config
    
    async def get(self, key: str) -> Optional[Any]:
        try:
            value = await self.redis.get(key)
            return json.loads(value) if value else None
        except Exception as e:
            logger.warning(f"App cache get error: {e}")
            return None
    
    async def set(self, key: str, value: Any, ttl: int = None) -> bool:
        ttl = min(ttl or self.config.app_cache_default_ttl, self.config.app_cache_max_ttl)
        try:
            await self.redis.setex(key, ttl, json.dumps(value, default=str))
            return True
        except Exception as e:
            logger.warning(f"App cache set error: {e}")
            return False
    
    async def delete(self, key: str) -> bool:
        try:
            await self.redis.delete(key)
            return True
        except Exception as e:
            logger.warning(f"App cache delete error: {e}")
            return False
    
    async def delete_pattern(self, pattern: str) -> int:
        deleted = 0
        try:
            cursor = 0
            while True:
                cursor, keys = await self.redis.scan(cursor, match=pattern, count=100)
                if keys:
                    await self.redis.delete(*keys)
                    deleted += len(keys)
                if cursor == 0:
                    break
        except Exception as e:
            logger.warning(f"App cache pattern delete error: {e}")
        return deleted


class CDNCache(ICacheLayer):
    """CDN cache layer (via API)."""
    
    def __init__(self, cdn_client, config: MultiTierCacheConfig):
        self.cdn = cdn_client
        self.config = config
    
    async def get(self, key: str) -> Optional[Any]:
        # CDN is transparent - we don't read from it directly
        return None
    
    async def set(self, key: str, value: Any, ttl: int = None) -> bool:
        # CDN caching is controlled by headers, not direct sets
        return True
    
    async def delete(self, key: str) -> bool:
        if not self.config.cdn_enabled:
            return True
        try:
            return await self.cdn.purge_url(key)
        except Exception as e:
            logger.warning(f"CDN purge error: {e}")
            return False
    
    async def delete_pattern(self, pattern: str) -> int:
        if not self.config.cdn_enabled:
            return 0
        try:
            result = await self.cdn.purge_pattern(pattern)
            return 1 if result else 0
        except Exception as e:
            logger.warning(f"CDN pattern purge error: {e}")
            return 0


# =============================================================================
# Multi-Tier Cache Manager
# =============================================================================

class MultiTierCacheManager:
    """
    Manages caching across all tiers.
    
    Provides unified interface for:
    - Reading from multiple tiers
    - Writing to appropriate tiers
    - Invalidating across all tiers
    """
    
    def __init__(
        self,
        app_cache: ApplicationCache,
        cdn_cache: CDNCache,
        gateway_cache: Optional[ICacheLayer],
        event_publisher,
        config: MultiTierCacheConfig
    ):
        self.app_cache = app_cache
        self.cdn_cache = cdn_cache
        self.gateway_cache = gateway_cache
        self.events = event_publisher
        self.config = config
        
        # Layer order for reads (fastest first)
        self.read_layers = [self.app_cache]
        
        # Layer order for invalidation (closest to user first)
        self.invalidation_layers = [
            self.app_cache,
            self.gateway_cache,
            self.cdn_cache,
        ]
    
    async def get(
        self,
        key: str,
        fetch_func: callable = None,
        ttl: int = None
    ) -> Optional[Any]:
        """
        Get value from cache hierarchy.
        
        Checks each layer in order, returns first hit.
        On miss, optionally fetches and caches.
        """
        # Try each read layer
        for layer in self.read_layers:
            if layer:
                value = await layer.get(key)
                if value is not None:
                    return value
        
        # All layers missed
        if fetch_func is None:
            return None
        
        # Fetch from source
        value = await fetch_func()
        
        if value is not None:
            # Cache in application layer
            await self.app_cache.set(key, value, ttl)
        
        return value
    
    async def set(
        self,
        key: str,
        value: Any,
        ttl: int = None,
        layers: List[CacheLayer] = None
    ) -> bool:
        """
        Set value in specified cache layers.
        
        By default, only sets in application layer.
        CDN caching is typically done via headers.
        """
        layers = layers or [CacheLayer.APPLICATION]
        success = True
        
        for layer in layers:
            if layer == CacheLayer.APPLICATION:
                success = success and await self.app_cache.set(key, value, ttl)
        
        return success
    
    async def invalidate(
        self,
        keys: List[str],
        patterns: List[str] = None,
        cdn_urls: List[str] = None,
        sync: bool = None
    ) -> bool:
        """
        Invalidate cache across all tiers.
        
        Args:
            keys: Specific keys to invalidate
            patterns: Patterns to match for bulk invalidation
            cdn_urls: CDN URLs/patterns to purge
            sync: Force synchronous invalidation
        """
        sync = sync if sync is not None else not self.config.async_invalidation
        
        if sync:
            return await self._invalidate_sync(keys, patterns, cdn_urls)
        else:
            return await self._invalidate_async(keys, patterns, cdn_urls)
    
    async def _invalidate_sync(
        self,
        keys: List[str],
        patterns: List[str] = None,
        cdn_urls: List[str] = None
    ) -> bool:
        """Synchronous invalidation across all tiers."""
        success = True
        
        # Application cache
        for key in keys:
            success = success and await self.app_cache.delete(key)
        
        for pattern in (patterns or []):
            await self.app_cache.delete_pattern(pattern)
        
        # Gateway cache
        if self.gateway_cache:
            for key in keys:
                await self.gateway_cache.delete(key)
            for pattern in (patterns or []):
                await self.gateway_cache.delete_pattern(pattern)
        
        # CDN
        for url in (cdn_urls or []):
            success = success and await self.cdn_cache.delete(url)
        
        return success
    
    async def _invalidate_async(
        self,
        keys: List[str],
        patterns: List[str] = None,
        cdn_urls: List[str] = None
    ) -> bool:
        """Queue invalidation for async processing."""
        event = {
            "type": "cache.invalidate",
            "keys": keys,
            "patterns": patterns or [],
            "cdn_urls": cdn_urls or [],
            "timestamp": datetime.utcnow().isoformat()
        }
        
        # Invalidate app cache immediately (fast)
        for key in keys:
            await self.app_cache.delete(key)
        
        # Queue the rest for async processing
        await self.events.publish(self.config.invalidation_queue, event)
        
        return True
    
    def get_cache_headers(
        self,
        ttl: int,
        private: bool = False,
        vary: List[str] = None,
        etag: str = None
    ) -> Dict[str, str]:
        """
        Generate HTTP cache headers.
        
        Call this when building responses to control
        browser and CDN caching.
        """
        directives = []
        
        if private:
            directives.append("private")
        else:
            directives.append("public")
        
        directives.append(f"max-age={ttl}")
        
        if ttl > 0:
            # Add stale-while-revalidate for better UX
            swr = min(ttl // 2, 60)
            directives.append(f"stale-while-revalidate={swr}")
        
        headers = {
            "Cache-Control": ", ".join(directives)
        }
        
        if vary:
            headers["Vary"] = ", ".join(vary)
        
        if etag:
            headers["ETag"] = f'"{etag}"'
        
        return headers


# =============================================================================
# Repository with Multi-Tier Caching
# =============================================================================

class CachedProductRepository:
    """
    Product repository with multi-tier caching.
    
    Demonstrates integration of cache manager with business logic.
    """
    
    def __init__(
        self,
        db_client,
        cache_manager: MultiTierCacheManager
    ):
        self.db = db_client
        self.cache = cache_manager
    
    async def get_product(self, product_id: str) -> Optional[dict]:
        """Get product with multi-tier caching."""
        cache_key = f"product:{product_id}"
        
        return await self.cache.get(
            key=cache_key,
            fetch_func=lambda: self._fetch_product(product_id),
            ttl=300
        )
    
    async def _fetch_product(self, product_id: str) -> Optional[dict]:
        """Fetch product from database."""
        row = await self.db.fetch_one(
            "SELECT * FROM products WHERE id = $1",
            product_id
        )
        return dict(row) if row else None
    
    async def update_product(
        self,
        product_id: str,
        data: dict
    ) -> Optional[dict]:
        """Update product with cache invalidation."""
        
        # Update database
        product = await self.db.fetch_one(
            """
            UPDATE products 
            SET name = $2, price = $3, updated_at = NOW()
            WHERE id = $1
            RETURNING *
            """,
            product_id, data.get("name"), data.get("price")
        )
        
        if not product:
            return None
        
        # Invalidate all cache tiers
        await self.cache.invalidate(
            keys=[
                f"product:{product_id}",
                f"product_page:{product_id}",
            ],
            patterns=[
                f"products:*",  # Product lists
            ],
            cdn_urls=[
                f"/api/products/{product_id}",
                f"/api/products?*",
                f"/products/{product_id}",
            ]
        )
        
        return dict(product)
    
    async def get_products_list(
        self,
        category: str = None,
        page: int = 1,
        limit: int = 20
    ) -> dict:
        """Get product list with caching."""
        cache_key = f"products:list:{category or 'all'}:page{page}"
        
        return await self.cache.get(
            key=cache_key,
            fetch_func=lambda: self._fetch_products_list(category, page, limit),
            ttl=60  # Short TTL for lists
        )
    
    async def _fetch_products_list(
        self,
        category: str,
        page: int,
        limit: int
    ) -> dict:
        """Fetch product list from database."""
        offset = (page - 1) * limit
        
        if category:
            rows = await self.db.fetch(
                """
                SELECT * FROM products 
                WHERE category = $1 
                ORDER BY created_at DESC 
                LIMIT $2 OFFSET $3
                """,
                category, limit, offset
            )
        else:
            rows = await self.db.fetch(
                """
                SELECT * FROM products 
                ORDER BY created_at DESC 
                LIMIT $1 OFFSET $2
                """,
                limit, offset
            )
        
        return {
            "products": [dict(row) for row in rows],
            "page": page,
            "limit": limit
        }
    
    def get_response_headers(
        self,
        product: dict,
        authenticated: bool
    ) -> dict:
        """Get cache headers for product response."""
        return self.cache.get_cache_headers(
            ttl=60 if not authenticated else 0,
            private=authenticated,
            vary=["Accept-Encoding"] + (["Authorization"] if authenticated else []),
            etag=str(product.get("updated_at", ""))
        )

Part III: Real-World Application

Chapter 9: Case Studies

9.1 Case Study: Netflix — Multi-Region Multi-Tier

NETFLIX CACHING ARCHITECTURE

        ┌─────────────────────────────────────────────────────────┐
        │                      CLIENT                             │
        │  ┌────────────────────────────────────────────────────┐ │
        │  │ Browser/App Cache                                  │ │
        │  │ - Video player state                               │ │
        │  │ - Recently watched                                 │ │
        │  │ - UI preferences                                   │ │
        │  └────────────────────────────────────────────────────┘ │
        └───────────────────────────┬─────────────────────────────┘
                                    │
        ┌───────────────────────────▼─────────────────────────────┐
        │                      CDN (Open Connect)                 │
        │  ┌────────────────────────────────────────────────────┐ │
        │  │ - Video content (actual files)                     │ │
        │  │ - ISP-embedded servers                             │ │
        │  │ - 95%+ of bandwidth served from edge               │ │
        │  └────────────────────────────────────────────────────┘ │
        └───────────────────────────┬─────────────────────────────┘
                                    │
        ┌───────────────────────────▼─────────────────────────────┐
        │                    AWS REGION                           │
        │  ┌────────────────────────────────────────────────────┐ │
        │  │ EVCache (Memcached-based)                          │ │
        │  │ - Session data                                     │ │
        │  │ - User profiles                                    │ │
        │  │ - Recommendations                                  │ │
        │  │ - Zone-aware replication                           │ │
        │  └────────────────────────────────────────────────────┘ │
        │  ┌────────────────────────────────────────────────────┐ │
        │  │ Cassandra                                          │ │
        │  │ - Persistent storage                               │ │
        │  │ - Cross-region replication                         │ │
        │  └────────────────────────────────────────────────────┘ │
        └─────────────────────────────────────────────────────────┘

Key decisions:
  1. VIDEO on CDN - 95% of bandwidth
  2. METADATA in EVCache - User profiles, preferences
  3. CASSANDRA for durability - Cross-region
  4. Zone-aware - Survive AZ failures

9.2 Case Study: Amazon — E-commerce Multi-Tier

AMAZON CACHING ARCHITECTURE

Product Page Request Flow:

1. CLOUDFRONT (CDN)
   ├── Static assets (JS, CSS, images)
   ├── Product images
   └── Some API responses (product details)

2. APPLICATION LOAD BALANCER
   ├── SSL termination
   └── Request routing

3. MICROSERVICES
   ├── Product Service
   │   └── ElastiCache (Redis)
   │       ├── Product details
   │       └── Price (short TTL)
   │
   ├── Inventory Service
   │   └── DynamoDB DAX (DynamoDB Accelerator)
   │       └── Real-time inventory
   │
   ├── Recommendations Service
   │   └── ElastiCache (Redis)
   │       └── Pre-computed recommendations
   │
   └── Reviews Service
       └── ElastiCache (Memcached)
           └── Aggregated reviews

4. DATABASES
   ├── DynamoDB - Product catalog
   ├── Aurora - Orders, users
   └── Neptune - Product graph

Key insight:
  Each microservice has its OWN cache
  Different cache tech for different needs
  Redis for complex data, Memcached for simple

9.3 Summary: Industry Patterns

Company CDN Usage App Cache Key Innovation
Netflix Video content EVCache Zone-aware caching
Amazon Static + API ElastiCache + DAX Per-service caching
Cloudflare Everything Workers KV Edge compute
Facebook Static TAO + Memcache Graph-aware cache
Twitter Media Redis Timeline caching

Chapter 10: Common Mistakes

10.1 Mistake 1: Caching Authenticated Responses at CDN

❌ WRONG: CDN caches user-specific data

# API returns user's profile
@app.get("/api/profile")
async def get_profile(user = Depends(get_current_user)):
    profile = await fetch_profile(user.id)
    return profile  # No Cache-Control header!

# CDN caches this response
# Next user gets PREVIOUS user's profile!


✅ CORRECT: Mark authenticated responses as private

@app.get("/api/profile")
async def get_profile(response: Response, user = Depends(get_current_user)):
    profile = await fetch_profile(user.id)
    
    # Don't cache at CDN
    response.headers["Cache-Control"] = "private, no-store"
    response.headers["Vary"] = "Authorization"
    
    return profile

10.2 Mistake 2: Forgetting to Vary on Important Headers

❌ WRONG: Cache without proper Vary headers

# API returns different content based on Accept-Language
@app.get("/api/products")
async def get_products(request: Request):
    lang = request.headers.get("Accept-Language", "en")
    products = await fetch_products(lang)
    return products  # Missing Vary header!

# CDN caches English response
# French user gets English!


✅ CORRECT: Vary on content-affecting headers

@app.get("/api/products")
async def get_products(request: Request, response: Response):
    lang = request.headers.get("Accept-Language", "en")
    products = await fetch_products(lang)
    
    response.headers["Vary"] = "Accept-Language, Accept-Encoding"
    response.headers["Cache-Control"] = "public, max-age=60"
    
    return products

10.3 Mistake 3: Invalidating in Wrong Order

❌ WRONG: Invalidate CDN before app cache

async def update_product(product_id: str, data: dict):
    # Update database
    await db.update(product_id, data)
    
    # Invalidate CDN first
    await cdn.purge(f"/api/products/{product_id}")
    
    # Then app cache
    await redis.delete(f"product:{product_id}")
    
    # PROBLEM: CDN fetches from origin between purge and redis delete
    # Origin still has old data in Redis!
    # CDN re-caches stale data!


✅ CORRECT: Invalidate closest to database first

async def update_product(product_id: str, data: dict):
    # Update database
    await db.update(product_id, data)
    
    # First: App cache (closest to DB)
    await redis.delete(f"product:{product_id}")
    
    # Second: Gateway cache
    await gateway.invalidate(f"/api/products/{product_id}")
    
    # Last: CDN (farthest from DB)
    await cdn.purge(f"/api/products/{product_id}")

10.4 Mistake 4: Not Versioning Cache Keys

❌ WRONG: Unversioned cache keys

# Deploy v1
await redis.set("product:123", product_v1_schema)

# Deploy v2 (schema changed!)
data = await redis.get("product:123")
# Crash! Old schema doesn't have new fields


✅ CORRECT: Version in cache key

CACHE_VERSION = "v3"

def cache_key(entity: str, id: str) -> str:
    return f"{CACHE_VERSION}:{entity}:{id}"

# Deploy v3
await redis.set(cache_key("product", "123"), product_v3_schema)

# On v4 deploy, just change CACHE_VERSION
# Old keys expire naturally, no crash

10.5 Mistake Checklist

  • Authenticated responses marked private — No CDN caching of user data
  • Vary headers set correctly — For language, encoding, auth
  • Invalidation order correct — App cache → Gateway → CDN
  • Cache keys versioned — Survives schema changes
  • CDN purge is async — Don't block on CDN API
  • Static assets have long TTL — With content hash in filename
  • Sensitive data not cached — no-store for PII, payment

Part IV: Interview Preparation

Chapter 11: Interview Tips

11.1 Key Phrases

INTRODUCING MULTI-TIER:

"For a high-traffic system, I'd design caching at multiple 
tiers: CDN for static content and public APIs, application 
cache in Redis for business data, and proper Cache-Control 
headers for browser caching. Each tier serves a different 
purpose and has different invalidation needs."


EXPLAINING CDN:

"The CDN handles static assets with long TTLs—CSS, JS, 
images. For API responses, I'd cache public endpoints 
like product listings at the CDN with short TTLs, but 
mark authenticated endpoints as private to prevent 
one user's data from being served to another."


ON INVALIDATION:

"Invalidation order matters. When data changes, I invalidate 
closest to the database first—Redis, then gateway cache, 
then CDN. This prevents the CDN from re-caching stale data 
if it fetches from an origin that still has cached old data."


ON CONSISTENCY:

"Multi-tier caching means accepting eventual consistency. 
Browser cache might be stale for minutes, CDN for seconds. 
For critical data like inventory or pricing, I'd use 
shorter TTLs and event-driven invalidation to minimize 
the staleness window."

11.2 Common Questions

Question Good Answer
"How do you handle cache invalidation at the CDN?" "CDN providers have purge APIs. When data changes, I'd publish an invalidation event, and a consumer calls the CDN purge API. For bulk updates, I'd use cache tags—tag all product pages with 'products', then purge by tag."
"How do you prevent caching user-specific data at CDN?" "Set Cache-Control: private and Vary: Authorization. The private directive tells CDN not to cache. Vary ensures different auth tokens get different cache entries if somehow cached."
"What about browser caching for SPAs?" "I'd use content-hashed filenames for JS/CSS with immutable cache headers (1 year TTL). The HTML entry point gets no-cache with ETag—browser always checks, but gets 304 Not Modified if unchanged."
"How do you debug cache issues?" "Check X-Cache headers from CDN (HIT/MISS). Log cache hit ratios at each tier. Use request IDs to trace a request through all layers. Monitor cache eviction rates and TTL distributions."

Chapter 12: Mock Interview

Scenario: Design Caching for an E-commerce Platform

Interviewer: "We're building an e-commerce platform. Walk me through how you'd design the caching architecture."

You: "I'd approach this with multiple cache tiers, each serving a specific purpose. Let me walk through from user to database.

Browser caching: For static assets—CSS, JavaScript, images—I'd use content-hashed filenames with immutable cache headers. app.abc123.js gets cached for a year. When we deploy new code, the filename changes, so users get fresh assets without explicit invalidation.

For HTML pages in a single-page app, I'd set Cache-Control: no-cache with ETag. The browser always checks, but gets a 304 Not Modified if nothing changed—saves bandwidth.

Interviewer: "What about the CDN layer?"

You: "The CDN is critical for performance. I'd cache:

  1. All static assets with long TTLs, served from edge locations globally.

  2. Public API responses like product listings and category pages with short TTLs—maybe 60 seconds. These are the same for all anonymous users, so CDN caching is highly effective.

  3. Product images with long TTLs. When a product image changes, we'd use a new URL rather than purging.

I would NOT cache authenticated endpoints at the CDN. Any request with an Authorization header gets Cache-Control: private, ensuring user-specific data stays private.

For Vary headers, I'd include Accept-Encoding (for compression) and Accept-Language if we serve localized content.

Interviewer: "How do you handle product price changes?"

You: "Price changes need fast propagation. Here's the flow:

  1. Database update: Price changes from $99 to $79.

  2. Application cache invalidation: Immediately delete product:123 from Redis. This is fast, synchronous.

  3. CDN purge: Call the CDN API to purge /api/products/123 and any product listing pages. This is async—I'd queue it to not block the price update.

  4. Browser cache: Can't force invalidation, but with 60-second TTL on product pages, staleness is bounded.

For flash sales where thousands of prices change at once, I'd use CDN cache tags. All product responses include a Cache-Tag: products header. To invalidate all products, I purge by tag—one API call instead of thousands.

Interviewer: "What about the application cache layer?"

You: "Redis serves as the application cache. I'd cache:

  • Product details: Full product objects with 5-minute TTL.
  • Inventory counts: Very short TTL (30 seconds) or event-driven invalidation since inventory changes frequently.
  • User sessions: With sliding expiration.
  • Computed values: Like product ratings, category counts.

I'd use the cache-aside pattern—check cache, miss goes to database, populate cache on return. For inventory, I might use read-through with very short TTL since accuracy matters.

Interviewer: "How do you ensure consistency across these tiers?"

You: "Perfect consistency across all tiers is impractical—it would eliminate the performance benefits. Instead, I'd design for bounded staleness:

  • CDN: 60-second max staleness for product data
  • Redis: 5-minute max, but event-driven invalidation makes it usually seconds
  • Browser: 60-second max for API data, longer for static assets

For critical paths like checkout, I'd bypass caching entirely and read from the database. A customer should never be charged a different price than what they saw.

The invalidation order is crucial: Redis first, then CDN. If I purge CDN before Redis, the CDN might fetch from origin, get the old cached value from Redis, and re-cache stale data."

Interviewer: "How would you monitor this system?"

You: "Key metrics at each tier:

CDN: Hit ratio (should be >95% for static assets), origin fetch latency, purge success rate.

Redis: Hit ratio, memory usage, eviction rate, latency percentiles.

Database: Query rate (should be low if caching is effective), slow queries.

I'd add X-Cache headers to responses indicating HIT/MISS at each layer, making debugging easier. And I'd alert on hit ratio drops—often indicates invalidation bugs or traffic pattern changes."


Summary

DAY 5 KEY TAKEAWAYS

THE CACHE HIERARCHY:
• Browser → CDN → Gateway → Application → Database
• Each layer has different characteristics
• Moving down: higher latency, better freshness

WHAT BELONGS WHERE:

Browser:
  • Static assets (with content hashing)
  • User preferences
  • Short-lived API responses

CDN:
  • Static assets
  • Public API responses
  • Media files
  • NOT: Authenticated/personalized content

API Gateway:
  • Auth token validation
  • Rate limit counters
  • Request deduplication

Application (Redis):
  • Business data
  • Sessions
  • Computed values
  • Feature flags

INVALIDATION ORDER:
  App Cache → Gateway → CDN → (Browser via TTL)
  Closest to database first!

KEY PRINCIPLES:
  • Private for authenticated responses
  • Vary on content-affecting headers
  • Version cache keys for schema changes
  • Async CDN purge (don't block updates)
  • Bounded staleness, not perfect consistency

CACHE HEADERS:
  • Cache-Control: public/private, max-age, stale-while-revalidate
  • Vary: Accept-Encoding, Accept-Language, Authorization
  • ETag: For conditional requests

📚 Further Reading

Official Documentation

Engineering Blogs

Books

  • "High Performance Browser Networking" by Ilya Grigorik
  • "Web Scalability for Startup Engineers" by Artur Ejsmont

End of Day 5: Multi-Tier Caching

This completes Week 4: Caching — Beyond "Just Add Redis"

You've learned:

  • Day 1: The four caching patterns
  • Day 2: Invalidation strategies
  • Day 3: Thundering herd prevention
  • Day 4: Feed caching (push vs pull)
  • Day 5: Multi-tier caching architecture

Next Week: Week 5 — Consistency and Coordination. We'll explore distributed transactions, consensus algorithms, and how to maintain consistency across services.