Week 4 — Day 5: Multi-Tier Caching
System Design Mastery Series
Preface
This week, you've learned caching patterns, invalidation strategies, thundering herd prevention, and feed caching. Each focused on a single cache layer.
But production systems don't have just one cache. They have many:
THE REAL WORLD: MULTIPLE CACHE LAYERS
User request: "Show me product #12345"
Layer 1: Browser Cache
└─ Hit? Return cached page (0ms)
└─ Miss? Continue...
Layer 2: CDN (Edge)
└─ Hit? Return from edge server (20ms)
└─ Miss? Continue...
Layer 3: API Gateway
└─ Hit? Return cached response (5ms)
└─ Miss? Continue...
Layer 4: Application Cache (Redis)
└─ Hit? Return from Redis (2ms)
└─ Miss? Continue...
Layer 5: Database Cache (Query Cache)
└─ Hit? Return from buffer pool (10ms)
└─ Miss? Query disk (50ms)
Each layer serves a purpose.
Each layer has different characteristics.
Each layer needs different invalidation strategies.
Today: How to design caching across ALL these layers.
This is where caching gets complex — and where the real performance wins happen.
Part I: Foundations
Chapter 1: The Caching Hierarchy
1.1 Understanding the Layers
CACHE LAYER HIERARCHY
Latency Hit Rate Capacity Scope
─────── ──────── ──────── ─────
┌──────────────┐
│ Browser │ ~0ms High Small Per user
│ Cache │
└──────┬───────┘
│
▼
┌──────────────┐
│ CDN │ 10-50ms High Large Global
│ (Edge) │
└──────┬───────┘
│
▼
┌──────────────┐
│ API Gateway │ 1-5ms Medium Medium Regional
│ Cache │
└──────┬───────┘
│
▼
┌──────────────┐
│ Application │ 1-2ms High Large Application
│ Cache (Redis)│
└──────┬───────┘
│
▼
┌──────────────┐
│ Database │ 5-50ms Medium Medium Database
│Query Cache │
└──────┬───────┘
│
▼
┌──────────────┐
│ Database │ 10-100ms N/A Large Database
│ (Disk) │
└──────────────┘
Moving down the hierarchy:
- Latency increases
- Data freshness increases
- Capacity increases
- Scope broadens
1.2 What Belongs at Each Layer
CONTENT PLACEMENT BY LAYER
BROWSER CACHE
├── Static assets (JS, CSS, images)
├── User-specific preferences
├── Recently viewed items
└── API responses with Cache-Control headers
CDN (EDGE)
├── Static assets
├── Public API responses (product listings)
├── Marketing pages
├── Media files (images, videos)
└── NOT: Personalized content, authenticated responses
API GATEWAY
├── Rate limit counters
├── Authentication tokens (validation cache)
├── Common API responses
├── Request deduplication
└── NOT: User-specific data
APPLICATION (REDIS)
├── Session data
├── User profiles
├── Product details
├── Computed aggregates (counts, stats)
├── Feature flags
└── Everything that's read frequently
DATABASE
├── Query result cache
├── Buffer pool (automatic)
├── Materialized views
└── Prepared statement cache
1.3 Cache Characteristics by Layer
| Layer | TTL Range | Invalidation | Consistency | Best For |
|---|---|---|---|---|
| Browser | Minutes-Days | Headers, versioned URLs | Eventual | Static assets, preferences |
| CDN | Seconds-Hours | Purge API, TTL | Eventual | Public content, media |
| API Gateway | Seconds-Minutes | TTL, events | Eventual | Auth, rate limits |
| Application | Seconds-Hours | Events, TTL | Near real-time | Business data |
| Database | Automatic | Query invalidation | Strong | Query results |
1.4 Key Terminology
| Term | Definition |
|---|---|
| Edge | CDN servers geographically close to users |
| Origin | Your actual servers (behind CDN) |
| Cache-Control | HTTP header controlling browser/CDN caching |
| Vary | HTTP header for cache key variation |
| Purge | Explicitly removing content from CDN |
| Stale-while-revalidate | Serve stale, refresh async |
| Cache key | Unique identifier for cached content |
| Hit ratio | Percentage of requests served from cache |
Chapter 2: Browser Caching
2.1 HTTP Cache Headers
HTTP CACHE-CONTROL DIRECTIVES
Response headers that control browser caching:
Cache-Control: max-age=31536000, immutable
└─ Cache for 1 year, never revalidate
Cache-Control: no-cache
└─ Cache, but revalidate before using
Cache-Control: no-store
└─ Don't cache at all (sensitive data)
Cache-Control: private, max-age=3600
└─ Only browser can cache (not CDN), 1 hour
Cache-Control: public, max-age=86400
└─ Anyone can cache (browser, CDN), 24 hours
Cache-Control: max-age=0, must-revalidate
└─ Always check with server before using
Cache-Control: stale-while-revalidate=60
└─ Can serve stale for 60s while refreshing
2.2 Versioned Assets (Cache Busting)
CACHE BUSTING STRATEGIES
Problem:
User has cached old JavaScript
You deploy new JavaScript
User still sees old version!
Solution 1: Query string versioning
/app.js?v=1.2.3
└─ Change version, URL changes, cache misses
Solution 2: Filename hashing (recommended)
/app.abc123.js
└─ Hash of content in filename
└─ Content changes = filename changes
└─ Old files can stay cached (no conflict)
Solution 3: Path versioning
/v2/app.js
└─ New version = new path
Implementation:
# Cache headers in FastAPI
from fastapi import FastAPI, Response
from fastapi.responses import FileResponse
import hashlib
app = FastAPI()
# Static assets: Long cache with content hash
@app.get("/static/{filename}")
async def get_static(filename: str):
file_path = f"static/{filename}"
# Generate ETag from content
with open(file_path, 'rb') as f:
content_hash = hashlib.md5(f.read()).hexdigest()
return FileResponse(
file_path,
headers={
"Cache-Control": "public, max-age=31536000, immutable",
"ETag": f'"{content_hash}"'
}
)
# API responses: Short cache with revalidation
@app.get("/api/products/{product_id}")
async def get_product(product_id: str, response: Response):
product = await fetch_product(product_id)
# Cache for 5 minutes, allow stale for 1 minute while revalidating
response.headers["Cache-Control"] = "public, max-age=300, stale-while-revalidate=60"
response.headers["ETag"] = f'"{product["version"]}"'
return product
# Personalized content: Private cache only
@app.get("/api/me/profile")
async def get_my_profile(response: Response, user = Depends(get_current_user)):
profile = await fetch_profile(user.id)
# Only browser can cache, not CDN
response.headers["Cache-Control"] = "private, max-age=60"
return profile
# Sensitive data: No caching
@app.get("/api/me/payment-methods")
async def get_payment_methods(response: Response):
# Never cache payment info
response.headers["Cache-Control"] = "no-store"
return await fetch_payment_methods()
2.3 Conditional Requests (Revalidation)
CONDITIONAL REQUEST FLOW
First request:
Client → GET /api/product/123
Server → 200 OK
ETag: "abc123"
Cache-Control: max-age=300
After 5 minutes (cache expired):
Client → GET /api/product/123
If-None-Match: "abc123"
If unchanged:
Server → 304 Not Modified (no body!)
Client uses cached version
If changed:
Server → 200 OK
ETag: "def456"
(full response body)
Benefit: Saves bandwidth when content unchanged
Chapter 3: CDN Caching
3.1 How CDNs Work
CDN ARCHITECTURE
┌─────────────────────────────┐
│ Your Origin │
│ (servers) │
└─────────────┬───────────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ CDN Edge │ │ CDN Edge │ │ CDN Edge │
│ US-East │ │ EU-West │ │ Asia-Pacific │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
┌───────┴───────┐ ┌───────┴───────┐ ┌───────┴───────┐
│ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼
[Users] [Users] [Users] [Users]
New York London Tokyo Sydney
Request flow:
1. User in Tokyo requests image
2. DNS routes to nearest edge (Asia-Pacific)
3. Edge checks cache:
- HIT: Return immediately (20ms)
- MISS: Fetch from origin, cache, return (200ms first time)
4. Next Tokyo user gets cached version (20ms)
3.2 CDN Cache Configuration
# CDN configuration example (CloudFront-style)
CDN_CACHE_BEHAVIORS = {
# Static assets: Aggressive caching
"/static/*": {
"ttl_default": 86400 * 365, # 1 year
"ttl_min": 86400,
"ttl_max": 86400 * 365,
"compress": True,
"forward_headers": [], # Don't vary on headers
"forward_query_strings": False,
"cache_methods": ["GET", "HEAD"],
},
# API - Public endpoints: Moderate caching
"/api/products/*": {
"ttl_default": 60, # 1 minute
"ttl_min": 0,
"ttl_max": 3600,
"compress": True,
"forward_headers": ["Accept", "Accept-Language"],
"forward_query_strings": True, # Different params = different cache
"cache_methods": ["GET"],
},
# API - Authenticated: No CDN caching
"/api/me/*": {
"ttl_default": 0,
"ttl_min": 0,
"ttl_max": 0,
"forward_headers": ["Authorization", "Cookie"],
"cache_methods": [], # Don't cache
},
# Media: Long cache with purge capability
"/media/*": {
"ttl_default": 86400 * 7, # 1 week
"ttl_min": 3600,
"ttl_max": 86400 * 30,
"compress": False, # Already compressed (images, video)
"forward_headers": [],
"forward_query_strings": False,
},
}
3.3 CDN Cache Keys and Vary
CACHE KEY COMPONENTS
Default cache key:
URL + Query String
/api/products?category=electronics&sort=price
└─ Cached separately from:
/api/products?category=electronics&sort=name
Vary header adds dimensions:
Vary: Accept-Language
└─ /api/products (Accept-Language: en) ≠ /api/products (Accept-Language: es)
Vary: Accept-Encoding
└─ Gzipped and non-gzipped cached separately
Vary: Cookie
└─ DANGEROUS! Every user gets different cache = no caching benefit
EXAMPLE: Language-aware caching
Response:
Cache-Control: public, max-age=3600
Vary: Accept-Language
Content-Language: en
CDN caches separately for:
- Accept-Language: en
- Accept-Language: es
- Accept-Language: fr
3.4 CDN Invalidation (Purging)
# CDN Purge Implementation
import httpx
from typing import List
class CDNPurgeService:
"""
Service to invalidate CDN cache.
Different CDN providers have different APIs.
This is a generic implementation.
"""
def __init__(self, cdn_api_url: str, api_key: str):
self.cdn_api_url = cdn_api_url
self.api_key = api_key
self.client = httpx.AsyncClient()
async def purge_url(self, url: str) -> bool:
"""Purge a specific URL from CDN."""
response = await self.client.post(
f"{self.cdn_api_url}/purge",
headers={"Authorization": f"Bearer {self.api_key}"},
json={"urls": [url]}
)
return response.status_code == 200
async def purge_pattern(self, pattern: str) -> bool:
"""Purge URLs matching pattern (e.g., /products/*)."""
response = await self.client.post(
f"{self.cdn_api_url}/purge",
headers={"Authorization": f"Bearer {self.api_key}"},
json={"pattern": pattern}
)
return response.status_code == 200
async def purge_tag(self, tag: str) -> bool:
"""Purge all URLs with a specific cache tag."""
response = await self.client.post(
f"{self.cdn_api_url}/purge",
headers={"Authorization": f"Bearer {self.api_key}"},
json={"tag": tag}
)
return response.status_code == 200
# Integration with product updates
class ProductService:
def __init__(self, db, cache, cdn_purge: CDNPurgeService):
self.db = db
self.cache = cache
self.cdn = cdn_purge
async def update_product(self, product_id: str, data: dict) -> dict:
# Update database
product = await self.db.update_product(product_id, data)
# Invalidate application cache
await self.cache.delete(f"product:{product_id}")
# Purge CDN
await self.cdn.purge_url(f"/api/products/{product_id}")
await self.cdn.purge_pattern(f"/api/products?*") # List pages
return product
Chapter 4: API Gateway Caching
4.1 What to Cache at the Gateway
API GATEWAY CACHE USE CASES
1. AUTHENTICATION TOKEN VALIDATION
- Validate JWT/session token
- Cache validation result (short TTL)
- Avoid hitting auth service every request
2. RATE LIMIT COUNTERS
- Track requests per API key
- Must be fast (every request checks)
- Shared across gateway instances
3. RESPONSE CACHING
- Cache entire API responses
- Based on URL + headers
- Short TTL for dynamic content
4. REQUEST DEDUPLICATION
- Multiple identical requests in flight
- Collapse into single backend request
- Similar to thundering herd protection
4.2 Gateway Response Caching
# API Gateway Caching Implementation
from fastapi import FastAPI, Request, Response
from fastapi.responses import JSONResponse
import hashlib
import json
class GatewayCache:
"""
API Gateway response cache.
Caches responses based on:
- HTTP method
- URL path
- Query parameters
- Selected headers (Accept, Accept-Language)
"""
def __init__(self, redis_client, default_ttl: int = 60):
self.redis = redis_client
self.default_ttl = default_ttl
# Paths that should be cached
self.cacheable_paths = {
"/api/products": 60,
"/api/categories": 300,
"/api/config": 3600,
}
# Headers that affect cache key
self.vary_headers = ["Accept", "Accept-Language", "Accept-Encoding"]
def _build_cache_key(self, request: Request) -> str:
"""Build cache key from request."""
parts = [
request.method,
str(request.url.path),
str(sorted(request.query_params.items())),
]
# Add varied headers
for header in self.vary_headers:
value = request.headers.get(header, "")
parts.append(f"{header}:{value}")
key_string = "|".join(parts)
return f"gateway_cache:{hashlib.sha256(key_string.encode()).hexdigest()}"
def _is_cacheable(self, request: Request) -> bool:
"""Check if request is cacheable."""
if request.method not in ("GET", "HEAD"):
return False
# Check if path matches cacheable patterns
path = request.url.path
for pattern in self.cacheable_paths:
if path.startswith(pattern):
return True
return False
def _get_ttl(self, request: Request) -> int:
"""Get TTL for request."""
path = request.url.path
for pattern, ttl in self.cacheable_paths.items():
if path.startswith(pattern):
return ttl
return self.default_ttl
async def get_cached_response(self, request: Request) -> Optional[Response]:
"""Try to get cached response."""
if not self._is_cacheable(request):
return None
cache_key = self._build_cache_key(request)
cached = await self.redis.get(cache_key)
if cached:
data = json.loads(cached)
return JSONResponse(
content=data["body"],
status_code=data["status_code"],
headers={
**data["headers"],
"X-Cache": "HIT",
"X-Cache-Key": cache_key[:16]
}
)
return None
async def cache_response(
self,
request: Request,
response_body: dict,
status_code: int,
headers: dict
):
"""Cache a response."""
if not self._is_cacheable(request):
return
if status_code != 200:
return
cache_key = self._build_cache_key(request)
ttl = self._get_ttl(request)
data = {
"body": response_body,
"status_code": status_code,
"headers": {k: v for k, v in headers.items() if k.lower() not in ("content-length", "transfer-encoding")}
}
await self.redis.setex(cache_key, ttl, json.dumps(data))
# Middleware for automatic caching
class CacheMiddleware:
def __init__(self, app, cache: GatewayCache):
self.app = app
self.cache = cache
async def __call__(self, scope, receive, send):
if scope["type"] != "http":
await self.app(scope, receive, send)
return
request = Request(scope, receive)
# Try cache
cached_response = await self.cache.get_cached_response(request)
if cached_response:
await cached_response(scope, receive, send)
return
# Not cached - forward to app
await self.app(scope, receive, send)
4.3 Authentication Cache
# Auth Token Validation Cache
class AuthCache:
"""
Cache authentication token validation results.
Reduces load on auth service by caching valid tokens.
Short TTL ensures revoked tokens stop working quickly.
"""
def __init__(self, redis_client, auth_service, ttl: int = 60):
self.redis = redis_client
self.auth_service = auth_service
self.ttl = ttl
async def validate_token(self, token: str) -> Optional[dict]:
"""
Validate token with caching.
Returns user info if valid, None if invalid.
"""
cache_key = f"auth:{self._hash_token(token)}"
# Check cache
cached = await self.redis.get(cache_key)
if cached:
if cached == "INVALID":
return None
return json.loads(cached)
# Validate with auth service
try:
user = await self.auth_service.validate(token)
if user:
# Cache valid result
await self.redis.setex(cache_key, self.ttl, json.dumps(user))
return user
else:
# Cache invalid result (shorter TTL)
await self.redis.setex(cache_key, 10, "INVALID")
return None
except Exception as e:
# Auth service error - don't cache, let through
logger.error(f"Auth service error: {e}")
raise
def _hash_token(self, token: str) -> str:
"""Hash token for cache key (security)."""
return hashlib.sha256(token.encode()).hexdigest()
async def invalidate_token(self, token: str):
"""Invalidate a cached token (on logout/revoke)."""
cache_key = f"auth:{self._hash_token(token)}"
await self.redis.delete(cache_key)
async def invalidate_user_tokens(self, user_id: str):
"""Invalidate all tokens for a user."""
# This requires tracking tokens by user
# Or just wait for TTL expiry
pattern = f"auth:user:{user_id}:*"
keys = await self.redis.keys(pattern)
if keys:
await self.redis.delete(*keys)
Chapter 5: Application Cache (Redis)
5.1 Application Cache Patterns
We covered this extensively in Days 1-4. Quick recap:
APPLICATION CACHE PATTERNS
CACHE-ASIDE (Most common)
Read: Check cache → Miss → Load DB → Store cache → Return
Write: Update DB → Invalidate cache
COMPUTED VALUES
Store pre-computed aggregates
- User's follower count
- Product's average rating
- Dashboard statistics
SESSION DATA
User sessions with sliding expiration
Shopping carts
Temporary wizard state
FEATURE FLAGS
Configuration that rarely changes
A/B test assignments
Gradual rollout percentages
RATE LIMITING
Request counts per window
User quotas
API usage tracking
5.2 Layered Cache Keys
# Structured cache keys for multi-tier systems
class CacheKeyBuilder:
"""
Build structured cache keys for multi-tier caching.
Key format: {prefix}:{version}:{entity}:{id}:{variant}
Example: app:v2:product:12345:en
"""
def __init__(self, prefix: str = "app", version: str = "v1"):
self.prefix = prefix
self.version = version
def product(self, product_id: str, locale: str = None) -> str:
key = f"{self.prefix}:{self.version}:product:{product_id}"
if locale:
key += f":{locale}"
return key
def product_list(self, category: str, page: int, locale: str = None) -> str:
key = f"{self.prefix}:{self.version}:products:{category}:page{page}"
if locale:
key += f":{locale}"
return key
def user_profile(self, user_id: str) -> str:
return f"{self.prefix}:{self.version}:user:{user_id}:profile"
def user_feed(self, user_id: str) -> str:
return f"{self.prefix}:{self.version}:feed:{user_id}"
def session(self, session_id: str) -> str:
return f"{self.prefix}:session:{session_id}" # No version - survives deploys
def rate_limit(self, key: str, window: str) -> str:
return f"{self.prefix}:ratelimit:{key}:{window}"
def invalidation_pattern(self, entity: str, id: str = "*") -> str:
"""Pattern for bulk invalidation."""
return f"{self.prefix}:{self.version}:{entity}:{id}:*"
# Usage
keys = CacheKeyBuilder(prefix="myapp", version="v3")
# After schema change, bump version to v4
# All v3 keys become orphaned and expire naturally
# No explicit invalidation needed!
Chapter 6: Multi-Tier Invalidation
6.1 The Invalidation Challenge
MULTI-TIER INVALIDATION PROBLEM
Product price changes from $99 to $79:
Layer 1: Browser
- Users have old price cached
- Can't push invalidation to browsers!
- Must wait for Cache-Control max-age
Layer 2: CDN
- Multiple edge servers have old price
- Must purge from all edges
- Purge takes time to propagate
Layer 3: API Gateway
- Response cache has old price
- Must invalidate gateway cache
Layer 4: Application (Redis)
- Product cache has old price
- Must delete/update Redis key
Layer 5: Database
- Query cache might have old price
- Usually auto-invalidated on write
Propagation order matters!
If CDN still has old price, users get stale data
even if application cache is fresh.
6.2 Invalidation Strategies by Layer
# Multi-Tier Invalidation Service
from dataclasses import dataclass
from typing import List, Optional
import asyncio
@dataclass
class InvalidationResult:
"""Result of multi-tier invalidation."""
success: bool
browser_invalidated: bool # Can only suggest via headers
cdn_invalidated: bool
gateway_invalidated: bool
app_cache_invalidated: bool
errors: List[str]
class MultiTierInvalidationService:
"""
Coordinates cache invalidation across all tiers.
Order of invalidation:
1. Application cache (Redis) - fastest
2. API Gateway cache
3. CDN - slowest to propagate
4. Browser - can only influence future requests
"""
def __init__(
self,
redis_client,
gateway_cache,
cdn_service,
event_publisher
):
self.redis = redis_client
self.gateway = gateway_cache
self.cdn = cdn_service
self.events = event_publisher
async def invalidate_product(
self,
product_id: str,
eager_cdn_purge: bool = True
) -> InvalidationResult:
"""
Invalidate a product across all cache tiers.
"""
errors = []
# 1. Application cache (immediate)
app_success = await self._invalidate_app_cache(product_id, errors)
# 2. API Gateway cache
gateway_success = await self._invalidate_gateway_cache(product_id, errors)
# 3. CDN
cdn_success = True
if eager_cdn_purge:
cdn_success = await self._invalidate_cdn(product_id, errors)
else:
# Queue CDN purge for background processing
await self.events.publish("cdn-purge-queue", {
"type": "product",
"id": product_id
})
return InvalidationResult(
success=app_success and gateway_success and cdn_success,
browser_invalidated=False, # Can't force browser
cdn_invalidated=cdn_success,
gateway_invalidated=gateway_success,
app_cache_invalidated=app_success,
errors=errors
)
async def _invalidate_app_cache(self, product_id: str, errors: List[str]) -> bool:
"""Invalidate Redis cache."""
try:
keys_to_delete = [
f"product:{product_id}",
f"product_page:{product_id}",
f"product:{product_id}:*", # Pattern for variants
]
pipe = self.redis.pipeline()
for key in keys_to_delete:
if '*' in key:
# Pattern delete - scan and delete
cursor = 0
while True:
cursor, keys = await self.redis.scan(cursor, match=key, count=100)
if keys:
pipe.delete(*keys)
if cursor == 0:
break
else:
pipe.delete(key)
await pipe.execute()
return True
except Exception as e:
errors.append(f"Redis invalidation failed: {e}")
return False
async def _invalidate_gateway_cache(self, product_id: str, errors: List[str]) -> bool:
"""Invalidate API Gateway cache."""
try:
patterns = [
f"/api/products/{product_id}",
f"/api/products?*", # List endpoints
]
for pattern in patterns:
await self.gateway.invalidate_pattern(pattern)
return True
except Exception as e:
errors.append(f"Gateway invalidation failed: {e}")
return False
async def _invalidate_cdn(self, product_id: str, errors: List[str]) -> bool:
"""Invalidate CDN cache."""
try:
urls = [
f"/api/products/{product_id}",
f"/api/products/{product_id}/*",
f"/products/{product_id}", # Product page
]
results = await asyncio.gather(*[
self.cdn.purge_url(url) for url in urls
], return_exceptions=True)
for i, result in enumerate(results):
if isinstance(result, Exception):
errors.append(f"CDN purge failed for {urls[i]}: {result}")
return all(not isinstance(r, Exception) for r in results)
except Exception as e:
errors.append(f"CDN invalidation failed: {e}")
return False
async def invalidate_bulk(
self,
entity_type: str,
entity_ids: List[str]
) -> InvalidationResult:
"""
Bulk invalidation for multiple entities.
More efficient than individual invalidations.
"""
errors = []
# 1. Application cache - bulk delete
try:
pipe = self.redis.pipeline()
for entity_id in entity_ids:
pipe.delete(f"{entity_type}:{entity_id}")
await pipe.execute()
app_success = True
except Exception as e:
errors.append(f"Bulk Redis invalidation failed: {e}")
app_success = False
# 2. Gateway - invalidate pattern
try:
await self.gateway.invalidate_pattern(f"/api/{entity_type}s/*")
gateway_success = True
except Exception as e:
errors.append(f"Gateway invalidation failed: {e}")
gateway_success = False
# 3. CDN - use cache tags if available
try:
await self.cdn.purge_tag(f"{entity_type}")
cdn_success = True
except Exception as e:
errors.append(f"CDN invalidation failed: {e}")
cdn_success = False
return InvalidationResult(
success=app_success and gateway_success and cdn_success,
browser_invalidated=False,
cdn_invalidated=cdn_success,
gateway_invalidated=gateway_success,
app_cache_invalidated=app_success,
errors=errors
)
6.3 Event-Driven Multi-Tier Invalidation
# Event-Driven Invalidation System
class InvalidationEventHandler:
"""
Handles invalidation events for all cache tiers.
Listens to data change events and coordinates
cache invalidation across tiers.
"""
def __init__(self, invalidation_service: MultiTierInvalidationService):
self.invalidation = invalidation_service
async def handle_event(self, event: dict):
"""Route event to appropriate handler."""
event_type = event.get("type")
handlers = {
"product.updated": self._handle_product_updated,
"product.deleted": self._handle_product_deleted,
"product.price_changed": self._handle_price_change,
"category.updated": self._handle_category_updated,
"inventory.updated": self._handle_inventory_updated,
"user.updated": self._handle_user_updated,
"global.invalidate": self._handle_global_invalidate,
}
handler = handlers.get(event_type)
if handler:
await handler(event)
else:
logger.warning(f"Unknown invalidation event type: {event_type}")
async def _handle_product_updated(self, event: dict):
"""Handle product update - full invalidation."""
product_id = event["product_id"]
result = await self.invalidation.invalidate_product(
product_id,
eager_cdn_purge=True
)
if not result.success:
logger.error(f"Product invalidation failed: {result.errors}")
async def _handle_price_change(self, event: dict):
"""Handle price change - priority invalidation."""
product_id = event["product_id"]
# Price changes are high priority - invalidate everywhere immediately
result = await self.invalidation.invalidate_product(
product_id,
eager_cdn_purge=True
)
# Also invalidate any deals/promotions pages
await self.invalidation.invalidate_bulk(
"deals_page",
["current", "featured"]
)
async def _handle_inventory_updated(self, event: dict):
"""Handle inventory update - partial invalidation."""
product_id = event["product_id"]
# Inventory is usually cached with short TTL
# Just invalidate application cache, CDN will expire quickly
await self.invalidation._invalidate_app_cache(product_id, [])
async def _handle_category_updated(self, event: dict):
"""Handle category update - affects multiple products."""
category_id = event["category_id"]
# Invalidate category cache
await self.invalidation.invalidate_bulk(
"category",
[category_id]
)
# Invalidate product listings in this category
await self.invalidation.gateway.invalidate_pattern(
f"/api/products?category={category_id}*"
)
await self.invalidation.cdn.purge_pattern(
f"/api/products?category={category_id}*"
)
async def _handle_user_updated(self, event: dict):
"""Handle user update - private cache only."""
user_id = event["user_id"]
# User data is private - only in app cache
await self.invalidation.redis.delete(f"user:{user_id}")
await self.invalidation.redis.delete(f"user:{user_id}:profile")
async def _handle_global_invalidate(self, event: dict):
"""Handle global invalidation - nuclear option."""
entity_type = event.get("entity_type", "*")
logger.warning(f"Global invalidation triggered for {entity_type}")
# Bump cache version (makes all keys invalid)
await self.invalidation.redis.incr("cache_version")
# Purge CDN entirely for this entity type
await self.invalidation.cdn.purge_pattern(f"/api/{entity_type}*")
# Consumer that processes invalidation events
async def run_invalidation_consumer(kafka_consumer, handler: InvalidationEventHandler):
"""Run the invalidation event consumer."""
async for message in kafka_consumer:
try:
event = json.loads(message.value.decode())
await handler.handle_event(event)
await kafka_consumer.commit()
except Exception as e:
logger.error(f"Failed to process invalidation event: {e}")
Chapter 7: Cache Strategies by Request Type
7.1 Anonymous vs Authenticated Requests
CACHING STRATEGY BY AUTH STATUS
ANONYMOUS (Not logged in):
├── Browser: Cache static assets, marketing pages
├── CDN: Cache everything (product pages, listings)
├── Gateway: Cache API responses
├── App: Cache shared data (products, categories)
│
│ Key: All anonymous users see same content
│ Can cache aggressively at all layers
AUTHENTICATED (Logged in):
├── Browser: Cache static assets only
├── CDN: CAREFUL! Don't cache personalized content
├── Gateway: Cache auth validation, rate limits
├── App: Cache user-specific data
│
│ Key: Each user sees different content
│ CDN caching requires careful Vary headers
# Auth-Aware Caching
class AuthAwareCacheMiddleware:
"""
Middleware that applies different caching strategies
based on authentication status.
"""
def __init__(self, app, cache_service):
self.app = app
self.cache = cache_service
async def __call__(self, scope, receive, send):
if scope["type"] != "http":
await self.app(scope, receive, send)
return
request = Request(scope, receive)
is_authenticated = self._is_authenticated(request)
# Set caching strategy based on auth
if is_authenticated:
await self._handle_authenticated(request, scope, receive, send)
else:
await self._handle_anonymous(request, scope, receive, send)
def _is_authenticated(self, request: Request) -> bool:
"""Check if request is authenticated."""
return (
"Authorization" in request.headers or
"session_id" in request.cookies
)
async def _handle_anonymous(self, request, scope, receive, send):
"""Handle anonymous request - aggressive caching."""
# Try CDN-style caching
cached = await self.cache.get_cached_response(request)
if cached:
await cached(scope, receive, send)
return
# Forward to app with caching headers
response_started = False
original_body = []
async def send_wrapper(message):
nonlocal response_started
if message["type"] == "http.response.start":
# Add cache headers for anonymous
headers = dict(message.get("headers", []))
headers[b"cache-control"] = b"public, max-age=60"
headers[b"vary"] = b"Accept-Encoding"
message["headers"] = list(headers.items())
response_started = True
await send(message)
await self.app(scope, receive, send_wrapper)
async def _handle_authenticated(self, request, scope, receive, send):
"""Handle authenticated request - private caching."""
async def send_wrapper(message):
if message["type"] == "http.response.start":
headers = dict(message.get("headers", []))
# Private cache only - no CDN
headers[b"cache-control"] = b"private, max-age=0, must-revalidate"
headers[b"vary"] = b"Authorization, Cookie"
message["headers"] = list(headers.items())
await send(message)
await self.app(scope, receive, send_wrapper)
7.2 Mobile vs Web Clients
CACHING STRATEGY BY CLIENT TYPE
WEB BROWSER:
├── Has robust caching support
├── Service Worker for offline
├── IndexedDB for structured data
├── Cache API for fine-grained control
│
│ Strategy: Use browser capabilities fully
│ Cache-Control with stale-while-revalidate
MOBILE APP (Native):
├── App-level cache (SQLite, Realm)
├── Network layer cache (OkHttp, URLCache)
├── May have offline requirements
│
│ Strategy: Server provides cache hints
│ App decides local caching strategy
│ ETag/Last-Modified for efficient sync
MOBILE WEB:
├── Limited storage
├── Unreliable network
├── High latency sensitivity
│
│ Strategy: Minimize payload
│ Aggressive server-side caching
│ Short TTLs with efficient revalidation
7.3 API Response Caching Decisions
# Response Caching Decision Tree
class ResponseCachePolicy:
"""
Determines caching policy for API responses.
"""
def get_cache_headers(
self,
request: Request,
response_data: dict,
endpoint: str
) -> dict:
"""Determine appropriate cache headers."""
# Start with defaults
headers = {
"Cache-Control": "no-cache",
"Vary": "Accept, Accept-Encoding"
}
# Check authentication
if self._is_authenticated(request):
# Authenticated: Private cache only
headers["Cache-Control"] = "private, max-age=0"
headers["Vary"] += ", Authorization"
return headers
# Anonymous: Apply endpoint-specific rules
policy = self._get_endpoint_policy(endpoint)
if policy["cacheable"]:
directives = [
"public",
f"max-age={policy['max_age']}",
]
if policy.get("stale_while_revalidate"):
directives.append(
f"stale-while-revalidate={policy['stale_while_revalidate']}"
)
if policy.get("immutable"):
directives.append("immutable")
headers["Cache-Control"] = ", ".join(directives)
# Add ETag for conditional requests
if "version" in response_data:
headers["ETag"] = f'"{response_data["version"]}"'
return headers
def _get_endpoint_policy(self, endpoint: str) -> dict:
"""Get caching policy for endpoint."""
policies = {
"/api/products": {
"cacheable": True,
"max_age": 60,
"stale_while_revalidate": 30,
},
"/api/products/{id}": {
"cacheable": True,
"max_age": 300,
"stale_while_revalidate": 60,
},
"/api/categories": {
"cacheable": True,
"max_age": 3600,
"stale_while_revalidate": 300,
},
"/api/search": {
"cacheable": True,
"max_age": 30, # Short for search results
},
"/api/cart": {
"cacheable": False, # User-specific
},
"/api/checkout": {
"cacheable": False, # Sensitive
},
}
# Match endpoint to policy
for pattern, policy in policies.items():
if self._matches_pattern(endpoint, pattern):
return policy
return {"cacheable": False}
def _matches_pattern(self, endpoint: str, pattern: str) -> bool:
"""Simple pattern matching for endpoints."""
if "{" not in pattern:
return endpoint.startswith(pattern)
# Handle patterns like /api/products/{id}
pattern_parts = pattern.split("/")
endpoint_parts = endpoint.split("/")
if len(pattern_parts) != len(endpoint_parts):
return False
for p, e in zip(pattern_parts, endpoint_parts):
if p.startswith("{") and p.endswith("}"):
continue # Wildcard match
if p != e:
return False
return True
Part II: Production Implementation
Chapter 8: Complete Multi-Tier Cache System
# Complete Multi-Tier Cache System
import asyncio
import hashlib
import json
import logging
from dataclasses import dataclass
from typing import Dict, List, Optional, Any
from datetime import datetime
from enum import Enum
logger = logging.getLogger(__name__)
# =============================================================================
# Configuration
# =============================================================================
@dataclass
class MultiTierCacheConfig:
"""Configuration for multi-tier cache system."""
# Application cache (Redis)
app_cache_default_ttl: int = 300
app_cache_max_ttl: int = 86400
# Gateway cache
gateway_cache_enabled: bool = True
gateway_cache_ttl: int = 60
# CDN
cdn_enabled: bool = True
cdn_default_ttl: int = 60
cdn_purge_on_update: bool = True
# Browser
browser_cache_default: int = 0
browser_cache_static: int = 31536000
# Invalidation
async_invalidation: bool = True
invalidation_queue: str = "cache-invalidation"
class CacheLayer(Enum):
BROWSER = "browser"
CDN = "cdn"
GATEWAY = "gateway"
APPLICATION = "application"
DATABASE = "database"
# =============================================================================
# Cache Layer Interfaces
# =============================================================================
class ICacheLayer:
"""Interface for cache layers."""
async def get(self, key: str) -> Optional[Any]:
raise NotImplementedError
async def set(self, key: str, value: Any, ttl: int = None) -> bool:
raise NotImplementedError
async def delete(self, key: str) -> bool:
raise NotImplementedError
async def delete_pattern(self, pattern: str) -> int:
raise NotImplementedError
class ApplicationCache(ICacheLayer):
"""Redis-based application cache."""
def __init__(self, redis_client, config: MultiTierCacheConfig):
self.redis = redis_client
self.config = config
async def get(self, key: str) -> Optional[Any]:
try:
value = await self.redis.get(key)
return json.loads(value) if value else None
except Exception as e:
logger.warning(f"App cache get error: {e}")
return None
async def set(self, key: str, value: Any, ttl: int = None) -> bool:
ttl = min(ttl or self.config.app_cache_default_ttl, self.config.app_cache_max_ttl)
try:
await self.redis.setex(key, ttl, json.dumps(value, default=str))
return True
except Exception as e:
logger.warning(f"App cache set error: {e}")
return False
async def delete(self, key: str) -> bool:
try:
await self.redis.delete(key)
return True
except Exception as e:
logger.warning(f"App cache delete error: {e}")
return False
async def delete_pattern(self, pattern: str) -> int:
deleted = 0
try:
cursor = 0
while True:
cursor, keys = await self.redis.scan(cursor, match=pattern, count=100)
if keys:
await self.redis.delete(*keys)
deleted += len(keys)
if cursor == 0:
break
except Exception as e:
logger.warning(f"App cache pattern delete error: {e}")
return deleted
class CDNCache(ICacheLayer):
"""CDN cache layer (via API)."""
def __init__(self, cdn_client, config: MultiTierCacheConfig):
self.cdn = cdn_client
self.config = config
async def get(self, key: str) -> Optional[Any]:
# CDN is transparent - we don't read from it directly
return None
async def set(self, key: str, value: Any, ttl: int = None) -> bool:
# CDN caching is controlled by headers, not direct sets
return True
async def delete(self, key: str) -> bool:
if not self.config.cdn_enabled:
return True
try:
return await self.cdn.purge_url(key)
except Exception as e:
logger.warning(f"CDN purge error: {e}")
return False
async def delete_pattern(self, pattern: str) -> int:
if not self.config.cdn_enabled:
return 0
try:
result = await self.cdn.purge_pattern(pattern)
return 1 if result else 0
except Exception as e:
logger.warning(f"CDN pattern purge error: {e}")
return 0
# =============================================================================
# Multi-Tier Cache Manager
# =============================================================================
class MultiTierCacheManager:
"""
Manages caching across all tiers.
Provides unified interface for:
- Reading from multiple tiers
- Writing to appropriate tiers
- Invalidating across all tiers
"""
def __init__(
self,
app_cache: ApplicationCache,
cdn_cache: CDNCache,
gateway_cache: Optional[ICacheLayer],
event_publisher,
config: MultiTierCacheConfig
):
self.app_cache = app_cache
self.cdn_cache = cdn_cache
self.gateway_cache = gateway_cache
self.events = event_publisher
self.config = config
# Layer order for reads (fastest first)
self.read_layers = [self.app_cache]
# Layer order for invalidation (closest to user first)
self.invalidation_layers = [
self.app_cache,
self.gateway_cache,
self.cdn_cache,
]
async def get(
self,
key: str,
fetch_func: callable = None,
ttl: int = None
) -> Optional[Any]:
"""
Get value from cache hierarchy.
Checks each layer in order, returns first hit.
On miss, optionally fetches and caches.
"""
# Try each read layer
for layer in self.read_layers:
if layer:
value = await layer.get(key)
if value is not None:
return value
# All layers missed
if fetch_func is None:
return None
# Fetch from source
value = await fetch_func()
if value is not None:
# Cache in application layer
await self.app_cache.set(key, value, ttl)
return value
async def set(
self,
key: str,
value: Any,
ttl: int = None,
layers: List[CacheLayer] = None
) -> bool:
"""
Set value in specified cache layers.
By default, only sets in application layer.
CDN caching is typically done via headers.
"""
layers = layers or [CacheLayer.APPLICATION]
success = True
for layer in layers:
if layer == CacheLayer.APPLICATION:
success = success and await self.app_cache.set(key, value, ttl)
return success
async def invalidate(
self,
keys: List[str],
patterns: List[str] = None,
cdn_urls: List[str] = None,
sync: bool = None
) -> bool:
"""
Invalidate cache across all tiers.
Args:
keys: Specific keys to invalidate
patterns: Patterns to match for bulk invalidation
cdn_urls: CDN URLs/patterns to purge
sync: Force synchronous invalidation
"""
sync = sync if sync is not None else not self.config.async_invalidation
if sync:
return await self._invalidate_sync(keys, patterns, cdn_urls)
else:
return await self._invalidate_async(keys, patterns, cdn_urls)
async def _invalidate_sync(
self,
keys: List[str],
patterns: List[str] = None,
cdn_urls: List[str] = None
) -> bool:
"""Synchronous invalidation across all tiers."""
success = True
# Application cache
for key in keys:
success = success and await self.app_cache.delete(key)
for pattern in (patterns or []):
await self.app_cache.delete_pattern(pattern)
# Gateway cache
if self.gateway_cache:
for key in keys:
await self.gateway_cache.delete(key)
for pattern in (patterns or []):
await self.gateway_cache.delete_pattern(pattern)
# CDN
for url in (cdn_urls or []):
success = success and await self.cdn_cache.delete(url)
return success
async def _invalidate_async(
self,
keys: List[str],
patterns: List[str] = None,
cdn_urls: List[str] = None
) -> bool:
"""Queue invalidation for async processing."""
event = {
"type": "cache.invalidate",
"keys": keys,
"patterns": patterns or [],
"cdn_urls": cdn_urls or [],
"timestamp": datetime.utcnow().isoformat()
}
# Invalidate app cache immediately (fast)
for key in keys:
await self.app_cache.delete(key)
# Queue the rest for async processing
await self.events.publish(self.config.invalidation_queue, event)
return True
def get_cache_headers(
self,
ttl: int,
private: bool = False,
vary: List[str] = None,
etag: str = None
) -> Dict[str, str]:
"""
Generate HTTP cache headers.
Call this when building responses to control
browser and CDN caching.
"""
directives = []
if private:
directives.append("private")
else:
directives.append("public")
directives.append(f"max-age={ttl}")
if ttl > 0:
# Add stale-while-revalidate for better UX
swr = min(ttl // 2, 60)
directives.append(f"stale-while-revalidate={swr}")
headers = {
"Cache-Control": ", ".join(directives)
}
if vary:
headers["Vary"] = ", ".join(vary)
if etag:
headers["ETag"] = f'"{etag}"'
return headers
# =============================================================================
# Repository with Multi-Tier Caching
# =============================================================================
class CachedProductRepository:
"""
Product repository with multi-tier caching.
Demonstrates integration of cache manager with business logic.
"""
def __init__(
self,
db_client,
cache_manager: MultiTierCacheManager
):
self.db = db_client
self.cache = cache_manager
async def get_product(self, product_id: str) -> Optional[dict]:
"""Get product with multi-tier caching."""
cache_key = f"product:{product_id}"
return await self.cache.get(
key=cache_key,
fetch_func=lambda: self._fetch_product(product_id),
ttl=300
)
async def _fetch_product(self, product_id: str) -> Optional[dict]:
"""Fetch product from database."""
row = await self.db.fetch_one(
"SELECT * FROM products WHERE id = $1",
product_id
)
return dict(row) if row else None
async def update_product(
self,
product_id: str,
data: dict
) -> Optional[dict]:
"""Update product with cache invalidation."""
# Update database
product = await self.db.fetch_one(
"""
UPDATE products
SET name = $2, price = $3, updated_at = NOW()
WHERE id = $1
RETURNING *
""",
product_id, data.get("name"), data.get("price")
)
if not product:
return None
# Invalidate all cache tiers
await self.cache.invalidate(
keys=[
f"product:{product_id}",
f"product_page:{product_id}",
],
patterns=[
f"products:*", # Product lists
],
cdn_urls=[
f"/api/products/{product_id}",
f"/api/products?*",
f"/products/{product_id}",
]
)
return dict(product)
async def get_products_list(
self,
category: str = None,
page: int = 1,
limit: int = 20
) -> dict:
"""Get product list with caching."""
cache_key = f"products:list:{category or 'all'}:page{page}"
return await self.cache.get(
key=cache_key,
fetch_func=lambda: self._fetch_products_list(category, page, limit),
ttl=60 # Short TTL for lists
)
async def _fetch_products_list(
self,
category: str,
page: int,
limit: int
) -> dict:
"""Fetch product list from database."""
offset = (page - 1) * limit
if category:
rows = await self.db.fetch(
"""
SELECT * FROM products
WHERE category = $1
ORDER BY created_at DESC
LIMIT $2 OFFSET $3
""",
category, limit, offset
)
else:
rows = await self.db.fetch(
"""
SELECT * FROM products
ORDER BY created_at DESC
LIMIT $1 OFFSET $2
""",
limit, offset
)
return {
"products": [dict(row) for row in rows],
"page": page,
"limit": limit
}
def get_response_headers(
self,
product: dict,
authenticated: bool
) -> dict:
"""Get cache headers for product response."""
return self.cache.get_cache_headers(
ttl=60 if not authenticated else 0,
private=authenticated,
vary=["Accept-Encoding"] + (["Authorization"] if authenticated else []),
etag=str(product.get("updated_at", ""))
)
Part III: Real-World Application
Chapter 9: Case Studies
9.1 Case Study: Netflix — Multi-Region Multi-Tier
NETFLIX CACHING ARCHITECTURE
┌─────────────────────────────────────────────────────────┐
│ CLIENT │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Browser/App Cache │ │
│ │ - Video player state │ │
│ │ - Recently watched │ │
│ │ - UI preferences │ │
│ └────────────────────────────────────────────────────┘ │
└───────────────────────────┬─────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────┐
│ CDN (Open Connect) │
│ ┌────────────────────────────────────────────────────┐ │
│ │ - Video content (actual files) │ │
│ │ - ISP-embedded servers │ │
│ │ - 95%+ of bandwidth served from edge │ │
│ └────────────────────────────────────────────────────┘ │
└───────────────────────────┬─────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────┐
│ AWS REGION │
│ ┌────────────────────────────────────────────────────┐ │
│ │ EVCache (Memcached-based) │ │
│ │ - Session data │ │
│ │ - User profiles │ │
│ │ - Recommendations │ │
│ │ - Zone-aware replication │ │
│ └────────────────────────────────────────────────────┘ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Cassandra │ │
│ │ - Persistent storage │ │
│ │ - Cross-region replication │ │
│ └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Key decisions:
1. VIDEO on CDN - 95% of bandwidth
2. METADATA in EVCache - User profiles, preferences
3. CASSANDRA for durability - Cross-region
4. Zone-aware - Survive AZ failures
9.2 Case Study: Amazon — E-commerce Multi-Tier
AMAZON CACHING ARCHITECTURE
Product Page Request Flow:
1. CLOUDFRONT (CDN)
├── Static assets (JS, CSS, images)
├── Product images
└── Some API responses (product details)
2. APPLICATION LOAD BALANCER
├── SSL termination
└── Request routing
3. MICROSERVICES
├── Product Service
│ └── ElastiCache (Redis)
│ ├── Product details
│ └── Price (short TTL)
│
├── Inventory Service
│ └── DynamoDB DAX (DynamoDB Accelerator)
│ └── Real-time inventory
│
├── Recommendations Service
│ └── ElastiCache (Redis)
│ └── Pre-computed recommendations
│
└── Reviews Service
└── ElastiCache (Memcached)
└── Aggregated reviews
4. DATABASES
├── DynamoDB - Product catalog
├── Aurora - Orders, users
└── Neptune - Product graph
Key insight:
Each microservice has its OWN cache
Different cache tech for different needs
Redis for complex data, Memcached for simple
9.3 Summary: Industry Patterns
| Company | CDN Usage | App Cache | Key Innovation |
|---|---|---|---|
| Netflix | Video content | EVCache | Zone-aware caching |
| Amazon | Static + API | ElastiCache + DAX | Per-service caching |
| Cloudflare | Everything | Workers KV | Edge compute |
| Static | TAO + Memcache | Graph-aware cache | |
| Media | Redis | Timeline caching |
Chapter 10: Common Mistakes
10.1 Mistake 1: Caching Authenticated Responses at CDN
❌ WRONG: CDN caches user-specific data
# API returns user's profile
@app.get("/api/profile")
async def get_profile(user = Depends(get_current_user)):
profile = await fetch_profile(user.id)
return profile # No Cache-Control header!
# CDN caches this response
# Next user gets PREVIOUS user's profile!
✅ CORRECT: Mark authenticated responses as private
@app.get("/api/profile")
async def get_profile(response: Response, user = Depends(get_current_user)):
profile = await fetch_profile(user.id)
# Don't cache at CDN
response.headers["Cache-Control"] = "private, no-store"
response.headers["Vary"] = "Authorization"
return profile
10.2 Mistake 2: Forgetting to Vary on Important Headers
❌ WRONG: Cache without proper Vary headers
# API returns different content based on Accept-Language
@app.get("/api/products")
async def get_products(request: Request):
lang = request.headers.get("Accept-Language", "en")
products = await fetch_products(lang)
return products # Missing Vary header!
# CDN caches English response
# French user gets English!
✅ CORRECT: Vary on content-affecting headers
@app.get("/api/products")
async def get_products(request: Request, response: Response):
lang = request.headers.get("Accept-Language", "en")
products = await fetch_products(lang)
response.headers["Vary"] = "Accept-Language, Accept-Encoding"
response.headers["Cache-Control"] = "public, max-age=60"
return products
10.3 Mistake 3: Invalidating in Wrong Order
❌ WRONG: Invalidate CDN before app cache
async def update_product(product_id: str, data: dict):
# Update database
await db.update(product_id, data)
# Invalidate CDN first
await cdn.purge(f"/api/products/{product_id}")
# Then app cache
await redis.delete(f"product:{product_id}")
# PROBLEM: CDN fetches from origin between purge and redis delete
# Origin still has old data in Redis!
# CDN re-caches stale data!
✅ CORRECT: Invalidate closest to database first
async def update_product(product_id: str, data: dict):
# Update database
await db.update(product_id, data)
# First: App cache (closest to DB)
await redis.delete(f"product:{product_id}")
# Second: Gateway cache
await gateway.invalidate(f"/api/products/{product_id}")
# Last: CDN (farthest from DB)
await cdn.purge(f"/api/products/{product_id}")
10.4 Mistake 4: Not Versioning Cache Keys
❌ WRONG: Unversioned cache keys
# Deploy v1
await redis.set("product:123", product_v1_schema)
# Deploy v2 (schema changed!)
data = await redis.get("product:123")
# Crash! Old schema doesn't have new fields
✅ CORRECT: Version in cache key
CACHE_VERSION = "v3"
def cache_key(entity: str, id: str) -> str:
return f"{CACHE_VERSION}:{entity}:{id}"
# Deploy v3
await redis.set(cache_key("product", "123"), product_v3_schema)
# On v4 deploy, just change CACHE_VERSION
# Old keys expire naturally, no crash
10.5 Mistake Checklist
- Authenticated responses marked private — No CDN caching of user data
- Vary headers set correctly — For language, encoding, auth
- Invalidation order correct — App cache → Gateway → CDN
- Cache keys versioned — Survives schema changes
- CDN purge is async — Don't block on CDN API
- Static assets have long TTL — With content hash in filename
- Sensitive data not cached — no-store for PII, payment
Part IV: Interview Preparation
Chapter 11: Interview Tips
11.1 Key Phrases
INTRODUCING MULTI-TIER:
"For a high-traffic system, I'd design caching at multiple
tiers: CDN for static content and public APIs, application
cache in Redis for business data, and proper Cache-Control
headers for browser caching. Each tier serves a different
purpose and has different invalidation needs."
EXPLAINING CDN:
"The CDN handles static assets with long TTLs—CSS, JS,
images. For API responses, I'd cache public endpoints
like product listings at the CDN with short TTLs, but
mark authenticated endpoints as private to prevent
one user's data from being served to another."
ON INVALIDATION:
"Invalidation order matters. When data changes, I invalidate
closest to the database first—Redis, then gateway cache,
then CDN. This prevents the CDN from re-caching stale data
if it fetches from an origin that still has cached old data."
ON CONSISTENCY:
"Multi-tier caching means accepting eventual consistency.
Browser cache might be stale for minutes, CDN for seconds.
For critical data like inventory or pricing, I'd use
shorter TTLs and event-driven invalidation to minimize
the staleness window."
11.2 Common Questions
| Question | Good Answer |
|---|---|
| "How do you handle cache invalidation at the CDN?" | "CDN providers have purge APIs. When data changes, I'd publish an invalidation event, and a consumer calls the CDN purge API. For bulk updates, I'd use cache tags—tag all product pages with 'products', then purge by tag." |
| "How do you prevent caching user-specific data at CDN?" | "Set Cache-Control: private and Vary: Authorization. The private directive tells CDN not to cache. Vary ensures different auth tokens get different cache entries if somehow cached." |
| "What about browser caching for SPAs?" | "I'd use content-hashed filenames for JS/CSS with immutable cache headers (1 year TTL). The HTML entry point gets no-cache with ETag—browser always checks, but gets 304 Not Modified if unchanged." |
| "How do you debug cache issues?" | "Check X-Cache headers from CDN (HIT/MISS). Log cache hit ratios at each tier. Use request IDs to trace a request through all layers. Monitor cache eviction rates and TTL distributions." |
Chapter 12: Mock Interview
Scenario: Design Caching for an E-commerce Platform
Interviewer: "We're building an e-commerce platform. Walk me through how you'd design the caching architecture."
You: "I'd approach this with multiple cache tiers, each serving a specific purpose. Let me walk through from user to database.
Browser caching: For static assets—CSS, JavaScript, images—I'd use content-hashed filenames with immutable cache headers. app.abc123.js gets cached for a year. When we deploy new code, the filename changes, so users get fresh assets without explicit invalidation.
For HTML pages in a single-page app, I'd set Cache-Control: no-cache with ETag. The browser always checks, but gets a 304 Not Modified if nothing changed—saves bandwidth.
Interviewer: "What about the CDN layer?"
You: "The CDN is critical for performance. I'd cache:
-
All static assets with long TTLs, served from edge locations globally.
-
Public API responses like product listings and category pages with short TTLs—maybe 60 seconds. These are the same for all anonymous users, so CDN caching is highly effective.
-
Product images with long TTLs. When a product image changes, we'd use a new URL rather than purging.
I would NOT cache authenticated endpoints at the CDN. Any request with an Authorization header gets Cache-Control: private, ensuring user-specific data stays private.
For Vary headers, I'd include Accept-Encoding (for compression) and Accept-Language if we serve localized content.
Interviewer: "How do you handle product price changes?"
You: "Price changes need fast propagation. Here's the flow:
-
Database update: Price changes from $99 to $79.
-
Application cache invalidation: Immediately delete
product:123from Redis. This is fast, synchronous. -
CDN purge: Call the CDN API to purge
/api/products/123and any product listing pages. This is async—I'd queue it to not block the price update. -
Browser cache: Can't force invalidation, but with 60-second TTL on product pages, staleness is bounded.
For flash sales where thousands of prices change at once, I'd use CDN cache tags. All product responses include a Cache-Tag: products header. To invalidate all products, I purge by tag—one API call instead of thousands.
Interviewer: "What about the application cache layer?"
You: "Redis serves as the application cache. I'd cache:
- Product details: Full product objects with 5-minute TTL.
- Inventory counts: Very short TTL (30 seconds) or event-driven invalidation since inventory changes frequently.
- User sessions: With sliding expiration.
- Computed values: Like product ratings, category counts.
I'd use the cache-aside pattern—check cache, miss goes to database, populate cache on return. For inventory, I might use read-through with very short TTL since accuracy matters.
Interviewer: "How do you ensure consistency across these tiers?"
You: "Perfect consistency across all tiers is impractical—it would eliminate the performance benefits. Instead, I'd design for bounded staleness:
- CDN: 60-second max staleness for product data
- Redis: 5-minute max, but event-driven invalidation makes it usually seconds
- Browser: 60-second max for API data, longer for static assets
For critical paths like checkout, I'd bypass caching entirely and read from the database. A customer should never be charged a different price than what they saw.
The invalidation order is crucial: Redis first, then CDN. If I purge CDN before Redis, the CDN might fetch from origin, get the old cached value from Redis, and re-cache stale data."
Interviewer: "How would you monitor this system?"
You: "Key metrics at each tier:
CDN: Hit ratio (should be >95% for static assets), origin fetch latency, purge success rate.
Redis: Hit ratio, memory usage, eviction rate, latency percentiles.
Database: Query rate (should be low if caching is effective), slow queries.
I'd add X-Cache headers to responses indicating HIT/MISS at each layer, making debugging easier. And I'd alert on hit ratio drops—often indicates invalidation bugs or traffic pattern changes."
Summary
DAY 5 KEY TAKEAWAYS
THE CACHE HIERARCHY:
• Browser → CDN → Gateway → Application → Database
• Each layer has different characteristics
• Moving down: higher latency, better freshness
WHAT BELONGS WHERE:
Browser:
• Static assets (with content hashing)
• User preferences
• Short-lived API responses
CDN:
• Static assets
• Public API responses
• Media files
• NOT: Authenticated/personalized content
API Gateway:
• Auth token validation
• Rate limit counters
• Request deduplication
Application (Redis):
• Business data
• Sessions
• Computed values
• Feature flags
INVALIDATION ORDER:
App Cache → Gateway → CDN → (Browser via TTL)
Closest to database first!
KEY PRINCIPLES:
• Private for authenticated responses
• Vary on content-affecting headers
• Version cache keys for schema changes
• Async CDN purge (don't block updates)
• Bounded staleness, not perfect consistency
CACHE HEADERS:
• Cache-Control: public/private, max-age, stale-while-revalidate
• Vary: Accept-Encoding, Accept-Language, Authorization
• ETag: For conditional requests
📚 Further Reading
Official Documentation
- HTTP Caching (MDN): https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching
- CloudFront Developer Guide: https://docs.aws.amazon.com/AmazonCloudFront/
- Fastly Caching Guides: https://developer.fastly.com/learning/concepts/cache-freshness/
Engineering Blogs
- Netflix Open Connect: https://openconnect.netflix.com/
- Cloudflare Workers: https://blog.cloudflare.com/
- Varnish Cache: https://varnish-cache.org/docs/
Books
- "High Performance Browser Networking" by Ilya Grigorik
- "Web Scalability for Startup Engineers" by Artur Ejsmont
End of Day 5: Multi-Tier Caching
This completes Week 4: Caching — Beyond "Just Add Redis"
You've learned:
- Day 1: The four caching patterns
- Day 2: Invalidation strategies
- Day 3: Thundering herd prevention
- Day 4: Feed caching (push vs pull)
- Day 5: Multi-tier caching architecture
Next Week: Week 5 — Consistency and Coordination. We'll explore distributed transactions, consensus algorithms, and how to maintain consistency across services.