Foundation

Week 0 — Part 2: Infrastructure Building Blocks

The Complete Guide to System Components

Introduction

Every system you design will use some combination of infrastructure building blocks. Understanding what each component does, how it works internally, when to use it, trade-offs, and how it fits with other components is essential for system design interviews and real-world architecture.

This document provides deep coverage of each component — not just definitions, but the practical knowledge you need to make informed design decisions.

┌─────────────────────────────────────────────────────────────────────────┐
│                   THE SYSTEM DESIGN BUILDING BLOCKS                     │
│                                                                         │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                      EDGE LAYER                                 │   │
│   │   CDN │ WAF │ DNS │ Load Balancer │ API Gateway                 │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                │                                        │
│                                ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                   APPLICATION LAYER                             │   │
│   │   Web Servers │ Application Servers │ Microservices             │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                │                                        │
│                                ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                       DATA LAYER                                │   │
│   │   Databases │ Caches │ Object Storage │ Search Engines          │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                │                                        │
│                                ▼                                        │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │                   MESSAGING LAYER                               │   │
│   │   Message Queues │ Event Streams │ Pub/Sub                      │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Chapter 1: Edge Layer Components

The edge layer is the first thing users interact with. It sits between users and your application servers, handling traffic before it reaches your backend. A well-designed edge layer dramatically improves performance, security, and reliability.

1.1 Content Delivery Network (CDN)

What Is a CDN?

A Content Delivery Network is a geographically distributed network of proxy servers (called edge servers or Points of Presence - PoPs). The goal is to serve content from locations physically closer to users, reducing latency dramatically.

When a user requests content, instead of traveling thousands of miles to your origin server, they connect to the nearest CDN edge location.

How CDN Works — Step by Step

┌─────────────────────────────────────────────────────────────────────────┐
│                         CDN REQUEST FLOW                                │
│                                                                         │
│   SCENARIO: User in Tokyo requests image from US-based origin           │
│                                                                         │
│   ═══════════════════════════════════════════════════════════════════   │
│   WITHOUT CDN:                                                          │
│   ═══════════════════════════════════════════════════════════════════   │
│                                                                         │
│   ┌──────────────┐                                    ┌──────────────┐  │
│   │     User     │────────── 200ms RTT ─────────────▶ │    Origin    │  │
│   │   (Tokyo)    │                                    │   (US-East)  │  │
│   │              │◀────────────────────────────────── │              │  │
│   └──────────────┘                                    └──────────────┘  │
│                                                                         │
│   Every single request = 200ms+ latency                                 │
│   Origin server handles ALL traffic worldwide                           │
│   High bandwidth costs for origin                                       │
│                                                                         │
│   ═══════════════════════════════════════════════════════════════════   │
│   WITH CDN - FIRST REQUEST (Cache Miss):                                │
│   ═══════════════════════════════════════════════════════════════════   │
│                                                                         │
│   ┌──────────────┐    ┌──────────────┐             ┌──────────────┐     │
│   │     User     │─1─▶│  CDN Edge    │──────3─────▶│    Origin    │     │
│   │   (Tokyo)    │    │   (Tokyo)    │             │   (US-East)  │     │
│   │              │    │              │◀─────4──────│              │     │
│   │              │◀─5─│  Caches &    │             └──────────────┘     │
│   └──────────────┘    │  Returns     │                                  │
│                       └──────────────┘                                  │
│                                                                         │
│   1. User requests image.jpg from nearest CDN edge (Tokyo)              │
│   2. CDN checks local cache → MISS (not cached yet)                     │
│   3. CDN fetches from origin server (200ms to US)                       │
│   4. Origin returns image to CDN edge                                   │
│   5. CDN caches the image AND returns to user                           │
│                                                                         │
│   First request total time: ~220ms (still slow, but now cached!)        │
│                                                                         │
│   ═══════════════════════════════════════════════════════════════════   │
│   WITH CDN - SUBSEQUENT REQUESTS (Cache Hit):                           │
│   ═══════════════════════════════════════════════════════════════════   │
│                                                                         │
│   ┌──────────────┐    ┌──────────────┐                                  │
│   │     User     │─1─▶│  CDN Edge    │  ← No origin request needed!     │
│   │   (Tokyo)    │    │   (Tokyo)    │                                  │
│   │              │◀─2─│  From Cache  │                                  │
│   └──────────────┘    └──────────────┘                                  │
│                                                                         │
│   1. User requests image.jpg                                            │
│   2. CDN returns from local cache immediately                           │
│                                                                         │
│   Subsequent requests: ~20ms (10x faster!)                              │
│   Origin load: Reduced by 90%+ for popular content                      │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

CDN Points of Presence (PoPs)

Major CDN providers have hundreds of edge locations worldwide:

CLOUDFLARE NETWORK (Example):
├── North America: 80+ cities (LA, NYC, Chicago, Dallas, Seattle...)
├── Europe: 60+ cities (London, Frankfurt, Paris, Amsterdam...)
├── Asia Pacific: 50+ cities (Tokyo, Singapore, Sydney, Mumbai...)
├── South America: 20+ cities (São Paulo, Buenos Aires...)
├── Africa: 15+ cities (Johannesburg, Cape Town, Lagos...)
└── Middle East: 10+ cities (Dubai, Tel Aviv...)

Total: 275+ cities across 100+ countries

AWS CLOUDFRONT:
├── 400+ Points of Presence
├── 13 Regional Edge Caches
└── Integrated with AWS services

The closer the PoP to the user, the lower the latency.
A user in Mumbai connecting to a Mumbai PoP = ~5-10ms
Same user connecting to US origin = ~150-200ms

What to Cache vs What NOT to Cache

This is critical for CDN strategy:

┌─────────────────────────────────────────────────────────────────────────┐
│                         CDN CACHING DECISIONS                           │
│                                                                         │
│  ✅ ALWAYS CACHE (Static Content)                                       │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  IMAGES                                                                 │
│  ├── Product photos, thumbnails, hero images                            │
│  ├── User avatars (public ones)                                         │
│  ├── Icons, logos, favicons                                             │
│  ├── Formats: jpg, png, gif, webp, svg, ico                             │
│  └── TTL: 1 year (use versioned URLs for updates)                       │
│                                                                         │
│  CSS & JAVASCRIPT                                                       │
│  ├── Stylesheets (app.css)                                              │
│  ├── JavaScript bundles (app.js, vendor.js)                             │ 
│  ├── Use content hash in filename: app.a1b2c3d4.js                      │
│  └── TTL: 1 year (hash changes = new URL = cache miss = fresh file)     │
│                                                                         │
│  FONTS                                                                  │
│  ├── Web fonts: woff, woff2, ttf, eot                                   │
│  └── TTL: 1 year (fonts rarely change)                                  │
│                                                                         │
│  VIDEO & AUDIO                                                          │
│  ├── Video files: mp4, webm                                             │
│  ├── Audio files: mp3, wav                                              │
│  ├── HLS/DASH segments for streaming                                    │
│  └── TTL: 1 year or based on content type                               │
│                                                                         │
│  DOCUMENTS                                                              │
│  ├── PDFs, downloadable files                                           │
│  ├── Software installers                                                │
│  └── TTL: Based on update frequency                                     │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  ⚠️ CACHE WITH CARE (Semi-Dynamic Content)                              │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  API RESPONSES (only if SAME for all users)                             │
│  ├── ✓ Product catalog (same for everyone)                              │
│  ├── ✓ Store locations                                                  │
│  ├── ✓ Public configuration                                             │
│  ├── ✗ User profile (different per user!)                               │
│  └── TTL: 1-5 minutes (short, to stay fresh)                            │
│                                                                         │
│  SEARCH RESULTS                                                         │
│  ├── Only if you expect high cache hit rate                             │
│  ├── Popular searches might benefit                                     │
│  └── TTL: 1-5 minutes                                                   │
│                                                                         │
│  USER-AGNOSTIC PAGES                                                    │
│  ├── Homepage (logged-out version only)                                 │
│  ├── Category/listing pages                                             │
│  ├── Marketing/landing pages                                            │
│  └── TTL: 5-60 minutes                                                  │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  ❌ NEVER CACHE                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  USER-SPECIFIC DATA                                                     │
│  ├── User profiles, account pages                                       │
│  ├── Shopping carts, wishlists                                          │
│  ├── Order history                                                      │
│  ├── Personalized recommendations                                       │
│  └── Dashboard data                                                     │
│                                                                         │
│  AUTHENTICATED API RESPONSES                                            │
│  ├── Anything requiring login                                           │
│  ├── User-specific search results                                       │
│  └── Private data                                                       │
│                                                                         │
│  WRITE OPERATIONS                                                       │
│  ├── POST requests                                                      │
│  ├── PUT/PATCH requests                                                 │
│  └── DELETE requests                                                    │
│                                                                         │
│  REAL-TIME DATA                                                         │
│  ├── Stock prices, cryptocurrency rates                                 │
│  ├── Live sports scores                                                 │
│  ├── Chat messages                                                      │
│  └── Live analytics dashboards                                          │
│                                                                         │
│  WEBHOOKS & CALLBACKS                                                   │
│  ├── Payment confirmations                                              │
│  ├── Third-party notifications                                          │
│  └── OAuth callbacks                                                    │
│                                                                         │
│  SENSITIVE DATA                                                         │
│  ├── Financial information                                              │
│  ├── Health records (HIPAA)                                             │
│  └── PII (Personally Identifiable Information)                          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Cache-Control Headers Deep Dive

HTTP headers control how CDNs (and browsers) cache content:

┌─────────────────────────────────────────────────────────────────────────┐
│                    CACHE-CONTROL HEADERS                                │
│                                                                         │
│  LONG-TERM CACHING (Static assets with content hash)                    │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Cache-Control: public, max-age=31536000, immutable                     │
│                                                                         │
│  • public: Can be cached by CDN and browser                             │
│  • max-age=31536000: Cache for 1 year (365 * 24 * 60 * 60)              │
│  • immutable: Don't even check for updates (save revalidation)          │
│                                                                         │
│  Use for: app.a1b2c3.js, style.x7y8z9.css, image.abc123.png             │
│  (Files with content hash in filename - changing content = new file)    │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  SHORT-TERM CACHING (Semi-dynamic content)                              │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Cache-Control: public, max-age=300, must-revalidate                    │
│                                                                         │
│  • max-age=300: Cache for 5 minutes                                     │
│  • must-revalidate: After TTL, MUST check with origin before serving    │
│                                                                         │
│  Use for: API responses, product listings, semi-dynamic pages           │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  REVALIDATION CACHING (Always fresh, but allow caching)                 │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Cache-Control: public, no-cache                                        │
│  ETag: "abc123"                                                         │
│                                                                         │
│  • no-cache: DOESN'T mean "don't cache"!                                │
│  • It means: Cache it, but revalidate with origin before using          │
│  • ETag allows conditional requests (If-None-Match: "abc123")           │
│  • If unchanged, origin returns 304 Not Modified (no body = fast)       │
│                                                                         │
│  Use for: Frequently updated content where staleness is unacceptable    │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  NO CACHING (Private/sensitive content)                                 │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Cache-Control: private, no-store                                       │
│                                                                         │
│  • private: Only browser can cache (not CDN)                            │
│  • no-store: Don't cache at all, not even temporarily                   │
│                                                                         │
│  Use for: User-specific data, authenticated responses, sensitive data   │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  DIFFERENT TTL FOR CDN vs BROWSER                                       │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Cache-Control: public, max-age=3600, s-maxage=86400                    │
│                                                                         │
│  • max-age=3600: Browser caches for 1 hour                              │
│  • s-maxage=86400: CDN (shared cache) caches for 24 hours               │
│                                                                         │
│  Why? CDN can invalidate quickly. Browsers can't.                       │
│  So CDN can cache longer, browser should check more often.              │
│                                                                         │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  STALE-WHILE-REVALIDATE (Performance optimization)                      │
│  ─────────────────────────────────────────────────────────────────────  │
│                                                                         │
│  Cache-Control: public, max-age=300, stale-while-revalidate=86400       │
│                                                                         │
│  • For 5 minutes: Serve from cache                                      │
│  • From 5 min to 24 hours: Serve stale content IMMEDIATELY,             │
│    but fetch fresh content in background                                │
│  • After 24 hours: Must wait for fresh content                          │
│                                                                         │
│  Benefit: User never waits for revalidation!                            │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Cache Invalidation Strategies

┌─────────────────────────────────────────────────────────────────────────┐
│                    CACHE INVALIDATION STRATEGIES                        │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  1. VERSIONED URLs (Recommended for static assets)                      │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Instead of:  /static/app.js                                            │
│  Use:         /static/app.a1b2c3d4.js  (content hash in filename)       │
│                                                                         │
│  How it works:                                                          │
│  1. Build tool (Webpack, Vite) generates hash from file content         │
│  2. When code changes, hash changes, URL changes                        │
│  3. New URL = automatic cache miss = users get fresh file               │
│  4. Old version stays cached (harmless, eventually expires)             │
│                                                                         │
│  ✓ Instant updates for users                                            │
│  ✓ Zero cache purge latency                                             │
│  ✓ Safe rollbacks (old URLs still work)                                 │
│  ✓ No CDN API calls needed                                              │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  2. PURGE API (For non-versioned content)                               │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  AWS CloudFront:                                                        │
│  aws cloudfront create-invalidation \                                   │
│    --distribution-id EDFDVBD632 \                                       │
│    --paths "/images/*" "/api/products/*"                                │
│                                                                         │
│  Cloudflare:                                                            │
│  curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone}/purge" │
│    -H "Authorization: Bearer {token}" \                                 │
│    -d '{"files":["https://example.com/image.jpg"]}'                     │
│                                                                         │
│  Propagation time:                                                      │
│  • CloudFront: 5-15 minutes (must propagate to all 400+ PoPs)           │
│  • Fastly: <150ms (instant purge is their specialty)                    │
│  • Cloudflare: ~30 seconds                                              │
│                                                                         │
│  ✓ Works for any URL pattern                                            │
│  ✗ Propagation delay                                                    │
│  ✗ May have cost (CloudFront: 1000 free/month, then $0.005 each)        │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  3. TTL-BASED EXPIRY (Simple but has staleness window)                  │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Cache-Control: max-age=300  (5 minutes)                                │
│                                                                         │
│  • Content cached for 5 minutes                                         │
│  • After TTL, CDN checks with origin (If-Modified-Since)                │
│  • Origin returns 304 (unchanged) or 200 (new content)                  │
│                                                                         │
│  Trade-off:                                                             │
│  • Short TTL (60s) = More origin requests, always fresh                 │
│  • Long TTL (1hr) = Fewer origin requests, may be stale                 │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  4. CACHE TAGS / SURROGATE KEYS (Advanced)                              │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Supported by: Fastly, Varnish, some CDNs                               │
│                                                                         │
│  Response header:                                                       │
│  Surrogate-Key: product-123 category-shoes brand-nike                   │
│                                                                         │
│  When product 123 updates:                                              │
│  PURGE /tags/product-123                                                │
│  → Invalidates ALL URLs tagged with "product-123"                       │
│  → Product page, category pages, search results, recommendations...     │
│                                                                         │
│  ✓ Granular invalidation without knowing all URLs                       │
│  ✓ One purge affects all related content                                │
│  ✗ Requires CDN support                                                 │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  BEST PRACTICE: Combine strategies                                      │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Static assets (JS, CSS, images): Versioned URLs + long TTL             │
│  API responses: Short TTL + Purge on update                             │
│  HTML pages: stale-while-revalidate + Purge on publish                  │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

CDN Provider Comparison

┌─────────────────────────────────────────────────────────────────────────┐
│                      CDN PROVIDER COMPARISON                            │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  CLOUDFRONT (AWS)                                                       │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Network: 400+ edge locations, 13 regional edge caches                  │
│                                                                         │
│  Strengths:                                                             │
│  ✓ Deep AWS integration (S3, ALB, API Gateway, Lambda@Edge)             │
│  ✓ Lambda@Edge for compute at edge (auth, redirects, A/B testing)       │
│  ✓ Origin Shield (extra caching layer to protect origin)                │
│  ✓ Real-time logs to Kinesis                                            │
│  ✓ Field-level encryption                                               │
│  ✓ WebSocket support                                                    │
│                                                                         │
│  Weaknesses:                                                            │
│  ✗ Slow invalidation (5-15 minutes to propagate)                        │ 
│  ✗ Complex pricing model                                                │
│  ✗ Configuration can be verbose                                         │
│                                                                         │
│  Best for: AWS-native applications, serverless architectures            │
│  Pricing: ~$0.085/GB (varies by region), $0.0075/10K requests           │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  CLOUDFLARE                                                             │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Network: 275+ cities worldwide                                         │
│                                                                         │
│  Strengths:                                                             │
│  ✓ Generous FREE tier (yes, really free)                                │
│  ✓ Built-in DDoS protection (industry-leading)                          │
│  ✓ Workers (JavaScript/WASM at edge - very powerful)                    │
│  ✓ Workers KV (distributed key-value at edge)                           │
│  ✓ Instant cache purge (~30 seconds global)                             │
│  ✓ Easy setup (just change DNS nameservers)                             │
│  ✓ Fastest public DNS resolver (1.1.1.1)                                │
│  ✓ Argo Smart Routing (optimized paths)                                 │
│                                                                         │
│  Weaknesses:                                                            │
│  ✗ Less control over caching behavior than some competitors             │
│  ✗ Advanced features require paid plans                                 │
│  ✗ Must use Cloudflare DNS (can be a dealbreaker for some)              │
│                                                                         │
│  Best for: Startups, security-focused apps, edge computing              │
│  Pricing: Free tier, Pro $20/mo, Business $200/mo, Enterprise custom    │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  FASTLY                                                                 │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Network: 80+ PoPs (fewer but very high capacity)                       │
│                                                                         │
│  Strengths:                                                             │
│  ✓ INSTANT PURGE (<150ms global) - their killer feature                 │
│  ✓ VCL (Varnish Configuration Language) for custom caching logic        │
│  ✓ Real-time analytics and logging                                      │
│  ✓ Compute@Edge (WebAssembly at edge)                                   │ 
│  ✓ Image optimization built-in                                          │  
│  ✓ Surrogate keys for granular invalidation                             │
│                                                                         │
│  Weaknesses:                                                            │
│  ✗ Fewer PoPs than competitors (80 vs 275+)                             │
│  ✗ Steeper learning curve (VCL is powerful but complex)                 │
│  ✗ No free tier                                                         │
│  ✗ Higher cost for small sites                                          │
│                                                                         │
│  Best for: Media streaming, real-time content, advanced caching needs   │
│  Pricing: $0.12/GB + $0.0075/10K requests (minimum ~$50/mo)             │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  AKAMAI                                                                 │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Network: 4,000+ locations (largest network by far)                     │
│                                                                         │
│  Strengths:                                                             │
│  ✓ Largest, most distributed network in the world                       │ 
│  ✓ Enterprise-grade SLAs (99.99%+)                                      │ 
│  ✓ Advanced security (WAF, Bot Management, DDoS)                        │
│  ✓ Media delivery specialization (used by major streaming services )    │
│  ✓ API acceleration                                                     │
│                                                                         │
│  Weaknesses:                                                            │
│  ✗ Expensive (enterprise pricing)                                       │
│  ✗ Complex configuration and management                                 │
│  ✗ Sales-driven (hard to get started without talking to sales)          │
│                                                                         │
│  Best for: Large enterprises, media companies, global reach needs       │
│  Pricing: Enterprise contracts (typically $$$$$)                        │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  QUICK SELECTION GUIDE                                                  │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  AWS shop / serverless         → CloudFront                             │
│  Startup / budget-conscious    → Cloudflare (free tier)                 │
│  Need instant cache purge      → Fastly                                 │
│  Edge computing / Workers      → Cloudflare                             │
│  Enterprise / media streaming  → Akamai or Fastly                       │
│  Simple setup                  → Cloudflare                             │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

CDN Interview Tips

When discussing CDN in system design interviews:

WHAT TO MENTION:

1. WHAT YOU'RE CACHING
   "Static assets like images, CSS, JS, and fonts are served through CDN."
   "For API responses that are identical for all users, like product catalog,
    we can also cache at the CDN."

2. CACHE STRATEGY
   "We use content-hashed filenames (app.a1b2c3.js) which allows us to set
    an aggressive 1-year TTL. When code changes, the hash changes, so users
    automatically get the new version."

3. CACHE HIT RATIO
   "We target 95%+ cache hit ratio for static assets. This reduces origin
    load significantly."

4. GEOGRAPHIC DISTRIBUTION
   "With users in Asia, Europe, and Americas, CDN ensures everyone gets
    sub-50ms latency for static content instead of 200ms+ to our US origin."

5. CACHE INVALIDATION
   "For product data that changes, we use short TTL (5 min) combined with
    cache purge on updates for immediate consistency when needed."

SAMPLE INTERVIEW ANSWER:

"For static content, we'll use CloudFront CDN. All JavaScript and CSS files
use content-based hashing in filenames, like app.a1b2c3.js, which lets us
set a 1-year cache TTL. When we deploy new code, the hash changes, so users
automatically get fresh files.

For images, we serve them from S3 through CloudFront with the same long TTL.
This gives us 95%+ cache hit ratio and reduces our origin bandwidth by about
90%.

For semi-dynamic content like product listings that are the same for all
users, we cache at the CDN with a 5-minute TTL. When a product is updated,
we purge that specific path.

With this setup, users globally get sub-50ms response times for static
content, and our origin servers only handle cache misses and authenticated
requests."

📚 Further Reading: CDN

Cloudflare Learning Center: https://www.cloudflare.com/learning/cdn/what-is-a-cdn/
AWS CloudFront Developer Guide: https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/
Fastly Documentation: https://docs.fastly.com/
Web.dev - HTTP Caching: https://web.dev/http-cache/
High Scalability - CDN Architecture: http://highscalability.com/blog/2011/2/28/a-practical-guide-to-building-a-content-delivery-network-c.html

1.2 Load Balancer

What Is a Load Balancer?

A load balancer distributes incoming network traffic across multiple servers. It's one of the most fundamental components for building scalable systems.

Why do we need it?

Scalability: One server can only handle so much. Distribute load across many.
Availability: If one server dies, others continue serving traffic.
Performance: Route to the least-loaded or fastest-responding server.

How Load Balancing Works

┌────────────────────────────────────────────────────────────────────────┐
│                      LOAD BALANCER ARCHITECTURE                        │
│                                                                        │
│                         Incoming Traffic                               │
│                        (50,000 req/sec)                                │
│                               │                                        │
│                               ▼                                        │
│  ┌───────────────────────────────────────────────────────────────────┐ │
│  │                        LOAD BALANCER                              │ │
│  │                                                                   │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐             │ │
│  │  │   Listener   │  │   Routing    │  │    Health    │             │ │
│  │  │  (Port 443)  │  │    Rules     │  │    Checks    │             │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘             │ │
│  │                                                                   │ │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐             │ │
│  │  │     SSL      │  │   Session    │  │   Algorithm  │             │ │
│  │  │ Termination  │  │  Persistence │  │  Selection   │             │ │
│  │  └──────────────┘  └──────────────┘  └──────────────┘             │ │
│  └───────────────────────────────────────────────────────────────────┘ │
│                               │                                        │
│            ┌──────────────────┼──────────────────┐                     │
│            │                  │                  │                     │
│            ▼                  ▼                  ▼                     │
│   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐               │
│   │   Server 1   │   │   Server 2   │   │   Server 3   │               │
│   │              │   │              │   │              │               │
│   │ ~17K req/s   │   │ ~17K req/s   │   │ ~17K req/s   │               │
│   │    ✓ OK      │   │    ✓ OK      │   │    ✗ DOWN    │               │
│   └──────────────┘   └──────────────┘   └──────────────┘               │
│            ▲                  ▲                  ▲                     │
│            │                  │                  │                     │
│            └──────────────────┴──────────────────┘                     │
│                               │                                        │
│                    Health Check Probes                                 │
│                   (Every 10-30 seconds)                                │
│                                                                        │
│   Server 3 is marked unhealthy → Traffic redistributed to 1 & 2        │
│   Now: Server 1 and 2 each handle ~25K req/s                           │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Load Balancing Algorithms — Detailed

┌─────────────────────────────────────────────────────────────────────────┐
│                   LOAD BALANCING ALGORITHMS                             │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  1. ROUND ROBIN                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  How it works:                                                          │
│  Request 1 → Server A                                                   │
│  Request 2 → Server B                                                   │
│  Request 3 → Server C                                                   │
│  Request 4 → Server A  (cycle repeats)                                  │
│  Request 5 → Server B                                                   │
│  ...                                                                    │
│                                                                         │
│  Implementation: next = servers[counter % len(servers)]; counter++      │
│                                                                         │
│  ✓ Simple to implement and understand                                   │
│  ✓ Even distribution over time                                          │
│  ✓ No state to maintain                                                 │
│  ✓ Works well with homogeneous servers                                  │
│                                                                         │
│  ✗ Ignores server capacity (all treated equal)                          │
│  ✗ Ignores current load                                                 │
│  ✗ Some requests may take longer than others                            │
│                                                                         │
│  Best for: Identical servers, stateless requests, similar workloads     │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  2. WEIGHTED ROUND ROBIN                                                │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Configuration:                                                         │
│  Server A: weight = 5 (more powerful, 8 CPU cores)                      │
│  Server B: weight = 3 (medium, 4 CPU cores)                             │
│  Server C: weight = 2 (smaller, 2 CPU cores)                            │
│                                                                         │
│  Distribution (out of 10 requests):                                     │
│  Server A: 5 requests (50%)                                             │
│  Server B: 3 requests (30%)                                             │
│  Server C: 2 requests (20%)                                             │
│                                                                         │
│  ✓ Accounts for different server capacities                             │
│  ✓ Useful during migrations (old=1, new=5)                              │
│  ✓ Canary deployments (stable=99, canary=1)                             │
│                                                                         │
│  ✗ Weights are static, don't adapt to actual load                       │
│  ✗ Requires manual tuning                                               │
│                                                                         │
│  Best for: Heterogeneous servers, gradual rollouts, canary deploys      │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  3. LEAST CONNECTIONS                                                   │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Current state:                                                         │
│  Server A: 150 active connections                                       │
│  Server B: 75 active connections   ← Next request goes here             │
│  Server C: 120 active connections                                       │
│                                                                         │
│  How it works:                                                          │
│  1. LB tracks active connections per server                             │
│  2. New request → server with fewest connections                        │
│  3. Connection count incremented when request starts                    │
│  4. Connection count decremented when response completes                │
│                                                                         │
│  ✓ Automatically adapts to actual load                                  │
│  ✓ Handles varying request durations well                               │
│  ✓ Self-balancing                                                       │
│                                                                         │
│  ✗ Requires connection tracking (slight overhead)                       │
│  ✗ Doesn't account for connection "weight" (some heavier than others)   │
│                                                                         │
│  Best for: Long-running connections, WebSockets, variable workloads     │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  4. LEAST RESPONSE TIME                                                 │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Current metrics:                                                       │
│  Server A: avg response = 45ms, 50 connections                          │
│  Server B: avg response = 28ms, 60 connections  ← Fastest!              │
│  Server C: avg response = 52ms, 40 connections                          │
│                                                                         │
│  Calculation: Route to server with lowest (response_time * connections) │
│                                                                         │
│  ✓ Optimizes for actual user-perceived latency                          │
│  ✓ Automatically adapts to server performance                           │
│  ✓ Accounts for both load and speed                                     │
│                                                                         │
│  ✗ More complex to implement                                            │
│  ✗ Can be "noisy" with variable latencies                               │
│  ✗ May starve consistently slow servers                                 │
│                                                                         │
│  Best for: Latency-sensitive applications, heterogeneous performance    │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  5. IP HASH (Source IP Affinity)                                        │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  How it works:                                                          │
│  server_index = hash(client_ip) % number_of_servers                     │
│                                                                         │
│  Client 192.168.1.1 → hash(ip) = 12345 → 12345 % 3 = 0 → Server A       │
│  Client 192.168.1.2 → hash(ip) = 67890 → 67890 % 3 = 0 → Server A       │
│  Client 192.168.1.3 → hash(ip) = 11111 → 11111 % 3 = 2 → Server C       │
│                                                                         │
│  Same client always goes to same server (until server pool changes)     │
│                                                                         │
│  ✓ Sticky sessions without cookies                                      │
│  ✓ Good for local server-side caching                                   │
│  ✓ Stateless LB (no session table needed)                               │
│                                                                         │
│  ✗ Can cause uneven distribution                                        │
│  ✗ Adding/removing server reshuffles many clients                       │
│  ✗ Mobile users may change IP (WiFi → cellular)                         │
│  ✗ Users behind NAT share IP                                            │
│                                                                         │
│  Best for: Caching layers, when cookies aren't available                │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  6. CONSISTENT HASHING (Advanced)                                       │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  How it works:                                                          │
│  • Servers placed on a virtual ring (hash space 0 to 2^32)              │
│  • Request key hashed to point on ring                                  │
│  • Route to first server clockwise from that point                      │
│                                                                         │
│       Server A                                                          │
│          *                                                              │
│        /   \                                                            │
│       /     \                                                           │
│      *       *                                                          │
│   Server C   Server B                                                   │
│                                                                         │
│  Key benefit: Adding/removing server only affects 1/N of keys!          │
│  (Unlike IP hash where many clients get reshuffled)                     │
│                                                                         │
│  Best for: Distributed caches, databases, when servers change often     │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  ALGORITHM SELECTION GUIDE                                              │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Scenario                          │ Recommended Algorithm              │
│  ──────────────────────────────────┼──────────────────────────────────  │
│  Simple stateless API              │ Round Robin                        │
│  Mixed server sizes                │ Weighted Round Robin               │
│  Long-running connections          │ Least Connections                  │
│  Latency-sensitive app             │ Least Response Time                │
│  Need session stickiness           │ IP Hash or Cookie-based            │
│  Distributed cache                 │ Consistent Hashing                 │
│  Canary deployment                 │ Weighted Round Robin               │
│                                                                         │
│  Default choice: Least Connections (adapts well to most scenarios)      │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Layer 4 vs Layer 7 Load Balancing

┌─────────────────────────────────────────────────────────────────────────┐
│                  LAYER 4 vs LAYER 7 LOAD BALANCING                      │
│                                                                         │
│  OSI MODEL REFERENCE:                                                   │
│  Layer 4 = Transport Layer (TCP/UDP)                                    │
│  Layer 7 = Application Layer (HTTP/HTTPS)                               │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│                           LAYER 4 (L4)                                  │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  WHAT IT SEES:                                                          │
│  ┌────────────────────────────────────────────────────────┐             │
│  │  TCP/IP Packet Headers:                                │             │
│  │  ├── Source IP: 192.168.1.100                          │             │
│  │  ├── Destination IP: 10.0.0.1                          │             │
│  │  ├── Source Port: 52431                                │             │
│  │  ├── Destination Port: 443                             │             │
│  │  └── Protocol: TCP                                     │             │
│  │                                                        │             │
│  │  [Payload is encrypted/opaque - LB cannot read it]     │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                         │
│  WHAT IT CAN DO:                                                        │
│  ✓ Route based on IP address and port                                   │
│  ✓ Very fast (no packet inspection, just forward)                       │
│  ✓ Protocol agnostic (TCP, UDP, any port)                               │
│  ✓ Preserve client IP (DSR/Direct Server Return mode)                   │
│  ✓ Handle millions of connections                                       │
│                                                                         │
│  WHAT IT CANNOT DO:                                                     │
│  ✗ Route based on URL path (/api vs /static)                            │
│  ✗ Read HTTP headers or cookies                                         │
│  ✗ Modify request/response content                                      │
│  ✗ Smart routing based on content                                       │
│  ✗ SSL termination (usually passed through)                             │
│                                                                         │
│  USE CASES:                                                             │
│  • Database connections (MySQL, PostgreSQL)                             │
│  • Redis, Memcached                                                     │
│  • Gaming servers (UDP)                                                 │
│  • Any non-HTTP TCP/UDP service                                         │
│  • When you need maximum performance                                    │
│                                                                         │
│  AWS: Network Load Balancer (NLB)                                       │
│  GCP: TCP/UDP Load Balancer                                             │
│                                                                         │
│  Performance: ~1M+ connections/sec, <100μs added latency                │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│                           LAYER 7 (L7)                                  │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  WHAT IT SEES (after SSL termination):                                  │
│  ┌────────────────────────────────────────────────────────┐             │
│  │  Full HTTP Request:                                    │             │
│  │  ├── Method: GET                                       │             │
│  │  ├── Path: /api/v2/users/123                           │             │
│  │  ├── Host: api.example.com                             │             │
│  │  ├── Headers:                                          │             │
│  │  │   ├── Authorization: Bearer eyJhbGc...              │             │
│  │  │   ├── Content-Type: application/json                │             │
│  │  │   ├── X-Request-ID: abc123                          │             │
│  │  │   └── X-Feature-Flag: new-checkout                  │             │
│  │  ├── Cookies:                                          │             │
│  │  │   └── session_id=xyz789                             │             │
│  │  └── Body: {"name": "John"}                            │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                         │
│  WHAT IT CAN DO:                                                        │
│  ✓ Route based on URL path (/api/* → API servers)                       │
│  ✓ Route based on hostname (api.* vs www.*)                             │
│  ✓ Route based on HTTP headers                                          │
│  ✓ Route based on cookies (session affinity)                            │
│  ✓ Route based on query parameters                                      │
│  ✓ SSL/TLS termination                                                  │
│  ✓ Add/modify/remove headers                                            │
│  ✓ URL rewriting (/old-path → /new-path)                                │
│  ✓ Response caching                                                     │
│  ✓ Compression (gzip)                                                   │
│  ✓ Rate limiting                                                        │
│  ✓ Authentication                                                       │
│  ✓ WebSocket support                                                    │
│                                                                         │
│  EXAMPLE L7 ROUTING RULES:                                              │
│  ┌────────────────────────────────────────────────────────┐             │
│  │  Rule 1: IF path starts with /api/v1/*                 │             │
│  │          → Route to: api-v1-target-group               │             │
│  │                                                        │             │
│  │  Rule 2: IF path starts with /api/v2/*                 │             │
│  │          → Route to: api-v2-target-group               │             │
│  │                                                        │             │
│  │  Rule 3: IF path starts with /static/*                 │             │
│  │          → Route to: static-servers                    │             │
│  │                                                        │             │
│  │  Rule 4: IF header X-Canary equals "true"              │             │
│  │          → Route to: canary-target-group               │             │
│  │                                                        │             │
│  │  Rule 5: IF host equals admin.example.com              │             │
│  │          → Route to: admin-servers                     │             │
│  │                                                        │             │
│  │  Default: → Route to: web-servers                      │             │
│  └────────────────────────────────────────────────────────┘             │
│                                                                         │
│  USE CASES:                                                             │
│  • Web applications                                                     │
│  • REST APIs                                                            │
│  • Microservices routing                                                │
│  • A/B testing (route 10% to new version)                               │
│  • Canary deployments                                                   │
│  • Multi-tenant applications (route by subdomain)                       │
│  • Blue-green deployments                                               │
│                                                                         │
│  AWS: Application Load Balancer (ALB)                                   │
│  GCP: HTTP(S) Load Balancer                                             │
│                                                                         │
│  Performance: ~100K connections/sec, 1-5ms added latency                │
│                                                                         │
│  ═══════════════════════════════════════════════════════════════════    │
│  COMPARISON SUMMARY                                                     │
│  ═══════════════════════════════════════════════════════════════════    │
│                                                                         │
│  Aspect              │ Layer 4 (NLB)      │ Layer 7 (ALB)              │
│  ────────────────────┼────────────────────┼────────────────────────────│
│  Operates at         │ TCP/UDP level      │ HTTP/HTTPS level           │
│  Sees                │ IP + Port          │ Full HTTP request          │
│  Speed               │ Very fast          │ Fast                       │
│  Latency added       │ <100μs             │ 1-5ms                      │
│  SSL termination     │ Pass-through       │ Yes                        │
│  Content routing     │ No                 │ Yes                        │
│  Header manipulation │ No                 │ Yes                        │
│  WebSocket           │ Yes (pass-through) │ Yes (native)               │
│  Health checks       │ TCP/HTTP           │ HTTP with path             │
│  Cost (AWS)          │ Lower              │ Higher                     │
│                                                                        │
│  WHEN TO USE WHICH:                                                    │
│  • HTTP/HTTPS traffic → L7 (ALB)                                       │
│  • Database connections → L4 (NLB)                                     │
│  • Need content-based routing → L7                                     │
│  • Need maximum throughput → L4                                        │
│  • Non-HTTP protocols → L4                                             │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Health Checks Deep Dive

┌────────────────────────────────────────────────────────────────────────┐
│                         HEALTH CHECKS                                  │
│                                                                        │
│  PURPOSE: Detect unhealthy servers and stop routing traffic to them    │
│                                                                        │
│  ══════════════════════════════════════════════════════════════════    │
│  CONFIGURATION EXAMPLE (AWS ALB)                                       │
│  ══════════════════════════════════════════════════════════════════    │
│                                                                        │
│  Protocol: HTTP                                                        │
│  Port: 8080                                                            │
│  Path: /health                                                         │
│  Interval: 30 seconds                                                  │
│  Timeout: 5 seconds                                                    │
│  Healthy threshold: 2 consecutive successes                            │
│  Unhealthy threshold: 3 consecutive failures                           │
│  Success codes: 200-299                                                │
│                                                                        │
│  ══════════════════════════════════════════════════════════════════    │
│  HEALTH CHECK TIMELINE EXAMPLE                                         │
│  ══════════════════════════════════════════════════════════════════    │
│                                                                        │
│  Time   LB Sends        Server Response    Server Status               │
│  ─────  ──────────────  ─────────────────  ─────────────────────────   │
│  0s     GET /health     200 OK             Starting (1/2)              │
│  30s    GET /health     200 OK             ✓ HEALTHY (2/2) ← Active!   │
│  60s    GET /health     200 OK             ✓ Healthy                   │
│  90s    GET /health     500 Error          ⚠ Warning (1/3 failures)    │
│  120s   GET /health     Connection timeout ⚠ Warning (2/3 failures)    │
│  150s   GET /health     500 Error          ✗ UNHEALTHY ← Removed!      │
│  180s   GET /health     200 OK             Recovering (1/2)            │
│  210s   GET /health     200 OK             ✓ HEALTHY ← Back in pool!   │
│                                                                        │
│  ══════════════════════════════════════════════════════════════════    │
│  TYPES OF HEALTH CHECKS                                                │
│  ══════════════════════════════════════════════════════════════════    │
│                                                                        │
│  1. SHALLOW / LIVENESS CHECK                                           │
│  ────────────────────────────────────────────────────────────────────  │
│  Just verifies the process is running and can respond                  │
│                                                                        │
│  GET /health                                                           │
│  Response:                                                             │
│  {                                                                     │
│    "status": "ok"                                                      │
│  }                                                                     │
│                                                                        │
│  ✓ Fast, low overhead                                                  │
│  ✓ Always succeeds if process is up                                    │
│  ✓ Good for load balancer health checks                                │
│  ✗ Doesn't verify dependencies work                                    │
│                                                                        │
│  2. DEEP / READINESS CHECK                                             │
│  ────────────────────────────────────────────────────────────────────  │
│  Verifies the app AND all its dependencies are healthy                 │
│                                                                        │
│  GET /health/ready                                                     │
│  Response (healthy):                                                   │
│  {                                                                     │
│    "status": "healthy",                                                │
│    "version": "1.2.3",                                                 │
│    "uptime_seconds": 86400,                                            │
│    "checks": {                                                         │
│      "database": {                                                     │
│        "status": "up",                                                 │
│        "latency_ms": 5                                                 │
│      },                                                                │
│      "redis": {                                                        │
│        "status": "up",                                                 │
│        "latency_ms": 1                                                 │
│      },                                                                │
│      "external_api": {                                                 │
│        "status": "up",                                                 │
│        "latency_ms": 45                                                │
│      }                                                                 │
│    }                                                                   │
│  }                                                                     │
│                                                                        │
│  Response (unhealthy - HTTP 503):                                      │
│  {                                                                     │
│    "status": "unhealthy",                                              │
│    "checks": {                                                         │
│      "database": {                                                     │
│        "status": "down",                                               │
│        "error": "Connection refused"                                   │
│      },                                                                │
│      "redis": { "status": "up" },                                      │
│      "external_api": { "status": "up" }                                │
│    }                                                                   │
│  }                                                                     │
│                                                                        │
│  ✓ Comprehensive status                                                │
│  ✓ Catches dependency failures                                         │
│  ✗ Slower (queries dependencies)                                       │
│  ✗ Risk: If DB is down, ALL servers marked unhealthy!                  │
│                                                                        │
│  ══════════════════════════════════════════════════════════════════    │
│  BEST PRACTICE                                                         │
│  ══════════════════════════════════════════════════════════════════    │
│                                                                        │
│  Use SHALLOW check for load balancer (keep traffic flowing)            │
│  Use DEEP check for monitoring/alerting (know what's broken)           │
│                                                                        │
│  /health          → LB checks this (fast, always works if process up)  │
│  /health/ready    → Kubernetes readiness probe                         │
│  /health/detailed → Monitoring systems (Datadog, etc.)                 │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Load Balancer Providers

┌────────────────────────────────────────────────────────────────────────┐
│                    LOAD BALANCER OPTIONS                               │
│                                                                        │
│  CLOUD LOAD BALANCERS                                                  │
│  ══════════════════════════════════════════════════════════════════    │
│                                                                        │
│  AWS                                                                   │
│  ────────────────────────────────────────────────────────────────────  │
│  ALB (Application Load Balancer) - Layer 7                             │
│  • HTTP/HTTPS traffic                                                  │
│  • Path-based and host-based routing                                   │
│  • WebSocket and HTTP/2 support                                        │
│  • Integrated with WAF, Cognito                                        │
│  • ~$0.0225/hour + $0.008/LCU                                          │
│                                                                        │
│  NLB (Network Load Balancer) - Layer 4                                 │
│  • TCP/UDP/TLS traffic                                                 │
│  • Ultra-low latency (<100μs)                                          │
│  • Millions of requests per second                                     │
│  • Static IP / Elastic IP support                                      │
│  • ~$0.0225/hour + $0.006/NLCU                                         │
│                                                                        │
│  GCP                                                                   │
│  ────────────────────────────────────────────────────────────────────  │
│  HTTP(S) Load Balancer - Global L7                                     │
│  • Single anycast IP worldwide                                         │
│  • Automatic multi-region failover                                     │
│  • Cloud CDN integration                                               │
│                                                                        │
│  Network Load Balancer - Regional L4                                   │
│  • High-performance TCP/UDP                                            │
│  • Regional scope                                                      │
│                                                                        │
│  SOFTWARE LOAD BALANCERS                                               │
│  ══════════════════════════════════════════════════════════════════    │
│                                                                        │
│  NGINX                                                                 │
│  ────────────────────────────────────────────────────────────────────  │
│  • Most popular web server / reverse proxy                             │
│  • HTTP, TCP, UDP load balancing                                       │
│  • Open source (free) + NGINX Plus (commercial)                        │
│  • Event-driven, high performance                                      │
│  • Great documentation and community                                   │
│                                                                        │
│  Config example:                                                       │
│  upstream backend {                                                    │
│      least_conn;                                                       │
│      server 10.0.0.1:8080 weight=3;                                    │
│      server 10.0.0.2:8080;                                             │
│      server 10.0.0.3:8080 backup;                                      │
│  }                                                                     │
│  server {                                                              │
│      listen 80;                                                        │
│      location / {                                                      │
│          proxy_pass http://backend;                                    │
│      }                                                                 │
│  }                                                                     │
│                                                                        │
│  HAProxy                                                               │
│  ────────────────────────────────────────────────────────────────────  │
│  • High-performance TCP/HTTP load balancer                             │
│  • Battle-tested (GitHub, Reddit, Stack Overflow, Twitter)             │
│  • Very detailed metrics and logging                                   │
│  • Advanced health checking                                            │
│                                                                        │
│  Envoy                                                                 │
│  ────────────────────────────────────────────────────────────────────  │
│  • Modern, cloud-native proxy                                          │
│  • Originally from Lyft, now CNCF                                      │
│  • Foundation of service meshes (Istio, AWS App Mesh)                  │
│  • Native gRPC support                                                 │
│  • Dynamic configuration via xDS API                                   │
│  • Best for: Kubernetes, service mesh, microservices                   │
│                                                                        │
│  Traefik                                                               │
│  ────────────────────────────────────────────────────────────────────  │
│  • Kubernetes-native                                                   │
│  • Auto-discovery of services                                          │
│  • Automatic Let's Encrypt certificates                                │
│  • Best for: Kubernetes, Docker Swarm                                  │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

📚 Further Reading: Load Balancing

NGINX Load Balancing Guide: https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/
AWS Elastic Load Balancing: https://docs.aws.amazon.com/elasticloadbalancing/
HAProxy Documentation: https://www.haproxy.com/documentation/
Google SRE Book - Load Balancing: https://sre.google/sre-book/load-balancing-frontend/
Envoy Proxy: https://www.envoyproxy.io/docs/

1.3 Domain Name System (DNS)

What Is DNS?

DNS (Domain Name System) is the internet's phone book. It translates human-readable domain names (like www.example.com) into IP addresses (like 93.184.216.34) that computers use to identify each other.

Without DNS, you'd need to memorize IP addresses for every website you visit. DNS makes the internet usable by humans.

Why DNS Matters for System Design

DNS is much more than name resolution. In modern systems, DNS is used for:

Load balancing: Distributing traffic across multiple servers
Geographic routing: Directing users to the nearest datacenter
Failover: Automatically switching to backup servers when primary fails
Service discovery: Letting microservices find each other
Traffic management: Canary deployments, blue-green deployments

Understanding DNS deeply is essential for designing globally distributed, highly available systems.

How DNS Works — Step by Step

When you type www.example.com in your browser, here's what happens:

DNS RESOLUTION FLOW

STEP 1: Browser Cache Check
─────────────────────────────────────────────────────────────────────
Browser: "Have I looked up this domain recently?"
If YES → Use cached IP (typically cached for TTL duration)
If NO  → Ask Operating System

STEP 2: Operating System Cache
─────────────────────────────────────────────────────────────────────
OS: "Is this domain in my local DNS cache?"
If YES → Return cached IP to browser
If NO  → Ask configured DNS resolver

STEP 3: DNS Resolver (Recursive Resolver)
─────────────────────────────────────────────────────────────────────
OS asks configured DNS resolver (ISP or 8.8.8.8 or 1.1.1.1):

Resolver: "Is it in my cache?"
If YES → Return cached IP
If NO  → Start recursive resolution (Steps 4-6)

STEP 4: Root DNS Server
─────────────────────────────────────────────────────────────────────
Resolver asks Root Server: "Where is www.example.com?"

13 Root server clusters (a.root-servers.net through m.root-servers)
Each cluster = hundreds of servers via Anycast

Root Server: "I don't know, but .com TLD is handled by:
              a.gtld-servers.net, b.gtld-servers.net, ..."

STEP 5: TLD (Top Level Domain) Server
─────────────────────────────────────────────────────────────────────
Resolver asks .com TLD Server: "Where is www.example.com?"

TLD Server: "example.com is handled by nameservers:
             ns1.example.com (IP: 93.184.216.34)
             ns2.example.com (IP: 93.184.216.35)"

STEP 6: Authoritative DNS Server
─────────────────────────────────────────────────────────────────────
Resolver asks ns1.example.com: "What is IP of www.example.com?"

Authoritative Server: "www.example.com = 93.184.216.34, TTL=300"

STEP 7: Response Returns
─────────────────────────────────────────────────────────────────────
• Resolver caches answer (respects TTL)
• Returns IP to OS
• OS caches and returns to browser
• Browser caches and connects to 93.184.216.34

TIMING:
Full resolution (no cache): 50-200ms
Cached at resolver: 1-10ms
Cached locally: <1ms

DNS Record Types

┌────────────────────────────────────────────────────────────────────────┐
│                      DNS RECORD TYPES                                  │
│                                                                        │
│  A RECORD (Address)                                                    │
│  ────────────────────────────────────────────────────────────────────  │
│  Maps domain to IPv4 address                                           │
│                                                                        │
│  example.com.      300   IN   A   93.184.216.34                        │
│  api.example.com.  60    IN   A   10.0.1.100                           │
│                                                                        │
│  Multiple A records = DNS round-robin load balancing                   │
│  api.example.com.  60   IN   A   10.0.1.100                            │
│  api.example.com.  60   IN   A   10.0.1.101                            │
│  api.example.com.  60   IN   A   10.0.1.102                            │
│                                                                        │
│  AAAA RECORD (IPv6)                                                    │
│  ────────────────────────────────────────────────────────────────────  │
│  Maps domain to IPv6 address                                           │
│                                                                        │
│  example.com.  300  IN  AAAA  2606:2800:220:1:248:1893:25c8:1946       │
│                                                                        │
│  CNAME RECORD (Canonical Name / Alias)                                 │
│  ────────────────────────────────────────────────────────────────────  │
│  Creates an alias pointing to another domain                           │
│                                                                        │
│  www.example.com.   300  IN  CNAME  example.com.                       │
│  cdn.example.com.   300  IN  CNAME  d123456.cloudfront.net.            │
│  blog.example.com.  300  IN  CNAME  mycompany.ghost.io.                │
│                                                                        │
│  ⚠️ IMPORTANT: Cannot use CNAME at zone apex (example.com)             │
│     Workarounds: ALIAS (Route 53), ANAME, CNAME flattening (Cloudflare)│
│                                                                        │
│  MX RECORD (Mail Exchange)                                             │
│  ────────────────────────────────────────────────────────────────────  │
│  Specifies mail servers for the domain                                 │
│                                                                        │
│  example.com.  300  IN  MX  10  mail1.example.com.                     │
│  example.com.  300  IN  MX  20  mail2.example.com.  (backup)           │
│                           ↑                                            │
│                      Priority (lower = higher priority)                │
│                                                                        │
│  TXT RECORD (Text)                                                     │
│  ────────────────────────────────────────────────────────────────────  │
│  Stores text, used for verification and email security                 │
│                                                                        │
│  Domain verification:                                                  │
│  example.com.  IN  TXT  "google-site-verification=abc123..."           │
│                                                                        │
│  SPF (email authentication):                                           │
│  example.com.  IN  TXT  "v=spf1 include:_spf.google.com ~all"          │
│                                                                        │
│  DMARC (email policy):                                                 │
│  _dmarc.example.com.  IN  TXT  "v=DMARC1; p=reject; rua=..."           │
│                                                                        │
│  NS RECORD (Name Server)                                               │
│  ────────────────────────────────────────────────────────────────────  │
│  Delegates DNS authority to nameservers                                │
│                                                                        │
│  example.com.  86400  IN  NS  ns1.example.com.                         │
│  example.com.  86400  IN  NS  ns2.example.com.                         │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

DNS in System Design

┌────────────────────────────────────────────────────────────────────────┐
│                   DNS USE CASES IN SYSTEM DESIGN                       │
│                                                                        │
│  1. GEOGRAPHIC ROUTING (GeoDNS)                                        │
│  ────────────────────────────────────────────────────────────────────  │
│  Route users to nearest datacenter:                                    │
│                                                                        │
│  User in Tokyo  → api.example.com → 10.1.0.1 (Asia server)             │
│  User in NYC    → api.example.com → 10.2.0.1 (US East server)          │
│  User in London → api.example.com → 10.3.0.1 (EU server)               │
│                                                                        │
│  Benefits: Lower latency, data residency compliance                    │
│                                                                        │
│  2. FAILOVER                                                           │
│  ────────────────────────────────────────────────────────────────────  │
│  Automatic switch to backup when primary fails:                        │
│                                                                        │
│  Primary healthy:  api.example.com → primary-dc.example.com            │
│  Primary fails:    api.example.com → backup-dc.example.com             │
│                                                                        │
│  ⚠️ TTL affects failover speed! Lower TTL = faster failover            │
│                                                                        │
│  3. WEIGHTED ROUTING (Canary Deployments)                              │
│  ────────────────────────────────────────────────────────────────────  │
│  Split traffic between versions:                                       │
│                                                                        │
│  api.example.com                                                       │
│  ├── Weight 95: stable.internal (v1.0)                                 │
│  └── Weight 5:  canary.internal (v1.1)                                 │
│                                                                        │
│  4. SERVICE DISCOVERY (Internal DNS)                                   │
│  ────────────────────────────────────────────────────────────────────  │
│  Services find each other by DNS name:                                 │
│                                                                        │
│  user-service.internal → 10.0.1.1, 10.0.1.2, 10.0.1.3                  │
│  order-service.internal → 10.0.2.1, 10.0.2.2                           │
│                                                                        │
│  No hardcoded IPs! Works with auto-scaling.                            │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

DNS Providers

MANAGED DNS:
├── Route 53 (AWS): Health checks, latency/geo routing, $0.50/zone/mo
├── Cloudflare DNS: Free, fastest, DDoS protection, instant propagation
├── Cloud DNS (GCP): Anycast, private zones
└── Azure DNS: Azure integration

PUBLIC RESOLVERS:
├── 8.8.8.8 / 8.8.4.4    - Google Public DNS
├── 1.1.1.1 / 1.0.0.1    - Cloudflare (fastest)
├── 9.9.9.9              - Quad9 (security-focused)
└── 208.67.222.222       - OpenDNS

📚 Further Reading: DNS

How DNS Works (Interactive Comic): https://howdns.works/
AWS Route 53 Documentation: https://docs.aws.amazon.com/Route53/
Cloudflare DNS Learning: https://www.cloudflare.com/learning/dns/what-is-dns/

Chapter 2: Application Layer

The application layer is where your business logic runs.

2.1 Stateless vs Stateful Applications

This is critical for scalability:

┌────────────────────────────────────────────────────────────────────────┐
│                    STATEFUL SERVER (Problematic)                       │
│                                                                        │
│  Server stores session data in memory:                                 │
│                                                                        │
│  Request 1: User logs in → Server A stores session in memory           │
│  Request 2: User loads profile → Must go to Server A (has session!)    │
│  Request 3: Server A crashes → Session LOST! User logged out!          │
│                                                                        │
│  PROBLEMS:                                                             │
│  ✗ User MUST return to same server (sticky sessions required)          │
│  ✗ Server crash = data loss                                            │
│  ✗ Can't add servers easily (new servers don't have sessions)          │
│  ✗ Memory grows with users                                             │
│  ✗ Horizontal scaling is painful                                       │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

┌────────────────────────────────────────────────────────────────────────┐
│                    STATELESS SERVER (Preferred)                        │
│                                                                        │
│  Server stores NO session data. All state is external.                 │
│                                                                        │
│  Request 1: User logs in → Server A creates session in Redis           │
│             Returns JWT token to client                                │
│  Request 2: User loads profile → Goes to Server B (any server works!)  │
│             Server B validates JWT, gets session from Redis            │
│  Request 3: Server A crashes → No problem! User continues on B, C      │ 
│                                                                        │
│  BENEFITS:                                                             │
│  ✓ Any server can handle any request                                   │
│  ✓ Easy horizontal scaling (just add servers)                          │
│  ✓ Server failure = no data loss                                       │
│  ✓ Simple load balancing (round-robin works)                           │
│  ✓ Easy deployments (no session migration)                             │
│                                                                        │
│  WHERE TO STORE STATE:                                                 │
│  • Sessions → Redis (fast, TTL support)                                │
│  • User data → PostgreSQL (persistent)                                 │
│  • Files → S3 (durable)                                                │
│  • Cache → Redis/Memcached                                             │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

2.2 Monolith vs Microservices

┌──────────────────────────────────────────────────────────────────────┐
│  MONOLITH                         │  MICROSERVICES                   │
│  ─────────────────────────────────┼────────────────────────────────  │
│                                   │                                  │
│  ┌────────────────────────────┐   │ ┌────────┐ ┌────────┐ ┌────────┐ │
│  │    SINGLE APPLICATION      │   │ │ User   │ │ Order  │ │Payment │ │
│  │                            │   │ │Service │ │Service │ │Service │ │
│  │  ┌──────┐ ┌──────┐         │   │ │ ┌────┐ │ │ ┌────┐ │ │ ┌────┐ │ │
│  │  │ User │ │Order │         │   │ │ │ DB │ │ │ │ DB │ │ │ │ DB │ │ │
│  │  │Module│ │Module│         │   │ │ └────┘ │ │ └────┘ │ │ └────┘ │ │
│  │  └──────┘ └──────┘         │   │ └────────┘ └────────┘ └────────┘ │
│  │                            │   │                                  │
│  │      SHARED DATABASE       │   │  Independent databases & deploys │
│  └────────────────────────────┘   │                                  │
│                                   │                                  │
│  PROS:                            │  PROS:                           │
│  ✓ Simple to develop              │  ✓ Scale services independently  │
│  ✓ Easy to test                   │  ✓ Technology diversity          │
│  ✓ Simple deployment              │  ✓ Isolated failures             │
│  ✓ No network latency             │  ✓ Team autonomy                 │
│  ✓ ACID transactions easy         │  ✓ Faster CI/CD per service      │
│                                   │                                  │
│  CONS:                            │  CONS:                           │
│  ✗ Scale everything together      │  ✗ Complex operations            │
│  ✗ Single failure affects all     │  ✗ Network latency               │
│  ✗ Technology lock-in             │  ✗ Distributed transactions hard │
│  ✗ Slow builds as code grows      │  ✗ Testing complexity            │
│  ✗ Team coordination hard         │  ✗ Need observability tooling    │
│                                   │                                  │
└──────────────────────────────────────────────────────────────────────┘

WHEN TO USE WHICH:

START WITH MONOLITH:
• Small team (< 10 developers)
• New product (unclear requirements)
• Need to move fast (MVP)
• Limited DevOps expertise

CONSIDER MICROSERVICES:
• Large team (can dedicate team per service)
• Clear service boundaries exist
• Different scaling requirements
• Strong DevOps culture
• Have observability infrastructure

"Start with a monolith, extract services when needed" - Martin Fowler

📚 Further Reading: Architecture

Martin Fowler - Microservices: https://martinfowler.com/articles/microservices.html
Monolith First: https://martinfowler.com/bliki/MonolithFirst.html
Building Microservices (Book) - Sam Newman
Microservices.io Patterns: https://microservices.io/

Chapter 3: Data Layer

3.1 SQL vs NoSQL Databases

┌─────────────────────────────────────────────────────────────────────────┐
│                    SQL (Relational) DATABASES                           │
│                                                                         │
│  Examples: PostgreSQL, MySQL, SQL Server                                │
│                                                                         │
│  STRUCTURE: Fixed schema, tables with rows and columns                  │
│                                                                         │
│  users                        orders                                    │
│  ┌─────────────────────┐     ┌──────────────────────────┐               │
│  │ id (PK)       INT   │     │ id (PK)        INT       │               │
│  │ email         TEXT  │◀────│ user_id (FK)   INT       │               │
│  │ name          TEXT  │ 1:N │ total          DECIMAL   │               │
│  │ created_at    TIME  │     │ status         TEXT      │               │
│  └─────────────────────┘     └──────────────────────────┘               │
│                                                                         │
│  ACID TRANSACTIONS:                                                     │
│  BEGIN;                                                                 │
│    UPDATE accounts SET balance = balance - 100 WHERE id = 1;            │
│    UPDATE accounts SET balance = balance + 100 WHERE id = 2;            │
│  COMMIT;  -- Both happen or neither happens                             │
│                                                                         │
│  BEST FOR:                                                              │
│  ✓ Complex queries with JOINs                                           │
│  ✓ Transactions (banking, orders)                                       │
│  ✓ Data integrity critical                                              │
│  ✓ Structured, predictable data                                         │
│  ✓ Reporting and analytics                                              │
│                                                                         │
│  DEFAULT CHOICE: PostgreSQL (handles most use cases well)               │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────┐
│                      NOSQL DATABASE TYPES                               │
│                                                                         │
│  1. DOCUMENT STORE (MongoDB, CouchDB)                                   │
│  ─────────────────────────────────────────────────────────────────────  │
│  {                                                                      │
│    "_id": "user_123",                                                   │
│    "name": "John",                                                      │
│    "orders": [                    // Embedded documents                 │
│      {"id": "ord1", "total": 99.99}                                     │
│    ]                                                                    │
│  }                                                                      │
│                                                                         │
│  ✓ Flexible schema, natural for hierarchical data                       │
│  ✗ No JOINs (denormalize instead)                                       │
│  Best for: Content management, catalogs, user profiles                  │
│                                                                         │
│  2. KEY-VALUE STORE (Redis, DynamoDB)                                   │
│  ─────────────────────────────────────────────────────────────────────  │
│  user:123  →  {"name": "John", "email": "j@ex.com"}                     │
│  session:abc  →  {"user_id": 123, "expires": ...}                       │
│                                                                         │
│  ✓ Extremely fast O(1) lookups                                          │
│  ✗ No complex queries                                                   │
│  Best for: Sessions, caching, real-time data, leaderboards              │
│                                                                         │
│  3. WIDE-COLUMN STORE (Cassandra, HBase)                                │
│  ─────────────────────────────────────────────────────────────────────  │
│  Row Key    │ Column1      │ Column2        │ Column3                   │
│  user:123   │ name:John    │ email:j@ex.com │ age:30                    │
│  user:456   │ name:Jane    │ city:NYC       │ (no age!)                 │
│                                                                         │
│  ✓ Massive write throughput, linear scaling                             │
│  ✗ Limited query patterns                                               │
│  Best for: Time-series, IoT, logging, high write volume                 │
│                                                                         │
│  4. GRAPH DATABASE (Neo4j, Neptune)                                     │
│  ─────────────────────────────────────────────────────────────────────  │
│        (Alice)──FOLLOWS──▶(Bob)──WORKS_AT──▶(Acme)                      │
│                                                                         │
│  ✓ Natural for connected data, fast traversals                          │
│  Best for: Social networks, recommendations, fraud detection            │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

DATABASE SELECTION QUICK GUIDE:
─────────────────────────────────────────────────────────────────────────
Need ACID transactions?        → PostgreSQL, MySQL
Flexible schema?               → MongoDB, DynamoDB
Simple key-value + caching?    → Redis
Massive writes?                → Cassandra
Full-text search?              → Elasticsearch
Graph relationships?           → Neo4j
Default choice?                → PostgreSQL

3.2 Caching

┌─────────────────────────────────────────────────────────────────────────┐
│                       WHY CACHING MATTERS                               │
│                                                                         │
│  WITHOUT CACHE:                                                         │
│  Every request → Database (50ms)                                        │
│  1000 requests = 1000 DB queries = DB overload                          │
│                                                                         │
│  WITH CACHE (90% hit rate):                                             │
│  900 requests → Cache (1ms)                                             │
│  100 requests → Database (50ms)                                         │
│                                                                         │
│  Average latency: (0.9 × 1) + (0.1 × 50) = 5.9ms                        │
│  vs 50ms without cache = 8.5x faster!                                   │
│  Database load reduced by 90%!                                          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────┐
│                   CACHE-ASIDE PATTERN (Most Common)                     │
│                                                                         │
│  def get_user(user_id):                                                 │
│      # 1. Check cache first                                             │
│      cached = redis.get(f"user:{user_id}")                              │
│      if cached:                                                         │
│          return json.loads(cached)  # Cache HIT                         │
│                                                                         │
│      # 2. Cache MISS - query database                                   │
│      user = db.query("SELECT * FROM users WHERE id = ?", user_id)       │
│                                                                         │
│      # 3. Store in cache for next time                                  │
│      redis.setex(f"user:{user_id}", 3600, json.dumps(user))  # 1hr TTL  │
│                                                                         │
│      return user                                                        │
│                                                                         │
│  INVALIDATION (on update):                                              │
│  def update_user(user_id, data):                                        │
│      db.update(user_id, data)                                           │
│      redis.delete(f"user:{user_id}")  # Invalidate cache                │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

CACHE EVICTION POLICIES:
─────────────────────────────────────────────────────────────────────────
LRU (Least Recently Used) - Remove least recently accessed (DEFAULT)
LFU (Least Frequently Used) - Remove least often accessed
FIFO (First In First Out) - Remove oldest
TTL (Time To Live) - Remove after expiration time

REDIS vs MEMCACHED:
─────────────────────────────────────────────────────────────────────────
Redis: Rich data structures, persistence, pub/sub, Lua scripts
Memcached: Simpler, multi-threaded, more memory efficient

DEFAULT CHOICE: Redis (more versatile)

📚 Further Reading: Databases & Caching

PostgreSQL Documentation: https://www.postgresql.org/docs/
Redis Documentation: https://redis.io/docs/
MongoDB University: https://university.mongodb.com/
Designing Data-Intensive Applications (Book) - Martin Kleppmann

3.3 Object Storage

┌────────────────────────────────────────────────────────────────────────┐
│                       OBJECT STORAGE (S3)                              │
│                                                                        │
│  STRUCTURE:                                                            │
│  Bucket: my-app-uploads                                                │
│  ├── images/profiles/user123.jpg      (Object)                         │
│  ├── videos/uploads/intro.mp4         (Object)                         │
│  └── documents/reports/q3.pdf         (Object)                         │
│                                                                        │
│  CHARACTERISTICS:                                                      │
│  ✓ Virtually unlimited scale                                           │
│  ✓ 99.999999999% durability (11 nines!)                                │
│  ✓ HTTP access (easy CDN integration)                                  │
│  ✓ Cheap (compared to block storage)                                   │
│  ✓ Versioning, lifecycle policies                                      │
│                                                                        │
│  ✗ Can't mount as filesystem                                           │
│  ✗ No partial updates (replace whole object)                           │
│  ✗ Higher latency than local disk                                      │
│                                                                        │
│  USE CASES:                                                            │
│  • User uploads (images, videos, documents)                            │
│  • Static website assets                                               │
│  • Backups and archives                                                │
│  • Data lake storage                                                   │
│                                                                        │
│  PROVIDERS:                                                            │
│  • AWS S3 (the original, most popular)                                 │
│  • Google Cloud Storage                                                │
│  • Azure Blob Storage                                                  │
│  • Cloudflare R2 (S3-compatible, no egress fees!)                      │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

Chapter 4: Messaging Layer

4.1 Message Queues

┌────────────────────────────────────────────────────────────────────────┐
│                    MESSAGE QUEUE PATTERNS                              │
│                                                                        │
│  POINT-TO-POINT (Work Queue)                                           │
│  ────────────────────────────────────────────────────────────────────  │
│                                                                        │
│  ┌──────────┐                             ┌────────────┐               │
│  │Producer A│──┐                     ┌───▶│Consumer 1  │               │
│  └──────────┘  │    ┌────────────┐   │    └────────────┘               │
│                ├───▶│   QUEUE    │───┤                                 │
│  ┌──────────┐  │    │ [3][2][1]  │   │    ┌────────────┐               │
│  │Producer B│──┘    └────────────┘   └───▶│Consumer 2  │               │
│  └──────────┘                             └────────────┘               │
│                                                                        │
│  Each message consumed by ONE consumer only                            │
│  Use for: Background jobs, task distribution, work queues              │
│                                                                        │
│  PUBLISH-SUBSCRIBE (Fan-out)                                           │
│  ────────────────────────────────────────────────────────────────────  │
│                                                                        │
│                                          ┌────────────┐                │
│                                     ┌───▶│Subscriber A│                │
│  ┌──────────┐   ┌────────────┐     │    └────────────┘                 │
│  │Publisher │──▶│   TOPIC    │─────┼───▶ Subscriber B                  │
│  └──────────┘   └────────────┘     │    ┌────────────┐                 │
│                                     └───▶│Subscriber C│                │
│                                          └────────────┘                │
│                                                                        │
│  Each message delivered to ALL subscribers                             │
│  Use for: Events, notifications, real-time updates                     │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

MESSAGE QUEUE COMPARISON:
─────────────────────────────────────────────────────────────────────────
             │ RabbitMQ  │ Kafka      │ SQS        │ Redis Streams
─────────────┼───────────┼────────────┼────────────┼─────────────────
Throughput   │ Medium    │ Very High  │ High       │ Medium
Replay       │ No        │ Yes        │ No         │ Yes
Ordering     │ Queue     │ Partition  │ FIFO opt   │ Stream
Persistence  │ Optional  │ Yes        │ Yes        │ Optional
Best for     │ Routing   │ Events     │ Tasks      │ Lightweight
─────────────────────────────────────────────────────────────────────────

WHEN TO USE QUEUES:
✓ Async processing (user doesn't wait for slow task)
✓ Decoupling services (Order → Inventory, Email, Analytics)
✓ Handling traffic spikes (queue buffers requests)
✓ Reliable delivery (payment events)
✓ Rate limiting external APIs

4.2 Event-Driven Architecture

┌────────────────────────────────────────────────────────────────────────┐
│                   EVENT-DRIVEN ARCHITECTURE                            │
│                                                                        │
│  INSTEAD OF:                                                           │
│  Order Service → calls → Inventory → calls → Email → calls → Analytics │
│  (synchronous chain - slow, coupled, one failure breaks all)           │
│                                                                        │
│  WE DO:                                                                │
│  ┌──────────────────┐                                                  │
│  │  Order Service   │                                                  │
│  │                  │                                                  │
│  │  1. Create order │                                                  │
│  │  2. Emit event:  │                                                  │
│  │    OrderCreated  │                                                  │
│  └────────┬─────────┘                                                  │
│           │                                                            │
│           ▼                                                            │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                    EVENT BUS (Kafka)                            │   │
│  │  { "type": "order.created", "order_id": "123", "total": 99.99 } │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│           │                    │                    │                  │
│           ▼                    ▼                    ▼                  │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐         │
│  │   Inventory     │  │     Email       │  │   Analytics     │         │
│  │   Service       │  │    Service      │  │    Service      │         │
│  │                 │  │                 │  │                 │         │
│  │ Reserve stock   │  │ Send confirm    │  │ Track metrics   │         │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘         │
│                                                                        │
│  BENEFITS:                                                             │
│  ✓ Services don't know about each other (loose coupling)               │
│  ✓ Easy to add new consumers                                           │
│  ✓ Failure isolation                                                   │
│  ✓ Event replay capability                                             │
│                                                                        │
│  CHALLENGES:                                                           │
│  ✗ Harder to trace request flow                                        │
│  ✗ Eventual consistency                                                │
│  ✗ Need good observability                                             │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

📚 Further Reading: Messaging

RabbitMQ Tutorials: https://www.rabbitmq.com/getstarted.html
Kafka Documentation: https://kafka.apache.org/documentation/
AWS SQS Guide: https://docs.aws.amazon.com/sqs/
Event-Driven Architecture (AWS): https://aws.amazon.com/event-driven-architecture/

Chapter 5: Quick Reference

Component Selection Cheatsheet

┌─────────────────────────────────────────────────────────────────────────┐
│                   COMPONENT SELECTION CHEATSHEET                        │
│                                                                         │
│  "I need to..."                        │  Use...                        │
│  ══════════════════════════════════════╪════════════════════════════════│
│  Cache static content globally         │  CDN (CloudFront, Cloudflare)  │
│  Distribute traffic to servers         │  Load Balancer (ALB, Nginx)    │
│  Handle auth, rate limiting for APIs   │  API Gateway (Kong, AWS APIGW) │
│  Route domain to servers               │  DNS (Route 53, Cloudflare)    │
│  Store structured data + transactions  │  PostgreSQL, MySQL             │
│  Store documents, flexible schema      │  MongoDB, DynamoDB             │
│  Fast key-value lookups, caching       │  Redis                         │
│  Store files (images, videos)          │  S3, GCS                       │
│  Full-text search                      │  Elasticsearch, Algolia        │
│  Async task processing                 │  SQS, RabbitMQ                 │
│  High-throughput event streaming       │  Kafka                         │
│  Real-time pub/sub                     │  Redis Pub/Sub                 │
│                                                                         │
│  DEFAULT CHOICES:                                                       │
│  • Database: PostgreSQL                                                 │
│  • Cache: Redis                                                         │
│  • Queue: SQS (AWS) or Kafka (high throughput)                          │
│  • CDN: CloudFront (AWS) or Cloudflare                                  │
│  • Object Storage: S3                                                   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Typical Architecture Patterns

SIMPLE WEB APP:
CDN → Load Balancer → App Servers → PostgreSQL + Redis

MICROSERVICES:
CDN → API Gateway → Load Balancers → Services → DBs + Kafka

REAL-TIME APP:
CDN → Load Balancer → WebSocket Servers → Redis Pub/Sub → Kafka

Summary

┌─────────────────────────────────────────────────────────────────────────┐
│                    PART 2 KEY TAKEAWAYS                                 │
│                                                                         │
│  EDGE LAYER:                                                            │
│  • CDN: Cache static content at edge, 95%+ hit ratio target             │
│  • Load Balancer: Distribute traffic, L4 vs L7, health checks           │
│  • API Gateway: Auth, rate limiting, routing, transformation            │
│  • DNS: Domain resolution, geo routing, failover                        │
│                                                                         │
│  APPLICATION LAYER:                                                     │
│  • Keep servers STATELESS (externalize state to Redis/DB)               │
│  • Start MONOLITH, extract microservices when needed                    │
│                                                                         │
│  DATA LAYER:                                                            │
│  • SQL (PostgreSQL): Transactions, relationships, complex queries       │
│  • NoSQL: Document (MongoDB), Key-Value (Redis), Column (Cassandra)     │
│  • Caching: Cache-aside pattern, LRU eviction                           │
│  • Object Storage: S3 for files                                         │
│                                                                         │
│  MESSAGING LAYER:                                                       │
│  • Queues: Async processing, decoupling, traffic buffering              │
│  • Kafka: High-throughput event streaming                               │
│  • Event-driven: Loose coupling, failure isolation                      │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

📚 Comprehensive Resource List

Books

Designing Data-Intensive Applications - Martin Kleppmann (Essential!)
Building Microservices - Sam Newman
System Design Interview - Alex Xu
Web Scalability for Startup Engineers - Artur Ejsmont

Online Resources

High Scalability Blog: http://highscalability.com/
AWS Architecture Center: https://aws.amazon.com/architecture/
System Design Primer: https://github.com/donnemartin/system-design-primer
ByteByteGo Newsletter: https://bytebytego.com/

Documentation

AWS Well-Architected: https://aws.amazon.com/architecture/well-architected/
Google Cloud Architecture: https://cloud.google.com/architecture
Azure Architecture Center: https://docs.microsoft.com/en-us/azure/architecture/

End of Week 0 — Part 2

Next: Part 3 covers Back-of-the-Envelope Estimation — how to calculate traffic, storage, bandwidth, and infrastructure sizing during interviews.

Back to Course Overview