Week 0 — Part 3: Back-of-the-Envelope Estimation
The Art of System Sizing
Introduction: Why Estimation Matters
In system design interviews, you'll be asked to design systems that handle real-world scale. Interviewers want to see that you can:
- Think about scale — Not just "it works" but "it works at 1M users"
- Make informed decisions — Choose technologies based on numbers
- Identify bottlenecks — Know where the system will break
- Size infrastructure — How many servers, how much storage
- Communicate clearly — Show your reasoning
This document gives you the formulas, numbers, and techniques to estimate quickly and accurately.
Chapter 1: Numbers Every Engineer Should Know
1.1 Powers of Two
These are essential for memory, storage, and data size calculations.
┌─────────────────────────────────────────────────────────────────────────┐
│ POWERS OF TWO │
│ │
│ Power Exact Value Approximate Name │
│ ────────────────────────────────────────────────────────────────────── │
│ 2^10 1,024 ~1 Thousand 1 KB (Kilobyte) │
│ 2^20 1,048,576 ~1 Million 1 MB (Megabyte) │
│ 2^30 1,073,741,824 ~1 Billion 1 GB (Gigabyte) │
│ 2^40 ~1 Trillion ~1 Trillion 1 TB (Terabyte) │
│ 2^50 ~1 Quadrillion ~1 Quadrillion 1 PB (Petabyte) │
│ │
│ MEMORY AIDS: │
│ • Every 10 powers of 2 ≈ 3 powers of 10 │
│ • 2^10 ≈ 10^3 (thousand) │
│ • 2^20 ≈ 10^6 (million) │
│ • 2^30 ≈ 10^9 (billion) │
│ │
│ PRACTICAL CONVERSIONS: │
│ • 1 GB = 1,024 MB ≈ 1,000 MB (use 1,000 for estimation) │
│ • 1 TB = 1,024 GB ≈ 1,000 GB │
│ • 1 PB = 1,024 TB ≈ 1,000 TB │
│ │
└─────────────────────────────────────────────────────────────────────────┘
1.2 Time Conversions
┌─────────────────────────────────────────────────────────────────────────┐
│ TIME CONVERSIONS │
│ │
│ Unit Seconds Round To │
│ ────────────────────────────────────────────────────────────────────── │
│ 1 minute 60 60 │
│ 1 hour 3,600 ~4,000 │
│ 1 day 86,400 ~100,000 (10^5) │
│ 1 week 604,800 ~600,000 │
│ 1 month 2,592,000 ~2.5 million │
│ 1 year 31,536,000 ~30 million (3 × 10^7) │
│ │
│ KEY SHORTCUTS: │
│ ────────────────────────────────────────────────────────────────────── │
│ • 1 day ≈ 10^5 seconds (use 100,000) │
│ • 1 year ≈ 3 × 10^7 seconds │
│ • 1 month ≈ 2.5 × 10^6 seconds │
│ │
│ DAILY TO PER-SECOND CONVERSION: │
│ ────────────────────────────────────────────────────────────────────── │
│ X per day ÷ 100,000 ≈ X per second │
│ │
│ Examples: │
│ • 100 million requests/day = 1,000 requests/second │
│ • 1 billion requests/day = 10,000 requests/second │
│ • 10 million requests/day = 100 requests/second │
│ │
│ FORMULA: │
│ Daily Volume │
│ RPS = ──────────────────────── │
│ 86,400 (or 100,000) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
1.3 Latency Numbers Every Programmer Should Know
These numbers are from Jeff Dean's famous talk and are essential for understanding system performance.
┌────────────────────────────────────────────────────────────────────────┐
│ LATENCY COMPARISON NUMBERS │
│ │
│ Operation Time Scale │
│ ──────────────────────────────────────────────────────────────────── │
│ L1 cache reference 0.5 ns | │
│ Branch mispredict 5 ns | │
│ L2 cache reference 7 ns | │
│ Mutex lock/unlock 25 ns | │
│ Main memory reference 100 ns |= │
│ Compress 1KB with Snappy 3,000 ns |=== │
│ Send 1 KB over 1 Gbps network 10,000 ns |==== │
│ Read 4 KB randomly from SSD 150,000 ns |====== │
│ Read 1 MB sequentially from memory 250,000 ns |======= │
│ Round trip within same datacenter 500,000 ns |======== │
│ Read 1 MB sequentially from SSD 1,000,000 ns |========= │
│ HDD seek 10,000,000 ns |long bar │
│ Read 1 MB sequentially from HDD 20,000,000 ns │
│ Send packet CA → Netherlands → CA 150,000,000 ns │
│ │
│ HUMAN-READABLE SCALE: │
│ ──────────────────────────────────────────────────────────────────── │
│ • L1 cache: 0.5 ns │
│ • Main memory: 100 ns = 200x slower than L1 │
│ • SSD random read: 150 μs = 150,000 ns = 1,500x slower than memory │
│ • HDD seek: 10 ms = 10,000,000 ns = 67x slower than SSD │
│ • Network round trip (datacenter): 0.5 ms │
│ • Network round trip (cross-continent): 150 ms │
│ │
│ KEY INSIGHTS: │
│ ───────────────────────────────────────────────────────────────────── │
│ 1. Memory is ~1000x faster than SSD │
│ 2. SSD is ~100x faster than HDD │
│ 3. Network within datacenter is fast (~0.5ms) │
│ 4. Cross-continent network is slow (~150ms) │
│ 5. Sequential reads are MUCH faster than random reads │
│ │
│ DESIGN IMPLICATIONS: │
│ ───────────────────────────────────────────────────────────────────── │
│ • Cache aggressively (memory >> disk) │
│ • Minimize network round trips │
│ • Batch operations when possible │
│ • Use CDN for global users │
│ • Prefer SSDs over HDDs for performance │
│ • Use sequential access patterns when possible │
│ │
└────────────────────────────────────────────────────────────────────────┘
📚 Source: Jeff Dean's "Numbers Everyone Should Know" - https://brenocon.com/dean_perf.html
1.4 Typical Data Sizes
┌─────────────────────────────────────────────────────────────────────────┐
│ COMMON DATA SIZES │
│ │
│ TEXT DATA │
│ ───────────────────────────────────────────────────────────────────── │
│ Item Size │
│ ────────────────────────────────────────────── │
│ Character (ASCII) 1 byte │
│ Character (UTF-8, average) 1-4 bytes │
│ UUID 36 bytes (string) / 16 bytes (bin)│
│ Email address ~30 bytes │
│ URL ~100 bytes │
│ Tweet (280 chars + metadata) ~500 bytes │
│ Typical JSON API response 2-10 KB │
│ Average email 50-100 KB │
│ Average web page (HTML + assets) 2-3 MB │
│ │
│ MEDIA │
│ ───────────────────────────────────────────────────────────────────── │
│ Item Size │
│ ────────────────────────────────────────────── │
│ Favicon ~1 KB │
│ Thumbnail image 5-10 KB │
│ Profile picture (small) 50-100 KB │
│ Regular photo (compressed) 200 KB - 2 MB │
│ High-res photo 5-10 MB │
│ 1 minute of video (720p) ~50 MB │
│ 1 minute of video (1080p) ~150 MB │
│ 1 minute of video (4K) ~350 MB │
│ 1 minute of audio (MP3) ~1 MB │
│ │
│ DATABASE RECORDS (typical) │
│ ───────────────────────────────────────────────────────────────────── │
│ Record Type Size │
│ ────────────────────────────────────────────── │
│ User profile (basic) 500 bytes - 1 KB │
│ Product listing 1-5 KB │
│ Order record 500 bytes - 1 KB │
│ Log entry 200-500 bytes │
│ Search index entry 200 bytes │
│ Session data 1-2 KB │
│ │
│ RULE OF THUMB: When in doubt, estimate 1 KB per record │
│ │
└─────────────────────────────────────────────────────────────────────────┘
1.5 Throughput Reference Numbers
┌────────────────────────────────────────────────────────────────────────┐
│ THROUGHPUT REFERENCE │
│ │
│ NETWORK BANDWIDTH │
│ ──────────────────────────────────────────────────────────────────── │
│ Connection Type Speed Data/second │
│ ────────────────────────────────────────────────────────────────── │
│ Home broadband 100 Mbps 12.5 MB/s │
│ Home fiber 1 Gbps 125 MB/s │
│ AWS within Availability Zone 10-25 Gbps 1-3 GB/s │
│ Datacenter backbone 100+ Gbps 12+ GB/s │
│ │
│ Formula: Gbps ÷ 8 = GB/s (bits to bytes) │
│ │
│ DATABASE THROUGHPUT │
│ ──────────────────────────────────────────────────────────────────── │
│ Database Operations/second (approximate) │
│ ────────────────────────────────────────────────────────────────── │
│ PostgreSQL (single) 10,000-50,000 queries/sec │
│ MySQL (single) 10,000-50,000 queries/sec │
│ Redis (single) 100,000+ ops/sec │
│ Cassandra (per node) 10,000-50,000 writes/sec │
│ MongoDB (single) 10,000-30,000 ops/sec │
│ DynamoDB (on-demand) Virtually unlimited │
│ │
│ Note: Actual throughput depends heavily on: │
│ • Query complexity │
│ • Data size │
│ • Hardware specs │
│ • Network latency │
│ │
│ WEB SERVER THROUGHPUT │
│ ──────────────────────────────────────────────────────────────────── │
│ Server Type Requests/second (approximate) │
│ ────────────────────────────────────────────────────────────────── │
│ Node.js (simple API) 5,000-10,000 req/sec │
│ Go (simple API) 30,000-50,000 req/sec │
│ Nginx (static files) 50,000-100,000 req/sec │
│ Node.js (complex logic) 1,000-3,000 req/sec │
│ Python/Django 500-2,000 req/sec │
│ │
│ Conservative estimate: 5,000 req/sec per server for simple APIs │
│ │
└────────────────────────────────────────────────────────────────────────┘
Chapter 2: The Estimation Framework
2.1 The Four Types of Estimates
Every system design requires estimating these four things:
┌────────────────────────────────────────────────────────────────────────┐
│ THE FOUR ESTIMATES │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 1. TRAFFIC ESTIMATES │ │
│ │ │ │
│ │ • Daily/monthly active users (DAU/MAU) │ │
│ │ • Requests per second (RPS) │ │
│ │ • Read vs write ratio │ │
│ │ • Peak vs average traffic │ │
│ │ │ │
│ │ Key question: How many requests will we handle? │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 2. STORAGE ESTIMATES │ │
│ │ │ │
│ │ • Data per user/object │ │
│ │ • Total storage needed (now and future) │ │
│ │ • Growth rate │ │
│ │ • Retention period │ │
│ │ │ │
│ │ Key question: How much data will we store? │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 3. BANDWIDTH ESTIMATES │ │
│ │ │ │
│ │ • Ingress (data coming in) │ │
│ │ • Egress (data going out) │ │
│ │ • Peak bandwidth requirements │ │
│ │ │ │
│ │ Key question: How much data will flow through the system? │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ 4. INFRASTRUCTURE ESTIMATES │ │
│ │ │ │
│ │ • Number of servers │ │
│ │ • Database sizing and replication │ │
│ │ • Cache memory requirements │ │
│ │ │ │
│ │ Key question: What infrastructure do we need? │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
2.2 Traffic Estimation
The Formula
┌─────────────────────────────────────────────────────────────────────────┐
│ TRAFFIC ESTIMATION FORMULAS │
│ │
│ STEP 1: CALCULATE DAILY VOLUME │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Daily Actions = DAU × Actions per User │
│ │
│ Example: │
│ • DAU: 10 million │
│ • Actions per user: 5 posts + 50 reads = 55 actions │
│ • Daily actions: 10M × 55 = 550 million │
│ │
│ │
│ STEP 2: CONVERT TO REQUESTS PER SECOND (RPS) │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Daily Actions │
│ RPS = ──────────────────────── │
│ 86,400 (seconds per day) │
│ │
│ Shortcut: Daily / 100,000 ≈ RPS │
│ │
│ Example: │
│ • 550 million / 100,000 = 5,500 RPS │
│ │
│ │
│ STEP 3: CALCULATE PEAK RPS │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Peak RPS = Average RPS × Peak Factor │
│ │
│ Peak factors: │
│ • Normal apps: 2x average │
│ • Social/viral: 3x average │
│ • Flash sales/events: 5-10x average │
│ │
│ Example: │
│ • Average: 5,500 RPS │
│ • Peak (2x): 11,000 RPS │
│ │
│ │
│ STEP 4: SPLIT BY READ/WRITE │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ If ratio is 10:1 (reads:writes): │
│ • Total: 5,500 RPS │
│ • Reads: 5,500 × 10/11 = 5,000 RPS │
│ • Writes: 5,500 × 1/11 = 500 RPS │
│ │
│ Common ratios: │
│ • Read-heavy (social media): 100:1 to 1000:1 │
│ • Mixed (e-commerce): 10:1 to 100:1 │
│ • Write-heavy (logging): 1:1 to 1:10 │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Worked Example: Twitter-like Service
┌─────────────────────────────────────────────────────────────────────────┐
│ EXAMPLE: TWITTER TRAFFIC ESTIMATION │
│ │
│ GIVEN: │
│ • 500 million monthly active users (MAU) │
│ • 200 million daily active users (DAU) │
│ • Each user views 20 tweets per day │
│ • Each user posts 0.5 tweets per day │
│ │
│ CALCULATE: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Reads per day: │
│ 200M users × 20 views = 4,000,000,000 reads/day │
│ = 4 billion reads/day │
│ │
│ Writes per day: │
│ 200M users × 0.5 posts = 100,000,000 writes/day │
│ = 100 million writes/day │
│ │
│ Read/Write ratio: │
│ 4B / 100M = 40:1 │
│ │
│ Reads per second: │
│ 4B / 100,000 = 40,000 read RPS │
│ (More precise: 4B / 86,400 = 46,296 RPS) │
│ │
│ Writes per second: │
│ 100M / 100,000 = 1,000 write RPS │
│ (More precise: 100M / 86,400 = 1,157 RPS) │
│ │
│ Total: ~47,000 RPS average │
│ Peak (2x): ~94,000 RPS │
│ │
│ INTERVIEW PHRASE: │
│ "With 200M DAU viewing 20 tweets each, that's 4 billion reads │
│ per day, or about 46,000 reads per second. Writes are much │
│ lower at about 1,000 per second. Peak would be roughly double, │
│ so we need to design for ~100,000 RPS." │
│ │
└─────────────────────────────────────────────────────────────────────────┘
2.3 Storage Estimation
The Formula
┌─────────────────────────────────────────────────────────────────────────┐
│ STORAGE ESTIMATION FORMULAS │
│ │
│ BASIC FORMULA: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Storage = Number of Objects × Size per Object × Retention Period │
│ │
│ │
│ DAILY GROWTH: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Daily Growth = New Objects per Day × Size per Object │
│ │
│ │
│ YEARLY STORAGE: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Yearly Storage = Daily Growth × 365 │
│ │
│ │
│ WITH REPLICATION: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Total Storage = Raw Storage × Replication Factor │
│ │
│ Common replication factors: │
│ • Databases: 3x (1 primary + 2 replicas) │
│ • Object storage (S3): Already replicated (built-in) │
│ • Kafka: 3x (3 replicas per partition) │
│ │
│ │
│ DON'T FORGET: │
│ ───────────────────────────────────────────────────────────────────── │
│ • Indexes: Add 20-30% for database indexes │
│ • Logs: Often as large as main data │
│ • Backups: Often 2-3x for backup retention │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Worked Example: Twitter-like Service
┌─────────────────────────────────────────────────────────────────────────┐
│ EXAMPLE: TWITTER STORAGE ESTIMATION │
│ │
│ GIVEN: │
│ • 100 million new tweets per day │
│ • Average tweet: 300 bytes (text + metadata) │
│ • 10% of tweets have images (average 200 KB) │
│ • 1% of tweets have videos (average 5 MB) │
│ • Retention: Keep everything (5 years for estimation) │
│ │
│ CALCULATE: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ TEXT STORAGE (per day): │
│ 100M tweets × 300 bytes = 30 GB/day │
│ │
│ IMAGE STORAGE (per day): │
│ 100M × 10% = 10M images │
│ 10M × 200 KB = 2 TB/day │
│ │
│ VIDEO STORAGE (per day): │
│ 100M × 1% = 1M videos │
│ 1M × 5 MB = 5 TB/day │
│ │
│ TOTAL DAILY: │
│ 30 GB + 2 TB + 5 TB ≈ 7 TB/day │
│ (Media dominates storage!) │
│ │
│ YEARLY: │
│ 7 TB × 365 = 2.5 PB/year │
│ │
│ 5-YEAR STORAGE: │
│ 2.5 PB × 5 = 12.5 PB │
│ │
│ WITH REPLICATION (3x for database, S3 handles media): │
│ Text DB: 30 GB/day × 365 × 5 × 3 = 165 TB │
│ Media (S3): 7 TB × 365 × 5 = 12.8 PB (S3 replicates internally) │
│ │
│ SUMMARY: │
│ "We'll need about 7 TB of new storage per day, or 2.5 PB per year. │
│ Over 5 years, that's roughly 13 PB. Media is 99%+ of storage." │
│ │
└─────────────────────────────────────────────────────────────────────────┘
2.4 Bandwidth Estimation
The Formula
┌─────────────────────────────────────────────────────────────────────────┐
│ BANDWIDTH ESTIMATION FORMULAS │
│ │
│ INGRESS (data coming IN): │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Ingress = Write RPS × Average Request Size │
│ │
│ Typically smaller (just the data being uploaded) │
│ │
│ │
│ EGRESS (data going OUT): │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Egress = Read RPS × Average Response Size │
│ │
│ Usually larger (responses include more data) │
│ │
│ │
│ CONVERTING UNITS: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ MB/s × 8 = Mbps (megabits per second) │
│ GB/s × 8 = Gbps (gigabits per second) │
│ │
│ Example: 100 MB/s = 800 Mbps = 0.8 Gbps │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Worked Example: Twitter-like Service
┌─────────────────────────────────────────────────────────────────────────┐
│ EXAMPLE: TWITTER BANDWIDTH ESTIMATION │
│ │
│ GIVEN (from traffic estimation): │
│ • Read RPS: 46,000 │
│ • Write RPS: 1,150 │
│ • Tweet read response: 2 KB (tweet + user info) │
│ • Tweet write request: 300 bytes │
│ • 20% of read responses include image (200 KB) │
│ │
│ CALCULATE: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ INGRESS (writes coming in): │
│ 1,150 writes/sec × 300 bytes = 345 KB/sec │
│ With 10% images: 1,150 × 0.1 × 200 KB = 23 MB/sec │
│ Total ingress: ~24 MB/sec = ~200 Mbps │
│ │
│ EGRESS (reads going out): │
│ Text only: 46,000 × 2 KB = 92 MB/sec │
│ With images: 46,000 × 0.2 × 200 KB = 1,840 MB/sec │
│ Total egress: ~1.9 GB/sec = ~15 Gbps │
│ │
│ PEAK (2x): │
│ Ingress peak: ~400 Mbps │
│ Egress peak: ~30 Gbps │
│ │
│ NOTE: │
│ This is why CDN is essential! CDN handles most of the 30 Gbps │
│ egress for images. Origin servers see much less. │
│ │
│ With CDN (90% cache hit rate for images): │
│ Origin egress: 92 MB/sec + (1,840 × 0.1) = ~280 MB/sec │
│ = ~2.3 Gbps (much more manageable) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
2.5 Infrastructure Estimation
Server Sizing
┌─────────────────────────────────────────────────────────────────────────┐
│ INFRASTRUCTURE ESTIMATION │
│ │
│ WEB/API SERVERS │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Peak RPS │
│ Servers = ──────────────────────────────── │
│ RPS per Server × Utilization Target │
│ │
│ Guidelines: │
│ • Simple API (CRUD): 5,000-10,000 RPS per server │
│ • Complex logic: 1,000-3,000 RPS per server │
│ • CPU-intensive: 100-500 RPS per server │
│ • Target utilization: 70% (leave headroom) │
│ │
│ Example: │
│ Peak RPS: 100,000 │
│ RPS per server: 5,000 │
│ Utilization: 70% │
│ Servers = 100,000 / (5,000 × 0.7) = 29 servers │
│ Round up to 30-35 for availability │
│ │
│ │
│ DATABASE SIZING │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ For reads (read replicas): │
│ Read RPS │
│ Read Replicas = ────────────────────── │
│ Queries per DB (30,000) │
│ │
│ For writes: │
│ Usually single primary (shard if > 10,000 writes/sec) │
│ │
│ Example: │
│ Read RPS: 46,000 │
│ Replicas: 46,000 / 30,000 = 2 read replicas │
│ Total: 1 primary + 2 replicas = 3 DB servers │
│ │
│ │
│ CACHE SIZING │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ Memory = Hot Data Size │
│ │
│ Rules of thumb: │
│ • Cache 20% of data (80/20 rule) │
│ • Or cache last N hours/days of activity │
│ • Aim for 90%+ cache hit rate │
│ │
│ Example: │
│ Daily new data: 30 GB (text only) │
│ Cache last 7 days: 210 GB │
│ Use 256 GB Redis cluster │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Chapter 3: Quick Reference Formulas
3.1 The Cheat Sheet
┌────────────────────────────────────────────────────────────────────────┐
│ ESTIMATION CHEAT SHEET │
│ │
│ TRAFFIC │
│ ──────────────────────────────────────────────────────────────────── │
│ RPS = Daily requests / 100,000 │
│ Peak RPS = Average RPS × 2 (or 3 for viral apps) │
│ │
│ STORAGE │
│ ──────────────────────────────────────────────────────────────────── │
│ Daily storage = New objects × Size per object │
│ Yearly storage = Daily × 365 │
│ With replication = Raw × 3 │
│ │
│ BANDWIDTH │
│ ──────────────────────────────────────────────────────────────────── │
│ Egress = Read RPS × Response size │
│ Ingress = Write RPS × Request size │
│ MB/s × 8 = Mbps │
│ │
│ SERVERS │
│ ──────────────────────────────────────────────────────────────────── │
│ API servers = Peak RPS / 5,000 / 0.7 │
│ DB replicas = Read RPS / 30,000 │
│ Cache = Hot data size (20% of total or N days) │
│ │
│ COMMON VALUES │
│ ──────────────────────────────────────────────────────────────────── │
│ 1 day = 100,000 seconds (approximately) │
│ 1 year = 30 million seconds (approximately) │
│ Typical API response: 2-5 KB │
│ Typical DB record: 500 bytes - 1 KB │
│ Image: 200 KB - 2 MB │
│ Server capacity: 5,000 simple requests/sec │
│ DB capacity: 30,000 queries/sec │
│ Redis capacity: 100,000 ops/sec │
│ │
└────────────────────────────────────────────────────────────────────────┘
3.2 Common Estimation Templates
┌─────────────────────────────────────────────────────────────────────────┐
│ SOCIAL MEDIA TEMPLATE │
│ │
│ INPUTS: │
│ • MAU: _______ │
│ • DAU: _______ (usually 20-40% of MAU) │
│ • Posts per user per day: _______ │
│ • Views per user per day: _______ │
│ • % posts with media: _______ │
│ │
│ CALCULATIONS: │
│ Write RPS = DAU × Posts / 100,000 │
│ Read RPS = DAU × Views / 100,000 │
│ Storage/day = DAU × Posts × (text size + media%) │
│ API servers = (Read + Write RPS) × 2 / 5,000 │
│ │
└─────────────────────────────────────────────────────────────────────────┘
┌────────────────────────────────────────────────────────────────────────┐
│ E-COMMERCE TEMPLATE │
│ │
│ INPUTS: │
│ • Daily visitors: _______ │
│ • Pages per visit: _______ │
│ • Conversion rate: _______ (1-3%) │
│ • Products in catalog: _______ │
│ │
│ CALCULATIONS: │
│ Page views/day = Visitors × Pages │
│ Orders/day = Visitors × Conversion rate │
│ Read RPS = Page views / 100,000 │
│ Product catalog size = Products × 5 KB │
│ │
└────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ URL SHORTENER TEMPLATE │
│ │
│ INPUTS: │
│ • New URLs per month: _______ │
│ • Read:Write ratio: _______ (usually 100:1) │
│ • URL record size: _______ (~500 bytes) │
│ │
│ CALCULATIONS: │
│ Write RPS = URLs/month / 2.5M (seconds/month) │
│ Read RPS = Write RPS × ratio │
│ Storage = URLs × 500 bytes × retention years │
│ Short code length = log₆₂(total URLs needed) │
│ 6 chars = 56B combinations, 7 chars = 3.5T │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Chapter 4: Interview Tips
4.1 How to Present Estimates
┌─────────────────────────────────────────────────────────────────────────┐
│ PRESENTING ESTIMATES │
│ │
│ DO: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ ✓ State assumptions clearly FIRST │
│ "Let me assume we have 100 million DAU..." │
│ │
│ ✓ Show your math step by step │
│ "100 million users × 10 posts = 1 billion posts per day" │
│ │
│ ✓ Round to nice numbers │
│ "That's about 12,000 per second, let's call it 10K RPS" │
│ │
│ ✓ Convert to useful units │
│ "10 KB per request × 10K RPS = 100 MB/sec" │
│ │
│ ✓ Sanity check your results │
│ "10 PB seems large but Netflix stores 100+ PB, so this is │
│ reasonable for a video platform" │
│ │
│ ✓ Explain implications │
│ "At 100K RPS, we'll need about 20 API servers and should │
│ definitely use a caching layer" │
│ │
│ │
│ DON'T: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ ✗ Spend more than 5 minutes on estimation │
│ Get to approximate numbers and move on │
│ │
│ ✗ Use overly precise numbers │
│ "12,453 requests per second" → Just say "~12,000" or "~10K" │
│ │
│ ✗ Forget peak traffic │
│ Always mention peak is 2-3x average │
│ │
│ ✗ Skip the "so what" │
│ Numbers should lead to design decisions │
│ │
└─────────────────────────────────────────────────────────────────────────┘
4.2 Common Pitfalls
┌─────────────────────────────────────────────────────────────────────────┐
│ ESTIMATION PITFALLS │
│ │
│ PITFALL 1: FORGETTING PEAK TRAFFIC │
│ ───────────────────────────────────────────────────────────────────── │
│ ✗ "We get 10,000 RPS" → designs for 10K │
│ ✓ "Average is 10K RPS, peak is 20-30K" → designs for 30K │
│ │
│ PITFALL 2: IGNORING GROWTH │
│ ───────────────────────────────────────────────────────────────────── │
│ ✗ "We need 100 GB storage" │
│ ✓ "We need 100 GB now, 1 TB in a year at 10x growth" │
│ │
│ PITFALL 3: FORGETTING REPLICATION │
│ ───────────────────────────────────────────────────────────────────── │
│ ✗ "Database needs 500 GB" │
│ ✓ "500 GB × 3 replicas = 1.5 TB total" │
│ │
│ PITFALL 4: UNDERESTIMATING MEDIA │
│ ───────────────────────────────────────────────────────────────────── │
│ ✗ "Posts are 1 KB, so storage is small" │
│ ✓ "Text is 1 KB, but 20% have 500 KB images = 100x more storage" │
│ │
│ PITFALL 5: OVERLOOKING BANDWIDTH │
│ ───────────────────────────────────────────────────────────────────── │
│ ✗ Calculate storage but forget egress costs │
│ ✓ "30 Gbps egress = need CDN to reduce costs" │
│ │
│ PITFALL 6: WRONG UNITS │
│ ───────────────────────────────────────────────────────────────────── │
│ ✗ Mixing MB and Mb (megabytes vs megabits) │
│ ✓ Always clarify: "100 megabytes per second" │
│ │
└─────────────────────────────────────────────────────────────────────────┘
4.3 Quick Mental Math Tricks
┌─────────────────────────────────────────────────────────────────────────┐
│ MENTAL MATH TRICKS │
│ │
│ MULTIPLYING LARGE NUMBERS: │
│ ───────────────────────────────────────────────────────────────────── │
│ 100 million × 10 = 1 billion │
│ 100 million × 100 = 10 billion │
│ 1 billion × 1000 = 1 trillion │
│ │
│ Trick: Count zeros, add exponents │
│ 10^8 × 10^2 = 10^10 = 10 billion │
│ │
│ DAILY TO PER-SECOND: │
│ ───────────────────────────────────────────────────────────────────── │
│ Divide by 100,000 (or 10^5) │
│ 1 billion/day = 10,000/second │
│ 100 million/day = 1,000/second │
│ 10 million/day = 100/second │
│ │
│ PERCENTAGES: │
│ ───────────────────────────────────────────────────────────────────── │
│ 1% = ÷ 100 │
│ 10% = ÷ 10 │
│ 20% = ÷ 5 │
│ 25% = ÷ 4 │
│ 50% = ÷ 2 │
│ │
│ STORAGE CONVERSIONS: │
│ ───────────────────────────────────────────────────────────────────── │
│ 1000 KB = 1 MB │
│ 1000 MB = 1 GB │
│ 1000 GB = 1 TB │
│ 1000 TB = 1 PB │
│ │
│ 1 million KB = 1 GB │
│ 1 billion KB = 1 TB │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Chapter 5: Practice Problems
5.1 Practice: Instagram
┌─────────────────────────────────────────────────────────────────────────┐
│ PRACTICE: INSTAGRAM │
│ │
│ PROBLEM: │
│ Estimate the infrastructure needs for Instagram-like app │
│ │
│ GIVEN: │
│ • 1 billion MAU │
│ • 500 million DAU │
│ • Each user views 50 photos per day │
│ • Each user posts 1 photo per week │
│ • Average photo: 2 MB │
│ • Keep photos forever │
│ │
│ CALCULATE: │
│ 1. Photo views per second │
│ 2. Photo uploads per second │
│ 3. Daily storage growth │
│ 4. 5-year storage │
│ 5. Egress bandwidth │
│ 6. Number of API servers needed │
│ │
│ TRY IT YOURSELF FIRST! │
│ │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ SOLUTION: │
│ │
│ 1. Photo views/second: │
│ 500M DAU × 50 views = 25 billion views/day │
│ 25B / 100,000 = 250,000 read RPS │
│ │
│ 2. Photo uploads/second: │
│ 500M DAU / 7 (weekly) = 71 million uploads/day │
│ 71M / 100,000 = 710 write RPS │
│ │
│ 3. Daily storage: │
│ 71M photos × 2 MB = 142 TB/day │
│ │
│ 4. 5-year storage: │
│ 142 TB × 365 × 5 = 259 PB │
│ (This is just raw photos - thumbnails add more) │
│ │
│ 5. Egress bandwidth: │
│ 250,000 RPS × 2 MB = 500 GB/sec = 4 Tbps │
│ (This is why CDN is ESSENTIAL) │
│ With 95% CDN hit rate: 25 GB/sec origin traffic │
│ │
│ 6. API servers: │
│ Peak: 250,000 × 2 = 500,000 RPS │
│ Servers: 500,000 / 5,000 / 0.7 = 143 servers │
│ Round to: 150 API servers (more for redundancy) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
5.2 Practice: YouTube
┌─────────────────────────────────────────────────────────────────────────┐
│ PRACTICE: YOUTUBE │
│ │
│ PROBLEM: │
│ Estimate infrastructure for YouTube-like video platform │
│ │
│ GIVEN: │
│ • 2 billion MAU │
│ • 1 billion DAU │
│ • Average watch time: 1 hour/day │
│ • 500 hours of video uploaded per minute │
│ • Average video: 5 minutes, stored at multiple resolutions │
│ • Resolutions: 360p (200MB/hr), 720p (500MB/hr), 1080p (1.5GB/hr) │
│ │
│ CALCULATE: │
│ 1. Video streaming bandwidth │
│ 2. Upload volume per day │
│ 3. Storage per year │
│ 4. CDN requirements │
│ │
│ HINTS: │
│ • Focus on streaming bandwidth (it's massive) │
│ • Consider multiple resolutions │
│ • CDN is absolutely essential │
│ │
│ TRY IT YOURSELF! │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Summary
┌────────────────────────────────────────────────────────────────────────┐
│ PART 3 KEY TAKEAWAYS │
│ │
│ MEMORIZE THESE NUMBERS: │
│ • 1 day = 86,400 seconds ≈ 100,000 for estimation │
│ • 1 year ≈ 30 million seconds │
│ • L1 cache: 1ns, Memory: 100ns, SSD: 150μs, HDD: 10ms │
│ • Datacenter RTT: 0.5ms, Cross-continent: 150ms │
│ │
│ THE FOUR ESTIMATES: │
│ 1. Traffic (RPS = daily / 100,000, peak = 2-3x) │
│ 2. Storage (objects × size × retention × replication) │
│ 3. Bandwidth (RPS × payload size) │
│ 4. Infrastructure (RPS / capacity per server) │
│ │
│ KEY FORMULAS: │
│ • RPS = Daily actions / 100,000 │
│ • Peak = Average × 2 (or 3) │
│ • Servers = Peak RPS / 5,000 / 0.7 │
│ • DB replicas = Read RPS / 30,000 │
│ │
│ INTERVIEW TIPS: │
│ • State assumptions clearly │
│ • Show your math │
│ • Round to nice numbers │
│ • Sanity check results │
│ • Connect numbers to design decisions │
│ • Don't spend more than 5 minutes │
│ │
└────────────────────────────────────────────────────────────────────────┘
📚 Further Reading
References
- Jeff Dean's Latency Numbers: https://brenocon.com/dean_perf.html
- Latency Numbers Interactive: https://colin-scott.github.io/personal_website/research/interactive_latency.html
- System Design Primer - Back of Envelope: https://github.com/donnemartin/system-design-primer#back-of-the-envelope-calculations
Books
- Designing Data-Intensive Applications - Martin Kleppmann (Chapter 1)
- System Design Interview - Alex Xu (Chapter 2)
Practice
- Pramp: Free mock interviews
- Interviewing.io: Practice with real interviewers
- LeetCode Discuss: System design questions
End of Week 0 — Part 3
Week 0 Complete!
You now have the foundation for the 10-week journey:
Part 1: System Design Framework (HLD/LLD, interview structure) Part 2: Infrastructure Building Blocks (all the components) Part 3: Back-of-the-Envelope Estimation (sizing systems)
Ready for Week 1: Foundations of Scale!