Week 0 — Part 4: Networking Fundamentals for System Design
System Design Mastery Series
Preface
Every system design interview involves networks. When you say "the client sends a request to the server," you're glossing over dozens of networking concepts. Interviewers notice.
What you say: "Client connects to server"
What happens:
1. DNS resolution (which server?)
2. TCP handshake (establish connection)
3. TLS handshake (secure it)
4. HTTP request (format the data)
5. Load balancer routing (which instance?)
6. Response travels back (same path, reverse)
Each step can fail. Each step has latency.
This document covers everything about networking you need for system design interviews — from the physical layer to application protocols, with interview questions and real-world examples.
Part I: The Network Stack
Chapter 1: OSI Model — The Mental Framework
The OSI model gives us a shared vocabulary. You don't need to memorize all 7 layers, but you must understand how they map to real technologies.
┌────────────────────────────────────────────────────────────────────────┐
│ OSI MODEL │
│ │
│ Layer 7: APPLICATION │
│ ──────────────────────────────────────────────────────────────────── │
│ What you see: HTTP, HTTPS, WebSocket, gRPC, DNS, SMTP │
│ Data unit: Message/Data │
│ System design relevance: API design, protocols, serialization │
│ │
│ Layer 6: PRESENTATION │
│ ──────────────────────────────────────────────────────────────────── │
│ What it does: Encryption, compression, encoding │
│ Examples: TLS/SSL, JPEG, JSON serialization │
│ System design relevance: HTTPS, data formats │
│ │
│ Layer 5: SESSION │
│ ──────────────────────────────────────────────────────────────────── │
│ What it does: Manages connections, sessions │
│ Examples: NetBIOS, RPC, session tokens │
│ System design relevance: Connection pooling, session management │
│ │
│ Layer 4: TRANSPORT │
│ ──────────────────────────────────────────────────────────────────── │
│ What you see: TCP, UDP, QUIC │
│ Data unit: Segment (TCP) / Datagram (UDP) │
│ System design relevance: Reliability, ordering, flow control │
│ │
│ Layer 3: NETWORK │
│ ──────────────────────────────────────────────────────────────────── │
│ What you see: IP (IPv4, IPv6), ICMP, routing │
│ Data unit: Packet │
│ System design relevance: IP addressing, subnets, VPCs │
│ │
│ Layer 2: DATA LINK │
│ ──────────────────────────────────────────────────────────────────── │
│ What you see: Ethernet, MAC addresses, switches │
│ Data unit: Frame │
│ System design relevance: Usually abstracted in cloud │
│ │
│ Layer 1: PHYSICAL │
│ ──────────────────────────────────────────────────────────────────── │
│ What it is: Cables, fiber, wireless signals │
│ System design relevance: Datacenter placement, latency │
│ │
└────────────────────────────────────────────────────────────────────────┘
The Practical Model (TCP/IP)
In practice, we use the simpler TCP/IP model:
┌────────────────────────────────────────────────────────────────────────┐
│ TCP/IP MODEL (What You Actually Use) │
│ │
│ APPLICATION HTTP, HTTPS, DNS, WebSocket, gRPC │
│ ──────────────────────────────────────────────────────────────────── │
│ TRANSPORT TCP, UDP, QUIC │
│ ──────────────────────────────────────────────────────────────────── │
│ INTERNET IP, ICMP, ARP │
│ ──────────────────────────────────────────────────────────────────── │
│ NETWORK ACCESS Ethernet, Wi-Fi, Fiber │
│ │
└────────────────────────────────────────────────────────────────────────┘
Interview Tip
When asked: "At what layer does a load balancer operate?"
Good answer: "It depends on the type. A Layer 4 load balancer works at the
transport layer — it sees IP addresses and ports but not HTTP headers. It's
faster but can't do content-based routing. A Layer 7 load balancer works at
the application layer — it can inspect HTTP headers, cookies, and URLs for
intelligent routing, but has more overhead."
Chapter 2: IP Addressing and Subnets
2.1 IPv4 Addresses
IPv4 ADDRESS STRUCTURE
192 . 168 . 1 . 100
└─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
8 bits 8 bits 8 bits 8 bits
Total: 32 bits = ~4.3 billion addresses
Binary form:
192.168.1.100 = 11000000.10101000.00000001.01100100
2.2 Private IP Ranges
These are reserved for internal networks (not routable on internet):
┌────────────────────────────────────────────────────────────────────────┐
│ PRIVATE IP RANGES │
│ │
│ Class A: 10.0.0.0 – 10.255.255.255 (16 million addresses) │
│ Used by: Large enterprises, cloud VPCs │
│ │
│ Class B: 172.16.0.0 – 172.31.255.255 (1 million addresses) │
│ Used by: Medium networks │
│ │
│ Class C: 192.168.0.0 – 192.168.255.255 (65,536 addresses) │
│ Used by: Home networks, small offices │
│ │
│ Loopback: 127.0.0.0 – 127.255.255.255 (localhost) │
│ 127.0.0.1 = "this machine" │
│ │
└────────────────────────────────────────────────────────────────────────┘
2.3 CIDR Notation
CIDR (Classless Inter-Domain Routing) defines network size:
CIDR NOTATION
10.0.0.0/8
└── Number of bits for network prefix
/8 = First 8 bits fixed = 16,777,216 addresses (10.0.0.0 - 10.255.255.255)
/16 = First 16 bits fixed = 65,536 addresses (10.0.0.0 - 10.0.255.255)
/24 = First 24 bits fixed = 256 addresses (10.0.0.0 - 10.0.0.255)
/32 = All 32 bits fixed = 1 address (10.0.0.1 only)
COMMON CIDR BLOCKS:
/8 16,777,216 IPs Large cloud VPC
/16 65,536 IPs Regional network
/20 4,096 IPs Subnet for services
/24 256 IPs Small subnet
/28 16 IPs Tiny subnet (load balancers)
/32 1 IP Single host
2.4 Subnets in System Design
┌───────────────────────────────────────────────────────────────────────┐
│ VPC SUBNET DESIGN EXAMPLE │
│ │
│ VPC: 10.0.0.0/16 (65,536 addresses) │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ AVAILABILITY ZONE A │ │
│ │ │ │
│ │ Public Subnet: 10.0.1.0/24 (256 IPs) │ │
│ │ └── Load balancers, NAT gateways, bastion hosts │ │
│ │ │ │
│ │ Private Subnet: 10.0.10.0/24 (256 IPs) │ │
│ │ └── Application servers │ │
│ │ │ │
│ │ Database Subnet: 10.0.20.0/24 (256 IPs) │ │
│ │ └── RDS, ElastiCache (no internet access) │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────┐ │
│ │ AVAILABILITY ZONE B │ │
│ │ │ │
│ │ Public Subnet: 10.0.2.0/24 │ │
│ │ Private Subnet: 10.0.11.0/24 │ │
│ │ Database Subnet: 10.0.21.0/24 │ │
│ │ │ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
└───────────────────────────────────────────────────────────────────────┘
WHY THIS STRUCTURE?
Public subnets: Have route to Internet Gateway (IGW)
Private subnets: Route to NAT Gateway for outbound only
Database subnets: No internet access at all (maximum security)
2.5 IPv6
IPv6 ADDRESS
2001:0db8:85a3:0000:0000:8a2e:0370:7334
└──────────────────────────────────────┘
128 bits (vs 32 for IPv4)
= 340 undecillion addresses (340 × 10^36)
Shortened form:
2001:db8:85a3::8a2e:370:7334
(consecutive zeros can be replaced with ::)
System design relevance:
- Most cloud providers support dual-stack (IPv4 + IPv6)
- Mobile networks increasingly use IPv6
- No NAT needed (every device can have public IP)
Chapter 3: TCP — The Reliable Protocol
TCP (Transmission Control Protocol) is the backbone of most internet traffic.
3.1 TCP Characteristics
┌────────────────────────────────────────────────────────────────────────┐
│ TCP GUARANTEES │
│ │
│ ✓ RELIABLE DELIVERY │
│ Every byte sent will be received (or connection fails) │
│ Achieved through: Acknowledgments + Retransmissions │
│ │
│ ✓ ORDERED DELIVERY │
│ Bytes received in same order as sent │
│ Achieved through: Sequence numbers │
│ │
│ ✓ FLOW CONTROL │
│ Sender won't overwhelm receiver │
│ Achieved through: Receive window │
│ │
│ ✓ CONGESTION CONTROL │
│ Sender adapts to network capacity │
│ Achieved through: Congestion window algorithms │
│ │
│ ✗ NO MESSAGE BOUNDARIES │
│ TCP is a byte stream, not message-based │
│ Application must define message framing │
│ │
└────────────────────────────────────────────────────────────────────────┘
3.2 TCP Three-Way Handshake
TCP CONNECTION ESTABLISHMENT
Client Server
│ │
│ ──────── SYN (seq=100) ──────────▶ │ Step 1: Client initiates
│ │ "I want to connect"
│ │
│ ◀─── SYN-ACK (seq=300, ack=101) ─── │ Step 2: Server acknowledges
│ │ "OK, I'm ready too"
│ │
│ ──────── ACK (ack=301) ──────────▶ │ Step 3: Client confirms
│ │ "Let's go!"
│ │
│ ══════ Connection Established ══════ │
│ │
LATENCY COST:
Minimum 1.5 RTT (Round Trip Time) before data can flow
Example:
Client in NYC, Server in London
RTT ≈ 80ms
Handshake time ≈ 120ms (1.5 × 80ms)
3.3 TCP Connection Termination
TCP CONNECTION TERMINATION (Four-Way Handshake)
Client Server
│ │
│ ──────── FIN ────────────────────▶ │ "I'm done sending"
│ │
│ ◀─────── ACK ──────────────────── │ "Got it"
│ │
│ ◀─────── FIN ──────────────────── │ "I'm done too"
│ │
│ ──────── ACK ────────────────────▶ │ "Goodbye"
│ │
│ [TIME_WAIT: 2×MSL] │
│ │
TIME_WAIT STATE:
- Lasts 2 × MSL (Maximum Segment Lifetime, typically 60 seconds)
- Prevents old packets from interfering with new connections
- Can cause "port exhaustion" with many short connections
SYSTEM DESIGN IMPACT:
- Connection pooling is critical for high-throughput systems
- HTTP Keep-Alive reuses connections
- TIME_WAIT can exhaust ephemeral ports (see Chapter on ports)
3.4 TCP Flow Control: Receive Window
RECEIVE WINDOW (Flow Control)
Sender Receiver
│ │
│ ──── Data (1000 bytes) ────────────▶ │
│ │ Buffer: [1000/4096]
│ ◀──── ACK, Window=3096 ──────────── │ "I can receive 3096 more"
│ │
│ ──── Data (3000 bytes) ────────────▶ │
│ │ Buffer: [4000/4096]
│ ◀──── ACK, Window=96 ────────────── │ "Only 96 bytes free!"
│ │
│ [Sender pauses, waits for window] │
│ │ [Receiver processes data]
│ ◀──── Window Update=4096 ───────── │ "Buffer cleared"
│ │
│ [Sender resumes] │
│ │
SYSTEM DESIGN IMPACT:
- Slow consumers cause sender to block
- Can lead to backpressure through the system
- Monitor receive buffer utilization
3.5 TCP Congestion Control
CONGESTION CONTROL PHASES
Throughput
│
│ ┌─── Congestion detected
│ │ (packet loss)
│ ▼
│ ╱╲ ╱╲
│ ╱ ╲ ╱ ╲
│ ╱ ╲ ╱ ╲
│ ╱ ╲ ╱ ╲
│ ╱ Slow ╲╱ ╲
│ ╱ Start │ ╲
│ ╱ │ ╲
│ ╱ │ Congestion
│ ╱ │ Avoidance
└─────────────────────────────▶ Time
PHASES:
1. Slow Start: Exponential growth (double each RTT)
2. Congestion Avoidance: Linear growth (cautious)
3. Congestion Detected: Cut window, restart
ALGORITHMS:
- Cubic (Linux default): Aggressive, good for high bandwidth
- BBR (Google): Estimates bandwidth, doesn't rely on packet loss
- Reno: Classic, conservative
SYSTEM DESIGN IMPACT:
- New connections start slow (Slow Start)
- Long-lived connections more efficient
- Connection pooling helps!
3.6 TCP Head-of-Line Blocking
HEAD-OF-LINE BLOCKING PROBLEM
Sender sends packets: [1] [2] [3] [4] [5]
Network delivers: [1] [X] [3] [4] [5]
│
└── Packet 2 lost!
Receiver has: [1] [_] [3] [4] [5]
│
└── Can't deliver 3,4,5 to application!
Must wait for retransmission of 2
Application sees: [1] ... waiting ... [2] [3] [4] [5]
│
└── Delayed by retransmission time!
WHY IT MATTERS:
- HTTP/2 multiplexes streams over single TCP
- One lost packet blocks ALL streams
- This is why HTTP/3 uses QUIC (UDP-based)
Chapter 4: UDP — The Fast Protocol
4.1 UDP Characteristics
┌────────────────────────────────────────────────────────────────────────┐
│ UDP CHARACTERISTICS │
│ │
│ ✓ FAST │
│ No handshake needed, just send │
│ │
│ ✓ LIGHTWEIGHT │
│ 8-byte header vs 20+ bytes for TCP │
│ │
│ ✓ MESSAGE-BASED │
│ Preserves message boundaries (unlike TCP byte stream) │
│ │
│ ✓ SUPPORTS MULTICAST/BROADCAST │
│ Send to multiple recipients efficiently │
│ │
│ ✗ NO RELIABILITY │
│ Packets may be lost, duplicated, or reordered │
│ Application must handle this │
│ │
│ ✗ NO CONGESTION CONTROL │
│ Can overwhelm network if unchecked │
│ Application should implement rate limiting │
│ │
└────────────────────────────────────────────────────────────────────────┘
4.2 UDP Use Cases
WHEN TO USE UDP:
1. REAL-TIME APPLICATIONS
- Video streaming (Netflix, YouTube live)
- Voice/Video calls (Zoom, Discord)
- Online gaming
Why: Late data is useless. Better to skip than wait.
2. DNS QUERIES
- Simple request/response
- Fits in single packet
- Can retry if lost
3. HEALTH CHECKS / HEARTBEATS
- Small, frequent messages
- Missing one is not critical
4. QUIC PROTOCOL (HTTP/3)
- Built reliability on top of UDP
- Avoids TCP head-of-line blocking
4.3 TCP vs UDP Comparison
┌────────────────────────────────────────────────────────────────────────┐
│ TCP vs UDP │
│ │
│ Feature │ TCP │ UDP │
│ ─────────────────────┼──────────────────┼─────────────────────────── │
│ Connection │ Required (3-way) │ None │
│ Reliability │ Guaranteed │ Best effort │
│ Ordering │ Guaranteed │ No guarantee │
│ Flow control │ Yes │ No │
│ Congestion control │ Yes │ No │
│ Message boundaries │ No (byte stream) │ Yes (datagrams) │
│ Header size │ 20-60 bytes │ 8 bytes │
│ Speed │ Slower │ Faster │
│ Use cases │ HTTP, SMTP, SSH │ DNS, VoIP, Gaming, QUIC │
│ │
└────────────────────────────────────────────────────────────────────────┘
Chapter 5: Ports and Sockets
5.1 Port Numbers
PORT NUMBER RANGES
0 - 1023 WELL-KNOWN PORTS (System/Privileged)
─────────────────────────────────────────────────────
20, 21 FTP (data, control)
22 SSH
23 Telnet
25 SMTP
53 DNS
80 HTTP
443 HTTPS
1024 - 49151 REGISTERED PORTS (User)
─────────────────────────────────────────────────────
3306 MySQL
5432 PostgreSQL
6379 Redis
8080 HTTP alternate
9092 Kafka
27017 MongoDB
49152 - 65535 DYNAMIC/EPHEMERAL PORTS
─────────────────────────────────────────────────────
Used for client-side connections
OS assigns automatically
5.2 Sockets: The Connection Identifier
SOCKET = IP Address + Port + Protocol
A connection is uniquely identified by 5-tuple:
┌────────────────────────────────────────────────────────────────────────┐
│ │
│ (Protocol, Source IP, Source Port, Dest IP, Dest Port) │
│ │
│ Example: │
│ (TCP, 192.168.1.100, 54321, 93.184.216.34, 443) │
│ └─────────────────┘ └────────────────────┘ │
│ Client side Server side │
│ │
└────────────────────────────────────────────────────────────────────────┘
SERVER SOCKET:
- Listens on (*, 443) - any IP, port 443
- Can handle many connections simultaneously
- Each connection is unique 5-tuple
CLIENT SOCKET:
- OS assigns ephemeral port (e.g., 54321)
- Connects to server IP:port
5.3 Ephemeral Port Exhaustion
EPHEMERAL PORT EXHAUSTION PROBLEM
Server making many outbound connections (e.g., to database, cache)
Available ephemeral ports: 49152 - 65535 = ~16,000 ports
Each connection uses one port until TIME_WAIT expires (60-120 sec)
At 500 connections/second:
500 × 60 = 30,000 ports needed
But only 16,000 available!
RESULT: "Cannot assign requested address" errors
SOLUTIONS:
1. Connection pooling (reuse connections)
2. Increase ephemeral port range: net.ipv4.ip_local_port_range
3. Reduce TIME_WAIT: net.ipv4.tcp_tw_reuse = 1
4. Use connection timeouts
Part II: Application Layer Protocols
Chapter 6: HTTP Protocol Deep Dive
6.1 HTTP/1.1
HTTP/1.1 REQUEST
GET /api/users/123 HTTP/1.1 ← Request line
Host: api.example.com ← Required header
Accept: application/json ← Content negotiation
Authorization: Bearer xyz123 ← Authentication
Connection: keep-alive ← Reuse connection
[empty line] ← Separates headers from body
[optional body]
HTTP/1.1 RESPONSE
HTTP/1.1 200 OK ← Status line
Content-Type: application/json ← Body format
Content-Length: 256 ← Body size
Cache-Control: max-age=3600 ← Caching directive
[empty line]
{"id": 123, "name": "John"} ← Response body
6.2 HTTP Methods
┌────────────────────────────────────────────────────────────────────────┐
│ HTTP METHODS │
│ │
│ Method │ Safe │ Idempotent │ Body │ Use Case │
│ ──────────┼──────┼────────────┼──────┼────────────────────────────── │
│ GET │ Yes │ Yes │ No │ Retrieve resource │
│ HEAD │ Yes │ Yes │ No │ Get headers only │
│ POST │ No │ No │ Yes │ Create resource │
│ PUT │ No │ Yes │ Yes │ Replace resource │
│ PATCH │ No │ No* │ Yes │ Partial update │
│ DELETE │ No │ Yes │ No │ Remove resource │
│ OPTIONS │ Yes │ Yes │ No │ Get allowed methods │
│ │
│ Safe: Doesn't modify server state │
│ Idempotent: Multiple identical requests = same result as one │
│ * PATCH can be idempotent if designed correctly │
│ │
└────────────────────────────────────────────────────────────────────────┘
6.3 HTTP Status Codes
┌────────────────────────────────────────────────────────────────────────┐
│ HTTP STATUS CODES │
│ │
│ 1xx INFORMATIONAL │
│ ──────────────────────────────────────────────────────────────────── │
│ 100 Continue "Keep sending, I'm ready" │
│ 101 Switching Proto "Upgrading to WebSocket" │
│ │
│ 2xx SUCCESS │
│ ──────────────────────────────────────────────────────────────────── │
│ 200 OK Standard success │
│ 201 Created Resource created (POST) │
│ 202 Accepted Request accepted, processing async │
│ 204 No Content Success, no body (DELETE) │
│ │
│ 3xx REDIRECTION │
│ ──────────────────────────────────────────────────────────────────── │
│ 301 Moved Permanent Resource moved forever │
│ 302 Found Temporary redirect │
│ 304 Not Modified Use cached version │
│ 307 Temp Redirect Redirect, keep method │
│ 308 Perm Redirect Permanent, keep method │
│ │
│ 4xx CLIENT ERROR │
│ ──────────────────────────────────────────────────────────────────── │
│ 400 Bad Request Invalid syntax │
│ 401 Unauthorized Authentication required │
│ 403 Forbidden Authenticated but not allowed │
│ 404 Not Found Resource doesn't exist │
│ 405 Method Not Allow This method not supported │
│ 408 Request Timeout Client too slow │
│ 409 Conflict State conflict (concurrent edit) │
│ 422 Unprocessable Validation failed │
│ 429 Too Many Request Rate limited │
│ │
│ 5xx SERVER ERROR │
│ ──────────────────────────────────────────────────────────────────── │
│ 500 Internal Error Generic server error │
│ 502 Bad Gateway Upstream error (from proxy/LB) │
│ 503 Service Unavail Server overloaded/maintenance │
│ 504 Gateway Timeout Upstream timeout │
│ │
└────────────────────────────────────────────────────────────────────────┘
6.4 HTTP/1.1 Limitations
HTTP/1.1 PROBLEMS
1. HEAD-OF-LINE BLOCKING
─────────────────────────────────────────────────────────────────────
One connection = one request at a time
Request 1 ────────────────────▶ [slow response]
Request 2 [waiting...]
Request 3 [waiting...]
Workaround: Open multiple connections (browsers use 6-8)
Problem: More TCP handshakes, more memory
2. HEADER OVERHEAD
─────────────────────────────────────────────────────────────────────
Headers sent as plain text
Same headers repeated for every request
Cookie: session=abc123def456... (100s of bytes, every request!)
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64)...
Accept: text/html,application/xhtml+xml,application/xml;q=0.9...
3. NO SERVER PUSH
─────────────────────────────────────────────────────────────────────
Server can't proactively send data
Client must request everything
6.5 HTTP/2
┌───────────────────────────────────────────────────────────────────────┐
│ HTTP/2 IMPROVEMENTS │
│ │
│ 1. MULTIPLEXING │
│ ─────────────────────────────────────────────────────────────────── │
│ Multiple requests/responses over single TCP connection │
│ No head-of-line blocking at HTTP level │
│ │
│ Connection ═══════════════════════════════════════════════════ │
│ │ Stream 1 │ Stream 3 │ Stream 1 │ Stream 5 │ │
│ │ Request │ Request │ Response │ Request │ │
│ │
│ 2. HEADER COMPRESSION (HPACK) │
│ ─────────────────────────────────────────────────────────────────── │
│ Headers compressed and indexed │
│ Send only differences from previous request │
│ Can reduce header size by 85-90% │
│ │
│ 3. BINARY PROTOCOL │
│ ─────────────────────────────────────────────────────────────────── │
│ Binary framing instead of text │
│ More efficient parsing │
│ Not human-readable (use tools like curl, Wireshark) │
│ │
│ 4. SERVER PUSH │
│ ─────────────────────────────────────────────────────────────────── │
│ Server can send resources before client requests │
│ Example: Push CSS when HTML is requested │
│ (Rarely used in practice, being removed in some browsers) │
│ │
│ 5. STREAM PRIORITIZATION │
│ ─────────────────────────────────────────────────────────────────── │
│ Client can indicate priority of streams │
│ Server can optimize delivery order │
│ │
│ LIMITATION: │
│ Still runs on TCP → TCP head-of-line blocking affects all streams │
│ │
└───────────────────────────────────────────────────────────────────────┘
6.6 HTTP/3 and QUIC
┌────────────────────────────────────────────────────────────────────────┐
│ HTTP/3 (QUIC) │
│ │
│ KEY CHANGE: Runs on UDP instead of TCP │
│ │
│ HTTP/1.1, HTTP/2: HTTP → TCP → IP │
│ HTTP/3: HTTP → QUIC → UDP → IP │
│ │
│ BENEFITS: │
│ ──────────────────────────────────────────────────────────────────── │
│ │
│ 1. NO TCP HEAD-OF-LINE BLOCKING │
│ Lost packet only affects its stream, not others │
│ │
│ 2. FASTER CONNECTION ESTABLISHMENT │
│ 0-RTT or 1-RTT (vs 2-3 RTT for TCP+TLS) │
│ │
│ TCP + TLS 1.3: │
│ TCP SYN → SYN-ACK → ACK (1.5 RTT) │
│ TLS ClientHello → ServerHello → Finished (1 RTT) │
│ Total: 2-3 RTT │
│ │
│ QUIC: │
│ Initial packet contains crypto + data (1 RTT) │
│ Resumption: 0-RTT (send data immediately) │
│ │
│ 3. CONNECTION MIGRATION │
│ Connection survives IP change (mobile switching WiFi to cellular) │
│ Connection identified by ID, not IP:port │
│ │
│ 4. BUILT-IN ENCRYPTION │
│ TLS 1.3 mandatory, even packet numbers encrypted │
│ │
│ ADOPTION: │
│ - Google services (YouTube, Search) │
│ - Cloudflare │
│ - Facebook │
│ - ~25% of web traffic (and growing) │
│ │
└────────────────────────────────────────────────────────────────────────┘
6.7 HTTP Keep-Alive vs Connection Pooling
HTTP KEEP-ALIVE (HTTP/1.1)
Without Keep-Alive:
Request 1: TCP Handshake → HTTP Request → Response → TCP Close
Request 2: TCP Handshake → HTTP Request → Response → TCP Close
Request 3: TCP Handshake → HTTP Request → Response → TCP Close
With Keep-Alive:
TCP Handshake
Request 1 → Response 1
Request 2 → Response 2 (reuse same connection)
Request 3 → Response 3
TCP Close (after idle timeout)
CONNECTION POOLING (Application Level)
┌─────────────────────────────────────────────────────────────────┐
│ CONNECTION POOL │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Connection 1│ │ Connection 2│ │ Connection 3│ │
│ │ (idle) │ │ (in use) │ │ (idle) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │
│ Pool size: 10 │
│ Active: 3 │
│ Idle: 7 │
│ Waiting: 0 │
│ │
└─────────────────────────────────────────────────────────────────┘
Benefits:
- Avoid TCP/TLS handshake per request
- Limit total connections (prevent exhaustion)
- Health checking of connections
Common configurations:
- Database pool: 10-50 connections
- HTTP client pool: 100-500 connections
- Redis: Usually single connection per client (pipelining)
Chapter 7: WebSocket Protocol
7.1 WebSocket vs HTTP
┌────────────────────────────────────────────────────────────────────────┐
│ HTTP vs WEBSOCKET │
│ │
│ HTTP (Request-Response) │
│ ──────────────────────────────────────────────────────────────────── │
│ │
│ Client Server │
│ │ │ │
│ │ ──── Request ────────────────▶ │ │
│ │ ◀─── Response ─────────────── │ │
│ │ │ Connection closed │
│ │ ──── Request ────────────────▶ │ │
│ │ ◀─── Response ─────────────── │ │
│ │ │ │
│ │ Server cannot initiate! │ │
│ │
│ WEBSOCKET (Bidirectional) │
│ ──────────────────────────────────────────────────────────────────── │
│ │
│ Client Server │
│ │ │ │
│ │ ──── HTTP Upgrade Request ───▶ │ │
│ │ ◀─── 101 Switching Protocols ─ │ │
│ │ │ │
│ │ ══════ WebSocket Open ═══════ │ │
│ │ │ │
│ │ ──── Message ────────────────▶ │ │
│ │ ◀─── Message ──────────────── │ Server can push anytime! │
│ │ ◀─── Message ──────────────── │ │
│ │ ──── Message ────────────────▶ │ │
│ │ │ │
│ │ [Connection persists] │ │
│ │
└────────────────────────────────────────────────────────────────────────┘
7.2 WebSocket Handshake
WEBSOCKET UPGRADE REQUEST
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket ← Request upgrade
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZS... ← Random key
Sec-WebSocket-Version: 13
WEBSOCKET UPGRADE RESPONSE
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxa... ← Key + magic string, hashed
After this, raw WebSocket frames flow (binary protocol)
7.3 WebSocket Use Cases
WHEN TO USE WEBSOCKET:
✓ Real-time bidirectional communication
- Chat applications
- Collaborative editing (Google Docs)
- Live sports scores
✓ Server needs to push frequently
- Stock tickers
- Gaming state updates
- Live dashboards
✓ Low latency required
- Online gaming
- Trading platforms
WHEN NOT TO USE:
✗ Request-response pattern
- REST APIs (use HTTP)
✗ Infrequent updates
- Use polling or long-polling
✗ One-way server push
- Consider Server-Sent Events (SSE)
7.4 WebSocket Scaling Challenges
WEBSOCKET SCALING CHALLENGES
1. CONNECTION PERSISTENCE
─────────────────────────────────────────────────────────────────────
Each connection = memory + file descriptor
100K connections = significant server resources
Solutions:
- Use async I/O (Node.js, Go, Rust)
- Tune OS limits (ulimit, sysctl)
2. LOAD BALANCER STICKINESS
─────────────────────────────────────────────────────────────────────
WebSocket = stateful connection
Must route to same server!
Solutions:
- Sticky sessions (IP hash, cookie)
- Connection-aware load balancing
3. HORIZONTAL SCALING
─────────────────────────────────────────────────────────────────────
User A on Server 1, User B on Server 2
How does A's message reach B?
┌──────────┐ ┌──────────┐
│ Server 1 │ │ Server 2 │
│ User A ──┼──?? How to reach ??───▶│── User B │
└──────────┘ └──────────┘
Solution: Pub/Sub backbone (Redis, Kafka)
┌──────────┐ ┌─────────┐ ┌──────────┐
│ Server 1 │────▶│ Redis │◀─────│ Server 2 │
│ User A │ │ Pub/Sub │ │ User B │
└──────────┘ └─────────┘ └──────────┘
Server 1 publishes to Redis
Server 2 subscribes, delivers to User B
4. RECONNECTION HANDLING
─────────────────────────────────────────────────────────────────────
Mobile connections drop frequently
Must handle reconnection + state sync
Solution:
- Client-side reconnection with exponential backoff
- Last-message-ID for catching up
Chapter 8: Server-Sent Events (SSE)
┌────────────────────────────────────────────────────────────────────────┐
│ SERVER-SENT EVENTS (SSE) │
│ │
│ One-way channel: Server → Client │
│ │
│ Client Server │
│ │ │ │
│ │ ──── GET /events ────────────▶ │ │
│ │ ◀─── 200 OK ───────────────── │ │
│ │ Content-Type: text/event-stream │
│ │ │ │
│ │ ◀─── data: {"price": 100} ─── │ │
│ │ ◀─── data: {"price": 101} ─── │ │
│ │ ◀─── data: {"price": 99} ──── │ │
│ │ │ │
│ │ [Connection held open] │ │
│ │
│ ADVANTAGES OVER WEBSOCKET: │
│ ───────────────────────────────────────────────────────────────────── │
│ ✓ Uses standard HTTP (works with existing infrastructure) │
│ ✓ Automatic reconnection built into browser API │
│ ✓ Simpler than WebSocket for server push only │
│ ✓ Works through HTTP/2 efficiently │
│ │
│ LIMITATIONS: │
│ ──────────────────────────────────────────────────────────────────── │
│ ✗ One-way only (server to client) │
│ ✗ Text-based (no binary) │
│ ✗ Some browser limits on connections per domain │
│ │
│ USE CASES: │
│ - News feeds │
│ - Stock prices │
│ - Notifications │
│ - Build/deployment status │
│ │
└────────────────────────────────────────────────────────────────────────┘
Chapter 9: gRPC Protocol
9.1 gRPC Overview
┌─────────────────────────────────────────────────────────────────────────┐
│ gRPC │
│ │
│ gRPC = Google Remote Procedure Call │
│ │
│ Stack: │
│ Application ← gRPC ← Protocol Buffers ← HTTP/2 ← TCP │
│ │
│ KEY FEATURES: │
│ ───────────────────────────────────────────────────────────────────── │
│ │
│ 1. PROTOCOL BUFFERS (Protobuf) │
│ Binary serialization format │
│ Strongly typed │
│ 10x smaller, 100x faster than JSON │
│ │
│ 2. HTTP/2 │
│ Multiplexing, header compression │
│ Bidirectional streaming │
│ │
│ 3. CODE GENERATION │
│ Generate client/server stubs from .proto files │
│ Supports many languages (Go, Java, Python, C++, etc.) │
│ │
│ 4. FOUR COMMUNICATION PATTERNS │
│ Unary, Server streaming, Client streaming, Bidirectional │
│ │
└─────────────────────────────────────────────────────────────────────────┘
9.2 gRPC Communication Patterns
gRPC COMMUNICATION PATTERNS
1. UNARY (Simple Request-Response)
─────────────────────────────────────────────────────────────────────
Client ─── Request ───▶ Server
Client ◀── Response ─── Server
Use: Simple API calls (like REST)
2. SERVER STREAMING
─────────────────────────────────────────────────────────────────────
Client ─── Request ──────────────────────▶ Server
Client ◀── Response 1 ─── Server
Client ◀── Response 2 ─── Server
Client ◀── Response 3 ─── Server
Client ◀── End ────────── Server
Use: Large result sets, real-time updates
3. CLIENT STREAMING
─────────────────────────────────────────────────────────────────────
Client ─── Request 1 ───▶ Server
Client ─── Request 2 ───▶ Server
Client ─── Request 3 ───▶ Server
Client ─── End ──────────▶ Server
Client ◀── Response ───── Server
Use: File upload, batch processing
4. BIDIRECTIONAL STREAMING
─────────────────────────────────────────────────────────────────────
Client ─── Request 1 ───▶ Server
Client ◀── Response 1 ─── Server
Client ─── Request 2 ───▶ Server
Client ◀── Response 2 ─── Server
Client ◀── Response 3 ─── Server
Client ─── Request 3 ───▶ Server
Use: Chat, real-time collaboration
9.3 gRPC vs REST
┌────────────────────────────────────────────────────────────────────────┐
│ gRPC vs REST │
│ │
│ Aspect │ gRPC │ REST │
│ ───────────────────┼───────────────────┼─────────────────────────── │
│ Protocol │ HTTP/2 │ HTTP/1.1 or HTTP/2 │
│ Data format │ Protobuf (binary) │ JSON (text) │
│ API contract │ .proto files │ OpenAPI/Swagger │
│ Code generation │ Built-in │ Optional │
│ Browser support │ Limited (grpc-web)│ Native │
│ Streaming │ Full support │ Limited │
│ Human readable │ No │ Yes │
│ Performance │ Faster │ Slower │
│ Tooling │ Specialized │ Universal (curl, etc.) │
│ Learning curve │ Steeper │ Gentler │
│ │
│ USE gRPC WHEN: │
│ - Microservices communication (internal) │
│ - High performance required │
│ - Streaming needed │
│ - Strong typing desired │
│ │
│ USE REST WHEN: │
│ - Public API │
│ - Browser clients │
│ - Simple CRUD operations │
│ - Human debugging needed │
│ │
└────────────────────────────────────────────────────────────────────────┘
Part III: Security and Encryption
Chapter 10: TLS/SSL Deep Dive
10.1 TLS Handshake
TLS 1.3 HANDSHAKE (Simplified)
Client Server
│ │
│ ──── ClientHello ───────────────────▶ │
│ + Supported cipher suites │
│ + Random number │
│ + Key share (for ECDH) │
│ │
│ ◀─── ServerHello ─────────────────── │
│ + Chosen cipher suite │
│ + Random number │
│ + Key share │
│ + Certificate │
│ + Certificate verify │
│ + Finished │
│ │
│ ──── Finished ──────────────────────▶ │
│ │
│ ═════ Encrypted Data ═══════════════ │
│ │
TLS 1.3 IMPROVEMENTS:
- 1-RTT handshake (vs 2-RTT in TLS 1.2)
- 0-RTT resumption (with caveats)
- Removed weak ciphers
- Encrypted more of the handshake
10.2 Certificate Chain
CERTIFICATE CHAIN OF TRUST
┌────────────────────────────────────────────────────────────────┐
│ ROOT CA │
│ (DigiCert, Let's Encrypt, etc.) │
│ │
│ - Pre-installed in browsers/OS │
│ - Self-signed │
│ - Extremely protected (offline) │
│ │
│ Signs ↓ │
│ │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ INTERMEDIATE CA │ │
│ │ │ │
│ │ - Signed by Root CA │ │
│ │ - Issues end-entity certificates │ │
│ │ - Can be revoked without affecting root │ │
│ │ │ │
│ │ Signs ↓ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────┐ │ │
│ │ │ END-ENTITY CERTIFICATE │ │ │
│ │ │ │ │ │
│ │ │ Subject: www.example.com │ │ │
│ │ │ Public Key: [RSA 2048 or ECDSA P-256] │ │ │
│ │ │ Valid: 2024-01-01 to 2025-01-01 │ │ │
│ │ │ Issuer: Intermediate CA │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
VALIDATION:
1. Server sends: End cert + Intermediate cert
2. Client has: Root certs pre-installed
3. Client verifies: End → Intermediate → Root (chain complete)
10.3 HTTPS Everywhere
WHY HTTPS MATTERS FOR SYSTEM DESIGN
1. DATA INTEGRITY
Prevents man-in-the-middle modifications
2. CONFIDENTIALITY
Encrypts all data in transit
3. AUTHENTICATION
Verifies server identity
4. SEO & TRUST
Google ranks HTTPS higher
Browsers show "Not Secure" for HTTP
5. REQUIRED FOR MODERN FEATURES
HTTP/2, HTTP/3, Service Workers, Geolocation all require HTTPS
PERFORMANCE OVERHEAD:
- TLS handshake: 1-2 RTT (use TLS 1.3, session resumption)
- Encryption: ~1-2% CPU overhead (hardware acceleration helps)
- Worth it: Security benefits outweigh costs
10.4 mTLS (Mutual TLS)
MUTUAL TLS (mTLS)
Standard TLS:
Client verifies server's certificate
Server doesn't verify client
Mutual TLS:
Client verifies server's certificate
Server ALSO verifies client's certificate
Client Server
│ │
│ Presents client certificate ───────▶ │
│ │
│ ◀─────── Validates certificate ───── │
│ │
│ Both sides authenticated │
USE CASES:
- Service-to-service authentication (microservices)
- API authentication (replacement for API keys)
- Zero-trust networks
- IoT device authentication
IMPLEMENTATION:
- Service mesh (Istio, Linkerd) handles automatically
- Or configure in application/load balancer
Chapter 11: DNS Security
11.1 DNS Attacks and Mitigations
DNS SECURITY THREATS
1. DNS SPOOFING / CACHE POISONING
─────────────────────────────────────────────────────────────────────
Attacker injects false DNS records into cache
User asks: "Where is bank.com?"
Attacker responds: "192.168.1.100" (attacker's server)
User visits fake site, enters credentials
MITIGATION: DNSSEC
2. DNS AMPLIFICATION ATTACK (DDoS)
─────────────────────────────────────────────────────────────────────
Attacker sends DNS queries with spoofed source IP (victim's IP)
DNS servers send large responses to victim
Small query → Large response = Amplification
MITIGATION: Rate limiting, response rate limiting (RRL)
3. DNS TUNNELING
─────────────────────────────────────────────────────────────────────
Encode data in DNS queries to bypass firewalls
Used for data exfiltration
MITIGATION: DNS monitoring, query analysis
11.2 DNSSEC
DNSSEC (DNS Security Extensions)
PROBLEM:
DNS responses are not authenticated
Anyone can claim to be the authority for a domain
SOLUTION:
Cryptographically sign DNS records
HOW IT WORKS:
1. Zone owner creates DNSKEY (public/private key pair)
2. Signs all records with private key → RRSIG records
3. Parent zone signs child's DNSKEY → DS record
4. Chain of trust from root → TLD → domain
Resolver validates signatures up to root
LIMITATIONS:
- Doesn't encrypt (use DoH/DoT for that)
- Complex to deploy
- Larger responses
- Not universally adopted
11.3 DNS over HTTPS (DoH) and DNS over TLS (DoT)
ENCRYPTED DNS
PROBLEM:
Traditional DNS is plaintext
ISP, network operators can see all your DNS queries
SOLUTIONS:
DNS over HTTPS (DoH):
- DNS queries over HTTPS (port 443)
- Looks like regular web traffic
- Harder to block
- Used by browsers (Firefox, Chrome)
DNS over TLS (DoT):
- DNS queries over TLS (port 853)
- Dedicated port, easier to identify
- Used by Android, system resolvers
PUBLIC DoH SERVERS:
- Cloudflare: https://cloudflare-dns.com/dns-query
- Google: https://dns.google/dns-query
- Quad9: https://dns.quad9.net/dns-query
Part IV: Network Performance
Chapter 12: Latency Deep Dive
12.1 Components of Latency
┌────────────────────────────────────────────────────────────────────────┐
│ LATENCY COMPONENTS │
│ │
│ Total Latency = Propagation + Transmission + Processing + Queuing │
│ │
│ 1. PROPAGATION DELAY │
│ ──────────────────────────────────────────────────────────────────── │
│ Time for signal to travel through medium │
│ Limited by speed of light! │
│ │
│ Light in fiber: ~200,000 km/s (2/3 speed of light) │
│ NYC to London: ~5,500 km │
│ Minimum RTT: ~55 ms (just physics!) │
│ │
│ 2. TRANSMISSION DELAY │
│ ──────────────────────────────────────────────────────────────────── │
│ Time to push all bits onto the wire │
│ = Packet size / Bandwidth │
│ │
│ 1 MB file on 100 Mbps link = 80 ms │
│ 1 MB file on 1 Gbps link = 8 ms │
│ │
│ 3. PROCESSING DELAY │
│ ──────────────────────────────────────────────────────────────────── │
│ Time for routers/servers to process packet │
│ Usually microseconds per hop │
│ │
│ 4. QUEUING DELAY │
│ ──────────────────────────────────────────────────────────────────── │
│ Time waiting in buffers │
│ Highly variable! │
│ Depends on congestion │
│ │
└────────────────────────────────────────────────────────────────────────┘
12.2 Real-World Latency Numbers
LATENCY REFERENCE (Approximate)
NETWORK OPERATIONS:
───────────────────────────────────────────────────────────────────────────
Same datacenter (same rack): 0.5 ms
Same datacenter (different rack): 1 ms
Same region (different AZ): 1-2 ms
Cross-region (same continent): 20-50 ms
Cross-continent: 100-200 ms
Geostationary satellite: 600+ ms
TCP CONNECTION (per handshake):
───────────────────────────────────────────────────────────────────────────
Same datacenter: ~1.5 ms (1.5 × 0.5 × 2)
Cross-region: ~60-150 ms (1.5 RTT)
TLS HANDSHAKE (additional):
───────────────────────────────────────────────────────────────────────────
TLS 1.2: 2 RTT
TLS 1.3: 1 RTT
TLS 1.3 resumption: 0 RTT
DNS RESOLUTION:
───────────────────────────────────────────────────────────────────────────
Cached locally: <1 ms
Cached at resolver: 1-10 ms
Full resolution: 50-200 ms
12.3 Reducing Latency
LATENCY OPTIMIZATION STRATEGIES
1. MOVE CLOSER TO USERS
- CDN for static content
- Edge computing for dynamic
- Multi-region deployment
2. REDUCE ROUND TRIPS
- Connection pooling (avoid TCP handshake)
- HTTP/2, HTTP/3 (multiplexing)
- TLS session resumption
- DNS prefetching
3. REDUCE DATA SIZE
- Compression (gzip, brotli)
- Efficient formats (protobuf vs JSON)
- Pagination
4. PARALLELIZE
- Concurrent requests
- Async I/O
- Connection multiplexing
5. CACHE AGGRESSIVELY
- Browser cache
- CDN cache
- Application cache
- Database query cache
Chapter 13: Bandwidth and Throughput
13.1 Understanding the Difference
BANDWIDTH vs THROUGHPUT vs LATENCY
BANDWIDTH (Capacity):
Maximum data rate of a link
Like: Width of a highway (lanes)
Measured in: bits per second (bps, Mbps, Gbps)
THROUGHPUT (Actual Rate):
Actual data rate achieved
Like: Number of cars actually passing per hour
Always <= Bandwidth
Measured in: bits per second
LATENCY (Delay):
Time for one unit to travel
Like: Time for one car to travel the highway
Measured in: milliseconds
EXAMPLE:
Pipe: 1 Gbps bandwidth
Actual throughput: 800 Mbps (due to overhead, congestion)
Latency: 50 ms
Sending 100 MB:
Time = Latency + (Size / Throughput)
Time = 50ms + (800 Mb / 800 Mbps)
Time = 50ms + 1000ms
Time = 1050ms
13.2 Bandwidth-Delay Product
BANDWIDTH-DELAY PRODUCT (BDP)
BDP = Bandwidth × RTT
This is how much data can be "in flight" in the network.
EXAMPLE:
Bandwidth: 1 Gbps
RTT: 100 ms
BDP = 1 Gbps × 100ms = 100 Mb = 12.5 MB
To fully utilize this link, you need 12.5 MB of data
in transit at any time!
WHY IT MATTERS:
- TCP receive window must be >= BDP
- Default window often too small for long-fat networks
- Need TCP window scaling for high-bandwidth, high-latency links
Chapter 14: Network Address Translation (NAT)
14.1 How NAT Works
NAT (Network Address Translation)
PROBLEM:
Not enough public IPv4 addresses for every device
SOLUTION:
Multiple devices share one public IP
HOW IT WORKS:
Private Network NAT Router Internet
─────────────────────────────────────────────────────────────────
Device A: 192.168.1.100
│
│ Src: 192.168.1.100:54321
│ Dst: 93.184.216.34:443
│
└──────────────▶ NAT ──────────────▶ Server
│
│ Src: 203.0.113.1:10001 (NAT's public IP:port)
│ Dst: 93.184.216.34:443
│
│ NAT Table:
│ ┌────────────────────┬──────────────────┐
│ │ Internal │ External │
│ ├────────────────────┼──────────────────┤
│ │ 192.168.1.100:54321│ 203.0.113.1:10001│
│ │ 192.168.1.101:54322│ 203.0.113.1:10002│
│ └────────────────────┴──────────────────┘
14.2 NAT Types and Implications
NAT TYPES
1. FULL CONE NAT
Any external host can send to mapped port
Most permissive
2. RESTRICTED CONE NAT
Only hosts that internal client contacted can send back
3. SYMMETRIC NAT
Different mapping for each destination
Most restrictive, breaks some P2P protocols
SYSTEM DESIGN IMPLICATIONS:
1. INBOUND CONNECTIONS
Can't directly connect to device behind NAT
Solutions: Port forwarding, UPnP, hole punching
2. P2P COMMUNICATION
Both clients behind NAT = difficult
Solutions: STUN, TURN, ICE (used by WebRTC)
3. CONNECTION LIMITS
NAT port exhaustion possible with many connections
Part V: Interview Questions and Answers
Chapter 15: Common Interview Questions
15.1 TCP/UDP Questions
Q: "What's the difference between TCP and UDP? When would you use each?"
GREAT ANSWER:
"TCP is a reliable, connection-oriented protocol that guarantees in-order
delivery through acknowledgments and retransmissions. It has congestion
control built in. The tradeoff is higher latency due to the three-way
handshake and head-of-line blocking.
UDP is a lightweight, connectionless protocol with no delivery guarantees.
It's faster since there's no handshake, and packets are independent.
I'd use TCP for:
- HTTP APIs where every request must succeed
- File transfers where completeness matters
- Database connections
I'd use UDP for:
- Real-time video/voice where late data is useless
- DNS queries (simple request-response)
- Gaming where speed matters more than occasional loss
- QUIC/HTTP3 which builds reliability on top of UDP"
────────────────────────────────────────────────────────────────────────────
Q: "Why does TCP use a three-way handshake instead of two-way?"
GREAT ANSWER:
"The three-way handshake establishes that BOTH sides can send AND receive.
With a two-way handshake:
- Client sends SYN
- Server sends SYN-ACK
The server knows the client can send and it can receive. But the client
doesn't know if the server received its SYN — maybe the SYN-ACK was for
an old connection.
The third ACK confirms the client received the server's SYN-ACK, so the
server knows its message got through. Both sides now have proof of
bidirectional communication.
It also prevents replay attacks where old SYN packets could establish
unwanted connections."
────────────────────────────────────────────────────────────────────────────
Q: "What is TCP head-of-line blocking and how does HTTP/3 solve it?"
GREAT ANSWER:
"TCP guarantees in-order delivery. If packet 2 is lost but packets 3, 4, 5
arrive, TCP can't deliver them to the application until packet 2 is
retransmitted and received.
In HTTP/2, multiple streams share one TCP connection. If one stream loses
a packet, ALL streams are blocked waiting for retransmission, even if their
data arrived fine.
HTTP/3 uses QUIC over UDP. QUIC implements its own reliability per-stream.
If stream A loses a packet, only stream A is blocked. Streams B and C
continue normally. This dramatically improves performance on lossy networks
like mobile."
15.2 HTTP Questions
Q: "Walk me through what happens when you type a URL in the browser."
GREAT ANSWER:
"Let me walk through each step:
1. DNS Resolution:
- Browser checks its cache
- OS checks its cache
- Query goes to recursive resolver
- If not cached, resolver queries root → TLD → authoritative
- Returns IP address
2. TCP Connection:
- Browser opens TCP connection to server IP:443
- Three-way handshake: SYN → SYN-ACK → ACK
- Takes 1.5 RTT
3. TLS Handshake:
- ClientHello with supported ciphers
- ServerHello with certificate
- Key exchange (usually ECDHE)
- Both sides derive session keys
- Takes 1-2 RTT (TLS 1.3 is 1 RTT)
4. HTTP Request:
- Browser sends GET request with headers
- Host, Accept, Cookies, etc.
5. Server Processing:
- Load balancer routes to server
- Application processes request
- May query databases, caches
6. HTTP Response:
- Server sends status code, headers, body
- Chunked encoding for streaming
- Compression if supported (gzip/brotli)
7. Rendering:
- Browser parses HTML
- Discovers CSS, JS, images
- Makes additional requests (hopefully on same connection)
- Builds DOM, CSSOM, renders
If I were optimizing this, I'd focus on:
- CDN to reduce RTT
- HTTP/2 or HTTP/3 for multiplexing
- Connection keep-alive
- DNS prefetching
- Caching at every layer"
────────────────────────────────────────────────────────────────────────────
Q: "Explain the difference between HTTP/1.1, HTTP/2, and HTTP/3."
GREAT ANSWER:
"HTTP/1.1:
- Text-based protocol
- One request per connection at a time
- Head-of-line blocking (have to wait for response before next request)
- Browsers work around this by opening 6-8 connections
- Headers sent uncompressed, repeatedly
HTTP/2:
- Binary protocol
- Multiplexing: multiple streams over single connection
- Header compression (HPACK)
- Server push (rarely used now)
- Still uses TCP, so TCP head-of-line blocking affects all streams
HTTP/3:
- Uses QUIC over UDP instead of TCP
- Independent streams (no cross-stream blocking)
- Built-in TLS 1.3 (faster handshake)
- Connection migration (survives IP changes)
- Better for lossy networks (mobile)
In a system design context, I'd consider HTTP/3 for:
- Mobile-heavy applications
- Real-time features
- Global users with varying network quality
But HTTP/2 is still the safe default for most applications since it's
widely supported and HTTP/3 is still being adopted."
15.3 Network Architecture Questions
Q: "How would you design the network architecture for a multi-region deployment?"
GREAT ANSWER:
"I'd approach this in layers:
1. Global Traffic Management:
- Route 53 or Cloudflare for GeoDNS
- Route users to nearest region based on latency
- Health checks with automatic failover
2. Edge Layer:
- CDN for static content (CloudFront, Cloudflare)
- Edge functions for simple customization
- DDoS protection at edge
3. Regional Architecture:
- VPC per region with /16 CIDR
- Multiple availability zones (at least 2)
- Public subnets for load balancers
- Private subnets for application servers
- Database subnets with no internet access
4. Cross-Region Communication:
- VPC peering or Transit Gateway
- Private connectivity (not over internet)
- Consider latency for sync vs async
5. Data Strategy:
- Primary database in one region
- Read replicas in other regions
- OR active-active with conflict resolution
- Async replication for non-critical data
Key considerations:
- Data residency requirements (GDPR)
- Cost of cross-region traffic
- Acceptable latency for cross-region calls
- Consistency vs availability tradeoffs"
────────────────────────────────────────────────────────────────────────────
Q: "How do you handle 10,000 concurrent WebSocket connections?"
GREAT ANSWER:
"First, let me break down the challenges:
1. Connection Management:
- Each connection = file descriptor + memory (~10KB)
- 10K connections = ~100MB memory just for connections
- Need to tune OS limits (ulimit, sysctl)
2. Server Architecture:
- Use event-driven runtime (Node.js, Go, Rust)
- Avoid thread-per-connection (doesn't scale)
- Connection timeout and heartbeat handling
3. Load Balancing:
- Sticky sessions required (WebSocket = stateful)
- IP hash or cookie-based routing
- Health checks that understand WebSocket
4. Horizontal Scaling:
- Users on different servers need to communicate
- Pub/sub backbone with Redis or Kafka
- Server 1 publishes, Server 2 subscribes and delivers
5. Specific numbers for 10K connections:
- One modern server can handle 10K easily
- For redundancy, use 3+ servers with ~4K each
- Redis Pub/Sub for cross-server messaging
- Monitor connections per server, memory usage
If we need 100K+ connections, I'd consider:
- Dedicated WebSocket servers (separate from API)
- Connection proxies (e.g., Envoy)
- Sharding by channel/room"
15.4 Security Questions
Q: "How does HTTPS protect data in transit?"
GREAT ANSWER:
"HTTPS uses TLS to provide three security properties:
1. Authentication:
- Server presents certificate signed by trusted CA
- Client verifies certificate chain to root CA
- Ensures you're talking to the real server, not imposter
2. Confidentiality:
- Asymmetric encryption (RSA/ECDHE) for key exchange
- Symmetric encryption (AES-GCM) for data
- Only client and server can read the data
3. Integrity:
- HMAC or AEAD ensures data wasn't modified
- Any tampering is detected
The TLS handshake:
1. Client sends supported cipher suites
2. Server sends certificate and chosen cipher
3. Key exchange (usually ECDHE for forward secrecy)
4. Both derive session keys
5. All subsequent data is encrypted
For system design, HTTPS is non-negotiable. The ~1-2% CPU overhead
is worth it. I'd also consider mTLS for service-to-service auth in
a microservices architecture."
────────────────────────────────────────────────────────────────────────────
Q: "What's the difference between authentication and authorization?"
GREAT ANSWER:
"Authentication is WHO you are. Authorization is WHAT you can do.
Authentication:
- Verifying identity
- 'Is this really user X?'
- Methods: password, MFA, OAuth, certificates, biometrics
- Results in: identity token, session
Authorization:
- Verifying permissions
- 'Can user X perform action Y on resource Z?'
- Methods: RBAC, ABAC, ACLs, policy engines
- Results in: allow/deny decision
Example flow:
1. User logs in with password (authentication)
2. Server issues JWT with user ID (authentication complete)
3. User tries to delete a document
4. Server checks: 'Can this user delete this document?' (authorization)
5. Server allows or denies based on permissions
In system design:
- Centralize authentication (OAuth provider, Auth0, Cognito)
- Distribute authorization checks to services
- Consider policy-as-code (OPA, Cedar) for complex rules
- Cache authorization decisions where safe"
Summary
┌────────────────────────────────────────────────────────────────────────┐
│ NETWORKING KEY TAKEAWAYS │
│ │
│ NETWORK LAYERS: │
│ • Know TCP/IP model (Application, Transport, Internet, Network) │
│ • Understand L4 (TCP/UDP) vs L7 (HTTP) in load balancer context │
│ │
│ TCP: │
│ • Reliable, ordered, connection-oriented │
│ • 3-way handshake adds latency (1.5 RTT) │
│ • Head-of-line blocking affects all streams │
│ • Flow control (receive window) and congestion control built-in │
│ │
│ UDP: │
│ • Fast, unreliable, connectionless │
│ • Use for: real-time, DNS, QUIC │
│ │
│ HTTP EVOLUTION: │
│ • HTTP/1.1: Text, one request at a time, multiple connections │
│ • HTTP/2: Binary, multiplexing, header compression, still TCP │
│ • HTTP/3: QUIC over UDP, no head-of-line blocking │
│ │
│ WEBSOCKET: │
│ • Bidirectional, persistent connection │
│ • Scaling requires pub/sub backbone │
│ • Consider SSE for server-push only │
│ │
│ SECURITY: │
│ • TLS provides auth, confidentiality, integrity │
│ • mTLS for service-to-service │
│ • DNSSEC for DNS integrity │
│ │
│ PERFORMANCE: │
│ • Latency = propagation + transmission + processing + queuing │
│ • Speed of light limits minimum latency │
│ • Connection pooling reduces handshake overhead │
│ • CDN moves content closer to users │
│ │
└────────────────────────────────────────────────────────────────────────┘
📚 Further Reading
RFCs (The Source)
- RFC 793 - TCP: https://tools.ietf.org/html/rfc793
- RFC 768 - UDP: https://tools.ietf.org/html/rfc768
- RFC 9000 - QUIC: https://tools.ietf.org/html/rfc9000
- RFC 9114 - HTTP/3: https://tools.ietf.org/html/rfc9114
- RFC 8446 - TLS 1.3: https://tools.ietf.org/html/rfc8446
Books
- Computer Networking: A Top-Down Approach by Kurose & Ross
- TCP/IP Illustrated, Volume 1 by W. Richard Stevens
- High Performance Browser Networking by Ilya Grigorik (free online)
Interactive Learning
- How DNS Works: https://howdns.works/
- How HTTPS Works: https://howhttps.works/
- Cloudflare Learning Center: https://www.cloudflare.com/learning/
Tools
- Wireshark: Packet analysis
- curl: HTTP debugging
- dig/nslookup: DNS debugging
- tcpdump: Command-line packet capture
- mtr: Network diagnostics (traceroute + ping)
Engineering Blogs
- Cloudflare Blog: https://blog.cloudflare.com/
- Netflix Tech Blog (networking posts): https://netflixtechblog.com/
- Google Cloud Blog: https://cloud.google.com/blog/
End of Week 0 — Part 4: Networking Fundamentals
Next: Part 5 will cover Operating System concepts essential for system design — processes, threads, memory, I/O models, and concurrency patterns.