Week 0 — Part 1: System Design Fundamentals
The Foundation Before The Journey
Introduction: Why Week 0?
Before we dive into distributed systems, failure handling, and complex architectures, we need a solid foundation. This "Week 0" ensures everyone starts with the same vocabulary, mental models, and understanding of the building blocks.
What This Covers:
- Part 1: The System Design Interview Framework
- Part 2: Core Infrastructure Components
- Part 3: Back-of-the-Envelope Estimation Mastery
Think of this as the "boot camp" before the main training begins.
Chapter 1: What Is System Design?
1.1 The Big Picture
System design is the process of defining the architecture, components, modules, interfaces, and data flow of a system to satisfy specified requirements.
┌─────────────────────────────────────────────────────────────────────────┐
│ THE SYSTEM DESIGN SPECTRUM │
│ │
│ REQUIREMENTS │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ HIGH-LEVEL DESIGN (HLD) │ │
│ │ │ │
│ │ "What are the major components and how do they interact?" │ │
│ │ │ │
│ │ • System architecture │ │
│ │ • Component identification │ │
│ │ • Data flow between components │ │
│ │ • Technology choices │ │
│ │ • Scalability strategy │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ LOW-LEVEL DESIGN (LLD) │ │
│ │ │ │
│ │ "How does each component work internally?" │ │
│ │ │ │
│ │ • Database schemas │ │
│ │ • API contracts │ │
│ │ • Class diagrams │ │
│ │ • Algorithm details │ │
│ │ • Error handling │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ IMPLEMENTATION │
│ │
└─────────────────────────────────────────────────────────────────────────┘
1.2 High-Level Design (HLD) Explained
HLD answers: "What are we building and how do the pieces fit together?"
What HLD Includes
HIGH-LEVEL DESIGN COMPONENTS
│
├── 1. ARCHITECTURE DIAGRAM
│ └── Visual representation of system components
│ └── Shows how components connect
│ └── Identifies boundaries (internal vs external)
│
├── 2. COMPONENT IDENTIFICATION
│ └── What services/modules exist?
│ └── What is each responsible for?
│ └── Single Responsibility Principle
│
├── 3. DATA FLOW
│ └── How does data move through the system?
│ └── Request → Processing → Response path
│ └── Async vs Sync communication
│
├── 4. TECHNOLOGY CHOICES
│ └── Database: SQL vs NoSQL
│ └── Cache: Redis vs Memcached
│ └── Message Queue: Kafka vs RabbitMQ
│ └── Justification for each choice
│
├── 5. SCALABILITY STRATEGY
│ └── Horizontal vs Vertical scaling
│ └── Stateless vs Stateful components
│ └── Load balancing approach
│
└── 6. NON-FUNCTIONAL REQUIREMENTS
└── Latency targets (P50, P99)
└── Availability targets (99.9%, 99.99%)
└── Throughput requirements
HLD Example: URL Shortener
┌─────────────────────────────────────────────────────────────────────────┐
│ URL SHORTENER - HIGH LEVEL DESIGN │
│ │
│ ┌─────────────┐ │
│ │ USERS │ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ CDN │ │
│ │ (Static) │ │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────┐ │
│ │ LOAD BALANCER │ │
│ └───────────┬───────────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ API │ │ API │ │ API │ │
│ │ Server 1 │ │ Server 2 │ │ Server N │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ └────────────┼────────────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ CACHE │ │ DATABASE │ │ COUNTER │ │
│ │ (Redis) │ │ (Primary) │ │ SERVICE │ │
│ │ │ │ │ │ │ │
│ │ short→long│ │ Mapping │ │ ID Gen │ │
│ │ mapping │ │ Storage │ │ │ │
│ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ Components: │
│ • CDN: Serves static content (landing page) │
│ • Load Balancer: Distributes traffic across API servers │
│ • API Servers: Handles create/redirect requests (stateless) │
│ • Cache: Fast lookup for popular URLs │
│ • Database: Persistent storage for URL mappings │
│ • Counter Service: Generates unique IDs for short codes │
│ │
└─────────────────────────────────────────────────────────────────────────┘
1.3 Low-Level Design (LLD) Explained
LLD answers: "How does each piece work internally?"
What LLD Includes
LOW-LEVEL DESIGN COMPONENTS
│
├── 1. DATABASE SCHEMA
│ └── Table definitions
│ └── Columns and data types
│ └── Indexes and constraints
│ └── Relationships (foreign keys)
│
├── 2. API CONTRACTS
│ └── Endpoints (REST/GraphQL/gRPC)
│ └── Request/Response formats
│ └── Error codes and messages
│ └── Rate limiting rules
│
├── 3. CLASS/MODULE DESIGN
│ └── Classes and their responsibilities
│ └── Methods and their signatures
│ └── Inheritance/Composition
│ └── Design patterns used
│
├── 4. ALGORITHMS
│ └── Core business logic
│ └── Time/Space complexity
│ └── Edge cases handling
│
├── 5. ERROR HANDLING
│ └── Exception types
│ └── Retry strategies
│ └── Fallback behaviors
│
└── 6. DATA STRUCTURES
└── In-memory data structures
└── Cache key formats
└── Message formats
LLD Example: URL Shortener
# DATABASE SCHEMA
"""
Table: urls
+----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| id | BIGINT | NO | PRI | NULL | AI |
| short_code | VARCHAR(10) | NO | UNI | NULL | |
| original_url | VARCHAR(2048)| NO | | NULL | |
| user_id | BIGINT | YES | MUL | NULL | |
| created_at | TIMESTAMP | NO | | NOW() | |
| expires_at | TIMESTAMP | YES | | NULL | |
| click_count | BIGINT | NO | | 0 | |
+----------------+--------------+------+-----+---------+-------+
Indexes:
- PRIMARY KEY (id)
- UNIQUE INDEX idx_short_code (short_code)
- INDEX idx_user_id (user_id)
- INDEX idx_expires_at (expires_at) -- For cleanup job
"""
# API CONTRACT
"""
POST /api/v1/shorten
Request:
{
"url": "https://very-long-url.com/path/to/page",
"custom_code": "optional-custom", // optional
"expires_in": 86400 // optional, seconds
}
Response (201 Created):
{
"short_url": "https://short.url/abc123",
"short_code": "abc123",
"original_url": "https://very-long-url.com/path/to/page",
"expires_at": "2024-01-16T10:00:00Z",
"created_at": "2024-01-15T10:00:00Z"
}
Response (400 Bad Request):
{
"error": "invalid_url",
"message": "The provided URL is not valid"
}
Response (409 Conflict):
{
"error": "code_taken",
"message": "The custom code is already in use"
}
---
GET /{short_code}
Response (302 Found):
Headers:
Location: https://original-url.com/path
Cache-Control: private, max-age=3600
Response (404 Not Found):
{
"error": "not_found",
"message": "Short URL not found"
}
Response (410 Gone):
{
"error": "expired",
"message": "This short URL has expired"
}
"""
# CLASS DESIGN
from dataclasses import dataclass
from datetime import datetime
from typing import Optional
import hashlib
import base64
@dataclass
class URL:
"""Domain entity for a shortened URL."""
id: Optional[int]
short_code: str
original_url: str
user_id: Optional[int]
created_at: datetime
expires_at: Optional[datetime]
click_count: int = 0
def is_expired(self) -> bool:
if self.expires_at is None:
return False
return datetime.utcnow() > self.expires_at
def increment_clicks(self) -> None:
self.click_count += 1
class ShortCodeGenerator:
"""Generates unique short codes for URLs."""
def __init__(self, counter_service):
self.counter = counter_service
self.alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
def generate(self) -> str:
"""Generate a unique short code using counter-based approach."""
# Get next unique ID from distributed counter
unique_id = self.counter.get_next()
# Convert to base62
return self._to_base62(unique_id)
def _to_base62(self, num: int) -> str:
"""Convert number to base62 string."""
if num == 0:
return self.alphabet[0]
result = []
while num > 0:
result.append(self.alphabet[num % 62])
num //= 62
return ''.join(reversed(result))
class URLShortenerService:
"""Main service for URL shortening operations."""
def __init__(self, repository, cache, code_generator):
self.repository = repository
self.cache = cache
self.code_generator = code_generator
def shorten(self, original_url: str, user_id: Optional[int] = None,
custom_code: Optional[str] = None,
expires_in: Optional[int] = None) -> URL:
"""Create a shortened URL."""
# Validate URL
if not self._is_valid_url(original_url):
raise InvalidURLError(original_url)
# Generate or validate short code
if custom_code:
if self.repository.exists(custom_code):
raise CodeTakenError(custom_code)
short_code = custom_code
else:
short_code = self.code_generator.generate()
# Calculate expiration
expires_at = None
if expires_in:
expires_at = datetime.utcnow() + timedelta(seconds=expires_in)
# Create URL entity
url = URL(
id=None,
short_code=short_code,
original_url=original_url,
user_id=user_id,
created_at=datetime.utcnow(),
expires_at=expires_at
)
# Persist
saved_url = self.repository.save(url)
# Cache for fast lookups
self.cache.set(short_code, original_url, ttl=3600)
return saved_url
def resolve(self, short_code: str) -> str:
"""Resolve short code to original URL."""
# Try cache first
cached = self.cache.get(short_code)
if cached:
# Async increment (don't block redirect)
self._async_increment_clicks(short_code)
return cached
# Cache miss - query database
url = self.repository.find_by_code(short_code)
if not url:
raise URLNotFoundError(short_code)
if url.is_expired():
raise URLExpiredError(short_code)
# Update cache
self.cache.set(short_code, url.original_url, ttl=3600)
# Async increment
self._async_increment_clicks(short_code)
return url.original_url
1.4 HLD vs LLD: When to Use Each
┌────────────────────────────────────────────────────────────────────────┐
│ HLD vs LLD COMPARISON │
│ │
│ ┌─────────────────────────────┐ ┌─────────────────────────────┐ │
│ │ HIGH-LEVEL DESIGN │ │ LOW-LEVEL DESIGN │ │
│ ├─────────────────────────────┤ ├─────────────────────────────┤ │
│ │ │ │ │ │
│ │ AUDIENCE: │ │ AUDIENCE: │ │
│ │ • Stakeholders │ │ • Developers │ │
│ │ • Architects │ │ • Tech Leads │ │
│ │ • Product Managers │ │ • Code Reviewers │ │
│ │ │ │ │ │
│ │ FOCUS: │ │ FOCUS: │ │
│ │ • What components exist │ │ • How components work │ │
│ │ • How they interact │ │ • Implementation details │ │
│ │ • Why this architecture │ │ • Data structures │ │
│ │ │ │ │ │
│ │ ABSTRACTION: │ │ ABSTRACTION: │ │
│ │ • Boxes and arrows │ │ • Code and schemas │ │
│ │ • Component names │ │ • Method signatures │ │
│ │ • Data flow │ │ • Algorithm logic │ │
│ │ │ │ │ │
│ │ INTERVIEW TIME: │ │ INTERVIEW TIME: │ │
│ │ • First 20-30 minutes │ │ • Last 15-20 minutes │ │
│ │ • Most of the discussion │ │ • Deep dive on 1-2 areas │ │
│ │ │ │ │ │
│ └─────────────────────────────┘ └─────────────────────────────┘ │
│ │
│ INTERVIEW TIP: │
│ Start with HLD (big picture), then drill into LLD for critical │
│ components. Interviewers often ask: "Let's dive deeper into X" │
│ │
└────────────────────────────────────────────────────────────────────────┘
Chapter 2: The System Design Interview Framework
2.1 The Four-Step Framework
Every system design interview can be structured into four phases:
┌────────────────────────────────────────────────────────────────────────┐
│ THE FOUR-STEP INTERVIEW FRAMEWORK │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ STEP 1: REQUIREMENTS CLARIFICATION (3-5 minutes) │ │
│ │ │ │
│ │ • Ask questions about scope │ │
│ │ • Understand functional requirements │ │
│ │ • Identify non-functional requirements │ │
│ │ • Clarify constraints and assumptions │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ STEP 2: BACK-OF-ENVELOPE ESTIMATION (5-7 minutes) │ │
│ │ │ │
│ │ • Calculate traffic volume │ │
│ │ • Estimate storage requirements │ │
│ │ • Determine bandwidth needs │ │
│ │ • Size infrastructure (servers, databases) │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ STEP 3: HIGH-LEVEL DESIGN (15-20 minutes) │ │
│ │ │ │
│ │ • Draw the architecture diagram │ │
│ │ • Identify core components │ │
│ │ • Define data flow │ │
│ │ • Discuss technology choices │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ STEP 4: DEEP DIVE (10-15 minutes) │ │
│ │ │ │
│ │ • Database schema design │ │
│ │ • API design │ │
│ │ • Algorithm for critical paths │ │
│ │ • Handle edge cases and failures │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
2.2 Step 1: Requirements Clarification
Goal: Understand what you're building before you build it.
Questions to Ask
FUNCTIONAL REQUIREMENTS (What the system does)
│
├── CORE FEATURES
│ • "What are the main features we need to support?"
│ • "What can users do with this system?"
│ • "Are there different user types with different capabilities?"
│
├── SCOPE BOUNDARIES
│ • "What features are OUT of scope for this discussion?"
│ • "Should I focus on a particular aspect?"
│ • "Are we designing the entire system or a specific component?"
│
├── DATA
│ • "What data do we need to store?"
│ • "How long do we keep the data?"
│ • "Who owns the data?"
│
└── INTEGRATIONS
• "What external systems do we interact with?"
• "Are there existing APIs we must use?"
NON-FUNCTIONAL REQUIREMENTS (How well the system performs)
│
├── SCALE
│ • "How many users do we expect?"
│ • "What's the read/write ratio?"
│ • "What's the expected growth rate?"
│
├── PERFORMANCE
│ • "What latency is acceptable?"
│ • "Any specific SLA requirements?"
│
├── AVAILABILITY
│ • "What's the uptime requirement?"
│ • "Can we have scheduled maintenance windows?"
│
├── CONSISTENCY
│ • "Do we need strong consistency?"
│ • "Is eventual consistency acceptable?"
│
└── GEOGRAPHY
• "Is this global or regional?"
• "Where are most users located?"
Example Dialogue
INTERVIEWER: "Design a Twitter-like social media platform."
YOU: "Before I start, I'd like to clarify some requirements.
For scope - should I focus on the core feed experience, or
include features like DMs, search, and trending topics?"
INTERVIEWER: "Focus on the core feed - posting and viewing tweets."
YOU: "Got it. For scale - what's our expected user base?"
INTERVIEWER: "Let's say 100 million monthly active users."
YOU: "And what's the read/write ratio? I imagine people view
tweets far more often than they post."
INTERVIEWER: "Yes, assume 1000:1 read to write ratio."
YOU: "For non-functional requirements - what latency targets?"
INTERVIEWER: "Feed loads should be under 200ms at P99."
YOU: "Perfect. Let me summarize:
- Core features: Post tweets, view home timeline
- 100M MAU
- 1000:1 read/write ratio
- P99 latency < 200ms for feed loads
- I'll assume eventual consistency is acceptable for the feed.
Does this sound right?"
2.3 Functional vs Non-Functional Requirements
┌──────────────────────────────────────────────────────────────────────┐
│ FUNCTIONAL vs NON-FUNCTIONAL REQUIREMENTS │
│ │
│ ┌───────────────────────────────┐ ┌────────────────────────────────┐│
│ │ FUNCTIONAL │ │ NON-FUNCTIONAL ││
│ │ (Features) │ │ (Quality Attributes) ││
│ ├───────────────────────────────┤ ├────────────────────────────────┤│
│ │ │ │ ││
│ │ "What the system DOES" │ │ "How WELL the system does it" ││
│ │ │ │ ││
│ │ Examples: │ │ Examples: ││
│ │ • User can create account │ │ • 99.99% uptime ││
│ │ • User can upload photos │ │ • <100ms latency ││
│ │ • User can search content │ │ • Handle 1M requests/sec ││
│ │ • User can send messages │ │ • Encrypt data at rest ││
│ │ • Admin can ban users │ │ • GDPR compliant ││
│ │ │ │ • Auto-scale during peaks ││
│ │ │ │ ││
│ │ Affects: │ │ Affects: ││
│ │ • Feature set │ │ • Architecture choices ││
│ │ • User interface │ │ • Technology selection ││
│ │ • Business logic │ │ • Infrastructure sizing ││
│ │ │ │ • Cost ││
│ │ │ │ ││
│ └───────────────────────────────┘ └────────────────────────────────┘│
│ │
│ INTERVIEW TIP: Always clarify BOTH types. Functional requirements │
│ tell you WHAT to build. Non-functional requirements tell you HOW │
│ to build it and what trade-offs to make. │
│ │
└──────────────────────────────────────────────────────────────────────┘
2.4 Common Non-Functional Requirements Reference
┌────────────────────────────────────────────────────────────────────────┐
│ NON-FUNCTIONAL REQUIREMENTS CHEATSHEET │
│ │
│ SCALABILITY │
│ ───────────────────────────────────────────────────────────────────── │
│ • Horizontal scaling (add more machines) │
│ • Vertical scaling (bigger machines) │
│ • Traffic: requests/second, concurrent users │
│ • Data: storage growth rate │
│ │
│ AVAILABILITY │
│ ───────────────────────────────────────────────────────────────────── │
│ 99% = 3.65 days downtime/year (low availability) │
│ 99.9% = 8.76 hours downtime/year (standard) │
│ 99.99% = 52.6 minutes downtime/year (high availability) │
│ 99.999% = 5.26 minutes downtime/year (mission critical) │
│ │
│ LATENCY │
│ ───────────────────────────────────────────────────────────────────── │
│ P50 = Median (50th percentile) │
│ P90 = 90% of requests are faster │
│ P99 = 99% of requests are faster │
│ P999 = 99.9% of requests are faster │
│ │
│ Typical targets: │
│ • API response: <100ms P99 │
│ • Page load: <1s P99 │
│ • Database query: <10ms P99 │
│ • Cache hit: <1ms P99 │
│ │
│ CONSISTENCY │
│ ───────────────────────────────────────────────────────────────────── │
│ Strong: Read always returns latest write │
│ Eventual: Read might return stale data temporarily │
│ Causal: Related events appear in order │
│ │
│ DURABILITY │
│ ───────────────────────────────────────────────────────────────────── │
│ • Data survives failures │
│ • Measured in "nines" (99.999999999% = 11 nines) │
│ • Achieved through replication │
│ │
│ THROUGHPUT │
│ ───────────────────────────────────────────────────────────────────── │
│ • Requests per second (RPS) │
│ • Transactions per second (TPS) │
│ • Messages per second (for queues) │
│ • Bytes per second (for data transfer) │
│ │
└────────────────────────────────────────────────────────────────────────┘
Chapter 3: Diagramming for System Design
3.1 Types of Diagrams
┌─────────────────────────────────────────────────────────────────────────┐
│ SYSTEM DESIGN DIAGRAM TYPES │
│ │
│ 1. ARCHITECTURE DIAGRAM (Most Common) │
│ • Shows components and their connections │
│ • High-level view of the entire system │
│ • Used in HLD phase │
│ │
│ 2. SEQUENCE DIAGRAM │
│ • Shows how components interact over time │
│ • Request/response flows │
│ • Used for complex workflows │
│ │
│ 3. DATA FLOW DIAGRAM │
│ • Shows how data moves through the system │
│ • Transformations at each step │
│ • Good for data pipelines │
│ │
│ 4. ENTITY-RELATIONSHIP DIAGRAM (ERD) │
│ • Shows database tables and relationships │
│ • Used in LLD for database design │
│ │
│ 5. STATE DIAGRAM │
│ • Shows states and transitions │
│ • Good for order status, user status, etc. │
│ │
└─────────────────────────────────────────────────────────────────────────┘
3.2 Architecture Diagram Basics
Components You'll Draw
┌─────────────────────────────────────────────────────────────────────────┐
│ COMMON DIAGRAM COMPONENTS │
│ │
│ USERS/CLIENTS SERVERS/SERVICES │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ 👤 User │ │ Service │ │
│ └─────────────┘ │ Name │ │
│ └─────────────┘ │
│ ┌─────────────┐ │
│ │ Mobile App │ │
│ └─────────────┘ DATABASES │
│ ┌─────────────┐ │
│ ┌─────────────┐ │ ╔═════════╗ │ │
│ │ Web Browser │ │ ║ DB ║ │ │
│ └─────────────┘ │ ╚═════════╝ │ │
│ └─────────────┘ │
│ LOAD BALANCERS │
│ ┌─────────────────┐ CACHES │
│ │ ┌─┐ ┌─┐ ┌─┐ │ ┌─────────────┐ │
│ │ └─┘ └─┘ └─┘ LB │ │ (( Cache ))│ │
│ └─────────────────┘ └─────────────┘ │
│ │
│ MESSAGE QUEUES CDN │
│ ┌─────────────────┐ ┌─────────────┐ │
│ │ ═══════════════ │ │ ☁ CDN │ │
│ │ Queue │ └─────────────┘ │
│ └─────────────────┘ │
│ │
│ ARROWS │
│ ─────────▶ Synchronous call (request/response) │
│ - - - - -▶ Asynchronous call (fire and forget) │
│ ◀────────▶ Bidirectional │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Layered Architecture Pattern
┌────────────────────────────────────────────────────────────────────────┐
│ STANDARD LAYERED ARCHITECTURE │
│ │
│ ┌─────────────┐ │
│ │ CLIENTS │ │
│ └──────┬──────┘ │
│ │ │
│ ┌─────────────────────────────┼─────────────────────────────┐ │
│ │ PRESENTATION LAYER (Edge) │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ CDN │ │ WAF │ │ LB │ │ API GW │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ └─────────────────────────────┼─────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────┼─────────────────────────────┐ │
│ │ APPLICATION LAYER (Business Logic) │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Service A │ │ Service B │ │ Service C │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────┼─────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────┼─────────────────────────────┐ │
│ │ DATA LAYER (Persistence) │ │
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
│ │ │ Cache │ │ Primary │ │ Replica │ │ Queue │ │ │
│ │ │ (Redis) │ │ (SQL) │ │ (SQL) │ │ (Kafka) │ │ │
│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
3.3 Sequence Diagram Basics
┌─────────────────────────────────────────────────────────────────────────┐
│ SEQUENCE DIAGRAM: USER LOGIN FLOW │
│ │
│ Client API Gateway Auth Service Database │
│ │ │ │ │ │
│ │ POST /login │ │ │ │
│ │────────────────▶│ │ │ │
│ │ │ Validate token │ │ │
│ │ │──────────────────▶│ │ │
│ │ │ │ Find user │ │
│ │ │ │───────────────▶│ │
│ │ │ │ │ │
│ │ │ │ User data │ │
│ │ │ │◀───────────────│ │
│ │ │ │ │ │
│ │ │ JWT token │ │ │
│ │ │◀──────────────────│ │ │
│ │ │ │ │ │
│ │ 200 OK + token │ │ │ │
│ │◀────────────────│ │ │ │
│ │ │ │ │ │
│ │
│ Key: ────▶ = Synchronous request │
│ ◀──── = Response │
│ │
└─────────────────────────────────────────────────────────────────────────┘
3.4 Data Flow Diagram Basics
┌─────────────────────────────────────────────────────────────────────────┐
│ DATA FLOW: IMAGE UPLOAD PIPELINE │
│ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Upload │───▶│ Validate│───▶│ Process │───▶│ Store │ │
│ │ Image │ │ Format │ │ Resize │ │ in S3 │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ │ │ │ │ │
│ │ │ │ │ │
│ ▼ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Raw │ │ Valid │ │ Multiple│ │ URLs │ │
│ │ bytes │ │ image │ │ sizes │ │ stored │ │
│ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │
│ │
│ Data transformation at each step: │
│ 1. Raw bytes → Decoded image │
│ 2. Decoded image → Validated (format, size, content) │
│ 3. Validated → Multiple resolutions (thumb, medium, large) │
│ 4. Processed images → Stored, URLs generated │
│ │
└─────────────────────────────────────────────────────────────────────────┘
3.5 Entity-Relationship Diagram Basics
┌────────────────────────────────────────────────────────────────────────┐
│ ERD: E-COMMERCE SYSTEM │
│ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ USER │ │ ORDER │ │ PRODUCT │ │
│ ├───────────────┤ ├───────────────┤ ├───────────────┤ │
│ │ PK id │ │ PK id │ │ PK id │ │
│ │ email │ │ FK user_id │────┐ │ name │ │
│ │ name │◀────────│ total │ │ │ price │ │
│ │ created_at │ 1:N │ status │ │ │ stock │ │
│ └───────────────┘ │ created_at │ │ └───────────────┘ │
│ └───────────────┘ │ │ │
│ │ │ │ │
│ │ 1:N │ │ │
│ ▼ │ │ │
│ ┌───────────────┐ │ │ │
│ │ ORDER_ITEM │ │ │ M:N │
│ ├───────────────┤ │ │ │
│ │ PK id │ │ │ │
│ │ FK order_id │────┘ │ │
│ │ FK product_id │◀───────────────┘ │
│ │ quantity │ │
│ │ price │ │
│ └───────────────┘ │
│ │
│ Relationships: │
│ • User 1:N Order (one user has many orders) │
│ • Order 1:N OrderItem (one order has many items) │
│ • Product M:N Order (via OrderItem junction table) │
│ │
└────────────────────────────────────────────────────────────────────────┘
Chapter 4: Communication in System Design
4.1 Synchronous vs Asynchronous Communication
┌─────────────────────────────────────────────────────────────────────────┐
│ SYNCHRONOUS vs ASYNCHRONOUS COMMUNICATION │
│ │
│ SYNCHRONOUS (Request-Response) │
│ ───────────────────────────────────────────────────────────────────────│
│ │
│ Client Server │
│ │ │ │
│ │ Request │ │
│ │───────────────────────▶│ │
│ │ │ Processing... │
│ │ (waiting) │ (could be slow) │
│ │ │ │
│ │ Response │ │
│ │◀───────────────────────│ │
│ │ │ │
│ │
│ ✓ Simple to understand │
│ ✓ Immediate feedback │
│ ✗ Client blocked while waiting │
│ ✗ Tight coupling │
│ ✗ Single point of failure │
│ │
│ ASYNCHRONOUS (Message-Based) │
│ ───────────────────────────────────────────────────────────────────────│
│ │
│ Producer Queue Consumer │
│ │ │ │ │
│ │ Send message │ │ │
│ │───────────────────▶│ │ │
│ │ │ │ │
│ │ ACK (received) │ │ │
│ │◀───────────────────│ │ │
│ │ │ Poll/Push │ │
│ │ (continues work) │───────────────────▶│ │
│ │ │ │ Processing... │
│ │ │ ACK (processed) │ │
│ │ │◀───────────────────│ │
│ │
│ ✓ Loose coupling │
│ ✓ Producer not blocked │
│ ✓ Handles traffic spikes (queue as buffer) │
│ ✓ Retry/replay possible │
│ ✗ More complex │
│ ✗ Eventual consistency │
│ ✗ Debugging harder │
│ │
└─────────────────────────────────────────────────────────────────────────┘
4.2 When to Use Each
USE SYNCHRONOUS WHEN:
├── User needs immediate response
│ └── Login, checkout, search results
├── Operation is fast (<100ms)
├── Strong consistency required
│ └── Check balance before transfer
└── Simple request-response pattern
USE ASYNCHRONOUS WHEN:
├── Operation is slow
│ └── Video processing, report generation
├── Notification/event delivery
│ └── Emails, push notifications
├── Decoupling services
│ └── Order placed → Inventory, Shipping, Email
├── Handling traffic spikes
│ └── Buffer requests during peak
└── At-least-once delivery needed
└── Payment webhooks, critical events
4.3 API Styles
┌────────────────────────────────────────────────────────────────────────┐
│ API STYLE COMPARISON │
│ │
│ REST (Representational State Transfer) │
│ ──────────────────────────────────────────────────────────────────────│
│ • Resource-based URLs: /users/123, /orders/456 │
│ • HTTP verbs: GET, POST, PUT, DELETE │
│ • Stateless │
│ • JSON/XML payloads │
│ • Good for: CRUD operations, public APIs │
│ │
│ Example: │
│ GET /api/users/123 → Get user 123 │
│ POST /api/users → Create new user │
│ PUT /api/users/123 → Update user 123 │
│ DELETE /api/users/123 → Delete user 123 │
│ │
│ GraphQL │
│ ──────────────────────────────────────────────────────────────────────│
│ • Single endpoint: /graphql │
│ • Client specifies exactly what data it needs │
│ • Reduces over-fetching and under-fetching │
│ • Good for: Complex frontends, mobile apps │
│ │
│ Example: │
│ query { │
│ user(id: 123) { │
│ name │
│ email │
│ orders(limit: 5) { │
│ id │
│ total │
│ } │
│ } │
│ } │
│ │
│ gRPC (Google Remote Procedure Call) │
│ ──────────────────────────────────────────────────────────────────────│
│ • Binary protocol (Protocol Buffers) │
│ • Strongly typed contracts (.proto files) │
│ • Bidirectional streaming │
│ • Good for: Microservices, internal APIs, low latency │
│ │
│ Example (.proto file): │
│ service UserService { │
│ rpc GetUser(GetUserRequest) returns (User); │
│ rpc CreateUser(CreateUserRequest) returns (User); │
│ } │
│ │
│ WebSocket │
│ ──────────────────────────────────────────────────────────────────────│
│ • Full-duplex communication │
│ • Persistent connection │
│ • Server can push to client │
│ • Good for: Chat, live updates, gaming │
│ │
└────────────────────────────────────────────────────────────────────────┘
Summary: Part 1 Concepts
┌────────────────────────────────────────────────────────────────────────┐
│ PART 1 KEY TAKEAWAYS │
│ │
│ 1. SYSTEM DESIGN = Requirements → HLD → LLD → Implementation │
│ │
│ 2. HLD = Components + Interactions (the "what") │
│ LLD = Internal details (the "how") │
│ │
│ 3. INTERVIEW FRAMEWORK: │
│ Step 1: Clarify requirements (functional + non-functional) │
│ Step 2: Estimate scale (back-of-envelope) │
│ Step 3: Draw high-level design │
│ Step 4: Deep dive into critical components │
│ │
│ 4. NON-FUNCTIONAL REQUIREMENTS matter: │
│ • Scalability (how much traffic) │
│ • Availability (how much uptime) │
│ • Latency (how fast) │
│ • Consistency (how accurate) │
│ │
│ 5. COMMUNICATION STYLES: │
│ • Sync: Fast, simple, blocking │
│ • Async: Decoupled, resilient, complex │
│ │
│ 6. API STYLES: REST (simple), GraphQL (flexible), gRPC (fast) │
│ │
└────────────────────────────────────────────────────────────────────────┘
End of Week 0 — Part 1
Next: Part 2 covers the infrastructure building blocks (CDN, Load Balancer, API Gateway, Databases, Caches, Message Queues) that we'll use throughout the 10 weeks.