Bonus Problem 6: Netflix Streaming
The World's Most Sophisticated Video Delivery Platform
π¬ Delivering 94 Billion Hours of Entertainment β Without a Single Buffer
Imagine this challenge: It's 8 PM on a Friday night. Across the globe, 300 million people are about to press "play" at roughly the same time.
Each viewer expects instant playback β no buffering wheel, no quality drops, no frozen frames. They're watching on everything from 4K smart TVs in Tokyo to mobile phones in rural Brazil. Their internet connections range from gigabit fiber to spotty mobile data.
Now do that for 94 billion hours of viewing per year. At peak hours, your traffic accounts for 15% of all downstream internet bandwidth worldwide.
This is Netflix β the platform that reinvented how we consume entertainment and built the most sophisticated video delivery system ever created.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β THE NETFLIX SCALE (2025) β
β β
β SUBSCRIBERS β
β βββββββββββ β
β Global paid subscribers: 301.6 Million β
β Countries served: 190+ β
β Ad-supported tier users: 94 Million monthly actives β
β Viewing hours (H2 2024): 94 Billion β
β β
β CONTENT β
β βββββββ β
β Total titles available: ~18,000+ β
β Content spend (2024): $17 Billion β
β Original productions (2023): 891 titles β
β Languages supported: 60+ β
β β
β INFRASTRUCTURE β
β ββββββββββββββ β
β Open Connect servers: 17,000+ worldwide β
β ISP partnerships: 1,000+ locations in 158 countries β
β % of global internet traffic: ~15% at peak β
β Backend services (AWS): 1,000+ microservices β
β β
β BUSINESS β
β ββββββββ β
β Annual revenue (2024): $39 Billion β
β Market cap (May 2025): $491.7 Billion β
β Employees: ~14,000 β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This is the system we'll design today β and understand how Netflix achieves buffer-free streaming for hundreds of millions of concurrent viewers worldwide.
The Interview Begins
You walk into the interview room. The interviewer smiles and gestures to the whiteboard.
Interviewer: "Thanks for coming in. Today we're going to design a video streaming platform similar to Netflix. I want to see how you think about large-scale content delivery, client-server architecture, and handling global distribution. Feel free to ask questions β this should be collaborative."
They write on the whiteboard:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Design a Global Video Streaming Platform β
β β
β Build a streaming service that can: β
β - Stream video to 300M+ subscribers worldwide β
β - Support devices from phones to 4K TVs β
β - Deliver content with <2 second start time β
β - Handle peak traffic of millions of concurrent streams β
β - Provide personalized recommendations β
β β
β Consider: β
β - How do you encode and store video efficiently? β
β - How do you deliver content globally with low latency? β
β - How do you adapt quality to network conditions? β
β - How do you handle millions of concurrent users? β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interviewer: "Take a few minutes to think about this, then walk me through your approach. We have about 45 minutes."
Phase 1: Requirements Clarification (5 minutes)
Before diving in, you take a breath and start asking questions.
Your Questions
You: "Before I start designing, I'd like to clarify some requirements. First, what's our target scale β how many concurrent viewers should we support at peak?"
Interviewer: "At peak, we need to support around 10 million concurrent streams. That's a typical Friday evening."
You: "What quality levels do we need to support? Just HD or up to 4K/HDR?"
Interviewer: "We need the full range β from 240p for poor mobile connections up to 4K HDR for premium users on home theater systems."
You: "How important is start time versus sustained quality?"
Interviewer: "Both are critical. Users expect playback to start within 2 seconds, but we also can't have constant rebuffering. A smooth experience is paramount."
You: "Should we design for VOD only, or also live streaming like sports events?"
Interviewer: "Focus on VOD for now, but keep in mind we're expanding into live events."
You: "What about content protection? Do we need DRM?"
Interviewer: "Yes, premium content requires robust DRM across all devices."
You: "Perfect. Let me summarize the requirements."
Functional Requirements
1. VIDEO PLAYBACK
- Stream on-demand video content
- Support resolutions from 240p to 4K HDR
- Adaptive bitrate based on network conditions
- Resume playback from where user left off
- Support subtitles and multiple audio tracks
2. CONTENT CATALOG
- Browse and search content library
- Organize content by genre, trending, personalized rows
- Display metadata, trailers, ratings
- Support multiple profiles per account
3. RECOMMENDATIONS
- Personalized content suggestions per user
- "Continue Watching" functionality
- "Because You Watched X" recommendations
- Trending and top 10 lists
4. USER MANAGEMENT
- Account creation and authentication
- Multiple profiles per household
- Parental controls and viewing restrictions
- Download for offline viewing
Non-Functional Requirements
1. SCALE
- 300M+ subscribers globally
- 10M+ concurrent streams at peak
- 18,000+ titles in catalog
- 190+ countries
2. LATENCY
- Playback start: <2 seconds
- Seek response: <500ms
- API response: <100ms p99
3. AVAILABILITY
- 99.99% uptime (52 minutes downtime/year)
- Graceful degradation under failures
- Multi-region redundancy
4. QUALITY
- Zero buffering on stable connections
- Seamless quality adaptation
- Consistent experience across devices
Phase 2: Back of the Envelope Estimation (5 minutes)
You: "Let me work through the numbers to understand the infrastructure we need."
Traffic Estimation
CONCURRENT STREAMS AT PEAK
Total subscribers: 300 million
Peak concurrent rate: ~3% (Friday 8 PM)
Peak concurrent streams: ~10 million
Average stream bitrate: 5 Mbps (mix of qualities)
Peak bandwidth: 10M Γ 5 Mbps = 50 Tbps
Daily viewing hours:
Average per subscriber: 2 hours/day
Total daily: 600 million hours
Streams per second (avg): ~7 million
Storage Estimation
VIDEO STORAGE
Titles in catalog: 18,000
Average title length: 90 minutes (mix of movies/series)
Per-title encoding:
Resolutions: 8 (240p to 4K)
Bitrates per resolution: 3-4 variants
Total streams per title: ~30 encoded versions
Average encoded title size:
Low quality (240p-480p): ~500 MB
HD (720p-1080p): ~4 GB
4K HDR: ~15 GB
Total per title: ~20 GB average
Total storage:
18,000 titles Γ 20 GB = 360 TB (source content)
With CDN replication (100x): ~36 PB distributed globally
Infrastructure Summary
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INFRASTRUCTURE SUMMARY β
β β
β BANDWIDTH β
β βββ Peak egress: 50+ Tbps β
β βββ Daily data transfer: ~1 EB (exabyte) β
β βββ % of internet traffic: ~15% at peak β
β β
β STORAGE β
β βββ Source content: ~500 TB β
β βββ Encoded versions: ~5 PB (central) β
β βββ CDN distributed: ~50 PB (globally) β
β β
β COMPUTE (Control Plane) β
β βββ Microservices: 1,000+ β
β βββ API requests/sec: ~1 million β
β βββ Recommendation calls: ~100K/sec β
β β
β CDN (Data Plane) β
β βββ Edge servers: 17,000+ β
β βββ ISP locations: 1,000+ β
β βββ Countries: 158 β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Phase 3: High-Level Design (10 minutes)
You: "Netflix has a unique architecture with two distinct planes β let me explain."
System Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NETFLIX ARCHITECTURE OVERVIEW β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CLIENT DEVICES β β
β β ββββββββββ ββββββββββ ββββββββββ ββββββββββ ββββββββββ β β
β β βSmart TVβ β Phone β β Tablet β β Web β βConsole β β β
β β βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ βββββ¬βββββ β β
β ββββββββββΌββββββββββββΌββββββββββββΌββββββββββββΌββββββββββββΌββββββββββββ β
β β β β β β β
β βββββββββββββ΄ββββββββββββ΄ββββββ¬ββββββ΄ββββββββββββ β
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββ β
β β CONTROL PLANE (AWS) β β
β β β β
β β βββββββββββ βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Zuul ββββββΆβ MICROSERVICES β β β
β β β Gateway β β βββββββββββ βββββββββββ ββββββββββββββββββββ β β
β β βββββββββββ β β Auth β β Catalog β β Recommendations ββ β β
β β β βββββββββββ βββββββββββ ββββββββββββββββββββ β β
β β βββββββββββ β βββββββββββ βββββββββββ ββββββββββββββββββββ β β
β β β Eureka β β βPlayback β β Billing β β Search ββ β β
β β βDiscoveryβ β β Service β β Service β β Service ββ β β
β β βββββββββββ β βββββββββββ βββββββββββ ββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β DATA STORES β β β
β β β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β β β
β β β βCassandra β β EVCache β β MySQL β β S3 β β β β
β β β β (Users) β β (Cache) β β(Billing) β β (Media) β β β β
β β β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β Playback URLs β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DATA PLANE (Open Connect CDN) β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β OPEN CONNECT APPLIANCES β β β
β β β β β β
β β β ISP Level IXP Level Origin β β β
β β β βββββββββ βββββββββ βββββββββ β β β
β β β β OCA βββββββββββββ OCA ββββββββββββ AWS β β β β
β β β β(Edge) β Miss β(Metro)β Miss β S3 β β β β
β β β βββββββββ βββββββββ βββββββββ β β β
β β β β β β β β
β β β β Video Streams β β β β
β β β βΌ βΌ β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β END USERS β β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The Two-Plane Architecture
You: "Netflix separates its architecture into two distinct systems optimized for different purposes."
Control Plane (AWS)
Purpose: Handle all user interactions before playback β browsing, authentication, recommendations, playback authorization.
Key characteristics:
- Runs on AWS across multiple regions
- 1,000+ microservices
- Handles metadata, not video data
- Optimized for consistency and availability
- ~1M API requests/second
Data Plane (Open Connect)
Purpose: Deliver actual video content to users with minimum latency.
Key characteristics:
- Netflix's proprietary CDN
- 17,000+ servers in 1,000+ ISP locations
- Servers placed inside ISP networks
- Handles 100% of video streaming
- Optimized for throughput and proximity
You: "This separation is crucial. The control plane needs strong consistency for things like authentication and billing. The data plane needs raw throughput for video delivery. Different problems, different solutions."
Phase 4: Deep Dives (20 minutes)
Interviewer: "Great overview. Let's dive deeper. How does adaptive bitrate streaming work?"
Deep Dive 1: Adaptive Bitrate Streaming (Week 2 - Timeouts & Latency)
You: "Adaptive bitrate streaming is the core technology that enables buffer-free playback across varying network conditions."
The Problem
NETWORK VARIABILITY CHALLENGE
User watching on mobile:
- Start: 20 Mbps (WiFi at home)
- Minute 5: 5 Mbps (left home, on LTE)
- Minute 15: 500 Kbps (entered subway)
- Minute 20: 0 Mbps (tunnel)
- Minute 25: 10 Mbps (emerged from tunnel)
Without ABR:
- Fixed 8 Mbps stream β Constant buffering on LTE
- Fixed 500 Kbps stream β Terrible quality on WiFi
With ABR:
- Dynamically adjust quality based on conditions
- Seamless transitions between quality levels
- Buffer management to avoid interruptions
How ABR Works
ADAPTIVE BITRATE STREAMING ARCHITECTURE
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β ENCODING PIPELINE β
β βββββββββββββββββ β
β β
β Source Video βββΆ Encoder βββΆ Multiple Quality Levels βββΆ Chunked Storage β
β β
β Quality Ladder (per-title optimized): β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Resolution β Bitrate β Use Case β β
β ββββββββββββββββΌβββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ€ β
β β 240p β 235 Kbps β Extremely poor connections β β
β β 360p β 560 Kbps β Mobile data saving β β
β β 480p β 1.0 Mbps β Standard mobile β β
β β 720p β 3.0 Mbps β Tablet/laptop β β
β β 1080p β 5.0 Mbps β HD TV β β
β β 1080p HDR β 7.0 Mbps β Premium HD β β
β β 4K β 15 Mbps β 4K TV β β
β β 4K HDR β 25 Mbps β Premium 4K experience β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β CHUNKING β
β ββββββββ β
β Each quality level is split into 2-4 second chunks β
β Client can switch quality at any chunk boundary β
β β
β Video Timeline: β
β [Chunk 1][Chunk 2][Chunk 3][Chunk 4][Chunk 5][Chunk 6]... β
β 4K 4K 1080p 720p 720p 1080p β Quality can vary β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Per-Title Encoding Optimization
You: "Netflix doesn't use a fixed bitrate ladder. They analyze each title and create custom encoding settings."
PER-TITLE ENCODING OPTIMIZATION
Problem: Not all content is equal
- Animation: Compresses well, needs less bitrate
- Action movies: Complex scenes need more bitrate
- Dark scenes: Compression artifacts more visible
Traditional approach:
One bitrate ladder for all content
Result: Wasted bandwidth on simple content, poor quality on complex content
Netflix's approach:
1. Analyze content complexity using VMAF (Video Multi-method Assessment Fusion)
2. Generate custom bitrate ladder per title
3. Some titles get good 1080p at 3 Mbps, others need 6 Mbps
Result: 20-30% bandwidth savings with better quality
ABR Client Implementation
# streaming/abr_client.py
"""
Adaptive Bitrate Streaming Client
Demonstrates how Netflix clients select quality levels dynamically.
"""
from dataclasses import dataclass
from typing import List, Optional
from enum import Enum
import time
class QualityLevel(Enum):
"""Available quality levels."""
Q_240P = (240, 235_000) # (height, bitrate_bps)
Q_360P = (360, 560_000)
Q_480P = (480, 1_000_000)
Q_720P = (720, 3_000_000)
Q_1080P = (1080, 5_000_000)
Q_4K = (2160, 15_000_000)
@dataclass
class ChunkInfo:
"""Information about a video chunk."""
index: int
duration_seconds: float
quality: QualityLevel
size_bytes: int
url: str
@dataclass
class BufferState:
"""Current state of playback buffer."""
buffered_seconds: float
current_position: float
is_playing: bool
class ABRController:
"""
Adaptive Bitrate Controller.
Implements a buffer-based ABR algorithm similar to Netflix's approach.
Balances between:
- Maximizing video quality
- Avoiding rebuffering
- Minimizing quality switches
Applies concepts from:
- Week 2: Timeout management and graceful degradation
- Week 4: Caching and buffer management
"""
# Buffer thresholds (seconds)
BUFFER_MIN = 5.0 # Minimum buffer before panic
BUFFER_LOW = 15.0 # Start being conservative
BUFFER_TARGET = 30.0 # Ideal buffer level
BUFFER_MAX = 60.0 # Maximum buffer
# Quality switch dampening
SWITCH_COOLDOWN = 10.0 # Seconds between quality changes
def __init__(self, quality_levels: List[QualityLevel]):
self.quality_levels = sorted(
quality_levels,
key=lambda q: q.value[1] # Sort by bitrate
)
self.current_quality_idx = 0
self.last_switch_time = 0.0
self.bandwidth_estimates: List[float] = []
def estimate_bandwidth(self, chunk_size: int, download_time: float) -> float:
"""
Estimate available bandwidth from chunk download.
Uses exponential moving average to smooth estimates.
"""
if download_time <= 0:
return 0
measured_bps = (chunk_size * 8) / download_time
# Keep last 5 measurements
self.bandwidth_estimates.append(measured_bps)
if len(self.bandwidth_estimates) > 5:
self.bandwidth_estimates.pop(0)
# Use harmonic mean (conservative estimate)
if not self.bandwidth_estimates:
return measured_bps
harmonic_mean = len(self.bandwidth_estimates) / sum(
1/bw for bw in self.bandwidth_estimates
)
return harmonic_mean
def select_quality(
self,
buffer_state: BufferState,
estimated_bandwidth: float,
current_time: float
) -> QualityLevel:
"""
Select optimal quality level based on buffer and bandwidth.
Algorithm:
1. If buffer critically low β drop to lowest quality
2. If buffer low β be conservative, don't increase
3. If buffer healthy β select highest sustainable quality
4. Apply switch dampening to avoid oscillation
"""
buffer = buffer_state.buffered_seconds
# PANIC MODE: Buffer critically low
if buffer < self.BUFFER_MIN:
self.current_quality_idx = 0
self.last_switch_time = current_time
return self.quality_levels[0]
# Find highest sustainable quality
target_idx = 0
for i, quality in enumerate(self.quality_levels):
bitrate = quality.value[1]
# Need 20% headroom for safety
if bitrate * 1.2 <= estimated_bandwidth:
target_idx = i
# Buffer-based adjustment
if buffer < self.BUFFER_LOW:
# Don't increase quality when buffer is low
target_idx = min(target_idx, self.current_quality_idx)
elif buffer > self.BUFFER_TARGET:
# Allow quality increase when buffer is healthy
pass
# Switch dampening: avoid rapid oscillation
time_since_switch = current_time - self.last_switch_time
if time_since_switch < self.SWITCH_COOLDOWN:
# Only allow quality decrease, not increase
target_idx = min(target_idx, self.current_quality_idx)
# Apply change
if target_idx != self.current_quality_idx:
self.current_quality_idx = target_idx
self.last_switch_time = current_time
return self.quality_levels[self.current_quality_idx]
def get_initial_quality(self) -> QualityLevel:
"""
Select initial quality for playback start.
Strategy: Start at medium quality for fast startup,
then adapt based on actual bandwidth.
"""
# Start at 480p - reasonable quality, fast start
for i, q in enumerate(self.quality_levels):
if q.value[0] >= 480:
self.current_quality_idx = i
return q
return self.quality_levels[0]
class StreamingSession:
"""
Manages a complete streaming session.
Coordinates:
- Chunk downloading
- Buffer management
- Quality selection
- Playback control
"""
def __init__(self, manifest_url: str):
self.manifest_url = manifest_url
self.abr = ABRController(list(QualityLevel))
self.buffer = BufferState(
buffered_seconds=0.0,
current_position=0.0,
is_playing=False
)
self.chunks_downloaded = 0
self.total_rebuffers = 0
self.quality_switches = 0
async def start_playback(self):
"""
Initialize playback session.
1. Fetch manifest
2. Select initial quality
3. Pre-buffer minimum amount
4. Start playback
"""
# Start with medium quality for fast startup
initial_quality = self.abr.get_initial_quality()
# Pre-buffer 5 seconds before starting
while self.buffer.buffered_seconds < 5.0:
await self._download_next_chunk(initial_quality)
self.buffer.is_playing = True
async def _download_next_chunk(self, quality: QualityLevel):
"""Download next chunk at specified quality."""
# Simulated download - in reality would fetch from CDN
chunk_size = quality.value[1] * 4 # 4-second chunk
download_time = chunk_size / (5_000_000) # Simulate 5 Mbps
await self._simulate_download(download_time)
self.buffer.buffered_seconds += 4.0
self.chunks_downloaded += 1
# Update bandwidth estimate
self.abr.estimate_bandwidth(chunk_size, download_time)
async def _simulate_download(self, seconds: float):
"""Simulate network download time."""
import asyncio
await asyncio.sleep(seconds)
def get_session_stats(self) -> dict:
"""Return session quality metrics."""
return {
"chunks_downloaded": self.chunks_downloaded,
"rebuffer_events": self.total_rebuffers,
"quality_switches": self.quality_switches,
"current_quality": self.abr.quality_levels[
self.abr.current_quality_idx
].name
}
Deep Dive 2: Open Connect CDN (Week 1 - Partitioning & Replication)
Interviewer: "Tell me about Netflix's CDN strategy. Why did they build their own?"
You: "Open Connect is Netflix's secret weapon. It's why they can deliver 15% of internet traffic without breaking the bank."
Why Build Your Own CDN?
THE CDN ECONOMICS PROBLEM
Third-party CDN costs (2010s):
- $0.02 - $0.05 per GB delivered
- Netflix daily traffic: ~1 exabyte
- Daily CDN cost: $20M - $50M (!)
- Annual: $7B - $18B just for delivery
Netflix's solution:
- Build own CDN infrastructure
- One-time investment: ~$1B over decade
- Ongoing costs: Much lower than third-party
- Bonus: Better control over quality
Open Connect Architecture
OPEN CONNECT CDN ARCHITECTURE
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β CONTENT DISTRIBUTION TIERS β
β ββββββββββββββββββββββββββ β
β β
β TIER 1: ORIGIN (AWS S3) β
β βββββββββββββββββββββββ β
β β’ Complete content library β
β β’ All encoded versions β
β β’ Source of truth β
β β β
β β Nightly fill (popular content) β
β βΌ β
β TIER 2: INTERNET EXCHANGE POINTS (IXP) β
β ββββββββββββββββββββββββββββββββββββββ β
β β’ Large OCAs at peering points β
β β’ Store ~95% of catalog β
β β’ Serve multiple ISPs β
β β β
β β Cache fill on demand β
β βΌ β
β TIER 3: ISP-EMBEDDED OCAs β
β βββββββββββββββββββββββββ β
β β’ Servers inside ISP networks β
β β’ Store regionally popular content β
β β’ Shortest path to users β
β β’ 17,000+ servers worldwide β
β β β
β β Video streams β
β βΌ β
β END USERS β
β βββββββββ β
β β’ Typically served from ISP OCA β
β β’ 90%+ cache hit rate β
β β’ Minimal latency β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Open Connect Appliance (OCA) Specs
OCA HARDWARE SPECIFICATIONS
Standard OCA Server:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β STORAGE β
β β’ 36 Γ 8TB HDDs = 288 TB raw β
β β’ Or: 18 Γ 16TB SSDs = 288 TB (flash variant) β
β β’ Typical usable: ~240 TB β
β β
β NETWORK β
β β’ 4 Γ 25 Gbps NICs = 100 Gbps total β
β β’ Can serve ~20,000 concurrent streams β
β β
β COMPUTE β
β β’ Minimal CPU (video serving is I/O bound) β
β β’ Custom FreeBSD-based OS β
β β’ Netflix-optimized nginx β
β β
β EFFICIENCY β
β β’ Power: ~500W under load β
β β’ Cost per GB delivered: <$0.001 β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Content Placement Algorithm
# cdn/content_placement.py
"""
Content Placement Algorithm for Open Connect
Determines which content to cache on which OCA servers.
"""
from dataclasses import dataclass
from typing import Dict, List, Set
from datetime import datetime, timedelta
import heapq
@dataclass
class ContentItem:
"""A piece of content in the catalog."""
content_id: str
size_bytes: int
popularity_score: float
release_date: datetime
regions: List[str] # Regions where content is available
@dataclass
class OCAServer:
"""An Open Connect Appliance server."""
server_id: str
location: str
region: str
capacity_bytes: int
used_bytes: int
cached_content: Set[str]
@property
def available_bytes(self) -> int:
return self.capacity_bytes - self.used_bytes
class ContentPlacementEngine:
"""
Determines optimal content placement across OCA fleet.
Goals:
1. Maximize cache hit rate
2. Minimize origin fetches
3. Balance load across servers
4. Account for regional popularity differences
Applies concepts from:
- Week 1: Partitioning and replication strategies
- Week 4: Cache placement and invalidation
"""
def __init__(self):
self.servers: Dict[str, OCAServer] = {}
self.content: Dict[str, ContentItem] = {}
self.regional_popularity: Dict[str, Dict[str, float]] = {}
def add_server(self, server: OCAServer):
"""Register an OCA server."""
self.servers[server.server_id] = server
def update_popularity(self, region: str, content_id: str, score: float):
"""Update regional popularity score for content."""
if region not in self.regional_popularity:
self.regional_popularity[region] = {}
self.regional_popularity[region][content_id] = score
def compute_placement(self, region: str) -> Dict[str, List[str]]:
"""
Compute optimal content placement for a region.
Algorithm:
1. Get all servers in region
2. Rank content by regional popularity
3. Place highest-popularity content on all servers
4. Fill remaining space with long-tail content
Returns: Dict mapping server_id to list of content_ids
"""
regional_servers = [
s for s in self.servers.values()
if s.region == region
]
if not regional_servers:
return {}
# Get regional popularity scores
popularity = self.regional_popularity.get(region, {})
# Sort content by regional popularity
sorted_content = sorted(
self.content.values(),
key=lambda c: popularity.get(c.content_id, 0),
reverse=True
)
placement: Dict[str, List[str]] = {
s.server_id: [] for s in regional_servers
}
# Tier 1: Popular content goes on ALL servers
popular_threshold = 0.8 # Top 20% by popularity
popular_content = sorted_content[:int(len(sorted_content) * 0.2)]
for content in popular_content:
for server in regional_servers:
if server.available_bytes >= content.size_bytes:
placement[server.server_id].append(content.content_id)
server.used_bytes += content.size_bytes
server.cached_content.add(content.content_id)
# Tier 2: Long-tail content distributed across servers
# Use consistent hashing to spread load
remaining_content = sorted_content[int(len(sorted_content) * 0.2):]
for i, content in enumerate(remaining_content):
# Assign to server based on hash
server_idx = hash(content.content_id) % len(regional_servers)
server = regional_servers[server_idx]
if server.available_bytes >= content.size_bytes:
placement[server.server_id].append(content.content_id)
server.used_bytes += content.size_bytes
server.cached_content.add(content.content_id)
return placement
def select_server_for_request(
self,
content_id: str,
user_region: str,
user_isp: str
) -> OCAServer:
"""
Select best OCA server to serve a content request.
Priority:
1. ISP-embedded OCA with content cached
2. Regional IXP OCA with content cached
3. Any OCA with content cached
4. Fall back to origin (cache miss)
"""
candidates = []
for server in self.servers.values():
if content_id not in server.cached_content:
continue
# Score servers by proximity and load
score = 0
# Prefer ISP-embedded servers
if server.location == user_isp:
score += 100
# Prefer same-region servers
elif server.region == user_region:
score += 50
# Penalize highly loaded servers
load_pct = server.used_bytes / server.capacity_bytes
score -= load_pct * 20
candidates.append((score, server))
if not candidates:
return None # Cache miss - fetch from origin
# Return highest-scoring server
candidates.sort(key=lambda x: x[0], reverse=True)
return candidates[0][1]
Deep Dive 3: Microservices Architecture (Week 2 - Circuit Breakers)
Interviewer: "Netflix is famous for their microservices. How do they handle failures at scale?"
You: "Netflix pioneered many of the patterns we now consider standard β circuit breakers, bulkheads, and chaos engineering."
The Microservices Landscape
NETFLIX MICROSERVICES ECOSYSTEM
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β REQUEST FLOW THROUGH NETFLIX BACKEND β
β β
β Client Request β
β β β
β βΌ β
β βββββββββββ β
β β Zuul β API Gateway β
β β Gateway β - Authentication β
β ββββββ¬βββββ - Rate limiting β
β β - Request routing β
β β β
β βΌ β
β βββββββββββ β
β β Eureka β Service Discovery β
β βRegistry β - Dynamic service registration β
β ββββββ¬βββββ - Health monitoring β
β β - Load balancing info β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SERVICE MESH β β
β β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β User β βPlayback β β Reco β β Search β β β
β β β Service βββββΆβ Service βββββΆβ Service βββββΆβ Service β β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β β β β β β β
β β β β β β β β
β β βΌ βΌ βΌ β β
β β βββββββββββ βββββββββββ βββββββββββ βββββββββββ β β
β β βCassandraβ β EVCache β βCassandraβ βElastic β β β
β β βββββββββββ βββββββββββ βββββββββββ β Search β β β
β β βββββββββββ β β
β β β β
β β Each service-to-service call wrapped in Hystrix circuit breake β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Circuit Breaker Pattern
# resilience/circuit_breaker.py
"""
Circuit Breaker Implementation
Based on Netflix's Hystrix pattern for fault tolerance.
"""
from dataclasses import dataclass, field
from typing import Callable, Optional, Any
from enum import Enum
from datetime import datetime, timedelta
import threading
import time
class CircuitState(Enum):
"""Circuit breaker states."""
CLOSED = "closed" # Normal operation
OPEN = "open" # Failing, reject requests
HALF_OPEN = "half_open" # Testing if service recovered
@dataclass
class CircuitBreakerConfig:
"""Configuration for circuit breaker behavior."""
failure_threshold: int = 5 # Failures before opening
success_threshold: int = 3 # Successes to close from half-open
timeout_seconds: float = 30.0 # Time before trying half-open
half_open_max_calls: int = 3 # Max calls in half-open state
@dataclass
class CircuitBreakerStats:
"""Runtime statistics for monitoring."""
total_calls: int = 0
successful_calls: int = 0
failed_calls: int = 0
rejected_calls: int = 0
state_changes: int = 0
class CircuitBreaker:
"""
Circuit Breaker for protecting service calls.
States:
- CLOSED: Normal operation, requests pass through
- OPEN: Service failing, reject requests immediately
- HALF_OPEN: Testing if service recovered
Transitions:
- CLOSED β OPEN: When failure_threshold exceeded
- OPEN β HALF_OPEN: After timeout_seconds
- HALF_OPEN β CLOSED: When success_threshold reached
- HALF_OPEN β OPEN: On any failure
Applies concepts from:
- Week 2: Failure handling, timeouts, graceful degradation
"""
def __init__(
self,
name: str,
config: CircuitBreakerConfig = None,
fallback: Callable = None
):
self.name = name
self.config = config or CircuitBreakerConfig()
self.fallback = fallback
self._state = CircuitState.CLOSED
self._failure_count = 0
self._success_count = 0
self._last_failure_time: Optional[datetime] = None
self._half_open_calls = 0
self.stats = CircuitBreakerStats()
self._lock = threading.Lock()
@property
def state(self) -> CircuitState:
"""Get current state, checking for timeout transitions."""
with self._lock:
if self._state == CircuitState.OPEN:
if self._should_attempt_reset():
self._transition_to(CircuitState.HALF_OPEN)
return self._state
def _should_attempt_reset(self) -> bool:
"""Check if enough time passed to try half-open."""
if self._last_failure_time is None:
return False
elapsed = (datetime.utcnow() - self._last_failure_time).total_seconds()
return elapsed >= self.config.timeout_seconds
def _transition_to(self, new_state: CircuitState):
"""Transition to new state with logging."""
old_state = self._state
self._state = new_state
self.stats.state_changes += 1
if new_state == CircuitState.HALF_OPEN:
self._half_open_calls = 0
self._success_count = 0
elif new_state == CircuitState.CLOSED:
self._failure_count = 0
print(f"Circuit {self.name}: {old_state.value} β {new_state.value}")
def call(self, func: Callable, *args, **kwargs) -> Any:
"""
Execute function through circuit breaker.
Returns result or fallback value.
Raises exception if no fallback and circuit open.
"""
self.stats.total_calls += 1
current_state = self.state
# OPEN: Reject immediately
if current_state == CircuitState.OPEN:
self.stats.rejected_calls += 1
return self._handle_rejection()
# HALF_OPEN: Limit concurrent calls
if current_state == CircuitState.HALF_OPEN:
with self._lock:
if self._half_open_calls >= self.config.half_open_max_calls:
self.stats.rejected_calls += 1
return self._handle_rejection()
self._half_open_calls += 1
# Execute the call
try:
result = func(*args, **kwargs)
self._on_success()
return result
except Exception as e:
self._on_failure()
return self._handle_failure(e)
def _on_success(self):
"""Handle successful call."""
with self._lock:
self.stats.successful_calls += 1
if self._state == CircuitState.HALF_OPEN:
self._success_count += 1
if self._success_count >= self.config.success_threshold:
self._transition_to(CircuitState.CLOSED)
elif self._state == CircuitState.CLOSED:
# Reset failure count on success
self._failure_count = 0
def _on_failure(self):
"""Handle failed call."""
with self._lock:
self.stats.failed_calls += 1
self._last_failure_time = datetime.utcnow()
if self._state == CircuitState.HALF_OPEN:
# Any failure in half-open goes back to open
self._transition_to(CircuitState.OPEN)
elif self._state == CircuitState.CLOSED:
self._failure_count += 1
if self._failure_count >= self.config.failure_threshold:
self._transition_to(CircuitState.OPEN)
def _handle_rejection(self) -> Any:
"""Handle rejected call (circuit open)."""
if self.fallback:
return self.fallback()
raise CircuitOpenError(f"Circuit {self.name} is OPEN")
def _handle_failure(self, exception: Exception) -> Any:
"""Handle failed call."""
if self.fallback:
return self.fallback()
raise exception
class CircuitOpenError(Exception):
"""Raised when circuit breaker rejects a call."""
pass
# Example usage with service calls
class RecommendationService:
"""
Service that uses circuit breaker for dependency calls.
"""
def __init__(self):
self.user_service_breaker = CircuitBreaker(
name="user-service",
config=CircuitBreakerConfig(
failure_threshold=5,
timeout_seconds=30,
success_threshold=3
),
fallback=self._get_default_user
)
self.ml_service_breaker = CircuitBreaker(
name="ml-recommendations",
config=CircuitBreakerConfig(
failure_threshold=3,
timeout_seconds=60,
success_threshold=2
),
fallback=self._get_popular_content
)
def get_recommendations(self, user_id: str) -> list:
"""Get personalized recommendations for user."""
# Get user profile (with circuit breaker)
user = self.user_service_breaker.call(
self._fetch_user_profile, user_id
)
# Get ML recommendations (with circuit breaker)
recommendations = self.ml_service_breaker.call(
self._fetch_ml_recommendations, user
)
return recommendations
def _fetch_user_profile(self, user_id: str) -> dict:
"""Fetch user profile from user service."""
# Actual HTTP call would go here
pass
def _fetch_ml_recommendations(self, user: dict) -> list:
"""Fetch recommendations from ML service."""
# Actual HTTP call would go here
pass
def _get_default_user(self) -> dict:
"""Fallback: return anonymous user profile."""
return {"id": "anonymous", "preferences": []}
def _get_popular_content(self) -> list:
"""Fallback: return globally popular content."""
return [
{"id": "trending_1", "title": "Popular Movie 1"},
{"id": "trending_2", "title": "Popular Series 1"},
]
Deep Dive 4: Recommendation System (Week 5 - Distributed Computing)
Interviewer: "Tell me about Netflix's recommendation system. How does it power 80% of viewing?"
You: "The recommendation system is Netflix's competitive moat. It processes billions of events to personalize every user's experience."
Recommendation Architecture
NETFLIX RECOMMENDATION SYSTEM
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β DATA COLLECTION β
β βββββββββββββββ β
β β
β User Events: β
β β’ What you watched (and for how long) β
β β’ What you searched for β
β β’ What you scrolled past β
β β’ Time of day you watch β
β β’ Device you watch on β
β β’ What you added to "My List" β
β β
β Content Metadata: β
β β’ Genre, cast, director β
β β’ Runtime, release year β
β β’ Visual features (extracted by ML) β
β β’ Audio features β
β β’ Maturity rating β
β β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ML MODELS β β
β β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββββββββββββββ β β
β β βCollaborativeβ βContent-Basedβ β Deep Learning Models β β β
β β β Filtering β β Filtering β β (Neural Networks) β β β
β β ββββββββ¬βββββββ ββββββββ¬βββββββ βββββββββββββ¬ββββββββββββββ β β
β β β β β β β
β β ββββββββββββββββββ΄ββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββ β β
β β β Ensemble Model β β β
β β β (Combines all) β β β
β β ββββββββββ¬βββββββββ β β
β βββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PERSONALIZATION β β
β β β β
β β Homepage Rows: β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β "Because You Watched Stranger Things" β β β
β β β [Show1] [Show2] [Show3] [Show4] [Show5] β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β "Trending Now" β β β
β β β [Show1] [Show2] [Show3] [Show4] [Show5] β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β "Top Picks for You" β β β
β β β [Show1] [Show2] [Show3] [Show4] [Show5] β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β Even the artwork shown varies per user! β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Recommendation Algorithm Implementation
# recommendations/recommender.py
"""
Netflix-style Recommendation System
Demonstrates collaborative filtering and personalization.
"""
from dataclasses import dataclass
from typing import Dict, List, Tuple, Optional
import numpy as np
from collections import defaultdict
@dataclass
class UserProfile:
"""User viewing profile."""
user_id: str
viewing_history: List[str] # Content IDs watched
ratings: Dict[str, float] # Content ID β rating
preferences: Dict[str, float] # Genre β preference score
@dataclass
class ContentItem:
"""Content metadata."""
content_id: str
title: str
genres: List[str]
tags: List[str]
avg_rating: float
popularity_score: float
class CollaborativeFilter:
"""
Collaborative Filtering using Matrix Factorization.
Learns latent factors for users and items that explain
viewing patterns. Similar users have similar factor vectors.
Applies concepts from:
- Week 5: Distributed computation (training at scale)
"""
def __init__(self, num_factors: int = 50, learning_rate: float = 0.01):
self.num_factors = num_factors
self.learning_rate = learning_rate
self.user_factors: Dict[str, np.ndarray] = {}
self.item_factors: Dict[str, np.ndarray] = {}
def fit(self, interactions: List[Tuple[str, str, float]]):
"""
Train model on user-item interactions.
Args:
interactions: List of (user_id, item_id, rating) tuples
"""
# Initialize random factors
users = set(u for u, _, _ in interactions)
items = set(i for _, i, _ in interactions)
for user in users:
self.user_factors[user] = np.random.randn(self.num_factors) * 0.1
for item in items:
self.item_factors[item] = np.random.randn(self.num_factors) * 0.1
# Stochastic Gradient Descent
for epoch in range(20):
np.random.shuffle(interactions)
total_error = 0
for user_id, item_id, rating in interactions:
# Predict rating
pred = np.dot(
self.user_factors[user_id],
self.item_factors[item_id]
)
error = rating - pred
total_error += error ** 2
# Update factors
user_grad = error * self.item_factors[item_id]
item_grad = error * self.user_factors[user_id]
self.user_factors[user_id] += self.learning_rate * user_grad
self.item_factors[item_id] += self.learning_rate * item_grad
rmse = np.sqrt(total_error / len(interactions))
if epoch % 5 == 0:
print(f"Epoch {epoch}, RMSE: {rmse:.4f}")
def predict(self, user_id: str, item_id: str) -> float:
"""Predict rating for user-item pair."""
if user_id not in self.user_factors:
return 3.0 # Default rating for unknown users
if item_id not in self.item_factors:
return 3.0 # Default rating for unknown items
return np.dot(
self.user_factors[user_id],
self.item_factors[item_id]
)
def recommend(self, user_id: str, n: int = 10) -> List[Tuple[str, float]]:
"""Get top N recommendations for user."""
if user_id not in self.user_factors:
return []
scores = []
for item_id, item_factors in self.item_factors.items():
score = np.dot(self.user_factors[user_id], item_factors)
scores.append((item_id, score))
scores.sort(key=lambda x: x[1], reverse=True)
return scores[:n]
class ContentBasedFilter:
"""
Content-Based Filtering using item features.
Recommends items similar to what user has liked,
based on content features (genre, tags, etc.)
"""
def __init__(self):
self.item_features: Dict[str, np.ndarray] = {}
self.feature_index: Dict[str, int] = {}
def index_content(self, items: List[ContentItem]):
"""Build feature vectors for all content."""
# Build feature vocabulary
all_features = set()
for item in items:
all_features.update(item.genres)
all_features.update(item.tags)
self.feature_index = {f: i for i, f in enumerate(all_features)}
num_features = len(self.feature_index)
# Build feature vectors
for item in items:
vector = np.zeros(num_features)
for genre in item.genres:
vector[self.feature_index[genre]] = 1.0
for tag in item.tags:
vector[self.feature_index[tag]] = 0.5
# Normalize
norm = np.linalg.norm(vector)
if norm > 0:
vector /= norm
self.item_features[item.content_id] = vector
def get_similar(self, item_id: str, n: int = 10) -> List[Tuple[str, float]]:
"""Find N most similar items."""
if item_id not in self.item_features:
return []
target = self.item_features[item_id]
similarities = []
for other_id, features in self.item_features.items():
if other_id == item_id:
continue
similarity = np.dot(target, features)
similarities.append((other_id, similarity))
similarities.sort(key=lambda x: x[1], reverse=True)
return similarities[:n]
def recommend_for_user(
self,
user: UserProfile,
n: int = 10
) -> List[Tuple[str, float]]:
"""Recommend based on user's viewing history."""
# Aggregate scores from similar items to watched content
scores = defaultdict(float)
for watched_id in user.viewing_history[-20:]: # Last 20 watched
similar = self.get_similar(watched_id, n=50)
for item_id, similarity in similar:
if item_id not in user.viewing_history:
scores[item_id] += similarity
# Sort by aggregated score
recommendations = sorted(
scores.items(),
key=lambda x: x[1],
reverse=True
)
return recommendations[:n]
class HybridRecommender:
"""
Hybrid Recommender combining multiple signals.
Netflix uses ensemble of multiple models:
- Collaborative filtering
- Content-based filtering
- Popularity-based
- Context-aware (time, device)
"""
def __init__(self):
self.collaborative = CollaborativeFilter()
self.content_based = ContentBasedFilter()
# Weights for combining models
self.weights = {
"collaborative": 0.4,
"content_based": 0.3,
"popularity": 0.2,
"recency": 0.1
}
def recommend(
self,
user: UserProfile,
context: dict = None,
n: int = 20
) -> List[ContentItem]:
"""
Generate personalized recommendations.
Combines multiple signals with learned weights.
"""
scores = defaultdict(float)
# Collaborative filtering scores
cf_recs = self.collaborative.recommend(user.user_id, n=100)
for item_id, score in cf_recs:
scores[item_id] += score * self.weights["collaborative"]
# Content-based scores
cb_recs = self.content_based.recommend_for_user(user, n=100)
for item_id, score in cb_recs:
scores[item_id] += score * self.weights["content_based"]
# Filter out already watched
for watched_id in user.viewing_history:
scores.pop(watched_id, None)
# Sort and return top N
sorted_items = sorted(
scores.items(),
key=lambda x: x[1],
reverse=True
)
return sorted_items[:n]
def get_row_recommendations(
self,
user: UserProfile,
row_type: str
) -> List[str]:
"""
Get recommendations for a specific homepage row.
Row types:
- "because_you_watched": Similar to recently watched
- "trending": Popular in user's region
- "top_picks": Personalized top recommendations
- "continue_watching": Incomplete titles
"""
if row_type == "because_you_watched":
if not user.viewing_history:
return []
last_watched = user.viewing_history[-1]
similar = self.content_based.get_similar(last_watched, n=10)
return [item_id for item_id, _ in similar]
elif row_type == "top_picks":
recs = self.recommend(user, n=10)
return [item_id for item_id, _ in recs]
# ... other row types
return []
Phase 5: Scaling and Edge Cases (5 minutes)
Interviewer: "How does Netflix handle extreme scale events?"
Scaling for Major Releases
SQUID GAME SEASON 2 LAUNCH (Example)
Peak load characteristics:
- 87 million views in first week
- Global simultaneous interest
- Massive traffic spike at midnight releases
Preparation:
1. Pre-position content on ALL OCAs globally
2. Scale up control plane capacity
3. Implement launch-specific caching
4. Prepare fallback recommendations if ML overloaded
Traffic shaping:
- Stagger release times by region (reduce peak)
- Pre-generate recommendations for likely viewers
- Cache homepage variations
Edge Cases
Edge Case 1: ISP OCA Failure
Scenario: ISP-embedded OCA goes offline
Impact: Users in that ISP see degraded performance
Solution:
- Automatic failover to IXP-level OCA
- BGP routing updates within seconds
- User sees brief quality dip, then recovery
Edge Case 2: Regional Control Plane Failure
Scenario: AWS region experiences outage
Impact: Users can't browse or start new streams
Solution:
- Multi-region deployment with automatic failover
- Continue Watching cached locally
- Graceful degradation: show cached recommendations
Edge Case 3: Thundering Herd at Midnight Release
Scenario: Millions request same new content simultaneously
Impact: Potential OCA overload
Solution:
- Pre-warm all caches with new content
- Distribute load across replica OCAs
- Queue and rate-limit API requests
- Start users at slightly different positions
Phase 6: Monitoring and Operations
Monitoring Dashboard
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NETFLIX STREAMING MONITORING β
β β
β PLAYBACK HEALTH β
β βββββββββββββββ β
β Active streams: 8.2M concurrent [ββββββββββ] 82% β
β Play start success: 99.7% [ββββββββββ] OK β
β Rebuffer rate: 0.02% [ββββββββββ] OK β
β Avg video quality: 4.2 Mbps [ββββββββββ] Good β
β β
β CDN HEALTH β
β ββββββββββ β
β OCA servers online: 16,892 / 17,000 [ββββββββββ] 99.4% β
β Cache hit rate: 94.2% [ββββββββββ] Good β
β Origin bandwidth: 2.1 Tbps [ββββββββββ] Normal β
β P99 latency to OCA: 12ms [ββββββββββ] Excellent β
β β
β CONTROL PLANE β
β βββββββββββββ β
β API requests/sec: 892,341 [ββββββββββ] 89% β
β API latency p99: 45ms [ββββββββββ] Good β
β Service health: 1,247 / 1,250 [ββββββββββ] 99.8% β
β Circuit breakers open: 3 [ββββββββββ] OK β
β β
β QUALITY OF EXPERIENCE β
β βββββββββββββββββββββ β
β Play start time p50: 1.2s [ββββββββββ] Good β
β Play start time p99: 3.1s [ββββββββββ] OK β
β Quality switches/hour: 2.1 [ββββββββββ] Normal β
β Session abandonment: 0.8% [ββββββββββ] Low β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Chaos Engineering
SIMIAN ARMY - NETFLIX'S CHAOS ENGINEERING TOOLS
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β CHAOS MONKEY β
β β’ Randomly terminates instances during business hours β
β β’ Ensures services handle instance failures gracefully β
β β’ "If you can't handle a single instance dying, fix it first" β
β β
β LATENCY MONKEY β
β β’ Injects artificial delays into service calls β
β β’ Tests timeout and fallback behavior β
β β’ Ensures graceful degradation under slow dependencies β
β β
β CHAOS GORILLA β
β β’ Simulates entire AWS Availability Zone failure β
β β’ Tests regional failover β
β β’ Run during low-traffic periods β
β β
β CHAOS KONG β
β β’ Simulates entire AWS Region failure β
β β’ Ultimate test of multi-region resilience β
β β’ Run very carefully with full preparation β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interview Conclusion
Interviewer: "Excellent work. You've covered the key aspects of Netflix's architecture β the two-plane separation, adaptive streaming, their CDN strategy, and resilience patterns. Any questions for me?"
You: "Thank you! I'm curious about how Netflix handles the transition to live streaming for events like sports. The infrastructure seems optimized for VOD β what changes for live?"
Interviewer: "Great question. Live requires entirely different latency characteristics β you can't pre-cache content. Netflix is building new infrastructure specifically for live, with different CDN strategies focused on minimal latency rather than maximum caching. It's a fascinating new challenge for their team."
Summary: Concepts Applied
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONCEPTS FROM 10-WEEK COURSE IN NETFLIX DESIGN β
β β
β WEEK 1: DATA AT SCALE β
β βββ Partitioning: Content distributed across OCA tiers β
β βββ Replication: Popular content on all regional servers β
β βββ Hot Keys: Trending content pre-positioned everywhere β
β β
β WEEK 2: FAILURE-FIRST DESIGN β
β βββ Circuit breakers: Hystrix pattern for service calls β
β βββ Timeouts: Strict budgets for playback start β
β βββ Graceful degradation: Cached recommendations as fallback β
β βββ Chaos engineering: Simian Army for resilience testing β
β β
β WEEK 3: MESSAGING & ASYNC β
β βββ Event streaming: User events for recommendations β
β βββ Async processing: Background content encoding β
β β
β WEEK 4: CACHING β
β βββ Multi-tier CDN: ISP β IXP β Origin β
β βββ Cache warming: Pre-position content before releases β
β βββ EVCache: In-memory caching for API responses β
β β
β WEEK 5: CONSISTENCY & COORDINATION β
β βββ Eventual consistency: Recommendations update async β
β β
β WEEK 6: NOTIFICATION SYSTEMS β
β βββ Push notifications: New content alerts β
β β
β WEEK 10: PRODUCTION READINESS β
β βββ SLOs: Play start time, rebuffer rate β
β βββ Observability: Atlas metrics, distributed tracing β
β βββ Canary deployments: Spinnaker for safe rollouts β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β WHY NETFLIX IS AN ENGINEERING MARVEL β
β β
β SCALE β
β βββββ β
β β’ 300+ million subscribers watching content globally β
β β’ 94 billion hours viewed in just 6 months β
β β’ 15% of global internet traffic at peak hours β
β β’ 17,000+ CDN servers in 158 countries β
β β
β INNOVATION β
β ββββββββββ β
β β’ Pioneered adaptive bitrate streaming at scale β
β β’ Invented per-title encoding optimization (20-30% bandwidth savings) β
β β’ Created chaos engineering discipline β
β β’ Open-sourced foundational microservices tools β
β β
β EFFICIENCY β
β ββββββββββ β
β β’ Built own CDN saving billions vs third-party β
β β’ <$0.001 per GB delivered (vs $0.02+ industry) β
β β’ 80% of viewing driven by recommendations β
β β’ Zero buffering goal achieved for stable connections β
β β
β RESILIENCE β
β ββββββββββ β
β β’ 99.99% availability target achieved β
β β’ Survives AWS region failures β
β β’ Graceful degradation at every layer β
β β’ "Chaos Monkey" became industry standard β
β β
β "Netflix didn't just build a streaming service β they reinvented β
β how the internet delivers video and how companies build β
β resilient distributed systems." β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Sources
Official Documentation:
- Netflix Open Connect Overview: https://openconnect.netflix.com/
- Netflix Tech Blog: https://netflixtechblog.com/
- Netflix Research: https://research.netflix.com/
Statistics and Data:
- Netflix Q4 2024 Earnings Report
- Netflix Engagement Report H2 2024: https://about.netflix.com/en/news/what-we-watched-the-second-half-of-2024
- Statista Netflix Subscriber Statistics
Architecture and Technical:
- "Mastering Chaos - A Netflix Guide to Microservices" - Josh Evans (QCon)
- Netflix OSS: https://netflix.github.io/
- Per-Title Encoding Optimization: Netflix Tech Blog
Open Source Tools:
- Eureka: Service discovery
- Zuul: API gateway
- Hystrix: Circuit breaker (archived but influential)
- Spinnaker: Continuous delivery
- Chaos Monkey: Chaos engineering
Further Reading
Official Netflix Engineering:
- Netflix Tech Blog: https://netflixtechblog.com/ (Extensive technical articles)
- Netflix Research: https://research.netflix.com/ (ML, video encoding papers)
Engineering Talks (Highly Recommended):
- Josh Evans - "Mastering Chaos: A Netflix Guide to Microservices" (QCon 2016)
- Nora Jones - "Chaos Engineering at Netflix" (Strange Loop)
- Anne Lovelace - "Open Connect: Netflix's CDN" (NANOG)
Books:
- "Chaos Engineering" by Casey Rosenthal & Nora Jones β Written by Netflix engineers
- "Building Microservices" by Sam Newman β References Netflix patterns
- "Designing Data-Intensive Applications" by Martin Kleppmann β CDN and caching principles
Related Systems to Study:
- YouTube: Different approach to video delivery
- Twitch: Live streaming challenges
- Disney+: Competing architecture
Self-Assessment Checklist
After studying this case study, you should be able to:
- Explain the two-plane architecture (control plane vs data plane)
- Describe how adaptive bitrate streaming works
- Explain Netflix's per-title encoding optimization
- Design a content placement strategy for a global CDN
- Implement a circuit breaker pattern for microservices
- Explain the Open Connect CDN architecture
- Describe how recommendation systems use collaborative filtering
- Design monitoring and alerting for a streaming platform
- Explain chaos engineering principles and the Simian Army
- Calculate infrastructure requirements for a streaming service
This case study demonstrates how Netflix combined innovations in video encoding, content delivery, microservices architecture, and machine learning to create the world's dominant streaming platform. The same principles apply to any large-scale media delivery system.
End of Bonus Problem 6: Netflix Streaming
Document Statistics:
- Core concepts covered: 20+
- Code implementations: 4 (ABR client, content placement, circuit breaker, recommender)
- Architecture diagrams: 10+
- Real-world scale numbers: 40+