Bonus Problem 10: Twitter/X
The Real-Time Information Network That Delivers 500 Million Tweets Per Day
π¦ How Do You Deliver a Celebrity's Tweet to 100 Million Followers in Under 5 Seconds?
Imagine this challenge: A celebrity with 100 million followers posts a tweet. Within 5 seconds, that tweet must appear in the home timeline of every single follower. Meanwhile, 500 million other tweets are being posted that same day, each needing to reach their own audiences.
This is Twitter/Xβa real-time information network where a single tweet can trigger billions of write operations. It's not just a social network. It's a masterclass in fan-out architecture, timeline construction, and solving the "celebrity problem" at scale.
THE TWITTER/X SCALE (2025)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β USERS β
β βββββ β
β Monthly Active Users: ~600 Million β
β Daily Active Users: ~250 Million β
β Tweets per day: 500+ Million β
β Tweets per second (avg): ~6,000 β
β Tweets per second (peak): ~150,000+ β
β β
β TIMELINES β
β βββββββββ β
β Timeline reads (QPS): 300,000+ β
β Timeline writes (QPS): ~6,000 β
β Read/Write ratio: 50:1 β
β Target delivery time: < 5 seconds β
β β
β SOCIAL GRAPH β
β ββββββββββββ β
β Average followers: ~700 β
β Celebrity followers: 100M+ (top accounts) β
β Follow relationships: Hundreds of billions β
β β
β INFRASTRUCTURE β
β ββββββββββββββ β
β Timeline cache size: 800 tweets per user β
β Redis clusters: Thousands of nodes β
β Cache replication: 3x per timeline β
β Data centers: Multiple (hybrid cloud) β
β β
β TOP ACCOUNTS (Feb 2025) β
β βββββββββββββββββββββββ β
β @elonmusk: 215M followers β
β @barackobama: 131M followers β
β @justinbieber: 111M followers β
β @cristiano: 112M followers β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This is the system we'll design todayβand discover how Twitter solved one of the hardest problems in distributed systems: the fan-out problem.
The Interview Begins
You walk into the interview room. The interviewer smiles and gestures to the whiteboard.
Interviewer: "Thanks for coming in. Today we're going to design a social network timeline system like Twitter. I'm particularly interested in how you'd handle the fan-out problemβdelivering tweets to millions of followers in real-time. Please think out loud."
They write on the whiteboard:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Design a Twitter-like Timeline β
β β
β Requirements: β
β - Users can post tweets (short messages) β
β - Users can follow other users β
β - Users see a home timeline with tweets from people they follow β
β - Tweets should appear in followers' timelines within 5 seconds β
β - Handle celebrities with 100M+ followers β
β - Support 500M+ tweets per day β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interviewer: "Take a few minutes to think about this, then walk me through your approach. We have about 45 minutes."
Phase 1: Requirements Clarification (5 minutes)
Before diving in, you take a breath and start asking questions.
Your Questions
You: "Before I start designing, let me clarify some requirements. What's the expected scale in terms of users and tweets?"
Interviewer: "Think 500-600 million monthly active users, about 250 million daily active users, and 500 million tweets per day."
You: "For the timeline, how many tweets should we show and how fresh should they be?"
Interviewer: "The home timeline should show the most recent tweets from people you follow. Target delivery under 5 seconds for most users."
You: "What about the follower distribution? How do we handle users with millions of followers?"
Interviewer: "Great question. The top accounts have 100+ million followers. A tweet from them needs special handlingβyou can't fan out to 100 million timelines synchronously."
You: "Should we support real-time updates or is pull-based refresh acceptable?"
Interviewer: "Pull-based is fine for the home timeline, but we should discuss the trade-offs."
You: "Perfect. Let me summarize the requirements."
Functional Requirements
1. TWEET OPERATIONS
- Create tweet (280 characters, media optional)
- Delete tweet
- Retweet / Quote tweet
- Like tweet
- Reply to tweet
2. FOLLOW OPERATIONS
- Follow a user
- Unfollow a user
- View followers list
- View following list
3. TIMELINE OPERATIONS
- View home timeline (tweets from followed users)
- View user timeline (tweets by specific user)
- Refresh timeline to see new tweets
4. SEARCH & DISCOVERY
- Search tweets by keyword/hashtag
- Trending topics
- User search
Non-Functional Requirements
1. LATENCY
- Tweet delivery to timelines: < 5 seconds (p99)
- Timeline read: < 200ms (p99)
- Tweet post confirmation: < 500ms
2. AVAILABILITY
- 99.99% for reads (timeline viewing)
- 99.9% for writes (posting tweets)
- Graceful degradation during peaks
3. SCALE
- 500M tweets/day = ~6,000 tweets/second
- 300K timeline reads/second
- Peak: 10-20x during major events
4. CONSISTENCY
- Eventual consistency acceptable for timelines
- Strong consistency for tweet storage
- Read-your-own-writes for tweet authors
Phase 2: Back of the Envelope Estimation (5 minutes)
You: "Let me work through the numbers to understand the scale."
Traffic Estimation
TWEET WRITE TRAFFIC
Tweets per day: 500 million
Tweets per second (avg): 500M / 86,400 β 6,000 TPS
Peak multiplier: 10x (major events)
Peak tweets per second: 60,000 TPS
TIMELINE READ TRAFFIC
Daily active users: 250 million
Timeline refreshes per user: ~20/day
Timeline reads per day: 5 billion
Timeline reads per second: 5B / 86,400 β 58,000 QPS
Peak (with buffer): 300,000 QPS
Fan-Out Estimation
FAN-OUT CALCULATIONS
Average followers per user: 700
Tweets per second: 6,000
Fan-out writes per second: 6,000 Γ 700 = 4.2 million
Celebrity problem:
- Celebrity with 100M followers tweets
- Fan-out to 100M timelines
- At 100K writes/second = 1,000 seconds (16 minutes!)
- THIS IS TOO SLOW β Need hybrid approach
Storage Estimation
TWEET STORAGE
Tweet size (avg): ~1 KB (text, metadata, indices)
Tweets per day: 500 million
Daily storage: 500 GB/day
Yearly storage: ~180 TB/year
TIMELINE CACHE
Active users: 250 million
Timeline entries per user: 800 (tweet IDs only)
Entry size: ~20 bytes (tweet_id + metadata)
Per-user timeline: 16 KB
Total timeline cache: 250M Γ 16 KB = 4 TB
With 3x replication: 12 TB (fits in Redis cluster)
Key Metrics Summary
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ESTIMATION SUMMARY β
β β
β TRAFFIC β
β βββ Tweet writes: 6,000/second (60K peak) β
β βββ Timeline reads: 300,000/second β
β βββ Fan-out writes: 4.2 million/second β
β β
β STORAGE β
β βββ Tweet storage: 500 GB/day β
β βββ Timeline cache: 12 TB (Redis) β
β βββ Social graph: Hundreds of GB β
β β
β KEY INSIGHT β
β βββββββββββ β
β Read/write ratio is 50:1 β
β β Optimize for reads (pre-compute timelines) β
β β Accept higher write latency β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Phase 3: High-Level Design (10 minutes)
You: "Now let me sketch out the high-level architecture."
System Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TWITTER HIGH-LEVEL ARCHITECTURE β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CLIENTS β β
β β ββββββββββββ ββββββββββββ ββββββββββββ β β
β β β Web β β iOS β β Android β β β
β β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β β
β βββββββββββΌββββββββββββββββΌββββββββββββββββΌββββββββββββββββββββββββ β
β β β β β
β βββββββββββββββββΌββββββββββββββββ β
β βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β API GATEWAY / LB β β
β β Rate Limiting β Auth β Routing β β
β ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β WRITE PATH β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β β Tweet β β Fan-Out β β Social β β β β
β β β β Service βββββΆβ Service ββββββ Graph β β β β
β β β ββββββββ¬βββββββ ββββββββ¬βββββββ βββββββββββββββ β β β
β β β β β β β β
β β β βΌ βΌ β β β
β β β βββββββββββββββ βββββββββββββββ β β β
β β β βTweet Storageβ β Timeline β β β β
β β β β (Manhattan) β β Cache β β β β
β β β βββββββββββββββ β (Redis) β β β β
β β β βββββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β READ PATH β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β β Timeline ββββββ Timeline ββββββ Tweet β β β β
β β β β Service β β Cache β β Service β β β β
β β β βββββββββββββββ β (Redis) β βββββββββββββββ β β β
β β β β βββββββββββββββ β β β
β β β βΌ β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β β β CELEBRITY MERGE β β β β
β β β β (Pull celebrity tweets at read time) β β β β
β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β SEARCH & TRENDS β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β β Search β β Trending β β Elasticsearchβ β β β
β β β β Service β β Service β β Cluster β β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β TWITTER SERVICES β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The Two Timeline Approaches
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FAN-OUT STRATEGIES COMPARED β
β β
β APPROACH 1: FAN-OUT ON WRITE (Push) β
β βββββββββββββββββββββββββββββββββββββ β
β β
β User A posts tweet β
β β β
β βΌ β
β βββββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β Fan-Out ββββββΆβ Write to all followers' timelines β β
β β Service β β β β
β βββββββββββ β Follower 1 timeline β tweet_id β β
β β Follower 2 timeline β tweet_id β β
β β Follower 3 timeline β tweet_id β β
β β ... β β
β β Follower N timeline β tweet_id β β
β βββββββββββββββββββββββββββββββββββββββ β
β β
β β Fast reads (timeline already assembled) β
β β Slow writes for users with many followers β
β β Celebrity problem: 100M writes per tweet! β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β APPROACH 2: FAN-OUT ON READ (Pull) β
β βββββββββββββββββββββββββββββββββββββ β
β β
β User B requests timeline β
β β β
β βΌ β
β ββββββββββββ βββββββββββββββββββββββββββββββββββββββ β
β β Timeline ββββββΆβ Look up all users B follows β β
β β Service β β Fetch recent tweets from each β β
β ββββββββββββ β Merge and sort by timestamp β β
β β Return top N tweets β β
β βββββββββββββββββββββββββββββββββββββββ β
β β
β β Fast writes (just store the tweet) β
β β Slow reads (must query many users) β
β β Doesn't scale for users following many accounts β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β APPROACH 3: HYBRID (Twitter's actual approach) β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Regular users (< 10K followers): Fan-out on write β
β Celebrities (> 10K followers): Fan-out on read β
β β
β At read time: β
β 1. Fetch pre-computed timeline from cache β
β 2. Fetch celebrity tweets separately β
β 3. Merge and sort β
β 4. Return to user β
β β
β β Best of both worlds β
β β Fast reads for most users β
β β Handles celebrity problem β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Phase 4: Deep Dives (20 minutes)
Interviewer: "Let's dive deeper. Tell me about the fan-out service."
Deep Dive 1: Fan-Out Service (Week 3 Concepts)
The Problem
FAN-OUT CHALLENGE
When a user with 1,000 followers tweets:
- Write tweet to storage
- Insert tweet_id into 1,000 follower timelines
- Must complete within 5 seconds
When a celebrity with 100M followers tweets:
- Same tweet storage
- Insert into 100M timelines = IMPOSSIBLE in 5 seconds
- At 100K inserts/second = 16 minutes!
Solution: Don't fan out to celebrity followers at all.
Instead, merge celebrity tweets at read time.
Implementation
# timeline/fanout_service.py
"""
Fan-Out Service for Twitter Timeline
Distributes tweets to follower timelines using a hybrid
push/pull approach based on follower count.
"""
from dataclasses import dataclass
from typing import List, Set
from enum import Enum
import asyncio
class FanoutStrategy(Enum):
PUSH = "push" # Fan-out on write
PULL = "pull" # Fan-out on read (celebrities)
@dataclass
class FanoutConfig:
celebrity_threshold: int = 10_000 # Followers above this = celebrity
batch_size: int = 1000 # Followers per batch
max_concurrent_batches: int = 100 # Parallel batch processing
timeline_max_size: int = 800 # Max tweets in timeline cache
@dataclass
class Tweet:
tweet_id: str
author_id: str
content: str
timestamp: int
is_celebrity: bool = False
class FanoutService:
"""
Hybrid fan-out service using push for regular users,
pull for celebrities.
Applies concepts:
- Week 3: Async message processing
- Week 4: Cache-aside pattern for timelines
- Week 1: Partitioning by user_id
"""
def __init__(
self,
config: FanoutConfig,
social_graph,
timeline_cache,
tweet_storage,
message_queue
):
self.config = config
self.social_graph = social_graph
self.timeline_cache = timeline_cache
self.tweet_storage = tweet_storage
self.queue = message_queue
async def process_tweet(self, tweet: Tweet) -> None:
"""
Process a new tweet for fan-out.
1. Store the tweet
2. Determine fan-out strategy
3. If push: queue fan-out jobs
4. If pull: mark author as celebrity
"""
# Store tweet first (source of truth)
await self.tweet_storage.store(tweet)
# Get follower count to determine strategy
follower_count = await self.social_graph.get_follower_count(
tweet.author_id
)
strategy = self._determine_strategy(follower_count)
if strategy == FanoutStrategy.PUSH:
# Regular user: fan out to all followers
await self._push_fanout(tweet)
else:
# Celebrity: don't fan out, mark for pull
await self._mark_celebrity_tweet(tweet)
def _determine_strategy(self, follower_count: int) -> FanoutStrategy:
"""Determine push vs pull based on follower count."""
if follower_count > self.config.celebrity_threshold:
return FanoutStrategy.PULL
return FanoutStrategy.PUSH
async def _push_fanout(self, tweet: Tweet) -> None:
"""
Fan out tweet to all follower timelines.
Uses batching and parallelism for efficiency.
"""
# Get all followers
followers = await self.social_graph.get_followers(
tweet.author_id
)
# Process in batches for efficiency
batches = self._create_batches(
followers,
self.config.batch_size
)
# Queue batch jobs (processed by workers)
for batch in batches:
await self.queue.enqueue({
"type": "fanout_batch",
"tweet_id": tweet.tweet_id,
"author_id": tweet.author_id,
"timestamp": tweet.timestamp,
"follower_ids": batch
})
async def process_fanout_batch(
self,
tweet_id: str,
author_id: str,
timestamp: int,
follower_ids: List[str]
) -> None:
"""
Process a batch of follower timeline updates.
Called by queue workers.
"""
# Create timeline entry (minimal data)
entry = {
"tweet_id": tweet_id,
"author_id": author_id,
"timestamp": timestamp
}
# Update each follower's timeline cache
tasks = [
self._update_timeline(follower_id, entry)
for follower_id in follower_ids
]
await asyncio.gather(*tasks, return_exceptions=True)
async def _update_timeline(
self,
user_id: str,
entry: dict
) -> None:
"""
Add tweet to user's cached timeline.
Uses Redis list with LPUSH + LTRIM pattern.
"""
timeline_key = f"timeline:{user_id}"
# Push to front of list
await self.timeline_cache.lpush(timeline_key, entry)
# Trim to max size (keep only recent 800)
await self.timeline_cache.ltrim(
timeline_key,
0,
self.config.timeline_max_size - 1
)
async def _mark_celebrity_tweet(self, tweet: Tweet) -> None:
"""
Mark tweet for celebrity pull-based delivery.
Store in celebrity tweets index for merge at read time.
"""
await self.timeline_cache.zadd(
f"celebrity_tweets:{tweet.author_id}",
{tweet.tweet_id: tweet.timestamp}
)
# Trim old celebrity tweets (keep last 100)
await self.timeline_cache.zremrangebyrank(
f"celebrity_tweets:{tweet.author_id}",
0,
-101
)
def _create_batches(
self,
items: List[str],
batch_size: int
) -> List[List[str]]:
"""Split list into batches."""
return [
items[i:i + batch_size]
for i in range(0, len(items), batch_size)
]
Deep Dive 2: Timeline Service with Celebrity Merge (Week 4 Concepts)
Interviewer: "How does the timeline read work with the celebrity merge?"
The Solution
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TIMELINE READ WITH CELEBRITY MERGE β
β β
β User requests home timeline β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β STEP 1: Fetch pre-computed timeline from cache β β
β β (Contains tweets from regular users they follow) β β
β β β β
β β Redis: LRANGE timeline:user123 0 800 β β
β β Result: [tweet_501, tweet_499, tweet_495, ...] β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β STEP 2: Get list of celebrities user follows β β
β β β β
β β Graph query: Get followed users where follower_count > 10K β β
β β Result: [elonmusk, taylorswift, cristiano] β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β STEP 3: Fetch recent tweets from each celebrity β β
β β β β
β β Redis: ZREVRANGE celebrity_tweets:elonmusk 0 20 β β
β β Redis: ZREVRANGE celebrity_tweets:taylorswift 0 20 β β
β β Result: [tweet_503, tweet_500, tweet_498, ...] β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β STEP 4: Merge and sort by timestamp β β
β β β β
β β Merge: [cached_timeline] + [celebrity_tweets] β β
β β Sort: By timestamp descending β β
β β Result: [tweet_503, tweet_501, tweet_500, tweet_499, ...] β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β STEP 5: Hydrate tweets (fetch full content) β β
β β β β
β β MGET tweet:503 tweet:501 tweet:500 tweet:499 ... β β
β β Return full tweet objects to client β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# timeline/timeline_service.py
"""
Timeline Service with Celebrity Merge
Constructs user's home timeline by merging pre-computed
timeline with celebrity tweets fetched at read time.
"""
from dataclasses import dataclass
from typing import List, Optional
import asyncio
import heapq
@dataclass
class TimelineEntry:
tweet_id: str
author_id: str
timestamp: int
@dataclass
class FullTweet:
tweet_id: str
author_id: str
author_name: str
content: str
timestamp: int
likes: int
retweets: int
replies: int
class TimelineService:
"""
Timeline service with hybrid push/pull architecture.
Applies concepts:
- Week 4: Cache-aside for timeline reads
- Week 1: Read optimization through pre-computation
- Week 3: Async parallel fetching
"""
CELEBRITY_THRESHOLD = 10_000
DEFAULT_TIMELINE_SIZE = 50
MAX_CELEBRITIES_TO_FETCH = 100
def __init__(
self,
timeline_cache,
tweet_storage,
social_graph,
user_service
):
self.timeline_cache = timeline_cache
self.tweet_storage = tweet_storage
self.social_graph = social_graph
self.user_service = user_service
async def get_home_timeline(
self,
user_id: str,
count: int = 50,
max_id: Optional[str] = None
) -> List[FullTweet]:
"""
Get user's home timeline with celebrity merge.
Merges pre-computed timeline (push) with
celebrity tweets (pull) at read time.
"""
# Step 1: Get pre-computed timeline from cache
cached_timeline = await self._get_cached_timeline(
user_id, count * 2 # Fetch extra for merge buffer
)
# Step 2: Get celebrities this user follows
celebrities = await self._get_followed_celebrities(user_id)
# Step 3: Fetch recent tweets from celebrities
celebrity_tweets = await self._fetch_celebrity_tweets(
celebrities
)
# Step 4: Merge and sort
merged = self._merge_timelines(
cached_timeline,
celebrity_tweets
)
# Step 5: Apply pagination
if max_id:
merged = [t for t in merged if t.tweet_id < max_id]
# Step 6: Hydrate tweets (get full content)
tweet_ids = [t.tweet_id for t in merged[:count]]
full_tweets = await self._hydrate_tweets(tweet_ids)
return full_tweets
async def _get_cached_timeline(
self,
user_id: str,
count: int
) -> List[TimelineEntry]:
"""Fetch pre-computed timeline from Redis."""
timeline_key = f"timeline:{user_id}"
# Check if timeline exists
exists = await self.timeline_cache.exists(timeline_key)
if not exists:
# Rebuild timeline for inactive user
await self._rebuild_timeline(user_id)
# Fetch timeline entries
entries = await self.timeline_cache.lrange(
timeline_key, 0, count - 1
)
return [
TimelineEntry(**entry)
for entry in entries
]
async def _get_followed_celebrities(
self,
user_id: str
) -> List[str]:
"""Get list of celebrities the user follows."""
# Get all users this person follows
following = await self.social_graph.get_following(user_id)
# Filter to celebrities (high follower count)
celebrities = []
for followed_id in following:
follower_count = await self.social_graph.get_follower_count(
followed_id
)
if follower_count > self.CELEBRITY_THRESHOLD:
celebrities.append(followed_id)
if len(celebrities) >= self.MAX_CELEBRITIES_TO_FETCH:
break
return celebrities
async def _fetch_celebrity_tweets(
self,
celebrity_ids: List[str]
) -> List[TimelineEntry]:
"""Fetch recent tweets from celebrities in parallel."""
if not celebrity_ids:
return []
# Parallel fetch from all celebrities
tasks = [
self._get_celebrity_recent_tweets(celeb_id)
for celeb_id in celebrity_ids
]
results = await asyncio.gather(*tasks)
# Flatten results
all_tweets = []
for tweets in results:
all_tweets.extend(tweets)
return all_tweets
async def _get_celebrity_recent_tweets(
self,
celebrity_id: str,
count: int = 20
) -> List[TimelineEntry]:
"""Get recent tweets from a celebrity."""
key = f"celebrity_tweets:{celebrity_id}"
# Sorted set: score = timestamp
tweet_data = await self.timeline_cache.zrevrange(
key, 0, count - 1, withscores=True
)
return [
TimelineEntry(
tweet_id=tweet_id,
author_id=celebrity_id,
timestamp=int(score)
)
for tweet_id, score in tweet_data
]
def _merge_timelines(
self,
cached: List[TimelineEntry],
celebrity: List[TimelineEntry]
) -> List[TimelineEntry]:
"""
Merge two sorted timelines by timestamp.
Uses heap merge for O(n log k) efficiency.
"""
# Convert to tuples for heap: (-timestamp, entry)
# Negative because heapq is min-heap
all_entries = [
(-e.timestamp, i, e)
for i, e in enumerate(cached + celebrity)
]
heapq.heapify(all_entries)
merged = []
seen_ids = set()
while all_entries and len(merged) < 200:
_, _, entry = heapq.heappop(all_entries)
# Deduplicate (same tweet might appear twice)
if entry.tweet_id not in seen_ids:
seen_ids.add(entry.tweet_id)
merged.append(entry)
return merged
async def _hydrate_tweets(
self,
tweet_ids: List[str]
) -> List[FullTweet]:
"""Fetch full tweet content from storage."""
if not tweet_ids:
return []
# Batch fetch from tweet storage
tweets = await self.tweet_storage.mget(tweet_ids)
# Get author info
author_ids = list(set(t.author_id for t in tweets if t))
authors = await self.user_service.mget(author_ids)
author_map = {a.user_id: a for a in authors}
return [
FullTweet(
tweet_id=t.tweet_id,
author_id=t.author_id,
author_name=author_map.get(t.author_id, {}).get("name", "Unknown"),
content=t.content,
timestamp=t.timestamp,
likes=t.likes,
retweets=t.retweets,
replies=t.replies
)
for t in tweets if t
]
async def _rebuild_timeline(self, user_id: str) -> None:
"""
Rebuild timeline for inactive user.
Called when user returns after 30+ days of inactivity
and their timeline has been evicted from cache.
"""
# Get users they follow
following = await self.social_graph.get_following(user_id)
# Fetch recent tweets from each (excluding celebrities)
all_tweets = []
for followed_id in following:
follower_count = await self.social_graph.get_follower_count(
followed_id
)
if follower_count <= self.CELEBRITY_THRESHOLD:
tweets = await self.tweet_storage.get_user_tweets(
followed_id, limit=50
)
all_tweets.extend(tweets)
# Sort by timestamp
all_tweets.sort(key=lambda t: t.timestamp, reverse=True)
# Store in cache
timeline_key = f"timeline:{user_id}"
for tweet in all_tweets[:800]:
await self.timeline_cache.rpush(timeline_key, {
"tweet_id": tweet.tweet_id,
"author_id": tweet.author_id,
"timestamp": tweet.timestamp
})
# Set TTL (30 days)
await self.timeline_cache.expire(timeline_key, 30 * 86400)
Deep Dive 3: Social Graph Service (Week 1 Concepts)
Interviewer: "How do you store and query the social graph efficiently?"
The Problem
SOCIAL GRAPH CHALLENGES
Scale:
- 600M users
- Average 700 followers per user
- Total edges: 420 billion relationships
Queries needed:
- Get all followers of user X (for fan-out)
- Get all users that X follows (for timeline)
- Check if A follows B
- Get follower count
Naive approach (SQL):
- followers table: (follower_id, followee_id)
- "Get followers of @elonmusk": Returns 200M rows!
- Not practical
Implementation
# graph/social_graph_service.py
"""
Social Graph Service
Stores and queries follow relationships using a
hybrid approach: Redis for hot data, distributed
storage for full graph.
"""
from dataclasses import dataclass
from typing import List, Set, Optional
from enum import Enum
@dataclass
class FollowRelation:
follower_id: str
followee_id: str
timestamp: int
class SocialGraphService:
"""
Social graph with optimized follower/following queries.
Uses Redis sets for fast membership checks and
counts, with Manhattan/Cassandra for full graph storage.
Applies concepts:
- Week 1: Partitioning by user_id
- Week 4: Caching hot data (follower counts)
- Week 5: Eventual consistency for graph updates
"""
def __init__(self, redis_client, graph_storage):
self.redis = redis_client
self.storage = graph_storage
async def follow(
self,
follower_id: str,
followee_id: str
) -> bool:
"""
Create a follow relationship.
Updates both directions for bidirectional queries.
"""
# Check if already following
if await self.is_following(follower_id, followee_id):
return False
timestamp = int(time.time())
# Store in persistent storage
await self.storage.put(
FollowRelation(follower_id, followee_id, timestamp)
)
# Update Redis caches
# 1. Add to follower's "following" set
await self.redis.sadd(
f"following:{follower_id}",
followee_id
)
# 2. Add to followee's "followers" set
await self.redis.sadd(
f"followers:{followee_id}",
follower_id
)
# 3. Increment follower count
await self.redis.incr(f"follower_count:{followee_id}")
await self.redis.incr(f"following_count:{follower_id}")
return True
async def unfollow(
self,
follower_id: str,
followee_id: str
) -> bool:
"""Remove a follow relationship."""
if not await self.is_following(follower_id, followee_id):
return False
# Remove from persistent storage
await self.storage.delete(follower_id, followee_id)
# Update Redis caches
await self.redis.srem(f"following:{follower_id}", followee_id)
await self.redis.srem(f"followers:{followee_id}", follower_id)
await self.redis.decr(f"follower_count:{followee_id}")
await self.redis.decr(f"following_count:{follower_id}")
return True
async def is_following(
self,
follower_id: str,
followee_id: str
) -> bool:
"""Check if follower follows followee. O(1) operation."""
return await self.redis.sismember(
f"following:{follower_id}",
followee_id
)
async def get_followers(
self,
user_id: str,
cursor: Optional[str] = None,
count: int = 1000
) -> List[str]:
"""
Get followers of a user.
For users with many followers, uses cursor-based
pagination to avoid loading millions of IDs.
"""
# For small follower sets, get all from Redis
follower_count = await self.get_follower_count(user_id)
if follower_count < 10_000:
return list(await self.redis.smembers(
f"followers:{user_id}"
))
# For large sets, use cursor-based scan
return await self._scan_followers(
user_id, cursor, count
)
async def get_following(
self,
user_id: str
) -> List[str]:
"""Get all users that user_id follows."""
return list(await self.redis.smembers(
f"following:{user_id}"
))
async def get_follower_count(self, user_id: str) -> int:
"""Get follower count. O(1) operation."""
count = await self.redis.get(f"follower_count:{user_id}")
return int(count) if count else 0
async def get_following_count(self, user_id: str) -> int:
"""Get following count. O(1) operation."""
count = await self.redis.get(f"following_count:{user_id}")
return int(count) if count else 0
async def _scan_followers(
self,
user_id: str,
cursor: Optional[str],
count: int
) -> List[str]:
"""Cursor-based scan for large follower sets."""
start_cursor = int(cursor) if cursor else 0
new_cursor, followers = await self.redis.sscan(
f"followers:{user_id}",
cursor=start_cursor,
count=count
)
return {
"followers": list(followers),
"next_cursor": str(new_cursor) if new_cursor else None
}
Deep Dive 4: Trending Topics (Week 8 Concepts)
Interviewer: "How do you compute trending topics in real-time?"
Implementation
# trending/trending_service.py
"""
Trending Topics Service
Computes trending hashtags and topics using
sliding window counts with decay.
"""
from dataclasses import dataclass
from typing import List, Dict, Tuple
from collections import defaultdict
import time
import math
@dataclass
class TrendingTopic:
topic: str
tweet_count: int
velocity: float # Rate of growth
score: float
class TrendingService:
"""
Real-time trending topics computation.
Uses time-bucketed counts with exponential decay
to identify rapidly growing topics.
Applies concepts:
- Week 8: Stream processing
- Week 4: Time-windowed aggregation
- Week 3: Async event processing
"""
BUCKET_SIZE_SECONDS = 60 # 1-minute buckets
WINDOW_SIZE_BUCKETS = 60 # Look at last 60 minutes
DECAY_FACTOR = 0.95 # Recent tweets count more
MIN_TWEETS_THRESHOLD = 100 # Minimum to be trending
def __init__(self, redis_client, location_service):
self.redis = redis_client
self.location_service = location_service
async def record_hashtag(
self,
hashtag: str,
tweet_id: str,
location: Optional[str] = None
) -> None:
"""
Record a hashtag occurrence from a tweet.
Updates both global and location-specific counts.
"""
current_bucket = self._get_current_bucket()
# Increment global count for this bucket
await self.redis.hincrby(
f"trending:global:{current_bucket}",
hashtag.lower(),
1
)
# Increment location-specific count
if location:
await self.redis.hincrby(
f"trending:{location}:{current_bucket}",
hashtag.lower(),
1
)
# Set expiry on bucket (auto-cleanup)
await self.redis.expire(
f"trending:global:{current_bucket}",
3600 * 2 # 2 hours
)
async def get_trending(
self,
location: Optional[str] = None,
count: int = 10
) -> List[TrendingTopic]:
"""
Get current trending topics.
Combines raw counts with velocity (growth rate)
to surface rapidly emerging topics.
"""
# Get counts from recent buckets
topic_counts = await self._aggregate_buckets(location)
# Calculate velocity and score
scored_topics = []
for topic, counts in topic_counts.items():
if sum(counts) < self.MIN_TWEETS_THRESHOLD:
continue
total_count = self._weighted_count(counts)
velocity = self._calculate_velocity(counts)
score = self._calculate_score(total_count, velocity)
scored_topics.append(TrendingTopic(
topic=topic,
tweet_count=sum(counts),
velocity=velocity,
score=score
))
# Sort by score and return top N
scored_topics.sort(key=lambda t: t.score, reverse=True)
return scored_topics[:count]
async def _aggregate_buckets(
self,
location: Optional[str]
) -> Dict[str, List[int]]:
"""Aggregate counts from recent time buckets."""
current_bucket = self._get_current_bucket()
prefix = f"trending:{location or 'global'}"
# Collect counts from each bucket
topic_counts = defaultdict(list)
for i in range(self.WINDOW_SIZE_BUCKETS):
bucket = current_bucket - i
bucket_data = await self.redis.hgetall(f"{prefix}:{bucket}")
for topic, count in bucket_data.items():
topic_counts[topic].append(int(count))
# Pad with zeros if bucket doesn't have this topic
for topic in topic_counts:
if len(topic_counts[topic]) <= i:
topic_counts[topic].append(0)
return dict(topic_counts)
def _weighted_count(self, counts: List[int]) -> float:
"""
Calculate weighted count with exponential decay.
More recent buckets count more heavily.
"""
total = 0.0
for i, count in enumerate(counts):
weight = self.DECAY_FACTOR ** i
total += count * weight
return total
def _calculate_velocity(self, counts: List[int]) -> float:
"""
Calculate velocity (rate of change).
Compare recent period to earlier period.
"""
if len(counts) < 10:
return 0.0
recent = sum(counts[:5]) # Last 5 minutes
earlier = sum(counts[5:10]) or 1 # 5-10 minutes ago
return (recent - earlier) / earlier
def _calculate_score(
self,
weighted_count: float,
velocity: float
) -> float:
"""
Calculate trending score.
Combines volume with velocity to surface
both popular and rapidly growing topics.
"""
# Log scale for volume (diminishing returns)
volume_score = math.log10(weighted_count + 1)
# Velocity boost for growing topics
velocity_boost = max(0, velocity) * 2
return volume_score * (1 + velocity_boost)
def _get_current_bucket(self) -> int:
"""Get current time bucket."""
return int(time.time()) // self.BUCKET_SIZE_SECONDS
Phase 5: Scaling and Edge Cases
Scaling Strategies
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SCALING STRATEGIES β
β β
β TIMELINE CACHE SHARDING β
β βββββββββββββββββββββββ β
β β’ Partition by user_id hash β
β β’ Each shard handles ~10M users β
β β’ 3x replication for fault tolerance β
β β’ Total: ~60 Redis shards β
β β
β FAN-OUT WORKERS β
β ββββββββββββββββ β
β β’ Queue-based processing (Kafka) β
β β’ Horizontal scaling based on queue depth β
β β’ Priority lanes for time-sensitive tweets β
β β’ Backpressure handling during peaks β
β β
β TWEET STORAGE β
β βββββββββββββ β
β β’ Manhattan (Twitter's distributed KV store) β
β β’ Partitioned by tweet_id β
β β’ Separate hot (recent) and cold (archive) tiers β
β β
β MULTI-DATACENTER β
β ββββββββββββββββ β
β β’ Active-active in multiple regions β
β β’ Route users to nearest datacenter β
β β’ Async replication of tweets and timelines β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Edge Cases
EDGE CASE 1: Celebrity Mentions Another Celebrity
Scenario: @elonmusk replies to @taylorswift
Problem: Both have 100M+ followers
Solution:
- Store tweet in both celebrity tweet indexes
- Users following both see it once (deduplication at read)
- Reply appears in both users' mention timelines
EDGE CASE 2: Reply Arrives Before Original Tweet
Scenario: Fan-out is slow, reply completes first
Problem: User sees reply to tweet they haven't received yet
Solution:
- Store parent_tweet_id with replies
- At read time, fetch parent if not in timeline
- Display as thread with "Show original tweet"
EDGE CASE 3: Inactive User Returns After 30 Days
Scenario: User's timeline evicted from cache
Problem: No pre-computed timeline exists
Solution:
- Detect cache miss on timeline request
- Rebuild timeline from followed users' tweets
- Cache rebuilt timeline
- Slightly slower first load (acceptable)
EDGE CASE 4: Viral Tweet (Massive Engagement)
Scenario: Tweet gets millions of likes/retweets rapidly
Problem: Hot key for like/retweet counters
Solution:
- Counter sharding (like_count_1, like_count_2, ...)
- Aggregate on read or periodically
- Eventually consistent counts (acceptable for social)
EDGE CASE 5: Major Event (Super Bowl, Elections)
Scenario: 10x normal tweet volume
Problem: Fan-out queues back up
Solution:
- Pre-scale based on event calendar
- Prioritize timeline reads over real-time delivery
- Acceptable to have 10-30 second delays
- Graceful degradation: show slightly stale timeline
Phase 6: Monitoring and Operations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TWITTER MONITORING DASHBOARD β
β β
β TIMELINE HEALTH β
β βββ Timeline read latency (p50): 45ms [ββββββββββ] healthy β
β βββ Timeline read latency (p99): 180ms [ββββββββββ] good β
β βββ Cache hit rate: 99.2% [ββββββββββ] healthy β
β βββ Celebrity merge time: 25ms [ββββββββββ] healthy β
β β
β FAN-OUT PIPELINE β
β βββ Fan-out queue depth: 45K [ββββββββββ] normal β
β βββ Fan-out latency (p50): 1.2s [ββββββββββ] good β
β βββ Fan-out latency (p99): 4.8s [ββββββββββ] watch β
β βββ Failed fan-outs: 0.01% [ββββββββββ] healthy β
β β
β TWEET INGESTION β
β βββ Tweets/second: 6,200 [ββββββββββ] normal β
β βββ Tweet storage latency: 15ms [ββββββββββ] healthy β
β βββ Search indexing lag: 2.1s [ββββββββββ] good β
β β
β INFRASTRUCTURE β
β βββ Redis cluster CPU: 45% [ββββββββββ] healthy β
β βββ Redis memory usage: 72% [ββββββββββ] normal β
β βββ Kafka consumer lag: 12K msgs [ββββββββββ] healthy β
β βββ API error rate: 0.02% [ββββββββββ] healthy β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Alerting Strategy
CRITICAL (PagerDuty):
- Timeline read latency p99 > 500ms
- Fan-out queue depth > 1M messages
- Cache hit rate < 95%
- Tweet storage failures > 0.1%
WARNING (Slack):
- Fan-out latency p99 > 10 seconds
- Redis memory > 85%
- Celebrity merge taking > 100ms
- Search indexing lag > 30 seconds
INFO (Dashboard):
- Tweets per second trends
- Top trending topics
- Celebrity tweet frequency
- User engagement metrics
Interview Conclusion
Interviewer: "Excellent work. You've covered the fan-out problem really well. Any questions?"
You: "Thank you! I'm curiousβhow does Twitter handle the case where a user unfollows someone? Do you remove old tweets from their timeline cache?"
Interviewer: "Great question. We don't actively remove tweets on unfollowβthat would be expensive. Instead, when serving the timeline, we filter out tweets from users that are no longer followed. The old tweets naturally age out as new ones push them past the 800-tweet limit."
Summary: Concepts Applied
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONCEPTS FROM 10-WEEK COURSE IN TWITTER DESIGN β
β β
β WEEK 1: DATA AT SCALE β
β βββ Partitioning: User-based sharding for timelines β
β βββ Read optimization: Pre-computed timelines β
β βββ Hot keys: Celebrity handling with pull-based reads β
β β
β WEEK 2: FAILURE-FIRST DESIGN β
β βββ Graceful degradation: Stale timelines during peaks β
β βββ Retries: Fan-out job retry with idempotency β
β βββ Timeouts: Fast-fail on slow fan-out batches β
β β
β WEEK 3: MESSAGING & ASYNC β
β βββ Kafka: Fan-out job queue β
β βββ Backpressure: Queue depth monitoring and scaling β
β βββ Event-driven: Tweet β Fan-out β Timeline update β
β β
β WEEK 4: CACHING β
β βββ Redis lists: Timeline storage with LPUSH/LTRIM β
β βββ Cache-aside: Rebuild timeline on miss β
β βββ Multi-tier: Hot timelines in memory, cold on disk β
β β
β WEEK 5: CONSISTENCY β
β βββ Eventual consistency: Timelines (acceptable lag) β
β βββ Strong consistency: Tweet storage (source of truth) β
β βββ Read-your-own-writes: Author sees their tweet immediately β
β β
β WEEK 8: ANALYTICS PIPELINE β
β βββ Trending topics: Real-time stream processing β
β βββ Time-bucketed counts: Sliding window aggregation β
β βββ Velocity scoring: Detect rapidly growing topics β
β β
β WEEK 10: PRODUCTION READINESS β
β βββ SLOs: < 5s fan-out, < 200ms reads β
β βββ Observability: Queue depth, cache hit rate, latencies β
β βββ Graceful degradation: Prioritize reads during overload β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WHY TWITTER IS AN ENGINEERING MARVEL β
β β
β THE FAN-OUT PROBLEM β
β βββββββββββββββββββ β
β β’ Single tweet can trigger 100M+ write operations β
β β’ Hybrid push/pull elegantly solves celebrity problem β
β β’ Pre-computed timelines enable 300K QPS reads β
β β
β REAL-TIME AT SCALE β
β ββββββββββββββββββ β
β β’ 500M tweets/day delivered in seconds β
β β’ Breaking news spreads globally in real-time β
β β’ Trending topics computed on live stream β
β β
β CACHE AS SOURCE OF TRUTH β
β ββββββββββββββββββββββββ β
β β’ Timeline cache IS the primary data store β
β β’ Database is for persistence, not serving β
β β’ Inverts traditional cache/DB relationship β
β β
β ASYMMETRIC SOCIAL GRAPH β
β βββββββββββββββββββββββ β
β β’ Followers β Following (unlike Facebook) β
β β’ Creates unique scaling challenges β
β β’ Enables powerful interest-based discovery β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Sources
Twitter Engineering Blog:
- Twitter Engineering: https://blog.twitter.com/engineering
- Timelines at Scale: https://www.infoq.com/presentations/Twitter-Timeline-Scalability/
- Infrastructure Behind Twitter: https://blog.twitter.com/engineering/en_us/topics/infrastructure/2017/the-infrastructure-behind-twitter-scale
Architecture Deep Dives:
- High Scalability - Twitter Architecture: https://highscalability.com/the-architecture-twitter-uses-to-deal-with-150m-active-users/
- System Design Primer - Twitter: https://github.com/donnemartin/system-design-primer/blob/master/solutions/system_design/twitter/README.md
Statistics:
- Backlinko - Twitter Statistics: https://backlinko.com/twitter-users
- Business of Apps - X Statistics: https://www.businessofapps.com/data/twitter-statistics/
- DemandSage - Twitter Users: https://www.demandsage.com/twitter-statistics/
Open Source:
- Twitter Algorithm (Open Sourced 2023): https://github.com/twitter/the-algorithm
Self-Assessment Checklist
After studying this case study, you should be able to:
- Explain fan-out on write vs fan-out on read trade-offs
- Design a hybrid push/pull timeline system
- Solve the celebrity problem (users with 100M+ followers)
- Implement timeline cache using Redis lists
- Design a social graph service with efficient queries
- Calculate real-time trending topics with velocity scoring
- Handle the reply-before-original edge case
- Scale timeline delivery to 300K+ QPS
- Implement graceful degradation during traffic spikes
- Build observability for fan-out pipeline health
- Articulate when eventual consistency is acceptable
This case study covers Twitter's architecture for delivering 500 million tweets per day to 600 million users worldwide, with a focus on the fan-out problem and timeline construction.