Week 5 Capstone: Multiplayer Game Matchmaking System
π― A Real-World Problem Covering Everything You've Learned
The Interview Begins
You walk into the interview room at a gaming company. The interviewer, a Staff Engineer, smiles and gestures to the whiteboard.
Interviewer: "Thanks for coming in. Today we're going to work through a system design problem together. I'm interested in your thought process, so please think out loud. Feel free to ask questions β this is meant to be collaborative."
They write on the whiteboard:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Design a Multiplayer Game Matchmaking System β
β β
β We're building a competitive multiplayer game. Players queue up β
β to find matches with others of similar skill. Once matched, they β
β connect to a game server and play. β
β β
β Key requirements: β
β - Match players of similar skill (ELO rating) β
β - Support multiple game modes (1v1, 5v5, battle royale) β
β - Minimize wait time while maintaining fair matches β
β - Handle players going offline during queue/matching β
β - Update player ratings after each match β
β - Support party queue (friends playing together) β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interviewer: "Take a few minutes to think about this, then walk me through your approach. We have about 45 minutes."
Phase 1: Requirements Clarification (5 minutes)
Before diving in, you take a breath and start asking questions. This is crucial β never assume.
Your Questions
You: "Before I start designing, I'd like to clarify a few requirements. First, what's the scale we're targeting? How many concurrent players?"
Interviewer: "At peak, we expect 2 million concurrent players in queue or in matches. We're targeting a global audience."
You: "And for match sizes β you mentioned 1v1, 5v5, and battle royale. What's the battle royale size?"
Interviewer: "Battle royale is 100 players. The 5v5 mode is our most popular, accounting for about 70% of matches."
You: "How quickly do players expect to find a match?"
Interviewer: "For 5v5, median wait time should be under 30 seconds. We'd accept up to 2 minutes for 95th percentile. Players get frustrated if they wait longer."
You: "What about skill matching? How strict should we be?"
Interviewer: "We use ELO ratings from 0-3000. Ideally, players in a match should be within 100 ELO of each other. But if wait time exceeds 60 seconds, we can expand the range. Quality versus speed trade-off."
You: "For party queue β can a high-skill player queue with a low-skill friend?"
Interviewer: "Yes, but we need to handle it carefully. The match should be fair for everyone. We typically use the highest-rated player in the party for matchmaking."
You: "What happens to ELO ratings after a match? Is it updated immediately?"
Interviewer: "Yes, ratings update immediately after the match ends. All 10 players in a 5v5 should see their new rating within seconds. This is important for competitive integrity β we can't have players see stale ratings."
You: "Finally, what about regional servers? Do players only match within their region?"
Interviewer: "We have game servers in 5 regions: NA, EU, APAC, SA, OCE. Players prefer low latency, so we match within region when possible. But during off-peak, we might cross-region match rather than have long waits."
You: "Perfect. Let me summarize the requirements."
Functional Requirements
1. QUEUE MANAGEMENT
- Players can join/leave matchmaking queue
- Support solo queue and party queue (2-5 players)
- Multiple game modes: 1v1, 5v5, battle royale (100 players)
- Players can queue for multiple modes simultaneously
2. MATCHMAKING
- Match players within ELO range (Β±100 preferred, expand over time)
- Balance teams by average ELO
- Respect party groupings (keep friends on same team)
- Consider player region for low latency
3. GAME SESSION
- Allocate game server for matched players
- Handle player confirmation (ready check)
- Reconnect players who disconnect mid-match
- Determine match outcome (win/loss/draw)
4. RATING UPDATES
- Calculate new ELO for all players after match
- Update ratings immediately (strong consistency)
- Handle edge cases (disconnects, abandons)
- Support rating decay for inactive players
Non-Functional Requirements
1. SCALE
- 2M concurrent players
- 200K matches created per minute (peak)
- 10M rating updates per hour
2. LATENCY
- Queue join: <100ms p99
- Match found notification: <50ms p99
- Rating update visible: <500ms p99
- Median queue time: <30 seconds (5v5)
3. CONSISTENCY
- Ratings: Strong consistency (no stale reads)
- Queue state: At-least-once processing
- Match assignment: Exactly-once (no double-booking)
4. AVAILABILITY
- 99.9% uptime for matchmaking
- Graceful degradation during partial outages
- No lost players in queue during failures
Phase 2: Back of the Envelope Estimation (5 minutes)
You: "Let me work through the numbers to understand the scale."
Traffic Estimation
MATCHMAKING TRAFFIC
Concurrent players: 2,000,000
Players in queue (estimated): ~400,000 (20% at any time)
Players in matches: ~1,600,000
Match composition (5v5):
Players per match: 10
Active matches: 160,000
Match duration (average): 20 minutes
Matches ending per minute: 160,000 / 20 = 8,000 matches/min
New matches starting: ~8,000 matches/min
Queue operations:
Join queue: ~400K players / 30 sec avg wait
= ~13,000 joins/sec
Leave queue (matched): ~13,000/sec
Leave queue (cancel): ~1,000/sec
Peak multiplier: 3x
Peak joins: ~40,000/sec
Storage Estimation
PLAYER DATA
Per player:
player_id: 16 bytes (UUID)
username: 32 bytes
elo_rating: 4 bytes (int)
region: 4 bytes
last_match: 8 bytes (timestamp)
Total: ~100 bytes
Total players (registered): 50M
Player data: 50M Γ 100B = 5GB
MATCH HISTORY
Per match:
match_id: 16 bytes
game_mode: 4 bytes
player_ids (10): 160 bytes
elo_changes (10): 40 bytes
outcome: 20 bytes
timestamps: 16 bytes
Total: ~300 bytes
Matches per day: 8,000/min Γ 60 Γ 24 = 11.5M
Daily match data: 11.5M Γ 300B = 3.5GB
Yearly (retained): 1.3TB
QUEUE STATE (in-memory)
Per queued player:
player_id: 16 bytes
elo: 4 bytes
queue_time: 8 bytes
game_modes: 8 bytes
party_id: 16 bytes
region: 4 bytes
Total: ~60 bytes
Queue memory (400K players): 400K Γ 60B = 24MB
Key Metrics Summary
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ESTIMATION SUMMARY β
β β
β TRAFFIC β
β βββ Peak queue joins: 40,000 /second β
β βββ Matches created: ~130 /second β
β βββ Rating updates: ~1,300 /second (10 per match) β
β β
β STORAGE β
β βββ Player profiles: 5 GB β
β βββ Match history (yearly): 1.3 TB β
β βββ Queue state (memory): 24 MB β
β β
β INFRASTRUCTURE (rough) β
β βββ Queue service: 10 instances (sharded by region) β
β βββ Matchmaker: 5 instances (one leader per region) β
β βββ Rating service: 10 instances β
β βββ Database: 3-node cluster (primary + 2 replicas) β
β βββ Redis (queue state): 3-node cluster β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Phase 3: High-Level Design (10 minutes)
You: "Now let me sketch out the high-level architecture."
System Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MATCHMAKING SYSTEM ARCHITECTURE β
β β
β βββββββββββββββ β
β β Players β β
β β (Clients) β β
β ββββββββ¬βββββββ β
β β β
β βΌ β
β βββββββββββββββ β
β β CDN β β
β β WebSocket β β
β β Gateway β β
β ββββββββ¬βββββββ β
β β β
β βββββββββββββββββββββΌββββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Queue β β Matchmaker β β Rating β β
β β Service β β Service β β Service β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β β
β β βββββββββββ΄ββββββββββ β β
β β β β β β
β βΌ βΌ βΌ βΌ β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β Redis Cluster β β PostgreSQL β β
β β (Queue State) β β (Ratings/ β β
β β β β Matches) β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ β
β β
β βββββββββββββββββββββ΄ββββββββββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββ βββββββββββββββ β
β β Kafka β β etcd β β
β β (Events) β β (Leader β β
β β β β Election) β β
β βββββββββββββββ βββββββββββββββ β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Game Server β β Game Server β β Game Server β β
β β Pool (NA) β β Pool (EU) β β Pool (APAC)β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Component Breakdown
You: "Let me walk through each component..."
1. WebSocket Gateway
Purpose: Maintain persistent connections with players
Key responsibilities:
- WebSocket connection management
- Authentication and session handling
- Route messages to appropriate services
- Push notifications (match found, rating updates)
Technology: Custom service with WebSocket support, horizontally scaled
2. Queue Service
Purpose: Manage player queue state
Key responsibilities:
- Add/remove players from queue
- Track queue time for each player
- Handle party groupings
- Publish events when queue state changes
Technology: Custom service + Redis for state
3. Matchmaker Service
Purpose: Create balanced matches from queued players
Key responsibilities:
- Run matching algorithm periodically
- Balance teams by ELO
- Allocate game servers
- Coordinate ready checks
Technology: Custom service with leader election (one per region)
4. Rating Service
Purpose: Manage player ratings with strong consistency
Key responsibilities:
- Calculate ELO changes after matches
- Update ratings transactionally
- Serve rating queries
- Handle rating decay
Technology: Custom service + PostgreSQL
Data Flow
You: "Let me trace through a typical matchmaking flow..."
MATCHMAKING FLOW (5v5)
Step 1: QUEUE JOIN
Player clicks "Find Match"
Client βββΆ Gateway βββΆ Queue Service
Queue Service adds player to Redis sorted set (by ELO)
Step 2: MATCHMAKING (runs every 500ms)
Matchmaker (leader) reads queue from Redis
Finds 10 compatible players
Creates provisional match
Step 3: READY CHECK
Matchmaker βββΆ Gateway βββΆ 10 Players
"Match found! Accept?"
Players confirm (30 second timeout)
Step 4: MATCH CONFIRMATION
All 10 accept:
Matchmaker allocates game server
Matchmaker writes match to PostgreSQL
Matchmaker publishes MatchCreated event
Some decline:
Decliners removed from queue
Accepters returned to front of queue
Step 5: GAME SESSION
Players connect to game server
Game plays out (20 minutes average)
Game server reports outcome
Step 6: RATING UPDATE
Game Server βββΆ Kafka βββΆ Rating Service
Rating Service calculates new ELO
Rating Service updates PostgreSQL (transaction)
Rating Service publishes RatingUpdated events
Gateway pushes new ratings to players
Phase 4: Deep Dives (20 minutes)
Interviewer: "Great high-level design. Let's dive deeper into a few areas. How do you ensure only one matchmaker is running per region?"
Deep Dive 1: Leader Election for Matchmaker (Week 5, Day 5)
You: "This is critical β if two matchmakers run simultaneously, players could be double-booked into different matches. I'd use leader election with fencing tokens."
The Problem
DOUBLE-BOOKING SCENARIO
Without proper leader election:
Matchmaker A (thinks it's leader):
Reads queue: [Player1, Player2, ... Player10]
Creates Match-1 with these players
Matchmaker B (also thinks it's leader):
Reads queue: [Player1, Player2, ... Player10]
Creates Match-2 with same players!
Result: Players assigned to two matches
β Confusion, broken game state
The Solution
You: "I'd use etcd for leader election with fencing tokens. The matchmaker must include its fencing token when modifying queue state."
LEADER ELECTION FLOW
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MATCHMAKER LEADER ELECTION β
β β
β Region: NA β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Matchmaker β β Matchmaker β β Matchmaker β β
β β A β β B β β C β β
β β (LEADER) β β (standby) β β (standby) β β
β β token=42 β β β β β β
β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β
β β β β β
β βββββββββββββββββββββΌββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββ β
β β etcd β β
β β /matchmaker β β
β β /na/leader β β
β βββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# Matchmaker Leader Election with Fencing Tokens
import etcd3
from dataclasses import dataclass
from typing import Optional
import asyncio
import time
@dataclass
class LeaderLease:
"""Leadership lease with fencing token."""
token: int
expires_at: float
def is_valid(self) -> bool:
return time.time() < self.expires_at
class MatchmakerLeaderElection:
"""
Leader election for matchmaker service.
Ensures only ONE matchmaker runs per region.
Uses fencing tokens to prevent zombie leaders.
Week 5, Day 5: Leader Election concepts applied
"""
def __init__(
self,
etcd_host: str,
region: str,
node_id: str,
lease_ttl: int = 10
):
self.client = etcd3.client(host=etcd_host)
self.region = region
self.node_id = node_id
self.lease_ttl = lease_ttl
self.election_key = f"/matchmaker/{region}/leader"
self.lease: Optional[LeaderLease] = None
self.etcd_lease = None
self.is_leader = False
async def campaign(self) -> Optional[int]:
"""
Campaign for leadership.
Returns fencing token if elected, None otherwise.
"""
# Create lease
self.etcd_lease = self.client.lease(self.lease_ttl)
# Try to become leader (atomic compare-and-set)
success, responses = self.client.transaction(
compare=[
self.client.transactions.version(self.election_key) == 0
],
success=[
self.client.transactions.put(
self.election_key,
self.node_id.encode(),
lease=self.etcd_lease
)
],
failure=[]
)
if success:
# We're the leader - get fencing token from key version
value, metadata = self.client.get(self.election_key)
fencing_token = metadata.mod_revision # Monotonically increasing
self.lease = LeaderLease(
token=fencing_token,
expires_at=time.time() + self.lease_ttl
)
self.is_leader = True
# Start keepalive
asyncio.create_task(self._keepalive_loop())
return fencing_token
return None
async def _keepalive_loop(self):
"""Keep lease alive while we're leader."""
while self.is_leader:
try:
self.etcd_lease.refresh()
self.lease.expires_at = time.time() + self.lease_ttl
await asyncio.sleep(self.lease_ttl / 3)
except Exception as e:
self.is_leader = False
break
def get_fencing_token(self) -> Optional[int]:
"""Get current fencing token."""
if self.lease and self.lease.is_valid():
return self.lease.token
return None
async def resign(self):
"""Resign from leadership."""
self.is_leader = False
if self.etcd_lease:
self.etcd_lease.revoke()
class FencedQueueOperations:
"""
Queue operations protected by fencing tokens.
Redis operations include fencing token validation.
Prevents zombie leaders from corrupting state.
"""
def __init__(self, redis_client):
self.redis = redis_client
# Track highest token seen per region
self.highest_token_key = "matchmaker:highest_token:{region}"
async def remove_players_from_queue(
self,
region: str,
player_ids: list[str],
fencing_token: int
) -> bool:
"""
Remove players from queue atomically with fencing.
Only succeeds if fencing_token >= highest seen.
"""
# Lua script for atomic fencing check + removal
script = """
local highest_key = KEYS[1]
local queue_key = KEYS[2]
local token = tonumber(ARGV[1])
local player_count = tonumber(ARGV[2])
-- Check fencing token
local highest = tonumber(redis.call('GET', highest_key) or '0')
if token < highest then
return 0 -- Stale token, reject
end
-- Update highest token
redis.call('SET', highest_key, token)
-- Remove players from queue
for i = 3, 2 + player_count do
redis.call('ZREM', queue_key, ARGV[i])
end
return 1 -- Success
"""
result = await self.redis.eval(
script,
keys=[
self.highest_token_key.format(region=region),
f"queue:{region}"
],
args=[fencing_token, len(player_ids)] + player_ids
)
return result == 1
Edge Cases
Interviewer: "What happens if the leader crashes mid-match-creation?"
You: "The match creation is a saga. If the leader crashes, the saga either completes via compensation or a new leader takes over and cleans up."
LEADER CRASH RECOVERY
Scenario:
Leader starts creating match
Leader crashes after removing from queue but before writing match
Recovery:
1. etcd lease expires (10 seconds)
2. New leader elected with higher fencing token
3. New leader checks for "orphaned" queue removals
4. Players without match assignment are re-queued
Implementation:
- Match creation is a saga with compensation
- Queue removal records "pending_match_id"
- If match not confirmed within timeout, players re-queued
Deep Dive 2: Rating Updates with Strong Consistency (Week 5, Day 1)
You: "Rating updates need strong consistency. Players must see their correct rating immediately after a match."
The Problem
STALE RATING PROBLEM
Without strong consistency:
T=0: Match ends, Player A won
Server 1 updates rating: 1500 β 1520
T=1: Player A checks profile (reads from replica)
Replica hasn't received update yet
Player A sees: 1500 (STALE!)
T=2: Player A queues for next match
Queue uses stale rating: 1500
Matchmaker creates unfair match!
This violates competitive integrity.
The Solution
You: "I'd use read-your-writes consistency for the player who just finished a match, and linearizable reads for matchmaking."
CONSISTENCY STRATEGIES
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RATING CONSISTENCY MODEL β
β β
β OPERATION β CONSISTENCY β IMPLEMENTATION β
β ββββββββββββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββββββ β
β Rating update β Linearizable β Write to primary β
β Player views own ratingβ Read-your-writes β Read from primary if β
β β β recent update β
β Matchmaker reads ratingβ Linearizable β Always read from primary β
β Leaderboard β Eventual β Read from replica (cached) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# Rating Service with Strong Consistency
from dataclasses import dataclass
from typing import Optional, Dict, Tuple
from datetime import datetime, timedelta
import asyncpg
import redis
@dataclass
class RatingUpdate:
"""Result of a rating update."""
player_id: str
old_rating: int
new_rating: int
match_id: str
updated_at: datetime
class RatingService:
"""
Rating service with consistency guarantees.
Week 5, Day 1: Consistency Models applied
- Linearizable writes (always to primary)
- Read-your-writes for recent updates
- Causal consistency via version tracking
"""
def __init__(
self,
primary_db: asyncpg.Pool,
replica_db: asyncpg.Pool,
redis_client: redis.Redis
):
self.primary = primary_db
self.replica = replica_db
self.redis = redis_client
# Track recent updates for read-your-writes
self.recent_update_ttl = timedelta(seconds=30)
async def update_ratings_for_match(
self,
match_id: str,
winner_ids: list[str],
loser_ids: list[str],
is_draw: bool = False
) -> list[RatingUpdate]:
"""
Update ratings for all players in a match.
Uses transaction for atomicity.
Linearizable consistency (writes to primary).
"""
updates = []
async with self.primary.acquire() as conn:
async with conn.transaction():
# Get current ratings
all_player_ids = winner_ids + loser_ids
ratings = await conn.fetch(
"""
SELECT player_id, elo_rating
FROM players
WHERE player_id = ANY($1)
FOR UPDATE -- Lock rows for consistent read-modify-write
""",
all_player_ids
)
rating_map = {r['player_id']: r['elo_rating'] for r in ratings}
# Calculate new ratings (ELO formula)
new_ratings = self._calculate_elo_changes(
winners={pid: rating_map[pid] for pid in winner_ids},
losers={pid: rating_map[pid] for pid in loser_ids},
is_draw=is_draw
)
# Update all ratings atomically
for player_id, new_rating in new_ratings.items():
old_rating = rating_map[player_id]
await conn.execute(
"""
UPDATE players
SET elo_rating = $1,
last_match_at = NOW(),
rating_version = rating_version + 1
WHERE player_id = $2
""",
new_rating, player_id
)
# Record update for read-your-writes
await self._record_recent_update(player_id)
updates.append(RatingUpdate(
player_id=player_id,
old_rating=old_rating,
new_rating=new_rating,
match_id=match_id,
updated_at=datetime.utcnow()
))
return updates
async def get_rating(
self,
player_id: str,
require_fresh: bool = False
) -> Tuple[int, int]:
"""
Get player's rating.
Returns (rating, version).
If require_fresh=True or player has recent update,
reads from primary (linearizable).
Otherwise reads from replica (eventually consistent).
"""
# Check for recent update (read-your-writes)
has_recent_update = await self._has_recent_update(player_id)
if require_fresh or has_recent_update:
# Read from primary for consistency
result = await self.primary.fetchrow(
"SELECT elo_rating, rating_version FROM players WHERE player_id = $1",
player_id
)
else:
# Read from replica (faster, eventual consistency)
result = await self.replica.fetchrow(
"SELECT elo_rating, rating_version FROM players WHERE player_id = $1",
player_id
)
if result:
return result['elo_rating'], result['rating_version']
raise PlayerNotFoundError(player_id)
async def get_ratings_for_matchmaking(
self,
player_ids: list[str]
) -> Dict[str, int]:
"""
Get ratings for matchmaking.
ALWAYS reads from primary for linearizable consistency.
Matchmaker needs accurate ratings for fair matches.
"""
results = await self.primary.fetch(
"SELECT player_id, elo_rating FROM players WHERE player_id = ANY($1)",
player_ids
)
return {r['player_id']: r['elo_rating'] for r in results}
async def _record_recent_update(self, player_id: str):
"""Record that player has a recent rating update."""
key = f"rating:recent_update:{player_id}"
await self.redis.setex(
key,
int(self.recent_update_ttl.total_seconds()),
"1"
)
async def _has_recent_update(self, player_id: str) -> bool:
"""Check if player has a recent rating update."""
key = f"rating:recent_update:{player_id}"
return await self.redis.exists(key)
def _calculate_elo_changes(
self,
winners: Dict[str, int],
losers: Dict[str, int],
is_draw: bool,
k_factor: int = 32
) -> Dict[str, int]:
"""Calculate new ELO ratings using standard formula."""
# Average ratings
winner_avg = sum(winners.values()) / len(winners)
loser_avg = sum(losers.values()) / len(losers)
# Expected scores
exp_winner = 1 / (1 + 10 ** ((loser_avg - winner_avg) / 400))
exp_loser = 1 - exp_winner
# Actual scores
if is_draw:
actual_winner = 0.5
actual_loser = 0.5
else:
actual_winner = 1.0
actual_loser = 0.0
# Calculate changes
new_ratings = {}
for pid, rating in winners.items():
change = k_factor * (actual_winner - exp_winner)
new_ratings[pid] = max(0, min(3000, round(rating + change)))
for pid, rating in losers.items():
change = k_factor * (actual_loser - exp_loser)
new_ratings[pid] = max(0, min(3000, round(rating + change)))
return new_ratings
Deep Dive 3: Match Creation Saga (Week 5, Day 2)
You: "Match creation is a multi-step process that can fail at any point. I'd use the saga pattern with compensation."
The Problem
MATCH CREATION STEPS (can fail at any point)
Step 1: Remove players from queue
Step 2: Send ready check to all players
Step 3: Wait for confirmations (timeout possible)
Step 4: Allocate game server
Step 5: Write match to database
Step 6: Notify players of server address
If Step 4 fails (no servers available):
- Players already removed from queue
- They've confirmed they're ready
- But no match happens!
Need compensation to put players back in queue.
The Solution
MATCH CREATION SAGA
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MATCH CREATION SAGA β
β β
β Step 1: Remove from queue β
β Compensation: Re-add to queue (front) β
β β
β Step 2: Send ready checks β
β Compensation: Cancel ready check UI β
β β
β Step 3: Collect confirmations β
β Compensation: Return declined players to queue β
β DECISION POINT: All confirmed? Continue : Partial compensate β
β β
β Step 4: Allocate game server β PIVOT POINT β
β If fail: Full compensation (re-queue all) β
β If success: Forward recovery only β
β β
β Step 5: Write match record β
β Forward recovery: Retry until success β
β β
β Step 6: Notify players β
β Forward recovery: Retry until success β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# Match Creation Saga
from dataclasses import dataclass, field
from typing import Optional, List, Dict
from enum import Enum
from datetime import datetime, timedelta
import asyncio
class MatchStatus(Enum):
PENDING_CONFIRMATIONS = "pending_confirmations"
ALLOCATING_SERVER = "allocating_server"
CONFIRMED = "confirmed"
CANCELLED = "cancelled"
FAILED = "failed"
@dataclass
class MatchCandidate:
"""A potential match being formed."""
match_id: str
region: str
game_mode: str
team1_player_ids: List[str]
team2_player_ids: List[str]
status: MatchStatus
created_at: datetime
fencing_token: int
# Saga state
removed_from_queue: bool = False
ready_checks_sent: bool = False
confirmations: Dict[str, bool] = field(default_factory=dict)
server_allocated: Optional[str] = None
match_written: bool = False
class MatchCreationSaga:
"""
Saga for creating a match.
Week 5, Day 2: Saga Pattern with compensation
Handles partial failures and ensures players
are never left in limbo.
"""
def __init__(
self,
queue_service,
gateway_service,
server_allocator,
match_repository,
fencing_token: int
):
self.queue = queue_service
self.gateway = gateway_service
self.servers = server_allocator
self.matches = match_repository
self.fencing_token = fencing_token
async def execute(self, candidate: MatchCandidate) -> MatchCandidate:
"""
Execute the match creation saga.
Returns the final state of the match candidate.
"""
all_players = candidate.team1_player_ids + candidate.team2_player_ids
try:
# Step 1: Remove players from queue
await self._step_remove_from_queue(candidate, all_players)
# Step 2: Send ready checks
await self._step_send_ready_checks(candidate, all_players)
# Step 3: Wait for confirmations
confirmed = await self._step_collect_confirmations(
candidate,
all_players,
timeout=timedelta(seconds=30)
)
if not confirmed:
# Partial compensation - some players declined
await self._compensate_unconfirmed(candidate)
candidate.status = MatchStatus.CANCELLED
return candidate
# Step 4: Allocate game server (PIVOT POINT)
try:
server = await self._step_allocate_server(candidate)
except NoServersAvailableError:
# Full compensation - re-queue everyone
await self._compensate_full(candidate, all_players)
candidate.status = MatchStatus.FAILED
return candidate
# PIVOT PASSED - Forward recovery only from here
candidate.server_allocated = server.address
# Step 5: Write match to database (retry until success)
await self._step_write_match(candidate)
# Step 6: Notify players (retry until success)
await self._step_notify_players(candidate, all_players)
candidate.status = MatchStatus.CONFIRMED
return candidate
except Exception as e:
# Unexpected error - full compensation
await self._compensate_full(candidate, all_players)
candidate.status = MatchStatus.FAILED
raise
async def _step_remove_from_queue(
self,
candidate: MatchCandidate,
player_ids: List[str]
):
"""Step 1: Remove players from queue."""
success = await self.queue.remove_players(
region=candidate.region,
player_ids=player_ids,
fencing_token=self.fencing_token,
pending_match_id=candidate.match_id
)
if not success:
raise FencingError("Stale fencing token")
candidate.removed_from_queue = True
async def _step_send_ready_checks(
self,
candidate: MatchCandidate,
player_ids: List[str]
):
"""Step 2: Send ready checks to all players."""
await self.gateway.send_ready_checks(
player_ids=player_ids,
match_id=candidate.match_id,
game_mode=candidate.game_mode
)
candidate.ready_checks_sent = True
async def _step_collect_confirmations(
self,
candidate: MatchCandidate,
player_ids: List[str],
timeout: timedelta
) -> bool:
"""Step 3: Collect player confirmations."""
deadline = datetime.utcnow() + timeout
while datetime.utcnow() < deadline:
# Get current confirmations
confirmations = await self.gateway.get_confirmations(
candidate.match_id
)
candidate.confirmations = confirmations
# Check if all confirmed
confirmed_count = sum(1 for c in confirmations.values() if c)
declined_count = sum(1 for c in confirmations.values() if c is False)
if confirmed_count == len(player_ids):
return True
if declined_count > 0:
# Someone explicitly declined
return False
await asyncio.sleep(0.5)
# Timeout - treat as decline
return False
async def _step_allocate_server(
self,
candidate: MatchCandidate
):
"""Step 4: Allocate game server (PIVOT)."""
server = await self.servers.allocate(
region=candidate.region,
game_mode=candidate.game_mode,
match_id=candidate.match_id
)
return server
async def _step_write_match(self, candidate: MatchCandidate):
"""Step 5: Write match to database (forward recovery)."""
max_retries = 10
for attempt in range(max_retries):
try:
await self.matches.create(
match_id=candidate.match_id,
region=candidate.region,
game_mode=candidate.game_mode,
team1=candidate.team1_player_ids,
team2=candidate.team2_player_ids,
server_address=candidate.server_allocated
)
candidate.match_written = True
return
except Exception as e:
if attempt == max_retries - 1:
raise
await asyncio.sleep(2 ** attempt) # Exponential backoff
async def _step_notify_players(
self,
candidate: MatchCandidate,
player_ids: List[str]
):
"""Step 6: Notify players of match (forward recovery)."""
# Retry until all notified
for _ in range(10):
try:
await self.gateway.notify_match_ready(
player_ids=player_ids,
match_id=candidate.match_id,
server_address=candidate.server_allocated
)
return
except Exception:
await asyncio.sleep(1)
async def _compensate_unconfirmed(self, candidate: MatchCandidate):
"""Compensate for unconfirmed players."""
# Re-queue players who confirmed
confirmed_players = [
pid for pid, confirmed in candidate.confirmations.items()
if confirmed
]
if confirmed_players:
await self.queue.add_players_priority(
region=candidate.region,
player_ids=confirmed_players,
priority=True # Put at front of queue
)
# Notify confirmed players match was cancelled
await self.gateway.notify_match_cancelled(
player_ids=confirmed_players,
reason="Other players did not confirm"
)
async def _compensate_full(
self,
candidate: MatchCandidate,
player_ids: List[str]
):
"""Full compensation - re-queue all players."""
if candidate.removed_from_queue:
await self.queue.add_players_priority(
region=candidate.region,
player_ids=player_ids,
priority=True
)
if candidate.ready_checks_sent:
await self.gateway.notify_match_cancelled(
player_ids=player_ids,
reason="Match creation failed"
)
Deep Dive 4: Party Queue with Conflict Resolution (Week 5, Day 4)
You: "Party queue adds complexity β players might update their party membership concurrently. I'd use CRDTs for party state."
The Problem
PARTY CONFLICT SCENARIO
Alice creates party, invites Bob and Carol:
Party: {Alice (leader), Bob, Carol}
Concurrent operations:
- Alice kicks Carol (on her device)
- Carol leaves party (on her device)
- Bob invites Dave (on his device)
All three operations happen before any sync.
Questions:
- What's the final party state?
- What if Alice's kick and Carol's leave have different effects?
- Is Dave in the party or not?
The Solution
You: "I'd model party membership as an OR-Set CRDT. Each add/remove is tagged, and add wins over concurrent remove."
PARTY STATE AS CRDT
Party modeled as OR-Set:
- Add member = add to set with unique tag
- Remove member = tombstone existing tags
- Concurrent add+remove = add wins (member stays)
Example resolution:
Initial: {Alice, Bob, Carol}
Operation 1: Alice kicks Carol
β Tombstone Carol's tag
Operation 2: Carol leaves
β Tombstone Carol's tag (same effect)
Operation 3: Bob invites Dave
β Add Dave with new tag
Final state after merge: {Alice, Bob, Dave}
Carol is out (both operations removed her)
Dave is in (no conflicting remove)
Implementation
# Party Queue with CRDT
from dataclasses import dataclass, field
from typing import Dict, Set, Optional
import uuid
import time
@dataclass
class PartyMember:
"""Member in a party with metadata."""
player_id: str
joined_at: float
is_leader: bool
class PartyORSet:
"""
Party membership using OR-Set CRDT.
Week 5, Day 4: Conflict Resolution with CRDTs
Handles concurrent joins/leaves/kicks without conflicts.
Add-wins semantics ensures players aren't accidentally removed.
"""
def __init__(self, party_id: str, creator_id: str):
self.party_id = party_id
self.leader_id = creator_id
# OR-Set: player_id -> {tag -> (tombstoned, joined_at)}
self.members: Dict[str, Dict[str, tuple]] = {}
# Add creator as first member
self._add_member(creator_id, is_leader=True)
def _generate_tag(self) -> str:
"""Generate unique tag for this operation."""
return f"{uuid.uuid4()}:{time.time()}"
def _add_member(self, player_id: str, is_leader: bool = False):
"""Add a member with new unique tag."""
tag = self._generate_tag()
if player_id not in self.members:
self.members[player_id] = {}
self.members[player_id][tag] = (False, time.time(), is_leader)
def add_member(self, player_id: str) -> bool:
"""Add a player to the party."""
if self.get_member_count() >= 5:
return False # Party full
self._add_member(player_id)
return True
def remove_member(self, player_id: str):
"""Remove a player (tombstone all their tags)."""
if player_id in self.members:
for tag in self.members[player_id]:
tombstoned, joined_at, is_leader = self.members[player_id][tag]
self.members[player_id][tag] = (True, joined_at, is_leader)
def is_member(self, player_id: str) -> bool:
"""Check if player is currently a member."""
if player_id not in self.members:
return False
# Member if any non-tombstoned tag exists
return any(
not tombstoned
for tombstoned, _, _ in self.members[player_id].values()
)
def get_members(self) -> list[str]:
"""Get all current members."""
return [
pid for pid in self.members
if self.is_member(pid)
]
def get_member_count(self) -> int:
"""Get count of current members."""
return len(self.get_members())
def merge(self, other: 'PartyORSet'):
"""
Merge with another party state.
OR-Set merge:
- Union of all tags
- Tombstoned if tombstoned in either
- Add wins: new tag from concurrent add survives
"""
all_players = set(self.members.keys()) | set(other.members.keys())
for player_id in all_players:
self_tags = self.members.get(player_id, {})
other_tags = other.members.get(player_id, {})
merged_tags = {}
all_tag_ids = set(self_tags.keys()) | set(other_tags.keys())
for tag in all_tag_ids:
if tag in self_tags and tag in other_tags:
# Both have this tag - tombstoned if either tombstoned
self_tomb, self_time, self_leader = self_tags[tag]
other_tomb, other_time, other_leader = other_tags[tag]
merged_tags[tag] = (
self_tomb or other_tomb,
max(self_time, other_time),
self_leader or other_leader
)
elif tag in self_tags:
merged_tags[tag] = self_tags[tag]
else:
merged_tags[tag] = other_tags[tag]
if merged_tags:
self.members[player_id] = merged_tags
def to_dict(self) -> dict:
"""Serialize for storage/transmission."""
return {
"party_id": self.party_id,
"leader_id": self.leader_id,
"members": {
pid: {
tag: {"tombstoned": t, "joined_at": j, "is_leader": l}
for tag, (t, j, l) in tags.items()
}
for pid, tags in self.members.items()
}
}
@classmethod
def from_dict(cls, data: dict) -> 'PartyORSet':
"""Deserialize from storage."""
party = cls.__new__(cls)
party.party_id = data["party_id"]
party.leader_id = data["leader_id"]
party.members = {
pid: {
tag: (info["tombstoned"], info["joined_at"], info["is_leader"])
for tag, info in tags.items()
}
for pid, tags in data["members"].items()
}
return party
class PartyService:
"""
Party management service with CRDT-based conflict resolution.
"""
def __init__(self, redis_client):
self.redis = redis_client
async def create_party(self, creator_id: str) -> str:
"""Create a new party."""
party_id = str(uuid.uuid4())
party = PartyORSet(party_id, creator_id)
await self._save_party(party)
await self._set_player_party(creator_id, party_id)
return party_id
async def join_party(self, party_id: str, player_id: str) -> bool:
"""Join an existing party."""
party = await self._load_party(party_id)
if not party:
return False
if party.add_member(player_id):
await self._save_party(party)
await self._set_player_party(player_id, party_id)
return True
return False
async def leave_party(self, party_id: str, player_id: str):
"""Leave a party."""
party = await self._load_party(party_id)
if not party:
return
party.remove_member(player_id)
await self._save_party(party)
await self._clear_player_party(player_id)
# Check if party is now empty
if party.get_member_count() == 0:
await self._delete_party(party_id)
async def merge_party_updates(
self,
party_id: str,
remote_state: dict
):
"""
Merge remote party state (from another server).
Used for cross-region party sync.
"""
local_party = await self._load_party(party_id)
remote_party = PartyORSet.from_dict(remote_state)
if local_party:
local_party.merge(remote_party)
await self._save_party(local_party)
else:
await self._save_party(remote_party)
async def get_party_for_matchmaking(
self,
party_id: str
) -> Optional[list[str]]:
"""Get party members for matchmaking."""
party = await self._load_party(party_id)
if not party:
return None
return party.get_members()
async def _save_party(self, party: PartyORSet):
"""Save party to Redis."""
await self.redis.set(
f"party:{party.party_id}",
json.dumps(party.to_dict())
)
async def _load_party(self, party_id: str) -> Optional[PartyORSet]:
"""Load party from Redis."""
data = await self.redis.get(f"party:{party_id}")
if data:
return PartyORSet.from_dict(json.loads(data))
return None
async def _delete_party(self, party_id: str):
"""Delete party from Redis."""
await self.redis.delete(f"party:{party_id}")
async def _set_player_party(self, player_id: str, party_id: str):
"""Set player's current party."""
await self.redis.set(f"player_party:{player_id}", party_id)
async def _clear_player_party(self, player_id: str):
"""Clear player's current party."""
await self.redis.delete(f"player_party:{player_id}")
Phase 5: Scaling and Edge Cases (5 minutes)
Interviewer: "How would this system scale to 10x the current load?"
Scaling Strategy
You: "There are several scaling vectors to consider..."
Horizontal Scaling
CURRENT β 10X SCALE
Queue Service:
Current: 10 instances (1 per region shard)
Scaled: 30 instances (3 per region for HA)
Matchmaker:
Current: 5 leaders (1 per region)
Scaled: 5 leaders + 10 standbys (hot failover)
Note: Cannot horizontally scale leaders (one per region)
Instead, optimize matching algorithm
Rating Service:
Current: 10 instances
Scaled: 50 instances (stateless, scales linearly)
Database:
Current: 3-node cluster
Scaled: Sharded by region (5 clusters Γ 3 nodes)
Redis:
Current: 3-node cluster
Scaled: 5 clusters (1 per region)
Bottleneck Analysis
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BOTTLENECK ANALYSIS β
β β
β Component β Current Limit β Scaling Solution β
β ββββββββββββββββββββΌβββββββββββββββββββΌβββββββββββββββββββββββββββββ β
β Queue joins β 40K/sec β Shard by region + game mode β
β Matchmaker cycle β 500ms/region β Optimize algorithm, parallel β
β Rating updates β 2K/sec β Batch updates, async writes β
β WebSocket conns β 500K/server β Add gateway instances β
β Database writes β 10K/sec β Shard by player_id hash β
β β
β FIRST BOTTLENECK: Matchmaker cycle time β
β At 10x scale, 500ms not enough to process 4M queued players β
β Solution: Parallelize within region by ELO bands β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Edge Cases
Interviewer: "What are some edge cases we should handle?"
You: "Let me walk through several..."
Edge Case 1: Player Disconnects During Ready Check
Scenario: Player confirms ready, then disconnects before match starts
Handling:
- 10-second grace period for reconnection
- If not reconnected: treat as decline
- Re-queue other confirmed players (priority)
- Player who disconnected gets short queue ban (1 minute)
Edge Case 2: Rating Update During Queue
Scenario: Player queued at 1500 ELO, plays another mode,
rating updates to 1550 while still in first queue
Handling:
- Queue stores snapshot of rating at join time
- Matchmaker uses queue-time rating for matching
- After match, new rating is used for next queue
- Prevents gaming the system
Edge Case 3: Cross-Region Party
Scenario: Alice (NA) parties with Bob (EU)
Handling:
- Party stored in both regions (CRDT sync)
- When queuing, use lowest latency region for party
- Show warning about potential latency for some members
- Respect party leader's preference
Failure Scenarios
| Failure | Detection | Impact | Recovery |
|---|---|---|---|
| Matchmaker crashes | etcd lease expires | Queue builds up (30 sec) | New leader elected, continues |
| Redis cluster fails | Health checks | Players can't queue | Fallback to local memory, replay from Kafka |
| Database primary fails | Connection errors | Rating updates delayed | Promote replica, replay pending updates |
| Game server crashes | Heartbeat timeout | Match lost | Players returned to queue, no ELO change |
| Kafka unavailable | Connection errors | Events not published | Local queue, retry when available |
Phase 6: Monitoring and Operations
Interviewer: "How would you monitor this system in production?"
Key Metrics
You: "I would track metrics at multiple levels..."
Business Metrics
- Median queue time by region: Target <30s, Alert if >60s
- Match quality (ELO difference): Target <100, Alert if >200
- Player satisfaction (match completion rate): Target >95%
System Metrics
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MATCHMAKING DASHBOARD β
β β
β QUEUE HEALTH β
β βββ Queue depth by region: [ββββββββββ] 400K players β
β βββ Median wait time: [ββββββββββ] 28 seconds β
β βββ P95 wait time: [ββββββββββ] 85 seconds β
β β
β MATCHMAKER HEALTH β
β βββ Matches created/min: [ββββββββββ] 8,200 β
β βββ Cycle time (p99): [ββββββββββ] 450ms β
β βββ Failed matches/min: [ββββββββββ] 12 β
β β
β RATING SERVICE β
β βββ Updates/sec: [ββββββββββ] 1,350 β
β βββ Update latency (p99): [ββββββββββ] 45ms β
β βββ Primary DB lag: [ββββββββββ] 2ms β
β β
β LEADER ELECTION β
β βββ Current leaders by region: NAβ EUβ APACβ SAβ OCEβ β
β βββ Last failover: 2 hours ago β
β βββ Failover count (24h): 1 β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Alerting Strategy
CRITICAL (PagerDuty):
- No matchmaker leader for region > 30 seconds
- Queue depth growing > 10% per minute for 5 minutes
- Rating update latency p99 > 5 seconds
- Database replication lag > 10 seconds
WARNING (Slack):
- Median queue time > 45 seconds
- Match ELO spread > 150 average
- Game server allocation failures > 1%
- Kafka consumer lag > 1000 messages
INFO (Dashboard only):
- Leader election occurred
- New region peak queue depth
- Daily match count records
Runbook Snippet
RUNBOOK: Matchmaker Leader Not Elected
SYMPTOMS:
- Alert: "No matchmaker leader for NA > 30s"
- Queue depth increasing
- No new matches being created
DIAGNOSIS:
1. Check etcd cluster health:
$ etcdctl endpoint health
2. Check matchmaker pods:
$ kubectl get pods -l app=matchmaker
3. Check etcd leader election key:
$ etcdctl get /matchmaker/na/leader
RESOLUTION:
1. If etcd unhealthy:
- Check etcd pod logs
- Restart etcd pods if necessary
2. If matchmaker pods crashed:
- Check for OOM or panic in logs
- Scale up replicas
3. If election stuck:
- Delete stale lease: etcdctl lease revoke <lease-id>
- Matchmakers will re-elect
4. Escalate to: Platform Team
Interview Conclusion
Interviewer: "Excellent work. You've demonstrated strong understanding of distributed systems, particularly around leader election and consistency models. The saga pattern for match creation was well thought out. Any questions for me?"
You: "Thank you! I'm curious how you handle the matchmaking algorithm itself β do you use any ML to predict match quality? And how do you balance queue time versus match fairness in practice?"
Interviewer: "Great questions. We actually use a learned model to predict match outcomes and optimize for close games. The queue time trade-off is a constant tuning exercise β we have knobs for each region based on player population."
Summary: Week 5 Concepts Applied
Week 5: Consistency and Coordination
| Day | Concept | Application in This Design |
|---|---|---|
| Day 1 | Consistency Models | Linearizable reads for matchmaking ratings, read-your-writes for player's own rating after match |
| Day 2 | Saga Pattern | Match creation as saga with compensation for failed/declined matches |
| Day 3 | Workflow Orchestration | Could use Temporal for complex match lifecycle (mentioned as alternative) |
| Day 4 | Conflict Resolution | Party membership as OR-Set CRDT for concurrent join/leave/kick |
| Day 5 | Leader Election | Single matchmaker leader per region with fencing tokens |
Code Patterns Demonstrated
1. LEADER ELECTION WITH FENCING
- MatchmakerLeaderElection class
- FencedQueueOperations for safe state modification
- etcd-based election with lease TTL
2. CONSISTENCY GUARANTEES
- RatingService with read-your-writes
- Linearizable reads for matchmaking
- Eventual consistency for leaderboards
3. SAGA PATTERN
- MatchCreationSaga with compensation
- Pivot point at server allocation
- Forward recovery after pivot
4. CRDT FOR CONFLICT RESOLUTION
- PartyORSet for membership
- Add-wins semantics
- Merge operation for cross-region sync
5. FENCING TOKEN VALIDATION
- Lua script in Redis for atomic check
- Prevents zombie leader writes
- Monotonically increasing tokens
Self-Assessment Checklist
After studying this capstone, you should be able to:
- Explain why leader election is necessary for matchmaking
- Implement fencing tokens to prevent zombie leader corruption
- Design consistency models based on operation requirements
- Use sagas for multi-step operations with compensation
- Apply CRDTs for concurrent state modification
- Choose between choreography and orchestration for workflows
- Identify pivot points in saga design
- Handle partial failures gracefully
- Design monitoring for distributed coordination
- Debug leader election and consistency issues
Key Interview Takeaways
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MATCHMAKING SYSTEM CHECKLIST β
β β
β β Leader election for matchmaker (one per region) β
β β Fencing tokens prevent zombie leader writes β
β β Saga pattern for match creation with compensation β
β β Strong consistency for ratings (linearizable) β
β β CRDTs for party state (conflict-free merging) β
β β Read-your-writes for player viewing own rating β
β β Graceful degradation during failures β
β β Horizontal scaling where possible β
β β Clear monitoring and alerting strategy β
β β
β KEY TRADE-OFFS: β
β β’ Queue time vs match quality (configurable ELO range) β
β β’ Strong vs eventual consistency (per-operation choice) β
β β’ Single leader vs parallel processing (leader per region) β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This capstone problem integrates all concepts from Week 5 of the System Design Mastery Series. Use this as a template for approaching similar interview problems involving distributed coordination.