Bonus Problem 9: Uber
The Real-Time Marketplace That Moves 30 Million People Every Day
π How Do You Match Millions of Riders with Drivers in Under 3 Seconds?
Imagine this challenge: Every second, thousands of people open an app expecting to see available cars near them. Within 3 seconds, you must find the optimal driver from potentially hundreds nearby, calculate the fare, estimate arrival time, and begin the matchβall while the driver and rider are both moving.
This is Uberβa real-time, location-based marketplace operating in 10,000+ cities across 70 countries. It's not just a taxi app. It's a masterclass in geospatial indexing, dynamic pricing, and building systems that make split-second decisions at global scale.
THE UBER SCALE (2025)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β MARKETPLACE β
β βββββββββββ β
β Monthly Active Users: 156 Million β
β Active Drivers: 7.4 Million β
β Daily Trips: 30 Million β
β Cities: 10,000+ β
β Countries: 70 β
β β
β REAL-TIME β
β βββββββββ β
β Location updates/second: Millions β
β Match latency target: < 3 seconds β
β ETA accuracy: ~95% within 3 minutes β
β Driver ping response: < 15 seconds β
β β
β DATA β
β ββββ β
β Events processed/day: Hundreds of billions β
β Data pipelines: 200,000+ β
β ML models in production: Thousands β
β Kafka messages/day: Trillions β
β β
β INFRASTRUCTURE β
β ββββββββββββββ β
β Microservices: Thousands β
β Data centers: Multiple (hybrid cloud) β
β Engineers: ~6,000 β
β β
β BUSINESS (2024) β
β βββββββββββββββ β
β Gross Bookings: $163 Billion β
β Revenue: $43.9 Billion β
β Trips completed: ~10 Billion (cumulative) β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This is the system we'll design todayβand discover how Uber matches riders with drivers faster than you can blink.
The Interview Begins
You walk into the interview room. The interviewer smiles and gestures to the whiteboard.
Interviewer: "Thanks for coming in. Today we're going to design a ride-hailing platform like Uber. I'm interested in how you think about real-time matching, geospatial systems, and dynamic pricing. Please think out loudβthis is collaborative."
They write on the whiteboard:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β Design a Ride-Hailing Platform β
β β
β Requirements: β
β - Match riders with nearby drivers in real-time (< 3 seconds) β
β - Track driver locations continuously β
β - Calculate accurate ETAs and fares β
β - Implement dynamic (surge) pricing β
β - Support multiple ride types (UberX, Pool, Black, etc.) β
β - Handle global scale across thousands of cities β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Interviewer: "Take a few minutes to think about this, then walk me through your approach. We have about 45 minutes."
Phase 1: Requirements Clarification (5 minutes)
Before diving in, you take a breath and start asking questions.
Your Questions
You: "Before I start designing, let me clarify some requirements. What's our target scaleβhow many concurrent users and trips?"
Interviewer: "Think global scale. 150+ million monthly users, 7+ million drivers, 30 million trips per day."
You: "For the matching process, what's the acceptable latency? How fast should a rider see available cars?"
Interviewer: "Under 3 seconds from request to match. Users should see nearby drivers update in real-time as they move."
You: "How frequently do drivers report their location?"
Interviewer: "Every 4 seconds while online. That's millions of location updates per second globally."
You: "What about the ETA calculation? How accurate does it need to be?"
Interviewer: "Within 3 minutes of actual arrival time, 95% of the time. This is critical for user trust."
You: "Perfect. Let me summarize the requirements."
Functional Requirements
1. RIDER EXPERIENCE
- See nearby available drivers on map
- Request a ride with pickup and destination
- Get fare estimate before confirming
- Track driver en route to pickup
- Track trip progress to destination
- Rate driver after trip
2. DRIVER EXPERIENCE
- Go online/offline
- Receive ride requests with accept/decline
- Navigate to pickup and destination
- See earnings and trip history
- Receive surge notifications
3. MATCHING
- Find optimal driver for rider request
- Consider distance, ETA, driver rating
- Support ride types (X, Pool, Black, XL)
- Handle driver cancellations gracefully
4. PRICING
- Calculate base fare (time + distance)
- Apply dynamic surge pricing
- Show upfront pricing to riders
- Split fares for Pool rides
Non-Functional Requirements
1. LATENCY
- Match request to driver assignment: < 3 seconds
- Location update processing: < 100ms
- ETA calculation: < 500ms
- Fare calculation: < 200ms
2. AVAILABILITY
- 99.99% for ride requests
- Graceful degradation (show cached data if needed)
- Multi-region failover
3. SCALE
- 30 million trips/day
- Millions of location updates/second
- Peak handling (New Year's Eve, major events)
4. CONSISTENCY
- Strong consistency for trip state
- Eventual consistency for driver locations
- Exactly-once trip creation
Phase 2: Back of the Envelope Estimation (5 minutes)
You: "Let me work through the numbers to understand the scale."
Traffic Estimation
RIDE REQUESTS
Daily trips: 30 million
Peak multiplier: 3x (evening rush, events)
Peak trips/hour: 30M Γ 3 / 24 = 3.75 million
Peak trips/second: ~1,000 trips/second
LOCATION UPDATES
Active drivers: 7.4 million
Online at any time: ~20% = 1.5 million
Update frequency: Every 4 seconds
Location updates/second: 1.5M / 4 = 375,000/second
Peak (1.5x): ~500,000 updates/second
Storage Estimation
TRIP DATA
Trips per day: 30 million
Trip record size: ~5 KB (locations, timestamps, fare)
Daily trip storage: 30M Γ 5 KB = 150 GB/day
Yearly: ~55 TB/year
LOCATION HISTORY
Updates per second: 500,000
Bytes per update: ~100 bytes (lat, lng, timestamp, driver_id)
Daily location data: 500K Γ 100 Γ 86,400 = 4.3 TB/day
Retention (7 days): ~30 TB
Key Metrics Summary
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ESTIMATION SUMMARY β
β β
β TRAFFIC β
β βββ Peak trip requests: 1,000/second β
β βββ Location updates: 500,000/second β
β βββ API requests (total): ~5 million/second β
β β
β STORAGE β
β βββ Trip data: 150 GB/day β
β βββ Location data: 4.3 TB/day β
β βββ Total (with retention): ~100 TB active β
β β
β COMPUTE β
β βββ Matching decisions: 1,000/second β
β βββ ETA calculations: 10,000/second β
β βββ Surge calculations: Per cell, every 30 seconds β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Phase 3: High-Level Design (10 minutes)
You: "Now let me sketch out the high-level architecture."
System Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β UBER HIGH-LEVEL ARCHITECTURE β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β CLIENTS β β
β β ββββββββββββ ββββββββββββ β β
β β β Rider β β Driver β β β
β β β App β β App β β β
β β ββββββ¬ββββββ ββββββ¬ββββββ β β
β ββββββββββββββββΌββββββββββββββββββββββββββΌβββββββββββββββββββββββββ β
β β β β
β β WebSocket β WebSocket β
β β + HTTP β + HTTP β
β βΌ βΌ β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β API GATEWAY β β
β β Rate Limiting β Auth β Routing β Load Balancing β β
β ββββββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β REAL-TIME LAYER β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β β WebSocket β β Location β β Push β β β β
β β β β Manager β β Service β β Service β β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β CORE SERVICES β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β β Matching β β Trip β β Pricing β β β β
β β β β Service β β Service β β Service β β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β β β ETA β β Map β β Payment β β β β
β β β β Service β β Service β β Service β β β β
β β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β DATA LAYER β β β
β β β βββββββββββββ βββββββββββββ βββββββββββββ βββββββββββββ β β β
β β β β Spanner β β Redis β β Cassandra β β Kafka β β β β
β β β β (Trips) β β(Locations)β β (History) β β (Events) β β β β
β β β βββββββββββββ βββββββββββββ βββββββββββββ βββββββββββββ β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β GEOSPATIAL INDEX β β β
β β β H3 Hexagonal Grid (Driver Location Index) β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β UBER SERVICES β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Data Flow: Requesting a Ride
RIDE REQUEST FLOW
1. RIDER OPENS APP
βββ WebSocket connection established
βββ See nearby drivers (from Location Service)
2. RIDER REQUESTS RIDE
βββ POST /ride-request {pickup, destination, ride_type}
βββ API Gateway β Matching Service
3. MATCHING SERVICE
βββ Query Location Service for nearby drivers
βββ Filter by ride type, rating, availability
βββ Calculate ETA for each candidate
βββ Score and rank candidates
βββ Select optimal driver
4. DRIVER NOTIFICATION
βββ Push notification via WebSocket
βββ Driver has 15 seconds to accept
5. IF ACCEPTED
βββ Trip Service creates trip record
βββ Pricing Service calculates fare
βββ Both rider and driver get trip details
βββ Navigation begins
6. DURING TRIP
βββ Driver location updates every 4 seconds
βββ ETA recalculated continuously
βββ Both apps show real-time progress
7. TRIP COMPLETION
βββ Calculate final fare
βββ Process payment
βββ Request ratings
βββ Update driver availability
TOTAL LATENCY: < 3 seconds (request to driver notification)
Phase 4: Deep Dives (20 minutes)
Interviewer: "Let's dive deeper. Tell me about the geospatial indexing system."
Deep Dive 1: H3 Geospatial Indexing (Week 1 Concepts)
The Problem
GEOSPATIAL CHALLENGE
Finding nearby drivers:
- "Find all drivers within 2km of this location"
- Naive approach: Check distance to ALL drivers
- With 1.5M online drivers: Too slow
Requirements:
- Sub-100ms lookups
- Works globally (not just one city)
- Handles drivers that are constantly moving
- Efficient for both sparse and dense areas
The Solution: H3 Hexagonal Grid
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β H3 HEXAGONAL GRID SYSTEM β
β β
β WHY HEXAGONS? β
β βββββββββββββ β
β Squares: 2 different neighbor distances (edge vs corner) β
β Triangles: 3 different neighbor distances β
β Hexagons: 1 uniform distance to ALL neighbors β β
β β
β ___ β
β ___/ \___ β
β ___/ \ D / \___ β
β / \ C /___| \ \ β
β \___/ \ / E \___/ β
β / \ B |___|___/ \ β
β \___/___| | F \___/ β
β \ A |___|___/ β
β \__| | G β
β |___ β
β β
β All neighbors (B,C,D,E,F,G) are equidistant from A β
β β
β H3 RESOLUTION LEVELS β
β ββββββββββββββββββββ β
β Resolution 0: ~4,357 kmΒ² (continent scale) β
β Resolution 7: ~5.16 kmΒ² (city district) β
β Resolution 9: ~105 mΒ² (street block) β Uber typically uses β
β Resolution 15: ~0.9 mΒ² (precise location) β
β β
β HIERARCHICAL STRUCTURE β
β ββββββββββββββββββββββ β
β Each hexagon subdivides into ~7 child hexagons β
β Parent cells contain children (useful for aggregation) β
β β
β Resolution 7 (city) β
β β β
β βββ Resolution 8 (neighborhood) β
β β β β
β β βββ Resolution 9 (street block) β
β β β β β
β β β βββ Resolution 10 (building level) β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# geospatial/h3_location_index.py
"""
H3-Based Location Index for Uber
Uses Uber's open-source H3 hexagonal grid system for
efficient geospatial queries.
"""
from dataclasses import dataclass
from typing import List, Set, Optional
import h3
import redis
@dataclass
class DriverLocation:
driver_id: str
latitude: float
longitude: float
timestamp: float
ride_types: List[str]
rating: float
is_available: bool
class H3LocationIndex:
"""
Location index using H3 hexagonal grid.
Key insight: Convert 2D lat/lng to 1D H3 cell ID
for O(1) lookups instead of O(n) distance calculations.
Applies concepts:
- Week 1: Partitioning strategies
- Week 4: Caching for low-latency lookups
"""
# Resolution 9: ~100m hexagons, good for urban areas
DEFAULT_RESOLUTION = 9
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
self.resolution = self.DEFAULT_RESOLUTION
def update_driver_location(self, driver: DriverLocation) -> None:
"""
Update driver location in the index.
Called every 4 seconds per driver.
Must be extremely fast (< 10ms).
"""
# Convert lat/lng to H3 cell
cell_id = h3.latlng_to_cell(
driver.latitude,
driver.longitude,
self.resolution
)
# Remove from old cell (if moved)
old_cell = self.redis.hget(f"driver:{driver.driver_id}", "cell")
if old_cell and old_cell != cell_id:
self.redis.srem(f"cell:{old_cell}", driver.driver_id)
# Add to new cell
self.redis.sadd(f"cell:{cell_id}", driver.driver_id)
# Store driver details
self.redis.hset(f"driver:{driver.driver_id}", mapping={
"cell": cell_id,
"lat": driver.latitude,
"lng": driver.longitude,
"ts": driver.timestamp,
"available": int(driver.is_available),
"rating": driver.rating
})
# Set TTL for automatic cleanup if driver goes offline
self.redis.expire(f"driver:{driver.driver_id}", 30)
def find_nearby_drivers(
self,
latitude: float,
longitude: float,
radius_km: float = 2.0,
ride_type: str = "UberX",
limit: int = 10
) -> List[DriverLocation]:
"""
Find available drivers near a location.
Uses H3 k-ring to get nearby cells, then fetches
drivers from those cells.
"""
# Get center cell
center_cell = h3.latlng_to_cell(
latitude, longitude, self.resolution
)
# Calculate k-ring size based on radius
# At resolution 9, each hex is ~100m edge
k = int(radius_km / 0.1) + 1
# Get all cells in radius (k-ring)
nearby_cells = h3.grid_disk(center_cell, k)
# Fetch drivers from all cells
candidates = []
for cell_id in nearby_cells:
driver_ids = self.redis.smembers(f"cell:{cell_id}")
for driver_id in driver_ids:
driver_data = self.redis.hgetall(f"driver:{driver_id}")
if driver_data and driver_data.get("available"):
candidates.append(self._to_driver_location(
driver_id, driver_data
))
# Sort by distance and return top matches
candidates.sort(
key=lambda d: self._haversine_distance(
latitude, longitude, d.latitude, d.longitude
)
)
return candidates[:limit]
def get_drivers_in_cell(self, cell_id: str) -> Set[str]:
"""Get all driver IDs in a specific H3 cell."""
return self.redis.smembers(f"cell:{cell_id}")
def _haversine_distance(
self, lat1: float, lng1: float,
lat2: float, lng2: float
) -> float:
"""Calculate distance between two points in km."""
from math import radians, sin, cos, sqrt, atan2
R = 6371 # Earth's radius in km
lat1, lng1, lat2, lng2 = map(
radians, [lat1, lng1, lat2, lng2]
)
dlat = lat2 - lat1
dlng = lng2 - lng1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlng/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
Deep Dive 2: Dynamic Surge Pricing (Week 5 Concepts)
Interviewer: "How does surge pricing work?"
The Problem
SUPPLY-DEMAND IMBALANCE
Scenario: Friday night, 11 PM, downtown
- Rider requests: 500/minute
- Available drivers: 50
- Result: 10-minute wait times, unhappy users
Without surge pricing:
- Drivers don't know demand is high
- No incentive to drive to busy areas
- Riders wait forever or can't get a ride
With surge pricing:
- Higher prices attract more drivers
- Some riders defer (reducing demand)
- Market reaches equilibrium faster
The Solution: Real-Time Surge Calculation
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SURGE PRICING ALGORITHM β
β β
β INPUT SIGNALS β
β βββββββββββββ β
β β’ Ride requests in H3 cell (last 5 minutes) β
β β’ Available drivers in cell + neighbors β
β β’ Historical demand patterns (time, day, events) β
β β’ Current trip completion rate β
β β’ Weather data β
β β
β SURGE CALCULATION β
β βββββββββββββββββ β
β β
β demand_supply_ratio = requests / available_drivers β
β β
β If ratio < 1.0 β surge = 1.0 (no surge) β
β If ratio = 2.0 β surge β 1.5x β
β If ratio = 4.0 β surge β 2.5x β
β If ratio > 10.0 β surge = cap (e.g., 5.0x) β
β β
β SMOOTHING β
β βββββββββ β
β β’ Don't change surge too rapidly (confusing for users) β
β β’ Use exponential moving average β
β β’ Gradually ramp up and down β
β β
β VISUALIZATION β
β βββββββββββββ β
β β
β City map with H3 cells colored by surge: β
β β
β βββββββββββββββββββββββββββββββ β
β β ___ ___ ___ β Legend: β
β β __/1.0\___/1.2\___/1.0\__ β 1.0x = No surge (gray) β
β β /1.0\___/2.5\___/1.5\___/ β 1.5x = Light surge (yellow) β
β β \___/1.2\___/3.0\___/1.2\ β 2.5x = Medium surge (orange) β
β β /1.0\___/2.0\___/1.5\___/ β 3.0x = High surge (red) β
β β \___/1.0\___/1.2\___/1.0\ β β
β βββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# pricing/surge_service.py
"""
Dynamic Surge Pricing Service
Calculates real-time surge multipliers based on
supply-demand imbalance in each H3 cell.
"""
from dataclasses import dataclass
from typing import Dict, Optional
from datetime import datetime, timedelta
import h3
@dataclass
class SurgeConfig:
min_surge: float = 1.0
max_surge: float = 5.0
update_interval_seconds: int = 30
smoothing_factor: float = 0.3 # For exponential moving average
demand_window_minutes: int = 5
@dataclass
class CellSurge:
cell_id: str
surge_multiplier: float
demand_count: int
supply_count: int
updated_at: datetime
class SurgePricingService:
"""
Calculate surge pricing per H3 cell.
Key principles:
- Higher demand β higher surge
- More drivers β lower surge
- Smooth transitions (no sudden jumps)
- Respect max cap (emergencies, PR concerns)
Applies concepts:
- Week 5: Eventual consistency (surge propagates)
- Week 8: Real-time analytics pipeline
"""
def __init__(
self,
config: SurgeConfig,
location_index,
demand_tracker,
cache
):
self.config = config
self.location_index = location_index
self.demand_tracker = demand_tracker
self.cache = cache
# Surge curve parameters (tuned via ML)
self.surge_curve = [
(1.0, 1.0), # ratio 1.0 β surge 1.0
(1.5, 1.25), # ratio 1.5 β surge 1.25
(2.0, 1.5), # ratio 2.0 β surge 1.5
(3.0, 2.0), # ratio 3.0 β surge 2.0
(5.0, 3.0), # ratio 5.0 β surge 3.0
(10.0, 5.0), # ratio 10.0 β surge 5.0 (cap)
]
def get_surge_for_location(
self,
latitude: float,
longitude: float
) -> float:
"""Get current surge multiplier for a location."""
cell_id = h3.latlng_to_cell(latitude, longitude, 9)
# Check cache first
cached = self.cache.get(f"surge:{cell_id}")
if cached:
return float(cached)
# Calculate fresh surge
surge = self._calculate_cell_surge(cell_id)
# Cache for 30 seconds
self.cache.setex(f"surge:{cell_id}", 30, surge)
return surge
def _calculate_cell_surge(self, cell_id: str) -> float:
"""Calculate surge for a specific H3 cell."""
# Get demand (requests in last 5 minutes)
demand = self.demand_tracker.get_request_count(
cell_id,
window_minutes=self.config.demand_window_minutes
)
# Get supply (available drivers in cell + neighbors)
nearby_cells = h3.grid_disk(cell_id, 1) # Cell + 6 neighbors
supply = sum(
len(self.location_index.get_drivers_in_cell(c))
for c in nearby_cells
)
# Prevent division by zero
if supply == 0:
return self.config.max_surge if demand > 0 else self.config.min_surge
# Calculate demand/supply ratio
ratio = demand / supply
# Map ratio to surge using curve
raw_surge = self._ratio_to_surge(ratio)
# Apply smoothing (don't change too fast)
previous_surge = self.cache.get(f"surge:{cell_id}")
if previous_surge:
raw_surge = self._smooth_surge(
float(previous_surge),
raw_surge
)
# Clamp to bounds
return max(
self.config.min_surge,
min(self.config.max_surge, raw_surge)
)
def _ratio_to_surge(self, ratio: float) -> float:
"""Convert demand/supply ratio to surge multiplier."""
# Linear interpolation between curve points
for i in range(len(self.surge_curve) - 1):
r1, s1 = self.surge_curve[i]
r2, s2 = self.surge_curve[i + 1]
if r1 <= ratio <= r2:
# Linear interpolation
t = (ratio - r1) / (r2 - r1)
return s1 + t * (s2 - s1)
# Above max ratio
return self.surge_curve[-1][1]
def _smooth_surge(
self,
previous: float,
current: float
) -> float:
"""Apply exponential moving average for smooth transitions."""
alpha = self.config.smoothing_factor
return alpha * current + (1 - alpha) * previous
def calculate_fare(
self,
distance_km: float,
duration_minutes: float,
surge: float,
ride_type: str = "UberX"
) -> float:
"""Calculate total fare with surge."""
# Base rates (vary by city/ride type)
rates = {
"UberX": {"base": 2.50, "per_km": 1.20, "per_min": 0.25},
"UberXL": {"base": 4.00, "per_km": 1.80, "per_min": 0.35},
"UberBlack": {"base": 8.00, "per_km": 3.00, "per_min": 0.50},
}
rate = rates.get(ride_type, rates["UberX"])
base_fare = (
rate["base"] +
rate["per_km"] * distance_km +
rate["per_min"] * duration_minutes
)
# Apply surge
surged_fare = base_fare * surge
# Minimum fare
return max(surged_fare, 5.00)
Deep Dive 3: Matching Algorithm (Week 1 Concepts)
Interviewer: "How do you match riders with the optimal driver?"
The Problem
MATCHING CHALLENGE
Given:
- 1 rider requesting a ride
- 50 nearby available drivers
Find the "best" match considering:
- Distance (affects ETA and driver cost)
- Driver rating
- Driver's acceptance rate
- Ride type compatibility
- Fairness (drivers waiting longer get priority)
The Solution: Weighted Scoring with Dispatch
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MATCHING ALGORITHM β
β β
β STEP 1: CANDIDATE SELECTION β
β βββββββββββββββββββββββββββ β
β Query H3 index for drivers within radius β
β Filter: available, correct ride type, not already pinged β
β β
β STEP 2: SCORING β
β βββββββββββββββ β
β For each candidate, calculate: β
β β
β score = w1 Γ ETA_score + β
β w2 Γ rating_score + β
β w3 Γ acceptance_score + β
β w4 Γ wait_time_score β
β β
β Where: β
β β’ ETA_score = 1 - (ETA / max_ETA) [closer is better] β
β β’ rating_score = (rating - 4.0) / 1.0 [higher is better] β
β β’ acceptance_score = acceptance_rate [higher is better] β
β β’ wait_time_score = min(wait_time / 30, 1) [longer wait = priority] β
β β
β STEP 3: DISPATCH β
β ββββββββββββββββ β
β Send request to highest-scoring driver β
β Wait up to 15 seconds for response β
β If declined/timeout β next best driver β
β β
β OPTIMIZATION: BATCH MATCHING β
β ββββββββββββββββββββββββββββ β
β For high-volume periods: β
β β’ Collect requests for 2-second windows β
β β’ Solve assignment problem (Hungarian algorithm) β
β β’ Globally optimize rider-driver pairs β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Implementation
# matching/dispatch_service.py
"""
Rider-Driver Matching Service
Finds optimal driver for each ride request using
weighted scoring and ETA estimation.
"""
from dataclasses import dataclass
from typing import List, Optional, Tuple
from enum import Enum
import asyncio
class MatchResult(Enum):
MATCHED = "matched"
NO_DRIVERS = "no_drivers"
ALL_DECLINED = "all_declined"
TIMEOUT = "timeout"
@dataclass
class MatchRequest:
request_id: str
rider_id: str
pickup_lat: float
pickup_lng: float
destination_lat: float
destination_lng: float
ride_type: str
@dataclass
class DriverCandidate:
driver_id: str
latitude: float
longitude: float
rating: float
acceptance_rate: float
wait_time_seconds: int
eta_seconds: int
score: float = 0.0
class MatchingService:
"""
Match riders with optimal drivers.
Uses weighted scoring to balance multiple factors:
- ETA (user experience)
- Driver rating (quality)
- Acceptance rate (reliability)
- Wait time (fairness to drivers)
Applies concepts:
- Week 1: Partitioning (H3-based driver lookup)
- Week 2: Timeouts and retries
"""
# Scoring weights (tuned via ML/experimentation)
WEIGHTS = {
"eta": 0.4,
"rating": 0.2,
"acceptance": 0.2,
"wait_time": 0.2
}
DRIVER_RESPONSE_TIMEOUT = 15 # seconds
MAX_CANDIDATES = 10
def __init__(
self,
location_index,
eta_service,
driver_service,
notification_service
):
self.location_index = location_index
self.eta_service = eta_service
self.driver_service = driver_service
self.notifications = notification_service
async def find_match(
self,
request: MatchRequest
) -> Tuple[MatchResult, Optional[str]]:
"""
Find and dispatch to optimal driver.
Returns (result, driver_id or None)
"""
# Step 1: Find nearby candidates
candidates = await self._get_candidates(request)
if not candidates:
return (MatchResult.NO_DRIVERS, None)
# Step 2: Score and rank candidates
scored = await self._score_candidates(request, candidates)
scored.sort(key=lambda c: c.score, reverse=True)
# Step 3: Dispatch to drivers in order
for candidate in scored[:self.MAX_CANDIDATES]:
accepted = await self._dispatch_to_driver(
request, candidate
)
if accepted:
return (MatchResult.MATCHED, candidate.driver_id)
return (MatchResult.ALL_DECLINED, None)
async def _get_candidates(
self,
request: MatchRequest
) -> List[DriverCandidate]:
"""Get available drivers near pickup location."""
drivers = self.location_index.find_nearby_drivers(
request.pickup_lat,
request.pickup_lng,
radius_km=3.0,
ride_type=request.ride_type,
limit=50
)
candidates = []
for driver in drivers:
if not driver.is_available:
continue
# Get driver stats
stats = await self.driver_service.get_stats(driver.driver_id)
# Calculate ETA
eta = await self.eta_service.calculate(
driver.latitude, driver.longitude,
request.pickup_lat, request.pickup_lng
)
candidates.append(DriverCandidate(
driver_id=driver.driver_id,
latitude=driver.latitude,
longitude=driver.longitude,
rating=driver.rating,
acceptance_rate=stats.acceptance_rate,
wait_time_seconds=stats.wait_time_seconds,
eta_seconds=eta
))
return candidates
async def _score_candidates(
self,
request: MatchRequest,
candidates: List[DriverCandidate]
) -> List[DriverCandidate]:
"""Calculate match score for each candidate."""
max_eta = max(c.eta_seconds for c in candidates) or 1
for candidate in candidates:
# Normalize scores to 0-1 range
eta_score = 1 - (candidate.eta_seconds / max_eta)
rating_score = (candidate.rating - 4.0) / 1.0
acceptance_score = candidate.acceptance_rate
wait_score = min(candidate.wait_time_seconds / 1800, 1.0)
# Weighted sum
candidate.score = (
self.WEIGHTS["eta"] * eta_score +
self.WEIGHTS["rating"] * max(0, rating_score) +
self.WEIGHTS["acceptance"] * acceptance_score +
self.WEIGHTS["wait_time"] * wait_score
)
return candidates
async def _dispatch_to_driver(
self,
request: MatchRequest,
candidate: DriverCandidate
) -> bool:
"""Send ride request to driver and wait for response."""
# Mark driver as unavailable (prevent double-dispatch)
await self.driver_service.set_pending(candidate.driver_id)
try:
# Send push notification
await self.notifications.send_ride_request(
driver_id=candidate.driver_id,
request_id=request.request_id,
pickup_address="123 Main St", # Would resolve address
eta_seconds=candidate.eta_seconds
)
# Wait for response with timeout
response = await asyncio.wait_for(
self.driver_service.wait_for_response(
candidate.driver_id,
request.request_id
),
timeout=self.DRIVER_RESPONSE_TIMEOUT
)
return response == "accepted"
except asyncio.TimeoutError:
return False
finally:
# Release driver if not matched
await self.driver_service.clear_pending(candidate.driver_id)
Deep Dive 4: ETA Prediction (Week 8 Concepts)
Interviewer: "How do you calculate accurate ETAs?"
The Problem
ETA CHALLENGES
Simple distance/speed calculation fails because:
- Traffic varies by time of day
- Accidents, road closures
- Weather affects speed
- One-way streets
- Pickup location accessibility (building entrance vs street)
- Historical patterns matter
Uber's DeepETA model:
- 95% accuracy within 3 minutes
- Uses ML to combine multiple signals
Implementation
# routing/eta_service.py
"""
ETA Prediction Service
Combines routing engine with ML model for accurate
arrival time predictions.
"""
from dataclasses import dataclass
from typing import List, Optional, Tuple
from datetime import datetime
@dataclass
class ETAResult:
eta_seconds: int
distance_meters: int
confidence: float
route_polyline: str
class ETAService:
"""
Calculate estimated time of arrival.
Combines:
- Graph-based shortest path (baseline)
- Real-time traffic data
- ML model adjustments (DeepETA)
- Historical patterns
Applies concepts:
- Week 8: ML pipeline integration
- Week 4: Caching for repeated queries
"""
def __init__(
self,
routing_engine,
traffic_service,
ml_model,
cache
):
self.routing = routing_engine
self.traffic = traffic_service
self.model = ml_model
self.cache = cache
async def calculate(
self,
origin_lat: float,
origin_lng: float,
dest_lat: float,
dest_lng: float,
departure_time: Optional[datetime] = None
) -> ETAResult:
"""Calculate ETA from origin to destination."""
departure_time = departure_time or datetime.utcnow()
# Check cache for recent identical query
cache_key = self._make_cache_key(
origin_lat, origin_lng, dest_lat, dest_lng
)
cached = self.cache.get(cache_key)
if cached:
return cached
# Step 1: Get baseline route from routing engine
route = await self.routing.get_route(
origin_lat, origin_lng,
dest_lat, dest_lng
)
# Step 2: Get current traffic conditions
traffic_factor = await self.traffic.get_factor(
route.segments,
departure_time
)
# Step 3: ML model adjustment
features = self._extract_features(
route, traffic_factor, departure_time
)
ml_adjustment = self.model.predict(features)
# Step 4: Combine estimates
base_eta = route.duration_seconds
traffic_eta = base_eta * traffic_factor
final_eta = int(traffic_eta * ml_adjustment)
result = ETAResult(
eta_seconds=final_eta,
distance_meters=route.distance_meters,
confidence=0.95 if traffic_factor < 1.5 else 0.85,
route_polyline=route.polyline
)
# Cache for 60 seconds
self.cache.setex(cache_key, 60, result)
return result
def _extract_features(
self,
route,
traffic_factor: float,
departure_time: datetime
) -> dict:
"""Extract features for ML model."""
return {
"distance_km": route.distance_meters / 1000,
"base_duration_min": route.duration_seconds / 60,
"traffic_factor": traffic_factor,
"hour_of_day": departure_time.hour,
"day_of_week": departure_time.weekday(),
"is_rush_hour": self._is_rush_hour(departure_time),
"num_turns": route.num_turns,
"num_traffic_lights": route.num_traffic_lights,
"highway_fraction": route.highway_fraction,
}
def _is_rush_hour(self, dt: datetime) -> bool:
"""Check if time is during typical rush hour."""
hour = dt.hour
is_weekday = dt.weekday() < 5
morning_rush = 7 <= hour <= 9
evening_rush = 17 <= hour <= 19
return is_weekday and (morning_rush or evening_rush)
Phase 5: Scaling and Edge Cases
Scaling Strategies
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SCALING STRATEGIES β
β β
β GEOGRAPHIC SHARDING β
β βββββββββββββββββββ β
β β’ Each city/region is a shard β
β β’ Location data stays local (low latency) β
β β’ Cross-region trips handled specially β
β β
β EVENT-DRIVEN ARCHITECTURE β
β βββββββββββββββββββββββββ β
β β’ All state changes β Kafka events β
β β’ Services react to events asynchronously β
β β’ Enables real-time analytics pipeline β
β β
β CELL-BASED LOAD BALANCING β
β βββββββββββββββββββββββββ β
β β’ Hot H3 cells (downtown) get dedicated capacity β
β β’ Cold cells share resources β
β β’ Dynamic rebalancing based on demand β
β β
β HYBRID CLOUD β
β ββββββββββββ β
β β’ Core services in Uber data centers β
β β’ Burst capacity in cloud (GCP, AWS) β
β β’ Multi-region active-active β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Edge Cases
EDGE CASE 1: Driver Goes Offline Mid-Match
Scenario: Driver selected, then loses connectivity
Handling:
- 15-second timeout for response
- Automatic retry with next best driver
- Driver marked unavailable after 30s silence
EDGE CASE 2: Mass Event Ending (Concert, Sports)
Scenario: 50,000 people request rides simultaneously
Handling:
- Surge pricing activates (attracts drivers)
- Virtual queue system for fairness
- Pre-positioning drivers before event ends
- Predictive scaling based on event schedule
EDGE CASE 3: GPS Drift/Inaccuracy
Scenario: Driver appears in wrong H3 cell
Handling:
- Kalman filter smooths location jumps
- Query neighboring cells (k-ring)
- Fall back to larger radius search
EDGE CASE 4: Payment Failure
Scenario: Card declined after trip completion
Handling:
- Retry with backup payment method
- Grace period before blocking rider
- Driver always gets paid (Uber absorbs risk)
Phase 6: Monitoring and Operations
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β UBER MONITORING DASHBOARD β
β β
β MARKETPLACE HEALTH β
β βββ Trips completing/minute: ~21,000 [ββββββββββ] 20,834 β
β βββ Average ETA accuracy: 94.7% [ββββββββββ] 94.7% β
β βββ Match success rate: 98.2% [ββββββββββ] 98.2% β
β βββ Average surge: 1.15x [ββββββββββ] 1.15x β
β β
β LATENCY β
β βββ Match time (p50): 1.8s [ββββββββββ] on target β
β βββ Match time (p99): 4.2s [ββββββββββ] warning β
β βββ Location update processing: 45ms [ββββββββββ] healthy β
β βββ ETA calculation: 180ms [ββββββββββ] healthy β
β β
β SUPPLY-DEMAND β
β βββ Online drivers: 1.2M [ββββββββββ] β
β βββ Active trips: 890K [ββββββββββ] β
β βββ Pending requests: 12K [ββββββββββ] β
β βββ Surge zones (>1.5x): 847 [ββββββββββ] β
β β
β INFRASTRUCTURE β
β βββ Kafka lag: < 100ms [ββββββββββ] healthy β
β βββ Redis hit rate: 99.2% [ββββββββββ] healthy β
β βββ API error rate: 0.02% [ββββββββββ] healthy β
β βββ WebSocket connections: 45M [ββββββββββ] β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Alerting Strategy
CRITICAL (PagerDuty):
- Match success rate < 90%
- Trip completion rate < 95%
- Location service latency > 500ms
- Payment processing failures > 1%
WARNING (Slack):
- Surge zones > 2000 (unusual demand)
- Driver online rate dropping > 10%
- ETA accuracy < 90%
- Kafka consumer lag > 1 minute
INFO (Dashboard):
- Regional traffic patterns
- Driver earnings distribution
- New city launch metrics
Interview Conclusion
Interviewer: "Excellent work. You've covered the geospatial system, pricing, and matching really well. Any questions?"
You: "Thank you! I'm curious about how Uber handles the transition when a driver crosses city boundariesβdoes the surge pricing change mid-trip?"
Interviewer: "Great question. The fare is locked at request time based on pickup location. If surge changes during the trip, the rider pays what was quoted upfront. We introduced upfront pricing specifically to avoid surprises."
Summary: Concepts Applied
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONCEPTS FROM 10-WEEK COURSE IN UBER DESIGN β
β β
β WEEK 1: DATA AT SCALE β
β βββ H3 Partitioning: Geospatial sharding by hexagon β
β βββ Geographic sharding: City-level data isolation β
β βββ Hot key handling: High-demand cells get dedicated capacity β
β β
β WEEK 2: FAILURE-FIRST DESIGN β
β βββ Timeouts: 15s driver response window β
β βββ Retries: Automatic failover to next driver β
β βββ Circuit breakers: Protect downstream services β
β β
β WEEK 3: MESSAGING & ASYNC β
β βββ Kafka: Trillions of events per day β
β βββ Event-driven: All state changes as events β
β βββ Real-time processing: Flink for surge calculation β
β β
β WEEK 4: CACHING β
β βββ Redis: Driver locations for sub-ms lookups β
β βββ ETA cache: Avoid redundant routing calculations β
β βββ Surge cache: 30-second TTL for stability β
β β
β WEEK 5: CONSISTENCY β
β βββ Strong: Trip state (Spanner) β
β βββ Eventual: Driver locations, surge pricing β
β βββ Optimistic locking: Prevent double-dispatch β
β β
β WEEK 8: ANALYTICS PIPELINE β
β βββ Real-time: Surge pricing, fraud detection β
β βββ Batch: ML model training, demand forecasting β
β βββ ML integration: DeepETA, demand prediction β
β β
β WEEK 10: PRODUCTION READINESS β
β βββ SLOs: < 3s match, 95% ETA accuracy β
β βββ Multi-region: Active-active deployment β
β βββ Observability: Real-time marketplace health β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WHY UBER IS AN ENGINEERING MARVEL β
β β
β REAL-TIME SCALE β
β βββββββββββββββ β
β β’ 30 million trips matched daily β
β β’ Millions of location updates per second β
β β’ Sub-3-second matching latency β
β β
β INNOVATION β
β ββββββββββ β
β β’ H3: Open-sourced hexagonal geospatial system β
β β’ DeepETA: ML model with 95% accuracy β
β β’ Dynamic pricing: Real-time market equilibrium β
β β
β GLOBAL COMPLEXITY β
β βββββββββββββββββ β
β β’ 10,000+ cities with local regulations β
β β’ Multiple products: Rides, Eats, Freight β
β β’ 7.4 million independent driver-partners β
β β
β DATA PLATFORM β
β βββββββββββββ β
β β’ 200,000+ data pipelines β
β β’ Petabytes processed daily β
β β’ Thousands of ML models in production β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Sources
Official Uber Engineering:
- Uber Engineering Blog: https://www.uber.com/blog/engineering/
- H3 Geospatial Indexing: https://www.uber.com/blog/h3/
- Fulfillment Platform Re-architecture: https://www.uber.com/blog/fulfillment-platform-rearchitecture/
- Tech Stack Foundation: https://www.uber.com/blog/tech-stack-part-one-foundation/
H3 Documentation:
- H3 GitHub: https://github.com/uber/h3
- H3 Documentation: https://h3geo.org/
Architecture:
- Astronomer - Uber's 200K Data Pipelines: https://www.astronomer.io/blog/airflow-in-action-uber/
- ByteByteGo - Uber System Design: https://blog.bytebytego.com/
Pricing:
- Uber Marketplace Surge Pricing: https://www.uber.com/us/en/marketplace/pricing/surge-pricing/
- Uber Dynamic Pricing: https://www.uber.com/en-GB/blog/uber-dynamic-pricing/
Self-Assessment Checklist
After studying this case study, you should be able to:
- Explain why hexagons are superior to squares for geospatial grids
- Design an H3-based location index with O(1) lookups
- Implement a surge pricing algorithm with demand-supply balancing
- Design a driver-rider matching system with weighted scoring
- Calculate ETAs combining routing, traffic, and ML models
- Handle edge cases: GPS drift, mass events, driver cancellations
- Design for global scale with geographic sharding
- Implement real-time location tracking with WebSockets
- Balance consistency needs (strong for trips, eventual for locations)
- Build observability for a real-time marketplace
- Articulate trade-offs in two-sided marketplace design
This case study covers Uber's architecture for matching millions of riders with drivers in real-time across 10,000+ cities worldwide.