Day 05

Week 1 — Day 5: Session Store Design

System Design Mastery Series

Preface

This is the capstone session for Week 1. Over the past four days, you've learned:

Day 1: Partitioning — how to split data across machines
Day 2: Replication — how to copy data for availability and scale
Day 3: Rate Limiting — how to protect systems from overload
Day 4: Hot Keys — how to handle skewed traffic patterns

Today, we tie it all together by designing a complete system from scratch: a session store for 10 million concurrent users. This is a real system that every large-scale application needs, and it exercises every concept from this week.

By the end of this session, you'll not only have a complete session store design—you'll have a template for approaching any storage system design.

Let's begin.

Part I: Foundations

Chapter 1: Understanding Session Management

1.1 What Is a Session?

A session is a temporary, server-side storage for user-specific data that persists across multiple requests. When a user logs into a web application, the server creates a session containing:

Authentication state (user ID, roles, permissions)
User preferences (language, timezone)
Shopping cart contents
CSRF tokens
Temporary workflow state

The client receives a session ID (usually in a cookie), which it sends with every request. The server uses this ID to retrieve the session data.

Request 1 (Login):
  Client → Server: Credentials
  Server: Create session {user_id: 123, roles: ['user']}
  Server → Client: Set-Cookie: session_id=abc123

Request 2 (Protected page):
  Client → Server: Cookie: session_id=abc123
  Server: Look up abc123 → {user_id: 123, roles: ['user']}
  Server: User is authenticated, serve page

1.2 Session Store Requirements

A production session store must satisfy:

Requirement	Target	Why It Matters
Latency	< 5ms P99	Every request reads session; can't add latency
Availability	99.99%	Session unavailable = user logged out
Durability	Best effort	Lost session = user re-authenticates (inconvenient, not catastrophic)
Consistency	Read-your-writes	User must see their own changes
Scale	10M concurrent	Enterprise SaaS, consumer apps, gaming

1.3 Scale Analysis

Let's understand what "10M concurrent users" means:

Session characteristics:

Average session size: 2 KB (JSON with user data, preferences, tokens)
Session TTL: 24 hours (extended on activity)
Active sessions: 10 million

Storage requirement:

10M sessions × 2 KB = 20 GB

Not huge—fits in memory of a few servers.

Request pattern:

Each active user makes ~1 request per second (average)
Peak: 10M requests/second (unlikely—users aren't perfectly synchronized)
Realistic peak: 2-3M requests/second (20-30% concurrently active)

Operation breakdown:

80% reads (session validation on each request)
15% updates (refresh TTL, update preferences)
5% creates/deletes (login/logout)

This is a read-heavy workload with moderate writes.

Chapter 2: Architecture Approaches

There are three fundamental approaches to session management. Each has distinct trade-offs.

2.1 Approach 1: Sticky Sessions (Server Affinity)

Route all requests from one user to the same server. Store sessions in that server's memory.

                    Load Balancer
                    (sticky routing)
                         │
         ┌───────────────┼───────────────┐
         │               │               │
         ▼               ▼               ▼
    ┌─────────┐     ┌─────────┐     ┌─────────┐
    │Server 1 │     │Server 2 │     │Server 3 │
    │Sessions:│     │Sessions:│     │Sessions:│
    │ A, D, G │     │ B, E, H │     │ C, F, I │
    └─────────┘     └─────────┘     └─────────┘

How it works:

Load balancer hashes user identifier (IP, cookie) to server
All requests from that user go to same server
Session stored in server's local memory

Strengths:

✅ No external dependencies
✅ Fastest reads (local memory)
✅ Simplest implementation

Weaknesses:

❌ Server failure loses all its sessions (users logged out)
❌ Uneven load distribution (some users heavier than others)
❌ Scaling requires session migration
❌ Can't do rolling deployments without session loss

When to use: Small scale, simple applications, sessions are easily recreated.

2.2 Approach 2: Distributed Session Store

Store sessions in a dedicated, distributed storage system (Redis, Memcached, DynamoDB).

                    Load Balancer
                    (any server)
                         │
         ┌───────────────┼───────────────┐
         │               │               │
         ▼               ▼               ▼
    ┌─────────┐     ┌─────────┐     ┌─────────┐
    │Server 1 │     │Server 2 │     │Server 3 │
    └────┬────┘     └────┬────┘     └────┬────┘
         │               │               │
         └───────────────┼───────────────┘
                         │
                         ▼
              ┌─────────────────────┐
              │   Session Store     │
              │   (Redis Cluster)   │
              └─────────────────────┘

How it works:

Sessions stored in external distributed store
Any server can handle any request
Session ID → lookup in store

Strengths:

✅ Server failures don't lose sessions
✅ Easy horizontal scaling
✅ Rolling deployments are trivial
✅ Load balancing is simple (round-robin)

Weaknesses:

❌ Network hop for every request (+1-2ms)
❌ Session store is a new dependency (availability concerns)
❌ More complex infrastructure

When to use: Most production applications. This is the standard approach.

2.3 Approach 3: Client-Side Sessions (JWT/Encrypted Cookies)

Store session data in signed/encrypted cookies. No server-side storage.

Request:
  Cookie: session=eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyX2lkIjoxMjN9.signature
  
Server decodes: {user_id: 123, roles: ['user'], exp: 1234567890}

How it works:

Session data encoded in JWT or encrypted cookie
Server validates signature/decrypts on each request
No storage needed

Strengths:

✅ No storage infrastructure
✅ Infinite horizontal scaling
✅ Works across datacenters without replication

Weaknesses:

❌ Can't invalidate sessions (until expiry)
❌ Session size limited by cookie size (4KB)
❌ Every request sends full session (bandwidth)
❌ Secret key rotation is complex

When to use: Stateless APIs, microservices auth, when you can't do server-side storage.

2.4 Hybrid: Distributed Store + Client Cache

Combine approaches for the best of both worlds:

Request flow:
1. Check encrypted cookie for basic auth (user_id, roles)
2. If more data needed, fetch from distributed store
3. Cache frequently accessed session data locally (short TTL)

Strengths:

Most requests don't hit the session store (cookie has basics)
Full session data available when needed
Can invalidate by marking session invalid in store

This is what we'll design today.

Chapter 3: Sticky Sessions Deep Dive

Before moving to distributed sessions, let's understand sticky sessions thoroughly—you may need them as a fallback or for specific use cases.

3.1 Implementation Options

Option 1: Cookie-based routing

Load balancer sets a cookie indicating which server should handle requests:

First request:
  LB → Server 2 (random assignment)
  Server 2 → Client: Set-Cookie: SERVERID=server2

Subsequent requests:
  Client → LB: Cookie: SERVERID=server2
  LB → Server 2 (based on cookie)

Option 2: IP hash

Hash client IP to determine server:

server_index = hash(client_ip) % num_servers

Problem: Users behind NAT share IP → same server.

Option 3: Consistent hashing

Use consistent hashing for more stable routing during scale events:

class StickyLoadBalancer:
    def __init__(self, servers: list[str]):
        self.ring = ConsistentHashRing(servers)
    
    def get_server(self, user_id: str) -> str:
        return self.ring.get_node(user_id)
    
    def add_server(self, server: str):
        # Only 1/N of sessions need to move
        self.ring.add_node(server)
    
    def remove_server(self, server: str):
        # Sessions for this server will be redistributed
        self.ring.remove_node(server)

3.2 Handling Server Failures

When a sticky server dies, its users lose sessions. Mitigations:

Mitigation 1: Session replication

Replicate sessions to a backup server:

Primary: Server 2 handles User A
Backup:  Server 5 has copy of User A's session

Server 2 dies:
  LB detects failure
  LB routes User A to Server 5
  User A's session still exists

But this adds complexity—at this point, just use a distributed store.

Mitigation 2: Graceful degradation

Accept that server failure = session loss. Make session recreation easy:

@app.route('/dashboard')
def dashboard():
    session = get_session()
    if not session:
        # Session lost - redirect to login
        return redirect('/login?reason=session_expired')
    return render_dashboard(session)

If login is fast (SSO, remember-me cookies), this is acceptable.

3.3 When Sticky Sessions Make Sense

WebSocket connections: Must maintain connection to same server
In-memory caching: Server caches user-specific data
Legacy systems: Application stores state in memory
Cost constraints: Can't afford session store infrastructure

Part II: The Design Challenge

Chapter 4: Designing a Distributed Session Store

4.1 Requirements Recap

Capacity: 10M concurrent sessions, 2KB each = 20GB
Throughput: 2-3M reads/second peak, 500K writes/second
Latency: <5ms P99
Availability: 99.99% (52 minutes downtime/year)
Durability: Best-effort (session loss is inconvenient, not catastrophic)
Consistency: Read-your-writes

4.2 High-Level Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                           Application Tier                               │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐         ┌──────────┐           │
│  │ App Srv  │ │ App Srv  │ │ App Srv  │   ...   │ App Srv  │           │
│  │   + LC   │ │   + LC   │ │   + LC   │         │   + LC   │           │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘         └────┬─────┘           │
│       │            │            │                    │                  │
│       └────────────┴────────────┴────────────────────┘                  │
│                                 │                                        │
│                    LC = Local Cache (optional)                          │
└─────────────────────────────────┼───────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                          Session Store Tier                              │
│                                                                          │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                      Redis Cluster                                  │ │
│  │                                                                     │ │
│  │   ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐        │ │
│  │   │ Shard 0 │    │ Shard 1 │    │ Shard 2 │    │ Shard N │        │ │
│  │   │Primary  │    │Primary  │    │Primary  │    │Primary  │        │ │
│  │   │+Replica │    │+Replica │    │+Replica │    │+Replica │        │ │
│  │   └─────────┘    └─────────┘    └─────────┘    └─────────┘        │ │
│  │                                                                     │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

4.3 Data Model

Session structure:

@dataclass
class Session:
    session_id: str          # Unique identifier (UUID or secure random)
    user_id: str             # Authenticated user
    created_at: int          # Unix timestamp
    last_accessed: int       # Unix timestamp (for TTL extension)
    expires_at: int          # Unix timestamp
    
    # Authentication data
    roles: list[str]
    permissions: list[str]
    
    # User preferences
    locale: str
    timezone: str
    
    # CSRF protection
    csrf_token: str
    
    # Application-specific data
    data: dict               # Flexible storage for app needs

# Example session
session = Session(
    session_id="sess_a1b2c3d4e5f6",
    user_id="user_123",
    created_at=1700000000,
    last_accessed=1700003600,
    expires_at=1700086400,
    roles=["user", "premium"],
    permissions=["read", "write", "delete"],
    locale="en-US",
    timezone="America/New_York",
    csrf_token="csrf_xyz789",
    data={
        "cart_items": ["item_1", "item_2"],
        "last_page": "/products/shoes"
    }
)

Redis storage:

Key:   session:{session_id}
Value: JSON-encoded session data
TTL:   24 hours (extended on access)

4.4 Partitioning Strategy

With 10M sessions across a Redis cluster, we need to partition effectively.

Hash partitioning by session_id:

def get_partition(session_id: str, num_partitions: int) -> int:
    return crc16(session_id) % num_partitions

Redis Cluster does this automatically using hash slots (16384 slots).

Capacity planning:

Total data: 20 GB
Target per shard: 5 GB (leave headroom)
Number of shards: 20 GB / 5 GB = 4 shards minimum

For redundancy and performance:
  6 shards × 2 (primary + replica) = 12 nodes
  
Each shard:
  ~1.7M sessions
  ~3.3 GB memory
  ~500K reads/second capacity (plenty of headroom)

4.5 Replication Strategy

For 99.99% availability, we need replication with automatic failover.

Redis Cluster configuration:

Each shard:
  1 primary (handles reads + writes)
  1 replica (handles reads, ready for promotion)

Replication: Asynchronous (default)
  - Writes acknowledged after primary persists
  - Replica catches up in background
  - Failover may lose last few writes (acceptable for sessions)

Failover behavior:

Normal operation:
  Primary handles all writes
  Reads distributed to primary + replica

Primary fails:
  1. Cluster detects failure (1-2 seconds)
  2. Replica promoted to primary (automatic)
  3. Brief write unavailability (seconds)
  4. Old primary rejoins as replica when recovered

4.6 Session Operations

import redis
import json
import secrets
import time
from typing import Optional
from dataclasses import asdict

class SessionStore:
    def __init__(
        self,
        redis_cluster: redis.RedisCluster,
        default_ttl: int = 86400,  # 24 hours
        extend_on_access: bool = True
    ):
        self.redis = redis_cluster
        self.default_ttl = default_ttl
        self.extend_on_access = extend_on_access
    
    def create(self, user_id: str, **kwargs) -> Session:
        """Create a new session for a user."""
        now = int(time.time())
        
        session = Session(
            session_id=f"sess_{secrets.token_urlsafe(32)}",
            user_id=user_id,
            created_at=now,
            last_accessed=now,
            expires_at=now + self.default_ttl,
            roles=kwargs.get('roles', []),
            permissions=kwargs.get('permissions', []),
            locale=kwargs.get('locale', 'en-US'),
            timezone=kwargs.get('timezone', 'UTC'),
            csrf_token=secrets.token_urlsafe(32),
            data=kwargs.get('data', {})
        )
        
        key = f"session:{session.session_id}"
        self.redis.setex(key, self.default_ttl, json.dumps(asdict(session)))
        
        # Index by user_id for "logout all devices"
        self.redis.sadd(f"user_sessions:{user_id}", session.session_id)
        self.redis.expire(f"user_sessions:{user_id}", self.default_ttl)
        
        return session
    
    def get(self, session_id: str) -> Optional[Session]:
        """Retrieve a session by ID."""
        key = f"session:{session_id}"
        data = self.redis.get(key)
        
        if not data:
            return None
        
        session_dict = json.loads(data)
        session = Session(**session_dict)
        
        # Check expiration
        if session.expires_at < time.time():
            self.delete(session_id)
            return None
        
        # Extend TTL on access
        if self.extend_on_access:
            self._extend_ttl(session)
        
        return session
    
    def _extend_ttl(self, session: Session):
        """Extend session TTL on access."""
        now = int(time.time())
        session.last_accessed = now
        session.expires_at = now + self.default_ttl
        
        key = f"session:{session.session_id}"
        self.redis.setex(key, self.default_ttl, json.dumps(asdict(session)))
    
    def update(self, session_id: str, **updates) -> Optional[Session]:
        """Update session data."""
        session = self.get(session_id)
        if not session:
            return None
        
        # Apply updates
        for key, value in updates.items():
            if hasattr(session, key):
                setattr(session, key, value)
            else:
                session.data[key] = value
        
        session.last_accessed = int(time.time())
        
        key = f"session:{session_id}"
        ttl = self.redis.ttl(key)
        self.redis.setex(key, max(ttl, 60), json.dumps(asdict(session)))
        
        return session
    
    def delete(self, session_id: str) -> bool:
        """Delete a session (logout)."""
        session = self.get(session_id)
        if not session:
            return False
        
        # Remove from user index
        self.redis.srem(f"user_sessions:{session.user_id}", session_id)
        
        # Delete session
        key = f"session:{session_id}"
        return self.redis.delete(key) > 0
    
    def delete_all_for_user(self, user_id: str) -> int:
        """Delete all sessions for a user (logout all devices)."""
        session_ids = self.redis.smembers(f"user_sessions:{user_id}")
        
        if not session_ids:
            return 0
        
        # Delete all sessions
        pipe = self.redis.pipeline()
        for sid in session_ids:
            pipe.delete(f"session:{sid}")
        pipe.delete(f"user_sessions:{user_id}")
        results = pipe.execute()
        
        return sum(1 for r in results[:-1] if r > 0)
    
    def validate(self, session_id: str, csrf_token: Optional[str] = None) -> bool:
        """Validate a session exists and optionally verify CSRF token."""
        session = self.get(session_id)
        if not session:
            return False
        
        if csrf_token and session.csrf_token != csrf_token:
            return False
        
        return True

4.7 Performance Optimizations

Optimization 1: Connection Pooling

Reuse connections instead of creating new ones per request:

# Redis cluster with connection pooling
redis_cluster = redis.RedisCluster(
    host='redis-cluster.example.com',
    port=6379,
    max_connections=100,
    socket_timeout=1.0,
    socket_connect_timeout=1.0
)

Optimization 2: Local Caching

Cache hot sessions locally to avoid network round-trips:

from cachetools import TTLCache
import threading

class CachedSessionStore:
    def __init__(self, session_store: SessionStore, local_ttl: int = 5):
        self.store = session_store
        self.cache = TTLCache(maxsize=10000, ttl=local_ttl)
        self.lock = threading.Lock()
    
    def get(self, session_id: str) -> Optional[Session]:
        # Check local cache
        with self.lock:
            if session_id in self.cache:
                return self.cache[session_id]
        
        # Cache miss - fetch from Redis
        session = self.store.get(session_id)
        
        if session:
            with self.lock:
                self.cache[session_id] = session
        
        return session
    
    def invalidate(self, session_id: str):
        """Invalidate local cache entry."""
        with self.lock:
            self.cache.pop(session_id, None)
    
    # Write operations bypass cache and invalidate
    def update(self, session_id: str, **updates) -> Optional[Session]:
        self.invalidate(session_id)
        return self.store.update(session_id, **updates)
    
    def delete(self, session_id: str) -> bool:
        self.invalidate(session_id)
        return self.store.delete(session_id)

Trade-off: Local cache may serve stale data for up to local_ttl seconds. For sessions, this is usually acceptable—permissions changes can tolerate a few seconds delay.

Optimization 3: Batch Operations

For endpoints that need multiple sessions (admin dashboards):

def get_sessions_batch(self, session_ids: list[str]) -> dict[str, Session]:
    """Fetch multiple sessions in one round-trip."""
    pipe = self.redis.pipeline()
    
    for sid in session_ids:
        pipe.get(f"session:{sid}")
    
    results = pipe.execute()
    
    sessions = {}
    for sid, data in zip(session_ids, results):
        if data:
            sessions[sid] = Session(**json.loads(data))
    
    return sessions

Optimization 4: Compression for Large Sessions

If sessions grow large (>10KB), compress them:

import zlib

class CompressedSessionStore:
    COMPRESSION_THRESHOLD = 1024  # bytes
    
    def _serialize(self, session: Session) -> bytes:
        data = json.dumps(asdict(session)).encode()
        if len(data) > self.COMPRESSION_THRESHOLD:
            return b'Z' + zlib.compress(data)
        return b'R' + data
    
    def _deserialize(self, data: bytes) -> Session:
        if data[0:1] == b'Z':
            data = zlib.decompress(data[1:])
        else:
            data = data[1:]
        return Session(**json.loads(data))

Chapter 5: Multi-Datacenter Design

5.1 The Challenge

You have users globally. A single-region session store means:

High latency for distant users
Single point of failure
Compliance issues (data residency)

5.2 Architecture Options

Option 1: Active-Passive (DR)

One primary region, one standby:

US-East (Primary)           US-West (Standby)
┌─────────────────┐         ┌─────────────────┐
│  Redis Cluster  │────────▶│  Redis Cluster  │
│   (Read/Write)  │  async  │   (Read-only)   │
└─────────────────┘  repl   └─────────────────┘
        │
        ▼
   All traffic

Failover: Promote US-West to primary, redirect traffic.

Problems:

US-West users always have high latency
Async replication = data loss on failover

Option 2: Active-Active (Multi-Master)

Both regions accept writes:

US-East                     US-West
┌─────────────────┐         ┌─────────────────┐
│  Redis Cluster  │◀───────▶│  Redis Cluster  │
│  (Read/Write)   │  bi-dir │  (Read/Write)   │
└─────────────────┘  repl   └─────────────────┘
        │                           │
        ▼                           ▼
   US-East users              US-West users

Problems:

Conflict resolution for same session updated in both regions
More complex replication

Option 3: Session Affinity by Region

Sessions are created in and served from one region:

US-East                     US-West
┌─────────────────┐         ┌─────────────────┐
│  Redis Cluster  │         │  Redis Cluster  │
│ Sessions: A,B,C │         │ Sessions: X,Y,Z │
└─────────────────┘         └─────────────────┘
        │                           │
        ▼                           ▼
   Users A,B,C                Users X,Y,Z
   (created here)             (created here)

How it works:

Session ID encodes region: sess_east_abc123
Requests routed to session's home region
User always goes to same region for that session

This is the recommended approach for most applications.

5.3 Implementing Region-Aware Sessions

from enum import Enum
from typing import Optional
import re

class Region(Enum):
    US_EAST = "use1"
    US_WEST = "usw2"
    EU_WEST = "euw1"
    AP_SOUTH = "aps1"

class MultiRegionSessionStore:
    def __init__(
        self,
        local_region: Region,
        redis_clients: dict[Region, redis.RedisCluster]
    ):
        self.local_region = local_region
        self.clients = redis_clients
        self.local_store = SessionStore(redis_clients[local_region])
    
    def create(self, user_id: str, **kwargs) -> Session:
        """Create session in local region."""
        session = self.local_store.create(user_id, **kwargs)
        
        # Encode region in session ID
        session.session_id = f"sess_{self.local_region.value}_{session.session_id[5:]}"
        
        # Re-save with region-prefixed ID
        key = f"session:{session.session_id}"
        self.clients[self.local_region].setex(
            key, 
            self.local_store.default_ttl, 
            json.dumps(asdict(session))
        )
        
        return session
    
    def get(self, session_id: str) -> Optional[Session]:
        """Get session from appropriate region."""
        region = self._extract_region(session_id)
        
        if region == self.local_region:
            return self.local_store.get(session_id)
        
        # Remote region - need cross-region call
        remote_client = self.clients.get(region)
        if not remote_client:
            return None
        
        key = f"session:{session_id}"
        data = remote_client.get(key)
        
        if data:
            return Session(**json.loads(data))
        return None
    
    def _extract_region(self, session_id: str) -> Region:
        """Extract region from session ID."""
        # Format: sess_{region}_{random}
        match = re.match(r'sess_([a-z0-9]+)_', session_id)
        if match:
            region_code = match.group(1)
            for region in Region:
                if region.value == region_code:
                    return region
        return self.local_region  # Default to local
    
    def should_redirect(self, session_id: str) -> Optional[str]:
        """
        Check if request should be redirected to session's home region.
        Returns redirect URL or None.
        """
        region = self._extract_region(session_id)
        if region != self.local_region:
            return self._get_region_url(region)
        return None
    
    def _get_region_url(self, region: Region) -> str:
        return {
            Region.US_EAST: "https://us-east.example.com",
            Region.US_WEST: "https://us-west.example.com",
            Region.EU_WEST: "https://eu-west.example.com",
            Region.AP_SOUTH: "https://ap-south.example.com"
        }[region]

5.4 The Datacenter Failover Question

Scenario: US-East datacenter goes completely offline. What happens to sessions?

Analysis:

Sessions created in US-East are inaccessible
- Session data is in US-East Redis
- Other regions can't read it
Users from US-East need to re-authenticate
- Their sessions are gone (or inaccessible)
- This is acceptable for most applications
Options for mitigation:

Option A: Accept session loss
- Users log in again
- Simplest approach
- Acceptable if login is easy (SSO, remember-me)
Option B: Cross-region replication
- Replicate sessions to another region
- On failover, sessions available in backup
- More complex, higher cost
Option C: Session reconstruction
- Store minimal auth in encrypted cookie
- Reconstruct session from user database
- Balance between A and B

My recommendation for 99.99% availability target:

class ResilientSessionStore:
    """
    Session store with cross-region backup for critical sessions.
    """
    
    def __init__(
        self,
        local_region: Region,
        backup_region: Region,
        redis_clients: dict[Region, redis.RedisCluster]
    ):
        self.local = redis_clients[local_region]
        self.backup = redis_clients[backup_region]
        self.local_region = local_region
        self.backup_region = backup_region
    
    def create(self, user_id: str, **kwargs) -> Session:
        session = self._create_session(user_id, **kwargs)
        
        # Write to local
        self._save(self.local, session)
        
        # Async replicate to backup
        threading.Thread(
            target=self._save,
            args=(self.backup, session),
            daemon=True
        ).start()
        
        return session
    
    def get(self, session_id: str) -> Optional[Session]:
        # Try local first
        session = self._load(self.local, session_id)
        if session:
            return session
        
        # Try backup (maybe we're recovering from local failure)
        session = self._load(self.backup, session_id)
        if session:
            # Restore to local
            self._save(self.local, session)
        
        return session
    
    def _save(self, client, session: Session):
        key = f"session:{session.session_id}"
        client.setex(key, 86400, json.dumps(asdict(session)))
    
    def _load(self, client, session_id: str) -> Optional[Session]:
        try:
            key = f"session:{session_id}"
            data = client.get(key)
            return Session(**json.loads(data)) if data else None
        except redis.RedisError:
            return None

Part III: Technology Comparison

Chapter 6: Redis Cluster vs DynamoDB vs Custom

You need to choose a technology. Let's compare the options.

6.1 Redis Cluster

Architecture: In-memory key-value store with clustering and replication.

Strengths:

✅ Sub-millisecond latency
✅ Rich data structures (useful for complex sessions)
✅ Built-in clustering and replication
✅ TTL support (perfect for sessions)
✅ Pub/Sub for session invalidation

Weaknesses:

❌ Memory-only (data loss on restart without persistence)
❌ Single-threaded per shard (CPU-bound at extreme scale)
❌ Operational complexity (managing cluster)
❌ Cross-region replication requires additional tooling

Cost estimate (AWS ElastiCache):

6 shards × 2 nodes × cache.r6g.large = 12 nodes
12 × $0.126/hour × 730 hours/month = ~$1,100/month

6.2 Amazon DynamoDB

Architecture: Managed NoSQL database with automatic scaling.

Strengths:

✅ Fully managed (no operational burden)
✅ Automatic scaling
✅ Global Tables for multi-region
✅ Built-in TTL
✅ Highly durable (replicated across AZs)

Weaknesses:

❌ Higher latency than Redis (5-10ms vs 1ms)
❌ Cost can spike with high throughput
❌ Less flexible than Redis (no complex data structures)
❌ Consistency model complexity (eventual by default)

Cost estimate (DynamoDB):

10M sessions, 2KB each = 20GB storage
3M reads/sec peak, 500K writes/sec peak

On-demand pricing:
  Reads: 3M × $0.25/million = $0.75/sec (peak)
  Writes: 500K × $1.25/million = $0.625/sec (peak)
  
Provisioned (more predictable):
  Read capacity: 3M RCU × $0.00013/hour = ~$390/hour (peak)
  
This gets expensive fast. Use DAX caching.

6.3 Custom Solution (PostgreSQL + Caching)

Architecture: Sessions in PostgreSQL with Redis/Memcached caching layer.

Strengths:

✅ Use existing infrastructure
✅ Full SQL capabilities (complex queries)
✅ Strong durability
✅ Familiar operational model

Weaknesses:

❌ More complex architecture (two systems)
❌ Cache invalidation complexity
❌ Higher latency without cache
❌ PostgreSQL not designed for this access pattern

Best for: When you already have PostgreSQL and don't want another system.

6.4 Decision Matrix

Factor	Redis	DynamoDB	Custom (PG+Cache)
Latency	⭐⭐⭐	⭐⭐	⭐⭐ (with cache)
Operational burden	⭐⭐	⭐⭐⭐	⭐
Cost (at scale)	⭐⭐	⭐	⭐⭐
Multi-region	⭐⭐	⭐⭐⭐	⭐
Durability	⭐	⭐⭐⭐	⭐⭐⭐
Flexibility	⭐⭐⭐	⭐⭐	⭐⭐⭐

6.5 My Recommendation

For most applications: Redis Cluster

Reasons:

Sessions are ephemeral—durability is nice but not critical
Sub-millisecond latency matters for every request
TTL support is exactly what sessions need
Operational complexity is manageable with managed services (ElastiCache, Redis Cloud)

Exception: Use DynamoDB Global Tables when:

Multi-region with write availability in each region is critical
You want zero operational burden
Slightly higher latency is acceptable

Exception: Use Custom when:

You have strict data residency requirements
You already have PostgreSQL and Redis
Sessions need complex querying (admin features)

Chapter 7: Decision Document Template

Here's a template for documenting your technology choice:

# Session Store Technology Decision

## Context
We need a session store for 10M concurrent users with <5ms P99 latency
and 99.99% availability.

## Decision
We will use **Redis Cluster** via AWS ElastiCache.

## Rationale

### Why Redis
1. **Latency**: Sub-millisecond reads meet our <5ms requirement with margin
2. **TTL**: Native TTL support matches session lifecycle
3. **Proven**: Battle-tested for sessions at companies like GitHub, Twitter

### Why Not DynamoDB
1. **Latency**: 5-10ms baseline doesn't leave margin for our 5ms target
2. **Cost**: At 3M reads/sec, on-demand pricing is prohibitive
3. **Complexity**: DAX required to match Redis latency, adding another layer

### Why Not Custom
1. **Operational burden**: Managing PostgreSQL + Redis is more work than managed Redis
2. **Not needed**: We don't require SQL capabilities for sessions

## Architecture
- 6 shards with 1 replica each (12 nodes total)
- ElastiCache for Redis 7.0 (latest stable)
- Multi-AZ deployment for high availability
- Local application caching (5-second TTL) to reduce load

## Multi-Region Strategy
- Primary region: US-East
- Sessions are region-affine (encoded in session ID)
- Cross-region reads for mobile users who travel
- Async backup replication to US-West for DR

## Failure Handling
- Redis node failure: Automatic failover to replica (<30 seconds)
- Datacenter failure: Users re-authenticate (acceptable UX)
- Full region failure: DR failover to US-West (manual, 15-minute RTO)

## Cost Estimate
- ElastiCache: $1,100/month
- Data transfer: ~$200/month
- Total: ~$1,300/month

## Monitoring
- Latency: Alert if P99 > 3ms
- Memory: Alert if > 80% utilization
- Replication lag: Alert if > 1 second
- Connection count: Alert if > 80% of max

## Review Date
Re-evaluate in 6 months or when reaching 50M concurrent sessions.

Part IV: Advanced Topics

Chapter 8: Security Considerations

8.1 Session ID Security

Session IDs must be unpredictable to prevent session hijacking:

import secrets

def generate_session_id() -> str:
    # 256 bits of randomness = practically unguessable
    return f"sess_{secrets.token_urlsafe(32)}"

# Good: sess_7Hj2kL9mNpQrStUvWxYz1234567890abcdef
# Bad:  sess_1, sess_2, sess_3 (sequential = guessable)

8.2 Session Fixation Prevention

Don't accept session IDs from clients before authentication:

def login(username: str, password: str, old_session_id: Optional[str]) -> Session:
    user = authenticate(username, password)
    
    if old_session_id:
        # Destroy old session to prevent fixation
        session_store.delete(old_session_id)
    
    # Create new session with new ID
    return session_store.create(user.id)

response.set_cookie(
    'session_id',
    session.session_id,
    httponly=True,      # JavaScript can't access
    secure=True,        # HTTPS only
    samesite='Lax',     # CSRF protection
    max_age=86400,      # 24 hours
    domain='.example.com'  # Subdomain sharing if needed
)

8.4 Session Data Encryption

For sensitive data in sessions:

from cryptography.fernet import Fernet

class EncryptedSessionStore:
    def __init__(self, session_store: SessionStore, encryption_key: bytes):
        self.store = session_store
        self.cipher = Fernet(encryption_key)
    
    def create(self, user_id: str, **kwargs) -> Session:
        session = self.store.create(user_id, **kwargs)
        
        # Encrypt sensitive fields
        if 'data' in kwargs:
            encrypted = self.cipher.encrypt(json.dumps(kwargs['data']).encode())
            session.data = {'_encrypted': encrypted.decode()}
            self.store.update(session.session_id, data=session.data)
        
        return session
    
    def get(self, session_id: str) -> Optional[Session]:
        session = self.store.get(session_id)
        if session and '_encrypted' in session.data:
            decrypted = self.cipher.decrypt(session.data['_encrypted'].encode())
            session.data = json.loads(decrypted)
        return session

Chapter 9: Monitoring and Observability

9.1 Key Metrics

from prometheus_client import Counter, Histogram, Gauge

# Operation counts
session_operations = Counter(
    'session_operations_total',
    'Session store operations',
    ['operation', 'status']  # operation: create/get/update/delete, status: success/failure
)

# Latency
session_latency = Histogram(
    'session_operation_seconds',
    'Session operation latency',
    ['operation'],
    buckets=[.001, .0025, .005, .01, .025, .05, .1, .25, .5, 1]
)

# Active sessions
active_sessions = Gauge(
    'active_sessions_total',
    'Number of active sessions'
)

# Cache hit rate
cache_hits = Counter(
    'session_cache_hits_total',
    'Local cache hits'
)
cache_misses = Counter(
    'session_cache_misses_total',
    'Local cache misses'
)

9.2 Health Checks

async def session_store_health() -> dict:
    """Health check for session store."""
    results = {
        'status': 'healthy',
        'checks': {}
    }
    
    # Check Redis connectivity
    try:
        start = time.time()
        await redis.ping()
        latency = (time.time() - start) * 1000
        
        results['checks']['redis_ping'] = {
            'status': 'pass',
            'latency_ms': latency
        }
    except Exception as e:
        results['status'] = 'unhealthy'
        results['checks']['redis_ping'] = {
            'status': 'fail',
            'error': str(e)
        }
    
    # Check read/write
    try:
        test_key = f"health_check_{uuid.uuid4()}"
        await redis.setex(test_key, 10, 'test')
        value = await redis.get(test_key)
        await redis.delete(test_key)
        
        results['checks']['redis_readwrite'] = {'status': 'pass'}
    except Exception as e:
        results['status'] = 'unhealthy'
        results['checks']['redis_readwrite'] = {
            'status': 'fail',
            'error': str(e)
        }
    
    return results

9.3 Alerting Rules

groups:
  - name: session_store
    rules:
      - alert: SessionStoreHighLatency
        expr: histogram_quantile(0.99, rate(session_operation_seconds_bucket[5m])) > 0.005
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "Session store P99 latency > 5ms"
      
      - alert: SessionStoreErrors
        expr: rate(session_operations_total{status="failure"}[5m]) > 0.01
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Session store error rate > 1%"
      
      - alert: SessionStoreCacheLow
        expr: rate(session_cache_hits_total[5m]) / (rate(session_cache_hits_total[5m]) + rate(session_cache_misses_total[5m])) < 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Session cache hit rate below 80%"

Part V: Interview Questions and Answers

Chapter 10: Real-World Interview Scenarios

10.1 Conceptual Questions

Question 1: "Compare sticky sessions vs distributed sessions."

Interviewer's Intent: Testing understanding of trade-offs.

Strong Answer:

"Sticky sessions route all requests from one user to the same server, storing sessions in local memory. Distributed sessions store sessions in an external system accessible by all servers.

Sticky sessions are simpler and faster (no network hop), but have critical weaknesses: server failure loses all its sessions, scaling is harder, and load can be uneven.

Distributed sessions require infrastructure (Redis, DynamoDB) but provide resilience—server failures don't affect sessions, scaling is straightforward, and load balancing is simple.

For production at scale, I'd almost always choose distributed sessions. The operational benefits far outweigh the small latency cost. The exception is WebSocket applications where connection affinity is required—but even then, I'd store session data externally and only keep connection state locally."

Question 2: "How do you handle session invalidation across multiple servers?"

Interviewer's Intent: Testing understanding of consistency challenges.

Strong Answer:

"With distributed sessions in Redis, invalidation is straightforward—delete the key, and all servers will see the session is gone on their next request.

The challenge comes with local caching. If each app server caches sessions locally for performance, deleting from Redis doesn't immediately invalidate local caches.

Three solutions:

Short TTL on local cache: 5-10 seconds. Stale sessions expire quickly. Simple but not immediate.
Pub/Sub invalidation: Publish invalidation message when session deleted. All servers subscribe and clear local cache. Near-immediate but requires reliable pub/sub.
Versioning: Store version in Redis. Local cache checks version on each read. If mismatched, refetch. Accurate but adds a Redis call.

I'd typically use short TTL (5 seconds) because it's simple and good enough. For security-critical scenarios (user revokes access), I'd add pub/sub for immediate invalidation."

Question 3: "What happens to sessions during a datacenter failover?"

Interviewer's Intent: Testing disaster recovery thinking.

Strong Answer:

"It depends on the architecture. Let me walk through options:

Single-region with no replication: All sessions are lost. Users must re-authenticate. This is often acceptable—inconvenient but not catastrophic.

Active-passive with replication: Sessions replicated to standby region. On failover, sessions are available. But there's a window (replication lag) where recent sessions might be lost.

Active-active with region-affine sessions: Each session lives in one region. If US-East fails, US-East sessions are gone, but US-West sessions are fine. Users can re-authenticate against US-West and get new sessions.

For 99.99% availability, I'd use region-affine sessions with optional cross-region backup for critical sessions. The backup is async—we accept potential data loss for simplicity. We make re-authentication fast (SSO, remember-me cookies) so session loss isn't painful.

Key insight: Sessions are ephemeral by nature. Designing for zero session loss adds significant complexity. It's often better to make session recreation seamless than to guarantee session survival."

10.2 Design Questions

Question 4: "Design a session store for 10M concurrent users."

Interviewer's Intent: Testing end-to-end system design.

Strong Answer:

"Let me start with requirements and scale analysis.

Scale analysis:

10M sessions × 2KB = 20GB storage
2-3M reads/sec (each user makes ~1 req/sec when active)
500K writes/sec (TTL extensions, updates)
<5ms P99 latency target

Storage choice: Redis Cluster. It provides sub-millisecond latency, native TTL, and can handle this load comfortably. DynamoDB would work but has higher latency.

Architecture:

App Servers (with local cache)
         │
         ▼
    Redis Cluster
    (6 shards × 2 replicas)

Partitioning: Hash by session_id across 6 shards. Each shard handles ~3.3GB and ~400K reads/sec—well within Redis capacity.

Replication: Each shard has one replica. Async replication for performance. Automatic failover if primary dies.

Local caching: 5-second TTL cache on app servers. Reduces Redis load by 80%+ for read-heavy traffic. Accept stale reads for that window.

Session operations:

Create: Generate secure random ID, store in Redis with TTL
Get: Check local cache → Redis → return or null
Update: Update Redis, invalidate local cache
Delete: Delete from Redis, broadcast invalidation

Multi-region (if needed): Region-encoded session IDs. Sessions served from home region. Cross-region backup for DR.

Failure handling:

Redis node fails: Automatic replica promotion (~30 sec)
App server fails: No impact (sessions in Redis)
Region fails: Users re-authenticate (acceptable)

Monitoring: Latency percentiles, cache hit rate, error rate, memory utilization."

Question 5: "How would you implement 'logout from all devices'?"

Interviewer's Intent: Testing detailed implementation thinking.

Strong Answer:

"This requires tracking which sessions belong to which user. Here's my approach:

Data model:

session:{session_id} → session data
user_sessions:{user_id} → set of session IDs

On session create:

redis.setex(f'session:{session_id}', TTL, session_data)
redis.sadd(f'user_sessions:{user_id}', session_id)
redis.expire(f'user_sessions:{user_id}', TTL)

On logout all devices:

session_ids = redis.smembers(f'user_sessions:{user_id}')
pipe = redis.pipeline()
for sid in session_ids:
    pipe.delete(f'session:{sid}')
pipe.delete(f'user_sessions:{user_id}')
pipe.execute()

Considerations:

Atomicity: Use a transaction or Lua script if needed, but for sessions, eventual consistency is fine.
Performance: If a user has 100+ sessions (many devices over time), this is still fast—Redis pipeline handles it efficiently.
TTL sync: The user_sessions set should have same TTL as sessions. Otherwise, it accumulates expired session IDs.
Alternative approach: Instead of deleting sessions, we could increment a 'session_version' for the user. Session validation checks if session's version matches current. Cheaper for many sessions but requires extra lookup.

For most applications, the straightforward delete approach works fine. The version approach is for extreme cases (users with thousands of sessions)."

10.3 Scenario-Based Questions

Question 6: "Your Redis session store is running out of memory. What do you do?"

Interviewer's Intent: Testing operational problem-solving.

Strong Answer:

"Immediate triage:

Check memory metrics: Is it a sudden spike or gradual growth? Spike suggests a specific cause (bug, attack); gradual suggests capacity planning issue.
Check eviction policy: If maxmemory-policy is noeviction, Redis will reject writes. If it's volatile-lru, old sessions should be evicting. Check eviction stats.

Immediate mitigations:

Enable eviction if disabled: Set maxmemory-policy volatile-lru to evict sessions by LRU. Sessions have TTL, so this is safe.
Reduce session TTL temporarily: If sessions are 24 hours, reduce to 12 hours. Forces logout but buys time.
Flush expired keys: Run SCAN with TTL check to delete any zombie sessions that aren't expiring properly.
Add capacity: Add more shards if this is a capacity issue.

Root cause investigation:

Session size growth: Are sessions getting bigger? Check average size over time.
Session count growth: More users, or sessions not expiring?
Memory fragmentation: Redis can have fragmentation issues. Check INFO memory for fragmentation ratio.
Memory leak: Application bug storing too much in sessions?

Long-term fixes:

Right-size sessions: Review what's stored. Move large data (carts, preferences) to dedicated storage.
Compression: Compress sessions over 1KB.
Capacity planning: Monitor growth rate, plan scaling 3-6 months ahead.
Alerting: Alert at 70% memory to catch before it's critical."

Question 7: "How would you migrate from sticky sessions to distributed sessions without downtime?"

Interviewer's Intent: Testing migration planning.

Strong Answer:

"This is a stateful migration requiring careful orchestration. Here's my approach:

Phase 1: Dual-write (1-2 weeks)

Modify session handling to write to both locations:

def create_session(user_id):
    # Write to local (existing)
    local_session = create_local_session(user_id)
    
    # Also write to Redis (new)
    redis_session = copy_to_redis(local_session)
    
    return local_session

All new sessions exist in both systems.

Phase 2: Read-through (1 week)

On session read, try local first, then Redis:

def get_session(session_id):
    # Try local first (existing sessions)
    session = get_local_session(session_id)
    if session:
        # Ensure it's in Redis too
        if not redis_has_session(session_id):
            copy_to_redis(session)
        return session
    
    # Fall back to Redis (new sessions)
    return get_redis_session(session_id)

Existing sessions migrate to Redis on access.

Phase 3: Remove sticky routing (traffic switch)

Gradually shift traffic:

10% of new users get non-sticky routing
Monitor for issues
Increase to 50%, then 100%

Old sticky users still work—their sessions are in Redis now.

Phase 4: Cleanup (after old sessions expire)

Remove local session code. Full cutover complete.

Rollback plan:

At any phase, we can roll back:

Phase 1-2: Stop writing to Redis, continue local
Phase 3: Re-enable sticky routing

Key insight: The dual-write phase means we never lose sessions. The gradual rollout catches issues early."

10.4 Deep-Dive Questions

Question 8: "Compare Redis Cluster vs DynamoDB for session storage."

Interviewer's Intent: Testing technology depth.

Strong Answer:

"Both can work. Let me compare on key dimensions:

Latency:

Redis: Sub-millisecond (0.2-0.5ms typical)
DynamoDB: 5-10ms (single-digit millisecond)
Winner: Redis, by 10-20x

Scalability:

Redis: Manual sharding, can handle millions of ops/sec with enough shards
DynamoDB: Automatic scaling, essentially unlimited
Winner: DynamoDB for simplicity

Durability:

Redis: Memory-only by default. AOF persistence adds durability but increases latency
DynamoDB: Fully durable, synchronously replicated across AZs
Winner: DynamoDB

Multi-region:

Redis: Requires manual setup (Redis Enterprise or custom)
DynamoDB: Global Tables with built-in replication
Winner: DynamoDB

Cost:

Redis: Predictable (pay per node)
DynamoDB: Variable (pay per operation). Can spike with traffic
Winner: Redis for predictability at scale

Operational burden:

Redis: Cluster management, failover monitoring
DynamoDB: Fully managed
Winner: DynamoDB

My recommendation:

For sessions, I'd choose Redis because:

Latency is critical (every request reads session)
Sessions are ephemeral—durability is nice but not essential
Cost is predictable at high scale

I'd choose DynamoDB when:

Multi-region write availability is required
Operations team is small (want fully managed)
Latency SLA is 10ms+ (more relaxed)

For truly global applications needing writes in multiple regions, DynamoDB Global Tables is hard to beat. For single-region or read-heavy multi-region, Redis is faster and cheaper."

Question 9: "How would you implement session encryption at rest and in transit?"

Interviewer's Intent: Testing security knowledge.

Strong Answer:

"Let me address both separately:

In transit (network encryption):

For Redis:

Use TLS connections between app servers and Redis
ElastiCache supports in-transit encryption natively
Performance impact: ~10-20% latency increase, worth it for security

redis_client = redis.RedisCluster(
    host='redis.example.com',
    port=6379,
    ssl=True,
    ssl_cert_reqs='required',
    ssl_ca_certs='/path/to/ca.pem'
)

At rest (storage encryption):

Option 1: Redis encryption at rest (infrastructure level)

ElastiCache supports encryption at rest
Transparent to application
Keys managed by AWS KMS

Option 2: Application-level encryption

Encrypt session data before storing in Redis
More control, works with any Redis

from cryptography.fernet import Fernet

class EncryptedSessionStore:
    def __init__(self, redis_client, key: bytes):
        self.redis = redis_client
        self.cipher = Fernet(key)
    
    def set(self, session_id: str, data: dict, ttl: int):
        plaintext = json.dumps(data).encode()
        ciphertext = self.cipher.encrypt(plaintext)
        self.redis.setex(f'session:{session_id}', ttl, ciphertext)
    
    def get(self, session_id: str) -> dict:
        ciphertext = self.redis.get(f'session:{session_id}')
        if not ciphertext:
            return None
        plaintext = self.cipher.decrypt(ciphertext)
        return json.loads(plaintext)

Key management:

Store encryption key in secrets manager (AWS Secrets Manager, Vault)
Rotate keys periodically
For rotation: support decrypting with old key, encrypt with new key

Recommendation:

Use infrastructure-level encryption for both transit and rest—simpler, no code changes, managed key rotation. Add application-level encryption only for extremely sensitive data that shouldn't be visible even to infrastructure admins."

Chapter 11: Week 1 Summary

11.1 What You've Learned This Week

Day	Topic	Key Concepts
1	Partitioning	Hash vs Range vs Directory, Consistent Hashing, Hot Partition mitigation
2	Replication	Leader-follower vs Multi-leader, Sync vs Async, Failover handling
3	Rate Limiting	Sliding window, Token bucket, Distributed limiting, Failure modes
4	Hot Keys	Detection (CMS), Mitigation strategies, Fan-out patterns
5	Session Store	Sticky vs Distributed, Multi-region, Technology comparison

11.2 How They Connect

The session store design uses everything:

Partitioning: Sessions distributed across Redis shards by session_id hash
Replication: Each shard has replicas for availability
Rate Limiting: Protect session store from abuse (many session creates)
Hot Keys: Handle celebrity users with many concurrent sessions

11.3 Interview Preparation Checklist

Before your interview, make sure you can:

Partitioning:

Explain hash vs range partitioning trade-offs
Implement consistent hashing
Handle partition rebalancing

Replication:

Compare sync vs async replication
Design failover procedures
Handle replication lag anomalies

Rate Limiting:

Implement sliding window algorithm
Design distributed rate limiting
Handle rate limiter failures

Hot Keys:

Detect hot keys with streaming algorithms
Apply caching, splitting, replication strategies
Design for skewed traffic patterns

Session Store:

Compare sticky vs distributed sessions
Design for multi-datacenter
Choose appropriate technology with justification

Exercises

Exercise 1: Session Store Implementation

Implement a complete session store with:

Creation, retrieval, update, deletion
TTL extension on access
User session index (for logout all devices)
Local caching with invalidation
Metrics collection

Exercise 2: Multi-Region Design

Design the session management for a global application:

Users in US, EU, and Asia
99.99% availability requirement
<10ms P99 latency in each region
Compliance with GDPR (EU data stays in EU)

Document your architecture, data flow, and failure handling.

Exercise 3: Migration Plan

You're migrating from PHP sessions (file-based) to Redis Cluster:

5M active sessions
Zero downtime requirement
Rollback capability

Write a detailed migration plan with phases, rollback procedures, and success criteria.

Appendix: Complete Session Store Implementation

A.1 Production-Ready Session Store

"""
Production session store implementation tying together all Week 1 concepts.
"""

import json
import secrets
import time
import threading
import logging
from dataclasses import dataclass, asdict, field
from typing import Optional, Dict, Set, Any
from enum import Enum
import redis
from cachetools import TTLCache
from prometheus_client import Counter, Histogram, Gauge

# ============================================================================
# Metrics
# ============================================================================

session_ops = Counter(
    'session_operations_total',
    'Session store operations',
    ['operation', 'status']
)

session_latency = Histogram(
    'session_operation_seconds',
    'Session operation latency',
    ['operation'],
    buckets=[.0005, .001, .0025, .005, .01, .025, .05, .1]
)

cache_stats = Counter(
    'session_cache_total',
    'Session cache statistics',
    ['result']  # hit, miss
)

active_sessions = Gauge(
    'active_sessions_total',
    'Estimated active sessions'
)

# ============================================================================
# Data Models
# ============================================================================

@dataclass
class Session:
    session_id: str
    user_id: str
    created_at: int
    last_accessed: int
    expires_at: int
    roles: list = field(default_factory=list)
    permissions: list = field(default_factory=list)
    locale: str = 'en-US'
    timezone: str = 'UTC'
    csrf_token: str = ''
    data: dict = field(default_factory=dict)
    
    def is_expired(self) -> bool:
        return time.time() > self.expires_at
    
    def to_dict(self) -> dict:
        return asdict(self)
    
    @classmethod
    def from_dict(cls, data: dict) -> 'Session':
        return cls(**data)

class Region(Enum):
    US_EAST = "use1"
    US_WEST = "usw2"
    EU_WEST = "euw1"

# ============================================================================
# Session Store Implementation
# ============================================================================

class ProductionSessionStore:
    """
    Production session store with:
    - Redis Cluster backend
    - Local caching
    - Multi-region support
    - Comprehensive metrics
    - Security features
    """
    
    def __init__(
        self,
        redis_cluster: redis.RedisCluster,
        region: Region = Region.US_EAST,
        default_ttl: int = 86400,
        local_cache_ttl: int = 5,
        local_cache_size: int = 10000,
        extend_on_access: bool = True
    ):
        self.redis = redis_cluster
        self.region = region
        self.default_ttl = default_ttl
        self.extend_on_access = extend_on_access
        
        # Local cache
        self.local_cache = TTLCache(maxsize=local_cache_size, ttl=local_cache_ttl)
        self.cache_lock = threading.Lock()
        
        # Invalidation via pub/sub
        self._start_invalidation_listener()
        
        self.logger = logging.getLogger('session_store')
    
    # -------------------------------------------------------------------------
    # Core Operations
    # -------------------------------------------------------------------------
    
    def create(
        self,
        user_id: str,
        roles: list = None,
        permissions: list = None,
        **kwargs
    ) -> Session:
        """Create a new session."""
        start = time.time()
        
        try:
            now = int(time.time())
            session = Session(
                session_id=self._generate_session_id(),
                user_id=user_id,
                created_at=now,
                last_accessed=now,
                expires_at=now + self.default_ttl,
                roles=roles or [],
                permissions=permissions or [],
                csrf_token=secrets.token_urlsafe(32),
                **kwargs
            )
            
            # Store in Redis
            self._store_session(session)
            
            # Add to user index
            self._add_to_user_index(session)
            
            session_ops.labels(operation='create', status='success').inc()
            return session
            
        except Exception as e:
            session_ops.labels(operation='create', status='failure').inc()
            self.logger.error(f"Failed to create session: {e}")
            raise
        
        finally:
            session_latency.labels(operation='create').observe(time.time() - start)
    
    def get(self, session_id: str) -> Optional[Session]:
        """Retrieve a session by ID."""
        start = time.time()
        
        try:
            # Check local cache first
            session = self._check_local_cache(session_id)
            if session:
                cache_stats.labels(result='hit').inc()
                session_ops.labels(operation='get', status='success').inc()
                return session
            
            cache_stats.labels(result='miss').inc()
            
            # Fetch from Redis
            session = self._fetch_from_redis(session_id)
            
            if session:
                if session.is_expired():
                    self.delete(session_id)
                    return None
                
                # Extend TTL if configured
                if self.extend_on_access:
                    self._extend_ttl(session)
                
                # Update local cache
                self._update_local_cache(session)
                
                session_ops.labels(operation='get', status='success').inc()
            else:
                session_ops.labels(operation='get', status='not_found').inc()
            
            return session
            
        except Exception as e:
            session_ops.labels(operation='get', status='failure').inc()
            self.logger.error(f"Failed to get session: {e}")
            return None
        
        finally:
            session_latency.labels(operation='get').observe(time.time() - start)
    
    def update(self, session_id: str, **updates) -> Optional[Session]:
        """Update session data."""
        start = time.time()
        
        try:
            session = self.get(session_id)
            if not session:
                return None
            
            # Apply updates
            for key, value in updates.items():
                if key == 'data':
                    session.data.update(value)
                elif hasattr(session, key):
                    setattr(session, key, value)
            
            session.last_accessed = int(time.time())
            
            # Store and invalidate cache
            self._store_session(session)
            self._invalidate_cache(session_id)
            
            session_ops.labels(operation='update', status='success').inc()
            return session
            
        except Exception as e:
            session_ops.labels(operation='update', status='failure').inc()
            self.logger.error(f"Failed to update session: {e}")
            raise
        
        finally:
            session_latency.labels(operation='update').observe(time.time() - start)
    
    def delete(self, session_id: str) -> bool:
        """Delete a session."""
        start = time.time()
        
        try:
            # Get session to find user_id
            session = self._fetch_from_redis(session_id)
            
            if session:
                # Remove from user index
                self.redis.srem(f"user_sessions:{session.user_id}", session_id)
            
            # Delete session
            result = self.redis.delete(f"session:{session_id}") > 0
            
            # Invalidate cache
            self._invalidate_cache(session_id)
            
            session_ops.labels(operation='delete', status='success').inc()
            return result
            
        except Exception as e:
            session_ops.labels(operation='delete', status='failure').inc()
            self.logger.error(f"Failed to delete session: {e}")
            return False
        
        finally:
            session_latency.labels(operation='delete').observe(time.time() - start)
    
    def delete_all_for_user(self, user_id: str) -> int:
        """Delete all sessions for a user (logout all devices)."""
        session_ids = self.redis.smembers(f"user_sessions:{user_id}")
        
        if not session_ids:
            return 0
        
        pipe = self.redis.pipeline()
        for sid in session_ids:
            sid_str = sid.decode() if isinstance(sid, bytes) else sid
            pipe.delete(f"session:{sid_str}")
            self._invalidate_cache(sid_str)
        
        pipe.delete(f"user_sessions:{user_id}")
        results = pipe.execute()
        
        return sum(1 for r in results[:-1] if r > 0)
    
    # -------------------------------------------------------------------------
    # Validation
    # -------------------------------------------------------------------------
    
    def validate(
        self,
        session_id: str,
        csrf_token: Optional[str] = None,
        required_roles: Optional[list] = None
    ) -> tuple[bool, Optional[str]]:
        """
        Validate a session.
        Returns (is_valid, error_message).
        """
        session = self.get(session_id)
        
        if not session:
            return False, "Session not found"
        
        if session.is_expired():
            return False, "Session expired"
        
        if csrf_token and session.csrf_token != csrf_token:
            return False, "Invalid CSRF token"
        
        if required_roles:
            if not set(required_roles).issubset(set(session.roles)):
                return False, "Insufficient permissions"
        
        return True, None
    
    # -------------------------------------------------------------------------
    # Internal Methods
    # -------------------------------------------------------------------------
    
    def _generate_session_id(self) -> str:
        """Generate a secure session ID with region prefix."""
        random_part = secrets.token_urlsafe(32)
        return f"sess_{self.region.value}_{random_part}"
    
    def _store_session(self, session: Session):
        """Store session in Redis."""
        key = f"session:{session.session_id}"
        ttl = session.expires_at - int(time.time())
        self.redis.setex(key, max(ttl, 1), json.dumps(session.to_dict()))
    
    def _fetch_from_redis(self, session_id: str) -> Optional[Session]:
        """Fetch session from Redis."""
        key = f"session:{session_id}"
        data = self.redis.get(key)
        
        if data:
            return Session.from_dict(json.loads(data))
        return None
    
    def _add_to_user_index(self, session: Session):
        """Add session to user's session index."""
        key = f"user_sessions:{session.user_id}"
        self.redis.sadd(key, session.session_id)
        self.redis.expire(key, self.default_ttl)
    
    def _extend_ttl(self, session: Session):
        """Extend session TTL."""
        now = int(time.time())
        session.last_accessed = now
        session.expires_at = now + self.default_ttl
        self._store_session(session)
    
    def _check_local_cache(self, session_id: str) -> Optional[Session]:
        """Check local cache for session."""
        with self.cache_lock:
            return self.local_cache.get(session_id)
    
    def _update_local_cache(self, session: Session):
        """Update local cache."""
        with self.cache_lock:
            self.local_cache[session.session_id] = session
    
    def _invalidate_cache(self, session_id: str):
        """Invalidate local cache and broadcast to other servers."""
        with self.cache_lock:
            self.local_cache.pop(session_id, None)
        
        # Broadcast invalidation
        try:
            self.redis.publish('session_invalidation', session_id)
        except Exception as e:
            self.logger.warning(f"Failed to broadcast invalidation: {e}")
    
    def _start_invalidation_listener(self):
        """Start pub/sub listener for cache invalidation."""
        def listen():
            try:
                pubsub = self.redis.pubsub()
                pubsub.subscribe('session_invalidation')
                
                for message in pubsub.listen():
                    if message['type'] == 'message':
                        session_id = message['data']
                        if isinstance(session_id, bytes):
                            session_id = session_id.decode()
                        
                        with self.cache_lock:
                            self.local_cache.pop(session_id, None)
            
            except Exception as e:
                self.logger.error(f"Invalidation listener error: {e}")
                # Reconnect after delay
                time.sleep(5)
                self._start_invalidation_listener()
        
        thread = threading.Thread(target=listen, daemon=True)
        thread.start()
    
    # -------------------------------------------------------------------------
    # Health Check
    # -------------------------------------------------------------------------
    
    def health_check(self) -> dict:
        """Check session store health."""
        result = {
            'status': 'healthy',
            'checks': {}
        }
        
        # Redis ping
        try:
            start = time.time()
            self.redis.ping()
            latency = (time.time() - start) * 1000
            
            result['checks']['redis_ping'] = {
                'status': 'pass',
                'latency_ms': round(latency, 2)
            }
        except Exception as e:
            result['status'] = 'unhealthy'
            result['checks']['redis_ping'] = {
                'status': 'fail',
                'error': str(e)
            }
        
        # Read/write test
        try:
            test_id = f"health_{secrets.token_hex(8)}"
            test_session = self.create(user_id='health_check')
            retrieved = self.get(test_session.session_id)
            self.delete(test_session.session_id)
            
            if retrieved and retrieved.user_id == 'health_check':
                result['checks']['read_write'] = {'status': 'pass'}
            else:
                result['status'] = 'unhealthy'
                result['checks']['read_write'] = {'status': 'fail'}
        except Exception as e:
            result['status'] = 'unhealthy'
            result['checks']['read_write'] = {
                'status': 'fail',
                'error': str(e)
            }
        
        return result


# ============================================================================
# Multi-Region Wrapper
# ============================================================================

class MultiRegionSessionStore:
    """
    Session store supporting multiple regions with session affinity.
    """
    
    def __init__(
        self,
        local_region: Region,
        stores: Dict[Region, ProductionSessionStore]
    ):
        self.local_region = local_region
        self.stores = stores
        self.local_store = stores[local_region]
    
    def create(self, user_id: str, **kwargs) -> Session:
        """Create session in local region."""
        return self.local_store.create(user_id, **kwargs)
    
    def get(self, session_id: str) -> Optional[Session]:
        """Get session from appropriate region."""
        region = self._extract_region(session_id)
        store = self.stores.get(region, self.local_store)
        return store.get(session_id)
    
    def _extract_region(self, session_id: str) -> Region:
        """Extract region from session ID."""
        # Format: sess_{region}_{random}
        parts = session_id.split('_')
        if len(parts) >= 2:
            region_code = parts[1]
            for region in Region:
                if region.value == region_code:
                    return region
        return self.local_region
    
    def get_redirect_url(self, session_id: str) -> Optional[str]:
        """Get redirect URL if session is in different region."""
        region = self._extract_region(session_id)
        if region != self.local_region:
            return self._region_urls.get(region)
        return None
    
    _region_urls = {
        Region.US_EAST: "https://us-east.example.com",
        Region.US_WEST: "https://us-west.example.com",
        Region.EU_WEST: "https://eu-west.example.com"
    }

End of Week 1: Foundations of Scale

Congratulations! You've completed Week 1. You now have a solid foundation in:

Data partitioning and distribution
Replication and consistency
Protection mechanisms (rate limiting)
Handling real-world traffic patterns (hot keys)
Designing complete storage systems

Next Week: Week 2 — Building Blocks. We'll dive into messaging systems, caching patterns, and coordination services that tie distributed systems together.

Back to Course Overview

Week 1 — Day 5: Session Store Design

System Design Mastery Series

Preface

Part I: Foundations

Chapter 1: Understanding Session Management

1.1 What Is a Session?

1.2 Session Store Requirements

1.3 Scale Analysis

Chapter 2: Architecture Approaches

2.1 Approach 1: Sticky Sessions (Server Affinity)

2.2 Approach 2: Distributed Session Store

2.3 Approach 3: Client-Side Sessions (JWT/Encrypted Cookies)

2.4 Hybrid: Distributed Store + Client Cache

Chapter 3: Sticky Sessions Deep Dive

3.1 Implementation Options

3.2 Handling Server Failures

3.3 When Sticky Sessions Make Sense

Part II: The Design Challenge

Chapter 4: Designing a Distributed Session Store

4.1 Requirements Recap

4.2 High-Level Architecture

4.3 Data Model

4.4 Partitioning Strategy

4.5 Replication Strategy

4.6 Session Operations

4.7 Performance Optimizations

Optimization 1: Connection Pooling

Optimization 2: Local Caching

Optimization 3: Batch Operations

Optimization 4: Compression for Large Sessions

Chapter 5: Multi-Datacenter Design

5.1 The Challenge

5.2 Architecture Options

Option 1: Active-Passive (DR)

Option 2: Active-Active (Multi-Master)

Option 3: Session Affinity by Region

5.3 Implementing Region-Aware Sessions

5.4 The Datacenter Failover Question

Part III: Technology Comparison

Chapter 6: Redis Cluster vs DynamoDB vs Custom

6.1 Redis Cluster

6.2 Amazon DynamoDB

6.3 Custom Solution (PostgreSQL + Caching)

6.4 Decision Matrix

6.5 My Recommendation

Chapter 7: Decision Document Template

Part IV: Advanced Topics

Chapter 8: Security Considerations

8.1 Session ID Security

8.2 Session Fixation Prevention

8.3 Secure Cookie Settings

8.4 Session Data Encryption

Chapter 9: Monitoring and Observability

9.1 Key Metrics

9.2 Health Checks

9.3 Alerting Rules

Part V: Interview Questions and Answers

Chapter 10: Real-World Interview Scenarios

10.1 Conceptual Questions

Question 1: "Compare sticky sessions vs distributed sessions."

Question 2: "How do you handle session invalidation across multiple servers?"

Question 3: "What happens to sessions during a datacenter failover?"

10.2 Design Questions

Question 4: "Design a session store for 10M concurrent users."

Question 5: "How would you implement 'logout from all devices'?"

10.3 Scenario-Based Questions

Question 6: "Your Redis session store is running out of memory. What do you do?"

Question 7: "How would you migrate from sticky sessions to distributed sessions without downtime?"

10.4 Deep-Dive Questions

Question 8: "Compare Redis Cluster vs DynamoDB for session storage."

Question 9: "How would you implement session encryption at rest and in transit?"

Chapter 11: Week 1 Summary

11.1 What You've Learned This Week

11.2 How They Connect

11.3 Interview Preparation Checklist

Exercises

Exercise 1: Session Store Implementation

Exercise 2: Multi-Region Design

Exercise 3: Migration Plan

Further Reading