Patterns

Scaling Patterns Reference

📈 Horizontal vs Vertical Scaling

Vertical (Scale Up)

Add more power to existing machine

✓ Simple, no code changes
✓ No distributed complexity
✗ Hardware limits (ceiling)
✗ Single point of failure
✗ Expensive at scale

Best for: Databases, quick wins

Horizontal (Scale Out)

Add more machines

✓ Near-unlimited scaling
✓ Fault tolerance (redundancy)
✓ Cost-effective at scale
✗ Distributed complexity
✗ Data consistency challenges

Best for: Stateless services, high traffic

⚖️ Load Balancing Strategies

Algorithm	How It Works	Best For
Round Robin	Rotate through servers sequentially	Similar capacity servers
Weighted Round Robin	More traffic to higher-weight servers	Mixed server capacities
Least Connections	Route to server with fewest active connections	Long-lived connections
IP Hash	Same IP always routes to same server	Session affinity needed
Random	Random server selection	Simple, statistically even

🔀 Database Sharding

Sharding Strategies

Hash-based: hash(user_id) % N shards
→ Even distribution, hard to add shards
Range-based: A-M on shard1, N-Z on shard2
→ Easy range queries, can get unbalanced
Directory-based: Lookup table maps key → shard
→ Flexible, lookup table is bottleneck
Geo-based: Shard by user location
→ Lower latency, cross-region queries hard

⚠️ Sharding Challenges

❌ Cross-shard joins - Very expensive
❌ Rebalancing - Moving data is painful
❌ Hotspots - Popular keys overwhelm one shard
❌ Transactions - ACID across shards is hard
❌ Unique constraints - Global uniqueness is complex

📋 Database Replication

Type	How It Works	Trade-offs
Primary-Replica (Master-Slave)	Writes → Primary Reads → Replicas	Read scalability Replica lag, write bottleneck
Multi-Primary (Multi-Master)	Any node accepts writes Conflict resolution needed	Write availability Conflict complexity
Synchronous	Wait for replicas before ack Strong consistency	No data loss Higher latency
Asynchronous	Ack immediately, replicate later Eventual consistency	Low latency Possible data loss

🧩 Common Scaling Patterns

CDN

Cache static content at edge locations

→ Lower latency, reduce origin load

Message Queue

Decouple producers and consumers

→ Handle traffic spikes, async processing

Microservices

Split into independent services

→ Scale bottlenecks independently

CQRS

Separate read and write models

→ Optimize each path independently

Read Replicas

Replicate DB for read queries

→ Scale reads without sharding

Connection Pooling

Reuse DB connections

→ Reduce connection overhead

✅ Scaling Checklist

Before Scaling

☐ Profile and identify bottleneck
☐ Optimize queries and indexes
☐ Add caching layer
☐ Consider vertical scaling first

Horizontal Scaling

☐ Make services stateless
☐ Add load balancer
☐ Externalize session state
☐ Plan for failure (redundancy)

🎯 Interview Template

"For scaling this system, I'd first ensure the application layer is stateless so we can horizontally scale with a load balancer. For the database, I'd start with read replicas for [read-heavy workload]. If we need to scale writes, I'd consider sharding by [user_id/geo] using [hash-based] partitioning."

API Design Patterns

Learn

Practice

Resources

Scaling Patterns Reference

Scaling Patterns Reference

📈 Horizontal vs Vertical Scaling

Vertical (Scale Up)

Horizontal (Scale Out)

⚖️ Load Balancing Strategies

🔀 Database Sharding

Sharding Strategies

⚠️ Sharding Challenges

📋 Database Replication

🧩 Common Scaling Patterns

CDN

Message Queue

Microservices

CQRS

Read Replicas

Connection Pooling

✅ Scaling Checklist

Before Scaling

Horizontal Scaling

🎯 Interview Template