4 min read By Vamsi Karuturi · Senior Backend Engineer at Salesforce

Database Sharding

Real Incident: Instagram, 2012

Instagram hit 25M users on a single PostgreSQL instance. Write throughput maxed at 15K TPS. Vertical scaling hit $30K/month EC2 ceiling. They sharded by user_id — each shard holds a range of users. Scaled to 2B+ accounts. Every large-scale system eventually shards. The question isn't IF but WHEN and HOW.

Why This Comes Up in Every Design Interview

When you say "I'll use PostgreSQL" in an interview, the natural follow-up is: "What happens at 100M users? 1B rows? 50K writes/sec?" The answer is sharding. Interviewers specifically test:

Can you identify WHEN to shard (not too early, not too late)
Can you choose a shard key that avoids hotspots
Do you understand the problems sharding creates (joins, transactions, resharding)
Can you reason about alternatives before jumping to sharding

The Scaling Staircase — Shard LAST, Not First

Stage	Strategy	Handles	Cost
1	Optimize queries + indexes	10x current load	Free
2	Vertical scaling (bigger machine)	Up to ~$50K/month instance	Money
3	Read replicas (write to primary, read from replicas)	5-10x read throughput	Replication lag complexity
4	Caching (Redis/Memcached in front of DB)	100x read throughput	Cache invalidation
5	Table partitioning (within single instance)	Larger data, faster queries	DB-managed, limited
6	Sharding (data across multiple instances)	Near-unlimited scale	Significant complexity

Interview gold: "I'd exhaust vertical scaling, read replicas, and caching before sharding. Sharding adds enormous complexity — cross-shard queries, distributed transactions, resharding operations. But when a single machine can't handle the write throughput or data volume, it's the only option."

Back-of-Envelope: When to Shard

PostgreSQL practical limits on a single instance (~$20K/month machine):

Resource	Practical Limit	Warning Sign
Table size	~1-5 TB	Queries slow, vacuum takes hours
Write TPS	~20-50K TPS	Write latency increasing
Connections	~500-1000	Connection pool exhaustion
Storage IOPS	~50K IOPS (NVMe)	I/O wait increasing

Example calculation: Social media platform, 500M users, 5 posts/user/day:

Metric	Value	Calculation
Daily posts	2.5B	500M × 5
Writes/sec	~29K	2.5B / 86400
Peak writes/sec (10x)	~290K	Well beyond single-instance
Storage/year	~25 TB	2.5B × 365 × ~30 bytes per row
Verdict	Must shard	Single instance maxes at ~50K writes/sec

Choosing a Shard Key (The Most Critical Decision)

The shard key determines everything: query routing, data distribution, hotspot risk. Once chosen, it's extremely expensive to change.

Properties of a good shard key:

Property	Why	Example
High cardinality	Even distribution across many shards	user_id (millions of values)
Even distribution	No shard gets disproportionate load	UUID > sequential ID (sequential causes write hotspot on latest shard)
Matches access pattern	Queries hit one shard, not all	Shard by user_id if most queries are per-user
Immutable	Key change = data movement	user_id (never changes) > email (can change)
Not correlated with time	Avoids "hot newest shard"	Hash of user_id > creation_date

How to evaluate a shard key — the scatter/gather test:

"What % of my queries can be answered by a single shard?"
Target: >95% single-shard queries
If most queries need ALL shards (scatter-gather), your key is wrong

Sharding Strategies Compared

Hash-Based Sharding

Text Only

shard = hash(user_id) % num_shards

Pros	Cons
Even distribution	Range queries impossible (scan all shards)
Simple routing	Adding shards = rehash everything (unless consistent hashing)
No hotspots (with good hash)	Can't colocate related data

When: Even distribution is priority, access is by primary key.

Range-Based Sharding

Text Only

Shard 1: user_id 1 - 1M
Shard 2: user_id 1M - 2M
...

Pros	Cons
Range queries efficient	Hotspot on "active range" (newest users)
Easy to understand	Uneven distribution over time
Can add shards for new ranges	Requires manual rebalancing

When: Data is naturally ordered and range queries are important (time-series, analytics).

Directory-Based Sharding

Key	Shard
user_1 - user_500K	shard_1
user_500K - user_1.2M	shard_2
user_1.2M - user_3M	shard_3

Pros	Cons
Flexible mapping (any logic)	Directory = SPOF + bottleneck
Easy rebalancing	Extra network hop for lookup
Handles uneven data	Directory must be cached/replicated

When: Need flexibility, or combining strategies (move hot users to dedicated shards).

Compound Shard Keys (DynamoDB Pattern)

Problem: You need both even distribution AND range queries within an entity.

Solution: Compound key = Partition Key (hash distribution) + Sort Key (range within partition)

Example	Partition Key	Sort Key	Query Pattern
User's posts	user_id	timestamp	"All posts by user in date range"
Order items	order_id	item_id	"All items in an order"
Chat messages	conversation_id	message_timestamp	"Recent messages in conversation"

Why this works: All data for one entity (user, order, conversation) lives on the same shard. Range queries work within-shard. No scatter-gather needed.

The Hard Problems Sharding Creates

1. Cross-Shard Queries (Scatter-Gather)

Problem: "Show all orders over $100 across all users"

Must query EVERY shard, merge results
Latency = slowest shard
Can't use DB-level sorting/pagination across shards

Mitigations:

Denormalize: duplicate data to enable single-shard queries
Secondary index service: Elasticsearch indexes all data for complex queries
Materialized views: Pre-compute aggregates

2. Cross-Shard Transactions

Problem: Transfer money from user on Shard 1 to user on Shard 3

No single DB transaction spans shards
Need distributed transaction (2PC) or eventual consistency

Mitigations:

Approach	How	Trade-off
Two-Phase Commit (2PC)	Coordinator asks all shards to prepare, then commit	Slow, blocks on failures
Saga pattern	Sequence of local transactions + compensations	Eventually consistent, complex
Co-locate	Put related data on same shard	Limits flexibility

3. Resharding (Adding Shards)

Problem: Started with 10 shards, now need 20. With hash % N, ALL keys remap.

Solutions:

Strategy	How	Downtime
Consistent hashing	Only 1/N keys move	Minimal
Logical shards	Create 1000 logical shards, map to physical. Remap = move logical shards.	Minutes
Double-write migration	Write to old + new, read from new once caught up	Zero downtime
Vitess/ProxySQL	Middleware handles resharding transparently	Application unaware

4. Unique Constraints Across Shards

Problem: "Email must be unique globally" but data is on different shards.

Solutions:

Global unique index (separate service/table)
Shard by the unique field (email)
UUIDs instead of sequences (guaranteed unique without coordination)
Snowflake IDs (Twitter): timestamp + machine_id + sequence = unique + sortable

5. JOINs Across Shards

Problem: SELECT * FROM orders JOIN users ON... when orders and users on different shards.

Solutions:

Denormalize: Store user_name in orders table (duplicate data)
Application-level join: Fetch from both shards, join in code
Co-locate: Shard orders by user_id too (same shard)
Separate analytics DB: ETL into data warehouse for complex queries

Real-World Sharding Examples

Company	Strategy	Shard Key	Details
Instagram	Hash (pgbouncer + custom)	user_id	Logical shards → physical mapping
Uber	Geo + hash	city_id	Trips sharded by city for locality
Stripe	Directory	merchant_id	Hot merchants on dedicated shards
Discord	Hash (Cassandra)	guild_id	Messages sharded per guild
Vitess (YouTube)	Hash (MySQL)	Configurable keyspace	Middleware manages sharding transparently
Pinterest	Hash (MySQL)	object_id	8192 logical shards, ~100 physical
Facebook	Hash (MySQL + TAO)	user_id	Thousands of shards, custom routing

Logical vs Physical Shards (Pinterest/Uber Pattern)

The trick: Don't map keys directly to physical machines. Add an indirection layer.

Text Only

Step 1: key → logical_shard_id (hash % 8192)
Step 2: logical_shard_id → physical_machine (lookup table)

Why:

Start with 8192 logical shards on 4 physical machines (2048 logical per physical)
Need more capacity? Move 1024 logical shards to a new machine
No key remapping, no data copying logic change
Just update the mapping table

Pinterest's approach: Started with 8192 logical shards. Currently mapped to ~150 physical database instances. Never needed to rehash.

Interview Framework

When the interviewer asks about scaling your database:

Step 1 — Delay sharding: "First I'd add read replicas for read-heavy queries and a cache layer. This gets us to [X] scale without sharding complexity."

Step 2 — When to shard: "Once we exceed [single-machine write capacity / storage], I'd introduce sharding. For this system, that's around [Y] users / [Z] writes/sec."

Step 3 — Shard key choice: "I'd shard by [user_id/merchant_id/entity_id] because [95%+ of queries are per-entity, it's immutable, high cardinality, evenly distributed]."

Step 4 — Strategy: "I'd use [hash-based with consistent hashing / logical shards mapped to physical] so resharding doesn't require data rehashing."

Step 5 — Address hard problems: "For cross-shard queries, I'd [denormalize / use Elasticsearch]. For distributed transactions, I'd use the [saga pattern / co-locate]. For unique constraints, [global index / UUID]."

Quick Recall

Question	Answer
When to shard?	After exhausting vertical scaling + read replicas + caching
Best shard key properties?	High cardinality, even distribution, matches access pattern, immutable
Scatter-gather test?	>95% queries should hit single shard
Hash vs range?	Hash = even distribution. Range = efficient range queries.
Resharding without downtime?	Logical shard indirection OR consistent hashing
Cross-shard joins?	Denormalize, co-locate, or separate analytics service
Unique constraints across shards?	Global unique index, UUID, or shard by the unique field
Pinterest pattern?	8192 logical shards → N physical machines. Remap mapping, not data.

Database Sharding

Why This Comes Up in Every Design Interview

The Scaling Staircase — Shard LAST, Not First

Back-of-Envelope: When to Shard

Choosing a Shard Key (The Most Critical Decision)

Sharding Strategies Compared

Hash-Based Sharding

Range-Based Sharding

Directory-Based Sharding

Compound Shard Keys (DynamoDB Pattern)

The Hard Problems Sharding Creates

1. Cross-Shard Queries (Scatter-Gather)

2. Cross-Shard Transactions

3. Resharding (Adding Shards)

4. Unique Constraints Across Shards

5. JOINs Across Shards

Real-World Sharding Examples

Logical vs Physical Shards (Pinterest/Uber Pattern)

Interview Framework

Quick Recall

5-Minute System Design — Weekly