3 min read By Vamsi Karuturi · Senior Backend Engineer at Salesforce

CAP Theorem

Real Incident: GitHub, October 2018

A 43-second network partition split GitHub's MySQL cluster. Both sides promoted a primary. Two databases accepted writes simultaneously. Result: 24 hours of degraded service, millions of developers affected. This is CAP theorem hitting production.

The 30-Second Explanation

In a distributed system, when a network partition happens, you must choose:

🔒

Consistency (CP)

"I'd rather show an error than wrong data"

Banks, inventory, bookings

🌐

Availability (AP)

"I'd rather show stale data than nothing"

Social feeds, DNS, shopping carts

The key insight: You're not choosing between C, A, and P. Partitions are inevitable. You're choosing between C and A when a partition happens.

The Pizza Shop Analogy

You own 3 pizza shops in SF, NYC, and Chicago. All must serve the same menu.

The phone lines between shops go down (partition). A customer in Chicago asks for a pizza you just added in SF 5 minutes ago.

You choose...	What happens	Real-world equivalent
Consistency	"Sorry, system is down. Come back later."	Banking app during outage
Availability	"Here's our menu!" (missing the new pizza)	Netflix showing stale recommendations

That's it. That's the entire theorem.

What FAANG Interviewers Actually Ask

"Given this system, would you choose CP or AP?"

Framework to answer:

If the data is...	Choose	Because	Example
Financial / transactional	CP	Wrong balance = lawsuit	Stripe, banks
User-generated content	AP	Stale likes > error page	Instagram, Twitter
Inventory / booking	CP	Overselling = real money lost	Uber seats, airline tickets
Session / preference	AP	Show old settings > force re-login	Netflix, Spotify
Leader election / config	CP	Split-brain = catastrophe	ZooKeeper, etcd

Real Systems Mapped

System	Choice	What happens during partition
ZooKeeper	🔒 CP	Minority side stops accepting writes. Waits for quorum.
etcd	🔒 CP	Raft consensus — no leader = no writes.
Google Spanner	🔒 CP	TrueTime + Paxos. Blocks rather than diverge.
Cassandra	🌐 AP	Every node accepts writes. Resolves conflicts later (last-write-wins).
DynamoDB	🌐 AP	Eventually consistent by default. Strong consistency optional (costs 2x).
MongoDB	🔒 CP	Primary goes down → election (10-30s downtime). Reads can be stale on secondaries.

PACELC — The Follow-Up They Always Ask

"OK, so you know CAP. What about when there's NO partition?"

PACELC extends CAP: if P**artition → choose **A or C. **E**lse (normal operation) → choose **L**atency or **C**onsistency.

System	During Partition	Normal Operation
Cassandra	A (stay available)	L (fast, eventually consistent)
DynamoDB	A	L (default) or C (strong reads cost 2x)
ZooKeeper	C (refuse writes)	C (always consistent, higher latency)
Spanner	C	C (TrueTime makes it fast despite consistency)
MongoDB	C	L (reads from secondaries are faster but stale)

Interview Gold

When an interviewer asks "What's the trade-off of using DynamoDB?" — answer with PACELC: "It's PA/EL by default. Available during partitions, low-latency in normal operation, but eventually consistent. You can opt into strong consistency per-read at 2x cost."

Consistency Models (From Strict to Relaxed)

Model	Promise	Real-World Feel
Linearizability	Every read sees the absolute latest write	Your bank balance — always correct
Sequential	All nodes agree on ordering, maybe slightly behind	A shared Google Doc
Causal	If A caused B, everyone sees A before B	Chat messages in order
Eventual	Everyone will eventually agree	Instagram like count (might be off by a few)

The 3 Interview Mistakes That Get You Rejected

Don't Say These

"Just pick CP for everything" — Shows you don't understand the trade-off. No interviewer wants a system that goes down during every network blip.
"CA is a valid option" — In any real distributed system, partitions WILL happen. CA only exists on a single machine (not distributed).
"Eventual consistency means data loss" — No. It means temporary staleness. All writes are eventually propagated. No data is lost.

Your Interview Answer Template

When asked "How would you handle consistency in [System X]?"

"For [specific use case], I'd choose [CP/AP] because [reason tied to business impact]. During normal operation, I'd optimize for [latency/consistency]. Specifically, I'd use [real system] which gives me [specific guarantee]. For the parts that need [opposite choice], I'd use a separate store — for example [second system] for [that specific data]."

Example: "For the payment service, I'd choose CP using PostgreSQL with synchronous replication — a wrong balance is worse than brief unavailability. But for the activity feed, I'd use Cassandra (AP) because showing a slightly stale feed is fine, and we need sub-10ms reads globally."

Quick Recall Card

Question	Answer
What is CAP?	During a partition, choose Consistency or Availability
Why not both?	Partitions are inevitable in distributed systems
CP examples?	ZooKeeper, etcd, Spanner, HBase, MongoDB
AP examples?	Cassandra, DynamoDB, CouchDB, DNS
What's PACELC?	Extends CAP: what do you trade-off when there's NO partition?
Most common mistake?	Treating it as "pick 2 of 3" instead of "CP or AP during partition"