The CAP Theorem

The "Pick Two" Problem

There's an old saying in the trades: "Cheap, Fast, and Good — pick two." The CAP theorem applies exactly the same logic to distributed databases.

The CAP theorem states that a distributed system can deliver only two of three properties: Consistency, Availability, and Partition Tolerance.

First articulated by Professor Eric A. Brewer in 2000 (and formally proven by MIT professors Seth Gilbert and Nancy Lynch in 2002), CAP is the foundational principle every backend engineer must understand before choosing a database for a distributed system.

The Three Properties

C — Consistency

Every client sees the same data at the same time, regardless of which node they connect to.

When data is written to one node, it must be replicated to all other nodes before the write is acknowledged as successful. There is no stale read — all nodes agree on the current state.

Write to Node A → replicate to B and C → confirm success
              
Client reads from B → gets the latest data ✅
Client reads from C → gets the latest data ✅

This is not the same as ACID consistency (which is about data validity within a single database). CAP consistency is about read-after-write coherence across nodes.

A — Availability

Every request receives a response — even if one or more nodes are down.

All healthy nodes in the distributed system return a valid response for any request, without exception. The system never refuses or times out a request as long as at least one node is reachable.

Node B is down
              
Client reads from A → gets a response ✅
Client reads from C → gets a response ✅
(even if the data might be slightly stale)

P — Partition Tolerance

The system continues operating despite network partitions (communication failures between nodes).

A partition is a break in communication between nodes — a dropped connection, a network delay, or a split-brain scenario. Partition tolerance means the cluster keeps working even when some nodes cannot talk to each other.

Node A ←─✗─→ Node B  (partition: A and B cannot communicate)

Partition tolerant system: both A and B continue serving requests
Non-tolerant system: shuts down or becomes unavailable

Why You Can Only Pick Two

Network partitions in distributed systems are inevitable — hardware fails, networks hiccup, data centers lose connectivity. Because partitions cannot be entirely prevented, the practical choice in a distributed system is always between CP or AP.

┌─────────────────────────────────────────────────────────────┐
│                       CAP Triangle                          │
│                                                             │
│                    Consistency (C)                          │
│                         /\                                  │
│                        /  \                                 │
│                       / CP \                                │
│                      /      \                               │
│              CA ----+--------+---- (theoretical)           │
│                    /   ????   \                             │
│                   /            \                            │
│    Availability  /----- AP -----\ Partition                 │
│         (A)                      Tolerance (P)              │
└─────────────────────────────────────────────────────────────┘

Choice

Trade-off

Practical Meaning

CP (Consistency + Partition Tolerance)

Sacrifices Availability

System rejects or blocks requests during a partition to prevent inconsistency

AP (Availability + Partition Tolerance)

Sacrifices Consistency

System returns data (may be stale) during a partition; syncs when partition heals

CA (Consistency + Availability)

Sacrifices Partition Tolerance

Only possible if partitions never occur — impractical in distributed systems

CA in practice: Relational databases like PostgreSQL are often called "CA databases" when deployed to a single node. When you add replication and run them across multiple nodes, you must choose between C and A during a partition — just like everything else.

CP Databases: Consistency Over Availability

When a partition occurs, a CP database shuts down the inconsistent node (makes it unavailable) rather than serve potentially stale data. Once the partition heals, the node rejoins and catches up.

Real-world example: MongoDB

MongoDB is a CP database. It uses a single-primary replica set model:

Only the primary node accepts writes
Secondary nodes replicate the primary's operation log
If the primary becomes unreachable, secondaries elect a new primary
During the election window, the cluster refuses writes to stay consistent

          Write Request
               ↓
        [Primary Node]  ←── clients write here only
         /         \
        /           \
[Secondary A]  [Secondary B]  ←── replicate primary's log

If Primary fails:
  → Election process begins
  → Writes blocked during election (~10-30 seconds)
  → New primary elected
  → Cluster available again, fully consistent

This means MongoDB prioritizes consistency — you'll never read data that doesn't reflect a committed write, but you may briefly lose write availability during failover.

Other CP databases: HBase, ZooKeeper, etcd, Redis (with strong consistency mode)

AP Databases: Availability Over Consistency

When a partition occurs, an AP database keeps all nodes available and continues serving requests. Nodes that become partitioned may serve slightly stale data. When the partition heals, the system reconciles differences to reach eventual consistency.

Real-world example: Apache Cassandra

Cassandra is an AP database with a masterless architecture (every node is equal):

Any node can accept reads and writes (no single point of failure)
During a partition, partitioned nodes serve requests with potentially old data
After the partition heals, Cassandra's repair mechanism syncs diverged nodes

       Write (any node accepts)
              ↓
    [Node A] ←─────→ [Node B] ←─────→ [Node C]
    
If A and B are partitioned from C:
  → A and B continue serving reads and writes ✅
  → C continues serving reads and writes ✅
  → Reads may return stale data on C
  → After partition heals: reconcile → eventual consistency

This means Cassandra prioritizes availability — the system is always up, but you may briefly read slightly older data.

Other AP databases: CouchDB, DynamoDB (default), Riak, Couchbase

CA Databases: Consistent and Available (No Partitions)

A CA system is consistent and always available — but only works if there are no partitions. Since partitions are unavoidable in distributed systems at scale, CA is effectively the model for single-node or tightly-coupled relational databases.

Database

Type

Notes

PostgreSQL (single node)

Consistent + available, not distributed

MySQL (single node)

Same — add replication → CP or AP trade-off

SQLite

Single file, not distributed

When you do deploy PostgreSQL with replication across multiple machines, you re-enter the CP/AP trade-off depending on your replication and failover configuration.

Quick Reference: Database Classification

Database

CAP Type

Consistency Model

Best For

MongoDB

Strong (per-shard)

General purpose, documents

Apache Cassandra

Eventual

High write throughput, IoT, time-series

CouchDB

Eventual

Offline-first, sync scenarios

Amazon DynamoDB

AP (default)

Eventual / Strong (opt-in)

Serverless, key-value at scale

Apache HBase

Strong

Big data, Hadoop ecosystem

etcd / ZooKeeper

Strong (linearizable)

Config storage, distributed coordination

Redis Cluster

Eventual

Cache, sessions, pub/sub

PostgreSQL (replicated)

Strong (synchronous replication)

Relational, ACID, complex queries

PACELC: Beyond CAP

The CAP theorem only describes behaviour during a partition. A more complete model is PACELC (proposed by Daniel Abadi):

If there is a Partition, choose between Availability and Consistency — Else (normal operation), choose between Latency and Consistency.

PACELC: PA/EL  → Prioritize Availability during partition, low Latency normally
         PC/EC  → Prioritize Consistency during partition, strong Consolidation normally

Database

PACELC

Meaning

Cassandra

PA/EL

Available during partition, low latency normally

MongoDB

PC/EC

Consistent during partition, consistent normally

DynamoDB

PA/EL

Available during partition, tunable

PACELC helps explain why some "AP" databases still have high latency even in normal operation — they're paying a consistency cost during every write.

Applying CAP When Choosing a Database

Ask these questions when picking a database for a distributed service:

Question 1: Can your application tolerate stale reads?

Yes (e.g., social media feeds, product catalog, analytics dashboards) → AP database (Cassandra, DynamoDB) is fine
No (e.g., bank account balance, inventory count, payment status) → CP database (PostgreSQL, MongoDB) is required

Question 2: What happens during a network partition?

Must stay available at all costs (even with stale data)? → AP
Must stay consistent even if unavailable for a short period? → CP

Question 3: Real-world examples by domain

Domain

Recommended

Reason

Payment processing

CP (PostgreSQL)

Cannot have inconsistent balances

Shopping cart

AP (DynamoDB)

Stale reads acceptable; availability critical

User sessions

AP (Redis)

Slight staleness fine; must be fast

Inventory management

CP (PostgreSQL)

Overselling requires consistency

Activity feed / timeline

AP (Cassandra)

High write volume; eventual consistency fine

Configuration / feature flags

CP (etcd)

All nodes must see the same config

Log aggregation

AP (Cassandra)

Volume over consistency

Connection to ACID and Transactions

You've already learned about ACID in the context of a single database. CAP is the distributed systems equivalent:

Property

ACID (single database)

CAP (distributed system)

Consistency

Data stays valid per schema rules

All nodes see the same data

Isolation

Concurrent transactions don't interfere

N/A (CAP doesn't address this)

Atomicity

All-or-nothing per transaction

N/A

Durability

Committed data survives crashes

Partition tolerance is related

In microservices and distributed architectures, you often cannot use ACID transactions across services. This is where patterns like Saga, 2-Phase Commit (2PC), and eventual consistency come in — all shaped by CAP trade-offs.

Key Takeaways

The CAP theorem says a distributed system can guarantee only two of three: Consistency, Availability, Partition Tolerance
Since partitions are unavoidable in real distributed systems, the practical choice is always CP vs AP
CP (MongoDB, etcd, PostgreSQL with sync replication): blocks or rejects requests during a partition to stay consistent
AP (Cassandra, CouchDB, DynamoDB): stays available during a partition, reconciles to eventual consistency afterwards
CA (single-node PostgreSQL, MySQL): not truly distributed; becomes CP or AP once you add replication
PACELC extends CAP to also describe the latency vs consistency trade-off during normal operation
Choose your database based on your application's tolerance for stale reads and brief unavailability

hashtagThe "Pick Two" Problem

hashtagThe Three Properties

hashtagC — Consistency

hashtagA — Availability

hashtagP — Partition Tolerance

hashtagWhy You Can Only Pick Two

hashtagCP Databases: Consistency Over Availability

hashtagAP Databases: Availability Over Consistency

hashtagCA Databases: Consistent and Available (No Partitions)

hashtagQuick Reference: Database Classification

hashtagPACELC: Beyond CAP

hashtagApplying CAP When Choosing a Database

hashtagQuestion 1: Can your application tolerate stale reads?

hashtagQuestion 2: What happens during a network partition?

hashtagQuestion 3: Real-world examples by domain

hashtagConnection to ACID and Transactions

hashtagKey Takeaways

hashtagFurther Reading

The "Pick Two" Problem

The Three Properties

C — Consistency

A — Availability

P — Partition Tolerance

Why You Can Only Pick Two

CP Databases: Consistency Over Availability

AP Databases: Availability Over Consistency

CA Databases: Consistent and Available (No Partitions)

Quick Reference: Database Classification

PACELC: Beyond CAP

Applying CAP When Choosing a Database

Question 1: Can your application tolerate stale reads?

Question 2: What happens during a network partition?

Question 3: Real-world examples by domain

Connection to ACID and Transactions

Key Takeaways

Further Reading