# The CAP Theorem

## The "Pick Two" Problem

There's an old saying in the trades: *"Cheap, Fast, and Good — pick two."* The **CAP theorem** applies exactly the same logic to distributed databases.

> The CAP theorem states that a distributed system can deliver only **two** of three properties: **Consistency**, **Availability**, and **Partition Tolerance**.

First articulated by Professor Eric A. Brewer in 2000 (and formally proven by MIT professors Seth Gilbert and Nancy Lynch in 2002), CAP is the foundational principle every backend engineer must understand before choosing a database for a distributed system.

***

## The Three Properties

### C — Consistency

**Every client sees the same data at the same time, regardless of which node they connect to.**

When data is written to one node, it must be replicated to all other nodes *before* the write is acknowledged as successful. There is no stale read — all nodes agree on the current state.

```
Write to Node A → replicate to B and C → confirm success
              
Client reads from B → gets the latest data ✅
Client reads from C → gets the latest data ✅
```

This is *not* the same as ACID consistency (which is about data validity within a single database). CAP consistency is about **read-after-write coherence across nodes**.

***

### A — Availability

**Every request receives a response — even if one or more nodes are down.**

All healthy nodes in the distributed system return a valid response for *any* request, without exception. The system never refuses or times out a request as long as at least one node is reachable.

```
Node B is down
              
Client reads from A → gets a response ✅
Client reads from C → gets a response ✅
(even if the data might be slightly stale)
```

***

### P — Partition Tolerance

**The system continues operating despite network partitions (communication failures between nodes).**

A partition is a break in communication between nodes — a dropped connection, a network delay, or a split-brain scenario. Partition tolerance means the cluster keeps working even when some nodes cannot talk to each other.

```
Node A ←─✗─→ Node B  (partition: A and B cannot communicate)

Partition tolerant system: both A and B continue serving requests
Non-tolerant system: shuts down or becomes unavailable
```

***

## Why You Can Only Pick Two

Network partitions in distributed systems are **inevitable** — hardware fails, networks hiccup, data centers lose connectivity. Because partitions cannot be entirely prevented, the practical choice in a distributed system is always between **CP** or **AP**.

```
┌─────────────────────────────────────────────────────────────┐
│                       CAP Triangle                          │
│                                                             │
│                    Consistency (C)                          │
│                         /\                                  │
│                        /  \                                 │
│                       / CP \                                │
│                      /      \                               │
│              CA ----+--------+---- (theoretical)           │
│                    /   ????   \                             │
│                   /            \                            │
│    Availability  /----- AP -----\ Partition                 │
│         (A)                      Tolerance (P)              │
└─────────────────────────────────────────────────────────────┘
```

| Choice                                      | Trade-off                      | Practical Meaning                                                                 |
| ------------------------------------------- | ------------------------------ | --------------------------------------------------------------------------------- |
| **CP** (Consistency + Partition Tolerance)  | Sacrifices Availability        | System rejects or blocks requests during a partition to prevent inconsistency     |
| **AP** (Availability + Partition Tolerance) | Sacrifices Consistency         | System returns data (may be stale) during a partition; syncs when partition heals |
| **CA** (Consistency + Availability)         | Sacrifices Partition Tolerance | Only possible if partitions *never* occur — impractical in distributed systems    |

> **CA in practice:** Relational databases like PostgreSQL are often called "CA databases" when deployed to a *single node*. When you add replication and run them across multiple nodes, you must choose between C and A during a partition — just like everything else.

***

## CP Databases: Consistency Over Availability

When a partition occurs, a CP database **shuts down the inconsistent node** (makes it unavailable) rather than serve potentially stale data. Once the partition heals, the node rejoins and catches up.

**Real-world example: MongoDB**

MongoDB is a CP database. It uses a **single-primary replica set** model:

* Only the primary node accepts writes
* Secondary nodes replicate the primary's operation log
* If the primary becomes unreachable, secondaries **elect a new primary**
* During the election window, the cluster **refuses writes** to stay consistent

```
          Write Request
               ↓
        [Primary Node]  ←── clients write here only
         /         \
        /           \
[Secondary A]  [Secondary B]  ←── replicate primary's log

If Primary fails:
  → Election process begins
  → Writes blocked during election (~10-30 seconds)
  → New primary elected
  → Cluster available again, fully consistent
```

This means MongoDB prioritizes consistency — you'll never read data that doesn't reflect a committed write, but you may briefly lose write availability during failover.

**Other CP databases:** HBase, ZooKeeper, etcd, Redis (with strong consistency mode)

***

## AP Databases: Availability Over Consistency

When a partition occurs, an AP database keeps **all nodes available** and continues serving requests. Nodes that become partitioned may serve slightly stale data. When the partition heals, the system **reconciles differences** to reach eventual consistency.

**Real-world example: Apache Cassandra**

Cassandra is an AP database with a **masterless architecture** (every node is equal):

* Any node can accept reads and writes (no single point of failure)
* During a partition, partitioned nodes serve requests with potentially old data
* After the partition heals, Cassandra's **repair** mechanism syncs diverged nodes

```
       Write (any node accepts)
              ↓
    [Node A] ←─────→ [Node B] ←─────→ [Node C]
    
If A and B are partitioned from C:
  → A and B continue serving reads and writes ✅
  → C continues serving reads and writes ✅
  → Reads may return stale data on C
  → After partition heals: reconcile → eventual consistency
```

This means Cassandra prioritizes availability — the system is always up, but you may briefly read slightly older data.

**Other AP databases:** CouchDB, DynamoDB (default), Riak, Couchbase

***

## CA Databases: Consistent and Available (No Partitions)

A CA system is consistent and always available — but only works if there are **no partitions**. Since partitions are unavoidable in distributed systems at scale, CA is effectively the model for **single-node** or **tightly-coupled** relational databases.

| Database                     | Type | Notes                                       |
| ---------------------------- | ---- | ------------------------------------------- |
| **PostgreSQL** (single node) | CA   | Consistent + available, not distributed     |
| **MySQL** (single node)      | CA   | Same — add replication → CP or AP trade-off |
| **SQLite**                   | CA   | Single file, not distributed                |

When you *do* deploy PostgreSQL with replication across multiple machines, you re-enter the CP/AP trade-off depending on your replication and failover configuration.

***

## Quick Reference: Database Classification

| Database                | CAP Type         | Consistency Model                | Best For                                 |
| ----------------------- | ---------------- | -------------------------------- | ---------------------------------------- |
| MongoDB                 | **CP**           | Strong (per-shard)               | General purpose, documents               |
| Apache Cassandra        | **AP**           | Eventual                         | High write throughput, IoT, time-series  |
| CouchDB                 | **AP**           | Eventual                         | Offline-first, sync scenarios            |
| Amazon DynamoDB         | **AP** (default) | Eventual / Strong (opt-in)       | Serverless, key-value at scale           |
| Apache HBase            | **CP**           | Strong                           | Big data, Hadoop ecosystem               |
| etcd / ZooKeeper        | **CP**           | Strong (linearizable)            | Config storage, distributed coordination |
| Redis Cluster           | **AP**           | Eventual                         | Cache, sessions, pub/sub                 |
| PostgreSQL (replicated) | **CP**           | Strong (synchronous replication) | Relational, ACID, complex queries        |

***

## PACELC: Beyond CAP

The CAP theorem only describes behaviour *during* a partition. A more complete model is **PACELC** (proposed by Daniel Abadi):

> **If** there is a **Partition**, choose between **Availability** and **Consistency** — **Else** (normal operation), choose between **Latency** and **Consistency**.

```
PACELC: PA/EL  → Prioritize Availability during partition, low Latency normally
         PC/EC  → Prioritize Consistency during partition, strong Consolidation normally
```

| Database  | PACELC | Meaning                                          |
| --------- | ------ | ------------------------------------------------ |
| Cassandra | PA/EL  | Available during partition, low latency normally |
| MongoDB   | PC/EC  | Consistent during partition, consistent normally |
| DynamoDB  | PA/EL  | Available during partition, tunable              |

PACELC helps explain why some "AP" databases still have high latency even in normal operation — they're paying a consistency cost during every write.

***

## Applying CAP When Choosing a Database

Ask these questions when picking a database for a distributed service:

### Question 1: Can your application tolerate stale reads?

* **Yes** (e.g., social media feeds, product catalog, analytics dashboards) → **AP database** (Cassandra, DynamoDB) is fine
* **No** (e.g., bank account balance, inventory count, payment status) → **CP database** (PostgreSQL, MongoDB) is required

### Question 2: What happens during a network partition?

* Must stay **available at all costs** (even with stale data)? → **AP**
* Must stay **consistent even if unavailable** for a short period? → **CP**

### Question 3: Real-world examples by domain

| Domain                        | Recommended     | Reason                                        |
| ----------------------------- | --------------- | --------------------------------------------- |
| Payment processing            | CP (PostgreSQL) | Cannot have inconsistent balances             |
| Shopping cart                 | AP (DynamoDB)   | Stale reads acceptable; availability critical |
| User sessions                 | AP (Redis)      | Slight staleness fine; must be fast           |
| Inventory management          | CP (PostgreSQL) | Overselling requires consistency              |
| Activity feed / timeline      | AP (Cassandra)  | High write volume; eventual consistency fine  |
| Configuration / feature flags | CP (etcd)       | All nodes must see the same config            |
| Log aggregation               | AP (Cassandra)  | Volume over consistency                       |

***

## Connection to ACID and Transactions

You've already learned about [ACID](https://blog.htunnthuthu.com/getting-started/programming/database-101/database-101-transactions-and-acid) in the context of a single database. CAP is the distributed systems equivalent:

| Property    | ACID (single database)                  | CAP (distributed system)       |
| ----------- | --------------------------------------- | ------------------------------ |
| Consistency | Data stays valid per schema rules       | All nodes see the same data    |
| Isolation   | Concurrent transactions don't interfere | N/A (CAP doesn't address this) |
| Atomicity   | All-or-nothing per transaction          | N/A                            |
| Durability  | Committed data survives crashes         | Partition tolerance is related |

In microservices and distributed architectures, you often **cannot use ACID transactions across services**. This is where patterns like **Saga**, **2-Phase Commit (2PC)**, and **eventual consistency** come in — all shaped by CAP trade-offs.

***

## Key Takeaways

* The CAP theorem says a distributed system can guarantee only **two of three**: Consistency, Availability, Partition Tolerance
* Since **partitions are unavoidable** in real distributed systems, the practical choice is always **CP vs AP**
* **CP** (MongoDB, etcd, PostgreSQL with sync replication): blocks or rejects requests during a partition to stay consistent
* **AP** (Cassandra, CouchDB, DynamoDB): stays available during a partition, reconciles to eventual consistency afterwards
* **CA** (single-node PostgreSQL, MySQL): not truly distributed; becomes CP or AP once you add replication
* **PACELC** extends CAP to also describe the latency vs consistency trade-off during *normal* operation
* Choose your database based on your application's tolerance for **stale reads** and **brief unavailability**

***

## Further Reading

* [IBM: What is the CAP Theorem?](https://www.ibm.com/topics/cap-theorem)
* [Brewer's Original CAP Conjecture (2000)](https://people.eecs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf)
* [Martin Kleppmann — Designing Data-Intensive Applications](https://dataintensive.net/) *(the definitive textbook on this topic)*
* [PACELC — Daniel Abadi](http://cs-www.cs.yale.edu/homes/dna/papers/abadi-pacelc.pdf)
* ← Previous: [Transactions and ACID](https://blog.htunnthuthu.com/getting-started/programming/database-101/database-101-transactions-and-acid)
* → Next: [Views, Functions, and Stored Procedures](https://blog.htunnthuthu.com/getting-started/programming/database-101/database-101-views-functions-stored-procedures)