Transactions and ACID

My journey from data corruption disasters to bulletproof consistency


Table of Contents


Introduction: The Partial Update That Cost Me a Weekend

My blog had a "publish post" workflow: set status = 'published', update published_at, and send a notification email. I'd written it as three separate database calls in my Python code:

One night, the email server was unavailable. The send_notification_email call threw an exception. The post was now published with a published_at timestamp, but no notification was sent. Fineβ€”I could resend the email.

But then I noticed something worse: the email server failure had caused my code to exit before updating a post_metadata table. Twenty posts had their status as published in one table but their metadata was out of sync. My homepage query joined both tables and returned broken results.

I spent a weekend writing a repair script to find and fix all the inconsistent rows.

The fix was simple: transactions. All related operations should either all succeed or all fail. There's no acceptable in-between.


What Is a Transaction?

A transaction is a sequence of SQL operations treated as a single unit of work. It either commits (all changes applied) or rolls back (all changes undone).

spinner

In PostgreSQL, every SQL statement is implicitly wrapped in a transaction unless you use explicit BEGIN/COMMIT. A single UPDATE without BEGIN is auto-committed immediately.


ACID Properties Explained

ACID is the gold standard for reliable database transactions. Each property addresses a specific failure scenario.

Atomicity: All or Nothing

Either all operations in a transaction succeed, or none of them do.

If any statement fails between BEGIN and COMMIT, PostgreSQL automatically rolls back all prior changes in that transaction.

Consistency: Valid State to Valid State

A transaction moves the database from one valid state to another. Constraints (NOT NULL, FOREIGN KEY, UNIQUE, CHECK) are enforced at commit time.

Isolation: Concurrent Transactions Don't See Each Other's In-Progress Work

Two concurrent users editing different posts won't interfere with each other. The database behaves as if transactions run serially (to the degree specified by the isolation level).

Durability: Committed Data Survives Crashes

Once PostgreSQL returns "COMMIT", the data is persisted to disk. A power failure immediately after won't lose committed dataβ€”PostgreSQL uses a Write-Ahead Log (WAL) to guarantee this.


Basic Transaction Syntax

Transaction in psql


Savepoints: Partial Rollbacks

Savepoints let you roll back to a specific point within a transaction without losing earlier work:

This is useful in batch processingβ€”you can retry individual items without losing the entire batch.


Transaction Isolation Levels

PostgreSQL supports four isolation levels, each with different trade-offs between consistency and concurrency:

Isolation Level
Dirty Read
Non-Repeatable Read
Phantom Read

READ UNCOMMITTED

βœ… Possible

βœ… Possible

βœ… Possible

READ COMMITTED (default)

❌ Prevented

βœ… Possible

βœ… Possible

REPEATABLE READ

❌ Prevented

❌ Prevented

❌ Prevented*

SERIALIZABLE

❌ Prevented

❌ Prevented

❌ Prevented

*PostgreSQL's MVCC prevents phantom reads at REPEATABLE READ level.

Setting Isolation Level

When to Use Each Level

  • READ COMMITTED (default): Suitable for most OLTP workloads

  • REPEATABLE READ: Reporting/analytics queries that must see a consistent snapshot

  • SERIALIZABLE: Financial operations, booking systems, anything requiring strict correctness


Concurrency Problems and How Isolation Levels Solve Them

Dirty Read

Reading uncommitted data from another transaction:

PostgreSQL's MVCC prevents dirty reads even at READ UNCOMMITTEDβ€”they're effectively impossible in PostgreSQL.

Non-Repeatable Read

Reading the same row twice in one transaction and getting different values:

Phantom Read

A re-query returns new rows that didn't exist before:


Locking in PostgreSQL

PostgreSQL uses MVCC (Multi-Version Concurrency Control) to allow readers and writers to coexist without blocking each other. But explicit locks are sometimes needed.

Row-Level Locks

Table-Level Locks

PostgreSQL automatically acquires appropriate locks for INSERT, UPDATE, DELETE.


Deadlocks: What They Are and How to Prevent Them

A deadlock occurs when two transactions each hold a lock the other needs:

spinner

Preventing Deadlocks

  1. Always acquire locks in the same order across your application:

  1. Use NOWAIT to fail immediately rather than waiting:

  1. Keep transactions short to reduce lock contention windows.


Practical Transaction Patterns

Publishing a Post (Atomically)

Safe Counter Increment (No Race Condition)

Transfer/Move Between Entities


Advisory Locks for Application-Level Coordination

Advisory locks are PostgreSQL locks that your application controls directlyβ€”not tied to any specific table row:

Useful for distributed scheduled jobs where only one instance should run at a time.


Two-Phase Commit (Distributed Transactions)

For operations spanning multiple databases or servers:

Two-phase commit is rarely needed in single-database applications but is essential for distributed systems.


Common Transaction Mistakes

1. Transactions That Are Too Long

βœ… Keep transactions as short as possible. Prepare data outside the transaction, then use a short transaction just for the database writes.

2. Forgetting to Handle Errors

If an error occurs inside a transaction in PostgreSQL, the transaction is aborted. Any further statements will fail until you ROLLBACK:

3. Auto-Commit Confusion

In psql and most clients, AUTOCOMMIT is on by default. This means every statement is its own transaction:


What I Learned About Transactions

The weekend I spent fixing my inconsistent database taught me that transactions are not optional. Any operation that modifies more than one row or more than one table must be wrapped in a transaction.

My mental checklist now:

Always use a transaction when:

Best practices:

  • βœ… Keep transactions shortβ€”do work outside, commit quickly

  • βœ… Use SAVEPOINTS for complex batch operations with partial rollback needs

  • βœ… Use SELECT ... FOR UPDATE to prevent lost updates

  • βœ… Use SKIP LOCKED for queue processing patterns

  • βœ… Always acquire locks in a consistent order to prevent deadlocks

  • ❌ Never hold a transaction open waiting for user input or external API calls


Next Steps

With data integrity secured through transactions, let's explore Views, Functions, and Stored Proceduresβ€”powerful tools for encapsulating and reusing database logic.


Part of the Database 101 Series

Last updated