Indexes and Performance

My journey from 45-second page loads to sub-100ms responses


Table of Contents


Introduction: The Day My Blog Became Unusable

It was a Saturday morning. I had just crossed 50,000 rows in my posts table after three years of writing. I opened my blog's admin panel to review comments, and nothing loaded. After 30 seconds, a timeout error.

I restarted the server. Same thing. Cleared the application cache. Still broken.

I connected directly to the database and ran the exact query my admin panel used:

It returned… in 43 seconds.

The same query had been running in 80ms when I had 5,000 posts. Nobody told me databases slow down as they grow. Nobody told me I needed to design for scale.

Two hours later, after adding the right indexes, the same query ran in 4ms.

This article is the guide I wish had existed that Saturday morning.


What Is an Index?

An index is a separate data structure that PostgreSQL maintains alongside your table. It's like the index at the back of a textbookβ€”instead of reading every page to find "ACID properties", you look it up in the index and jump directly to page 347.

spinner

Trade-off: Indexes speed up reads but slow down writes (INSERT, UPDATE, DELETE), because PostgreSQL must update the index alongside the table. Choose indexes that serve your actual query patterns.


How PostgreSQL Finds Data Without an Index

Without an index, PostgreSQL performs a Sequential Scan (seq scan): it reads every single row in the table from start to finish, checks if the row matches the WHERE condition, and either includes or discards it.

For a table with 50,000 posts:

  • Sequential scan: read all 50,000 rows

  • Your query matches 12,000 published posts

  • PostgreSQL discarded 38,000 rows it didn't need

With an index on status:

  • Index lookup: jump directly to the 12,000 matching rows

  • No wasted reads

Sequential scans aren't always badβ€”for small tables or when you're fetching most rows, they can be faster than index scans. PostgreSQL's query planner decides which to use.


B-Tree Indexes: The Default Workhorse

The default index type in PostgreSQL is a B-Tree (Balanced Tree). It handles:

  • =, <, >, <=, >= comparisons

  • BETWEEN ranges

  • IN lists

  • LIKE 'prefix%' (prefix matches only, not '%suffix')

  • IS NULL, IS NOT NULL

  • ORDER BY without a separate sort

πŸ’‘ Foreign keys are not automatically indexed in PostgreSQL. Always add indexes on foreign key columnsβ€”they're used in almost every JOIN.

Naming Convention


Other Index Types in PostgreSQL

Hash Index

For equality comparisons only ( = ), slightly faster than B-Tree for simple lookups:

In practice, B-Tree is almost always preferred because it supports range queries too and hash indexes weren't crash-safe before PostgreSQL 10.

GIN Index (Generalised Inverted Index)

Best for array columns, JSONB, and full-text searchβ€”any column where a single row contains multiple values to index:

GIN indexes are larger and slower to build but fast for lookups within complex values.

GiST Index (Generalised Search Tree)

Used for geometric data, range types, and fuzzy string matching:

BRIN Index (Block Range Index)

Very small, efficient for naturally ordered data in huge tables (timestamps, sequential IDs):

BRIN indexes are tiny (kilobytes vs megabytes for B-Tree) but only efficient when values are physically stored in order.


Index Strategies for Real Queries

Always Index Foreign Keys

Index Columns Used in WHERE Clauses

Identify the most common filter conditions:

Index Columns Used in ORDER BY

Without an index, ORDER BY requires a full sort. With the right index, results arrive pre-sorted:


EXPLAIN and EXPLAIN ANALYZE

EXPLAIN shows the query plan. EXPLAIN ANALYZE actually executes the query and shows real timing.

Reading EXPLAIN Output

Key things to look for:

Term
Meaning

Seq Scan

⚠️ Reading every rowβ€”may need an index

Index Scan

βœ… Using an index

Index Only Scan

βœ…βœ… Reading only from index (fastest)

Bitmap Heap Scan

βœ… Multiple index matches, then heap lookup

Hash Join

Joining using a hash table

Nested Loop

Joining row by row (good for small inner sets)

Merge Join

Joining pre-sorted data

cost=X..Y

Estimated startup..total cost (arbitrary units)

actual time=X..Y

Real execution time in milliseconds

rows=N

Estimated vs actual row count

When estimated rows β‰  actual rows β€” your table statistics are stale. Run:


Partial Indexes: Index Only What You Need

A partial index only includes rows matching a WHERE condition. Smaller, faster, and perfectly targeted:

Partial indexes can be 10-100x smaller than full-table indexes while covering 99% of real queries.


Composite Indexes and Column Order

A composite (multi-column) index can serve multiple query patterns, but column order matters enormously.

The "Equality First, Range Last" Rule

Put equality conditions (=) before range conditions (<, >, BETWEEN) in composite indexes:


Index-Only Scans and Covering Indexes

An index-only scan retrieves data solely from the index without touching the main table. This is extremely fast:

INCLUDE columns (PostgreSQL 11+) don't affect sort order but allow index-only scans for those columns.


The Cost of Indexes

Indexes aren't free. Every index:

  • Slows down writes: INSERT, UPDATE, DELETE must update each index

  • Uses disk space: An index can be as large as the table itself

  • Needs maintenance: VACUUM and AUTOVACUUM spend time on indexes

When NOT to Add an Index

  • Tables with fewer than ~1,000 rows (seq scan is faster)

  • Columns with very low cardinality (e.g., boolean or status with 2 values in a small table)

  • Columns rarely used in WHERE, JOIN, or ORDER BY

  • Write-heavy tables where index maintenance outweighs read benefit


Query Optimisation Techniques

Avoid Functions on Indexed Columns in WHERE

Use LIMIT Aggressively

Avoid SELECT * in Production

Use Connection Pooling

Individual connections are expensive. Use PgBouncer or application-level pooling:


Monitoring Index Usage

Find Unused Indexes (Index Bloat)

Find Slow Queries

Find Missing Indexes (Sequential Scans on Large Tables)

High seq_tup_read on a large table = potential missing index.


What I Learned About Performance

The performance incident that broke my blog was entirely avoidable. The checklist I now follow for every new table:

When creating a table:

When a query feels slow:

  1. Run EXPLAIN ANALYZE firstβ€”never guess

  2. Look for Seq Scan on large tables

  3. Check if estimated vs actual row counts diverge significantly (run ANALYZE)

  4. Add the targeted index

  5. Verify improvement with EXPLAIN ANALYZE again

Golden rules:

  • βœ… Always index foreign keys

  • βœ… Index columns in WHERE, JOIN ON, and ORDER BY

  • βœ… Partial indexes for filtered queries

  • βœ… Use INCLUDE for covering indexes (PostgreSQL 11+)

  • βœ… Monitor with pg_stat_user_indexes and pg_stat_statements

  • ❌ Don't add indexes "just in case"β€”measure first


Next Steps

With indexes and performance mastered, the next topic is ensuring your data stays consistent even when things go wrong: Transactions and ACID.


Part of the Database 101 Series

Last updated