Backup and Recovery

My journey from praying nothing breaks to sleeping soundly with automated backups


Table of Contents


Introduction: The Backup I Never Made

I had been developing my blog for two years when my VPS provider sent an email: "Your server has experienced a hardware failure and cannot be recovered."

I went cold. I searched my laptop for any local copies. I found SQL export filesβ€”but they were eight months old. My staging environment had a copy but it was three weeks behind.

I lost eight months of posts, all the comments and user data from that period, and countless hours of work.

The sickening part? Backups would have taken 30 minutes to set up properly. I had been saying "I'll do it properly later" for two years.

Don't be me. Set up backups today.


Backup Strategies Overview

spinner
Strategy
Recovery Point
Recovery Time
Use Case

pg_dump (daily)

24 hours ago

Minutes-hours

Small databases, portability

pg_basebackup (weekly)

Last full backup

Minutes

Medium databases

WAL archiving + PITR

Any point in time

Minutes + replay time

Production, compliance

Streaming Replication

Near real-time

Seconds

High availability

The 3-2-1 Rule: 3 copies, 2 different media, 1 offsite location.


pg_dump: Logical Backups

pg_dump exports the database as SQL statements or binary format. It's consistentβ€”it takes a snapshot without locking tables.

Basic Usage

pg_dump Formats

Format
Flag
Notes

Plain SQL

-Fp (default)

Human-readable, largest size

Custom

-Fc

Compressed, selective restore

Directory

-Fd

Parallel dump (large databases)

Tar

-Ft

Archive, no compression

Selective Dumps

Dump All Databases


pg_basebackup: Physical Backups

pg_basebackup copies the entire PostgreSQL data directory as a binary backup. It's faster for large databases but less portable (same PostgreSQL major version required for restore).

Create a Replication Role for Backups


Continuous Archiving and PITR

Write-Ahead Logging (WAL) records every database change. By archiving WAL files, you can restore to any point in timeβ€”not just the last backup.

Configure WAL Archiving

Full Backup + WAL Archive Workflow

Point-in-Time Recovery


Restoring from Backups

Restore from pg_dump

Restore Specific Tables (Emergency Recovery)

Verify Restored Data


Backup Automation and Scheduling

Shell Script for Daily Backups

Scheduling with Cron

Scheduling with pg_cron


Verifying Your Backups

An unverified backup is not a backup. Schedule regular restore tests:


Replication for High Availability

Streaming replication keeps a standby server up to date with the primary in near real-time:

spinner

Setting Up a Standby Server

Failover

Use Patroni or pg_auto_failover for automated failover in production.


Cloud Backup Solutions

Most cloud-managed PostgreSQL services handle backups automatically:

Service
Backup Type
PITR
Retention

AWS RDS PostgreSQL

Automated snapshots

βœ… Yes

1–35 days

AWS Aurora PostgreSQL

Continuous backup

βœ… Yes

1–35 days

GCP Cloud SQL

Automated backups

βœ… Yes

Up to 365 days

Azure Database for PostgreSQL

Full + differential

βœ… Yes

7–35 days

Supabase

Daily backups

βœ… Pro plan

7 days

Even with managed backups, test restores regularly and keep your own exports for compliance.


Disaster Recovery Planning

A backup without a recovery plan is just a file. Document your recovery procedure:

Recovery Time Objective (RTO) and Recovery Point Objective (RPO)

  • RTO: How long can the service be down? (1 hour? 4 hours? 24 hours?)

  • RPO: How much data can you lose? (Last hour? Last day? Zero data loss?)

These determine your backup strategy:

RPO
RPO
Strategy

1 hour

4 hours

Daily pg_dump + hourly WAL archiving

5 minutes

30 minutes

WAL archiving every 5 min + hot standby

~0

Seconds

Synchronous replication + hot standby

Runbook Template


What I Learned About Backups

That data loss incident was the defining moment in my relationship with infrastructure. The lesson was brutal but clear: the data you haven't backed up doesn't really belong to youβ€”you're just borrowing it.

The mindset shift that made everything click:

Backups are not for when things go wrong. Backups are for when things go wrong and you don't panic.

My current setup:

  • Daily automated pg_dump to S3 with 30-day retention

  • WAL archiving with 7-day PITR window

  • Weekly automated restore test to a throw-away database

  • Monitoring alert if that test fails

Total setup time: about 2 hours. Total peace of mind: priceless.

The non-negotiables:

  • βœ… Automate backupsβ€”manual backups don't get done

  • βœ… Test restores regularlyβ€”untested backups are a false sense of security

  • βœ… Store backups offsite or in a different availability zone

  • βœ… Document your recovery procedure before you need it

  • βœ… Monitor that your backup jobs are completing successfully


Next Steps

With backups secured, we complete the series with comprehensive Database Design Best Practicesβ€”pulling together everything we've learned into production-ready patterns.


Part of the Database 101 Series

Last updated