Database Design

← Back to System Design 101 | ← Previous: Caching Strategies

Introduction

Choosing the right database and designing it properly is one of the most critical decisions in system architecture. I've made both good and poor database choices in my career, and the consequences of a wrong decision can haunt you for years.

This article covers the database design patterns I've used in production systems—the trade-offs, the pain points, and the solutions that actually worked.

SQL vs NoSQL: The Real Trade-offs

The SQL vs NoSQL debate isn't about which is "better"—it's about which fits your use case.

When I Choose SQL (PostgreSQL, MySQL)

Use cases:

  • Financial transactions requiring ACID guarantees

  • Complex queries with joins across multiple tables

  • Data with clear relationships and schema

  • When data integrity is non-negotiable

Example: E-commerce order system

-- PostgreSQL schema for order management
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE products (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    price DECIMAL(10, 2) NOT NULL,
    stock INT NOT NULL DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT positive_price CHECK (price > 0),
    CONSTRAINT non_negative_stock CHECK (stock >= 0)
);

CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID NOT NULL REFERENCES users(id),
    status VARCHAR(50) NOT NULL DEFAULT 'pending',
    total DECIMAL(10, 2) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT valid_status CHECK (status IN ('pending', 'paid', 'shipped', 'delivered', 'cancelled'))
);

CREATE TABLE order_items (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    order_id UUID NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
    product_id UUID NOT NULL REFERENCES products(id),
    quantity INT NOT NULL,
    price DECIMAL(10, 2) NOT NULL,
    CONSTRAINT positive_quantity CHECK (quantity > 0)
);

-- Indexes for common queries
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);

Python implementation with transaction:

When I Choose NoSQL (MongoDB, DynamoDB)

Use cases:

  • Flexible schema that evolves frequently

  • Massive scale (millions of writes/sec)

  • Document-oriented data (user profiles, product catalogs)

  • When eventual consistency is acceptable

Example: User profile system (MongoDB)

My Decision Matrix

Factor
SQL
NoSQL

ACID Transactions

✅ Strong

❌ Limited

Schema Flexibility

❌ Rigid

✅ Flexible

Horizontal Scaling

⚠️ Complex

✅ Easy

Complex Joins

✅ Excellent

❌ Limited

Write Performance

⚠️ Good

✅ Excellent

Consistency

✅ Strong

⚠️ Eventual

Tooling/Ecosystem

✅ Mature

⚠️ Growing

Database Replication

Replication improves availability and read performance by maintaining copies of data across multiple servers.

Master-Slave Replication

One primary (master) handles writes, multiple replicas (slaves) handle reads.

PostgreSQL streaming replication setup:

Challenges I've faced:

  • Replication lag: Replicas can be seconds behind master

  • Read-after-write consistency: User updates profile, immediately reads old data from replica

  • Failover complexity: What happens when master fails?

My solutions:

Multi-Master Replication

Multiple nodes can accept writes. More complex but eliminates single point of failure.

When I use multi-master:

  • Global applications with users in different regions

  • High write throughput requirements

  • When downtime is absolutely unacceptable

Trade-offs:

  • ✅ No single point of failure

  • ✅ Better write performance

  • ✅ Lower latency for geographically distributed users

  • ❌ Conflict resolution complexity

  • ❌ Eventual consistency (not strong consistency)

  • ❌ More complex operations

Database Sharding

Sharding splits data across multiple databases to handle massive scale.

Sharding Strategies

1. Range-based sharding:

Problems I've hit:

  • Uneven distribution (shard1 has way more users than shard2)

  • Hard to rebalance

  • Hot shards (one shard gets disproportionate traffic)

2. Hash-based sharding (my preferred method):

3. Geographic sharding:

Sharding Challenges

Cross-shard queries:

My approach to minimize cross-shard queries:

  1. Denormalize data when necessary

  2. Use a separate analytics database

  3. Design shard key to keep related data together

  4. Accept eventual consistency for aggregations

Database Partitioning

Partitioning splits large tables within a single database.

Indexing Strategies

Indexes are critical for query performance.

Indexing rules I follow:

Real-World Database Architecture

Here's a production setup I've used:

Tier 1: Application Database (PostgreSQL)

  • Master: Handles all writes

  • 3 read replicas: Distribute read load

  • Connection pooling: PgBouncer

  • Automatic failover: Patroni + etcd

Tier 2: Cache Layer (Redis)

  • Reduces database load by 85%

  • Stores session data, frequently accessed records

  • Redis Cluster for high availability

Tier 3: Analytics Database (ClickHouse)

  • Separate from transactional database

  • Optimized for aggregations and reporting

  • Data replicated from PostgreSQL via Kafka

Tier 4: Search Engine (Elasticsearch)

  • Full-text search capabilities

  • Product catalog search

  • Log aggregation and analysis

Lessons Learned

What worked:

  1. Start with a single database, scale when you have real metrics

  2. Use read replicas before sharding

  3. PostgreSQL handles way more than you think

  4. Monitor query performance from day one

  5. Cache aggressively, invalidate carefully

What didn't work:

  1. Premature sharding (added complexity without benefit)

  2. Choosing NoSQL for everything (missed SQL's strengths)

  3. Not setting up proper indexes (queries became unbearably slow)

  4. Ignoring replication lag (caused user-facing bugs)

  5. Over-normalizing data (too many joins killed performance)

What's Next

With database design covered, let's explore asynchronous processing with message queues:


Navigation:

Last updated