Relationships and Data Integrity

My journey from comma-separated chaos to properly related data

Introduction: The Comma-Separated Nightmare

When I first started managing my blog, I stored categories as a comma-separated string in each post:

Post: "PostgreSQL Tips"
Categories: "Database,PostgreSQL,Tutorial,Backend"

Seemed simple enough, right? Then I needed to find all posts in the "PostgreSQL" category.

My first attempt:

SELECT * FROM posts WHERE categories LIKE '%PostgreSQL%';

Worked great... until I added a post with the "PostgreSQL Performance" category. Now my query matched posts with "PostgreSQL" and "PostgreSQL Performance"—not what I wanted.

I tried getting clever with commas:

SELECT * FROM posts WHERE categories LIKE '%,PostgreSQL,%';

But this missed posts where "PostgreSQL" was the first or last category. My workarounds got increasingly complex:

WHERE categories LIKE 'PostgreSQL,%'  -- First position
   OR categories LIKE '%,PostgreSQL'   -- Last position
   OR categories LIKE '%,PostgreSQL,%' -- Middle position
   OR categories = 'PostgreSQL'        -- Only category

This is madness. And it gets worse:

Want to rename a category? Search and replace across all posts
Want to count posts per category? Parse comma-separated strings
Want to ensure category names are spelled consistently? Good luck
Want to add category descriptions or colors? Nowhere to put them

That's when I learned about relational databases and foreign keys. The "relational" part isn't just a fancy name—it's the entire point. Let me show you how properly modeling relationships transformed my database design.

Why Relationships Matter

In the real world, data is connected:

Authors write posts (one-to-many)
Posts have comments (one-to-many)
Posts belong to categories (many-to-many)
Users have profiles (one-to-one)

Modeling these relationships correctly means:

✅ Data integrity: Can't assign a post to a non-existent author ✅ Consistency: Author names update everywhere automatically ✅ Efficiency: No duplicate data (authors stored once, not repeated) ✅ Queryability: Find related data with JOINs ✅ Maintainability: Change category name in one place

The Relational Model Visualization

Here's how our blog data relates:

Don't worry if this looks complex—we'll build it step by step!

Understanding Foreign Keys

A foreign key is a column that references the primary key of another table. It creates a link between tables.

Basic Foreign Key Example

-- The authors table (already created in Part 2)
CREATE TABLE authors (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    full_name VARCHAR(100)
);

-- The posts table with a foreign key
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    author_id INTEGER REFERENCES authors(id),  -- Foreign key!
    title VARCHAR(200) NOT NULL,
    content TEXT
);

What the Foreign Key Does

The author_id INTEGER REFERENCES authors(id) line means:

author_id must be an integer (matching the type of authors.id)
REFERENCES authors(id) means it must be a valid author ID
PostgreSQL enforces this constraint—you can't insert invalid values

Let's see it in action:

-- Insert some authors
INSERT INTO authors (username, email, full_name) 
VALUES 
    ('john_doe', '[email protected]', 'John Doe'),
    ('jane_smith', '[email protected]', 'Jane Smith');

-- Insert a post by John (id=1)
INSERT INTO posts (author_id, title, content) 
VALUES (1, 'My First Post', 'Content here...');
-- SUCCESS!

-- Try to insert a post by non-existent author
INSERT INTO posts (author_id, title, content) 
VALUES (999, 'Invalid Post', 'This should fail...');
-- ERROR: insert or update on table "posts" violates foreign key constraint
-- DETAIL: Key (author_id)=(999) is not present in table "authors".

PostgreSQL protects you from orphaned data. No post can reference a non-existent author!

One-to-Many Relationships

One-to-many is the most common relationship type. One author can write many posts, but each post has only one author.

Setting Up One-to-Many

-- Drop existing tables to start fresh
DROP TABLE IF EXISTS posts CASCADE;
DROP TABLE IF EXISTS authors CASCADE;

-- Create authors table
CREATE TABLE authors (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    full_name VARCHAR(100),
    bio TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create posts table with foreign key to authors
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    author_id INTEGER NOT NULL REFERENCES authors(id),
    title VARCHAR(200) NOT NULL,
    slug VARCHAR(200) UNIQUE NOT NULL,
    content TEXT,
    published BOOLEAN DEFAULT false,
    view_count INTEGER DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create comments table (one-to-many with posts)
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    post_id INTEGER NOT NULL REFERENCES posts(id),
    author_name VARCHAR(100) NOT NULL,
    author_email VARCHAR(100),
    content TEXT NOT NULL,
    approved BOOLEAN DEFAULT false,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Step 1: Insert authors
INSERT INTO authors (username, email, full_name, bio) 
VALUES 
    ('john_doe', '[email protected]', 'John Doe', 'Tech blogger'),
    ('jane_smith', '[email protected]', 'Jane Smith', 'Database enthusiast')
RETURNING id, username;

-- Step 2: Insert posts (using author IDs from above)
INSERT INTO posts (author_id, title, slug, content, published) 
VALUES 
    (1, 'PostgreSQL Tips', 'postgresql-tips', 'Great tips here...', true),
    (1, 'SQL Basics', 'sql-basics', 'Learn SQL...', true),
    (2, 'Database Design', 'database-design', 'Design patterns...', true);

-- Step 3: Insert comments
INSERT INTO comments (post_id, author_name, content, approved) 
VALUES 
    (1, 'Alice', 'Great article!', true),
    (1, 'Bob', 'Very helpful, thanks!', true),
    (2, 'Charlie', 'Could you explain more about...', false);

Querying One-to-Many with COUNT

-- How many posts does each author have?
SELECT 
    a.username,
    a.full_name,
    COUNT(p.id) AS post_count
FROM authors a
LEFT JOIN posts p ON a.id = p.author_id
GROUP BY a.id, a.username, a.full_name
ORDER BY post_count DESC;

-- Output:
--  username   | full_name   | post_count
-- ------------+-------------+------------
--  john_doe   | John Doe    |          2
--  jane_smith | Jane Smith  |          1

Many-to-Many Relationships

Many-to-many means both sides can have multiple connections. A post can have multiple categories, and a category can contain multiple posts.

You cannot model this with a simple foreign key. You need a junction table (also called a "join table" or "bridge table").

The Problem with Direct Many-to-Many

Imagine trying to store categories directly in the posts table:

-- WRONG: Can't store multiple category IDs in one column
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT,
    category_id INTEGER  -- What if post has 3 categories?
);

Or multiple columns?

-- ALSO WRONG: Limited to 3 categories, lots of NULLs
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    title TEXT,
    category_id_1 INTEGER,
    category_id_2 INTEGER,
    category_id_3 INTEGER
);

The Solution: Junction Table

-- Create categories table
CREATE TABLE categories (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50) UNIQUE NOT NULL,
    slug VARCHAR(50) UNIQUE NOT NULL,
    description TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create junction table
CREATE TABLE post_categories (
    post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    category_id INTEGER NOT NULL REFERENCES categories(id) ON DELETE CASCADE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (post_id, category_id)  -- Composite primary key
);

Understanding the Junction Table

The post_categories table has:

post_id: References a post
category_id: References a category
PRIMARY KEY (post_id, category_id): Prevents duplicate pairings
ON DELETE CASCADE: We'll explain this soon!

Each row represents one connection between a post and a category.

Inserting Many-to-Many Data

-- Step 1: Insert categories
INSERT INTO categories (name, slug, description) 
VALUES 
    ('PostgreSQL', 'postgresql', 'All about PostgreSQL'),
    ('Tutorial', 'tutorial', 'Step-by-step guides'),
    ('Backend', 'backend', 'Backend development'),
    ('Database', 'database', 'Database topics');

-- Step 2: Assign categories to posts
-- Post 1 gets 3 categories
INSERT INTO post_categories (post_id, category_id) 
VALUES 
    (1, 1),  -- PostgreSQL Tips -> PostgreSQL
    (1, 2),  -- PostgreSQL Tips -> Tutorial
    (1, 4);  -- PostgreSQL Tips -> Database

-- Post 2 gets 2 categories
INSERT INTO post_categories (post_id, category_id) 
VALUES 
    (2, 2),  -- SQL Basics -> Tutorial
    (2, 4);  -- SQL Basics -> Database

-- Post 3 gets 2 categories
INSERT INTO post_categories (post_id, category_id) 
VALUES 
    (3, 3),  -- Database Design -> Backend
    (3, 4);  -- Database Design -> Database

Querying Many-to-Many

-- Get all categories for a specific post
SELECT c.name, c.slug
FROM categories c
INNER JOIN post_categories pc ON c.id = pc.category_id
WHERE pc.post_id = 1;

-- Get all posts in a specific category
SELECT p.title, p.slug
FROM posts p
INNER JOIN post_categories pc ON p.id = pc.post_id
WHERE pc.category_id = 4  -- Database category
ORDER BY p.created_at DESC;

-- Count posts per category
SELECT 
    c.name,
    COUNT(pc.post_id) AS post_count
FROM categories c
LEFT JOIN post_categories pc ON c.id = pc.category_id
GROUP BY c.id, c.name
ORDER BY post_count DESC;

One-to-One Relationships

One-to-one means each row in table A relates to exactly one row in table B. These are less common but useful for:

Separating optional data: User and UserProfile
Security: User and PasswordHash
Performance: Frequently-accessed vs rarely-accessed data

Example: Posts and SEO Metadata

CREATE TABLE post_seo (
    post_id INTEGER PRIMARY KEY REFERENCES posts(id) ON DELETE CASCADE,
    meta_title VARCHAR(200),
    meta_description VARCHAR(300),
    og_image_url TEXT,
    canonical_url TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

The key is post_id INTEGER PRIMARY KEY:

PRIMARY KEY ensures each post can have only one SEO entry
REFERENCES posts(id) ensures the post exists

-- Insert SEO data for post 1
INSERT INTO post_seo (post_id, meta_title, meta_description) 
VALUES (
    1, 
    'PostgreSQL Tips - Complete Guide',
    'Learn essential PostgreSQL tips for better database management'
);

-- Query post with its SEO data
SELECT 
    p.title,
    p.slug,
    s.meta_title,
    s.meta_description
FROM posts p
LEFT JOIN post_seo s ON p.id = s.post_id
WHERE p.id = 1;

JOIN Operations Explained

JOINs are how you query related data. Let me explain each type with real examples.

INNER JOIN: Only Matching Rows

Returns only rows where both tables have matching values.

-- Get posts with their author names (only published posts)
SELECT 
    p.title,
    p.created_at,
    a.full_name,
    a.email
FROM posts p
INNER JOIN authors a ON p.author_id = a.id
WHERE p.published = true;

LEFT JOIN: All Rows from Left Table

Returns all rows from the left table, even if there's no match in the right table.

-- Get all posts with comment counts (including posts with 0 comments)
SELECT 
    p.title,
    COUNT(c.id) AS comment_count
FROM posts p
LEFT JOIN comments c ON p.id = c.post_id
GROUP BY p.id, p.title
ORDER BY comment_count DESC;

Why LEFT JOIN here? If we used INNER JOIN, posts without comments would disappear from results.

RIGHT JOIN: All Rows from Right Table

Less common. Returns all rows from the right table.

-- Get all authors (even those without posts)
SELECT 
    a.username,
    COUNT(p.id) AS post_count
FROM posts p
RIGHT JOIN authors a ON p.author_id = a.id
GROUP BY a.id, a.username;

Note: RIGHT JOIN is usually rewritten as LEFT JOIN by reversing table order:

-- Same query, more readable
SELECT 
    a.username,
    COUNT(p.id) AS post_count
FROM authors a  -- Now on the left
LEFT JOIN posts p ON a.id = p.author_id
GROUP BY a.id, a.username;

Joining Multiple Tables

-- Get posts with author names and category names
SELECT 
    p.title,
    a.full_name AS author,
    STRING_AGG(c.name, ', ') AS categories,
    COUNT(DISTINCT com.id) AS comment_count
FROM posts p
INNER JOIN authors a ON p.author_id = a.id
LEFT JOIN post_categories pc ON p.id = pc.post_id
LEFT JOIN categories c ON pc.category_id = c.id
LEFT JOIN comments com ON p.id = com.post_id
WHERE p.published = true
GROUP BY p.id, p.title, a.full_name
ORDER BY p.created_at DESC
LIMIT 10;

Breaking down this complex query:

Start with posts (p)
INNER JOIN authors (a) - all posts have authors
LEFT JOIN post_categories (pc) - some posts may have no categories
LEFT JOIN categories (c) - get category names
LEFT JOIN comments (com) - count comments
STRING_AGG(c.name, ', ') - combine multiple categories into one string
COUNT(DISTINCT com.id) - count unique comments
GROUP BY - because we're aggregating

JOIN Performance Visualization

This is why we'll create indexes in Part 4!

CASCADE Options and Referential Integrity

What happens when you delete an author who has posts? By default, PostgreSQL prevents it:

-- Try to delete an author with posts
DELETE FROM authors WHERE id = 1;
-- ERROR: update or delete on table "authors" violates foreign key constraint
-- DETAIL: Key (id)=(1) is still referenced from table "posts".

This is referential integrity—the database protects you from orphaned data.

CASCADE Options

You can control this behavior with CASCADE options:

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    author_id INTEGER REFERENCES authors(id) ON DELETE CASCADE,
    -- When author is deleted, delete all their posts too
    title TEXT
);

All CASCADE Options

Option

Behavior

When to Use

NO ACTION

Prevent delete (default)

When child data must always have a parent

RESTRICT

Same as NO ACTION

Explicit prevention

CASCADE

Delete child rows too

When child data is meaningless without parent

SET NULL

Set foreign key to NULL

When child can exist independently

SET DEFAULT

Set to default value

Rarely used

Real-World Example

-- Comments should be deleted if post is deleted
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    post_id INTEGER REFERENCES posts(id) ON DELETE CASCADE,
    content TEXT
);

-- But posts shouldn't auto-delete if author is deleted
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    author_id INTEGER REFERENCES authors(id) ON DELETE RESTRICT,
    title TEXT
);

-- Or: Keep posts but set author to NULL (anonymous)
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    author_id INTEGER REFERENCES authors(id) ON DELETE SET NULL,
    title TEXT
);

My rule of thumb:

CASCADE: Child data is meaningless without parent (comments without post)
RESTRICT/NO ACTION: Parent data should be preserved (don't accidentally delete authors)
SET NULL: Child can exist independently (posts can become anonymous)

Transactions and ACID Properties

A transaction is a group of SQL statements that must all succeed or all fail together.

Why Transactions Matter

Imagine transferring money between accounts:

-- Step 1: Subtract from account A
UPDATE accounts SET balance = balance - 100 WHERE id = 1;

-- Step 2: Add to account B
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

If Step 2 fails (power outage, network issue), you've just lost $100! Transactions prevent this.

Basic Transaction Syntax

BEGIN;  -- Start transaction

-- Multiple operations
INSERT INTO posts (author_id, title, slug) 
VALUES (1, 'New Post', 'new-post');

INSERT INTO post_categories (post_id, category_id) 
VALUES (CURRVAL('posts_id_seq'), 1);

COMMIT;  -- Make changes permanent

-- OR

ROLLBACK;  -- Undo everything since BEGIN

Real-World Transaction Example

-- Publish a post: update status AND log the event
BEGIN;

UPDATE posts 
SET 
    published = true,
    published_at = CURRENT_TIMESTAMP
WHERE id = 5;

INSERT INTO activity_log (post_id, action, timestamp) 
VALUES (5, 'published', CURRENT_TIMESTAMP);

COMMIT;  -- Both or neither!

If the INSERT fails, the UPDATE is rolled back automatically.

ACID Properties Explained

ACID is what makes transactions reliable:

Atomicity: All or nothing

BEGIN;
INSERT INTO posts ...;
INSERT INTO post_categories ...;
COMMIT;  -- Either both succeed or both fail

Consistency: Database rules are never violated

-- Can't violate foreign key constraint, even mid-transaction
BEGIN;
INSERT INTO posts (author_id, ...) VALUES (999, ...);  -- Invalid author
ROLLBACK;  -- Transaction automatically fails

Isolation: Transactions don't interfere with each other

-- Transaction 1 sees consistent snapshot
-- Transaction 2 doesn't see Transaction 1's changes until COMMIT

Durability: Committed data is permanent

COMMIT;  -- Even if power fails now, data is saved

Indexes for Query Performance

An index is like a book's index—it helps find data quickly without scanning everything.

Without an Index

-- Find all posts by author 5
SELECT * FROM posts WHERE author_id = 5;
-- PostgreSQL scans EVERY row (slow on large tables)

With an Index

-- Create index on author_id
CREATE INDEX idx_posts_author ON posts(author_id);

-- Same query, now uses index (fast!)
SELECT * FROM posts WHERE author_id = 5;

Types of Indexes

-- Single-column index
CREATE INDEX idx_posts_published ON posts(published);

-- Multi-column (composite) index
CREATE INDEX idx_posts_published_created ON posts(published, created_at DESC);

-- Unique index (automatically created for UNIQUE constraint)
CREATE UNIQUE INDEX idx_posts_slug ON posts(slug);

-- Partial index (only for specific rows)
CREATE INDEX idx_published_posts ON posts(created_at) WHERE published = true;

When to Create Indexes

✅ Foreign key columns (author_id, post_id) ✅ Columns in WHERE clauses (published, status) ✅ Columns in ORDER BY (created_at) ✅ Columns in JOIN conditions ✅ Unique columns (slug, email)

❌ Columns rarely queried ❌ Small tables (< 1000 rows) ❌ Columns with many duplicates (low cardinality)

Index Trade-offs

Benefits:

Faster SELECT queries
Faster JOINs
Faster sorting

Costs:

Slower INSERTs (index must be updated)
Slower UPDATEs on indexed columns
Extra disk space
Maintenance overhead

Complete Blog Schema

Let's put it all together with our complete, production-ready blog schema:

-- Clean slate
DROP TABLE IF EXISTS post_categories CASCADE;
DROP TABLE IF EXISTS categories CASCADE;
DROP TABLE IF EXISTS comments CASCADE;
DROP TABLE IF EXISTS post_seo CASCADE;
DROP TABLE IF EXISTS posts CASCADE;
DROP TABLE IF EXISTS authors CASCADE;

-- Authors table
CREATE TABLE authors (
    id SERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    full_name VARCHAR(100),
    bio TEXT,
    active BOOLEAN DEFAULT true,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Posts table
CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    author_id INTEGER NOT NULL REFERENCES authors(id) ON DELETE RESTRICT,
    title VARCHAR(200) NOT NULL,
    slug VARCHAR(200) UNIQUE NOT NULL,
    content TEXT,
    excerpt VARCHAR(500),
    published BOOLEAN DEFAULT false,
    featured BOOLEAN DEFAULT false,
    view_count INTEGER DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    published_at TIMESTAMP,
    
    CONSTRAINT check_title_length CHECK (LENGTH(title) >= 3),
    CONSTRAINT check_view_count_positive CHECK (view_count >= 0)
);

-- Categories table
CREATE TABLE categories (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50) UNIQUE NOT NULL,
    slug VARCHAR(50) UNIQUE NOT NULL,
    description TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Post-Categories junction table (many-to-many)
CREATE TABLE post_categories (
    post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    category_id INTEGER NOT NULL REFERENCES categories(id) ON DELETE CASCADE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (post_id, category_id)
);

-- Comments table
CREATE TABLE comments (
    id SERIAL PRIMARY KEY,
    post_id INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    parent_id INTEGER REFERENCES comments(id) ON DELETE CASCADE,  -- For nested comments
    author_name VARCHAR(100) NOT NULL,
    author_email VARCHAR(100),
    content TEXT NOT NULL,
    approved BOOLEAN DEFAULT false,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Post SEO metadata (one-to-one)
CREATE TABLE post_seo (
    post_id INTEGER PRIMARY KEY REFERENCES posts(id) ON DELETE CASCADE,
    meta_title VARCHAR(200),
    meta_description VARCHAR(300),
    og_image_url TEXT,
    canonical_url TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Indexes for performance
CREATE INDEX idx_posts_author ON posts(author_id);
CREATE INDEX idx_posts_published ON posts(published);
CREATE INDEX idx_posts_created ON posts(created_at DESC);
CREATE INDEX idx_posts_published_created ON posts(published, created_at DESC);
CREATE INDEX idx_comments_post ON comments(post_id);
CREATE INDEX idx_comments_approved ON comments(approved);
CREATE INDEX idx_post_categories_post ON post_categories(post_id);
CREATE INDEX idx_post_categories_category ON post_categories(category_id);

Real-World Relationship Patterns

Here are queries I use constantly in real blog applications:

Blog Homepage: Latest Posts with Authors and Categories

SELECT 
    p.id,
    p.title,
    p.slug,
    p.excerpt,
    p.view_count,
    p.created_at,
    a.full_name AS author,
    a.username AS author_username,
    STRING_AGG(DISTINCT c.name, ', ' ORDER BY c.name) AS categories,
    COUNT(DISTINCT com.id) AS comment_count
FROM posts p
INNER JOIN authors a ON p.author_id = a.id
LEFT JOIN post_categories pc ON p.id = pc.post_id
LEFT JOIN categories c ON pc.category_id = c.id
LEFT JOIN comments com ON p.id = com.post_id AND com.approved = true
WHERE p.published = true
GROUP BY p.id, p.title, p.slug, p.excerpt, p.view_count, p.created_at, a.full_name, a.username
ORDER BY p.created_at DESC
LIMIT 10;

Single Post View with Full Details

SELECT 
    p.*,
    a.full_name AS author_name,
    a.bio AS author_bio,
    a.username AS author_username,
    STRING_AGG(DISTINCT c.name, ', ') AS categories,
    s.meta_title,
    s.meta_description,
    COUNT(DISTINCT com.id) AS comment_count
FROM posts p
INNER JOIN authors a ON p.author_id = a.id
LEFT JOIN post_categories pc ON p.id = pc.post_id
LEFT JOIN categories c ON pc.category_id = c.id
LEFT JOIN post_seo s ON p.id = s.post_id
LEFT JOIN comments com ON p.id = com.post_id AND com.approved = true
WHERE p.slug = 'postgresql-tips'  -- From URL
GROUP BY p.id, a.full_name, a.bio, a.username, s.meta_title, s.meta_description;

Category Page: All Posts in a Category

SELECT 
    p.title,
    p.slug,
    p.excerpt,
    p.created_at,
    a.full_name AS author
FROM posts p
INNER JOIN authors a ON p.author_id = a.id
INNER JOIN post_categories pc ON p.id = pc.post_id
INNER JOIN categories cat ON pc.category_id = cat.id
WHERE cat.slug = 'postgresql'  -- From URL
  AND p.published = true
ORDER BY p.created_at DESC;

Author Profile: All Posts by Author

SELECT 
    a.full_name,
    a.bio,
    COUNT(DISTINCT p.id) AS total_posts,
    SUM(p.view_count) AS total_views,
    MAX(p.created_at) AS latest_post_date
FROM authors a
LEFT JOIN posts p ON a.id = p.author_id AND p.published = true
WHERE a.username = 'john_doe'
GROUP BY a.id, a.full_name, a.bio;

SELECT 
    c.name AS category,
    p.title,
    p.view_count,
    a.full_name AS author
FROM categories c
INNER JOIN post_categories pc ON c.id = pc.category_id
INNER JOIN posts p ON pc.post_id = p.id
INNER JOIN authors a ON p.author_id = a.id
WHERE p.published = true
ORDER BY c.name, p.view_count DESC;

Common Relationship Mistakes

Mistake 1: Forgetting CASCADE

-- Created without ON DELETE CASCADE
CREATE TABLE comments (
    post_id INTEGER REFERENCES posts(id)  -- Missing CASCADE!
);

-- Now you can't delete posts with comments
DELETE FROM posts WHERE id = 1;
-- ERROR: violates foreign key constraint

Fix: Add CASCADE or manually delete comments first.

Mistake 2: Using VARCHAR for Foreign Keys

-- WRONG: Using string as foreign key
CREATE TABLE posts (
    author_username VARCHAR(50) REFERENCES authors(username)
);
-- What if author changes username? Nightmare!

-- RIGHT: Use integer ID
CREATE TABLE posts (
    author_id INTEGER REFERENCES authors(id)
);

Mistake 3: Missing Indexes on Foreign Keys

-- Created foreign key but no index
CREATE TABLE posts (
    author_id INTEGER REFERENCES authors(id)
);

-- This query will be slow on large tables!
SELECT * FROM posts WHERE author_id = 5;

-- Fix: Create index
CREATE INDEX idx_posts_author ON posts(author_id);

Mistake 4: Joining Without WHERE

-- WRONG: Cartesian product (every post matched with every author!)
SELECT p.title, a.name
FROM posts p, authors a;
-- Returns posts * authors rows!

-- RIGHT: Use proper JOIN condition
SELECT p.title, a.full_name
FROM posts p
INNER JOIN authors a ON p.author_id = a.id;

Mistake 5: Many-to-Many Without Junction Table

-- WRONG: Trying to store multiple categories in one column
CREATE TABLE posts (
    categories TEXT  -- 'PostgreSQL,Tutorial,Database'
);
-- Querying is a nightmare!

-- RIGHT: Use junction table
CREATE TABLE post_categories (
    post_id INTEGER REFERENCES posts(id),
    category_id INTEGER REFERENCES categories(id)
);

What I Learned About Data Relationships

Reflecting on my journey from comma-separated chaos to properly modeled relationships:

1. Relationships Are the Point

The "relational" in relational database isn't just terminology—it's the entire value proposition. Once I understood how to properly model relationships, my queries became simpler and my data more reliable.

2. Foreign Keys Are Your Best Friend

Initially, I thought foreign keys were restrictive. "Can't I just store the ID without the constraint?" Sure, but then you'll have orphaned data, broken references, and mysterious bugs. Foreign keys protect you.

3. Junction Tables Aren't Complex—They're Elegant

My first many-to-many relationship felt awkward. Why do I need an extra table? But once I understood that post_categories is just a list of connections, it clicked. It's actually simpler than trying to cram multiple values into one column.

4. CASCADE Options Require Thought

Don't just slap ON DELETE CASCADE on everything. Think about your business logic:

Should comments disappear if a post is deleted? (Probably yes)
Should posts disappear if an author is deleted? (Probably no)

5. Indexes Make or Break Performance

Without indexes, my JOIN queries on 10,000+ posts took seconds. With proper indexes on foreign keys, they ran in milliseconds. Always index your foreign keys!

6. Start Simple, Then Add Complexity

Don't try to model every possible relationship upfront. Start with the core entities (authors, posts), get that working, then add relationships (comments, categories) as needed.

Next Steps

Congratulations! You've mastered database relationships, the heart of relational databases. You now understand:

✅ Foreign keys and referential integrity ✅ One-to-many relationships (authors → posts) ✅ Many-to-many relationships (posts ↔ categories) ✅ One-to-one relationships (posts ↔ SEO data) ✅ JOIN operations (INNER, LEFT, RIGHT) ✅ CASCADE options for delete behavior ✅ Transactions and ACID properties ✅ Basic indexing for performance

But we can do even better! Our schema works, but is it optimized? In Part 4: Database Design and Best Practices, we'll level up with:

Normalization: Organizing data to eliminate redundancy
Indexing strategies: Making queries blazing fast
PostgreSQL-specific features: JSONB, GIN indexes, full-text search
Query optimization: Using EXPLAIN ANALYZE
Schema design patterns: Timestamps, soft deletes, audit trails
Performance best practices: What I learned from production databases

The final chapter of your database journey awaits! 🚀

Practice Exercise: Before moving to Part 4:

Model a many-to-many relationship between posts and tags
Create a nested comments system (parent_id self-reference)
Write a query to find posts with more than 5 comments
Add CASCADE to all appropriate foreign keys
Create indexes on all foreign key columns
Write a transaction that creates a post and assigns it to 3 categories

These exercises will prepare you for advanced database design!

PreviousSQL Fundamentals and CRUD Operations NextAdvanced Queries and Joins

Last updated 1 month ago

hashtagTable of Contents

hashtagIntroduction: The Comma-Separated Nightmare

hashtagWhy Relationships Matter

hashtagThe Relational Model Visualization

hashtagUnderstanding Foreign Keys

hashtagBasic Foreign Key Example

hashtagWhat the Foreign Key Does

hashtagOne-to-Many Relationships

hashtagSetting Up One-to-Many

hashtagInserting Related Data

hashtagQuerying One-to-Many with COUNT

hashtagMany-to-Many Relationships

hashtagThe Problem with Direct Many-to-Many

hashtagThe Solution: Junction Table

hashtagUnderstanding the Junction Table

hashtagInserting Many-to-Many Data

hashtagQuerying Many-to-Many

hashtagOne-to-One Relationships

hashtagExample: Posts and SEO Metadata

hashtagJOIN Operations Explained

hashtagINNER JOIN: Only Matching Rows

hashtagLEFT JOIN: All Rows from Left Table

hashtagRIGHT JOIN: All Rows from Right Table

hashtagJoining Multiple Tables

hashtagJOIN Performance Visualization

hashtagCASCADE Options and Referential Integrity

hashtagCASCADE Options

hashtagAll CASCADE Options

hashtagReal-World Example

hashtagTransactions and ACID Properties

hashtagWhy Transactions Matter

hashtagBasic Transaction Syntax

hashtagReal-World Transaction Example

hashtagACID Properties Explained

hashtagIndexes for Query Performance

hashtagWithout an Index

hashtagWith an Index

hashtagTypes of Indexes

hashtagWhen to Create Indexes

hashtagIndex Trade-offs

hashtagComplete Blog Schema

hashtagReal-World Relationship Patterns

hashtagBlog Homepage: Latest Posts with Authors and Categories

hashtagSingle Post View with Full Details

hashtagCategory Page: All Posts in a Category

hashtagAuthor Profile: All Posts by Author

hashtagPopular Posts by Category

hashtagCommon Relationship Mistakes

hashtagMistake 1: Forgetting CASCADE

hashtagMistake 2: Using VARCHAR for Foreign Keys

hashtagMistake 3: Missing Indexes on Foreign Keys

hashtagMistake 4: Joining Without WHERE

hashtagMistake 5: Many-to-Many Without Junction Table

hashtagWhat I Learned About Data Relationships

hashtag1. Relationships Are the Point

hashtag2. Foreign Keys Are Your Best Friend

hashtag3. Junction Tables Aren't Complex—They're Elegant

hashtag4. CASCADE Options Require Thought

hashtag5. Indexes Make or Break Performance

hashtag6. Start Simple, Then Add Complexity

hashtagNext Steps

Table of Contents