# Database Security

*My journey from a public-facing root user to production-grade access control*

***

## Table of Contents

1. [Introduction: The Security Audit That Humbled Me](#introduction-the-security-audit-that-humbled-me)
2. [PostgreSQL Authentication and pg\_hba.conf](#postgresql-authentication-and-pg_hbaconf)
3. [Roles and Privileges: The Foundation of Access Control](#roles-and-privileges-the-foundation-of-access-control)
4. [Row-Level Security (RLS)](#row-level-security-rls)
5. [SQL Injection: The Most Common Attack](#sql-injection-the-most-common-attack)
6. [Encryption: Data at Rest and in Transit](#encryption-data-at-rest-and-in-transit)
7. [Auditing and Monitoring](#auditing-and-monitoring)
8. [Connection Security](#connection-security)
9. [Secrets Management](#secrets-management)
10. [Security Checklist for Production](#security-checklist-for-production)
11. [What I Learned About Database Security](#what-i-learned-about-database-security)
12. [Next Steps](#next-steps)

***

## Introduction: The Security Audit That Humbled Me

When a security-conscious friend reviewed my blog infrastructure, the first thing he asked was: "What database user does your application use?"

"Postgres," I said.

He stared at me. "`postgres` is the superuser. Your application has superuser access to your entire database."

I shrugged. "It's just a blog."

"A blog that accepts user uploads, user comments, and runs arbitrary search queries. One SQL injection and an attacker can read your entire database, create new accounts, or drop your tables."

That conversation changed how I think about security permanently. **The principle of least privilege**—give every component only the minimum access it needs—isn't paranoia. It's the baseline.

***

## PostgreSQL Authentication and pg\_hba.conf

`pg_hba.conf` (Host-Based Authentication) is PostgreSQL's access control file. It defines who can connect, from where, and how they authenticate.

```bash
# Location: typically /etc/postgresql/{version}/main/pg_hba.conf
# or find it with:
SHOW hba_file;
```

### pg\_hba.conf Format

```
# TYPE  DATABASE    USER         ADDRESS        METHOD
local   all         postgres                    peer
local   all         all                         md5
host    all         all          127.0.0.1/32   scram-sha-256
host    blog_db     blog_app     10.0.0.0/8     scram-sha-256
hostssl all         all          0.0.0.0/0      scram-sha-256
```

**Authentication Methods (newest first):**

| Method          | Security | Notes                                 |
| --------------- | -------- | ------------------------------------- |
| `scram-sha-256` | ✅ Strong | Recommended for all modern setups     |
| `md5`           | ⚠️ Weak  | Legacy; avoid if possible             |
| `peer`          | ✅ Strong | Unix socket only; matches OS username |
| `trust`         | ❌ None   | Never use for network connections     |
| `reject`        | Block    | Explicitly deny connections           |

### Reload After Changes

```bash
# Reload without restarting PostgreSQL
pg_ctlcluster 14 main reload
# or
psql -c "SELECT pg_reload_conf();"
```

***

## Roles and Privileges: The Foundation of Access Control

PostgreSQL uses a unified **role** system for both users and groups.

### Role Hierarchy for a Blog Application

{% @mermaid/diagram content="graph TD
Superuser\["postgres (superuser)"]
Superuser --> AppRole\["blog\_app\_role\n(application role)"]
Superuser --> ReadRole\["blog\_read\_role\n(read-only role)"]
Superuser --> AdminRole\["blog\_admin\_role\n(admin role)"]
AppRole --> AppUser\["blog\_app\n(application user)"]
ReadRole --> ReportUser\["blog\_reporter\n(analytics user)"]
AdminRole --> AdminUser\["blog\_admin\n(DBA user)"]" %}

### Creating Roles and Users

```sql
-- Create a read-only role
CREATE ROLE blog_read_role NOLOGIN;
GRANT CONNECT ON DATABASE blog_db TO blog_read_role;
GRANT USAGE ON SCHEMA public TO blog_read_role;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO blog_read_role;
-- Future tables automatically granted
ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT ON TABLES TO blog_read_role;

-- Create the application role (read + write, no DDL)
CREATE ROLE blog_app_role NOLOGIN;
GRANT CONNECT ON DATABASE blog_db TO blog_app_role;
GRANT USAGE ON SCHEMA public TO blog_app_role;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO blog_app_role;
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO blog_app_role;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO blog_app_role;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
    GRANT USAGE ON SEQUENCES TO blog_app_role;

-- Create actual login users and assign roles
CREATE USER blog_app WITH PASSWORD 'strong_random_password_here' LOGIN;
GRANT blog_app_role TO blog_app;

CREATE USER blog_reporter WITH PASSWORD 'another_strong_password' LOGIN;
GRANT blog_read_role TO blog_reporter;
```

### Principle of Least Privilege in Practice

```sql
-- Your application user should NEVER be superuser
-- Check current privileges
\du                                -- in psql
SELECT rolname, rolsuper, rolcreatedb, rolcreaterole, rolcanlogin
FROM pg_roles
WHERE rolname NOT LIKE 'pg_%';

-- Revoke dangerous privileges if accidentally granted
REVOKE SUPERUSER FROM blog_app;
REVOKE CREATEDB FROM blog_app;
REVOKE CREATEROLE FROM blog_app;
```

### Column-Level Privileges

```sql
-- Application can read all post columns but only specific author columns
REVOKE SELECT ON authors FROM blog_app_role;
GRANT SELECT (id, name, bio, created_at) ON authors TO blog_app_role;
-- password_hash column is never accessible through blog_app_role
```

### Revoking Privileges

```sql
-- Remove all privileges on a table
REVOKE ALL ON posts FROM blog_reporter;
GRANT SELECT ON posts TO blog_reporter;  -- Only SELECT now

-- Revoke from public schema (important hardening step)
REVOKE CREATE ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON DATABASE blog_db FROM PUBLIC;
```

***

## Row-Level Security (RLS)

Row-Level Security restricts which rows a user can see or modify, based on policies:

```sql
-- Enable RLS on the posts table
ALTER TABLE posts ENABLE ROW LEVEL SECURITY;

-- Policy: users can only see their own drafts, but everyone sees published posts
CREATE POLICY posts_visibility ON posts
    FOR SELECT
    USING (
        status = 'published'
        OR author_id = current_setting('app.current_user_id')::INTEGER
    );

-- Policy: users can only update/delete their own posts
CREATE POLICY posts_author_modify ON posts
    FOR UPDATE USING (
        author_id = current_setting('app.current_user_id')::INTEGER
    );

CREATE POLICY posts_author_delete ON posts
    FOR DELETE USING (
        author_id = current_setting('app.current_user_id')::INTEGER
    );

-- Superuser bypasses RLS by default
-- To bypass RLS for maintenance: ALTER TABLE posts FORCE ROW LEVEL SECURITY;
-- Then add explicit policy for admin role
```

### Setting Application Context

```sql
-- Your application sets the current user ID at connection start
SET app.current_user_id = '42';

-- Or at transaction level
BEGIN;
SET LOCAL app.current_user_id = '42';
SELECT * FROM posts;  -- Only sees this user's drafts + all published posts
COMMIT;
```

### Multi-Tenant with RLS

```sql
-- Multi-tenant setup where each tenant only sees their data
ALTER TABLE posts ADD COLUMN tenant_id INTEGER NOT NULL;
ALTER TABLE posts ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON posts
    USING (tenant_id = current_setting('app.tenant_id')::INTEGER);

-- Set in application before any query
SET app.tenant_id = '100';
```

***

## SQL Injection: The Most Common Attack

SQL injection occurs when user input is concatenated directly into SQL strings.

### The Attack

Imagine a search endpoint:

```python
# VULNERABLE: Never do this
query = f"SELECT * FROM posts WHERE title LIKE '%{user_input}%'"
db.execute(query)
```

If `user_input = "' OR '1'='1"`:

```sql
-- Resulting query - returns ALL posts!
SELECT * FROM posts WHERE title LIKE '%' OR '1'='1%'
```

If `user_input = "'; DROP TABLE posts; --"`:

```sql
-- Catastrophic injection
SELECT * FROM posts WHERE title LIKE '%'; DROP TABLE posts; --%'
```

### Prevention: Always Use Parameterised Queries

```python
# SAFE: Parameterised query
db.execute("SELECT * FROM posts WHERE title LIKE %s", [f"%{user_input}%"])

# In psycopg2
cursor.execute("SELECT * FROM posts WHERE id = %s AND status = %s",
               (post_id, 'published'))
```

```javascript
// SAFE: Node.js with pg
const result = await client.query(
    'SELECT * FROM posts WHERE id = $1 AND status = $2',
    [postId, 'published']
);
```

```sql
-- In SQL itself: use EXECUTE with USING for dynamic SQL in functions
CREATE OR REPLACE FUNCTION safe_search(search_term TEXT)
RETURNS SETOF posts AS $$
BEGIN
    RETURN QUERY
    EXECUTE 'SELECT * FROM posts WHERE title ILIKE $1'
    USING '%' || search_term || '%';
END;
$$ LANGUAGE plpgsql;
```

### Defence in Depth

Even with parameterised queries, apply additional layers:

1. **Least privilege**: Application user can't drop tables
2. **Input validation**: Reject obviously malicious patterns at the application layer
3. **WAF**: Web Application Firewall at the network level
4. **pg\_audit**: Audit all SQL executed

***

## Encryption: Data at Rest and in Transit

### SSL/TLS (Encryption in Transit)

```sql
-- Check if SSL is enabled
SHOW ssl;

-- Verify your connection is using SSL
SELECT ssl, version FROM pg_stat_ssl WHERE pid = pg_backend_pid();
```

```bash
# postgresql.conf
ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'
ssl_ca_file = 'root.crt'

# Force SSL for all connections in pg_hba.conf
hostssl  all  all  0.0.0.0/0  scram-sha-256
```

### Encrypting Sensitive Columns (pgcrypto)

```sql
-- Install pgcrypto extension
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- Store encrypted data
INSERT INTO users (email, password_hash, api_key_encrypted)
VALUES (
    'user@example.com',
    crypt('their_password', gen_salt('bf', 12)),   -- bcrypt hash
    pgp_sym_encrypt('their_api_key', 'encryption_key_from_env')
);

-- Verify password
SELECT * FROM users
WHERE email = 'user@example.com'
  AND password_hash = crypt('their_password', password_hash);

-- Decrypt when needed
SELECT pgp_sym_decrypt(api_key_encrypted::bytea, 'encryption_key_from_env')
FROM users
WHERE id = 42;
```

> ⚠️ **Never store passwords as plain text or simple MD5.** Use bcrypt (via `crypt()` in pgcrypto) or argon2 at the application level.

### Transparent Data Encryption

For encryption at rest of the entire database, use filesystem-level encryption (LUKS on Linux, FileVault on macOS) or cloud provider managed encryption (AWS RDS encryption, GCP Cloud SQL encryption).

***

## Auditing and Monitoring

### pg\_audit Extension

```sql
-- Install pgaudit
-- In postgresql.conf:
-- shared_preload_libraries = 'pgaudit'
-- pgaudit.log = 'all'   -- or specific: 'read,write,ddl'

CREATE EXTENSION IF NOT EXISTS pgaudit;

-- Audit only specific roles
ALTER ROLE blog_app SET pgaudit.log = 'write';
```

### Custom Audit Triggers

```sql
-- From the Views and Functions article - apply to sensitive tables
CREATE TRIGGER trg_audit_posts
AFTER INSERT OR UPDATE OR DELETE ON posts
FOR EACH ROW EXECUTE FUNCTION log_post_changes();

-- Audit access to sensitive author data
CREATE TABLE sensitive_access_log (
    id          SERIAL PRIMARY KEY,
    table_name  TEXT,
    operation   TEXT,
    accessed_by TEXT DEFAULT current_user,
    accessed_at TIMESTAMPTZ DEFAULT NOW(),
    row_id      INTEGER
);
```

### Monitor Failed Login Attempts

```sql
-- Check authentication failures in PostgreSQL log
-- In postgresql.conf:
-- log_connections = on
-- log_disconnections = on
-- log_failed_connections = on
-- log_min_duration_statement = 1000  -- log queries taking > 1 second

-- Monitor active connections
SELECT
    datname,
    usename,
    client_addr,
    state,
    query_start,
    LEFT(query, 60) AS current_query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY query_start;
```

***

## Connection Security

### Connection Limits

```sql
-- Limit connections per user to prevent DoS
ALTER USER blog_app CONNECTION LIMIT 50;
ALTER USER blog_reporter CONNECTION LIMIT 10;

-- Database-level connection limit
ALTER DATABASE blog_db CONNECTION LIMIT 100;

-- Monitor current connections
SELECT datname, count(*)
FROM pg_stat_activity
GROUP BY datname;
```

### Enforce Password Policies

```sql
-- Set password expiry
ALTER USER blog_app VALID UNTIL '2027-01-01';

-- Force password change
ALTER USER blog_app PASSWORD 'new_strong_password';

-- Require strong passwords via passwordcheck extension
shared_preload_libraries = 'passwordcheck'
```

### Protect Against Brute Force

PostgreSQL itself doesn't have built-in rate limiting for login attempts. Protect at the infrastructure level:

* **pg\_bouncer** with connection limits
* **fail2ban** watching PostgreSQL auth logs
* **Cloud security groups / VPC firewall** blocking non-application IPs

***

## Secrets Management

### Never Hardcode Credentials

```python
# WRONG - credentials in code
conn = psycopg2.connect(
    host="localhost",
    database="blog_db",
    user="blog_app",
    password="hardcoded_password"  # ❌ Never do this
)

# RIGHT - use environment variables
import os
conn = psycopg2.connect(
    host=os.environ["DB_HOST"],
    database=os.environ["DB_NAME"],
    user=os.environ["DB_USER"],
    password=os.environ["DB_PASSWORD"]
)
```

### Use a `.env` File (Development Only)

```bash
# .env (add to .gitignore!)
DB_HOST=localhost
DB_PORT=5432
DB_NAME=blog_db
DB_USER=blog_app
DB_PASSWORD=dev_password_only

# .gitignore
.env
*.env
```

### Production: Use Secret Managers

* **AWS**: AWS Secrets Manager or Parameter Store
* **GCP**: Secret Manager
* **Azure**: Key Vault
* **Self-hosted**: HashiCorp Vault
* **Kubernetes**: Sealed Secrets or External Secrets Operator

***

## Security Checklist for Production

```
AUTHENTICATION
[ ] Use scram-sha-256 in pg_hba.conf (not md5 or trust)
[ ] SSL/TLS enabled for all network connections
[ ] Strong passwords enforced

ACCESS CONTROL
[ ] Application uses a non-superuser account
[ ] Application role has only SELECT, INSERT, UPDATE, DELETE
[ ] No application access to pg_catalog or system tables
[ ] REVOKE CREATE ON SCHEMA public FROM PUBLIC
[ ] Column-level security for sensitive fields (passwords, PII)

SQL INJECTION
[ ] All queries use parameterised statements (no string concatenation)
[ ] No dynamic SQL in application code
[ ] Input validation at application layer

ENCRYPTION
[ ] SSL enabled (ssl = on)
[ ] Passwords stored as bcrypt hashes (never plain text or MD5)
[ ] Sensitive columns encrypted with pgcrypto

AUDITING
[ ] pg_audit or custom audit triggers on sensitive tables
[ ] log_connections = on
[ ] log_failed_connections = on
[ ] Activity monitoring dashboard in place

SECRETS
[ ] No credentials in source code or version control
[ ] Environment variables or secret manager for credentials
[ ] Secrets rotation policy in place

NETWORK
[ ] Database port not exposed to public internet
[ ] Application connects via VPC/private network
[ ] Connection limits set per user and database
```

***

## What I Learned About Database Security

Security isn't a feature you add at the end—it's a practice you embed from the start. The gap between "it works" and "it's secure" is enormous, and closing it doesn't require heroic effort. Most of it is just setting up the right roles, using parameterised queries, and following the principle of least privilege.

The changes that matter most:

1. **Separate database users for each application/service** — never share credentials
2. **Parameterised queries everywhere** — no exceptions
3. **Row-Level Security for multi-tenant or user-scoped data** — push access control into the database itself
4. **SSL always** — encryption in transit is non-negotiable
5. **Audit logs for sensitive tables** — know who changed what, when

***

## Next Steps

With security in place, the final operational concern is: what happens when things go wrong? **Backup and Recovery** ensures you can always restore your data.

* [**→ Next: Backup and Recovery**](https://blog.htunnthuthu.com/getting-started/programming/database-101/database-101-backup-and-recovery)
* [**← Previous: Views, Functions, and Stored Procedures**](https://blog.htunnthuthu.com/getting-started/programming/database-101/database-101-views-functions-stored-procedures)

***

*Part of the* [*Database 101 Series*](https://blog.htunnthuthu.com/getting-started/programming/database-101)
