Data Architecture Patterns

It was 11 PM on a Friday when I realized we had a serious problem. A customer's order showed "Mohinga - 3500 MMK" but our Inventory Service said the current price was 4000 MMK. The order was placed two weeks ago, but now we couldn't tell what the customer actually paid because we'd joined live data from the Inventory Service.

That's when I learned data architecture isn't just about choosing databasesβ€”it's about data ownership, consistency models, and accepting trade-offs between perfect accuracy and system independence.

The Problem: Shared Database Chaos

Initially, all services shared one PostgreSQL database:

# All services accessing the same database - BAD
from shared_db import SessionLocal, Product, Order

def create_order(product_id: str):
    db = SessionLocal()
    
    # Order Service reading Product table (owned by Inventory Service)
    product = db.query(Product).filter(Product.id == product_id).first()
    
    order = Order(
        product_id=product.id,
        product_name=product.name,  # ← Reading another service's data
        price=product.price,         # ← Will change when Inventory updates it
    )
    db.add(order)
    db.commit()

Problems we hit:

  1. Tight coupling: Can't deploy Inventory Service without coordinating with Order Service

  2. Schema conflicts: Inventory wants to change Product.price type, breaks Order queries

  3. Performance: Inventory's slow product search query locks tables, slowing down order creation

  4. No clear ownership: Who owns the products table?

Database-Per-Service Pattern

Each service owns its database and data:

spinner

Why Different Databases?

PostgreSQL for transactional services:

  • Auth Service: ACID for user data

  • POS Core: Strong consistency for financial records

  • Payment Service: Transactional integrity

  • Restaurant Service: Relational menu structure

MongoDB for Inventory Service:

Product schemas vary wildly across restaurantsβ€”some need nutritional info, some need halal certification, some need multiple photos. MongoDB's flexibility was essential.

Data Duplication: The Necessary Evil

When POS Core creates an order, it duplicates product data:

Why duplicate?

  1. Historical accuracy: Order shows what customer paid, not current price

  2. Service independence: POS Core doesn't break if Inventory Service is down

  3. Performance: No join across services at query time

  4. Auditability: Legal requirement to preserve order details exactly as sold

Trade-off: Data might be "stale" (product renamed, but old orders show old name). This is acceptable for our business requirements.

Eventual Consistency

When product stock changes, systems don't update instantly:

POS Core subscribes to events:

Eventual consistency window: 50-500ms between Inventory update and POS Core knowing about it. For our business, this is acceptable.

The Saga Pattern: Coordinating Distributed Transactions

When an order is placed, we need to:

  1. Reserve inventory

  2. Process payment

  3. Create order record

But we can't use database transactions across services. Enter the Saga pattern:

Saga execution flow:

spinner

When NOT to Use Database-Per-Service

This pattern isn't always right:

Don't use if:

  • ❌ Single service application (over-engineering)

  • ❌ Need true ACID transactions (banking systems)

  • ❌ Complex joins across all data (reporting/analytics)

  • ❌ Small team without ops capacity

Do use if:

  • βœ… Multiple services with clear boundaries

  • βœ… Independent scaling requirements

  • βœ… Different data access patterns

  • βœ… Team autonomy is important

Production Lessons

Lesson 1: Data Migration is Hard

When we split the monolith database:

Challenges:

  • Data inconsistencies discovered during migration

  • Downtime required (we chose 3 AM Sunday)

  • Rollback plan needed (we kept monolith DB for 2 weeks)

Lesson 2: Accept Eventual Consistency

One customer complained: "I see 5 items in stock but checkout says out of stock!"

What happened:

  1. Customer loaded product page (cached: 5 in stock)

  2. Another customer bought all 5 (Inventory Service updated)

  3. First customer tried to checkout (POS Core cache not updated yet)

Solution: Better UX messaging:

Lesson 3: Query Complexity Increases

Reporting queries that were simple joins became complex aggregations:

Key Learnings

βœ… Data Ownership

  • Each service owns its data completely

  • Other services access via API, never direct database queries

  • Clear ownership prevents conflicts

βœ… Polyglot Persistence

  • Use the right database for each service's needs

  • PostgreSQL for transactions

  • MongoDB for flexible schemas

βœ… Data Duplication is OK

  • Duplicate data for service independence

  • Snapshots preserve historical accuracy

  • Trade consistency for availability

βœ… Eventual Consistency

  • Accept temporary inconsistency

  • Use events to propagate changes

  • Business requirements determine acceptable delay

βœ… Sagas for Distributed Transactions

  • Orchestrate multi-service operations

  • Implement compensating transactions

  • Handle partial failures gracefully

Common Mistakes

❌ Sharing databases "just this once"

  • Slippery slope back to tight coupling

❌ No compensating transactions

  • Saga failures leave inconsistent state

❌ Over-normalizing data

  • Join across services is expensive, duplicate when needed

❌ Ignoring migration complexity

  • Splitting databases is harder than splitting code

Next Steps

We've designed data architecture, but services still need to communicate about changes in real-time. Polling databases isn't the answer.

Next, we'll explore Event-Driven Architectureβ€”how Inventory Service tells POS Core about stock changes without tight coupling.

Continue to: Event-Driven Architecture Basics

Last updated