Microservice Architecture

The Decision That Took 18 Months to Get Right

I have mentioned already in Monolithic Architecture that my POS system started as a monolith and that the first attempt at microservices failed badly. Here is the full story of how I eventually got it right — and what I understand now that I did not then.

The first split happened when I noticed a pattern: the chatbot service was consuming disproportionate CPU during peak hours, and its long-running LLM inference calls were blocking resources needed by real-time order processing. The chatbot's workload was genuinely different from the order processing workload. That was a real domain boundary, with a real operational reason to separate.

That distinction — operational pressure + domain boundary — is the signal I now look for before splitting a service.

What Is a Microservice?

A microservice is a small, independently deployable service focused on a single business capability. Independently deployable means I can release a new version of the Inventory service without touching the Order service.

Key characteristics:

Single responsibility — owns one bounded context
Independent deployment — its own CI/CD pipeline, its own release cycle
Data ownership — its own database, not shared with other services
Communication through APIs or events — no shared libraries for business logic, no shared databases

The Decomposition Decision

How do I decide what becomes a service? I use three criteria:

1. Domain Boundary

If two parts of the system are owned by conceptually separate bounded contexts — orders vs. inventory vs. payments — they are candidates for separation. Domain boundaries are the natural seams.

2. Independent Scaling Need

If one part needs to scale differently from the rest, that is a signal. The chatbot service needed GPU-adjacent resources and handled long-running requests. The order service needed low latency and high throughput. These are different resource profiles.

3. Independent Deployment Rhythm

If different parts of the system change at very different rates — the chatbot model and prompt logic changed weekly; the payment integration changed monthly; the order schema almost never changed — that is a reason to separate.

All three do not need to align for every split. But I am suspicious of splits that satisfy none of them.

My POS Microservices Architecture

Here is what each service in the POS system owns:

Service

Responsibility

Database

Language

Auth Service

Authentication, JWT, tenant isolation

PostgreSQL

Python (FastAPI)

POS Core

Order lifecycle, order items, receipts

PostgreSQL

Python (FastAPI)

Inventory

Products, stock levels, stock reservations

PostgreSQL

Python (FastAPI)

Payment

Payment processing, transaction records

PostgreSQL

Python (FastAPI)

Restaurant

Tenant management, table configuration, staff

PostgreSQL

Python (FastAPI)

Chatbot

LLM conversation history, context, responses

MongoDB

Python (FastAPI)

The chatbot uses MongoDB because its data model is document-oriented (conversation threads with variable structure), while all other services have relational data with clear schemas.

Inter-Service Communication

Synchronous (REST)

For real-time calls where the caller needs an immediate response:

# pos_core/clients/inventory_client.py
import httpx
from config import settings

class InventoryClient:
    def __init__(self):
        self._base_url = settings.INVENTORY_SERVICE_URL
        self._timeout = httpx.Timeout(5.0)

    async def reserve_stock(
        self,
        tenant_id: str,
        product_id: int,
        quantity: int
    ) -> bool:
        async with httpx.AsyncClient(timeout=self._timeout) as client:
            response = await client.post(
                f"{self._base_url}/internal/reserve",
                json={"tenant_id": tenant_id, "product_id": product_id, "quantity": quantity},
                headers={"X-Service": "pos-core"}
            )
            response.raise_for_status()
            return response.json()["reserved"]

Asynchronous (Events)

For operations where the caller does not need to wait for completion:

# pos_core/events/publisher.py
import json
import redis.asyncio as redis
from config import settings

class EventPublisher:
    def __init__(self):
        self._redis = redis.from_url(settings.REDIS_URL)

    async def order_placed(self, tenant_id: str, order_id: int, total: float):
        event = {
            "event": "order.placed",
            "tenant_id": tenant_id,
            "order_id": order_id,
            "total": total,
        }
        await self._redis.publish("pos.events", json.dumps(event))

The Notification service and the Restaurant service subscribe to pos.events and handle order.placed independently — without any coupling to POS Core.

Data Ownership

The hardest part of microservices is accepting that each service owns its data and no other service touches it directly.

This means:

No shared database tables between services
No cross-service JOINs in SQL
If Service A needs data from Service B, it calls Service B's API

In practice, this led me to duplicate certain read-optimised data. For example, the Chatbot service caches product names so it can answer questions about the menu without calling the Inventory service on every user message. When the Inventory service updates a product name, it publishes a product.updated event, and the Chatbot service updates its local cache.

This is eventual consistency — the chatbot's view of product data may be seconds behind, but that is acceptable for its use case.

Service Discovery and Routing

In development, I use Docker Compose with service names as DNS hostnames:

# docker-compose.yml
services:
  api-gateway:
    image: nginx:alpine
    ports:
      - "80:80"
    depends_on:
      - auth-service
      - pos-core
      - inventory-service

  auth-service:
    build: ./auth-service
    environment:
      - DATABASE_URL=postgresql://postgres:password@auth-db:5432/auth
    depends_on:
      - auth-db

  pos-core:
    build: ./pos-core
    environment:
      - DATABASE_URL=postgresql://postgres:password@pos-db:5432/pos
      - INVENTORY_SERVICE_URL=http://inventory-service:8003
      - AUTH_SERVICE_URL=http://auth-service:8001
    depends_on:
      - pos-db
      - auth-service
      - inventory-service

In production on Kubernetes, service discovery happens through cluster DNS — services refer to each other by service name, and Kubernetes resolves it to the correct pod IP.

Deployment with Docker and Docker Compose

Each service has its own Dockerfile:

# auth-service/Dockerfile
FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8001

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8001"]

The benefit: I can rebuild and redeploy only the Auth service without touching any other container. That independent deployment cycle is one of the practical payoffs of microservices.

Operational Realities

Microservices introduce complexity that a monolith does not have:

Challenge

What I Do About It

Distributed tracing

Correlation IDs passed in request headers across all services

Partial failures

Circuit breakers + retry with exponential backoff on HTTP clients

Cross-service debugging

Structured logging with tenant_id and correlation_id in every log line

Schema evolution

Versioned APIs (/api/v1/, /api/v2/), never break existing contracts

Local development

Docker Compose brings up all 6 services + databases in one command

None of these are solved once. They require ongoing attention.

When Not to Use Microservices

I would not start with microservices if:

The team is fewer than 3–4 people
The domain boundaries are not understood yet
There is no CI/CD pipeline (deploying 6 services manually is worse than a monolith)
The system is a proof of concept or early-stage product

Start with a well-structured monolith, identify the natural seams, then extract services when you feel the operational pressure described in the decomposition section.

Lessons Learned

The first split should happen when you feel operational pain, not when you read about microservices.
Own your data completely or do not split the service. Shared database microservices are distributed monoliths.
HTTP between services is a network call that can fail. Every inter-service call needs a timeout, a retry policy, and a fallback.
Observability is not optional. Without distributed tracing and structured logs, debugging a failure that involves three services is nearly impossible.
Microservices scale development teams, not just load. The real benefit of microservices is that two teams can work independently. If you do not have two teams, you may not need the separation.

PreviousService-Oriented Architecture (SOA)NextServerless Architecture

Last updated 5 days ago

hashtagThe Decision That Took 18 Months to Get Right

hashtagTable of Contents

hashtagWhat Is a Microservice?

hashtagThe Decomposition Decision

hashtag1. Domain Boundary

hashtag2. Independent Scaling Need

hashtag3. Independent Deployment Rhythm

hashtagMy POS Microservices Architecture

hashtagInter-Service Communication

hashtagSynchronous (REST)

hashtagAsynchronous (Events)

hashtagData Ownership

hashtagService Discovery and Routing

hashtagDeployment with Docker and Docker Compose

hashtagOperational Realities

hashtagWhen Not to Use Microservices

hashtagLessons Learned

The Decision That Took 18 Months to Get Right

Table of Contents

What Is a Microservice?

The Decomposition Decision

1. Domain Boundary

2. Independent Scaling Need

3. Independent Deployment Rhythm

My POS Microservices Architecture

Inter-Service Communication

Synchronous (REST)

Asynchronous (Events)

Data Ownership

Service Discovery and Routing

Deployment with Docker and Docker Compose

Operational Realities

When Not to Use Microservices

Lessons Learned