Caching Strategies & Session Management

Table of Contents

Introduction

One of the hardest lessons I learned running my multi-tenant POS system was this: databases are slow, and users are impatient. When the Chatbot Service needed to aggregate data from 5 different services to answer "What are my top-selling products today?", response times hit 800ms. Customers complained about the lag.

The solution wasn't faster databases or more powerful servers—it was strategic caching. By introducing Redis as a caching layer, I reduced chatbot response times to 150ms (an 80% improvement) and cut database load by 70%.

In this article, I'll share how I implemented Redis caching across my POS architecture, from simple session management in the Auth Service to complex aggregation caching in the Chatbot. We'll cover cache-aside patterns, TTL strategies, invalidation techniques, and the critical multi-tenant isolation concerns.

The Performance Problem

Before caching, here's what happened when a user asked the chatbot "Show me today's sales":

  1. Chatbot → POS Core: Get all orders (200ms, query 500+ orders)

  2. Chatbot → Payment: Get payment details (150ms, join payment records)

  3. Chatbot → Inventory: Get product names (180ms, MongoDB query)

  4. Chatbot → Restaurant: Get table assignments (120ms)

  5. Chatbot: Aggregate and calculate (100ms, in-memory processing)

Total: ~750ms for a query that users ran dozens of times per day—with the exact same results for subsequent requests within the same hour.

This was wasteful. The data didn't change every second, yet we hit the database every time. Classic caching opportunity.

Redis in the Auth Service

The Auth Service (port 4001) uses Redis for two purposes: session storage and JWT token blacklisting.

Session Management

After a user logs in with email/password, I store their session in Redis instead of a database:

JWT Blacklist

When users log out, their JWT is still valid until expiration. To handle this, I blacklist tokens in Redis:

FastAPI middleware checks the blacklist on every request:

This pattern gives me:

  • Instant logout: Tokens are blacklisted immediately

  • Automatic cleanup: Redis TTL removes expired tokens

  • Tenant isolation: Session keys include tenant_id

Redis in the Chatbot Service

The Chatbot Service (port 4006) aggregates data from 5 services. Without caching, it was the slowest service in my architecture. Here's how I fixed it:

Aggregation Cache

Using the Cache

Here's how the chatbot uses this cache for "top selling products" queries:

Performance impact:

  • First request: 750ms (cache miss, aggregates from 5 services)

  • Subsequent requests: 12ms (cache hit, just Redis lookup)

  • Cache hit rate: 85% in production (same queries repeated throughout the day)

Cache-Aside Pattern Implementation

The cache-aside (lazy loading) pattern I use follows this flow:

Here's a reusable decorator I built:

TTL Strategies

Different data types need different Time-To-Live (TTL) values. Here's what I learned works best:

Multi-Tenant Cache Isolation

Critical lesson: Always include tenant_id in cache keys to prevent data leakage between tenants.

I also namespace by service:

Cache Invalidation Patterns

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Pattern 1: Event-Based Invalidation

Using the event bus from the previous article:

Pattern 2: Write-Through Cache

Update cache and database simultaneously:

Pattern 3: Time-Based Invalidation

Let TTL handle invalidation for read-heavy, write-light data:

Production Lessons Learned

Lesson 1: Cache Stampede

During a deployment, all caches expired simultaneously. Thousands of requests hit the database at once, causing a 30-second outage.

Solution: Stagger TTLs with jitter:

Lesson 2: Stale Data During Outages

When the Inventory Service was down, the cache served stale data, causing customers to order out-of-stock items.

Solution: Add health check to cache decorator:

Lesson 3: Memory Exhaustion

Redis memory filled up with cached aggregations, causing evictions of session data (critical).

Solution: Use different Redis instances or databases:

Best Practices

Based on production experience:

  1. Always include tenant_id in keys to prevent cross-tenant data leakage

  2. Use TTL jitter to prevent cache stampede

  3. Separate critical from cacheable data (different Redis instances/DBs)

  4. Monitor cache hit rates - low hit rate means bad caching strategy

  5. Invalidate on events for data that changes unpredictably

  6. Use write-through for data that must be consistent

  7. Let TTL handle invalidation for read-heavy, rarely changing data

  8. Add circuit breakers - don't let cache failures take down your service

Next Steps

Caching and session management are foundational to performant distributed systems. In my POS architecture, Redis reduced:

  • Chatbot response time by 80% (800ms → 150ms)

  • Database load by 70%

  • Auth Service response time by 60%

In the next article, we'll explore Integration & Orchestration Patterns, where the Chatbot Service orchestrates calls to 5 downstream services—and how caching plays a crucial role in making that pattern fast and reliable.


This is part of the Software Architecture 101 series, where I share lessons learned building a production multi-tenant POS system with 6 microservices.

Last updated