Caching Strategies & Session Management
Table of Contents
Introduction
One of the hardest lessons I learned running my multi-tenant POS system was this: databases are slow, and users are impatient. When the Chatbot Service needed to aggregate data from 5 different services to answer "What are my top-selling products today?", response times hit 800ms. Customers complained about the lag.
The solution wasn't faster databases or more powerful servers—it was strategic caching. By introducing Redis as a caching layer, I reduced chatbot response times to 150ms (an 80% improvement) and cut database load by 70%.
In this article, I'll share how I implemented Redis caching across my POS architecture, from simple session management in the Auth Service to complex aggregation caching in the Chatbot. We'll cover cache-aside patterns, TTL strategies, invalidation techniques, and the critical multi-tenant isolation concerns.
The Performance Problem
Before caching, here's what happened when a user asked the chatbot "Show me today's sales":
Chatbot → POS Core: Get all orders (200ms, query 500+ orders)
Chatbot → Payment: Get payment details (150ms, join payment records)
Chatbot → Inventory: Get product names (180ms, MongoDB query)
Chatbot → Restaurant: Get table assignments (120ms)
Chatbot: Aggregate and calculate (100ms, in-memory processing)
Total: ~750ms for a query that users ran dozens of times per day—with the exact same results for subsequent requests within the same hour.
This was wasteful. The data didn't change every second, yet we hit the database every time. Classic caching opportunity.
Redis in the Auth Service
The Auth Service (port 4001) uses Redis for two purposes: session storage and JWT token blacklisting.
Session Management
After a user logs in with email/password, I store their session in Redis instead of a database:
JWT Blacklist
When users log out, their JWT is still valid until expiration. To handle this, I blacklist tokens in Redis:
FastAPI middleware checks the blacklist on every request:
This pattern gives me:
Instant logout: Tokens are blacklisted immediately
Automatic cleanup: Redis TTL removes expired tokens
Tenant isolation: Session keys include tenant_id
Redis in the Chatbot Service
The Chatbot Service (port 4006) aggregates data from 5 services. Without caching, it was the slowest service in my architecture. Here's how I fixed it:
Aggregation Cache
Using the Cache
Here's how the chatbot uses this cache for "top selling products" queries:
Performance impact:
First request: 750ms (cache miss, aggregates from 5 services)
Subsequent requests: 12ms (cache hit, just Redis lookup)
Cache hit rate: 85% in production (same queries repeated throughout the day)
Cache-Aside Pattern Implementation
The cache-aside (lazy loading) pattern I use follows this flow:
Here's a reusable decorator I built:
TTL Strategies
Different data types need different Time-To-Live (TTL) values. Here's what I learned works best:
Multi-Tenant Cache Isolation
Critical lesson: Always include tenant_id in cache keys to prevent data leakage between tenants.
I also namespace by service:
Cache Invalidation Patterns
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
Pattern 1: Event-Based Invalidation
Using the event bus from the previous article:
Pattern 2: Write-Through Cache
Update cache and database simultaneously:
Pattern 3: Time-Based Invalidation
Let TTL handle invalidation for read-heavy, write-light data:
Production Lessons Learned
Lesson 1: Cache Stampede
During a deployment, all caches expired simultaneously. Thousands of requests hit the database at once, causing a 30-second outage.
Solution: Stagger TTLs with jitter:
Lesson 2: Stale Data During Outages
When the Inventory Service was down, the cache served stale data, causing customers to order out-of-stock items.
Solution: Add health check to cache decorator:
Lesson 3: Memory Exhaustion
Redis memory filled up with cached aggregations, causing evictions of session data (critical).
Solution: Use different Redis instances or databases:
Best Practices
Based on production experience:
Always include tenant_id in keys to prevent cross-tenant data leakage
Use TTL jitter to prevent cache stampede
Separate critical from cacheable data (different Redis instances/DBs)
Monitor cache hit rates - low hit rate means bad caching strategy
Invalidate on events for data that changes unpredictably
Use write-through for data that must be consistent
Let TTL handle invalidation for read-heavy, rarely changing data
Add circuit breakers - don't let cache failures take down your service
Next Steps
Caching and session management are foundational to performant distributed systems. In my POS architecture, Redis reduced:
Chatbot response time by 80% (800ms → 150ms)
Database load by 70%
Auth Service response time by 60%
In the next article, we'll explore Integration & Orchestration Patterns, where the Chatbot Service orchestrates calls to 5 downstream services—and how caching plays a crucial role in making that pattern fast and reliable.
This is part of the Software Architecture 101 series, where I share lessons learned building a production multi-tenant POS system with 6 microservices.
Last updated