Caching Strategies

← Back to System Design 101 | ← Previous: Scalability Patterns

Why Caching Matters

Caching is one of the most effective ways to improve system performance. A well-implemented cache can reduce database load by 80-90%, decrease API response times from seconds to milliseconds, and save significant infrastructure costs.

I've seen caching transform struggling systems into performant ones. But I've also seen poorly implemented caching cause subtle bugs and stale data issues. The key is understanding when to cache, what to cache, and how to invalidate cached data.

Caching Fundamentals

What to Cache

Good cache candidates (from my experience):

Data that's read frequently but changes rarely (product catalogs, user profiles)
Expensive computations (aggregations, reports)
External API responses (third-party data)
Session data and authentication tokens
Static content (images, CSS, JavaScript)

Poor cache candidates:

Data that changes constantly (real-time stock prices, live sports scores)
User-specific data that's rarely reused
Data that must always be current (financial transactions)
Large objects that exceed cache memory limits

Cache Hit Ratio

The percentage of requests served from cache vs database.

# Simple cache metrics tracking
from typing import Optional
from datetime import datetime
import redis

class CacheMetrics:
    """
    Track cache performance metrics.
    I use this to monitor cache effectiveness across services.
    """
    
    def __init__(self, redis_client: redis.Redis):
        self.redis = redis_client
        self.metrics_key = "cache:metrics"
    
    def record_hit(self):
        """Record a cache hit."""
        self.redis.hincrby(self.metrics_key, "hits", 1)
    
    def record_miss(self):
        """Record a cache miss."""
        self.redis.hincrby(self.metrics_key, "misses", 1)
    
    def get_hit_ratio(self) -> float:
        """Calculate cache hit ratio."""
        metrics = self.redis.hgetall(self.metrics_key)
        hits = int(metrics.get(b"hits", 0))
        misses = int(metrics.get(b"misses", 0))
        
        total = hits + misses
        if total == 0:
            return 0.0
        
        return (hits / total) * 100
    
    def reset(self):
        """Reset metrics."""
        self.redis.delete(self.metrics_key)

# Usage example
cache_metrics = CacheMetrics(redis_client)

def get_user(user_id: str) -> dict:
    """Get user with cache metrics tracking."""
    cache_key = f"user:{user_id}"
    
    # Try cache first
    cached = redis_client.get(cache_key)
    if cached:
        cache_metrics.record_hit()
        return json.loads(cached)
    
    # Cache miss - fetch from database
    cache_metrics.record_miss()
    user = db.users.find_one({"id": user_id})
    
    # Store in cache
    redis_client.setex(
        cache_key,
        3600,  # 1 hour TTL
        json.dumps(user)
    )
    
    return user

My target metrics:

Cache hit ratio > 80% for frequently accessed data
Cache hit ratio > 95% for static content
If hit ratio < 70%, I reconsider the caching strategy

Caching Patterns

1. Cache-Aside (Lazy Loading)

The application manages the cache directly. Most common pattern I use.

How it works:

Check cache first
If miss, fetch from database
Store in cache
Return data

# Cache-aside pattern implementation
import redis
import json
from typing import Optional, Callable, Any
from functools import wraps

redis_client = redis.Redis(host='localhost', port=6379, decode_responses=True)

def cache_aside(
    key_prefix: str,
    ttl: int = 3600,
    key_generator: Optional[Callable] = None
):
    """
    Cache-aside decorator.
    I use this pattern for most read-heavy endpoints.
    """
    def decorator(func: Callable) -> Callable:
        @wraps(func)
        def wrapper(*args, **kwargs) -> Any:
            # Generate cache key
            if key_generator:
                cache_key = key_generator(*args, **kwargs)
            else:
                # Default: use function name and first argument
                cache_key = f"{key_prefix}:{args[0] if args else 'default'}"
            
            # Try to get from cache
            cached_value = redis_client.get(cache_key)
            if cached_value:
                return json.loads(cached_value)
            
            # Cache miss - execute function
            result = func(*args, **kwargs)
            
            # Store in cache
            redis_client.setex(
                cache_key,
                ttl,
                json.dumps(result)
            )
            
            return result
        
        return wrapper
    return decorator

# Usage example
@cache_aside(key_prefix="product", ttl=3600)
def get_product(product_id: str) -> dict:
    """
    Get product details.
    Automatically cached for 1 hour.
    """
    return db.products.find_one({"id": product_id})

# Custom key generator for complex keys
def user_orders_key(*args, **kwargs):
    user_id = args[0]
    status = kwargs.get('status', 'all')
    return f"user_orders:{user_id}:{status}"

@cache_aside(
    key_prefix="orders",
    ttl=1800,  # 30 minutes
    key_generator=user_orders_key
)
def get_user_orders(user_id: str, status: str = 'all') -> list:
    """Get user orders with caching."""
    query = {"user_id": user_id}
    if status != 'all':
        query["status"] = status
    return list(db.orders.find(query))

When I use cache-aside:

Read-heavy workloads
When cache misses are acceptable
When I need fine control over caching logic

Challenges I've faced:

Cache stampede (multiple requests hit DB when cache expires)
Inconsistency between cache and database
Complex invalidation logic

2. Write-Through Cache

Data is written to cache and database simultaneously.

class WriteThroughCache:
    """
    Write-through caching implementation.
    I use this for critical data that must stay synchronized.
    """
    
    def __init__(self, redis_client: redis.Redis, db):
        self.cache = redis_client
        self.db = db
    
    def set(self, key: str, value: dict, ttl: int = 3600):
        """Write to both cache and database."""
        # Write to database first
        self.db.data.update_one(
            {"key": key},
            {"$set": {"value": value}},
            upsert=True
        )
        
        # Then update cache
        self.cache.setex(
            key,
            ttl,
            json.dumps(value)
        )
    
    def get(self, key: str) -> Optional[dict]:
        """Read from cache, fallback to database."""
        # Try cache first
        cached = self.cache.get(key)
        if cached:
            return json.loads(cached)
        
        # Fallback to database
        doc = self.db.data.find_one({"key": key})
        if doc:
            value = doc["value"]
            # Populate cache
            self.cache.setex(key, 3600, json.dumps(value))
            return value
        
        return None

# Usage
write_through = WriteThroughCache(redis_client, db)

def update_user_settings(user_id: str, settings: dict):
    """Update user settings with write-through caching."""
    cache_key = f"user_settings:{user_id}"
    write_through.set(cache_key, settings, ttl=7200)
    return {"status": "updated"}

When I use write-through:

When data consistency is critical
When read/write ratio is high
When I can tolerate slightly slower writes

Trade-offs:

✅ Cache always synchronized with database
✅ No cache misses for written data
❌ Higher write latency
❌ Wasted cache space if data is never read

3. Write-Back (Write-Behind) Cache

Data is written to cache first, then asynchronously written to database.

import asyncio
from queue import Queue
from threading import Thread
import time

class WriteBackCache:
    """
    Write-back caching with async database writes.
    Use with caution - risk of data loss if cache fails.
    """
    
    def __init__(self, redis_client: redis.Redis, db):
        self.cache = redis_client
        self.db = db
        self.write_queue = Queue()
        self.running = True
        
        # Start background writer thread
        self.writer_thread = Thread(target=self._background_writer, daemon=True)
        self.writer_thread.start()
    
    def set(self, key: str, value: dict, ttl: int = 3600):
        """Write to cache immediately, queue database write."""
        # Write to cache immediately
        self.cache.setex(key, ttl, json.dumps(value))
        
        # Queue database write
        self.write_queue.put({
            "key": key,
            "value": value,
            "timestamp": time.time()
        })
    
    def _background_writer(self):
        """Background thread that writes queued data to database."""
        batch = []
        batch_size = 100
        max_wait = 5  # seconds
        
        last_write = time.time()
        
        while self.running:
            try:
                # Get item with timeout
                item = self.write_queue.get(timeout=1)
                batch.append(item)
                
                # Write batch if size or time threshold reached
                should_write = (
                    len(batch) >= batch_size or
                    time.time() - last_write >= max_wait
                )
                
                if should_write and batch:
                    self._write_batch(batch)
                    batch = []
                    last_write = time.time()
                    
            except:
                # Timeout or error - write any pending batch
                if batch:
                    self._write_batch(batch)
                    batch = []
                    last_write = time.time()
    
    def _write_batch(self, batch: list):
        """Write a batch of items to database."""
        try:
            bulk_ops = [
                {
                    "update_one": {
                        "filter": {"key": item["key"]},
                        "update": {"$set": {
                            "value": item["value"],
                            "updated_at": item["timestamp"]
                        }},
                        "upsert": True
                    }
                }
                for item in batch
            ]
            self.db.data.bulk_write(bulk_ops)
            print(f"Wrote {len(batch)} items to database")
        except Exception as e:
            print(f"Error writing batch: {e}")
            # In production, implement retry logic or dead letter queue
    
    def shutdown(self):
        """Graceful shutdown - flush remaining writes."""
        self.running = False
        self.writer_thread.join()

# Usage
write_back = WriteBackCache(redis_client, db)

def update_page_view(page_id: str):
    """Increment page view counter with write-back caching."""
    cache_key = f"page_views:{page_id}"
    
    # Increment in cache
    current = int(redis_client.get(cache_key) or 0)
    new_value = current + 1
    
    # Write back to cache and queue DB write
    write_back.set(
        cache_key,
        {"page_id": page_id, "views": new_value},
        ttl=86400
    )
    
    return new_value

When I use write-back (rarely):

High-write workloads where latency is critical
Analytics and counters where eventual consistency is acceptable
When I have reliable cache infrastructure with persistence

⚠️ Risks:

Data loss if cache fails before writing to database
Complex error handling
Harder to debug issues

4. Refresh-Ahead Cache

Proactively refresh cache before expiration.

from datetime import datetime, timedelta
import threading

class RefreshAheadCache:
    """
    Refresh cache before it expires to prevent cache misses.
    I use this for expensive computations with predictable access patterns.
    """
    
    def __init__(self, redis_client: redis.Redis):
        self.cache = redis_client
        self.refresh_jobs = {}
    
    def get_with_refresh(
        self,
        key: str,
        fetch_func: Callable,
        ttl: int = 3600,
        refresh_threshold: float = 0.8
    ) -> Any:
        """
        Get value from cache and schedule refresh if close to expiration.
        
        Args:
            key: Cache key
            fetch_func: Function to fetch fresh data
            ttl: Time to live in seconds
            refresh_threshold: Refresh when this % of TTL has elapsed (0.8 = 80%)
        """
        # Get value and TTL from cache
        value = self.cache.get(key)
        remaining_ttl = self.cache.ttl(key)
        
        if value:
            # Check if we should refresh
            if remaining_ttl > 0 and remaining_ttl < (ttl * (1 - refresh_threshold)):
                # Schedule background refresh
                self._schedule_refresh(key, fetch_func, ttl)
            
            return json.loads(value)
        
        # Cache miss - fetch and store
        fresh_value = fetch_func()
        self.cache.setex(key, ttl, json.dumps(fresh_value))
        return fresh_value
    
    def _schedule_refresh(self, key: str, fetch_func: Callable, ttl: int):
        """Schedule background refresh of cached data."""
        # Avoid scheduling multiple refreshes for same key
        if key in self.refresh_jobs:
            return
        
        def refresh():
            try:
                fresh_value = fetch_func()
                self.cache.setex(key, ttl, json.dumps(fresh_value))
                print(f"Refreshed cache for {key}")
            except Exception as e:
                print(f"Error refreshing cache for {key}: {e}")
            finally:
                # Remove from active jobs
                if key in self.refresh_jobs:
                    del self.refresh_jobs[key]
        
        # Schedule refresh
        thread = threading.Thread(target=refresh, daemon=True)
        self.refresh_jobs[key] = thread
        thread.start()

# Usage
refresh_cache = RefreshAheadCache(redis_client)

def get_trending_products() -> list:
    """
    Get trending products with refresh-ahead caching.
    Expensive query that benefits from proactive refresh.
    """
    def fetch_trending():
        # Expensive database aggregation
        pipeline = [
            {"$match": {"created_at": {"$gte": datetime.utcnow() - timedelta(days=7)}}},
            {"$group": {"_id": "$product_id", "count": {"$sum": 1}}},
            {"$sort": {"count": -1}},
            {"$limit": 10}
        ]
        return list(db.orders.aggregate(pipeline))
    
    return refresh_cache.get_with_refresh(
        key="trending_products",
        fetch_func=fetch_trending,
        ttl=3600,  # 1 hour
        refresh_threshold=0.9  # Refresh at 90% of TTL
    )

When I use refresh-ahead:

Expensive computations that are frequently accessed
Dashboards and analytics
When avoiding cache misses is critical

Cache Invalidation

Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things." He was right.

Time-Based Expiration (TTL)

The simplest invalidation strategy.

# TTL-based caching
def cache_with_ttl(key: str, value: Any, ttl: int):
    """
    Store value with time-to-live.
    Works well for data with predictable freshness requirements.
    """
    redis_client.setex(key, ttl, json.dumps(value))

# My typical TTL values:
TTL_STATIC_CONTENT = 86400 * 7  # 7 days
TTL_USER_SESSION = 3600 * 24     # 24 hours
TTL_API_RESPONSE = 300           # 5 minutes
TTL_EXPENSIVE_QUERY = 3600       # 1 hour
TTL_REAL_TIME_DATA = 60          # 1 minute

Event-Based Invalidation

Invalidate cache when data changes.

class CacheInvalidator:
    """
    Event-based cache invalidation.
    I use this for critical data that must stay synchronized.
    """
    
    def __init__(self, redis_client: redis.Redis):
        self.cache = redis_client
    
    def invalidate_pattern(self, pattern: str):
        """Invalidate all keys matching a pattern."""
        cursor = 0
        while True:
            cursor, keys = self.cache.scan(
                cursor=cursor,
                match=pattern,
                count=100
            )
            if keys:
                self.cache.delete(*keys)
            if cursor == 0:
                break
    
    def invalidate_user_cache(self, user_id: str):
        """Invalidate all cache entries for a user."""
        patterns = [
            f"user:{user_id}",
            f"user:{user_id}:*",
            f"user_orders:{user_id}:*",
            f"user_settings:{user_id}"
        ]
        for pattern in patterns:
            self.invalidate_pattern(pattern)

cache_invalidator = CacheInvalidator(redis_client)

def update_user_profile(user_id: str, profile_data: dict):
    """Update user profile and invalidate related caches."""
    # Update database
    db.users.update_one(
        {"id": user_id},
        {"$set": profile_data}
    )
    
    # Invalidate user cache
    cache_invalidator.invalidate_user_cache(user_id)
    
    return {"status": "updated"}

Cache Tags

Group related cache entries for easier invalidation.

class TaggedCache:
    """
    Cache with tag-based invalidation.
    Useful for complex invalidation scenarios.
    """
    
    def __init__(self, redis_client: redis.Redis):
        self.cache = redis_client
    
    def set_with_tags(self, key: str, value: Any, tags: list[str], ttl: int = 3600):
        """Store value with associated tags."""
        # Store the value
        self.cache.setex(key, ttl, json.dumps(value))
        
        # Associate key with each tag
        for tag in tags:
            tag_key = f"tag:{tag}"
            self.cache.sadd(tag_key, key)
            self.cache.expire(tag_key, ttl)
    
    def invalidate_by_tag(self, tag: str):
        """Invalidate all keys with a specific tag."""
        tag_key = f"tag:{tag}"
        keys = self.cache.smembers(tag_key)
        
        if keys:
            # Delete all keys with this tag
            self.cache.delete(*keys)
            # Delete the tag set itself
            self.cache.delete(tag_key)

tagged_cache = TaggedCache(redis_client)

# Usage example
def cache_product(product: dict):
    """Cache product with tags for easy invalidation."""
    product_id = product["id"]
    category_id = product["category_id"]
    brand_id = product["brand_id"]
    
    tagged_cache.set_with_tags(
        key=f"product:{product_id}",
        value=product,
        tags=[
            f"product:{product_id}",
            f"category:{category_id}",
            f"brand:{brand_id}"
        ],
        ttl=3600
    )

def update_category(category_id: str, updates: dict):
    """Update category and invalidate all related product caches."""
    db.categories.update_one({"id": category_id}, {"$set": updates})
    
    # Invalidate all products in this category
    tagged_cache.invalidate_by_tag(f"category:{category_id}")

CDN Caching

Content Delivery Networks cache static assets close to users.

CDN Configuration

# CloudFront cache configuration (Infrastructure as Code)
import json

cloudfront_config = {
    "CacheBehaviors": {
        # Static assets - long cache
        "/static/*": {
            "MinTTL": 86400 * 365,  # 1 year
            "MaxTTL": 86400 * 365,
            "DefaultTTL": 86400 * 30,  # 30 days
            "Compress": True,
            "ForwardedValues": {
                "QueryString": False,
                "Cookies": {"Forward": "none"}
            }
        },
        # API responses - short cache
        "/api/*": {
            "MinTTL": 0,
            "MaxTTL": 300,  # 5 minutes
            "DefaultTTL": 60,  # 1 minute
            "Compress": True,
            "ForwardedValues": {
                "QueryString": True,
                "Headers": ["Authorization"],
                "Cookies": {"Forward": "none"}
            }
        }
    }
}

Cache-Control Headers

from fastapi import FastAPI, Response
from fastapi.responses import FileResponse

app = FastAPI()

@app.get("/api/products/{product_id}")
async def get_product(product_id: str, response: Response):
    """API endpoint with cache control headers."""
    product = db.products.find_one({"id": product_id})
    
    # Set cache headers for CDN
    response.headers["Cache-Control"] = "public, max-age=300"  # 5 minutes
    response.headers["ETag"] = f'"{product["version"]}"'
    
    return product

@app.get("/static/images/{filename}")
async def get_image(filename: str):
    """Serve static images with aggressive caching."""
    file_path = f"/app/static/images/{filename}"
    
    return FileResponse(
        file_path,
        headers={
            "Cache-Control": "public, max-age=31536000, immutable",  # 1 year
        }
    )

Redis Best Practices

Redis is my go-to caching solution. Here are patterns I use:

import redis
from redis.cluster import RedisCluster

# Single instance for development
redis_single = redis.Redis(
    host='localhost',
    port=6379,
    decode_responses=True,
    socket_connect_timeout=5,
    socket_timeout=5,
    max_connections=50
)

# Redis Cluster for production
redis_cluster = RedisCluster(
    startup_nodes=[
        {"host": "redis-1.internal", "port": 6379},
        {"host": "redis-2.internal", "port": 6379},
        {"host": "redis-3.internal", "port": 6379},
    ],
    decode_responses=True,
    skip_full_coverage_check=True,
    max_connections=100
)

# Connection pooling
from redis import ConnectionPool

pool = ConnectionPool(
    host='redis.internal',
    port=6379,
    max_connections=100,
    decode_responses=True
)
redis_client = redis.Redis(connection_pool=pool)

Memory Management

# Monitor Redis memory usage
def get_redis_memory_stats():
    """Get Redis memory statistics."""
    info = redis_client.info('memory')
    return {
        "used_memory": info['used_memory'],
        "used_memory_human": info['used_memory_human'],
        "used_memory_peak": info['used_memory_peak'],
        "used_memory_peak_human": info['used_memory_peak_human'],
        "mem_fragmentation_ratio": info['mem_fragmentation_ratio']
    }

# Eviction policy configuration
# redis.conf
"""
maxmemory 2gb
maxmemory-policy allkeys-lru  # Evict least recently used keys

# Other policies I've used:
# volatile-lru: Evict LRU keys with TTL set
# allkeys-lfu: Evict least frequently used
# volatile-ttl: Evict keys with shortest TTL
"""

Lessons Learned

What worked:

Start with cache-aside pattern - simplest and most flexible
Use Redis for almost everything - it's fast, reliable, and well-supported
Set conservative TTLs initially, then optimize based on metrics
Tag-based invalidation for complex scenarios
Monitor cache hit ratio religiously

What didn't work:

Caching everything without measuring benefit
Very long TTLs without invalidation strategy
Write-back caching without proper backup mechanisms
Sharing Redis instance between different use cases (sessions, cache, queues)
Not setting max memory limits - caused OOM issues

My caching checklist:

✅ Defined clear cache keys with namespace
✅ Set appropriate TTLs
✅ Implemented invalidation strategy
✅ Added cache metrics monitoring
✅ Handled cache failures gracefully
✅ Documented caching decisions

What's Next

With caching strategies in place, let's explore database design for distributed systems:

Database Design →: SQL vs NoSQL, sharding, and replication strategies

Navigation:

PreviousScalability Patterns NextDatabase Design

Last updated 1 month ago

hashtagWhy Caching Matters

hashtagCaching Fundamentals

hashtagWhat to Cache

hashtagCache Hit Ratio

hashtagCaching Patterns

hashtag1. Cache-Aside (Lazy Loading)

hashtag2. Write-Through Cache

hashtag3. Write-Back (Write-Behind) Cache

hashtag4. Refresh-Ahead Cache

hashtagCache Invalidation

hashtagTime-Based Expiration (TTL)

hashtagEvent-Based Invalidation

hashtagCache Tags

hashtagCDN Caching

hashtagCDN Configuration

hashtagCache-Control Headers

hashtagRedis Best Practices

hashtagMemory Management

hashtagLessons Learned

hashtagWhat's Next