What You'll Learn
How to use Redis to drastically improve application performance. We'll cover Cache-Aside, Write-Through, Write-Behind strategies, and the hardest problem in computer science: Cache Invalidation.
Why Redis?
Redis (Remote Dictionary Server) is an in-memory key-value data store. Because it stores data in RAM instead of on a hard drive, it can deliver sub-millisecond response times, capable of handling millions of requests per second.
However, RAM is expensive and volatile. Therefore, Redis is most commonly used alongside a primary database (like PostgreSQL) as a caching layer.
1. Cache-Aside (Lazy Loading)
This is the most common caching strategy. The application talks to both the cache and the database.
async function getUserProfile(userId) {
// 1. Check Cache first
const cacheKey = `user:${userId}:profile`;
const cachedData = await redis.get(cacheKey);
// Cache HIT! Return immediately
if (cachedData) {
return JSON.parse(cachedData);
}
// 2. Cache MISS! Fetch from primary database
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
if (user) {
// 3. Write to cache for next time, set a TTL (Time To Live)
await redis.set(cacheKey, JSON.stringify(user), 'EX', 3600); // Expire in 1 hour
}
return user;
}
2. Write-Through Caching
In this strategy, data is written into the cache and the corresponding database at the exact same time. The cache acts as the primary interface.
async function updateUserName(userId, newName) {
// 1. Update Database
await db.query('UPDATE users SET name = ? WHERE id = ?', [newName, userId]);
// 2. Update Cache immediately
const cacheKey = `user:${userId}:profile`;
// Fetch full updated object or just update the specific field
const updatedUser = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
await redis.set(cacheKey, JSON.stringify(updatedUser));
return updatedUser;
}
3. Write-Behind (Write-Back) Caching
The application writes data ONLY to the cache and returns immediately. A background process asynchronously syncs the cache data to the primary database later.
Highly performant for write-heavy workloads (like counting YouTube views or tracking live game scores), but risks data loss if the cache server crashes before the sync happens.
The Hardest Problem: Cache Invalidation
Phil Karlton famously said: "There are only two hard things in Computer Science: cache invalidation and naming things."
How do you ensure users don't see old data? Here are the top strategies:
1. Time To Live (TTL)
Every cached item is given an expiration time (e.g., 5 minutes). Simple, but guarantees data will be stale for up to 5 minutes.
2. Event-Driven Invalidation
When data is updated in the DB, publish an event (e.g., to RabbitMQ/Kafka or Redis Pub/Sub). A worker listens to this event and actively deletes the `user:123:profile` key from Redis.
3. Versioning (Cache Busting)
Append a version number to the cache key: `user:123:profile:v2`. When the user updates their profile, increment the version to `v3`. The old `v2` key will eventually be evicted by Redis memory limits.
The Cache Stampede (Thundering Herd)
If a highly popular cached item (like the front page of a news site) expires, 10,000 concurrent requests might all experience a Cache Miss at the exact same millisecond. They will ALL query the database simultaneously, crashing it instantly.
Solutions:
- Locking: The first request acquires a Redis lock, fetches from DB, and updates the cache. The other 9,999 requests wait for the lock to release.
- Probabilistic Early Expiration: Randomly expire the cache slightly *before* the TTL for a small percentage of requests, forcing one request to refresh it early in the background.