Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions content/develop/use-cases/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@ This section provides practical examples and reference implementations for commo
* [Time series dashboard]({{< relref "/develop/use-cases/time-series-dashboard" >}}) - Build a rolling sensor graph demo with Redis time series data
* [Leaderboards]({{< relref "/develop/use-cases/leaderboard" >}}) - Build a ranked leaderboard with sorted sets and user metadata
* [Job queue]({{< relref "/develop/use-cases/job-queue" >}}) - Run a reliable background job queue with at-least-once delivery and visibility-timeout reclaim
* [Prefetch cache]({{< relref "/develop/use-cases/prefetch-cache" >}}) - Pre-load reference data into Redis so every read is a cache hit, kept current by a CDC sync worker
105 changes: 105 additions & 0 deletions content/develop/use-cases/prefetch-cache/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
---
categories:
- docs
- develop
- stack
- oss
- rs
- rc
description: Pre-load reference data into Redis so every read is a cache hit.
hideListLinks: true
linkTitle: Prefetch cache
title: Redis prefetch cache
weight: 5
---

## When to use Redis prefetch cache

Use a Redis prefetch cache when you need to pre-load reference or master data into cache before the first request arrives, so every read is a hit and no request ever falls through to the primary database.

## Why the problem is hard

Cache-aside guarantees cold-start misses: the first request for every key hits the primary, and between TTL expiry and the next read, every service re-fetches the same rows from a slow backend. At scale this creates latency spikes and sustained read pressure on the system of record — the load pattern is worst exactly when traffic is highest.

Prefetch solves this by loading data proactively, but that brings its own constraints. The entire working set must fit in memory, and it must stay current as the source of truth changes. Building and maintaining the sync pipeline from the source database adds engineering cost and ongoing operational burden — once the cache is the only read path, any sync lag becomes a correctness problem rather than a freshness one.

This pattern is distinct from cache-aside, where the cache populates reactively on miss and the primary is always available as a fall-back. With prefetch, the application assumes the cache is authoritative on the read path; on a miss, it does not fall back to the primary (and a sustained miss rate is treated as an incident). It is also distinct from write-through caching, where every write to the application writes both the cache and the primary in lock-step — prefetch decouples the write path from the cache and lets a separate sync pipeline catch up.

## What you can expect from a Redis solution

You can:

- Achieve near-100% cache hit ratios for country codes, product categories, translations, configuration, and other reference tables.
- Keep P95 read latency under 1 ms for lookup-heavy request paths at peak traffic.
Comment thread
andy-stark-redis marked this conversation as resolved.
Outdated
- Sync source database changes into cache within seconds using a managed CDC pipeline (such as Redis Data Integration), or a small consumer in front of Debezium, Kafka, or a Redis stream.
- Offload all reference-data reads from the primary database, avoiding the cost of dedicated read replicas.
- Pre-warm the cache on deploy or restart so cold starts never reach the backend.
- Bound memory with a long safety-net TTL that expires entries if the sync pipeline ever stops, so a silent failure never serves stale data forever.

## How Redis supports the solution

In practice, the application loads the full working set into Redis once at startup using a pipelined bulk write, then a separate sync worker keeps Redis current as the source of truth changes. Every reference-data read goes to Redis only — there is no fall-back path to the primary on the request critical path.

Redis provides the following features that make it a good fit for prefetch caching:

- [Hashes]({{< relref "/develop/data-types/hashes" >}})
([`HSET`]({{< relref "/commands/hset" >}}),
[`HGETALL`]({{< relref "/commands/hgetall" >}})) and native
[JSON]({{< relref "/develop/data-types/json" >}}) documents
([`JSON.SET`]({{< relref "/commands/json.set" >}}),
[`JSON.GET`]({{< relref "/commands/json.get" >}})) map directly to common
reference-data lookup patterns — id-keyed records with a fixed set of fields,
or richer nested documents accessed by JSONPath.
- [Pipelined]({{< relref "/develop/clients/pools-and-muxing" >}})
[`HSET`]({{< relref "/commands/hset" >}}) or
[`MSET`]({{< relref "/commands/mset" >}}) batches make the initial bulk load
fast: a few thousand records load in a single round trip, so the application
starts serving from a fully-warm cache within seconds of boot.
- [`EXPIRE`]({{< relref "/commands/expire" >}}) sets a long safety-net TTL on
each entry so memory stays bounded even if the sync pipeline silently stops —
not as the freshness mechanism, but as a guardrail.
- [`SCAN`]({{< relref "/commands/scan" >}}) iterates the prefetched keyspace
without blocking the server, so the application can audit cache coverage,
list available IDs, or run a periodic reconciliation pass against the source.
- [Streams]({{< relref "/develop/data-types/streams" >}})
([`XADD`]({{< relref "/commands/xadd" >}}),
[`XREAD`]({{< relref "/commands/xread" >}})) provide a durable, replayable
change feed when the sync worker needs to resume from a known offset after
a restart — the canonical pattern for CDC consumers feeding Redis.
- Sub-millisecond reads from memory, so reference-data lookups never appear on
a flame graph. If Redis is already in the stack for sessions, rate limiting,
or cache-aside, prefetch runs on the same instance at zero marginal cost.

## Ecosystem

The following libraries and frameworks support Redis-backed prefetch caching:

- **Java**:
[Spring Cache abstraction (`@Cacheable` with Redis cache store)](https://docs.spring.io/spring-data/redis/reference/redis/redis-cache.html),
populated by a startup `CommandLineRunner` for the bulk load.
- **Node.js**:
[Redis OM](https://github.com/redis/redis-om-node) for object-mapping
prefetched JSON documents.
- **Change-data-capture (CDC)** pipelines that stream source-database changes
into Redis without custom application code:
[Redis Data Integration (RDI)]({{< relref "/integrate/redis-data-integration" >}})
for relational and NoSQL sources on Redis Enterprise / Redis Cloud;
[Debezium](https://debezium.io/) plus a lightweight Redis consumer for
open-source Redis.
- **API gateways**:
[Kong](https://docs.konghq.com/hub/) plugins to route reference-data reads to
Redis directly, bypassing the backend service entirely.

## Code examples to build your own Redis prefetch cache

The following guides show how to build a simple Redis-backed prefetch cache in front of a primary store of reference data. Each guide includes a runnable interactive demo that pre-loads records on startup, runs a background sync worker that applies primary-store changes to Redis within milliseconds, and lets you watch the cache stay current as records are added, updated, and deleted on the source.

* [redis-py (Python)]({{< relref "/develop/use-cases/prefetch-cache/redis-py" >}})
* [node-redis (Node.js)]({{< relref "/develop/use-cases/prefetch-cache/nodejs" >}})
* [go-redis (Go)]({{< relref "/develop/use-cases/prefetch-cache/go" >}})
* [Jedis (Java)]({{< relref "/develop/use-cases/prefetch-cache/java-jedis" >}})
* [Lettuce (Java)]({{< relref "/develop/use-cases/prefetch-cache/java-lettuce" >}})
* [StackExchange.Redis (C#)]({{< relref "/develop/use-cases/prefetch-cache/dotnet" >}})
* [Predis (PHP)]({{< relref "/develop/use-cases/prefetch-cache/php" >}})
* [redis-rb (Ruby)]({{< relref "/develop/use-cases/prefetch-cache/ruby" >}})
* [redis-rs (Rust)]({{< relref "/develop/use-cases/prefetch-cache/rust" >}})
197 changes: 197 additions & 0 deletions content/develop/use-cases/prefetch-cache/dotnet/MockPrimaryStore.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,197 @@
using System.Collections.Concurrent;

namespace PrefetchCacheDemo;

/// <summary>
/// Mock primary data store for the prefetch-cache demo.
///
/// This stands in for a source-of-truth database (Postgres, MySQL,
/// Mongo, etc.) that holds reference data the application serves to
/// users.
///
/// Every mutation appends a change event to an in-process queue, which
/// the sync worker drains and applies to Redis. In a real system the
/// queue is replaced by a CDC pipeline — Redis Data Integration,
/// Debezium plus a lightweight consumer, or an equivalent tool that
/// tails the source's binlog/WAL and pushes changes into Redis.
///
/// The store also exposes <see cref="ReadLatencyMs"/> so the demo can
/// illustrate how much slower a direct primary read would be than a
/// Redis hit.
/// </summary>
public class MockPrimaryStore
{
public int ReadLatencyMs { get; }

private readonly object _lock = new();
private long _reads;
private readonly Dictionary<string, Dictionary<string, string>> _records;
private readonly BlockingCollection<ChangeEvent> _changes = new(new ConcurrentQueue<ChangeEvent>());

public MockPrimaryStore(int readLatencyMs = 80)
{
ReadLatencyMs = readLatencyMs;
_records = new Dictionary<string, Dictionary<string, string>>(StringComparer.Ordinal)
{
["cat-001"] = new()
{
["id"] = "cat-001",
["name"] = "Beverages",
["display_order"] = "1",
["featured"] = "true",
["parent_id"] = "",
},
["cat-002"] = new()
{
["id"] = "cat-002",
["name"] = "Bakery",
["display_order"] = "2",
["featured"] = "true",
["parent_id"] = "",
},
["cat-003"] = new()
{
["id"] = "cat-003",
["name"] = "Pantry Staples",
["display_order"] = "3",
["featured"] = "false",
["parent_id"] = "",
},
["cat-004"] = new()
{
["id"] = "cat-004",
["name"] = "Frozen",
["display_order"] = "4",
["featured"] = "false",
["parent_id"] = "",
},
["cat-005"] = new()
{
["id"] = "cat-005",
["name"] = "Specialty Cheeses",
["display_order"] = "5",
["featured"] = "false",
["parent_id"] = "cat-002",
},
};
}

public List<string> ListIds()
{
lock (_lock)
{
var ids = _records.Keys.ToList();
ids.Sort(StringComparer.Ordinal);
return ids;
}
}

/// <summary>Return every record. Used by the cache's bulk-load path on startup.</summary>
public List<Dictionary<string, string>> ListRecords()
{
Thread.Sleep(ReadLatencyMs);
lock (_lock)
{
Interlocked.Increment(ref _reads);
return _records.Values
.Select(r => new Dictionary<string, string>(r, StringComparer.Ordinal))
.ToList();
}
}

/// <summary>Single-record read. Not on the demo's normal read path.</summary>
public Dictionary<string, string>? Read(string entityId)
{
Thread.Sleep(ReadLatencyMs);
lock (_lock)
{
Interlocked.Increment(ref _reads);
return _records.TryGetValue(entityId, out var record)
? new Dictionary<string, string>(record, StringComparer.Ordinal)
: null;
}
}

public bool AddRecord(Dictionary<string, string> record)
{
if (!record.TryGetValue("id", out var entityId) || string.IsNullOrEmpty(entityId?.Trim()))
{
return false;
}
entityId = entityId.Trim();
lock (_lock)
{
if (_records.ContainsKey(entityId))
{
return false;
}
_records[entityId] = new Dictionary<string, string>(record, StringComparer.Ordinal);
// Emit while the lock is held so the queue order matches the
// mutation order. Two concurrent callers cannot interleave
// mutation A -> mutation B -> emit B -> emit A.
EmitChangeLocked(ChangeOp.Upsert, entityId, new Dictionary<string, string>(record, StringComparer.Ordinal));
}
return true;
}

public bool UpdateField(string entityId, string field, string value)
{
lock (_lock)
{
if (!_records.TryGetValue(entityId, out var record))
{
return false;
}
record[field] = value;
EmitChangeLocked(
ChangeOp.Upsert,
entityId,
new Dictionary<string, string>(record, StringComparer.Ordinal));
}
return true;
}

public bool DeleteRecord(string entityId)
{
lock (_lock)
{
if (!_records.Remove(entityId))
{
return false;
}
EmitChangeLocked(ChangeOp.Delete, entityId, null);
}
return true;
}

/// <summary>Block up to <paramref name="timeout"/> for the next change event.</summary>
public ChangeEvent? NextChange(TimeSpan timeout)
{
if (_changes.TryTake(out var change, timeout))
{
return change;
}
return null;
}

public long Reads => Interlocked.Read(ref _reads);

public void ResetReads() => Interlocked.Exchange(ref _reads, 0);

/// <summary>
/// Append a change event to the feed. Caller must hold <c>_lock</c>.
///
/// <see cref="BlockingCollection{T}.Add(T)"/> is itself thread-safe
/// and never tries to acquire <c>_lock</c>, so calling it while
/// holding the records lock cannot deadlock. Holding the lock here
/// is what guarantees that the queue order matches the order in
/// which the records dict was mutated.
/// </summary>
private void EmitChangeLocked(ChangeOp op, string entityId, Dictionary<string, string>? fields)
{
// Use millisecond-precision unix timestamp so the sync-lag
// metric is in the same shape as the Python reference.
var timestampMs = (double) DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();
_changes.Add(new ChangeEvent(op, entityId, fields, timestampMs));
}
}
Loading
Loading