diff --git a/.agents/skills/redis-use-case-ports/assets/audit-checklist.md b/.agents/skills/redis-use-case-ports/assets/audit-checklist.md index 05b7cacbeb..69acc2c078 100644 --- a/.agents/skills/redis-use-case-ports/assets/audit-checklist.md +++ b/.agents/skills/redis-use-case-ports/assets/audit-checklist.md @@ -205,7 +205,54 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a --- -## 14. Subscribe-acknowledgement race in pub/sub-style helpers +## 14. Empty-fields `HSET` guard in change-event consumers + +**What to scan for:** any code path that takes a "fields" payload from a change event / message / callback and forwards it to `HSET` (or the client-equivalent `hSet` / `hSetMultiple` / `HashSet` / `hMSet` / etc.). Typically this is a CDC consumer, sync worker, or write-through path. + +**Pass criterion:** before the `HSET` call, the code explicitly guards against `fields` being null, missing, or empty, and returns early on the malformed case (or routes to a dead-letter, etc.). The guard must run before the pipeline / transaction is opened. + +**Sample audit prompt:** + +> Audit every code path in the 9 client implementations under `content/develop/use-cases/{{USE_CASE_NAME}}/` that forwards a fields payload from a change-event / callback / message to `HSET` (or the client equivalent). For each, confirm there is an explicit early-return guard for null / missing / empty fields **before** any pipeline or transaction is constructed. Flag any port without the guard with file path and line number. + +**Why on list:** Every Redis client tested in the prefetch-cache use case raises or panics on `HSET` with an empty fields mapping: redis-py `DataError`, node-redis throws, Predis "wrong number of arguments", redis-rs **panics** on `pipe().hset_multiple(&key, &[])`, Jedis errors, go-redis errors. A defensive `|| {}` fallback that LOOKS like it handles the empty case is actually misleading — Cursor bugbot caught this on the reference implementation. ([PR #3317 comment](https://github.com/redis/docs/pull/3317)) + +--- + +## 15. TTL sentinel preservation across libraries + +**What to scan for:** any `TTL` / `ttl_remaining` / `ttlRemaining` helper that wraps the client's TTL command. Particularly any code that converts the library's return type (often `time.Duration`, `TimeSpan?`, `Long`) into integer seconds. + +**Pass criterion:** the helper returns **`-2`** for a missing key and **`-1`** for a key with no TTL, as integer seconds (or the language's native integer type). Libraries encode these sentinels inconsistently: + +- **redis-py**: returns `int` directly with `-2` / `-1` preserved. +- **go-redis**: returns `time.Duration` with `-2` / `-1` as **raw nanoseconds** (not seconds-scaled). A naive `int(d.Seconds())` truncates to `0`. +- **StackExchange.Redis**: `KeyTimeToLive` returns `TimeSpan?` and collapses **both** missing-key and no-TTL into `null` — a null-coalesce loses the `-2` sentinel. +- **node-redis / Jedis / Lettuce / Predis / redis-rb**: return integer-typed seconds with `-2` / `-1` preserved. + +The recommended cross-client idiom is to **bypass the library wrapper** and send the raw command (`client.Do(ctx, "TTL", key).Int64()` in Go, `IDatabase.Execute("TTL", key)` in .NET) so the integer reply comes through untouched. + +**Sample audit prompt:** + +> For each port's `TTLRemaining` (or equivalent) under `content/develop/use-cases/{{USE_CASE_NAME}}/`, confirm it returns `-2` for a missing key and `-1` for a key with no TTL. Test each by reading a non-existent ID and by running `PERSIST` on an existing cache key then reading it. Flag any port that returns `0`, `null`, or collapses the two sentinels into one value. + +**Why on list:** Caught in the prefetch-cache cross-port audit. go-redis and StackExchange.Redis both shipped with subtle bugs in their TTL conversion that the audit caught. ([PR #3317 audit B](https://github.com/redis/docs/pull/3317)) + +--- + +## 16. Locked-emit ordering for producer/consumer queues + +**What to scan for:** any mock primary store, in-memory writer, or producer that (a) mutates internal state under a lock and (b) appends a corresponding event to an out-of-process or out-of-thread queue/stream/channel. Typical methods: `add_record` / `update_field` / `delete_record`, `enqueue`, `publish_change`. + +**Pass criterion:** the queue append happens **inside the same locked section** as the state mutation, not after it. Without this, two concurrent mutations can complete in one order but enqueue their events in the opposite order, and a downstream consumer applies them out of order — the cache ends up divergent from the source. For cross-process producers (PHP, etc.), the equivalent is wrapping the mutation + `LPUSH` in a Lua script so the server enforces ordering. + +**Sample audit prompt:** + +> Audit every mutation method in each port's mock primary store (or equivalent producer) under `content/develop/use-cases/{{USE_CASE_NAME}}/`. For each, confirm the change event is appended to the queue / stream / channel **while the mutation lock is still held** (or, for cross-process ports, wrapped in a Lua script that combines the record write and the LPUSH server-side). Flag any port where the emit happens after the lock release. + +**Why on list:** Locked-emit ordering is what guarantees a CDC consumer can replay events deterministically. Caught and fixed in the prefetch-cache reference's `_emit_change_locked` pattern after Codex review; the prefetch-cache cross-port audit confirmed all 9 ports preserve the invariant, including PHP's Lua-script equivalent. ([PR #3317 audit C](https://github.com/redis/docs/pull/3317)) + +## 17. Subscribe-acknowledgement race in pub/sub-style helpers **What to scan for:** the constructor or registration path of any subscriber object (pub/sub Subscription, message-listener, channel consumer). Specifically, the code path between "request the SUBSCRIBE / PSUBSCRIBE" and "return the Subscription handle to the caller". @@ -219,7 +266,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a --- -## 15. Concurrent-name reservation race in async helpers +## 18. Concurrent-name reservation race in async helpers **What to scan for:** any helper that does "check map for duplicate → release lock → do async work → acquire lock → insert". This shape is common in Rust (`std::sync::Mutex` is `!Send`, so can't be held across `await`) and any async language where the check and the insert are bracketed by an `await` that releases the lock implicitly. @@ -233,7 +280,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a --- -## 16. Detached-worker PID capture +## 19. Detached-worker PID capture **What to scan for:** in any port that spawns subscriber/worker processes from a request handler (typically PHP under `php -S`, but any helper that uses `proc_open`, `subprocess.Popen`, `child_process.spawn`, `posix_spawn`, etc.), how is the worker's PID recorded? Look for `proc_get_status()['pid']` after `proc_open([...])`, or `pid` properties on subprocess handles. @@ -247,7 +294,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a --- -## 17. Silent timeout fallthrough in readiness waits +## 20. Silent timeout fallthrough in readiness waits **What to scan for:** functions named `waitFor*`, `pollUntil*`, `awaitReady`, etc. that loop with a deadline. Especially ones that return `void` / `None` / `()` instead of a status. @@ -261,7 +308,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a --- -## 18. Pub/sub introspection commands are server-wide +## 21. Pub/sub introspection commands are server-wide **What to scan for:** any test or smoke-test step that asserts an **absolute** value of `PUBSUB CHANNELS`, `PUBSUB NUMSUB`, or `PUBSUB NUMPAT`. Especially common in pub/sub-style use cases. diff --git a/.agents/skills/redis-use-case-ports/assets/cross-diff-checklist.md b/.agents/skills/redis-use-case-ports/assets/cross-diff-checklist.md index 7e0fd10038..3f5e48cf54 100644 --- a/.agents/skills/redis-use-case-ports/assets/cross-diff-checklist.md +++ b/.agents/skills/redis-use-case-ports/assets/cross-diff-checklist.md @@ -25,7 +25,7 @@ A sub-agent can run this in read-only mode. For each row, produce a 9-column com | Write path | `HSET` (with all fields) + `EXPIRE`, ideally pipelined or in a single `HSET ... EXPIRE` MULTI. | | Invalidate | `DEL` (not `EXPIRE 0`, not `UNLINK`). | | Field update | `HSET key field value` + `EXPIRE` inside a conditional transaction or `Condition.KeyExists`. | -| TTL inspection | `TTL` (not `PTTL`, not `OBJECT`). | +| TTL inspection | `TTL` (not `PTTL`, not `OBJECT`). The wrapper must preserve the `-2` (missing key) and `-1` (no TTL) sentinels as integer seconds; if the client's typed wrapper collapses or rescales them (go-redis's `time.Duration` with nanosecond-encoded sentinels, StackExchange.Redis's `KeyTimeToLive` returning `null` for both cases), bypass it with the raw command (`Do("TTL", ...)` / `Execute("TTL", ...)`). See audit-checklist row 15. | | Single-flight acquire | Lua script using `SET NX PX`. | | Single-flight release | Lua script using `GET == token` check + `DEL`. | | Counters (where stats are in Redis, e.g. PHP) | `HINCRBY`. | @@ -187,10 +187,11 @@ The only per-client variation should be the **pill text** at the top of `` | `Get the source files` subsection | Every `_index.md` has a `### Get the source files` subsection as the first child of `## Running the demo`. It contains a `mkdir -demo && cd -demo`, a `BASE=https://raw.githubusercontent.com/redis/docs/main/...` variable, and one `curl -O $BASE/` per source file the port needs. | | Files curled match files run | The set of files in the curl block matches what the existing run command (e.g. `python3 demo_server.py`, `dotnet run`, `php -S ... demo_server.php`) actually requires. No missing config files (`package.json`, `composer.json`, `*.csproj`, `go.mod`, `Cargo.toml`), no extras (`Cargo.lock` only if `cargo` expects it; build outputs never). | | Rust folder layout | The curl block matches the port's on-disk layout: if files live under `src/`, the block does `mkdir -p .../src && cd ...` then `curl -o src/ $BASE/src/`; if files are flat at the project root (driven by explicit `path =` in `Cargo.toml`), `curl -O $BASE/` for all of them. | +| Source-file count in prose matches curl block | Prose like *"The demo consists of N files"* in `### Get the source files` must match the actual number of `curl -O` lines in the block. Easy drift when a port adds an extra worker entry point (e.g. PHP's separate `sync_worker.php`) and the count is not updated. | **Audit prompt:** -> For each of the 9 client implementations of `content/develop/use-cases/{{USE_CASE_NAME}}/`, grep `_index.md` with `grep -nE "\]\(([^h)][^)]*\.[a-z]+)\)"` — the result must be empty (no relative file links). Then confirm `## Running the demo` is followed by `### Get the source files`, and that the curl block downloads the same files the run command needs. Flag any port where the curl-block file set diverges from the run-time requirements, or where a Rust port's `src/` layout doesn't match its on-disk reality. +> For each of the 9 client implementations of `content/develop/use-cases/{{USE_CASE_NAME}}/`, grep `_index.md` with `grep -nE "\]\(([^h)][^)]*\.[a-z]+)\)"` — the result must be empty (no relative file links). Then confirm `## Running the demo` is followed by `### Get the source files`, and that the curl block downloads the same files the run command needs. Count the `curl -O` lines and confirm the prose intro ("The demo consists of N files") matches. Flag any port where the curl-block file set diverges from the run-time requirements, or where a Rust port's `src/` layout doesn't match its on-disk reality. ## File names per client diff --git a/.agents/skills/redis-use-case-ports/assets/redis-conventions.md b/.agents/skills/redis-use-case-ports/assets/redis-conventions.md index a2dffe2b06..4c602df21c 100644 --- a/.agents/skills/redis-use-case-ports/assets/redis-conventions.md +++ b/.agents/skills/redis-use-case-ports/assets/redis-conventions.md @@ -271,6 +271,9 @@ PHP runs each HTTP request in a fresh process under `php -S`. This means: - **In-process state doesn't persist.** Cache stats, primary record state, primary read counters, and per-job-queue counters must live in Redis (under a `demo:*` keyspace, or a `:{name}:stats` hash). - **Spawning sub-processes from a request handler must detach from the dev server's listen socket.** This bites both `pcntl_fork` (forked children inherit the accept socket) and `proc_open` (children inherit FDs unless explicitly redirected). The fix is **`setsid` on Linux**, and a shell-based new-session wrapper on macOS (which lacks `setsid(1)`). The detach also needs to redirect stdin/stdout/stderr to files; closing them alone isn't enough. - **Predis 3.x's `hset()` is variadic, not associative.** The 1.x `$redis->hset($key, ['field' => 'value'])` form raises `wrong number of arguments for 'hset'` against a 3.x client/server. Use `$redis->hset($key, 'field', 'value', 'field2', 'value2', ...)` and write a small `flattenFields()` helper if you're storing a map. +- **Predis `BRPOP` only accepts whole-second timeouts.** Sub-second polling intervals (e.g. a 50 ms `next_change` loop in the reference Python) need a workaround: use a 1 s `BRPOP` for change draining plus a separate fast pause-flag poll (e.g. 20 ms `usleep`) so pause/resume latency stays low even when the main `BRPOP` is parked. +- **Cross-process pause/resume goes through Redis flags.** Where threaded ports use a `threading.Event` (or equivalent) inside one process, PHP needs the demo server and the long-running sync worker to coordinate across processes. The pattern is two keys: `demo:sync:paused` (writer to worker) and `demo:sync:idle` (worker acks parked state). The demo's `/clear` and `/reprefetch` handlers set `paused=1`, spin-wait for `idle=1` with a 10 ms poll and a 2 s timeout, do the cache write, then set `paused=0`. The worker checks `paused` on each loop iteration; if set, writes `idle=1` and spin-waits for it to clear. Established in the prefetch-cache PHP port. +- **Mutation + change-event emit needs Lua-script atomicity** when the producer is also stateless (PHP). The reference Python uses an in-process `Lock` to make "mutate-then-emit" atomic; the PHP equivalent is wrapping the record write and the `LPUSH` (or `XADD`) onto the change feed in a single `EVAL`. Without this, two concurrent mutations on the same key can land in queue order opposite to their server-side commit order. (Audit-checklist row 16.) - The brief should call out that the cross-process supervision approach is **PHP-specific** in the production-usage section. ## .NET-specific notes @@ -281,20 +284,25 @@ PHP runs each HTTP request in a fresh process under `php -S`. This means: - **StackExchange.Redis intentionally does not expose blocking pops** (`BRPOPLPUSH` / `BLMOVE` with a timeout) because they would monopolise the multiplexer's single command pipeline. Use cases that need a blocking claim (job queue, etc.) should poll the non-blocking `IDatabase.ListRightPopLeftPush` on a short interval (50 ms is a reasonable default). Document this in the helper's "Claiming jobs" / "How it works" section. - **`RedisChannel` no longer has an implicit `string` conversion in 2.7+.** `db.Publish(...)` needs `RedisChannel.Literal("channel:name")` or `RedisChannel.Pattern(...)` explicitly. - StackExchange.Redis transparently caches Lua scripts: the first `ScriptEvaluate(script, keys, args)` sends `EVAL`, subsequent calls switch to `EVALSHA` automatically. No need to manage SHAs by hand. +- **`IDatabase.KeyTimeToLive` collapses the `-2` (missing) and `-1` (no TTL) sentinels into a single `TimeSpan?` null.** For any `TTL` lookup that needs to distinguish them, send the raw command instead: `(long) db.Execute("TTL", key)` returns the integer the server actually replied with. (Audit-checklist row 15.) +- **`IServer.Keys` (the typed SCAN enumerator) requires `AllowAdmin = true` on the `ConfigurationOptions`** — which also grants `FLUSHDB` / `CONFIG`, a real security concern in production. Where SCAN-style enumeration is needed (e.g. a `clear()` helper) prefer `db.Execute("SCAN", cursor, "MATCH", pattern, "COUNT", count)` so the demo doesn't pull in admin-privileged client config. ## Java-specific notes - **Jedis**: use `JedisPool` and acquire a `Jedis` instance per call with try-with-resources. Each transaction gets its own connection; no in-process lock is needed. - **Jedis 5.x's `brpoplpush` takes integer seconds.** Sub-second blocking-claim timeouts (e.g. 500 ms polling windows) round up to 1 s on the wire. The polling loop still observes its stop flag promptly enough; just be aware the per-iteration block is longer than the reference suggests. - **Lettuce**: by default the demo shares one `StatefulRedisConnection` across HTTP handlers. Lettuce is thread-safe for individual commands but pipelined sequences and transactions are connection-scoped — concurrent pipelines or `MULTI`/`EXEC` blocks on one connection can interleave. Options when an enqueue / update needs two-or-more commands atomic-ish: (a) wrap in a `ReentrantLock`; (b) use `MULTI`/`EXEC` with the same lock; (c) merge into a Lua script (preferred — atomic server-side and lock-free, but requires writing the script). The production-usage section should explain you'd switch to `ConnectionPoolSupport.createGenericObjectPool(...)` in production and drop the lock. +- **Lettuce sync API does not cooperate with `setAutoFlushCommands(false)`.** Each sync call internally awaits its `CompletableFuture`; with auto-flush off, those futures never complete because nothing flushes. Symptom: bulk-load deadlocks silently — no exception, just a hung process. Use the **async API** (`RedisAsyncCommands`) for any pipelined batch where you intend to flush at the end: queue commands without awaiting each one, then `connection.flushCommands()` and await the futures in bulk. Documented after the prefetch-cache Lettuce port hit it during testing. - Lettuce's `BLMOVE` accepts a `double` timeout in seconds with sub-second precision (`bRPopLPush(timeout: double)`). Don't use the older `long`-overload — pre-6.x builds treated values < 1 as "block forever". - Both Java demos depend on a small classpath. The `_index.md` should give an example `javac` + `java` command listing the jars by name. +- **JDK version: pick text blocks (15+) or string concatenation (11+) and apply it across both Java ports of the same use case.** Text blocks (`"""..."""`) keep the inlined HTML readable; concatenation works on older JDKs. The cache-aside Java ports use concatenation with JDK 11+ prereq; the prefetch-cache Java ports use text blocks with JDK 17+ prereq. Either is fine — just don't mix within a use case, and set Prerequisites accordingly. ## Go-specific notes - Use `package ` (e.g., `package cacheaside`) for all files, including the demo server. Expose the entry point as a `RunDemoServer()` function rather than `main()` directly. - Ask the user to create a one-line `main.go` next to the files: `package main; import ""; func main() { .RunDemoServer() }`. This avoids the Go limitation that `package main` can't coexist with another package in the same directory. - `go.mod` should declare `module ` and `require github.com/redis/go-redis/v9` at a recent stable version. +- **go-redis encodes the `TTL` sentinels `-2` / `-1` as raw nanoseconds**, not seconds-scaled. `client.TTL(...).Result()` returns `time.Duration(-2)` (one nanosecond) for a missing key, and a naive `int(d.Seconds())` truncates it to `0`. For any `TTL` lookup, bypass the typed wrapper: `client.Do(ctx, "TTL", key).Int64()` returns the integer reply directly. Same idiom maps to the .NET `Execute("TTL", ...)` workaround. (Audit-checklist row 15.) ## Rust-specific notes @@ -387,6 +395,27 @@ Every client has a `MockPrimaryStore` that: - Is thread-safe (mutex around the records map, atomic on the counter). - Lives entirely in-process — except in PHP, where it persists in Redis under `demo:primary:*` keys for cross-request survival. +### Locked-emit ordering for producer/consumer use cases + +When the mock primary store doubles as the *producer* of a change feed that some downstream consumer (CDC worker, sync worker, replicator) drains — as in the prefetch-cache use case — every mutation method must emit its change event **while the mutation lock is still held**. The append-to-queue cannot happen after the lock is released, even though the queue itself is thread-safe. + +Without this, two concurrent `update_field` calls can mutate the records map in one order (T1 then T2 → primary state ends at T2's value) and then enqueue their events in the opposite order (T2 then T1 → consumer applies T1 last → cache ends at T1's value, divergent from primary). + +The reference Python pattern is an `_emit_change_locked(...)` helper called inside each `with self._lock:` block. The equivalent in other languages: + +| Language | Pattern | +|---|---| +| Python | `_emit_change_locked` inside `with self._lock:` | +| Node.js | mutation + emit are synchronous within the same function; no `await` between them (single-threaded event loop guarantees serial execution) | +| Go | `defer mu.Unlock()` + `emitChangeLocked` before the deferred unlock | +| Java | `synchronized (lock) { ...mutate...; emitChangeLocked(...); }` | +| C# | `lock (_lock) { ...mutate...; EmitChangeLocked(...); }` | +| PHP | Lua scripts that combine the record write and the `LPUSH` server-side (no in-process lock to hold across requests) | +| Ruby | `@lock.synchronize { ...mutate...; emit_change_locked(...); }` | +| Rust | `emit_locked(...)` while the `MutexGuard` is still in scope (call before drop) | + +See audit-checklist row 16 for the audit prompt. + ## Library versions to standardise (when this skill is updated) Pin the recommended versions in the `_index.md` Prerequisites section. As of the cache-aside use case: diff --git a/content/develop/use-cases/_index.md b/content/develop/use-cases/_index.md index 1f377ce457..79e83e1a1a 100644 --- a/content/develop/use-cases/_index.md +++ b/content/develop/use-cases/_index.md @@ -22,4 +22,5 @@ This section provides practical examples and reference implementations for commo * [Time series dashboard]({{< relref "/develop/use-cases/time-series-dashboard" >}}) - Build a rolling sensor graph demo with Redis time series data * [Leaderboards]({{< relref "/develop/use-cases/leaderboard" >}}) - Build a ranked leaderboard with sorted sets and user metadata * [Job queue]({{< relref "/develop/use-cases/job-queue" >}}) - Run a reliable background job queue with at-least-once delivery and visibility-timeout reclaim +* [Prefetch cache]({{< relref "/develop/use-cases/prefetch-cache" >}}) - Pre-load reference data into Redis so every read is a cache hit, kept current by a CDC sync worker * [Pub/sub messaging]({{< relref "/develop/use-cases/pub-sub" >}}) - Broadcast real-time events to many consumers with channel and pattern subscriptions diff --git a/content/develop/use-cases/prefetch-cache/_index.md b/content/develop/use-cases/prefetch-cache/_index.md new file mode 100644 index 0000000000..0f6012ec1a --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/_index.md @@ -0,0 +1,105 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Pre-load reference data into Redis so every read is a cache hit. +hideListLinks: true +linkTitle: Prefetch cache +title: Redis prefetch cache +weight: 5 +--- + +## When to use Redis prefetch cache + +Use a Redis prefetch cache when you need to pre-load reference or master data into cache before the first request arrives, so every read is a hit and no request ever falls through to the primary database. + +## Why the problem is hard + +Cache-aside guarantees cold-start misses: the first request for every key hits the primary, and between TTL expiry and the next read, every service re-fetches the same rows from a slow backend. At scale this creates latency spikes and sustained read pressure on the system of record — the load pattern is worst exactly when traffic is highest. + +Prefetch solves this by loading data proactively, but that brings its own constraints. The entire working set must fit in memory, and it must stay current as the source of truth changes. Building and maintaining the sync pipeline from the source database adds engineering cost and ongoing operational burden — once the cache is the only read path, any sync lag becomes a correctness problem rather than a freshness one. + +This pattern is distinct from cache-aside, where the cache populates reactively on miss and the primary is always available as a fall-back. With prefetch, the application assumes the cache is authoritative on the read path; on a miss, it does not fall back to the primary (and a sustained miss rate is treated as an incident). It is also distinct from write-through caching, where every write to the application writes both the cache and the primary in lock-step — prefetch decouples the write path from the cache and lets a separate sync pipeline catch up. + +## What you can expect from a Redis solution + +You can: + +- Achieve near-100% cache hit ratios for country codes, product categories, translations, configuration, and other reference tables. +- Keep P95 read latency under 1 ms for lookup-heavy request paths at peak traffic (that is to say 95% of requests have a latency of 1 ms or less). +- Sync source database changes into cache within seconds using a managed CDC pipeline (such as Redis Data Integration), or a small consumer in front of Debezium, Kafka, or a Redis stream. +- Offload all reference-data reads from the primary database, avoiding the cost of dedicated read replicas. +- Pre-warm the cache on deploy or restart so cold starts never reach the backend. +- Bound memory with a long safety-net TTL that expires entries if the sync pipeline ever stops, so a silent failure never serves stale data forever. + +## How Redis supports the solution + +In practice, the application loads the full working set into Redis once at startup using a pipelined bulk write, then a separate sync worker keeps Redis current as the source of truth changes. Every reference-data read goes to Redis only — there is no fall-back path to the primary on the request critical path. + +Redis provides the following features that make it a good fit for prefetch caching: + +- [Hashes]({{< relref "/develop/data-types/hashes" >}}) + ([`HSET`]({{< relref "/commands/hset" >}}), + [`HGETALL`]({{< relref "/commands/hgetall" >}})) and native + [JSON]({{< relref "/develop/data-types/json" >}}) documents + ([`JSON.SET`]({{< relref "/commands/json.set" >}}), + [`JSON.GET`]({{< relref "/commands/json.get" >}})) map directly to common + reference-data lookup patterns — id-keyed records with a fixed set of fields, + or richer nested documents accessed by JSONPath. +- [Pipelined]({{< relref "/develop/clients/pools-and-muxing" >}}) + [`HSET`]({{< relref "/commands/hset" >}}) or + [`MSET`]({{< relref "/commands/mset" >}}) batches make the initial bulk load + fast: a few thousand records load in a single round trip, so the application + starts serving from a fully-warm cache within seconds of boot. +- [`EXPIRE`]({{< relref "/commands/expire" >}}) sets a long safety-net TTL on + each entry so memory stays bounded even if the sync pipeline silently stops — + not as the freshness mechanism, but as a guardrail. +- [`SCAN`]({{< relref "/commands/scan" >}}) iterates the prefetched keyspace + without blocking the server, so the application can audit cache coverage, + list available IDs, or run a periodic reconciliation pass against the source. +- [Streams]({{< relref "/develop/data-types/streams" >}}) + ([`XADD`]({{< relref "/commands/xadd" >}}), + [`XREAD`]({{< relref "/commands/xread" >}})) provide a durable, replayable + change feed when the sync worker needs to resume from a known offset after + a restart — the canonical pattern for CDC consumers feeding Redis. +- Sub-millisecond reads from memory, so reference-data lookups never appear on + a flame graph. If Redis is already in the stack for sessions, rate limiting, + or cache-aside, prefetch runs on the same instance at zero marginal cost. + +## Ecosystem + +The following libraries and frameworks support Redis-backed prefetch caching: + +- **Java**: + [Spring Cache abstraction (`@Cacheable` with Redis cache store)](https://docs.spring.io/spring-data/redis/reference/redis/redis-cache.html), + populated by a startup `CommandLineRunner` for the bulk load. +- **Node.js**: + [Redis OM](https://github.com/redis/redis-om-node) for object-mapping + prefetched JSON documents. +- **Change-data-capture (CDC)** pipelines that stream source-database changes + into Redis without custom application code: + [Redis Data Integration (RDI)]({{< relref "/integrate/redis-data-integration" >}}) + for relational and NoSQL sources on Redis Enterprise / Redis Cloud; + [Debezium](https://debezium.io/) plus a lightweight Redis consumer for + open-source Redis. +- **API gateways**: + [Kong](https://docs.konghq.com/hub/) plugins to route reference-data reads to + Redis directly, bypassing the backend service entirely. + +## Code examples to build your own Redis prefetch cache + +The following guides show how to build a simple Redis-backed prefetch cache in front of a primary store of reference data. Each guide includes a runnable interactive demo that pre-loads records on startup, runs a background sync worker that applies primary-store changes to Redis within milliseconds, and lets you watch the cache stay current as records are added, updated, and deleted on the source. + +* [redis-py (Python)]({{< relref "/develop/use-cases/prefetch-cache/redis-py" >}}) +* [node-redis (Node.js)]({{< relref "/develop/use-cases/prefetch-cache/nodejs" >}}) +* [go-redis (Go)]({{< relref "/develop/use-cases/prefetch-cache/go" >}}) +* [Jedis (Java)]({{< relref "/develop/use-cases/prefetch-cache/java-jedis" >}}) +* [Lettuce (Java)]({{< relref "/develop/use-cases/prefetch-cache/java-lettuce" >}}) +* [StackExchange.Redis (C#)]({{< relref "/develop/use-cases/prefetch-cache/dotnet" >}}) +* [Predis (PHP)]({{< relref "/develop/use-cases/prefetch-cache/php" >}}) +* [redis-rb (Ruby)]({{< relref "/develop/use-cases/prefetch-cache/ruby" >}}) +* [redis-rs (Rust)]({{< relref "/develop/use-cases/prefetch-cache/rust" >}}) diff --git a/content/develop/use-cases/prefetch-cache/dotnet/MockPrimaryStore.cs b/content/develop/use-cases/prefetch-cache/dotnet/MockPrimaryStore.cs new file mode 100644 index 0000000000..910996579e --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/dotnet/MockPrimaryStore.cs @@ -0,0 +1,197 @@ +using System.Collections.Concurrent; + +namespace PrefetchCacheDemo; + +/// +/// Mock primary data store for the prefetch-cache demo. +/// +/// This stands in for a source-of-truth database (Postgres, MySQL, +/// Mongo, etc.) that holds reference data the application serves to +/// users. +/// +/// Every mutation appends a change event to an in-process queue, which +/// the sync worker drains and applies to Redis. In a real system the +/// queue is replaced by a CDC pipeline — Redis Data Integration, +/// Debezium plus a lightweight consumer, or an equivalent tool that +/// tails the source's binlog/WAL and pushes changes into Redis. +/// +/// The store also exposes so the demo can +/// illustrate how much slower a direct primary read would be than a +/// Redis hit. +/// +public class MockPrimaryStore +{ + public int ReadLatencyMs { get; } + + private readonly object _lock = new(); + private long _reads; + private readonly Dictionary> _records; + private readonly BlockingCollection _changes = new(new ConcurrentQueue()); + + public MockPrimaryStore(int readLatencyMs = 80) + { + ReadLatencyMs = readLatencyMs; + _records = new Dictionary>(StringComparer.Ordinal) + { + ["cat-001"] = new() + { + ["id"] = "cat-001", + ["name"] = "Beverages", + ["display_order"] = "1", + ["featured"] = "true", + ["parent_id"] = "", + }, + ["cat-002"] = new() + { + ["id"] = "cat-002", + ["name"] = "Bakery", + ["display_order"] = "2", + ["featured"] = "true", + ["parent_id"] = "", + }, + ["cat-003"] = new() + { + ["id"] = "cat-003", + ["name"] = "Pantry Staples", + ["display_order"] = "3", + ["featured"] = "false", + ["parent_id"] = "", + }, + ["cat-004"] = new() + { + ["id"] = "cat-004", + ["name"] = "Frozen", + ["display_order"] = "4", + ["featured"] = "false", + ["parent_id"] = "", + }, + ["cat-005"] = new() + { + ["id"] = "cat-005", + ["name"] = "Specialty Cheeses", + ["display_order"] = "5", + ["featured"] = "false", + ["parent_id"] = "cat-002", + }, + }; + } + + public List ListIds() + { + lock (_lock) + { + var ids = _records.Keys.ToList(); + ids.Sort(StringComparer.Ordinal); + return ids; + } + } + + /// Return every record. Used by the cache's bulk-load path on startup. + public List> ListRecords() + { + Thread.Sleep(ReadLatencyMs); + lock (_lock) + { + Interlocked.Increment(ref _reads); + return _records.Values + .Select(r => new Dictionary(r, StringComparer.Ordinal)) + .ToList(); + } + } + + /// Single-record read. Not on the demo's normal read path. + public Dictionary? Read(string entityId) + { + Thread.Sleep(ReadLatencyMs); + lock (_lock) + { + Interlocked.Increment(ref _reads); + return _records.TryGetValue(entityId, out var record) + ? new Dictionary(record, StringComparer.Ordinal) + : null; + } + } + + public bool AddRecord(Dictionary record) + { + if (!record.TryGetValue("id", out var entityId) || string.IsNullOrEmpty(entityId?.Trim())) + { + return false; + } + entityId = entityId.Trim(); + lock (_lock) + { + if (_records.ContainsKey(entityId)) + { + return false; + } + _records[entityId] = new Dictionary(record, StringComparer.Ordinal); + // Emit while the lock is held so the queue order matches the + // mutation order. Two concurrent callers cannot interleave + // mutation A -> mutation B -> emit B -> emit A. + EmitChangeLocked(ChangeOp.Upsert, entityId, new Dictionary(record, StringComparer.Ordinal)); + } + return true; + } + + public bool UpdateField(string entityId, string field, string value) + { + lock (_lock) + { + if (!_records.TryGetValue(entityId, out var record)) + { + return false; + } + record[field] = value; + EmitChangeLocked( + ChangeOp.Upsert, + entityId, + new Dictionary(record, StringComparer.Ordinal)); + } + return true; + } + + public bool DeleteRecord(string entityId) + { + lock (_lock) + { + if (!_records.Remove(entityId)) + { + return false; + } + EmitChangeLocked(ChangeOp.Delete, entityId, null); + } + return true; + } + + /// Block up to for the next change event. + public ChangeEvent? NextChange(TimeSpan timeout) + { + if (_changes.TryTake(out var change, timeout)) + { + return change; + } + return null; + } + + public long Reads => Interlocked.Read(ref _reads); + + public void ResetReads() => Interlocked.Exchange(ref _reads, 0); + + /// + /// Append a change event to the feed. Caller must hold _lock. + /// + /// is itself thread-safe + /// and never tries to acquire _lock, so calling it while + /// holding the records lock cannot deadlock. Holding the lock here + /// is what guarantees that the queue order matches the order in + /// which the records dict was mutated. + /// + private void EmitChangeLocked(ChangeOp op, string entityId, Dictionary? fields) + { + // Use millisecond-precision unix timestamp so the sync-lag + // metric is in the same shape as the Python reference. + var timestampMs = (double) DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(); + _changes.Add(new ChangeEvent(op, entityId, fields, timestampMs)); + } +} diff --git a/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCache.cs b/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCache.cs new file mode 100644 index 0000000000..571238686e --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCache.cs @@ -0,0 +1,307 @@ +using StackExchange.Redis; + +namespace PrefetchCacheDemo; + +/// +/// Redis prefetch-cache helper. +/// +/// Each cached entity is stored as a Redis hash under +/// {prefix}{id} with a long safety-net TTL that bounds memory if +/// the sync pipeline ever stops, but is not the freshness mechanism. +/// Freshness comes from the path, which the +/// sync worker calls every time a primary mutation arrives. +/// +/// Reads run HGETALL against Redis only. A miss is not a +/// fall-back trigger — the application treats it as an error or a +/// deliberate for testing. In production a +/// sustained miss rate means the prefetch or the sync pipeline is +/// broken, not that the primary should be re-queried on the request +/// path. +/// +public class PrefetchCache +{ + private readonly IDatabase _db; + private readonly string _prefix; + private readonly int _ttlSeconds; + + private readonly object _statsLock = new(); + private long _hits; + private long _misses; + private long _prefetched; + private long _syncEventsApplied; + private double _syncLagMsTotal; + private long _syncLagSamples; + + public PrefetchCache(IDatabase db, string prefix = "cache:category:", int ttlSeconds = 3600) + { + _db = db ?? throw new ArgumentNullException(nameof(db)); + if (ttlSeconds < 1) throw new ArgumentException("ttlSeconds must be at least 1 second", nameof(ttlSeconds)); + _prefix = string.IsNullOrEmpty(prefix) ? "cache:category:" : prefix; + _ttlSeconds = ttlSeconds; + } + + public int TtlSeconds => _ttlSeconds; + public string Prefix => _prefix; + + public sealed record ReadResult(Dictionary? Record, bool Hit, double RedisLatencyMs); + + /// + /// Pipeline DEL + HSET + EXPIRE for every record. Returns the count loaded. + /// + /// The batch is non-transactional: it is fast on startup (when + /// nothing is reading the cache) and on the live /reprefetch + /// path (when the demo pauses the sync worker around the call). + /// Calling BulkLoad on a cache that is actively being read + /// and written to can briefly expose a key that has been deleted + /// but not yet rewritten; pause the writers first or use a + /// transaction if that matters. + /// + public int BulkLoad(IEnumerable> records) + { + var batch = _db.CreateBatch(); + var tasks = new List(); + var loaded = 0; + foreach (var record in records) + { + if (!record.TryGetValue("id", out var entityId) || string.IsNullOrEmpty(entityId)) + { + continue; + } + var cacheKey = CacheKey(entityId); + tasks.Add(batch.KeyDeleteAsync(cacheKey)); + tasks.Add(batch.HashSetAsync( + cacheKey, + record.Select(p => new HashEntry(p.Key, p.Value)).ToArray())); + tasks.Add(batch.KeyExpireAsync(cacheKey, TimeSpan.FromSeconds(_ttlSeconds))); + loaded++; + } + if (loaded > 0) + { + batch.Execute(); + Task.WaitAll(tasks.ToArray()); + } + lock (_statsLock) + { + _prefetched += loaded; + } + return loaded; + } + + /// + /// Return (record, hit, redisLatencyMs) for an HGETALL against Redis. + /// + /// Prefetch-cache reads do not fall back to the primary. A miss is + /// a signal that the cache is incomplete, not a trigger to re-query + /// the source. The caller decides how to surface it. + /// + public ReadResult Get(string entityId) + { + var cacheKey = CacheKey(entityId); + var sw = System.Diagnostics.Stopwatch.StartNew(); + var entries = _db.HashGetAll(cacheKey); + sw.Stop(); + var redisLatencyMs = sw.Elapsed.TotalMilliseconds; + + if (entries.Length > 0) + { + lock (_statsLock) { _hits++; } + return new ReadResult(ToDict(entries), Hit: true, redisLatencyMs); + } + + lock (_statsLock) { _misses++; } + return new ReadResult(null, Hit: false, redisLatencyMs); + } + + /// + /// Apply a primary change event to Redis. + /// + /// The sync worker calls this for every event the primary emits. + /// For an upsert, the helper rewrites the hash and refreshes the + /// safety-net TTL inside a transaction. For a delete, it removes + /// the cache key. + /// + public void ApplyChange(ChangeEvent change) + { + if (string.IsNullOrEmpty(change.Id)) return; + var cacheKey = CacheKey(change.Id); + + if (change.Op == ChangeOp.Upsert) + { + if (change.Fields is null || change.Fields.Count == 0) + { + // Malformed upsert with no fields. Skip rather than + // crash the sync worker: HSET with an empty array + // throws, and there's nothing to write anyway. A real + // CDC consumer would route this to a dead-letter queue + // and alert; the demo just drops it. + return; + } + // StackExchange.Redis transactions are optimistic (WATCH- + // based) rather than full MULTI/EXEC, but the three commands + // here have no conditions and can be queued and dispatched + // atomically in one round trip via CreateTransaction. + var tx = _db.CreateTransaction(); + _ = tx.KeyDeleteAsync(cacheKey); + _ = tx.HashSetAsync( + cacheKey, + change.Fields.Select(p => new HashEntry(p.Key, p.Value)).ToArray()); + _ = tx.KeyExpireAsync(cacheKey, TimeSpan.FromSeconds(_ttlSeconds)); + tx.Execute(); + } + else if (change.Op == ChangeOp.Delete) + { + _db.KeyDelete(cacheKey); + } + else + { + return; + } + + lock (_statsLock) + { + _syncEventsApplied++; + if (change.TimestampMs > 0.0) + { + var nowMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(); + var lagMs = Math.Max(0.0, nowMs - change.TimestampMs); + _syncLagMsTotal += lagMs; + _syncLagSamples++; + } + } + } + + /// Delete one cache key. Demo-only: simulates a broken sync pipeline. + public bool Invalidate(string entityId) + { + return _db.KeyDelete(CacheKey(entityId)); + } + + /// Delete every key under this cache's prefix and return the count. + public int Clear() + { + var deleted = 0; + var batch = new List(500); + foreach (var key in ScanKeys()) + { + batch.Add(key); + if (batch.Count >= 500) + { + deleted += (int) _db.KeyDelete(batch.ToArray()); + batch.Clear(); + } + } + if (batch.Count > 0) + { + deleted += (int) _db.KeyDelete(batch.ToArray()); + } + return deleted; + } + + /// Return every entity id currently in the cache. + public List Ids() + { + var ids = new List(); + foreach (var key in ScanKeys()) + { + var s = (string) key!; + ids.Add(s.StartsWith(_prefix, StringComparison.Ordinal) ? s.Substring(_prefix.Length) : s); + } + ids.Sort(StringComparer.Ordinal); + return ids; + } + + /// + /// Iterate every key under the cache's prefix using a raw SCAN command. + /// + /// Sending SCAN through IDatabase.Execute avoids + /// IServer.Keys, which would require AllowAdmin=true on + /// the connection options — a flag that also grants + /// FLUSHDB/CONFIG and is best avoided in production. + /// + private IEnumerable ScanKeys() + { + var cursor = "0"; + var match = $"{_prefix}*"; + do + { + var reply = (RedisResult[]) _db.Execute( + "SCAN", cursor, "MATCH", match, "COUNT", 500)!; + cursor = (string) reply[0]!; + var keys = (RedisResult[]) reply[1]!; + foreach (var key in keys) + { + yield return (RedisKey) (string) key!; + } + } while (cursor != "0"); + } + + public int Count() => Ids().Count; + + public long TtlRemaining(string entityId) + { + // Use Execute("TTL", ...) rather than KeyTimeToLive: the latter + // returns `null` for BOTH a missing key and a key without a TTL, + // collapsing the -2 and -1 sentinels. Execute returns the raw + // integer so the demo UI can show the correct value in each case. + return (long) _db.Execute("TTL", CacheKey(entityId)); + } + + public Dictionary Stats() + { + lock (_statsLock) + { + var total = _hits + _misses; + var hitRate = total == 0 ? 0.0 : Math.Round(100.0 * _hits / total, 1); + var avgLag = _syncLagSamples == 0 + ? 0.0 + : Math.Round(_syncLagMsTotal / _syncLagSamples, 2); + return new Dictionary + { + ["hits"] = _hits, + ["misses"] = _misses, + ["hit_rate_pct"] = hitRate, + ["prefetched"] = _prefetched, + ["sync_events_applied"] = _syncEventsApplied, + ["sync_lag_ms_avg"] = avgLag, + }; + } + } + + public void ResetStats() + { + lock (_statsLock) + { + _hits = 0; + _misses = 0; + _prefetched = 0; + _syncEventsApplied = 0; + _syncLagMsTotal = 0.0; + _syncLagSamples = 0; + } + } + + private static Dictionary ToDict(HashEntry[] entries) + { + var result = new Dictionary(entries.Length, StringComparer.Ordinal); + foreach (var entry in entries) + { + result[entry.Name!] = entry.Value!; + } + return result; + } + + private string CacheKey(string id) => _prefix + id; +} + +public enum ChangeOp +{ + Upsert, + Delete, +} + +/// +/// A single primary change event. is null for +/// deletes and a fully-formed record for upserts. +/// is the unix epoch in milliseconds (with sub-millisecond precision). +/// +public sealed record ChangeEvent(ChangeOp Op, string Id, Dictionary? Fields, double TimestampMs); diff --git a/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCacheDemo.csproj b/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCacheDemo.csproj new file mode 100644 index 0000000000..909cf1efd3 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCacheDemo.csproj @@ -0,0 +1,14 @@ + + + + net8.0 + enable + enable + PrefetchCacheDemo + + + + + + + diff --git a/content/develop/use-cases/prefetch-cache/dotnet/Program.cs b/content/develop/use-cases/prefetch-cache/dotnet/Program.cs new file mode 100644 index 0000000000..375dfb2a0d --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/dotnet/Program.cs @@ -0,0 +1,642 @@ +using PrefetchCacheDemo; +using StackExchange.Redis; + +// .NET grows its ThreadPool gradually (~2 threads/sec under load), +// which can starve polling threads in the pause/resume race test and +// produce false fall-through reads. Raising the floor up front keeps +// the demo's "cache converges to the primary state under load" +// behaviour clean. A production helper would be async (HashGetAllAsync, +// await Task.Delay) and avoid this entirely. +ThreadPool.SetMinThreads(64, 64); + +// pauseMu serialises /clear and /reprefetch so two concurrent admin +// callers cannot pause/resume each other into a sync-worker live state. +// Mirrors the `pauseMu sync.Mutex` in the go-redis port. +var pauseMu = new object(); + +var host = "127.0.0.1"; +var port = 8787; +var redisHost = "localhost"; +var redisPort = 6379; +var cachePrefix = "cache:category:"; +var ttlSeconds = 3600; +var primaryLatencyMs = 80; + +for (var i = 0; i < args.Length; i++) +{ + switch (args[i]) + { + case "--host" when i + 1 < args.Length: host = args[++i]; break; + case "--port" when i + 1 < args.Length: port = int.Parse(args[++i]); break; + case "--redis-host" when i + 1 < args.Length: redisHost = args[++i]; break; + case "--redis-port" when i + 1 < args.Length: redisPort = int.Parse(args[++i]); break; + case "--cache-prefix" when i + 1 < args.Length: cachePrefix = args[++i]; break; + case "--ttl-seconds" when i + 1 < args.Length: ttlSeconds = int.Parse(args[++i]); break; + case "--primary-latency-ms" when i + 1 < args.Length: primaryLatencyMs = int.Parse(args[++i]); break; + } +} + +port = int.TryParse(Environment.GetEnvironmentVariable("PORT"), out var envPort) ? envPort : port; +redisHost = Environment.GetEnvironmentVariable("REDIS_HOST") ?? redisHost; +redisPort = int.TryParse(Environment.GetEnvironmentVariable("REDIS_PORT"), out var envRedisPort) + ? envRedisPort + : redisPort; + +ConnectionMultiplexer redis; +try +{ + var configuration = ConfigurationOptions.Parse($"{redisHost}:{redisPort}"); + redis = ConnectionMultiplexer.Connect(configuration); + redis.GetDatabase().Ping(); +} +catch (Exception ex) +{ + Console.Error.WriteLine($"Failed to connect to Redis at {redisHost}:{redisPort}: {ex.Message}"); + return 1; +} + +var cache = new PrefetchCache(redis.GetDatabase(), prefix: cachePrefix, ttlSeconds: ttlSeconds); +var primary = new MockPrimaryStore(primaryLatencyMs); +var sync = new SyncWorker(primary, cache); + +var startupSw = System.Diagnostics.Stopwatch.StartNew(); +cache.Clear(); +var initialLoaded = cache.BulkLoad(primary.ListRecords()); +startupSw.Stop(); +sync.Start(); + +var builder = WebApplication.CreateBuilder(); +builder.WebHost.UseUrls($"http://{host}:{port}"); +builder.Logging.SetMinimumLevel(LogLevel.Warning); +var app = builder.Build(); + +Dictionary BuildStats() +{ + var stats = cache.Stats(); + stats["primary_reads_total"] = primary.Reads; + stats["primary_read_latency_ms"] = primary.ReadLatencyMs; + return stats; +} + +double Round2(double value) => Math.Round(value, 2); + +app.MapGet("/", () => Results.Content(HtmlPage.Generate(cache.TtlSeconds), "text/html; charset=utf-8")); + +app.MapGet("/categories", () => Results.Json(new +{ + cache_ids = cache.Ids(), + primary_ids = primary.ListIds(), +})); + +app.MapGet("/read", (string? id) => +{ + if (string.IsNullOrEmpty(id)) + { + return Results.BadRequest(new { error = "Missing 'id' query parameter." }); + } + var result = cache.Get(id); + return Results.Json(new + { + id, + record = result.Record, + hit = result.Hit, + redis_latency_ms = Round2(result.RedisLatencyMs), + ttl_remaining = cache.TtlRemaining(id), + stats = BuildStats(), + }); +}); + +app.MapGet("/stats", () => Results.Json(BuildStats())); + +app.MapPost("/update", async (HttpContext ctx) => +{ + var form = await ctx.Request.ReadFormAsync(); + var id = form["id"].ToString(); + var field = form["field"].ToString(); + var value = form["value"].ToString(); + if (string.IsNullOrEmpty(id) || string.IsNullOrEmpty(field)) + { + return Results.BadRequest(new { error = "Missing 'id' or 'field'." }); + } + if (!primary.UpdateField(id, field, value)) + { + return Results.NotFound(new { error = $"Unknown category '{id}'." }); + } + return Results.Json(new { id, field, value, stats = BuildStats() }); +}); + +app.MapPost("/add", async (HttpContext ctx) => +{ + var form = await ctx.Request.ReadFormAsync(); + var id = form["id"].ToString().Trim(); + var name = form["name"].ToString().Trim(); + if (string.IsNullOrEmpty(id) || string.IsNullOrEmpty(name)) + { + return Results.BadRequest(new { error = "Missing 'id' or 'name'." }); + } + var displayOrder = form["display_order"].ToString(); + if (string.IsNullOrEmpty(displayOrder)) displayOrder = "99"; + var featured = form["featured"].ToString(); + if (string.IsNullOrEmpty(featured)) featured = "false"; + var parentId = form["parent_id"].ToString(); + var record = new Dictionary(StringComparer.Ordinal) + { + ["id"] = id, + ["name"] = name, + ["display_order"] = displayOrder, + ["featured"] = featured, + ["parent_id"] = parentId, + }; + if (!primary.AddRecord(record)) + { + return Results.Json(new { error = $"Category '{id}' already exists." }, statusCode: 409); + } + return Results.Json(new { id, record, stats = BuildStats() }); +}); + +app.MapPost("/delete", async (HttpContext ctx) => +{ + var form = await ctx.Request.ReadFormAsync(); + var id = form["id"].ToString(); + if (string.IsNullOrEmpty(id)) + { + return Results.BadRequest(new { error = "Missing 'id'." }); + } + if (!primary.DeleteRecord(id)) + { + return Results.NotFound(new { error = $"Unknown category '{id}'." }); + } + return Results.Json(new { id, stats = BuildStats() }); +}); + +app.MapPost("/invalidate", async (HttpContext ctx) => +{ + var form = await ctx.Request.ReadFormAsync(); + var id = form["id"].ToString(); + if (string.IsNullOrEmpty(id)) + { + return Results.BadRequest(new { error = "Missing 'id'." }); + } + var deleted = cache.Invalidate(id); + return Results.Json(new { id, deleted, stats = BuildStats() }); +}); + +app.MapPost("/clear", () => +{ + // Serialise admin handlers so two concurrent callers cannot + // pause/resume each other into a sync-worker live state. + lock (pauseMu) + { + // Pause the sync worker so it cannot recreate keys between SCAN + // and DEL. Queued events accumulate and apply after resume. + sync.Pause(); + int deleted; + try + { + deleted = cache.Clear(); + } + finally + { + sync.Resume(); + } + return Results.Json(new { deleted, stats = BuildStats() }); + } +}); + +app.MapPost("/reprefetch", () => +{ + // Serialise admin handlers so two concurrent callers cannot + // pause/resume each other into a sync-worker live state. + lock (pauseMu) + { + // Pause the sync worker so it cannot interleave with the + // clear + snapshot + bulk_load sequence. Without this, a change + // applied between ListRecords() and BulkLoad() would be overwritten + // by the stale snapshot. + sync.Pause(); + int loaded; + double elapsedMs; + try + { + var sw = System.Diagnostics.Stopwatch.StartNew(); + cache.Clear(); + loaded = cache.BulkLoad(primary.ListRecords()); + sw.Stop(); + elapsedMs = sw.Elapsed.TotalMilliseconds; + } + finally + { + sync.Resume(); + } + return Results.Json(new + { + loaded, + elapsed_ms = Round2(elapsedMs), + stats = BuildStats(), + }); + } +}); + +app.MapPost("/reset", () => +{ + cache.ResetStats(); + primary.ResetReads(); + return Results.Json(BuildStats()); +}); + +Console.WriteLine($"Redis prefetch-cache demo server listening on http://{host}:{port}"); +Console.WriteLine( + $"Using Redis at {redisHost}:{redisPort}" + + $" with cache prefix '{cachePrefix}' and TTL {ttlSeconds}s"); +Console.WriteLine($"Prefetched {initialLoaded} records in {startupSw.Elapsed.TotalMilliseconds:F1} ms; sync worker running"); + +AppDomain.CurrentDomain.ProcessExit += (_, _) => sync.Stop(); +Console.CancelKeyPress += (_, _) => sync.Stop(); + +app.Run(); +sync.Stop(); +return 0; + +static class HtmlPage +{ + public static string Generate(int cacheTtl) + { + return Template.Replace("__CACHE_TTL__", cacheTtl.ToString()); + } + + // Verbatim copy of the Python reference's HTML_TEMPLATE. The pill + // text is changed to describe the .NET stack; everything else is + // identical so the demo UI matches across clients. + private const string Template = """ + + + + + + Redis Prefetch Cache Demo + + + +
+
StackExchange.Redis + ASP.NET Core minimal API
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +"""; +} + diff --git a/content/develop/use-cases/prefetch-cache/dotnet/SyncWorker.cs b/content/develop/use-cases/prefetch-cache/dotnet/SyncWorker.cs new file mode 100644 index 0000000000..7c5d12f621 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/dotnet/SyncWorker.cs @@ -0,0 +1,139 @@ +namespace PrefetchCacheDemo; + +/// +/// Background sync worker for the prefetch-cache demo. +/// +/// A long-running background drains the primary's +/// change queue and applies each event to Redis through +/// . In a real system, the queue +/// is replaced by a CDC pipeline (Redis Data Integration, Debezium, or +/// an equivalent) that tails the primary's binlog/WAL and writes the +/// same shape of events. +/// +/// The worker exposes and so +/// maintenance paths (/reprefetch, ) +/// can stop event application without tearing the thread down. +/// blocks until the worker is parked, so the caller +/// knows no apply is in flight by the time it returns. +/// +public class SyncWorker +{ + private readonly MockPrimaryStore _primary; + private readonly PrefetchCache _cache; + private readonly TimeSpan _pollTimeout; + private readonly ManualResetEventSlim _stopEvent = new(false); + private readonly ManualResetEventSlim _pauseEvent = new(false); + private readonly ManualResetEventSlim _pausedIdleEvent = new(false); + private readonly object _threadLock = new(); + private Thread? _thread; + + public SyncWorker(MockPrimaryStore primary, PrefetchCache cache, TimeSpan? pollTimeout = null) + { + _primary = primary ?? throw new ArgumentNullException(nameof(primary)); + _cache = cache ?? throw new ArgumentNullException(nameof(cache)); + _pollTimeout = pollTimeout ?? TimeSpan.FromMilliseconds(50); + } + + public void Start() + { + lock (_threadLock) + { + if (_thread is not null && _thread.IsAlive) return; + _stopEvent.Reset(); + _pauseEvent.Reset(); + _pausedIdleEvent.Reset(); + _thread = new Thread(Run) + { + Name = "prefetch-cache-sync", + IsBackground = true, + }; + _thread.Start(); + } + } + + /// + /// Signal the worker to exit and join its thread. + /// + /// If the join times out the worker is wedged inside + /// ; we leave + /// _thread populated so a subsequent + /// does not spawn a second worker on top of the orphan. + /// + public void Stop(TimeSpan? joinTimeout = null) + { + var timeout = joinTimeout ?? TimeSpan.FromSeconds(2); + _stopEvent.Set(); + Thread? toJoin; + lock (_threadLock) { toJoin = _thread; } + if (toJoin is null) return; + if (toJoin.Join(timeout)) + { + lock (_threadLock) + { + if (!toJoin.IsAlive) _thread = null; + } + } + } + + /// + /// Stop applying events and block until the worker is parked. + /// + /// Returns true once the worker has confirmed it is idle, or + /// false if the timeout elapsed first. While paused, change + /// events accumulate in the primary's queue and are applied in + /// order after . + /// + public bool Pause(TimeSpan? timeout = null) + { + var waitFor = timeout ?? TimeSpan.FromSeconds(2); + _pausedIdleEvent.Reset(); + _pauseEvent.Set(); + Thread? current; + lock (_threadLock) { current = _thread; } + if (current is null || !current.IsAlive) return true; + return _pausedIdleEvent.Wait(waitFor); + } + + public void Resume() + { + _pauseEvent.Reset(); + _pausedIdleEvent.Reset(); + } + + private void Run() + { + while (!_stopEvent.IsSet) + { + if (_pauseEvent.IsSet) + { + // Park until the pause is lifted or the worker is stopped. + // Re-Set _pausedIdleEvent on every iteration so a *new* + // Pause call that arrives while we are still parked from + // the previous cycle gets acknowledged within one poll + // interval, not the Pause's 2 s timeout. + while (_pauseEvent.IsSet && !_stopEvent.IsSet) + { + _pausedIdleEvent.Set(); + _stopEvent.Wait(_pollTimeout); + } + _pausedIdleEvent.Reset(); + continue; + } + + var change = _primary.NextChange(_pollTimeout); + if (change is null) continue; + try + { + _cache.ApplyChange(change); + } + catch (Exception ex) + { + // Demo behaviour: log and drop the event. A production + // CDC consumer would retry with bounded backoff and + // expose a dead-letter / error counter; see the guide's + // "Production usage" section. + Console.Error.WriteLine($"[sync] failed to apply {change}: {ex.Message}"); + } + } + } +} diff --git a/content/develop/use-cases/prefetch-cache/dotnet/_index.md b/content/develop/use-cases/prefetch-cache/dotnet/_index.md new file mode 100644 index 0000000000..fc773433d3 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/dotnet/_index.md @@ -0,0 +1,426 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in C# with StackExchange.Redis +linkTitle: StackExchange.Redis example (C#) +title: Redis prefetch cache with StackExchange.Redis +weight: 6 +--- + +This guide shows you how to implement a Redis prefetch cache in C# with [StackExchange.Redis](https://stackexchange.github.io/StackExchange.Redis/). It includes a small local web server built with ASP.NET Core minimal APIs so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.BulkLoad(primary.ListRecords())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.Get(entityId)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process queue. The sync worker thread drains the queue and calls `cache.ApplyChange(event)`. For an `Upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `Delete`, it removes the cache key. + +In a real system the in-process change queue is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/dotnet/PrefetchCache.cs)): + +```csharp +using StackExchange.Redis; +using PrefetchCacheDemo; + +var redis = ConnectionMultiplexer.Connect("localhost:6379"); +var primary = new MockPrimaryStore(); +var cache = new PrefetchCache(redis.GetDatabase(), ttlSeconds: 3600); + +// Pre-load every primary record into Redis in one pipelined round trip. +cache.BulkLoad(primary.ListRecords()); + +// Start the sync worker so primary mutations propagate into Redis. +var sync = new SyncWorker(primary, cache); +sync.Start(); + +// Read paths now go to Redis only. +var result = cache.Get("cat-001"); +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), batched, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `BulkLoad` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record through a StackExchange.Redis `IBatch`, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```csharp +public int BulkLoad(IEnumerable> records) +{ + var batch = _db.CreateBatch(); + var tasks = new List(); + var loaded = 0; + foreach (var record in records) + { + if (!record.TryGetValue("id", out var entityId) || string.IsNullOrEmpty(entityId)) continue; + var cacheKey = CacheKey(entityId); + tasks.Add(batch.KeyDeleteAsync(cacheKey)); + tasks.Add(batch.HashSetAsync( + cacheKey, + record.Select(p => new HashEntry(p.Key, p.Value)).ToArray())); + tasks.Add(batch.KeyExpireAsync(cacheKey, TimeSpan.FromSeconds(_ttlSeconds))); + loaded++; + } + if (loaded > 0) + { + batch.Execute(); + Task.WaitAll(tasks.ToArray()); + } + return loaded; +} +``` + +`IBatch` is non-transactional on purpose for the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `BulkLoad` directly from your own code on a cache that is already serving reads, either pause your writers first or rewrite it with `IDatabase.CreateTransaction()` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +The `Get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```csharp +public ReadResult Get(string entityId) +{ + var cacheKey = CacheKey(entityId); + var sw = System.Diagnostics.Stopwatch.StartNew(); + var entries = _db.HashGetAll(cacheKey); + sw.Stop(); + var redisLatencyMs = sw.Elapsed.TotalMilliseconds; + + if (entries.Length > 0) + { + lock (_statsLock) { _hits++; } + return new ReadResult(ToDict(entries), Hit: true, redisLatencyMs); + } + + lock (_statsLock) { _misses++; } + return new ReadResult(null, Hit: false, redisLatencyMs); +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `ApplyChange` for every primary mutation. For an `Upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL inside a StackExchange.Redis transaction (`IDatabase.CreateTransaction()`) so the cache never holds a stale mix of old and new fields. For a `Delete`, it removes the cache key: + +```csharp +public void ApplyChange(ChangeEvent change) +{ + if (string.IsNullOrEmpty(change.Id)) return; + var cacheKey = CacheKey(change.Id); + + if (change.Op == ChangeOp.Upsert) + { + if (change.Fields is null || change.Fields.Count == 0) return; + var tx = _db.CreateTransaction(); + _ = tx.KeyDeleteAsync(cacheKey); + _ = tx.HashSetAsync( + cacheKey, + change.Fields.Select(p => new HashEntry(p.Key, p.Value)).ToArray()); + _ = tx.KeyExpireAsync(cacheKey, TimeSpan.FromSeconds(_ttlSeconds)); + tx.Execute(); + } + else if (change.Op == ChangeOp.Delete) + { + _db.KeyDelete(cacheKey); + } +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. StackExchange.Redis transactions are optimistic (WATCH-based under the hood), but the three commands here have no conditions so they queue and dispatch atomically in a single round trip. + +The "skip empty upserts" early-return is important: `HSET` with an empty array of fields throws, and a CDC pipeline that ever emits an upsert without fields would crash the sync worker on first encounter. A production consumer would route the bad event to a dead-letter queue and alert; the demo simply drops it. + +## The sync worker + +The `SyncWorker` runs a long-running background `Thread` (not a `Task`) so it can poll on the change queue without consuming a ThreadPool slot. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/dotnet/SyncWorker.cs)): + +```csharp +private void Run() +{ + while (!_stopEvent.IsSet) + { + if (_pauseEvent.IsSet) + { + _pausedIdleEvent.Set(); + while (_pauseEvent.IsSet && !_stopEvent.IsSet) + { + _stopEvent.Wait(_pollTimeout); + } + _pausedIdleEvent.Reset(); + continue; + } + + var change = _primary.NextChange(_pollTimeout); + if (change is null) continue; + try { _cache.ApplyChange(change); } + catch (Exception ex) + { + Console.Error.WriteLine($"[sync] failed to apply {change}: {ex.Message}"); + } + } +} +``` + +`ManualResetEventSlim` provides the pause and stop signals. `BlockingCollection.TryTake(out _, timeout)` is the .NET equivalent of `queue.Queue.get(timeout=…)` from the reference; the 50 ms timeout keeps the worker responsive to pause and stop requests without busy-looping. + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `Invalidate(entityId)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `Clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `BulkLoad(primary.ListRecords())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`Clear()` and `BulkLoad()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.ListRecords()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `Clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```csharp +sync.Pause(); +try +{ + cache.Clear(); + cache.BulkLoad(primary.ListRecords()); +} +finally +{ + sync.Resume(); +} +``` + +`Pause()` waits for the worker to finish whatever event it is currently applying, parks the run loop, and returns. Change events that arrive during the pause sit in the primary's `BlockingCollection` queue and apply in order once `Resume()` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```csharp +public Dictionary Stats() +{ + lock (_statsLock) + { + var total = _hits + _misses; + var hitRate = total == 0 ? 0.0 : Math.Round(100.0 * _hits / total, 1); + var avgLag = _syncLagSamples == 0 + ? 0.0 + : Math.Round(_syncLagMsTotal / _syncLagSamples, 2); + return new Dictionary + { + ["hits"] = _hits, + ["misses"] = _misses, + ["hit_rate_pct"] = hitRate, + ["prefetched"] = _prefetched, + ["sync_events_applied"] = _syncEventsApplied, + ["sync_lag_ms_avg"] = avgLag, + }; + } +} +``` + +In production you would emit these as counters and gauges through `Meter`/`Counter` and scrape with Prometheus or OpenTelemetry. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* The [.NET 8 SDK](https://dotnet.microsoft.com/download) (or newer) is installed: + +```bash +dotnet --version +``` + +The project file pins `StackExchange.Redis` at 2.7+, which `dotnet run` restores automatically on first invocation. + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of five files. Download them from the [`dotnet` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/dotnet) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/dotnet +curl -O $BASE/PrefetchCacheDemo.csproj +curl -O $BASE/PrefetchCache.cs +curl -O $BASE/MockPrimaryStore.cs +curl -O $BASE/SyncWorker.cs +curl -O $BASE/Program.cs +``` + +### Start the demo server + +From that directory: + +```bash +dotnet run +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8787 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 92.4 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8787`. + +The demo server uses ASP.NET Core minimal APIs and only standard .NET threading primitives: + +* `WebApplication.CreateBuilder()` for HTTP routing +* `BlockingCollection` for the change-event queue +* `Thread` + `ManualResetEventSlim` for the sync worker + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/dotnet/MockPrimaryStore.cs)): + +```csharp +public class MockPrimaryStore +{ + public MockPrimaryStore(int readLatencyMs = 80) { ... } + + public List> ListRecords() + { + Thread.Sleep(ReadLatencyMs); + ... + } + + public bool UpdateField(string entityId, string field, string value) + { + lock (_lock) + { + ... + EmitChangeLocked(ChangeOp.Upsert, entityId, snapshot); + } + return true; + } +} +``` + +Every mutation appends a `ChangeEvent` to an in-process [`BlockingCollection`](https://learn.microsoft.com/dotnet/api/system.collections.concurrent.blockingcollection-1). The sync worker drains the queue with a 50 ms timeout and applies each event to Redis. The emit happens while the mutation lock is held so two concurrent updates cannot interleave their event order on the queue. In a real system this queue is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's in-process queue is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` argument exactly for this. + +### Prefer async on hot paths + +The demo helper is synchronous (`HashGetAll`, `KeyDelete`, etc.) to keep the example compact. .NET's `ThreadPool` grows by only a couple of threads per second under load, so a synchronous helper combined with many concurrent HTTP handlers can starve workers and produce false cache-misses during traffic spikes. The demo works around this by calling `ThreadPool.SetMinThreads(64, 64)` at startup; a production helper would expose `async` methods (`HashGetAllAsync`, `KeyDeleteAsync`, `await Task.Delay`) and route requests through an async pipeline end-to-end. That removes the synchronous-blocking risk entirely and is the idiomatic shape for ASP.NET Core handlers. + +### Use TickCount64 (not TickCount) for any deadline arithmetic + +If you add timeout/deadline logic to your sync worker or maintenance handlers, use `Environment.TickCount64`, never `Environment.TickCount`. The 32-bit variant wraps every 24.9 days and adding a positive offset near the wraparound boundary produces a negative deadline that immediately exits the polling loop. The 64-bit variant has no practical wrap interval. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [StackExchange.Redis documentation](https://stackexchange.github.io/StackExchange.Redis/) - Install and use the StackExchange.Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/go/_index.md b/content/develop/use-cases/prefetch-cache/go/_index.md new file mode 100644 index 0000000000..18844b7313 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/_index.md @@ -0,0 +1,456 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Go with go-redis +linkTitle: go-redis example (Go) +title: Redis prefetch cache with go-redis +weight: 3 +--- + +This guide shows you how to implement a Redis prefetch cache in Go with [`go-redis`]({{< relref "/develop/clients/go" >}}). It includes a small local web server built with Go's standard `net/http` package so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.BulkLoad(ctx, primary.ListRecords())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.Get(ctx, id)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process channel. A sync-worker goroutine drains the channel and calls `cache.ApplyChange(ctx, event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process channel is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` type wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/go/cache.go)): + +```go +package main + +import ( + "context" + + "github.com/redis/go-redis/v9" + "prefetchcache" +) + +func main() { + client := redis.NewClient(&redis.Options{Addr: "localhost:6379"}) + primary := prefetchcache.NewMockPrimaryStore(80) + cache := prefetchcache.NewPrefetchCache(client, "cache:category:", 3600) + + ctx := context.Background() + + // Pre-load every primary record into Redis in one pipelined round trip. + _, _ = cache.BulkLoad(ctx, primary.ListRecords()) + + // Start the sync worker so primary mutations propagate into Redis. + sync := prefetchcache.NewSyncWorker(primary, cache) + sync.Start() + defer sync.Stop(2 * time.Second) + + // Read paths now go to Redis only. + result, _ := cache.Get(ctx, "cat-001") + _ = result +} +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +`BulkLoad` pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```go +func (c *PrefetchCache) BulkLoad(ctx context.Context, records []map[string]string) (int, error) { + loaded := 0 + pipe := c.client.Pipeline() + for _, record := range records { + id := record["id"] + if id == "" { + continue + } + cacheKey := c.cacheKey(id) + pipe.Del(ctx, cacheKey) + pipe.HSet(ctx, cacheKey, hashFields(record)...) + pipe.Expire(ctx, cacheKey, time.Duration(c.ttlSeconds)*time.Second) + loaded++ + } + if loaded > 0 { + if _, err := pipe.Exec(ctx); err != nil { + return 0, err + } + } + return loaded, nil +} +``` + +The pipeline uses `client.Pipeline()` (non-transactional) on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `BulkLoad` directly from your own code on a cache that is already serving reads, either pause your writers first or rewrite it with `client.TxPipeline()` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +`Get` runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```go +func (c *PrefetchCache) Get(ctx context.Context, id string) (GetResult, error) { + cacheKey := c.cacheKey(id) + started := time.Now() + cached, err := c.client.HGetAll(ctx, cacheKey).Result() + latencyMs := float64(time.Since(started).Microseconds()) / 1000.0 + if err != nil { + return GetResult{RedisLatencyMs: latencyMs}, err + } + if len(cached) > 0 { + c.recordHit() + return GetResult{Record: cached, Hit: true, RedisLatencyMs: latencyMs}, nil + } + c.recordMiss() + return GetResult{Record: nil, Hit: false, RedisLatencyMs: latencyMs}, nil +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `ApplyChange` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one pipelined transaction so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```go +func (c *PrefetchCache) ApplyChange(ctx context.Context, change Change) error { + if change.ID == "" { + return nil + } + cacheKey := c.cacheKey(change.ID) + + switch change.Op { + case ChangeOpUpsert: + if len(change.Fields) == 0 { + // Malformed upsert with no fields. Skip rather than crash + // the sync worker: HSET with an empty mapping errors, and + // there's nothing to write anyway. + return nil + } + pipe := c.client.TxPipeline() + pipe.Del(ctx, cacheKey) + pipe.HSet(ctx, cacheKey, hashFields(change.Fields)...) + pipe.Expire(ctx, cacheKey, time.Duration(c.ttlSeconds)*time.Second) + if _, err := pipe.Exec(ctx); err != nil { + return err + } + case ChangeOpDelete: + if err := c.client.Del(ctx, cacheKey).Err(); err != nil { + return err + } + default: + return nil + } + return nil +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. `TxPipeline` wraps the three commands in `MULTI`/`EXEC` so concurrent readers can never observe the half-written intermediate state. + +## The sync worker + +`SyncWorker` runs a single goroutine that blocks on the primary's change channel with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/go/sync_worker.go)): + +```go +func (w *SyncWorker) run(ctx context.Context, done chan struct{}) { + defer close(done) + for { + select { + case <-ctx.Done(): + return + default: + } + + // ... park here if paused ... + + change, ok := w.primary.NextChange(w.pollTimeout) + if !ok { + continue + } + if err := w.cache.ApplyChange(ctx, change); err != nil { + log.Printf("[sync] failed to apply %s %s: %v", change.Op, change.ID, err) + } + } +} +``` + +Pause and resume are coordinated through two channels stored on the worker: + +* `pausedIdle` is closed by the worker when the run loop has parked itself. `Pause()` waits on this channel so it can prove no `ApplyChange` is in flight before returning. +* `resumeCh` is closed by `Resume()` to wake the parked select. Both channels are replaced with fresh values on each `Pause()` so a stale `Resume` from a previous cycle cannot prematurely unblock the next pause. + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `Invalidate(ctx, id)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `Clear(ctx)` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `BulkLoad(ctx, primary.ListRecords())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`Clear()` and `BulkLoad()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.ListRecords()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `Clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```go +s.sync.Pause(2 * time.Second) +_, _ = s.cache.Clear(ctx) +loaded, _ := s.cache.BulkLoad(ctx, s.primary.ListRecords()) +s.sync.Resume() +``` + +`Pause()` waits for the worker goroutine to finish whatever event it is currently applying, parks the run loop, and returns. Change events that arrive during the pause sit on the primary's channel and apply in order once `Resume()` is called, so no event is lost. The demo also wraps the pause/resume pair in a `sync.Mutex` so two concurrent admin callers cannot interleave their pause/resume cycles. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```go +func (c *PrefetchCache) Stats() map[string]any { + c.mu.Lock() + defer c.mu.Unlock() + total := c.hits + c.misses + hitRate := 0.0 + if total > 0 { + hitRate = roundTo(100.0*float64(c.hits)/float64(total), 1) + } + avgLag := 0.0 + if c.syncLagSamples > 0 { + avgLag = roundTo(c.syncLagMsTotal/float64(c.syncLagSamples), 2) + } + return map[string]any{ + "hits": c.hits, + "misses": c.misses, + "hit_rate_pct": hitRate, + "prefetched": c.prefetched, + "sync_events_applied": c.syncEventsApplied, + "sync_lag_ms_avg": avgLag, + } +} +``` + +In production you would emit these as Prometheus counters and gauges. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +* Redis running and accessible. By default, the demo connects to `localhost:6379`. +* Go 1.21 or later. +* The `go-redis` client. The included `go.mod` pins: + + ```text + require github.com/redis/go-redis/v9 v9.18.0 + ``` + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of five files. Download them from the [`go` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/go) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/go +curl -O $BASE/cache.go +curl -O $BASE/primary.go +curl -O $BASE/sync_worker.go +curl -O $BASE/demo_server.go +curl -O $BASE/go.mod +curl -O $BASE/go.sum +``` + +### Start the demo server + +The helper, mock primary, sync worker, and demo handlers all live in `package prefetchcache`. Go's `package main` can't live in the same directory as another package, so create a tiny `main.go` shim in a subdirectory that calls into the package: + +```bash +mkdir -p cmd/demo +cat > cmd/demo/main.go <<'EOF' +package main + +import "prefetchcache" + +func main() { prefetchcache.RunDemoServer() } +EOF +``` + +Then build and run: + +```bash +go mod tidy +go run ./cmd/demo +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8784 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 83.0 ms; sync worker running +``` + +After starting the server, visit [http://localhost:8784](http://localhost:8784). + +The demo server uses only Go's standard library plus `go-redis`: + +* [`net/http`](https://pkg.go.dev/net/http) for the web server +* [`flag`](https://pkg.go.dev/flag) for CLI flags +* Goroutines, channels, and `sync.Mutex` for the sync worker and stats counters + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +If you want to run the demo against a non-default cache prefix or port, pass `--port` and `--cache-prefix`: + +```bash +go run ./cmd/demo --port 8784 --cache-prefix 'cache:category:' +``` + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/go/primary.go)): + +```go +type MockPrimaryStore struct { + readLatencyMs int + + mu sync.Mutex + reads int + changes chan Change + records map[string]map[string]string +} + +func (p *MockPrimaryStore) ListRecords() []map[string]string { + time.Sleep(time.Duration(p.readLatencyMs) * time.Millisecond) + // ... return a deep copy of every record under p.mu ... +} + +func (p *MockPrimaryStore) UpdateField(id, field, value string) bool { + p.mu.Lock() + defer p.mu.Unlock() + rec, ok := p.records[id] + if !ok { + return false + } + rec[field] = value + p.emitChangeLocked(ChangeOpUpsert, id, copyRecord(rec)) + return true +} +``` + +Every mutation appends a change event to an in-process buffered `chan Change`. The sync worker drains the channel with a 50 ms timeout via `NextChange` and applies each event to Redis. The change event is **emitted while the record lock is still held** (`emitChangeLocked` runs inside the `mu.Lock()` block) so two concurrent `UpdateField` calls cannot produce out-of-order events on the channel. + +In a real system this channel is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change channel with a real CDC pipeline + +The demo's in-process channel is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` constructor argument exactly for this. + +### Wire shutdown through `context.Context` + +The sync worker runs on its own goroutine that blocks in `NextChange` (a channel select with a 50 ms timeout). The demo's `RunDemoServer` calls `syncWorker.Stop(2 * time.Second)` on SIGINT/SIGTERM, which cancels the worker's internal context and joins the goroutine. Wire your real sync worker to your service's shutdown context so `SIGTERM` produces a clean drain instead of a hard kill. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [go-redis guide]({{< relref "/develop/clients/go" >}}) - Install and use the Go Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/go/cache.go b/content/develop/use-cases/prefetch-cache/go/cache.go new file mode 100644 index 0000000000..a473c2f719 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/cache.go @@ -0,0 +1,354 @@ +// Redis prefetch-cache helper. +// +// Each cached entity is stored as a Redis hash under cache:{prefix}:{id} +// with a long safety-net TTL that bounds memory if the sync pipeline +// ever stops, but is not the freshness mechanism. Freshness comes from +// the ApplyChange path, which the sync worker calls every time a +// primary mutation arrives. +// +// Reads run HGETALL against Redis only. A miss is not a fall-back +// trigger -- the application treats it as an error or a deliberate +// Invalidate for testing. In production a sustained miss rate means the +// prefetch or the sync pipeline is broken, not that the primary should +// be re-queried on the request path. +package prefetchcache + +import ( + "context" + "sort" + "sync" + "time" + + "github.com/redis/go-redis/v9" +) + +// PrefetchCache is a prefetch-cache helper backed by Redis hashes with +// a safety-net TTL. +type PrefetchCache struct { + client *redis.Client + prefix string + ttlSeconds int + + mu sync.Mutex + hits int + misses int + prefetched int + syncEventsApplied int + syncLagMsTotal float64 + syncLagSamples int +} + +// NewPrefetchCache returns a PrefetchCache. Pass an empty prefix to use +// the default "cache:category:" and 0 for ttlSeconds to use the default +// 3600. +func NewPrefetchCache(client *redis.Client, prefix string, ttlSeconds int) *PrefetchCache { + if prefix == "" { + prefix = "cache:category:" + } + if ttlSeconds == 0 { + ttlSeconds = 3600 + } + return &PrefetchCache{ + client: client, + prefix: prefix, + ttlSeconds: ttlSeconds, + } +} + +// Prefix returns the configured cache-key prefix. +func (c *PrefetchCache) Prefix() string { return c.prefix } + +// TTLSeconds returns the configured safety-net TTL in seconds. +func (c *PrefetchCache) TTLSeconds() int { return c.ttlSeconds } + +func (c *PrefetchCache) cacheKey(id string) string { return c.prefix + id } + +func (c *PrefetchCache) stripPrefix(key string) string { + if len(key) >= len(c.prefix) && key[:len(c.prefix)] == c.prefix { + return key[len(c.prefix):] + } + return key +} + +// BulkLoad pipelines DEL + HSET + EXPIRE for every record. Returns the +// number of records loaded. +// +// The pipeline is non-transactional: it is fast on startup (when +// nothing is reading the cache) and on the live /reprefetch path (when +// the demo pauses the sync worker around the call). Calling BulkLoad +// on a cache that is actively being read and written to can briefly +// expose a key that has been deleted but not yet rewritten; pause the +// writers first or rewrite this with TxPipeline if that matters. +func (c *PrefetchCache) BulkLoad(ctx context.Context, records []map[string]string) (int, error) { + loaded := 0 + pipe := c.client.Pipeline() + for _, record := range records { + id := record["id"] + if id == "" { + continue + } + cacheKey := c.cacheKey(id) + pipe.Del(ctx, cacheKey) + fields := hashFields(record) + pipe.HSet(ctx, cacheKey, fields...) + pipe.Expire(ctx, cacheKey, time.Duration(c.ttlSeconds)*time.Second) + loaded++ + } + if loaded > 0 { + if _, err := pipe.Exec(ctx); err != nil { + return 0, err + } + } + c.mu.Lock() + c.prefetched += loaded + c.mu.Unlock() + return loaded, nil +} + +// GetResult bundles the record, hit/miss flag, and Redis-side latency +// for a Get call. +type GetResult struct { + Record map[string]string + Hit bool + RedisLatencyMs float64 +} + +// Get runs HGETALL against Redis and returns the cached hash with the +// hit flag and Redis-side latency in milliseconds. +// +// Prefetch-cache reads do not fall back to the primary. A miss is a +// signal that the cache is incomplete, not a trigger to re-query the +// source. The caller decides how to surface it. +func (c *PrefetchCache) Get(ctx context.Context, id string) (GetResult, error) { + cacheKey := c.cacheKey(id) + started := time.Now() + cached, err := c.client.HGetAll(ctx, cacheKey).Result() + latencyMs := float64(time.Since(started).Microseconds()) / 1000.0 + if err != nil { + return GetResult{RedisLatencyMs: latencyMs}, err + } + if len(cached) > 0 { + c.mu.Lock() + c.hits++ + c.mu.Unlock() + return GetResult{Record: cached, Hit: true, RedisLatencyMs: latencyMs}, nil + } + c.mu.Lock() + c.misses++ + c.mu.Unlock() + return GetResult{Record: nil, Hit: false, RedisLatencyMs: latencyMs}, nil +} + +// ApplyChange applies a primary change event to Redis. +// +// For an upsert, the helper rewrites the cache hash and refreshes the +// safety-net TTL in one transactional pipeline so the cache never holds +// a stale mix of old and new fields. For a delete, it removes the cache +// key. An upsert with no fields is dropped silently: HSET with an empty +// mapping errors in most clients, and there is nothing to write. +func (c *PrefetchCache) ApplyChange(ctx context.Context, change Change) error { + if change.ID == "" { + return nil + } + cacheKey := c.cacheKey(change.ID) + + switch change.Op { + case ChangeOpUpsert: + if len(change.Fields) == 0 { + // Malformed upsert with no fields. Skip rather than + // crash the sync worker: HSET with an empty mapping + // errors, and there's nothing to write anyway. A real + // CDC consumer would route this to a dead-letter queue + // and alert; the demo just drops it. + return nil + } + pipe := c.client.TxPipeline() + pipe.Del(ctx, cacheKey) + pipe.HSet(ctx, cacheKey, hashFields(change.Fields)...) + pipe.Expire(ctx, cacheKey, time.Duration(c.ttlSeconds)*time.Second) + if _, err := pipe.Exec(ctx); err != nil { + return err + } + case ChangeOpDelete: + if err := c.client.Del(ctx, cacheKey).Err(); err != nil { + return err + } + default: + return nil + } + + c.mu.Lock() + c.syncEventsApplied++ + if change.TimestampMs > 0 { + nowMs := float64(time.Now().UnixNano()) / 1e6 + lag := nowMs - change.TimestampMs + if lag < 0 { + lag = 0 + } + c.syncLagMsTotal += lag + c.syncLagSamples++ + } + c.mu.Unlock() + return nil +} + +// Invalidate deletes one cache key. Returns true if a key was removed. +// Demo-only: simulates a broken sync pipeline. +func (c *PrefetchCache) Invalidate(ctx context.Context, id string) (bool, error) { + n, err := c.client.Del(ctx, c.cacheKey(id)).Result() + if err != nil { + return false, err + } + return n == 1, nil +} + +// Clear deletes every key under this cache's prefix using SCAN + DEL in +// batches. Returns the number of keys deleted. +func (c *PrefetchCache) Clear(ctx context.Context) (int, error) { + var ( + cursor uint64 + deleted int + ) + for { + keys, next, err := c.client.Scan(ctx, cursor, c.prefix+"*", 500).Result() + if err != nil { + return deleted, err + } + if len(keys) > 0 { + n, err := c.client.Del(ctx, keys...).Result() + if err != nil { + return deleted, err + } + deleted += int(n) + } + cursor = next + if cursor == 0 { + break + } + } + return deleted, nil +} + +// IDs returns every entity ID currently in the cache, sorted, with the +// prefix stripped. +func (c *PrefetchCache) IDs(ctx context.Context) ([]string, error) { + var ( + cursor uint64 + out []string + ) + for { + keys, next, err := c.client.Scan(ctx, cursor, c.prefix+"*", 500).Result() + if err != nil { + return nil, err + } + for _, k := range keys { + out = append(out, c.stripPrefix(k)) + } + cursor = next + if cursor == 0 { + break + } + } + sortStrings(out) + return out, nil +} + +// Count returns the number of keys under the cache prefix. +func (c *PrefetchCache) Count(ctx context.Context) (int, error) { + var ( + cursor uint64 + count int + ) + for { + keys, next, err := c.client.Scan(ctx, cursor, c.prefix+"*", 500).Result() + if err != nil { + return 0, err + } + count += len(keys) + cursor = next + if cursor == 0 { + break + } + } + return count, nil +} + +// TTLRemaining returns the remaining TTL on the cached key in seconds +// (Redis TTL semantics: -2 = missing, -1 = no expiry). +// +// Use Do("TTL", ...) rather than client.TTL().Result(): the latter +// returns time.Duration, encoding the -2 / -1 sentinels as raw +// nanoseconds (so a naive int(d.Seconds()) would truncate them to 0). +// Sending the raw command and reading the integer reply preserves the +// value Redis actually returned. +func (c *PrefetchCache) TTLRemaining(ctx context.Context, id string) (int, error) { + n, err := c.client.Do(ctx, "TTL", c.cacheKey(id)).Int64() + if err != nil { + return 0, err + } + return int(n), nil +} + +// Stats returns the in-process counters and derived rates. JSON keys +// are snake_case to match the other client ports. +func (c *PrefetchCache) Stats() map[string]any { + c.mu.Lock() + defer c.mu.Unlock() + total := c.hits + c.misses + hitRate := 0.0 + if total > 0 { + hitRate = roundTo(100.0*float64(c.hits)/float64(total), 1) + } + avgLag := 0.0 + if c.syncLagSamples > 0 { + avgLag = roundTo(c.syncLagMsTotal/float64(c.syncLagSamples), 2) + } + return map[string]any{ + "hits": c.hits, + "misses": c.misses, + "hit_rate_pct": hitRate, + "prefetched": c.prefetched, + "sync_events_applied": c.syncEventsApplied, + "sync_lag_ms_avg": avgLag, + } +} + +// ResetStats zeroes every counter. +func (c *PrefetchCache) ResetStats() { + c.mu.Lock() + c.hits = 0 + c.misses = 0 + c.prefetched = 0 + c.syncEventsApplied = 0 + c.syncLagMsTotal = 0 + c.syncLagSamples = 0 + c.mu.Unlock() +} + +// hashFields flattens a map into the [key1, val1, key2, val2, ...] slice +// go-redis expects for HSet. +func hashFields(record map[string]string) []any { + fields := make([]any, 0, len(record)*2) + for k, v := range record { + fields = append(fields, k, v) + } + return fields +} + +// roundTo rounds x to decimals digits after the decimal point. +func roundTo(x float64, decimals int) float64 { + mul := 1.0 + for i := 0; i < decimals; i++ { + mul *= 10 + } + if x >= 0 { + return float64(int64(x*mul+0.5)) / mul + } + return float64(int64(x*mul-0.5)) / mul +} + +// sortStrings sorts a slice of IDs in place. Pulled out so callers can +// rely on a sorted result regardless of SCAN return order. +func sortStrings(s []string) { + sort.Strings(s) +} diff --git a/content/develop/use-cases/prefetch-cache/go/demo_server.go b/content/develop/use-cases/prefetch-cache/go/demo_server.go new file mode 100644 index 0000000000..528137829f --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/demo_server.go @@ -0,0 +1,790 @@ +// Redis prefetch-cache demo server. +// +// Create a main.go file in a subdirectory (Go's package main can't live +// in the same directory as package prefetchcache): +// +// mkdir -p cmd/demo +// cat > cmd/demo/main.go <<'EOF' +// package main +// +// import "prefetchcache" +// +// func main() { prefetchcache.RunDemoServer() } +// EOF +// +// Then build and run: +// +// go mod tidy +// go run ./cmd/demo --port 8784 +// +// Visit http://localhost:8784 to watch a prefetch cache in action: the +// demo bulk-loads every primary record into Redis on startup, runs a +// background sync worker that applies primary mutations within +// milliseconds, and lets you add, update, delete, and re-prefetch +// records to see how the cache stays current without ever falling back +// to the primary on the read path. +package prefetchcache + +import ( + "context" + "encoding/json" + "flag" + "fmt" + "log" + "net/http" + "os" + "os/signal" + "strconv" + "strings" + "sync" + "syscall" + "time" + + "github.com/redis/go-redis/v9" +) + +type demoServer struct { + cache *PrefetchCache + primary *MockPrimaryStore + sync *SyncWorker +} + +// RunDemoServer parses CLI flags and starts the prefetch-cache demo +// HTTP server. It is the entry point your cmd/demo/main.go shim calls. +func RunDemoServer() { + host := flag.String("host", "127.0.0.1", "HTTP bind host") + port := flag.Int("port", 8784, "HTTP bind port") + redisHost := flag.String("redis-host", "localhost", "Redis host") + redisPort := flag.Int("redis-port", 6379, "Redis port") + cachePrefix := flag.String("cache-prefix", "cache:category:", "Cache key prefix") + ttlSeconds := flag.Int("ttl-seconds", 3600, "Safety-net TTL in seconds (refreshed on every sync event)") + primaryLatencyMs := flag.Int("primary-latency-ms", 80, + "Simulated primary read latency (only affects bulk loads and reconciliations)") + flag.Parse() + + client := redis.NewClient(&redis.Options{ + Addr: fmt.Sprintf("%s:%d", *redisHost, *redisPort), + }) + ctx, cancel := context.WithCancel(context.Background()) + defer cancel() + if err := client.Ping(ctx).Err(); err != nil { + log.Fatalf("could not reach Redis at %s:%d: %v", *redisHost, *redisPort, err) + } + + cache := NewPrefetchCache(client, *cachePrefix, *ttlSeconds) + primary := NewMockPrimaryStore(*primaryLatencyMs) + syncWorker := NewSyncWorker(primary, cache) + + started := time.Now() + if _, err := cache.Clear(ctx); err != nil { + log.Fatalf("clear cache: %v", err) + } + loaded, err := cache.BulkLoad(ctx, primary.ListRecords()) + if err != nil { + log.Fatalf("bulk load: %v", err) + } + elapsedMs := float64(time.Since(started).Microseconds()) / 1000.0 + syncWorker.Start() + + srv := &demoServer{cache: cache, primary: primary, sync: syncWorker} + + mux := http.NewServeMux() + mux.HandleFunc("/", srv.handleRoot) + mux.HandleFunc("/categories", srv.handleCategories) + mux.HandleFunc("/read", srv.handleRead) + mux.HandleFunc("/stats", srv.handleStats) + mux.HandleFunc("/update", srv.handleUpdate) + mux.HandleFunc("/add", srv.handleAdd) + mux.HandleFunc("/delete", srv.handleDelete) + mux.HandleFunc("/invalidate", srv.handleInvalidate) + mux.HandleFunc("/clear", srv.handleClear) + mux.HandleFunc("/reprefetch", srv.handleReprefetch) + mux.HandleFunc("/reset", srv.handleReset) + + addr := fmt.Sprintf("%s:%d", *host, *port) + httpSrv := &http.Server{Addr: addr, Handler: mux} + + go func() { + log.Printf("Redis prefetch-cache demo server listening on http://%s", addr) + log.Printf("Using Redis at %s:%d with cache prefix '%s' and TTL %ds", + *redisHost, *redisPort, *cachePrefix, *ttlSeconds) + log.Printf("Prefetched %d records in %.1f ms; sync worker running", loaded, elapsedMs) + if err := httpSrv.ListenAndServe(); err != nil && err != http.ErrServerClosed { + log.Fatalf("http server: %v", err) + } + }() + + sigCh := make(chan os.Signal, 1) + signal.Notify(sigCh, os.Interrupt, syscall.SIGTERM) + <-sigCh + log.Print("shutting down") + syncWorker.Stop(2 * time.Second) + shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 2*time.Second) + defer shutdownCancel() + _ = httpSrv.Shutdown(shutdownCtx) +} + +// --- HTTP handlers --- + +func (s *demoServer) handleRoot(w http.ResponseWriter, r *http.Request) { + if r.URL.Path != "/" && r.URL.Path != "/index.html" { + http.NotFound(w, r) + return + } + w.Header().Set("Content-Type", "text/html; charset=utf-8") + _, _ = w.Write([]byte(s.htmlPage())) +} + +func (s *demoServer) handleCategories(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodGet { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + ids, err := s.cache.IDs(r.Context()) + if err != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": err.Error()}) + return + } + if ids == nil { + ids = []string{} + } + primaryIDs := s.primary.ListIDs() + if primaryIDs == nil { + primaryIDs = []string{} + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "cache_ids": ids, + "primary_ids": primaryIDs, + }) +} + +func (s *demoServer) handleRead(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodGet { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + id := r.URL.Query().Get("id") + if id == "" { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": "Missing 'id'."}) + return + } + result, err := s.cache.Get(r.Context(), id) + if err != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": err.Error()}) + return + } + ttl, err := s.cache.TTLRemaining(r.Context(), id) + if err != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": err.Error()}) + return + } + var record any + if result.Record == nil { + record = nil + } else { + record = result.Record + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "id": id, + "record": record, + "hit": result.Hit, + "redis_latency_ms": roundTo(result.RedisLatencyMs, 2), + "ttl_remaining": ttl, + "stats": s.buildStats(), + }) +} + +func (s *demoServer) handleStats(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodGet { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + s.writeJSON(w, http.StatusOK, s.buildStats()) +} + +func (s *demoServer) handleUpdate(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + if err := r.ParseForm(); err != nil { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": err.Error()}) + return + } + id := r.FormValue("id") + field := r.FormValue("field") + value := r.FormValue("value") + if id == "" || field == "" { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": "Missing 'id' or 'field'."}) + return + } + if !s.primary.UpdateField(id, field, value) { + s.writeJSON(w, http.StatusNotFound, map[string]any{"error": fmt.Sprintf("Unknown category '%s'.", id)}) + return + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "id": id, + "field": field, + "value": value, + "stats": s.buildStats(), + }) +} + +func (s *demoServer) handleAdd(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + if err := r.ParseForm(); err != nil { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": err.Error()}) + return + } + id := strings.TrimSpace(r.FormValue("id")) + name := strings.TrimSpace(r.FormValue("name")) + if id == "" || name == "" { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": "Missing 'id' or 'name'."}) + return + } + displayOrder := r.FormValue("display_order") + if displayOrder == "" { + displayOrder = "99" + } + featured := r.FormValue("featured") + if featured == "" { + featured = "false" + } + parentID := r.FormValue("parent_id") + record := map[string]string{ + "id": id, + "name": name, + "display_order": displayOrder, + "featured": featured, + "parent_id": parentID, + } + if !s.primary.AddRecord(record) { + s.writeJSON(w, http.StatusConflict, map[string]any{"error": fmt.Sprintf("Category '%s' already exists.", id)}) + return + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "id": id, + "record": record, + "stats": s.buildStats(), + }) +} + +func (s *demoServer) handleDelete(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + if err := r.ParseForm(); err != nil { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": err.Error()}) + return + } + id := r.FormValue("id") + if id == "" { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": "Missing 'id'."}) + return + } + if !s.primary.DeleteRecord(id) { + s.writeJSON(w, http.StatusNotFound, map[string]any{"error": fmt.Sprintf("Unknown category '%s'.", id)}) + return + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "id": id, + "stats": s.buildStats(), + }) +} + +func (s *demoServer) handleInvalidate(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + if err := r.ParseForm(); err != nil { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": err.Error()}) + return + } + id := r.FormValue("id") + if id == "" { + s.writeJSON(w, http.StatusBadRequest, map[string]any{"error": "Missing 'id'."}) + return + } + deleted, err := s.cache.Invalidate(r.Context(), id) + if err != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": err.Error()}) + return + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "id": id, + "deleted": deleted, + "stats": s.buildStats(), + }) +} + +// pauseMu serialises /clear and /reprefetch so two concurrent admin +// callers cannot pause/resume each other into a sync-worker live state. +var pauseMu sync.Mutex + +func (s *demoServer) handleClear(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + pauseMu.Lock() + defer pauseMu.Unlock() + // Pause the sync worker so it cannot recreate keys between SCAN + // and DEL. Queued events accumulate and apply after resume. + // `defer Resume()` guarantees the worker is unparked even if + // Clear panics or returns an error mid-way. + s.sync.Pause(2 * time.Second) + defer s.sync.Resume() + deleted, err := s.cache.Clear(r.Context()) + if err != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": err.Error()}) + return + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "deleted": deleted, + "stats": s.buildStats(), + }) +} + +func (s *demoServer) handleReprefetch(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + pauseMu.Lock() + defer pauseMu.Unlock() + // Pause the sync worker so it cannot interleave with the clear + // + snapshot + bulk_load sequence. Without this, a change applied + // between ListRecords() and BulkLoad() would be overwritten by + // the stale snapshot. `defer Resume()` guarantees the worker is + // unparked even if Clear or BulkLoad panics mid-way. + s.sync.Pause(2 * time.Second) + defer s.sync.Resume() + started := time.Now() + var loaded int + var clearErr, loadErr error + if _, clearErr = s.cache.Clear(r.Context()); clearErr == nil { + loaded, loadErr = s.cache.BulkLoad(r.Context(), s.primary.ListRecords()) + } + elapsedMs := float64(time.Since(started).Microseconds()) / 1000.0 + if clearErr != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": clearErr.Error()}) + return + } + if loadErr != nil { + s.writeJSON(w, http.StatusInternalServerError, map[string]any{"error": loadErr.Error()}) + return + } + s.writeJSON(w, http.StatusOK, map[string]any{ + "loaded": loaded, + "elapsed_ms": roundTo(elapsedMs, 2), + "stats": s.buildStats(), + }) +} + +func (s *demoServer) handleReset(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + s.cache.ResetStats() + s.primary.ResetReads() + s.writeJSON(w, http.StatusOK, s.buildStats()) +} + +// --- helpers --- + +func (s *demoServer) buildStats() map[string]any { + stats := s.cache.Stats() + stats["primary_reads_total"] = s.primary.Reads() + stats["primary_read_latency_ms"] = s.primary.ReadLatencyMs() + return stats +} + +func (s *demoServer) writeJSON(w http.ResponseWriter, status int, payload any) { + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(status) + _ = json.NewEncoder(w).Encode(payload) +} + +func (s *demoServer) htmlPage() string { + return strings.ReplaceAll(htmlTemplate, "__CACHE_TTL__", strconv.Itoa(s.cache.TTLSeconds())) +} + +// htmlTemplate is the inline demo UI, ported verbatim from the Python +// reference. The only substitutions are the pill text (top of ) +// and __CACHE_TTL__. +const htmlTemplate = ` + + + + + Redis Prefetch Cache Demo + + + +
+
go-redis + Go net/http
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +` diff --git a/content/develop/use-cases/prefetch-cache/go/go.mod b/content/develop/use-cases/prefetch-cache/go/go.mod new file mode 100644 index 0000000000..f2620d662b --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/go.mod @@ -0,0 +1,11 @@ +module prefetchcache + +go 1.23 + +require github.com/redis/go-redis/v9 v9.18.0 + +require ( + github.com/cespare/xxhash/v2 v2.3.0 // indirect + github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect + go.uber.org/atomic v1.11.0 // indirect +) diff --git a/content/develop/use-cases/prefetch-cache/go/go.sum b/content/develop/use-cases/prefetch-cache/go/go.sum new file mode 100644 index 0000000000..e25b1f4d0a --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/go.sum @@ -0,0 +1,22 @@ +github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs= +github.com/bsm/ginkgo/v2 v2.12.0/go.mod h1:SwYbGRRDovPVboqFv0tPTcG1sN61LM1Z4ARdbAV9g4c= +github.com/bsm/gomega v1.27.10 h1:yeMWxP2pV2fG3FgAODIY8EiRE3dy0aeFYt4l7wh6yKA= +github.com/bsm/gomega v1.27.10/go.mod h1:JyEr/xRbxbtgWNi8tIEVPUYZ5Dzef52k01W3YH0H+O0= +github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs= +github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= +github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= +github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f h1:lO4WD4F/rVNCu3HqELle0jiPLLBs70cWOduZpkS1E78= +github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f/go.mod h1:cuUVRXasLTGF7a8hSLbxyZXjz+1KgoB3wDUb6vlszIc= +github.com/klauspost/cpuid/v2 v2.0.9 h1:lgaqFMSdTdQYdZ04uHyN2d/eKdOMyi2YLSvlQIBFYa4= +github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg= +github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= +github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/redis/go-redis/v9 v9.18.0 h1:pMkxYPkEbMPwRdenAzUNyFNrDgHx9U+DrBabWNfSRQs= +github.com/redis/go-redis/v9 v9.18.0/go.mod h1:k3ufPphLU5YXwNTUcCRXGxUoF1fqxnhFQmscfkCoDA0= +github.com/stretchr/testify v1.3.0 h1:TivCn/peBQ7UY8ooIcPgZFpTNSz0Q2U6UrFlUfqbe0Q= +github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= +github.com/zeebo/xxh3 v1.0.2 h1:xZmwmqxHZA8AI603jOQ0tMqmBr9lPeFwGg6d+xy9DC0= +github.com/zeebo/xxh3 v1.0.2/go.mod h1:5NWz9Sef7zIDm2JHfFlcQvNekmcEl9ekUZQQKCYaDcA= +go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE= +go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0= diff --git a/content/develop/use-cases/prefetch-cache/go/primary.go b/content/develop/use-cases/prefetch-cache/go/primary.go new file mode 100644 index 0000000000..4e3ffc0906 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/primary.go @@ -0,0 +1,252 @@ +// Mock primary data store for the prefetch-cache demo. +// +// This stands in for a source-of-truth database (Postgres, MySQL, Mongo, +// etc.) that holds reference data the application serves to users. +// +// Every mutation appends a change event to an in-process Go channel, +// which the sync worker drains and applies to Redis. In a real system +// the channel is replaced by a CDC pipeline -- Redis Data Integration, +// Debezium plus a lightweight consumer, or an equivalent tool that +// tails the source's binlog/WAL and pushes changes into Redis. +// +// The store also exposes ReadLatencyMs so the demo can illustrate how +// much slower a direct primary read would be than a Redis hit. +package prefetchcache + +import ( + "sort" + "sync" + "time" +) + +// Change op constants emitted on the change feed. +const ( + ChangeOpUpsert = "upsert" + ChangeOpDelete = "delete" +) + +// Change is a primary mutation event. Fields is nil for delete ops. +type Change struct { + Op string + ID string + Fields map[string]string + TimestampMs float64 +} + +// MockPrimaryStore is an in-memory stand-in for a primary database of +// reference data. +type MockPrimaryStore struct { + readLatencyMs int + + mu sync.Mutex + reads int + changes chan Change + records map[string]map[string]string +} + +// NewMockPrimaryStore returns a MockPrimaryStore seeded with the same +// five sample records as the Python reference. The change channel is +// buffered generously so concurrent mutations never block the producer. +func NewMockPrimaryStore(readLatencyMs int) *MockPrimaryStore { + return &MockPrimaryStore{ + readLatencyMs: readLatencyMs, + changes: make(chan Change, 1024), + records: map[string]map[string]string{ + "cat-001": { + "id": "cat-001", + "name": "Beverages", + "display_order": "1", + "featured": "true", + "parent_id": "", + }, + "cat-002": { + "id": "cat-002", + "name": "Bakery", + "display_order": "2", + "featured": "true", + "parent_id": "", + }, + "cat-003": { + "id": "cat-003", + "name": "Pantry Staples", + "display_order": "3", + "featured": "false", + "parent_id": "", + }, + "cat-004": { + "id": "cat-004", + "name": "Frozen", + "display_order": "4", + "featured": "false", + "parent_id": "", + }, + "cat-005": { + "id": "cat-005", + "name": "Specialty Cheeses", + "display_order": "5", + "featured": "false", + "parent_id": "cat-002", + }, + }, + } +} + +// ReadLatencyMs returns the configured simulated read latency. +func (p *MockPrimaryStore) ReadLatencyMs() int { + return p.readLatencyMs +} + +// ListIDs returns the primary record IDs in sorted order. No sleep, no +// counter increment -- this stands in for a fast metadata query (for example, +// SELECT id FROM categories) rather than a full record read. +func (p *MockPrimaryStore) ListIDs() []string { + p.mu.Lock() + defer p.mu.Unlock() + ids := make([]string, 0, len(p.records)) + for id := range p.records { + ids = append(ids, id) + } + sort.Strings(ids) + return ids +} + +// ListRecords returns every record. Used by the cache's bulk-load path +// on startup. Sleeps for ReadLatencyMs and increments the read counter. +func (p *MockPrimaryStore) ListRecords() []map[string]string { + time.Sleep(time.Duration(p.readLatencyMs) * time.Millisecond) + p.mu.Lock() + defer p.mu.Unlock() + p.reads++ + out := make([]map[string]string, 0, len(p.records)) + // Iterate sorted IDs so the snapshot order is deterministic. + ids := make([]string, 0, len(p.records)) + for id := range p.records { + ids = append(ids, id) + } + sort.Strings(ids) + for _, id := range ids { + out = append(out, copyRecord(p.records[id])) + } + return out +} + +// Read returns a single record by id, or nil if absent. Not on the +// demo's normal read path. +func (p *MockPrimaryStore) Read(id string) map[string]string { + time.Sleep(time.Duration(p.readLatencyMs) * time.Millisecond) + p.mu.Lock() + defer p.mu.Unlock() + p.reads++ + rec, ok := p.records[id] + if !ok { + return nil + } + return copyRecord(rec) +} + +// AddRecord inserts a record if id is absent and emits an upsert event. +// Returns false if the id already exists or is empty. +func (p *MockPrimaryStore) AddRecord(record map[string]string) bool { + id := record["id"] + if id == "" { + return false + } + p.mu.Lock() + defer p.mu.Unlock() + if _, exists := p.records[id]; exists { + return false + } + p.records[id] = copyRecord(record) + // Emit while the lock is held so the channel order matches the + // mutation order. Two concurrent callers cannot interleave + // mutation A -> mutation B -> emit B -> emit A. + p.emitChangeLocked(ChangeOpUpsert, id, copyRecord(p.records[id])) + return true +} + +// UpdateField updates a single field in place and emits an upsert event. +// Returns false if the id is unknown. +func (p *MockPrimaryStore) UpdateField(id, field, value string) bool { + p.mu.Lock() + defer p.mu.Unlock() + rec, ok := p.records[id] + if !ok { + return false + } + rec[field] = value + p.emitChangeLocked(ChangeOpUpsert, id, copyRecord(rec)) + return true +} + +// DeleteRecord removes a record and emits a delete event. Returns false +// if the id is unknown. +func (p *MockPrimaryStore) DeleteRecord(id string) bool { + p.mu.Lock() + defer p.mu.Unlock() + if _, ok := p.records[id]; !ok { + return false + } + delete(p.records, id) + p.emitChangeLocked(ChangeOpDelete, id, nil) + return true +} + +// NextChange blocks up to timeout for the next change event. Returns +// (zero Change, false) if the timeout elapsed with nothing on the +// channel. The boolean disambiguates a zero-value Change from a +// genuine timeout. +func (p *MockPrimaryStore) NextChange(timeout time.Duration) (Change, bool) { + if timeout <= 0 { + select { + case change := <-p.changes: + return change, true + default: + return Change{}, false + } + } + timer := time.NewTimer(timeout) + defer timer.Stop() + select { + case change := <-p.changes: + return change, true + case <-timer.C: + return Change{}, false + } +} + +// Reads returns the cumulative number of full-record reads since the +// counter was last reset. +func (p *MockPrimaryStore) Reads() int { + p.mu.Lock() + defer p.mu.Unlock() + return p.reads +} + +// ResetReads zeroes the primary read counter. +func (p *MockPrimaryStore) ResetReads() { + p.mu.Lock() + p.reads = 0 + p.mu.Unlock() +} + +// emitChangeLocked appends a change event to the feed. The caller must +// hold p.mu. Channel sends are themselves thread-safe and never try to +// acquire p.mu, so sending while holding the records lock cannot +// deadlock; holding the lock here is what guarantees that the channel +// order matches the order in which the records map was mutated. +func (p *MockPrimaryStore) emitChangeLocked(op, id string, fields map[string]string) { + p.changes <- Change{ + Op: op, + ID: id, + Fields: fields, + TimestampMs: float64(time.Now().UnixNano()) / 1e6, + } +} + +func copyRecord(in map[string]string) map[string]string { + out := make(map[string]string, len(in)) + for k, v := range in { + out[k] = v + } + return out +} diff --git a/content/develop/use-cases/prefetch-cache/go/sync_worker.go b/content/develop/use-cases/prefetch-cache/go/sync_worker.go new file mode 100644 index 0000000000..49ae57fe1f --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/go/sync_worker.go @@ -0,0 +1,215 @@ +// Background sync worker for the prefetch-cache demo. +// +// A daemon goroutine drains the primary's change channel and applies +// each event to Redis through PrefetchCache.ApplyChange. In a real +// system, the channel is replaced by a CDC pipeline (Redis Data +// Integration, Debezium, or an equivalent) that tails the primary's +// binlog/WAL and writes the same shape of events. +// +// The worker exposes Pause() and Resume() so maintenance paths +// (/reprefetch, Clear()) can stop event application without tearing +// the goroutine down. Pause() blocks until the worker is parked, so +// the caller knows no apply is in flight by the time it returns. +package prefetchcache + +import ( + "context" + "log" + "sync" + "time" +) + +// SyncWorker drains primary change events into Redis on a goroutine. +type SyncWorker struct { + primary *MockPrimaryStore + cache *PrefetchCache + pollTimeout time.Duration + + mu sync.Mutex + running bool + cancel context.CancelFunc + done chan struct{} + paused bool + pausedIdle chan struct{} // closed (replaced with a fresh chan) every time the loop parks + resumeCh chan struct{} // closed by Resume to wake the parked loop +} + +// NewSyncWorker creates a SyncWorker. The default poll timeout (50 ms) +// matches the Python reference. +func NewSyncWorker(primary *MockPrimaryStore, cache *PrefetchCache) *SyncWorker { + return &SyncWorker{ + primary: primary, + cache: cache, + pollTimeout: 50 * time.Millisecond, + pausedIdle: make(chan struct{}), + resumeCh: make(chan struct{}), + } +} + +// Start spawns the worker goroutine if it is not already running. +func (w *SyncWorker) Start() { + w.mu.Lock() + defer w.mu.Unlock() + if w.running { + return + } + ctx, cancel := context.WithCancel(context.Background()) + done := make(chan struct{}) + w.cancel = cancel + w.done = done + w.running = true + go w.run(ctx, done) +} + +// Stop signals the worker to exit and waits up to joinTimeout for the +// goroutine to finish. +// +// If the join times out the worker is wedged inside ApplyChange; we +// leave w.running true so a subsequent Start() does not spawn a second +// worker on top of the orphan. +func (w *SyncWorker) Stop(joinTimeout time.Duration) { + w.mu.Lock() + if !w.running { + w.mu.Unlock() + return + } + cancel := w.cancel + done := w.done + // Close resumeCh inside the lock so a concurrent Resume cannot + // pass the "already closed?" check and then race us to close() + // the same channel twice (which would panic). + closeOnce(w.resumeCh) + w.mu.Unlock() + + cancel() + + select { + case <-done: + w.mu.Lock() + w.running = false + w.cancel = nil + w.done = nil + w.mu.Unlock() + case <-time.After(joinTimeout): + // Worker is wedged: leave running=true so Start() is a no-op + // rather than producing a second worker. + } +} + +// Pause sets the pause flag and blocks until the worker confirms it is +// parked, up to timeout. Returns true if confirmed paused. +// +// While paused, change events accumulate on the primary's channel and +// apply in order after Resume(). Calling Pause while already paused is +// idempotent and returns immediately. +func (w *SyncWorker) Pause(timeout time.Duration) bool { + w.mu.Lock() + if !w.running { + w.paused = true + w.mu.Unlock() + return true + } + if w.paused { + idle := w.pausedIdle + w.mu.Unlock() + // Already paused -- wait for the current idle signal (which + // is closed once the worker is parked). + select { + case <-idle: + return true + case <-time.After(timeout): + return false + } + } + // Replace the resume channel with a fresh one so any prior + // Resume() does not immediately unblock this pause. + w.resumeCh = make(chan struct{}) + // Reset the idle channel: a fresh one will be closed by the + // worker when it parks. + w.pausedIdle = make(chan struct{}) + idle := w.pausedIdle + w.paused = true + w.mu.Unlock() + + select { + case <-idle: + return true + case <-time.After(timeout): + return false + } +} + +// Resume clears the pause flag and wakes the parked worker goroutine. +func (w *SyncWorker) Resume() { + w.mu.Lock() + defer w.mu.Unlock() + if !w.paused { + return + } + w.paused = false + // Close inside the lock so a concurrent Stop cannot pass the + // "already closed?" check and then race us to close() the same + // channel twice (which would panic). + closeOnce(w.resumeCh) +} + +// closeOnce closes ch if it isn't already closed. Callers MUST hold +// w.mu while invoking it (the non-blocking receive + close pair is not +// atomic on its own; the mutex provides the missing serialisation). +func closeOnce(ch chan struct{}) { + select { + case <-ch: + // Already closed. + default: + close(ch) + } +} + +func (w *SyncWorker) run(ctx context.Context, done chan struct{}) { + defer close(done) + for { + // Bail out promptly on cancel. + select { + case <-ctx.Done(): + return + default: + } + + // Park if paused. + w.mu.Lock() + paused := w.paused + idle := w.pausedIdle + resumeCh := w.resumeCh + w.mu.Unlock() + if paused { + // Signal "I am parked, no apply in flight". Closing the + // channel lets every waiter on Pause() observe it. + select { + case <-idle: + // Already closed -- nothing to do. + default: + close(idle) + } + // Wait for Resume() to close resumeCh, or for Stop() to + // cancel the context. + select { + case <-resumeCh: + case <-ctx.Done(): + return + } + continue + } + + change, ok := w.primary.NextChange(w.pollTimeout) + if !ok { + continue + } + if err := w.cache.ApplyChange(ctx, change); err != nil { + // Demo behaviour: log and drop the event. A production + // CDC consumer would retry with bounded backoff and + // expose a dead-letter / error counter; see the guide's + // "Production usage" section. + log.Printf("[sync] failed to apply %s %s: %v", change.Op, change.ID, err) + } + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-jedis/DemoServer.java b/content/develop/use-cases/prefetch-cache/java-jedis/DemoServer.java new file mode 100644 index 0000000000..186f7cf455 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-jedis/DemoServer.java @@ -0,0 +1,866 @@ +import com.sun.net.httpserver.HttpExchange; +import com.sun.net.httpserver.HttpHandler; +import com.sun.net.httpserver.HttpServer; +import redis.clients.jedis.JedisPool; +import redis.clients.jedis.JedisPoolConfig; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.net.InetSocketAddress; +import java.net.URLDecoder; +import java.nio.charset.StandardCharsets; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.Executors; + +/** + * Redis prefetch-cache demo server (Jedis + JDK HttpServer). + * + *

Run this file and visit {@code http://localhost:8785} to watch a + * prefetch cache in action: the demo bulk-loads every primary record + * into Redis on startup, runs a background sync worker that applies + * primary mutations within milliseconds, and lets you add, update, + * delete, and re-prefetch records to see how the cache stays current + * without ever falling back to the primary on the read path.

+ * + *
{@code
+ * javac -cp jedis-5.1.2.jar:slf4j-api-2.0.13.jar \
+ *     PrefetchCache.java MockPrimaryStore.java SyncWorker.java DemoServer.java
+ * java -cp .:jedis-5.1.2.jar:slf4j-api-2.0.13.jar \
+ *     DemoServer --port 8785 --redis-host localhost --redis-port 6379
+ * }
+ */ +public class DemoServer { + + private static PrefetchCache cache; + private static MockPrimaryStore primary; + private static SyncWorker sync; + private static JedisPool jedisPool; + + public static void main(String[] args) { + String host = "127.0.0.1"; + int port = 8785; + String redisHost = "localhost"; + int redisPort = 6379; + String cachePrefix = PrefetchCache.DEFAULT_PREFIX; + int ttlSeconds = PrefetchCache.DEFAULT_TTL_SECONDS; + int primaryLatencyMs = 80; + + for (int i = 0; i < args.length; i++) { + switch (args[i]) { + case "--host": + host = args[++i]; + break; + case "--port": + port = Integer.parseInt(args[++i]); + break; + case "--redis-host": + redisHost = args[++i]; + break; + case "--redis-port": + redisPort = Integer.parseInt(args[++i]); + break; + case "--cache-prefix": + cachePrefix = args[++i]; + break; + case "--ttl-seconds": + ttlSeconds = Integer.parseInt(args[++i]); + break; + case "--primary-latency-ms": + primaryLatencyMs = Integer.parseInt(args[++i]); + break; + default: + break; + } + } + + try { + jedisPool = new JedisPool(new JedisPoolConfig(), redisHost, redisPort); + jedisPool.getResource().close(); + } catch (Exception e) { + System.err.printf("Failed to connect to Redis at %s:%d: %s%n", redisHost, redisPort, e.getMessage()); + System.exit(1); + } + + cache = new PrefetchCache(jedisPool, cachePrefix, ttlSeconds); + primary = new MockPrimaryStore(primaryLatencyMs); + sync = new SyncWorker(primary, cache); + + long startedNs = System.nanoTime(); + cache.clear(); + int loaded = cache.bulkLoad(primary.listRecords()); + double elapsedMs = (System.nanoTime() - startedNs) / 1_000_000.0; + sync.start(); + + try { + HttpServer server = HttpServer.create(new InetSocketAddress(host, port), 0); + server.createContext("/", new RootHandler()); + server.createContext("/categories", new CategoriesHandler()); + server.createContext("/read", new ReadHandler()); + server.createContext("/stats", new StatsHandler()); + server.createContext("/update", new UpdateHandler()); + server.createContext("/add", new AddHandler()); + server.createContext("/delete", new DeleteHandler()); + server.createContext("/invalidate", new InvalidateHandler()); + server.createContext("/clear", new ClearHandler()); + server.createContext("/reprefetch", new ReprefetchHandler()); + server.createContext("/reset", new ResetHandler()); + server.setExecutor(Executors.newFixedThreadPool(16)); + server.start(); + + System.out.printf("Redis prefetch-cache demo server listening on http://%s:%d%n", host, port); + System.out.printf( + "Using Redis at %s:%d with cache prefix '%s' and TTL %ds%n", + redisHost, redisPort, cachePrefix, ttlSeconds); + System.out.printf("Prefetched %d records in %.1f ms; sync worker running%n", loaded, elapsedMs); + + Runtime.getRuntime().addShutdownHook(new Thread(() -> { + sync.stop(); + server.stop(0); + jedisPool.close(); + })); + } catch (IOException e) { + System.err.println("Failed to start server: " + e.getMessage()); + sync.stop(); + jedisPool.close(); + System.exit(1); + } + } + + /* --- Handlers --- */ + + static class RootHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + String path = exchange.getRequestURI().getPath(); + if (!"GET".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + if (path.equals("/") || path.equals("/index.html")) { + byte[] body = renderHtmlPage(cache.getTtlSeconds()).getBytes(StandardCharsets.UTF_8); + exchange.getResponseHeaders().set("Content-Type", "text/html; charset=utf-8"); + exchange.sendResponseHeaders(200, body.length); + try (OutputStream os = exchange.getResponseBody()) { + os.write(body); + } + return; + } + sendJson(exchange, 404, "{\"error\":\"Not Found\"}"); + } + } + + static class CategoriesHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + Map payload = new LinkedHashMap<>(); + payload.put("cache_ids", cache.ids()); + payload.put("primary_ids", primary.listIds()); + sendJson(exchange, 200, toJson(payload)); + } + } + + static class ReadHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + Map query = parseQuery(exchange.getRequestURI().getQuery()); + String id = query.getOrDefault("id", ""); + if (id.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id'.\"}"); + return; + } + PrefetchCache.Result result = cache.get(id); + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("record", result.record); + response.put("hit", result.hit); + response.put("redis_latency_ms", round2(result.redisLatencyMs)); + response.put("ttl_remaining", cache.ttlRemaining(id)); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class StatsHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + sendJson(exchange, 200, toJson(buildStats())); + } + } + + static class UpdateHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", ""); + String field = form.getOrDefault("field", ""); + String value = form.getOrDefault("value", ""); + if (id.isEmpty() || field.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id' or 'field'.\"}"); + return; + } + if (!primary.updateField(id, field, value)) { + sendJson(exchange, 404, "{\"error\":\"Unknown category '" + jsonEscape(id) + "'.\"}"); + return; + } + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("field", field); + response.put("value", value); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class AddHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", "").trim(); + String name = form.getOrDefault("name", "").trim(); + if (id.isEmpty() || name.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id' or 'name'.\"}"); + return; + } + String displayOrder = form.getOrDefault("display_order", "99"); + if (displayOrder.isEmpty()) { + displayOrder = "99"; + } + String featured = form.getOrDefault("featured", "false"); + if (featured.isEmpty()) { + featured = "false"; + } + String parentId = form.getOrDefault("parent_id", ""); + Map record = new LinkedHashMap<>(); + record.put("id", id); + record.put("name", name); + record.put("display_order", displayOrder); + record.put("featured", featured); + record.put("parent_id", parentId); + if (!primary.addRecord(record)) { + sendJson(exchange, 409, "{\"error\":\"Category '" + jsonEscape(id) + "' already exists.\"}"); + return; + } + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("record", record); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class DeleteHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", ""); + if (id.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id'.\"}"); + return; + } + if (!primary.deleteRecord(id)) { + sendJson(exchange, 404, "{\"error\":\"Unknown category '" + jsonEscape(id) + "'.\"}"); + return; + } + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class InvalidateHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", ""); + if (id.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id'.\"}"); + return; + } + boolean deleted = cache.invalidate(id); + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("deleted", deleted); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ClearHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + // Pause the sync worker so it cannot recreate keys between + // SCAN and DEL. Queued events accumulate and apply after resume. + sync.pause(); + int deleted; + try { + deleted = cache.clear(); + } finally { + sync.resume(); + } + Map response = new LinkedHashMap<>(); + response.put("deleted", deleted); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ReprefetchHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + // Pause the sync worker so it cannot interleave with the + // clear + snapshot + bulkLoad sequence. Without this, a + // change applied between listRecords() and bulkLoad() would + // be overwritten by the stale snapshot. + sync.pause(); + int loaded; + double elapsedMs; + try { + long startedNs = System.nanoTime(); + cache.clear(); + loaded = cache.bulkLoad(primary.listRecords()); + elapsedMs = (System.nanoTime() - startedNs) / 1_000_000.0; + } finally { + sync.resume(); + } + Map response = new LinkedHashMap<>(); + response.put("loaded", loaded); + response.put("elapsed_ms", round2(elapsedMs)); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ResetHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + cache.resetStats(); + primary.resetReads(); + sendJson(exchange, 200, toJson(buildStats())); + } + } + + /* --- helpers --- */ + + private static Map buildStats() { + Map stats = cache.stats(); + stats.put("primary_reads_total", primary.getReads()); + stats.put("primary_read_latency_ms", primary.getReadLatencyMs()); + return stats; + } + + private static double round2(double value) { + return Math.round(value * 100.0) / 100.0; + } + + private static String readRequestBody(HttpExchange exchange) throws IOException { + try (InputStream inputStream = exchange.getRequestBody()) { + return new String(inputStream.readAllBytes(), StandardCharsets.UTF_8); + } + } + + private static Map parseFormData(String body) { + Map params = new HashMap<>(); + if (body == null || body.isEmpty()) { + return params; + } + for (String pair : body.split("&")) { + String[] kv = pair.split("=", 2); + if (kv.length != 2 || kv[0].isEmpty()) { + continue; + } + params.put(URLDecoder.decode(kv[0], StandardCharsets.UTF_8), + URLDecoder.decode(kv[1], StandardCharsets.UTF_8)); + } + return params; + } + + private static Map parseQuery(String query) { + if (query == null || query.isEmpty()) { + return new HashMap<>(); + } + return parseFormData(query); + } + + private static void sendJson(HttpExchange exchange, int status, String body) throws IOException { + byte[] bytes = body.getBytes(StandardCharsets.UTF_8); + exchange.getResponseHeaders().set("Content-Type", "application/json"); + exchange.sendResponseHeaders(status, bytes.length); + try (OutputStream os = exchange.getResponseBody()) { + os.write(bytes); + } + } + + private static String toJson(Object value) { + StringBuilder sb = new StringBuilder(); + appendJson(sb, value); + return sb.toString(); + } + + @SuppressWarnings("unchecked") + private static void appendJson(StringBuilder sb, Object value) { + if (value == null) { + sb.append("null"); + } else if (value instanceof Boolean) { + sb.append(value); + } else if (value instanceof Number) { + sb.append(value); + } else if (value instanceof Map) { + sb.append('{'); + boolean first = true; + for (Map.Entry entry : ((Map) value).entrySet()) { + if (!first) sb.append(','); + first = false; + appendJsonString(sb, String.valueOf(entry.getKey())); + sb.append(':'); + appendJson(sb, entry.getValue()); + } + sb.append('}'); + } else if (value instanceof List) { + sb.append('['); + boolean first = true; + for (Object item : (List) value) { + if (!first) sb.append(','); + first = false; + appendJson(sb, item); + } + sb.append(']'); + } else { + appendJsonString(sb, String.valueOf(value)); + } + } + + private static void appendJsonString(StringBuilder sb, String value) { + sb.append('"').append(jsonEscape(value)).append('"'); + } + + private static String jsonEscape(String value) { + StringBuilder sb = new StringBuilder(value.length() + 4); + for (int i = 0; i < value.length(); i++) { + char c = value.charAt(i); + switch (c) { + case '"': sb.append("\\\""); break; + case '\\': sb.append("\\\\"); break; + case '\n': sb.append("\\n"); break; + case '\r': sb.append("\\r"); break; + case '\t': sb.append("\\t"); break; + default: + if (c < 0x20) { + sb.append(String.format("\\u%04x", (int) c)); + } else { + sb.append(c); + } + } + } + return sb.toString(); + } + + private static String renderHtmlPage(int cacheTtlSeconds) { + return HTML_TEMPLATE.replace("__CACHE_TTL__", Integer.toString(cacheTtlSeconds)); + } + + private static final String HTML_TEMPLATE = """ + + + + + + Redis Prefetch Cache Demo + + + +
+
Jedis + JDK HttpServer
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +"""; +} diff --git a/content/develop/use-cases/prefetch-cache/java-jedis/MockPrimaryStore.java b/content/develop/use-cases/prefetch-cache/java-jedis/MockPrimaryStore.java new file mode 100644 index 0000000000..326864410b --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-jedis/MockPrimaryStore.java @@ -0,0 +1,181 @@ +import java.util.ArrayList; +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.TreeMap; +import java.util.concurrent.LinkedBlockingQueue; +import java.util.concurrent.TimeUnit; + +/** + * Mock primary data store for the prefetch-cache demo. + * + *

This stands in for a source-of-truth database (Postgres, MySQL, + * Mongo, etc.) that holds reference data the application serves to + * users. Every mutation appends a change event to an in-process queue, + * which the sync worker drains and applies to Redis. In a real system + * the queue is replaced by a CDC pipeline — Redis Data + * Integration, Debezium plus a lightweight consumer, or an equivalent + * tool that tails the source's binlog/WAL and pushes events into + * Redis.

+ * + *

The store also exposes {@code readLatencyMs} so the demo can + * illustrate how much slower a direct primary read would be than a + * Redis hit.

+ */ +public class MockPrimaryStore { + + public static final String OP_UPSERT = "upsert"; + public static final String OP_DELETE = "delete"; + + private final int readLatencyMs; + private final Object lock = new Object(); + private long reads; + private final LinkedBlockingQueue> changes = new LinkedBlockingQueue<>(); + private final Map> records = new TreeMap<>(); + + public MockPrimaryStore() { + this(80); + } + + public MockPrimaryStore(int readLatencyMs) { + this.readLatencyMs = readLatencyMs; + seed("cat-001", "Beverages", "1", "true", ""); + seed("cat-002", "Bakery", "2", "true", ""); + seed("cat-003", "Pantry Staples", "3", "false", ""); + seed("cat-004", "Frozen", "4", "false", ""); + seed("cat-005", "Specialty Cheeses", "5", "false", "cat-002"); + } + + public int getReadLatencyMs() { + return readLatencyMs; + } + + /** Sorted IDs. No sleep, no counter increment (metadata-only query). */ + public List listIds() { + synchronized (lock) { + List ids = new ArrayList<>(records.keySet()); + Collections.sort(ids); + return ids; + } + } + + /** Slow read of every record. Used by the cache's bulk-load path. */ + public List> listRecords() { + sleepLatency(); + synchronized (lock) { + reads++; + List> out = new ArrayList<>(records.size()); + for (Map record : records.values()) { + out.add(new LinkedHashMap<>(record)); + } + return out; + } + } + + /** Single-record read. Not on the demo's normal read path. */ + public Map read(String entityId) { + sleepLatency(); + synchronized (lock) { + reads++; + Map record = records.get(entityId); + return record == null ? null : new LinkedHashMap<>(record); + } + } + + public boolean addRecord(Map record) { + if (record == null) { + return false; + } + String entityId = record.getOrDefault("id", "").trim(); + if (entityId.isEmpty()) { + return false; + } + synchronized (lock) { + if (records.containsKey(entityId)) { + return false; + } + Map stored = new LinkedHashMap<>(record); + records.put(entityId, stored); + // Emit while the lock is held so the queue order matches the + // mutation order. Two concurrent callers cannot interleave + // mutation A -> mutation B -> emit B -> emit A. + emitChangeLocked(OP_UPSERT, entityId, new LinkedHashMap<>(stored)); + } + return true; + } + + public boolean updateField(String entityId, String field, String value) { + synchronized (lock) { + Map record = records.get(entityId); + if (record == null) { + return false; + } + record.put(field, value); + emitChangeLocked(OP_UPSERT, entityId, new LinkedHashMap<>(record)); + } + return true; + } + + public boolean deleteRecord(String entityId) { + synchronized (lock) { + if (!records.containsKey(entityId)) { + return false; + } + records.remove(entityId); + emitChangeLocked(OP_DELETE, entityId, null); + } + return true; + } + + /** Block up to {@code timeoutMs} milliseconds for the next change event. */ + public Map nextChange(long timeoutMs) throws InterruptedException { + return changes.poll(timeoutMs, TimeUnit.MILLISECONDS); + } + + public long getReads() { + synchronized (lock) { + return reads; + } + } + + public void resetReads() { + synchronized (lock) { + reads = 0; + } + } + + private void emitChangeLocked(String op, String entityId, Map fields) { + // queue.put is thread-safe and never tries to acquire `lock`, so + // calling it while holding the records lock cannot deadlock. + // Holding the lock here is what guarantees that the queue order + // matches the order in which the records map was mutated. + Map event = new LinkedHashMap<>(); + event.put("op", op); + event.put("id", entityId); + event.put("fields", fields); + event.put("timestamp_ms", (double) System.currentTimeMillis()); + changes.add(event); + } + + private void sleepLatency() { + if (readLatencyMs <= 0) { + return; + } + try { + Thread.sleep(readLatencyMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + } + } + + private void seed(String id, String name, String displayOrder, String featured, String parentId) { + Map record = new LinkedHashMap<>(); + record.put("id", id); + record.put("name", name); + record.put("display_order", displayOrder); + record.put("featured", featured); + record.put("parent_id", parentId); + records.put(id, record); + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-jedis/PrefetchCache.java b/content/develop/use-cases/prefetch-cache/java-jedis/PrefetchCache.java new file mode 100644 index 0000000000..e323d5c5ee --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-jedis/PrefetchCache.java @@ -0,0 +1,317 @@ +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicLong; + +import redis.clients.jedis.Jedis; +import redis.clients.jedis.JedisPool; +import redis.clients.jedis.Pipeline; +import redis.clients.jedis.Transaction; +import redis.clients.jedis.params.ScanParams; +import redis.clients.jedis.resps.ScanResult; + +/** + * Redis prefetch-cache helper. + * + *

Each cached entity is stored as a Redis hash under + * {@code } with a long safety-net TTL that bounds memory if + * the sync pipeline ever stops, but is not the freshness mechanism. + * Freshness comes from the {@link #applyChange(Map)} path, which the + * sync worker calls every time a primary mutation arrives.

+ * + *

Reads run {@code HGETALL} against Redis only. A miss is not a + * fall-back trigger — the application treats it as an error or a + * deliberate {@link #invalidate(String)} for testing. In production a + * sustained miss rate means the prefetch or the sync pipeline is broken, + * not that the primary should be re-queried on the request path.

+ */ +public class PrefetchCache { + + public static final String DEFAULT_PREFIX = "cache:category:"; + public static final int DEFAULT_TTL_SECONDS = 3600; + + private final JedisPool pool; + private final String prefix; + private final int ttlSeconds; + + private final Object statsLock = new Object(); + private long hits; + private long misses; + private long prefetched; + private long syncEventsApplied; + private double syncLagMsTotal; + private long syncLagSamples; + + public PrefetchCache(JedisPool pool) { + this(pool, DEFAULT_PREFIX, DEFAULT_TTL_SECONDS); + } + + public PrefetchCache(JedisPool pool, String prefix, int ttlSeconds) { + if (pool == null) { + throw new IllegalArgumentException("pool is required"); + } + if (ttlSeconds < 1) { + throw new IllegalArgumentException("ttlSeconds must be at least 1 second"); + } + this.pool = pool; + this.prefix = (prefix == null || prefix.isEmpty()) ? DEFAULT_PREFIX : prefix; + this.ttlSeconds = ttlSeconds; + } + + public String getPrefix() { + return prefix; + } + + public int getTtlSeconds() { + return ttlSeconds; + } + + /** Result of an {@link #get(String)} read. */ + public static final class Result { + public final Map record; + public final boolean hit; + public final double redisLatencyMs; + + public Result(Map record, boolean hit, double redisLatencyMs) { + this.record = record; + this.hit = hit; + this.redisLatencyMs = redisLatencyMs; + } + } + + /** + * Pipeline {@code DEL} + {@code HSET} + {@code EXPIRE} for every record. + * Returns the count loaded. + * + *

The pipeline is non-transactional: it is fast on startup (when + * nothing is reading the cache) and on the live {@code /reprefetch} + * path (when the demo pauses the sync worker around the call). + * Calling {@code bulkLoad} on a cache that is actively being read + * and written to can briefly expose a key that has been deleted but + * not yet rewritten; pause the writers first or rewrite this with a + * transaction if that matters.

+ */ + public int bulkLoad(Iterable> records) { + int loaded = 0; + try (Jedis jedis = pool.getResource()) { + Pipeline pipe = jedis.pipelined(); + for (Map record : records) { + if (record == null) { + continue; + } + String entityId = record.get("id"); + if (entityId == null || entityId.isEmpty()) { + continue; + } + String cacheKey = cacheKey(entityId); + pipe.del(cacheKey); + pipe.hset(cacheKey, record); + pipe.expire(cacheKey, ttlSeconds); + loaded++; + } + if (loaded > 0) { + pipe.sync(); + } + } + synchronized (statsLock) { + prefetched += loaded; + } + return loaded; + } + + /** + * Run {@code HGETALL} against Redis and return the cached record. + * + *

Prefetch-cache reads do not fall back to the primary. A miss is + * a signal that the cache is incomplete, not a trigger to re-query + * the source. The caller decides how to surface it.

+ */ + public Result get(String entityId) { + String cacheKey = cacheKey(entityId); + long startedNs = System.nanoTime(); + Map cached; + try (Jedis jedis = pool.getResource()) { + cached = jedis.hgetAll(cacheKey); + } + double redisLatencyMs = (System.nanoTime() - startedNs) / 1_000_000.0; + + if (cached != null && !cached.isEmpty()) { + synchronized (statsLock) { + hits++; + } + return new Result(cached, true, redisLatencyMs); + } + synchronized (statsLock) { + misses++; + } + return new Result(null, false, redisLatencyMs); + } + + /** + * Apply a primary change event to Redis. + * + *

The sync worker calls this for every event the primary emits. + * For an upsert, the helper rewrites the hash and refreshes the + * safety-net TTL inside a {@code MULTI}/{@code EXEC} transaction so + * the cache never holds a stale mix of old and new fields. For a + * delete, it removes the cache key.

+ */ + public void applyChange(Map change) { + if (change == null) { + return; + } + Object op = change.get("op"); + Object idValue = change.get("id"); + if (!(idValue instanceof String)) { + return; + } + String entityId = (String) idValue; + if (entityId.isEmpty()) { + return; + } + String cacheKey = cacheKey(entityId); + + if ("upsert".equals(op)) { + @SuppressWarnings("unchecked") + Map fields = (Map) change.get("fields"); + if (fields == null || fields.isEmpty()) { + // Malformed upsert with no fields. Skip rather than crash + // the sync worker: HSET with an empty map raises and there + // is nothing to write anyway. A real CDC consumer would + // route this to a dead-letter queue and alert; the demo + // just drops it. + return; + } + try (Jedis jedis = pool.getResource()) { + Transaction tx = jedis.multi(); + tx.del(cacheKey); + tx.hset(cacheKey, fields); + tx.expire(cacheKey, ttlSeconds); + tx.exec(); + } + } else if ("delete".equals(op)) { + try (Jedis jedis = pool.getResource()) { + jedis.del(cacheKey); + } + } else { + return; + } + + synchronized (statsLock) { + syncEventsApplied++; + Object ts = change.get("timestamp_ms"); + if (ts instanceof Number) { + double timestampMs = ((Number) ts).doubleValue(); + double lagMs = Math.max(0.0, (System.currentTimeMillis()) - timestampMs); + syncLagMsTotal += lagMs; + syncLagSamples++; + } + } + } + + /** Delete one cache key. Demo-only: simulates a broken sync pipeline. */ + public boolean invalidate(String entityId) { + try (Jedis jedis = pool.getResource()) { + return jedis.del(cacheKey(entityId)) == 1L; + } + } + + /** Delete every key under this cache's prefix and return the count. */ + public int clear() { + int deleted = 0; + String match = prefix + "*"; + ScanParams params = new ScanParams().match(match).count(500); + try (Jedis jedis = pool.getResource()) { + String cursor = ScanParams.SCAN_POINTER_START; + do { + ScanResult scan = jedis.scan(cursor, params); + cursor = scan.getCursor(); + List keys = scan.getResult(); + if (keys != null && !keys.isEmpty()) { + // DEL accepts multiple keys; one round trip per batch. + deleted += (int) jedis.del(keys.toArray(new String[0])); + } + } while (!ScanParams.SCAN_POINTER_START.equals(cursor)); + } + return deleted; + } + + /** Return every entity id currently in the cache, sorted. */ + public List ids() { + List result = new ArrayList<>(); + ScanParams params = new ScanParams().match(prefix + "*").count(500); + try (Jedis jedis = pool.getResource()) { + String cursor = ScanParams.SCAN_POINTER_START; + do { + ScanResult scan = jedis.scan(cursor, params); + cursor = scan.getCursor(); + for (String key : scan.getResult()) { + result.add(stripPrefix(key)); + } + } while (!ScanParams.SCAN_POINTER_START.equals(cursor)); + } + Collections.sort(result); + return result; + } + + public int count() { + int n = 0; + ScanParams params = new ScanParams().match(prefix + "*").count(500); + try (Jedis jedis = pool.getResource()) { + String cursor = ScanParams.SCAN_POINTER_START; + do { + ScanResult scan = jedis.scan(cursor, params); + cursor = scan.getCursor(); + n += scan.getResult().size(); + } while (!ScanParams.SCAN_POINTER_START.equals(cursor)); + } + return n; + } + + public long ttlRemaining(String entityId) { + try (Jedis jedis = pool.getResource()) { + return jedis.ttl(cacheKey(entityId)); + } + } + + /** Hit/miss + sync counters with derived hit-rate and average sync lag. */ + public Map stats() { + synchronized (statsLock) { + long total = hits + misses; + double hitRate = total == 0 ? 0.0 + : Math.round(1000.0 * hits / total) / 10.0; + double avgLag = syncLagSamples == 0 ? 0.0 + : Math.round(100.0 * syncLagMsTotal / syncLagSamples) / 100.0; + Map stats = new LinkedHashMap<>(); + stats.put("hits", hits); + stats.put("misses", misses); + stats.put("hit_rate_pct", hitRate); + stats.put("prefetched", prefetched); + stats.put("sync_events_applied", syncEventsApplied); + stats.put("sync_lag_ms_avg", avgLag); + return stats; + } + } + + public void resetStats() { + synchronized (statsLock) { + hits = 0; + misses = 0; + prefetched = 0; + syncEventsApplied = 0; + syncLagMsTotal = 0.0; + syncLagSamples = 0; + } + } + + private String cacheKey(String entityId) { + return prefix + entityId; + } + + private String stripPrefix(String key) { + return key.startsWith(prefix) ? key.substring(prefix.length()) : key; + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-jedis/SyncWorker.java b/content/develop/use-cases/prefetch-cache/java-jedis/SyncWorker.java new file mode 100644 index 0000000000..a397a0ed69 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-jedis/SyncWorker.java @@ -0,0 +1,197 @@ +import java.util.Map; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.locks.Condition; +import java.util.concurrent.locks.ReentrantLock; + +/** + * Background sync worker for the prefetch-cache demo. + * + *

A daemon thread drains the primary's change queue and applies each + * event to Redis through {@link PrefetchCache#applyChange(Map)}. In a + * real system, the queue is replaced by a CDC pipeline (Redis Data + * Integration, Debezium, or an equivalent) that tails the primary's + * binlog/WAL and writes the same shape of events.

+ * + *

The worker exposes {@link #pause()} and {@link #resume()} so + * maintenance paths ({@code /reprefetch}, {@code clear()}) can stop + * event application without tearing the thread down. {@code pause()} + * blocks until the worker is parked, so the caller knows no apply is in + * flight by the time it returns.

+ */ +public class SyncWorker { + + private final MockPrimaryStore primary; + private final PrefetchCache cache; + private final long pollTimeoutMs; + + private final ReentrantLock lock = new ReentrantLock(); + /** Signalled when {@link #pause()} or {@link #resume()} changes the run-state flag. */ + private final Condition stateChanged = lock.newCondition(); + /** Signalled when the worker has confirmed it is parked. */ + private final Condition idleSignal = lock.newCondition(); + + private volatile boolean stopRequested; + private volatile boolean pauseRequested; + private volatile boolean workerIdle; + + private Thread thread; + + public SyncWorker(MockPrimaryStore primary, PrefetchCache cache) { + this(primary, cache, 50L); + } + + public SyncWorker(MockPrimaryStore primary, PrefetchCache cache, long pollTimeoutMs) { + if (primary == null || cache == null) { + throw new IllegalArgumentException("primary and cache are required"); + } + this.primary = primary; + this.cache = cache; + this.pollTimeoutMs = pollTimeoutMs; + } + + /** Spawn the worker if it isn't already running. */ + public synchronized void start() { + if (thread != null && thread.isAlive()) { + return; + } + stopRequested = false; + pauseRequested = false; + workerIdle = false; + thread = new Thread(this::run, "prefetch-cache-sync"); + thread.setDaemon(true); + thread.start(); + } + + /** + * Signal the worker to exit and join its thread. + * + *

If the join times out the worker is wedged inside an apply; we + * leave {@code thread} populated so a subsequent {@link #start()} + * does not spawn a second worker on top of the orphan.

+ */ + public synchronized void stop(long joinTimeoutMs) { + stopRequested = true; + lock.lock(); + try { + stateChanged.signalAll(); + } finally { + lock.unlock(); + } + if (thread == null) { + return; + } + try { + thread.join(joinTimeoutMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + } + if (!thread.isAlive()) { + thread = null; + } + } + + public void stop() { + stop(2000L); + } + + /** + * Stop applying events and block until the worker is parked. + * + *

Returns {@code true} once the worker has confirmed it is idle, + * or {@code false} if the timeout elapsed first. While paused, + * change events accumulate in the primary's queue and are applied + * in order after {@link #resume()}.

+ */ + public boolean pause(long timeoutMs) { + lock.lock(); + try { + pauseRequested = true; + workerIdle = false; + stateChanged.signalAll(); + if (thread == null || !thread.isAlive()) { + return true; + } + long deadline = System.nanoTime() + TimeUnit.MILLISECONDS.toNanos(timeoutMs); + while (!workerIdle) { + long remaining = deadline - System.nanoTime(); + if (remaining <= 0L) { + return false; + } + try { + idleSignal.awaitNanos(remaining); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + return false; + } + } + return true; + } finally { + lock.unlock(); + } + } + + public boolean pause() { + return pause(2000L); + } + + public void resume() { + lock.lock(); + try { + pauseRequested = false; + workerIdle = false; + stateChanged.signalAll(); + } finally { + lock.unlock(); + } + } + + private void run() { + while (!stopRequested) { + if (pauseRequested) { + lock.lock(); + try { + // Park until resume/stop. Re-announce "I am idle" on + // every iteration so a *new* pause() that arrives + // while we are still parked from the previous cycle + // gets acknowledged immediately, not after the + // caller's full pause-timeout. + while (pauseRequested && !stopRequested) { + workerIdle = true; + idleSignal.signalAll(); + try { + stateChanged.await(); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + stopRequested = true; + break; + } + } + workerIdle = false; + } finally { + lock.unlock(); + } + continue; + } + + Map change; + try { + change = primary.nextChange(pollTimeoutMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + break; + } + if (change == null) { + continue; + } + try { + cache.applyChange(change); + } catch (Exception exc) { + // Demo behaviour: log and drop the event. A production CDC + // consumer would retry with bounded backoff and expose a + // dead-letter / error counter; see the guide's "Production + // usage" section. + System.err.println("[sync] failed to apply " + change + ": " + exc); + } + } + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-jedis/_index.md b/content/develop/use-cases/prefetch-cache/java-jedis/_index.md new file mode 100644 index 0000000000..62415343a3 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-jedis/_index.md @@ -0,0 +1,401 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Java with Jedis +linkTitle: Jedis example (Java) +title: Redis prefetch cache with Jedis +weight: 4 +--- + +This guide shows you how to implement a Redis prefetch cache in Java with the [Jedis]({{< relref "/develop/clients/jedis" >}}) client library. It includes a small local web server built on the JDK's `com.sun.net.httpserver` so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.bulkLoad(primary.listRecords())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.get(entityId)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process queue. The sync worker thread drains the queue and calls `cache.applyChange(event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process change queue is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/java-jedis/PrefetchCache.java)): + +```java +import redis.clients.jedis.JedisPool; +import redis.clients.jedis.JedisPoolConfig; + +JedisPool pool = new JedisPool(new JedisPoolConfig(), "localhost", 6379); +MockPrimaryStore primary = new MockPrimaryStore(80); +PrefetchCache cache = new PrefetchCache(pool); + +// Pre-load every primary record into Redis in one pipelined round trip. +cache.bulkLoad(primary.listRecords()); + +// Start the sync worker so primary mutations propagate into Redis. +SyncWorker sync = new SyncWorker(primary, cache); +sync.start(); + +// Read paths now go to Redis only. +PrefetchCache.Result result = cache.get("cat-001"); +System.out.printf("hit=%s latency=%.2fms record=%s%n", + result.hit, result.redisLatencyMs, result.record); +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `bulkLoad` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```java +public int bulkLoad(Iterable> records) { + int loaded = 0; + try (Jedis jedis = pool.getResource()) { + Pipeline pipe = jedis.pipelined(); + for (Map record : records) { + if (record == null) continue; + String entityId = record.get("id"); + if (entityId == null || entityId.isEmpty()) continue; + String cacheKey = cacheKey(entityId); + pipe.del(cacheKey); + pipe.hset(cacheKey, record); + pipe.expire(cacheKey, ttlSeconds); + loaded++; + } + if (loaded > 0) pipe.sync(); + } + return loaded; +} +``` + +The pipeline is intentionally non-transactional on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulkLoad` directly from your own code on a cache that is already serving reads, either pause your writers first or wrap the writes in a `Transaction` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```java +public Result get(String entityId) { + String cacheKey = cacheKey(entityId); + long startedNs = System.nanoTime(); + Map cached; + try (Jedis jedis = pool.getResource()) { + cached = jedis.hgetAll(cacheKey); + } + double redisLatencyMs = (System.nanoTime() - startedNs) / 1_000_000.0; + + if (cached != null && !cached.isEmpty()) { + synchronized (statsLock) { hits++; } + return new Result(cached, true, redisLatencyMs); + } + synchronized (statsLock) { misses++; } + return new Result(null, false, redisLatencyMs); +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `applyChange` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL inside a `MULTI`/`EXEC` transaction so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```java +public void applyChange(Map change) { + Object op = change.get("op"); + String entityId = (String) change.get("id"); + if (entityId == null || entityId.isEmpty()) return; + String cacheKey = cacheKey(entityId); + + if ("upsert".equals(op)) { + @SuppressWarnings("unchecked") + Map fields = (Map) change.get("fields"); + if (fields == null || fields.isEmpty()) { + // Malformed upsert: skip rather than crash the sync worker. + return; + } + try (Jedis jedis = pool.getResource()) { + Transaction tx = jedis.multi(); + tx.del(cacheKey); + tx.hset(cacheKey, fields); + tx.expire(cacheKey, ttlSeconds); + tx.exec(); + } + } else if ("delete".equals(op)) { + try (Jedis jedis = pool.getResource()) { + jedis.del(cacheKey); + } + } +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. + +## The sync worker + +The `SyncWorker` runs a daemon thread that polls the primary's change queue with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/java-jedis/SyncWorker.java)): + +```java +private void run() { + while (!stopRequested) { + if (pauseRequested) { + // Park until pause is lifted (see pause/resume below). + parkUntilResumed(); + continue; + } + Map change = primary.nextChange(pollTimeoutMs); + if (change == null) continue; + try { + cache.applyChange(change); + } catch (Exception exc) { + System.err.println("[sync] failed to apply " + change + ": " + exc); + } + } +} +``` + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate(entityId)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulkLoad(primary.listRecords())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear()` and `bulkLoad()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.listRecords()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```java +sync.pause(); +try { + cache.clear(); + cache.bulkLoad(primary.listRecords()); +} finally { + sync.resume(); +} +``` + +`pause()` is built on a `ReentrantLock` plus two `Condition`s: it sets a pause flag, then blocks on the worker's "idle" condition until the worker reports it has finished whatever event it was applying. Change events that arrive during the pause sit in the primary's `LinkedBlockingQueue` and apply in order once `resume()` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. All counter access is guarded by an intrinsic lock so the demo's parallel HTTP handlers and the sync thread can read and write them without tearing: + +```java +public Map stats() { + synchronized (statsLock) { + long total = hits + misses; + double hitRate = total == 0 ? 0.0 : Math.round(1000.0 * hits / total) / 10.0; + double avgLag = syncLagSamples == 0 ? 0.0 + : Math.round(100.0 * syncLagMsTotal / syncLagSamples) / 100.0; + Map stats = new LinkedHashMap<>(); + stats.put("hits", hits); + stats.put("misses", misses); + stats.put("hit_rate_pct", hitRate); + stats.put("prefetched", prefetched); + stats.put("sync_events_applied", syncEventsApplied); + stats.put("sync_lag_ms_avg", avgLag); + return stats; + } +} +``` + +In production you would emit these as Micrometer counters and gauges rather than holding them in process memory. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* JDK 11 or later is installed. +* The Jedis JAR (5.0+) and its dependencies are on your classpath. Get them from [Maven Central](https://repo1.maven.org/maven2/redis/clients/jedis/), or via Maven/Gradle in a project setup. You also need [`slf4j-api`](https://repo1.maven.org/maven2/org/slf4j/slf4j-api/) (Jedis declares it as a transitive dependency). + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of four files. Download them from the [`java-jedis` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/java-jedis) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/java-jedis +curl -O $BASE/PrefetchCache.java +curl -O $BASE/MockPrimaryStore.java +curl -O $BASE/SyncWorker.java +curl -O $BASE/DemoServer.java +``` + +### Start the demo server + +From that directory, with the Jedis and `slf4j-api` jars in the working directory (the example assumes Jedis 5.1.2 and slf4j-api 2.0.13 — adjust the filenames to match your versions): + +```bash +javac -cp jedis-5.1.2.jar:slf4j-api-2.0.13.jar \ + PrefetchCache.java MockPrimaryStore.java SyncWorker.java DemoServer.java + +java -cp .:jedis-5.1.2.jar:slf4j-api-2.0.13.jar \ + DemoServer --port 8785 --redis-host localhost --redis-port 6379 +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8785 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 103.3 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8785`. + +The demo server uses standard JDK libraries for HTTP handling and concurrency: + +* [`com.sun.net.httpserver.HttpServer`](https://docs.oracle.com/en/java/javase/21/docs/api/jdk.httpserver/com/sun/net/httpserver/HttpServer.html) for the web server, with a 16-thread fixed pool from `Executors.newFixedThreadPool(16)` +* [`java.util.concurrent.LinkedBlockingQueue`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/LinkedBlockingQueue.html) for the change feed between the mock primary and the sync worker +* [`java.util.concurrent.locks.ReentrantLock`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/locks/ReentrantLock.html) and [`Condition`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/locks/Condition.html) for the sync worker's pause/resume coordination + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/java-jedis/MockPrimaryStore.java)): + +```java +public List> listRecords() { + sleepLatency(); + synchronized (lock) { + reads++; + // ... return a copy of every record + } +} + +public boolean updateField(String entityId, String field, String value) { + synchronized (lock) { + Map record = records.get(entityId); + if (record == null) return false; + record.put(field, value); + // Emit the change event while still holding the records lock, + // so queue order matches mutation order. + emitChangeLocked(OP_UPSERT, entityId, new LinkedHashMap<>(record)); + } + return true; +} +``` + +Every mutation appends a change event to an in-process [`LinkedBlockingQueue`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/LinkedBlockingQueue.html). The sync worker drains the queue with a 50 ms timeout and applies each event to Redis. In a real system this queue is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's in-process queue is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Tune the JedisPool + +The demo uses a default `JedisPoolConfig`. In production, set `setMaxTotal()` and `setMaxIdle()` to match your concurrency profile and server-side `maxclients`. Every cache call acquires a connection with try-with-resources so connections are returned to the pool even on exceptions; this is what lets the sync worker, the HTTP handlers, and any background job share Redis safely without an in-process lock. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` argument exactly for this. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [Jedis guide]({{< relref "/develop/clients/jedis" >}}) - Install and use the Jedis Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/java-lettuce/DemoServer.java b/content/develop/use-cases/prefetch-cache/java-lettuce/DemoServer.java new file mode 100644 index 0000000000..a3aa030a87 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-lettuce/DemoServer.java @@ -0,0 +1,865 @@ +import com.sun.net.httpserver.HttpExchange; +import com.sun.net.httpserver.HttpHandler; +import com.sun.net.httpserver.HttpServer; +import io.lettuce.core.RedisClient; +import io.lettuce.core.RedisURI; +import io.lettuce.core.api.StatefulRedisConnection; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.net.InetSocketAddress; +import java.net.URLDecoder; +import java.nio.charset.StandardCharsets; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.Executors; + +/** + * Redis prefetch-cache demo server using Lettuce. + * + *

Run this file and visit http://localhost:8786 to watch a prefetch + * cache in action: the demo bulk-loads every primary record into Redis + * on startup, runs a background sync worker that applies primary + * mutations within milliseconds, and lets you add, update, delete, and + * re-prefetch records to see how the cache stays current without ever + * falling back to the primary on the read path.

+ */ +public class DemoServer { + + private static PrefetchCache cache; + private static MockPrimaryStore primary; + private static SyncWorker sync; + private static RedisClient redisClient; + private static StatefulRedisConnection connection; + + public static void main(String[] args) { + String host = "127.0.0.1"; + int port = 8786; + String redisHost = "localhost"; + int redisPort = 6379; + String cachePrefix = "cache:category:"; + int ttlSeconds = 3600; + int primaryLatencyMs = 80; + + for (int i = 0; i < args.length; i++) { + switch (args[i]) { + case "--host": + host = args[++i]; + break; + case "--port": + port = Integer.parseInt(args[++i]); + break; + case "--redis-host": + redisHost = args[++i]; + break; + case "--redis-port": + redisPort = Integer.parseInt(args[++i]); + break; + case "--cache-prefix": + cachePrefix = args[++i]; + break; + case "--ttl-seconds": + ttlSeconds = Integer.parseInt(args[++i]); + break; + case "--primary-latency-ms": + primaryLatencyMs = Integer.parseInt(args[++i]); + break; + default: + break; + } + } + + try { + redisClient = RedisClient.create( + RedisURI.builder().withHost(redisHost).withPort(redisPort).build()); + connection = redisClient.connect(); + connection.sync().ping(); + } catch (Exception e) { + System.err.printf("Failed to connect to Redis at %s:%d: %s%n", + redisHost, redisPort, e.getMessage()); + System.exit(1); + } + + cache = new PrefetchCache(connection, cachePrefix, ttlSeconds); + primary = new MockPrimaryStore(primaryLatencyMs); + sync = new SyncWorker(primary, cache); + + long startedNs = System.nanoTime(); + cache.clear(); + int loaded = cache.bulkLoad(primary.listRecords()); + double elapsedMs = (System.nanoTime() - startedNs) / 1_000_000.0; + sync.start(); + + try { + HttpServer server = HttpServer.create(new InetSocketAddress(host, port), 0); + server.createContext("/", new RootHandler()); + server.createContext("/categories", new CategoriesHandler()); + server.createContext("/read", new ReadHandler()); + server.createContext("/stats", new StatsHandler()); + server.createContext("/update", new UpdateHandler()); + server.createContext("/add", new AddHandler()); + server.createContext("/delete", new DeleteHandler()); + server.createContext("/invalidate", new InvalidateHandler()); + server.createContext("/clear", new ClearHandler()); + server.createContext("/reprefetch", new ReprefetchHandler()); + server.createContext("/reset", new ResetHandler()); + server.setExecutor(Executors.newFixedThreadPool(16)); + + Runtime.getRuntime().addShutdownHook(new Thread(() -> { + try { + sync.stop(2000); + } catch (Exception ignored) { + } + if (connection != null) connection.close(); + if (redisClient != null) redisClient.shutdown(); + })); + + server.start(); + System.out.printf("Redis prefetch-cache demo server listening on http://%s:%d%n", + host, port); + System.out.printf("Using Redis at %s:%d with cache prefix '%s' and TTL %ds%n", + redisHost, redisPort, cachePrefix, ttlSeconds); + System.out.printf("Prefetched %d records in %.1f ms; sync worker running%n", + loaded, elapsedMs); + } catch (IOException e) { + System.err.println("Failed to start server: " + e.getMessage()); + System.exit(1); + } + } + + // ----- Handlers --------------------------------------------------- + + static class RootHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + String path = exchange.getRequestURI().getPath(); + if (!path.equals("/") && !path.equals("/index.html")) { + sendJson(exchange, 404, "{\"error\":\"Not Found\"}"); + return; + } + if (!"GET".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + byte[] body = renderHtmlPage(cache.getTtlSeconds()) + .getBytes(StandardCharsets.UTF_8); + exchange.getResponseHeaders().set("Content-Type", "text/html; charset=utf-8"); + exchange.sendResponseHeaders(200, body.length); + try (OutputStream os = exchange.getResponseBody()) { + os.write(body); + } + } + } + + static class CategoriesHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + Map response = new LinkedHashMap<>(); + response.put("cache_ids", cache.ids()); + response.put("primary_ids", primary.listIds()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ReadHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + Map query = parseQuery(exchange.getRequestURI().getRawQuery()); + String id = query.getOrDefault("id", ""); + if (id.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id'.\"}"); + return; + } + PrefetchCache.Result result = cache.get(id); + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("record", result.record); // may be null on miss + response.put("hit", result.hit); + response.put("redis_latency_ms", round2(result.redisLatencyMs)); + response.put("ttl_remaining", cache.ttlRemaining(id)); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class StatsHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + sendJson(exchange, 200, toJson(buildStats())); + } + } + + static class UpdateHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", ""); + String field = form.getOrDefault("field", ""); + String value = form.getOrDefault("value", ""); + if (id.isEmpty() || field.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id' or 'field'.\"}"); + return; + } + if (!primary.updateField(id, field, value)) { + sendJson(exchange, 404, "{\"error\":\"Unknown category '" + jsonEscape(id) + "'.\"}"); + return; + } + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("field", field); + response.put("value", value); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class AddHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", "").trim(); + String name = form.getOrDefault("name", "").trim(); + if (id.isEmpty() || name.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id' or 'name'.\"}"); + return; + } + String displayOrder = form.getOrDefault("display_order", ""); + if (displayOrder.isEmpty()) displayOrder = "99"; + String featured = form.getOrDefault("featured", ""); + if (featured.isEmpty()) featured = "false"; + String parentId = form.getOrDefault("parent_id", ""); + + Map record = new LinkedHashMap<>(); + record.put("id", id); + record.put("name", name); + record.put("display_order", displayOrder); + record.put("featured", featured); + record.put("parent_id", parentId); + + if (!primary.addRecord(record)) { + sendJson(exchange, 409, + "{\"error\":\"Category '" + jsonEscape(id) + "' already exists.\"}"); + return; + } + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("record", record); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class DeleteHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", ""); + if (id.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id'.\"}"); + return; + } + if (!primary.deleteRecord(id)) { + sendJson(exchange, 404, "{\"error\":\"Unknown category '" + jsonEscape(id) + "'.\"}"); + return; + } + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class InvalidateHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + Map form = parseFormData(readRequestBody(exchange)); + String id = form.getOrDefault("id", ""); + if (id.isEmpty()) { + sendJson(exchange, 400, "{\"error\":\"Missing 'id'.\"}"); + return; + } + boolean deleted = cache.invalidate(id); + Map response = new LinkedHashMap<>(); + response.put("id", id); + response.put("deleted", deleted); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ClearHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + // Pause the sync worker so it cannot recreate keys between + // SCAN and DEL. Queued events accumulate and apply after resume. + sync.pause(2000); + long deleted; + try { + deleted = cache.clear(); + } finally { + sync.resume(); + } + Map response = new LinkedHashMap<>(); + response.put("deleted", deleted); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ReprefetchHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + // Pause the sync worker so it cannot interleave with the + // clear + snapshot + bulk_load sequence. Without this, a + // change applied between listRecords() and bulkLoad() would + // be overwritten by the stale snapshot. + sync.pause(2000); + int loaded; + double elapsedMs; + try { + long startedNs = System.nanoTime(); + cache.clear(); + loaded = cache.bulkLoad(primary.listRecords()); + elapsedMs = (System.nanoTime() - startedNs) / 1_000_000.0; + } finally { + sync.resume(); + } + Map response = new LinkedHashMap<>(); + response.put("loaded", loaded); + response.put("elapsed_ms", round2(elapsedMs)); + response.put("stats", buildStats()); + sendJson(exchange, 200, toJson(response)); + } + } + + static class ResetHandler implements HttpHandler { + @Override + public void handle(HttpExchange exchange) throws IOException { + if (!"POST".equalsIgnoreCase(exchange.getRequestMethod())) { + sendJson(exchange, 405, "{\"error\":\"Method Not Allowed\"}"); + return; + } + cache.resetStats(); + primary.resetReads(); + sendJson(exchange, 200, toJson(buildStats())); + } + } + + // ----- Helpers ---------------------------------------------------- + + private static Map buildStats() { + Map stats = cache.stats(); + stats.put("primary_reads_total", primary.reads()); + stats.put("primary_read_latency_ms", primary.getReadLatencyMs()); + return stats; + } + + private static double round2(double value) { + return Math.round(value * 100.0) / 100.0; + } + + private static String readRequestBody(HttpExchange exchange) throws IOException { + try (InputStream inputStream = exchange.getRequestBody()) { + return new String(inputStream.readAllBytes(), StandardCharsets.UTF_8); + } + } + + private static Map parseFormData(String body) { + Map params = new HashMap<>(); + if (body == null || body.isEmpty()) return params; + for (String pair : body.split("&")) { + String[] kv = pair.split("=", 2); + if (kv.length != 2 || kv[0].isEmpty()) continue; + params.put(URLDecoder.decode(kv[0], StandardCharsets.UTF_8), + URLDecoder.decode(kv[1], StandardCharsets.UTF_8)); + } + return params; + } + + private static Map parseQuery(String query) { + if (query == null || query.isEmpty()) { + return new HashMap<>(); + } + return parseFormData(query); + } + + private static void sendJson(HttpExchange exchange, int status, String body) throws IOException { + byte[] bytes = body.getBytes(StandardCharsets.UTF_8); + exchange.getResponseHeaders().set("Content-Type", "application/json"); + exchange.sendResponseHeaders(status, bytes.length); + try (OutputStream os = exchange.getResponseBody()) { + os.write(bytes); + } + } + + private static String toJson(Object value) { + StringBuilder sb = new StringBuilder(); + appendJson(sb, value); + return sb.toString(); + } + + @SuppressWarnings("unchecked") + private static void appendJson(StringBuilder sb, Object value) { + if (value == null) { + sb.append("null"); + } else if (value instanceof Boolean) { + sb.append(value); + } else if (value instanceof Number) { + sb.append(value); + } else if (value instanceof Map) { + sb.append('{'); + boolean first = true; + for (Map.Entry entry : ((Map) value).entrySet()) { + if (!first) sb.append(','); + first = false; + appendJsonString(sb, String.valueOf(entry.getKey())); + sb.append(':'); + appendJson(sb, entry.getValue()); + } + sb.append('}'); + } else if (value instanceof List) { + sb.append('['); + boolean first = true; + for (Object item : (List) value) { + if (!first) sb.append(','); + first = false; + appendJson(sb, item); + } + sb.append(']'); + } else { + appendJsonString(sb, String.valueOf(value)); + } + } + + private static void appendJsonString(StringBuilder sb, String value) { + sb.append('"').append(jsonEscape(value)).append('"'); + } + + private static String jsonEscape(String value) { + StringBuilder sb = new StringBuilder(value.length() + 4); + for (int i = 0; i < value.length(); i++) { + char c = value.charAt(i); + switch (c) { + case '"': sb.append("\\\""); break; + case '\\': sb.append("\\\\"); break; + case '\n': sb.append("\\n"); break; + case '\r': sb.append("\\r"); break; + case '\t': sb.append("\\t"); break; + default: + if (c < 0x20) { + sb.append(String.format("\\u%04x", (int) c)); + } else { + sb.append(c); + } + } + } + return sb.toString(); + } + + private static String renderHtmlPage(int cacheTtl) { + return HTML_TEMPLATE.replace("__CACHE_TTL__", Integer.toString(cacheTtl)); + } + + // Same HTML as the Python reference. The pill text is the + // Lettuce + JDK HttpServer label per the Lettuce port brief, and + // __CACHE_TTL__ is substituted at render time with the safety-net + // TTL the cache was constructed with. + private static final String HTML_TEMPLATE = """ + + + + + + Redis Prefetch Cache Demo + + + +
+
Lettuce + JDK HttpServer
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + + """; +} diff --git a/content/develop/use-cases/prefetch-cache/java-lettuce/MockPrimaryStore.java b/content/develop/use-cases/prefetch-cache/java-lettuce/MockPrimaryStore.java new file mode 100644 index 0000000000..89f7d1d5c2 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-lettuce/MockPrimaryStore.java @@ -0,0 +1,186 @@ +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.LinkedBlockingQueue; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; + +/** + * Mock primary data store for the prefetch-cache demo. + * + *

This stands in for a source-of-truth database (Postgres, MySQL, + * Mongo, etc.) that holds reference data the application serves to + * users.

+ * + *

Every mutation appends a change event to an in-process queue, + * which the sync worker drains and applies to Redis. In a real system + * the queue is replaced by a CDC pipeline — Redis Data Integration, + * Debezium plus a lightweight consumer, or an equivalent tool that + * tails the source's binlog/WAL and pushes changes into Redis.

+ * + *

The store also exposes a read-latency knob so the demo can + * illustrate how much slower a direct primary read would be than a + * Redis hit.

+ */ +public class MockPrimaryStore { + + public static final String CHANGE_OP_UPSERT = "upsert"; + public static final String CHANGE_OP_DELETE = "delete"; + + private final int readLatencyMs; + private final AtomicLong reads = new AtomicLong(); + private final Object lock = new Object(); + private final Map> records = new LinkedHashMap<>(); + private final LinkedBlockingQueue> changes = new LinkedBlockingQueue<>(); + + public MockPrimaryStore() { + this(80); + } + + public MockPrimaryStore(int readLatencyMs) { + this.readLatencyMs = readLatencyMs; + seed("cat-001", "Beverages", "1", "true", ""); + seed("cat-002", "Bakery", "2", "true", ""); + seed("cat-003", "Pantry Staples", "3", "false", ""); + seed("cat-004", "Frozen", "4", "false", ""); + seed("cat-005", "Specialty Cheeses", "5", "false", "cat-002"); + } + + public int getReadLatencyMs() { + return readLatencyMs; + } + + /** Sorted list of every record id. No sleep, no counter increment (metadata-only). */ + public List listIds() { + List ids; + synchronized (lock) { + ids = new ArrayList<>(records.keySet()); + } + Collections.sort(ids); + return ids; + } + + /** Return every record. Used by the cache's bulk-load path on startup. */ + public List> listRecords() { + sleepReadLatency(); + synchronized (lock) { + reads.incrementAndGet(); + List> out = new ArrayList<>(records.size()); + for (Map record : records.values()) { + out.add(new LinkedHashMap<>(record)); + } + return out; + } + } + + /** Single-record read. Not on the demo's normal read path. */ + public Map read(String entityId) { + sleepReadLatency(); + synchronized (lock) { + reads.incrementAndGet(); + Map record = records.get(entityId); + return record == null ? null : new LinkedHashMap<>(record); + } + } + + public boolean addRecord(Map record) { + if (record == null) return false; + String entityId = record.getOrDefault("id", "").trim(); + if (entityId.isEmpty()) return false; + synchronized (lock) { + if (records.containsKey(entityId)) { + return false; + } + records.put(entityId, new LinkedHashMap<>(record)); + // Emit while the lock is held so the queue order matches + // the mutation order. Two concurrent callers cannot + // interleave mutation A → mutation B → emit B → emit A. + emitChangeLocked(CHANGE_OP_UPSERT, entityId, new LinkedHashMap<>(record)); + } + return true; + } + + public boolean updateField(String entityId, String field, String value) { + synchronized (lock) { + Map record = records.get(entityId); + if (record == null) { + return false; + } + record.put(field, value); + emitChangeLocked(CHANGE_OP_UPSERT, entityId, new LinkedHashMap<>(record)); + } + return true; + } + + public boolean deleteRecord(String entityId) { + synchronized (lock) { + if (!records.containsKey(entityId)) { + return false; + } + records.remove(entityId); + emitChangeLocked(CHANGE_OP_DELETE, entityId, null); + } + return true; + } + + /** + * Block up to {@code timeoutMs} for the next change event. Returns + * {@code null} on timeout. + */ + public Map nextChange(long timeoutMs) { + try { + return changes.poll(timeoutMs, TimeUnit.MILLISECONDS); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + return null; + } + } + + public long reads() { + return reads.get(); + } + + public void resetReads() { + reads.set(0); + } + + /** + * Append a change event to the feed. Caller must hold {@link #lock}. + * + *

{@link LinkedBlockingQueue#offer(Object)} is itself thread-safe + * and never tries to acquire {@link #lock}, so calling it while + * holding the records lock cannot deadlock. Holding the lock here + * is what guarantees the queue order matches the order in which + * the records map was mutated.

+ */ + private void emitChangeLocked(String op, String entityId, Map fields) { + Map event = new HashMap<>(); + event.put("op", op); + event.put("id", entityId); + event.put("fields", fields); + event.put("timestamp_ms", (double) System.currentTimeMillis()); + changes.offer(event); + } + + private void sleepReadLatency() { + if (readLatencyMs <= 0) return; + try { + Thread.sleep(readLatencyMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + } + } + + private void seed(String id, String name, String displayOrder, String featured, String parentId) { + Map record = new LinkedHashMap<>(); + record.put("id", id); + record.put("name", name); + record.put("display_order", displayOrder); + record.put("featured", featured); + record.put("parent_id", parentId); + records.put(id, record); + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-lettuce/PrefetchCache.java b/content/develop/use-cases/prefetch-cache/java-lettuce/PrefetchCache.java new file mode 100644 index 0000000000..3d9680290d --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-lettuce/PrefetchCache.java @@ -0,0 +1,338 @@ +import io.lettuce.core.KeyScanCursor; +import io.lettuce.core.RedisFuture; +import io.lettuce.core.ScanArgs; +import io.lettuce.core.api.StatefulRedisConnection; +import io.lettuce.core.api.async.RedisAsyncCommands; +import io.lettuce.core.api.sync.RedisCommands; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.locks.ReentrantLock; + +/** + * Redis prefetch-cache helper. + * + *

Each cached entity is stored as a Redis hash under + * {@code } with a long safety-net TTL that bounds memory if + * the sync pipeline ever stops, but is not the freshness mechanism. + * Freshness comes from the {@link #applyChange(Map)} path, which the + * sync worker calls every time a primary mutation arrives.

+ * + *

Reads run {@code HGETALL} against Redis only. A miss is not a + * fall-back trigger — the application treats it as an error or a + * deliberate {@link #invalidate(String)} for testing. In production a + * sustained miss rate means the prefetch or the sync pipeline is broken, + * not that the primary should be re-queried on the request path.

+ */ +public class PrefetchCache { + + /** + * Serializes every transactional sequence (MULTI/EXEC) because the + * demo shares a single {@link StatefulRedisConnection} across HTTP + * handler threads. Lettuce is thread-safe for individual commands, + * but transactions are connection-scoped: two threads issuing + * MULTI/EXEC over the same connection would interleave their queued + * commands. The lock covers {@link #applyChange(Map)}'s upsert + * transaction so a queued HSET in one thread cannot end up inside + * another thread's transaction. In production, hand each transactional + * caller its own connection from a {@code ConnectionPoolSupport} pool + * and drop the lock. + */ + private final ReentrantLock txLock = new ReentrantLock(); + + private final StatefulRedisConnection connection; + private final String prefix; + private final int ttlSeconds; + + private final AtomicLong hits = new AtomicLong(); + private final AtomicLong misses = new AtomicLong(); + private final AtomicLong prefetched = new AtomicLong(); + private final AtomicLong syncEventsApplied = new AtomicLong(); + + // Sync-lag is recorded as a running total and sample count behind + // a small lock so the average is computed without losing samples. + private final Object lagLock = new Object(); + private double syncLagMsTotal = 0.0; + private long syncLagSamples = 0L; + + public PrefetchCache(StatefulRedisConnection connection) { + this(connection, "cache:category:", 3600); + } + + public PrefetchCache( + StatefulRedisConnection connection, + String prefix, + int ttlSeconds) { + if (connection == null) { + throw new IllegalArgumentException("connection is required"); + } + if (ttlSeconds < 1) { + throw new IllegalArgumentException("ttlSeconds must be at least 1"); + } + this.connection = connection; + this.prefix = (prefix == null || prefix.isEmpty()) ? "cache:category:" : prefix; + this.ttlSeconds = ttlSeconds; + } + + public int getTtlSeconds() { + return ttlSeconds; + } + + public String getPrefix() { + return prefix; + } + + /** Result of a cache read: the record (or null on miss), hit flag, and Redis round-trip in ms. */ + public static final class Result { + public final Map record; + public final boolean hit; + public final double redisLatencyMs; + + public Result(Map record, boolean hit, double redisLatencyMs) { + this.record = record; + this.hit = hit; + this.redisLatencyMs = redisLatencyMs; + } + } + + /** + * Pipeline {@code DEL} + {@code HSET} + {@code EXPIRE} for every record. + * Returns the count loaded. + * + *

The pipeline is non-transactional: it is fast on startup (when + * nothing is reading the cache) and on the live {@code /reprefetch} + * path (when the demo pauses the sync worker around the call). Calling + * {@code bulkLoad} on a cache that is actively being read and written + * to can briefly expose a key that has been deleted but not yet + * rewritten; pause the writers first or rewrite this with + * {@link RedisCommands#multi()} if that matters.

+ */ + public int bulkLoad(Iterable> records) { + RedisAsyncCommands async = connection.async(); + // Disable auto-flushing so all commands batch into one network + // round trip, the Lettuce equivalent of a non-transactional + // pipeline in redis-py. We use the async API here because the + // sync API blocks on every command, which would defeat batching. + connection.setAutoFlushCommands(false); + List> futures = new ArrayList<>(); + int loaded = 0; + try { + for (Map record : records) { + if (record == null) continue; + String entityId = record.get("id"); + if (entityId == null || entityId.isEmpty()) continue; + String cacheKey = cacheKey(entityId); + futures.add(async.del(cacheKey)); + futures.add(async.hset(cacheKey, record)); + futures.add(async.expire(cacheKey, ttlSeconds)); + loaded += 1; + } + connection.flushCommands(); + // Wait for every queued command to complete before returning. + for (RedisFuture future : futures) { + try { + future.get(); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + throw new RuntimeException("bulkLoad interrupted", e); + } catch (java.util.concurrent.ExecutionException e) { + throw new RuntimeException("bulkLoad command failed", e.getCause()); + } + } + } finally { + connection.setAutoFlushCommands(true); + } + if (loaded > 0) { + prefetched.addAndGet(loaded); + } + return loaded; + } + + /** + * Return {@code (record, hit, redisLatencyMs)} for an {@code HGETALL} + * against Redis. Prefetch-cache reads do not fall back to the + * primary. A miss is a signal that the cache is incomplete, not a + * trigger to re-query the source. The caller decides how to surface it. + */ + public Result get(String entityId) { + RedisCommands sync = connection.sync(); + String cacheKey = cacheKey(entityId); + + long startedNs = System.nanoTime(); + Map cached = sync.hgetall(cacheKey); + double redisLatencyMs = (System.nanoTime() - startedNs) / 1_000_000.0; + + if (cached != null && !cached.isEmpty()) { + hits.incrementAndGet(); + return new Result(cached, true, redisLatencyMs); + } + misses.incrementAndGet(); + return new Result(null, false, redisLatencyMs); + } + + /** + * Apply a primary change event to Redis. + * + *

The sync worker calls this for every event the primary emits. + * For an upsert, the helper rewrites the hash and refreshes the + * safety-net TTL inside a {@code MULTI}/{@code EXEC} block (serialized + * by {@link #txLock} so a concurrent caller cannot interleave). + * For a delete, it removes the cache key.

+ */ + public void applyChange(Map change) { + if (change == null) return; + Object opObj = change.get("op"); + Object idObj = change.get("id"); + if (!(opObj instanceof String) || !(idObj instanceof String)) return; + String op = (String) opObj; + String entityId = (String) idObj; + if (entityId.isEmpty()) return; + + String cacheKey = cacheKey(entityId); + RedisCommands sync = connection.sync(); + + if ("upsert".equals(op)) { + Object fieldsObj = change.get("fields"); + if (!(fieldsObj instanceof Map)) return; + @SuppressWarnings("unchecked") + Map fields = (Map) fieldsObj; + if (fields.isEmpty()) { + // Malformed upsert with no fields. Skip rather than + // crash the sync worker: HSET with an empty mapping + // raises in Lettuce, and there is nothing to write + // anyway. A real CDC consumer would route this to a + // dead-letter queue and alert; the demo just drops it. + return; + } + txLock.lock(); + try { + sync.multi(); + sync.del(cacheKey); + sync.hset(cacheKey, fields); + sync.expire(cacheKey, ttlSeconds); + sync.exec(); + } finally { + txLock.unlock(); + } + } else if ("delete".equals(op)) { + sync.del(cacheKey); + } else { + return; + } + + syncEventsApplied.incrementAndGet(); + Object tsObj = change.get("timestamp_ms"); + if (tsObj instanceof Number) { + double lagMs = Math.max(0.0, + System.currentTimeMillis() - ((Number) tsObj).doubleValue()); + synchronized (lagLock) { + syncLagMsTotal += lagMs; + syncLagSamples += 1; + } + } + } + + /** Delete one cache key. Demo-only: simulates a broken sync pipeline. */ + public boolean invalidate(String entityId) { + return connection.sync().del(cacheKey(entityId)) == 1L; + } + + /** Delete every key under this cache's prefix and return the count. */ + public long clear() { + RedisCommands sync = connection.sync(); + long deleted = 0; + ScanArgs args = ScanArgs.Builder.matches(prefix + "*").limit(500); + KeyScanCursor cursor = sync.scan(args); + while (true) { + List keys = cursor.getKeys(); + if (keys != null && !keys.isEmpty()) { + String[] keyArray = keys.toArray(new String[0]); + deleted += sync.del(keyArray); + } + if (cursor.isFinished()) { + break; + } + cursor = sync.scan(cursor, args); + } + return deleted; + } + + /** Return every entity id currently in the cache, sorted. */ + public List ids() { + RedisCommands sync = connection.sync(); + List ids = new ArrayList<>(); + ScanArgs args = ScanArgs.Builder.matches(prefix + "*").limit(500); + KeyScanCursor cursor = sync.scan(args); + while (true) { + List keys = cursor.getKeys(); + if (keys != null) { + for (String key : keys) { + ids.add(stripPrefix(key)); + } + } + if (cursor.isFinished()) { + break; + } + cursor = sync.scan(cursor, args); + } + Collections.sort(ids); + return ids; + } + + public long count() { + return ids().size(); + } + + /** Remaining TTL in seconds (Redis {@code TTL} semantics: -2 missing, -1 no expiry). */ + public long ttlRemaining(String entityId) { + return connection.sync().ttl(cacheKey(entityId)); + } + + /** + * Snapshot of the helper's counters: hits, misses, hit_rate_pct, + * prefetched, sync_events_applied, sync_lag_ms_avg. + */ + public Map stats() { + long h = hits.get(); + long m = misses.get(); + long total = h + m; + double hitRate = total == 0 ? 0.0 : Math.round(1000.0 * h / total) / 10.0; + double avgLag; + synchronized (lagLock) { + avgLag = syncLagSamples == 0 + ? 0.0 + : Math.round(100.0 * syncLagMsTotal / syncLagSamples) / 100.0; + } + Map stats = new LinkedHashMap<>(); + stats.put("hits", h); + stats.put("misses", m); + stats.put("hit_rate_pct", hitRate); + stats.put("prefetched", prefetched.get()); + stats.put("sync_events_applied", syncEventsApplied.get()); + stats.put("sync_lag_ms_avg", avgLag); + return stats; + } + + public void resetStats() { + hits.set(0); + misses.set(0); + prefetched.set(0); + syncEventsApplied.set(0); + synchronized (lagLock) { + syncLagMsTotal = 0.0; + syncLagSamples = 0L; + } + } + + private String cacheKey(String entityId) { + return prefix + entityId; + } + + private String stripPrefix(String key) { + return key.startsWith(prefix) ? key.substring(prefix.length()) : key; + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-lettuce/SyncWorker.java b/content/develop/use-cases/prefetch-cache/java-lettuce/SyncWorker.java new file mode 100644 index 0000000000..299bdd925a --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-lettuce/SyncWorker.java @@ -0,0 +1,171 @@ +import java.util.Map; + +/** + * Background sync worker for the prefetch-cache demo. + * + *

A daemon {@link Thread} drains the primary's change queue and + * applies each event to Redis through + * {@link PrefetchCache#applyChange(Map)}. In a real system, the queue + * is replaced by a CDC pipeline (Redis Data Integration, Debezium, or + * an equivalent) that tails the primary's binlog/WAL and writes the + * same shape of events.

+ * + *

The worker exposes {@link #pause(long)} and {@link #resume()} so + * maintenance paths ({@code /reprefetch}, {@link PrefetchCache#clear()}) + * can stop event application without tearing the thread down. + * {@code pause} blocks until the worker is parked, so the caller knows + * no apply is in flight by the time it returns.

+ */ +public class SyncWorker { + + private final MockPrimaryStore primary; + private final PrefetchCache cache; + private final long pollTimeoutMs; + + private volatile boolean stopRequested = false; + private volatile boolean pauseRequested = false; + + // Signals worker has parked itself inside the pause loop. Cleared + // by the worker on resume so the next pause can wait again. + private final Object pausedSignal = new Object(); + private volatile boolean pausedIdle = false; + + private Thread thread; + + public SyncWorker(MockPrimaryStore primary, PrefetchCache cache) { + this(primary, cache, 50L); + } + + public SyncWorker(MockPrimaryStore primary, PrefetchCache cache, long pollTimeoutMs) { + if (primary == null || cache == null) { + throw new IllegalArgumentException("primary and cache are required"); + } + this.primary = primary; + this.cache = cache; + this.pollTimeoutMs = pollTimeoutMs; + } + + public synchronized void start() { + if (thread != null && thread.isAlive()) { + return; + } + stopRequested = false; + pauseRequested = false; + pausedIdle = false; + thread = new Thread(this::run, "prefetch-cache-sync"); + thread.setDaemon(true); + thread.start(); + } + + /** + * Signal the worker to exit and join its thread. + * + *

If the join times out the worker is wedged inside + * {@code applyChange}; we leave {@link #thread} populated so a + * subsequent {@link #start()} does not spawn a second worker on + * top of the orphan.

+ */ + public synchronized void stop(long joinTimeoutMs) { + stopRequested = true; + // Also wake the worker out of any park loop. + synchronized (pausedSignal) { + pausedSignal.notifyAll(); + } + if (thread == null) return; + try { + thread.join(joinTimeoutMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + return; + } + if (!thread.isAlive()) { + thread = null; + } + } + + /** + * Stop applying events and block until the worker is parked. + * + *

Returns {@code true} once the worker has confirmed it is idle, + * or {@code false} if the timeout elapsed first. While paused, + * change events accumulate in the primary's queue and are applied + * in order after {@link #resume()}.

+ */ + public boolean pause(long timeoutMs) { + Thread workerSnapshot; + synchronized (this) { + workerSnapshot = thread; + } + pausedIdle = false; + pauseRequested = true; + if (workerSnapshot == null || !workerSnapshot.isAlive()) { + // No worker running — nothing to wait on, treat as paused. + return true; + } + long deadline = System.nanoTime() + timeoutMs * 1_000_000L; + synchronized (pausedSignal) { + while (!pausedIdle && System.nanoTime() < deadline) { + long remainingNanos = deadline - System.nanoTime(); + if (remainingNanos <= 0) break; + long remainingMs = Math.max(1L, remainingNanos / 1_000_000L); + try { + pausedSignal.wait(remainingMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + return pausedIdle; + } + } + } + return pausedIdle; + } + + public void resume() { + pauseRequested = false; + pausedIdle = false; + synchronized (pausedSignal) { + pausedSignal.notifyAll(); + } + } + + private void run() { + while (!stopRequested) { + if (pauseRequested) { + synchronized (pausedSignal) { + // Park until pause is lifted or worker is stopped. + // Re-announce "pausedIdle" on every iteration so a + // *new* pause() that arrives while we are still + // parked from the previous cycle gets acknowledged + // immediately, not after the caller's full + // pause-timeout. + while (pauseRequested && !stopRequested) { + pausedIdle = true; + pausedSignal.notifyAll(); + try { + pausedSignal.wait(pollTimeoutMs); + } catch (InterruptedException e) { + Thread.currentThread().interrupt(); + return; + } + } + pausedIdle = false; + } + continue; + } + + Map change = primary.nextChange(pollTimeoutMs); + if (change == null) { + continue; + } + try { + cache.applyChange(change); + } catch (Exception exc) { + // Demo behaviour: log and drop the event. A production + // CDC consumer would retry with bounded backoff and + // expose a dead-letter / error counter; see the guide's + // "Production usage" section. + System.err.printf("[sync] failed to apply %s: %s%n", + change, exc.getMessage()); + } + } + } +} diff --git a/content/develop/use-cases/prefetch-cache/java-lettuce/_index.md b/content/develop/use-cases/prefetch-cache/java-lettuce/_index.md new file mode 100644 index 0000000000..e575788cfb --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/java-lettuce/_index.md @@ -0,0 +1,435 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Java with Lettuce +linkTitle: Lettuce example (Java) +title: Redis prefetch cache with Lettuce +weight: 5 +--- + +This guide shows you how to implement a Redis prefetch cache in Java with the [Lettuce]({{< relref "/develop/clients/lettuce" >}}) client library. It includes a small local web server built on the JDK's `com.sun.net.httpserver` so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +This guide uses Lettuce's synchronous command API (`StatefulRedisConnection.sync()`) for reads and event application, and the asynchronous API (`async()`) inside `bulkLoad` so that the startup pipeline of `DEL` + `HSET` + `EXPIRE` triples can batch into a single round trip without each command blocking on its own future. Lettuce's reactive API would work equally well for either path. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.bulkLoad(primary.listRecords())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.get(entityId)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process queue. The sync worker thread drains the queue and calls `cache.applyChange(event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process change queue is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/java-lettuce/PrefetchCache.java)): + +```java +import io.lettuce.core.RedisClient; +import io.lettuce.core.RedisURI; +import io.lettuce.core.api.StatefulRedisConnection; + +RedisClient client = RedisClient.create( + RedisURI.builder().withHost("localhost").withPort(6379).build()); +StatefulRedisConnection connection = client.connect(); + +MockPrimaryStore primary = new MockPrimaryStore(80); +PrefetchCache cache = new PrefetchCache(connection, "cache:category:", 3600); + +// Pre-load every primary record into Redis in one pipelined round trip. +cache.bulkLoad(primary.listRecords()); + +// Start the sync worker so primary mutations propagate into Redis. +SyncWorker sync = new SyncWorker(primary, cache); +sync.start(); + +// Read paths now go to Redis only. +PrefetchCache.Result result = cache.get("cat-001"); +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI +* [`MULTI`]({{< relref "/commands/multi" >}})/[`EXEC`]({{< relref "/commands/exec" >}}) for the transactional upsert path in `applyChange` + +## Bulk load on startup + +The `bulkLoad` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record using Lettuce's async API with `setAutoFlushCommands(false)`. The whole batch flushes in a single network round trip, so loading thousands of records takes one RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```java +public int bulkLoad(Iterable> records) { + RedisAsyncCommands async = connection.async(); + connection.setAutoFlushCommands(false); + List> futures = new ArrayList<>(); + int loaded = 0; + try { + for (Map record : records) { + if (record == null) continue; + String entityId = record.get("id"); + if (entityId == null || entityId.isEmpty()) continue; + String cacheKey = cacheKey(entityId); + futures.add(async.del(cacheKey)); + futures.add(async.hset(cacheKey, record)); + futures.add(async.expire(cacheKey, ttlSeconds)); + loaded += 1; + } + connection.flushCommands(); + for (RedisFuture future : futures) { + future.get(); + } + } finally { + connection.setAutoFlushCommands(true); + } + if (loaded > 0) prefetched.addAndGet(loaded); + return loaded; +} +``` + +The bulk load is intentionally non-transactional: nothing is reading the cache yet on the startup path, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the pipeline fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulkLoad` directly from your own code on a cache that is already serving reads, either pause your writers first or rewrite it as a single `MULTI`/`EXEC` block so callers cannot observe a half-loaded record. + +Using the async API here is important: the sync API blocks on every command's future, which would defeat the batching even with auto-flush disabled. The async API queues commands locally and only flushes them when `flushCommands()` is called, then waits on the resulting futures in bulk. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```java +public Result get(String entityId) { + RedisCommands sync = connection.sync(); + String cacheKey = cacheKey(entityId); + + long startedNs = System.nanoTime(); + Map cached = sync.hgetall(cacheKey); + double redisLatencyMs = (System.nanoTime() - startedNs) / 1_000_000.0; + + if (cached != null && !cached.isEmpty()) { + hits.incrementAndGet(); + return new Result(cached, true, redisLatencyMs); + } + misses.incrementAndGet(); + return new Result(null, false, redisLatencyMs); +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `applyChange` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one `MULTI`/`EXEC` block so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```java +public void applyChange(Map change) { + // ... validate op and id ... + if ("upsert".equals(op)) { + Map fields = (Map) change.get("fields"); + if (fields == null || fields.isEmpty()) return; + txLock.lock(); + try { + sync.multi(); + sync.del(cacheKey); + sync.hset(cacheKey, fields); + sync.expire(cacheKey, ttlSeconds); + sync.exec(); + } finally { + txLock.unlock(); + } + } else if ("delete".equals(op)) { + sync.del(cacheKey); + } + // ... record sync_events_applied counter and lag sample ... +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. + +A Lettuce-specific point: a single `StatefulRedisConnection` is thread-safe for individual command calls, but `MULTI`/`EXEC` is connection-scoped state. If two threads issued transactions over the same connection at the same time, their queued commands would interleave. The demo shares one connection across HTTP handlers and the sync worker, so `txLock` (a `ReentrantLock`) serializes every transactional sequence. In production you would hand each transactional caller its own connection from a pool (see [Production usage](#production-usage)) or migrate the upsert path into a Lua script so the atomicity is server-side and no client-side lock is needed. + +## The sync worker + +The `SyncWorker` runs a daemon thread that blocks on the primary's change queue with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/java-lettuce/SyncWorker.java)): + +```java +private void run() { + while (!stopRequested) { + if (pauseRequested) { + // park until resume() ... + continue; + } + Map change = primary.nextChange(pollTimeoutMs); + if (change == null) continue; + try { + cache.applyChange(change); + } catch (Exception exc) { + System.err.printf("[sync] failed to apply %s: %s%n", + change, exc.getMessage()); + } + } +} +``` + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate(entityId)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulkLoad(primary.listRecords())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear()` and `bulkLoad()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.listRecords()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```java +sync.pause(2000); +try { + cache.clear(); + cache.bulkLoad(primary.listRecords()); +} finally { + sync.resume(); +} +``` + +`pause()` waits for the worker to finish whatever event it is currently applying, parks the run loop, and returns. Change events that arrive during the pause sit in the primary's queue and apply in order once `resume()` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```java +public Map stats() { + long h = hits.get(); + long m = misses.get(); + long total = h + m; + double hitRate = total == 0 ? 0.0 : Math.round(1000.0 * h / total) / 10.0; + double avgLag; + synchronized (lagLock) { + avgLag = syncLagSamples == 0 + ? 0.0 + : Math.round(100.0 * syncLagMsTotal / syncLagSamples) / 100.0; + } + Map stats = new LinkedHashMap<>(); + stats.put("hits", h); + stats.put("misses", m); + stats.put("hit_rate_pct", hitRate); + stats.put("prefetched", prefetched.get()); + stats.put("sync_events_applied", syncEventsApplied.get()); + stats.put("sync_lag_ms_avg", avgLag); + return stats; +} +``` + +In production you would emit these as Micrometer counters and gauges or push them into your metrics pipeline. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* JDK 17 or later is installed (the demo uses Java text blocks for the inline HTML). +* The Lettuce JAR (and its Netty + Reactor dependencies) is on your classpath. + Get them from + [Maven Central](https://repo1.maven.org/maven2/io/lettuce/lettuce-core/), + or via Maven/Gradle in a project setup. + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of four Java files. Download them from the [`java-lettuce` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/java-lettuce) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/java-lettuce +curl -O $BASE/PrefetchCache.java +curl -O $BASE/MockPrimaryStore.java +curl -O $BASE/SyncWorker.java +curl -O $BASE/DemoServer.java +``` + +You also need Lettuce and its runtime dependencies on your classpath. The simplest way is to download them into a local `lib/` directory: + +```bash +mkdir lib && cd lib +LETTUCE=https://repo1.maven.org/maven2/io/lettuce/lettuce-core/6.5.0.RELEASE +curl -O $LETTUCE/lettuce-core-6.5.0.RELEASE.jar +NETTY=https://repo1.maven.org/maven2/io/netty +for ARTIFACT in netty-buffer netty-codec netty-common netty-handler \ + netty-resolver netty-transport netty-transport-native-unix-common; do + curl -O "$NETTY/$ARTIFACT/4.1.113.Final/$ARTIFACT-4.1.113.Final.jar" +done +curl -O https://repo1.maven.org/maven2/io/projectreactor/reactor-core/3.6.6/reactor-core-3.6.6.jar +curl -O https://repo1.maven.org/maven2/org/reactivestreams/reactive-streams/1.0.4/reactive-streams-1.0.4.jar +cd .. +``` + +### Start the demo server + +From the demo directory: + +```bash +javac -cp 'lib/*' PrefetchCache.java MockPrimaryStore.java SyncWorker.java DemoServer.java +java -cp '.:lib/*' DemoServer --port 8786 --redis-host localhost --redis-port 6379 +``` + +(Where `lib/` contains `lettuce-core`, `reactor-core`, `reactive-streams`, and the relevant Netty jars.) + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8786 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 90.9 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8786`. + +The demo server uses only standard JDK libraries for HTTP handling and concurrency: + +* [`com.sun.net.httpserver.HttpServer`](https://docs.oracle.com/en/java/javase/21/docs/api/jdk.httpserver/com/sun/net/httpserver/HttpServer.html) for the web server +* [`java.util.concurrent.Executors`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/Executors.html) for the request thread pool and sync-worker daemon +* [`java.util.concurrent.LinkedBlockingQueue`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/LinkedBlockingQueue.html) for the primary's in-process change feed + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/java-lettuce/MockPrimaryStore.java)): + +```java +public class MockPrimaryStore { + public MockPrimaryStore(int readLatencyMs) { ... } + + public List> listRecords() { + Thread.sleep(readLatencyMs); + // ... + } + + public boolean updateField(String entityId, String field, String value) { + synchronized (lock) { + // ... mutate the record ... + emitChangeLocked(CHANGE_OP_UPSERT, entityId, copy); + } + return true; + } +} +``` + +Every mutation appends a change event to an in-process [`LinkedBlockingQueue`](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/LinkedBlockingQueue.html). The sync worker drains the queue with a 50 ms timeout and applies each event to Redis. The mutation lock is held across both the record update and the queue `offer`, so concurrent updates produce change events in the same order as their mutations — a correctness requirement the demo's pause/resume race test relies on. In a real system the queue is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Use a connection pool for transactions + +The demo shares a single `StatefulRedisConnection` across HTTP handlers and the sync worker, and serializes every `MULTI`/`EXEC` block with an in-process `ReentrantLock`. In production, use [`ConnectionPoolSupport`](https://github.com/redis/lettuce/wiki/Connection-Pooling) so each transactional caller (or each sync-worker partition, if you shard the change feed) gets its own connection. Once each transaction has a dedicated connection, you can drop `txLock` entirely. An alternative is to merge the `DEL`+`HSET`+`EXPIRE` upsert into a small Lua script invoked with `EVAL` — atomic server-side, lock-free on the client, and a single network round trip per event. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's in-process queue is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Consider the async or reactive APIs + +For high-throughput or event-driven applications, Lettuce's `async()` (`CompletionStage`-based) or `reactive()` (Project Reactor) APIs let request-handling threads return immediately while Redis work continues. The prefetch-cache structure is identical — replace the synchronous `hgetall` / `multi`/`exec` calls with their async counterparts and chain them together. The bulk-load path in this helper already uses the async API to batch its pipeline. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` argument exactly for this. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [Lettuce guide]({{< relref "/develop/clients/lettuce" >}}) - Install and use the Lettuce Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [MULTI command]({{< relref "/commands/multi" >}}) / [EXEC command]({{< relref "/commands/exec" >}}) - Transactional upsert path in `applyChange` +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/nodejs/_index.md b/content/develop/use-cases/prefetch-cache/nodejs/_index.md new file mode 100644 index 0000000000..4df155e815 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/nodejs/_index.md @@ -0,0 +1,390 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Node.js with node-redis +linkTitle: node-redis example (Node.js) +title: Redis prefetch cache with node-redis +weight: 2 +--- + +This guide shows you how to implement a Redis prefetch cache in Node.js with [`node-redis`]({{< relref "/develop/clients/nodejs" >}}). It includes a small local web server built with the Node.js standard `http` module so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.bulkLoad(await primary.listRecords())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.get(entityId)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process queue. The sync worker async task drains the queue and calls `cache.applyChange(event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process change queue is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/nodejs/cache.js)): + +```javascript +const { createClient } = require("redis"); +const { PrefetchCache } = require("./cache"); +const { MockPrimaryStore } = require("./primary"); +const { SyncWorker } = require("./sync_worker"); + +const client = createClient({ socket: { host: "localhost", port: 6379 } }); +await client.connect(); + +const primary = new MockPrimaryStore(); +const cache = new PrefetchCache({ redisClient: client, ttlSeconds: 3600 }); + +// Pre-load every primary record into Redis in one pipelined round trip. +await cache.bulkLoad(await primary.listRecords()); + +// Start the sync worker so primary mutations propagate into Redis. +const sync = new SyncWorker({ primary, cache }); +sync.start(); + +// Read paths now go to Redis only. +const { record, hit, redisLatencyMs } = await cache.get("cat-001"); +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `bulkLoad` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```javascript +async bulkLoad(records) { + let loaded = 0; + const pipe = this.redis.multi(); + for (const record of records) { + const entityId = record && record.id; + if (!entityId) continue; + const cacheKey = this._cacheKey(entityId); + pipe.del(cacheKey); + pipe.hSet(cacheKey, record); + pipe.expire(cacheKey, this.ttlSeconds); + loaded += 1; + } + if (loaded > 0) { + await pipe.execAsPipeline(); + } + this._prefetched += loaded; + return loaded; +} +``` + +`execAsPipeline()` is intentional on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulkLoad` directly from your own code on a cache that is already serving reads, either pause your writers first or rewrite it with `multi().exec()` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```javascript +async get(entityId) { + const cacheKey = this._cacheKey(entityId); + + const started = process.hrtime.bigint(); + const cached = await this.redis.hGetAll(cacheKey); + const redisLatencyMs = Number(process.hrtime.bigint() - started) / 1e6; + + if (cached && Object.keys(cached).length > 0) { + this._hits += 1; + return { record: cached, hit: true, redisLatencyMs }; + } + + this._misses += 1; + return { record: null, hit: false, redisLatencyMs }; +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `applyChange` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one pipelined transaction so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```javascript +async applyChange(change) { + const { op, id: entityId, fields } = change; + if (!entityId) return; + + const cacheKey = this._cacheKey(entityId); + + if (op === "upsert") { + if (!fields || Object.keys(fields).length === 0) return; + await this.redis + .multi() + .del(cacheKey) + .hSet(cacheKey, fields) + .expire(cacheKey, this.ttlSeconds) + .exec(); + } else if (op === "delete") { + await this.redis.del(cacheKey); + } +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. The empty-fields guard avoids crashing the sync worker if a malformed event arrives, because node-redis rejects an `hSet` call with an empty object. + +## The sync worker + +The `SyncWorker` runs an async task that awaits the primary's change queue with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/nodejs/sync_worker.js)): + +```javascript +async _run() { + while (!this._stopped) { + if (this._paused) { + if (this._pausedIdleResolve) this._pausedIdleResolve(); + while (this._paused && !this._stopped) await this._resumePromise; + continue; + } + + const change = await this.primary.nextChange(this.pollTimeoutMs); + if (change == null) continue; + try { + await this.cache.applyChange(change); + } catch (err) { + console.error(`[sync] failed to apply ${JSON.stringify(change)}: ${err && err.message}`); + } + } +} +``` + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate(entityId)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulkLoad(await primary.listRecords())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear()` and `bulkLoad()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.listRecords()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```javascript +await sync.pause(); +try { + await cache.clear(); + await cache.bulkLoad(await primary.listRecords()); +} finally { + sync.resume(); +} +``` + +`pause()` waits for the worker to finish whatever event it is currently applying, parks the run loop, and returns. Change events that arrive during the pause sit in the primary's queue and apply in order once `resume()` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```javascript +stats() { + const total = this._hits + this._misses; + const hitRate = total > 0 ? Math.round((1000 * this._hits) / total) / 10 : 0.0; + const avgLag = this._syncLagSamples > 0 + ? Math.round((this._syncLagMsTotal / this._syncLagSamples) * 100) / 100 + : 0.0; + return { + hits: this._hits, + misses: this._misses, + hit_rate_pct: hitRate, + prefetched: this._prefetched, + sync_events_applied: this._syncEventsApplied, + sync_lag_ms_avg: avgLag, + }; +} +``` + +In production you would emit these as Prometheus counters and gauges. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* Node.js 18 or later is installed. +* The `redis` (node-redis) package is installed via `npm install`. The demo pins to the latest 5.x. + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of five files. Download them from the [`nodejs` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/nodejs) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/nodejs +curl -O $BASE/cache.js +curl -O $BASE/primary.js +curl -O $BASE/sync_worker.js +curl -O $BASE/demoServer.js +curl -O $BASE/package.json +npm install +``` + +### Start the demo server + +From that directory: + +```bash +node demoServer.js +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8783 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 85.5 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8783`. + +The demo server uses only Node.js standard library features for HTTP and concurrency: + +* The [`http`](https://nodejs.org/api/http.html) module for the web server, with manual route dispatch on `req.method` and `url.pathname` +* The [`url`](https://nodejs.org/api/url.html) module's `URL` and `URLSearchParams` for query and form decoding +* `async`/`await` for the sync worker's long-running task + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/nodejs/primary.js)): + +```javascript +class MockPrimaryStore { + constructor({ readLatencyMs = 80 } = {}) { /* ... */ } + + async listRecords() { + await new Promise((r) => setTimeout(r, this.readLatencyMs)); + this._reads += 1; + return [...this._records.values()].map((r) => ({ ...r })); + } + + updateField(entityId, field, value) { + const record = this._records.get(entityId); + if (record === undefined) return false; + record[field] = String(value); + this._emitChange("upsert", entityId, { ...record }); + return true; + } +} +``` + +Every mutation appends a change event to an in-process queue and the sync worker `await`s `primary.nextChange(timeoutMs)` to drain it. Node.js JS execution is single-threaded, so the mutate-then-emit pair runs without any `await` between the two steps — that's how the queue order stays aligned with the mutation order. In a real system this queue is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's in-process queue is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` constructor option exactly for this. + +### Use a connection pool or a single multiplexed client + +The demo shares a single `createClient()` connection across the HTTP handlers and the sync worker. node-redis 5.x multiplexes commands over that one socket, so a single client handles concurrent reads, transactions, and pipelined writes without any in-process locking. In production, monitor the client's `error` event and reconnect-and-retry on transient failures, and consider a small pool of clients if you need to isolate slow blocking commands from the rest of your traffic. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [node-redis guide]({{< relref "/develop/clients/nodejs" >}}) - Install and use the Node.js Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/nodejs/cache.js b/content/develop/use-cases/prefetch-cache/nodejs/cache.js new file mode 100644 index 0000000000..317daefc16 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/nodejs/cache.js @@ -0,0 +1,241 @@ +"use strict"; + +/** + * Redis prefetch-cache helper. + * + * Each cached entity is stored as a Redis hash under `cache:{prefix}:{id}` + * with a long safety-net TTL that bounds memory if the sync pipeline ever + * stops, but is not the freshness mechanism. Freshness comes from the + * `applyChange` path, which the sync worker calls every time a primary + * mutation arrives. + * + * Reads run `HGETALL` against Redis only. A miss is not a fall-back + * trigger — the application treats it as an error or a deliberate + * `invalidate` for testing. In production a sustained miss rate means + * the prefetch or the sync pipeline is broken, not that the primary should + * be re-queried on the request path. + */ + +class PrefetchCache { + constructor({ + redisClient, + prefix = "cache:category:", + ttlSeconds = 3600, + } = {}) { + if (!redisClient) { + throw new Error("A connected redisClient is required."); + } + this.redis = redisClient; + this.prefix = prefix; + this.ttlSeconds = ttlSeconds; + + // Node.js is single-threaded for JS execution, so plain numbers are + // safe for counters. No lock needed; an `await` in one helper cannot + // interleave another helper's counter increment between the read and + // the write on the same line. + this._hits = 0; + this._misses = 0; + this._prefetched = 0; + this._syncEventsApplied = 0; + this._syncLagMsTotal = 0; + this._syncLagSamples = 0; + } + + _cacheKey(entityId) { + return `${this.prefix}${entityId}`; + } + + _stripPrefix(key) { + return key.startsWith(this.prefix) ? key.slice(this.prefix.length) : key; + } + + /** + * Pipeline `DEL` + `HSET` + `EXPIRE` for every record. Returns the count loaded. + * + * The pipeline is non-transactional: it is fast on startup (when + * nothing is reading the cache) and on the live `/reprefetch` path + * (when the demo pauses the sync worker around the call). Calling + * `bulkLoad` on a cache that is actively being read and written to can + * briefly expose a key that has been deleted but not yet rewritten; + * pause the writers first or rewrite this with a transactional MULTI + * if that matters. + */ + async bulkLoad(records) { + let loaded = 0; + const pipe = this.redis.multi(); + for (const record of records) { + const entityId = record && record.id; + if (!entityId) { + continue; + } + const cacheKey = this._cacheKey(entityId); + pipe.del(cacheKey); + pipe.hSet(cacheKey, record); + pipe.expire(cacheKey, this.ttlSeconds); + loaded += 1; + } + if (loaded > 0) { + // execAsPipeline sends the commands in one round trip without + // wrapping them in MULTI/EXEC. + await pipe.execAsPipeline(); + } + this._prefetched += loaded; + return loaded; + } + + /** + * Return `{ record, hit, redisLatencyMs }` for an `HGETALL` against Redis. + * + * Prefetch-cache reads do not fall back to the primary. A miss is a + * signal that the cache is incomplete, not a trigger to re-query the + * source. The caller decides how to surface it. + */ + async get(entityId) { + const cacheKey = this._cacheKey(entityId); + + const started = process.hrtime.bigint(); + const cached = await this.redis.hGetAll(cacheKey); + const redisLatencyMs = Number(process.hrtime.bigint() - started) / 1e6; + + if (cached && Object.keys(cached).length > 0) { + this._hits += 1; + return { record: cached, hit: true, redisLatencyMs }; + } + + this._misses += 1; + return { record: null, hit: false, redisLatencyMs }; + } + + /** + * Apply a primary change event to Redis. + * + * The sync worker calls this for every event the primary emits. For an + * upsert, the helper rewrites the hash and refreshes the safety-net + * TTL. For a delete, it removes the cache key. + */ + async applyChange(change) { + if (!change) { + return; + } + const { op, id: entityId, fields, timestamp_ms: timestampMs } = change; + if (!entityId) { + return; + } + + const cacheKey = this._cacheKey(entityId); + + if (op === "upsert") { + if (!fields || Object.keys(fields).length === 0) { + // Malformed upsert with no fields. Skip rather than crash the + // sync worker: hSet with an empty mapping rejects in node-redis, + // and there's nothing to write anyway. A real CDC consumer would + // route this to a dead-letter queue and alert; the demo just drops it. + return; + } + await this.redis + .multi() + .del(cacheKey) + .hSet(cacheKey, fields) + .expire(cacheKey, this.ttlSeconds) + .exec(); + } else if (op === "delete") { + await this.redis.del(cacheKey); + } else { + return; + } + + this._syncEventsApplied += 1; + if (typeof timestampMs === "number" && Number.isFinite(timestampMs)) { + const lagMs = Math.max(0, Date.now() - timestampMs); + this._syncLagMsTotal += lagMs; + this._syncLagSamples += 1; + } + } + + /** Delete one cache key. Demo-only: simulates a broken sync pipeline. */ + async invalidate(entityId) { + const deleted = await this.redis.del(this._cacheKey(entityId)); + return deleted === 1; + } + + /** Delete every key under this cache's prefix and return the count. */ + async clear() { + let deleted = 0; + let pipe = this.redis.multi(); + let batch = 0; + const match = `${this.prefix}*`; + // node-redis 5.x scanIterator yields batches (arrays) of keys, not + // individual keys. Iterate over each batch and flatten. + for await (const keys of this.redis.scanIterator({ MATCH: match, COUNT: 500 })) { + for (const key of keys) { + pipe.del(key); + batch += 1; + if (batch >= 500) { + const results = await pipe.execAsPipeline(); + deleted += results.reduce((acc, r) => acc + (r ? Number(r) : 0), 0); + pipe = this.redis.multi(); + batch = 0; + } + } + } + if (batch > 0) { + const results = await pipe.execAsPipeline(); + deleted += results.reduce((acc, r) => acc + (r ? Number(r) : 0), 0); + } + return deleted; + } + + /** Return every entity id currently in the cache. */ + async ids() { + const out = []; + const match = `${this.prefix}*`; + for await (const keys of this.redis.scanIterator({ MATCH: match, COUNT: 500 })) { + for (const key of keys) { + out.push(this._stripPrefix(key)); + } + } + out.sort(); + return out; + } + + async count() { + let n = 0; + const match = `${this.prefix}*`; + for await (const keys of this.redis.scanIterator({ MATCH: match, COUNT: 500 })) { + n += keys.length; + } + return n; + } + + async ttlRemaining(entityId) { + return this.redis.ttl(this._cacheKey(entityId)); + } + + stats() { + const total = this._hits + this._misses; + const hitRate = total > 0 ? Math.round((1000 * this._hits) / total) / 10 : 0.0; + const avgLag = + this._syncLagSamples > 0 + ? Math.round((this._syncLagMsTotal / this._syncLagSamples) * 100) / 100 + : 0.0; + return { + hits: this._hits, + misses: this._misses, + hit_rate_pct: hitRate, + prefetched: this._prefetched, + sync_events_applied: this._syncEventsApplied, + sync_lag_ms_avg: avgLag, + }; + } + + resetStats() { + this._hits = 0; + this._misses = 0; + this._prefetched = 0; + this._syncEventsApplied = 0; + this._syncLagMsTotal = 0; + this._syncLagSamples = 0; + } +} + +module.exports = { PrefetchCache }; diff --git a/content/develop/use-cases/prefetch-cache/nodejs/demoServer.js b/content/develop/use-cases/prefetch-cache/nodejs/demoServer.js new file mode 100644 index 0000000000..4887fbc24f --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/nodejs/demoServer.js @@ -0,0 +1,699 @@ +#!/usr/bin/env node +"use strict"; + +/** + * Redis prefetch-cache demo server. + * + * Run this file and visit http://localhost:8783 to watch a prefetch cache + * in action: the demo bulk-loads every primary record into Redis on + * startup, runs a background sync worker that applies primary mutations + * within milliseconds, and lets you add, update, delete, and re-prefetch + * records to see how the cache stays current without ever falling back to + * the primary on the read path. + */ + +const http = require("http"); +const { URL, URLSearchParams } = require("url"); +const { createClient } = require("redis"); + +const { PrefetchCache } = require("./cache"); +const { MockPrimaryStore } = require("./primary"); +const { SyncWorker } = require("./sync_worker"); + +const HTML_TEMPLATE = ` + + + + + Redis Prefetch Cache Demo + + + +
+
node-redis + Node.js standard http module
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +`; + +function parseArgs() { + const args = process.argv.slice(2); + const config = { + host: "127.0.0.1", + port: 8783, + redisHost: "localhost", + redisPort: 6379, + cachePrefix: "cache:category:", + ttlSeconds: 3600, + primaryLatencyMs: 80, + }; + + for (let i = 0; i < args.length; i += 1) { + switch (args[i]) { + case "--host": + config.host = args[++i]; + break; + case "--port": + config.port = Number.parseInt(args[++i], 10); + break; + case "--redis-host": + config.redisHost = args[++i]; + break; + case "--redis-port": + config.redisPort = Number.parseInt(args[++i], 10); + break; + case "--cache-prefix": + config.cachePrefix = args[++i]; + break; + case "--ttl-seconds": + config.ttlSeconds = Number.parseInt(args[++i], 10); + break; + case "--primary-latency-ms": + config.primaryLatencyMs = Number.parseInt(args[++i], 10); + break; + default: + break; + } + } + return config; +} + +function readBody(req) { + return new Promise((resolve, reject) => { + let body = ""; + req.on("data", (chunk) => { + body += chunk; + }); + req.on("end", () => resolve(body)); + req.on("error", reject); + }); +} + +function sendJson(res, status, payload) { + const body = JSON.stringify(payload); + res.writeHead(status, { "Content-Type": "application/json" }); + res.end(body); +} + +function htmlPage(cacheTtlSeconds) { + return HTML_TEMPLATE.replace("__CACHE_TTL__", String(cacheTtlSeconds)); +} + +async function buildStats(cache, primary) { + const stats = cache.stats(); + stats.primary_reads_total = primary.reads(); + stats.primary_read_latency_ms = primary.readLatencyMs; + return stats; +} + +async function handleRead(url, cache, primary) { + const id = url.searchParams.get("id") || ""; + if (!id) { + return { status: 400, body: { error: "Missing 'id'." } }; + } + const { record, hit, redisLatencyMs } = await cache.get(id); + return { + status: 200, + body: { + id, + record, + hit, + redis_latency_ms: Math.round(redisLatencyMs * 100) / 100, + ttl_remaining: await cache.ttlRemaining(id), + stats: await buildStats(cache, primary), + }, + }; +} + +async function handleUpdate(form, cache, primary) { + const id = form.get("id") || ""; + const field = form.get("field") || ""; + const value = form.get("value") || ""; + if (!id || !field) { + return { status: 400, body: { error: "Missing 'id' or 'field'." } }; + } + if (!primary.updateField(id, field, value)) { + return { status: 404, body: { error: `Unknown category '${id}'.` } }; + } + return { + status: 200, + body: { id, field, value, stats: await buildStats(cache, primary) }, + }; +} + +async function handleAdd(form, cache, primary) { + const id = (form.get("id") || "").trim(); + const name = (form.get("name") || "").trim(); + if (!id || !name) { + return { status: 400, body: { error: "Missing 'id' or 'name'." } }; + } + const record = { + id, + name, + display_order: form.get("display_order") || "99", + featured: form.get("featured") || "false", + parent_id: form.get("parent_id") || "", + }; + if (!primary.addRecord(record)) { + return { status: 409, body: { error: `Category '${id}' already exists.` } }; + } + return { + status: 200, + body: { id, record, stats: await buildStats(cache, primary) }, + }; +} + +async function handleDelete(form, cache, primary) { + const id = form.get("id") || ""; + if (!id) { + return { status: 400, body: { error: "Missing 'id'." } }; + } + if (!primary.deleteRecord(id)) { + return { status: 404, body: { error: `Unknown category '${id}'.` } }; + } + return { + status: 200, + body: { id, stats: await buildStats(cache, primary) }, + }; +} + +async function handleInvalidate(form, cache, primary) { + const id = form.get("id") || ""; + if (!id) { + return { status: 400, body: { error: "Missing 'id'." } }; + } + const deleted = await cache.invalidate(id); + return { + status: 200, + body: { id, deleted, stats: await buildStats(cache, primary) }, + }; +} + +async function main() { + const config = parseArgs(); + + const client = createClient({ + socket: { host: config.redisHost, port: config.redisPort }, + }); + client.on("error", (err) => console.error("Redis error:", err)); + await client.connect(); + + const cache = new PrefetchCache({ + redisClient: client, + prefix: config.cachePrefix, + ttlSeconds: config.ttlSeconds, + }); + const primary = new MockPrimaryStore({ readLatencyMs: config.primaryLatencyMs }); + const sync = new SyncWorker({ primary, cache }); + + const started = process.hrtime.bigint(); + await cache.clear(); + const loaded = await cache.bulkLoad(await primary.listRecords()); + const elapsedMs = Number(process.hrtime.bigint() - started) / 1e6; + sync.start(); + + console.log( + `Redis prefetch-cache demo server listening on http://${config.host}:${config.port}` + ); + console.log( + `Using Redis at ${config.redisHost}:${config.redisPort} ` + + `with cache prefix '${config.cachePrefix}' and TTL ${config.ttlSeconds}s` + ); + console.log(`Prefetched ${loaded} records in ${elapsedMs.toFixed(1)} ms; sync worker running`); + + const server = http.createServer(async (req, res) => { + const url = new URL(req.url, `http://${req.headers.host || "localhost"}`); + + try { + if (req.method === "GET" && (url.pathname === "/" || url.pathname === "/index.html")) { + const html = htmlPage(cache.ttlSeconds); + res.writeHead(200, { "Content-Type": "text/html; charset=utf-8" }); + res.end(html); + return; + } + if (req.method === "GET" && url.pathname === "/categories") { + sendJson(res, 200, { + cache_ids: await cache.ids(), + primary_ids: primary.listIds(), + }); + return; + } + if (req.method === "GET" && url.pathname === "/read") { + const r = await handleRead(url, cache, primary); + sendJson(res, r.status, r.body); + return; + } + if (req.method === "GET" && url.pathname === "/stats") { + sendJson(res, 200, await buildStats(cache, primary)); + return; + } + if (req.method === "POST" && url.pathname === "/update") { + const form = new URLSearchParams(await readBody(req)); + const r = await handleUpdate(form, cache, primary); + sendJson(res, r.status, r.body); + return; + } + if (req.method === "POST" && url.pathname === "/add") { + const form = new URLSearchParams(await readBody(req)); + const r = await handleAdd(form, cache, primary); + sendJson(res, r.status, r.body); + return; + } + if (req.method === "POST" && url.pathname === "/delete") { + const form = new URLSearchParams(await readBody(req)); + const r = await handleDelete(form, cache, primary); + sendJson(res, r.status, r.body); + return; + } + if (req.method === "POST" && url.pathname === "/invalidate") { + const form = new URLSearchParams(await readBody(req)); + const r = await handleInvalidate(form, cache, primary); + sendJson(res, r.status, r.body); + return; + } + if (req.method === "POST" && url.pathname === "/clear") { + // Pause the sync worker so it cannot recreate keys between SCAN + // and DEL. Queued events accumulate and apply after resume. + await sync.pause(); + let deleted = 0; + try { + deleted = await cache.clear(); + } finally { + sync.resume(); + } + sendJson(res, 200, { deleted, stats: await buildStats(cache, primary) }); + return; + } + if (req.method === "POST" && url.pathname === "/reprefetch") { + // Pause the sync worker so it cannot interleave with the clear + + // snapshot + bulkLoad sequence. Without this, a change applied + // between listRecords() and bulkLoad() would be overwritten by + // the stale snapshot. + await sync.pause(); + let loadedHere = 0; + let elapsedHere = 0; + try { + const t0 = process.hrtime.bigint(); + await cache.clear(); + loadedHere = await cache.bulkLoad(await primary.listRecords()); + elapsedHere = Number(process.hrtime.bigint() - t0) / 1e6; + } finally { + sync.resume(); + } + sendJson(res, 200, { + loaded: loadedHere, + elapsed_ms: Math.round(elapsedHere * 100) / 100, + stats: await buildStats(cache, primary), + }); + return; + } + if (req.method === "POST" && url.pathname === "/reset") { + cache.resetStats(); + primary.resetReads(); + sendJson(res, 200, await buildStats(cache, primary)); + return; + } + res.writeHead(404, { "Content-Type": "text/plain" }); + res.end("Not Found"); + } catch (err) { + console.error("Request error:", err); + sendJson(res, 500, { error: (err && err.message) || "Internal error" }); + } + }); + + const shutdown = async () => { + console.log("\nShutting down..."); + await sync.stop(); + server.close(); + try { + await client.quit(); + } catch (_e) { + // ignore + } + process.exit(0); + }; + process.on("SIGINT", shutdown); + process.on("SIGTERM", shutdown); + + server.listen(config.port, config.host); +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); diff --git a/content/develop/use-cases/prefetch-cache/nodejs/package.json b/content/develop/use-cases/prefetch-cache/nodejs/package.json new file mode 100644 index 0000000000..bad8830d69 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/nodejs/package.json @@ -0,0 +1,16 @@ +{ + "name": "redis-prefetch-cache-demo", + "version": "1.0.0", + "description": "Redis prefetch-cache demo with node-redis and the Node.js standard http module.", + "private": true, + "main": "demoServer.js", + "scripts": { + "start": "node demoServer.js" + }, + "dependencies": { + "redis": "^5.0.0" + }, + "engines": { + "node": ">=18" + } +} diff --git a/content/develop/use-cases/prefetch-cache/nodejs/primary.js b/content/develop/use-cases/prefetch-cache/nodejs/primary.js new file mode 100644 index 0000000000..0029ae7aef --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/nodejs/primary.js @@ -0,0 +1,172 @@ +"use strict"; + +/** + * Mock primary data store for the prefetch-cache demo. + * + * This stands in for a source-of-truth database (Postgres, MySQL, Mongo, + * etc.) that holds reference data the application serves to users. + * + * Every mutation appends a change event to an in-process queue, which the + * sync worker drains and applies to Redis. In a real system the queue is + * replaced by a CDC pipeline — Redis Data Integration, Debezium plus a + * lightweight consumer, or an equivalent tool that tails the source's + * binlog/WAL and pushes changes into Redis. + * + * The store also exposes `readLatencyMs` so the demo can illustrate + * how much slower a direct primary read would be than a Redis hit. + */ + +const CHANGE_OP_UPSERT = "upsert"; +const CHANGE_OP_DELETE = "delete"; + +class MockPrimaryStore { + constructor({ readLatencyMs = 80 } = {}) { + this.readLatencyMs = readLatencyMs; + this._reads = 0; + + // Change-event queue. The sync worker awaits `nextChange()` and we + // resolve the oldest pending waiter (or buffer the event) so the + // queue order matches the mutation order. + this._pendingEvents = []; + this._waiters = []; + + // Node.js JS execution is single-threaded, so the only "lock" we need + // is a no-op marker: we serialise a mutation and its emit by doing + // both synchronously, without any `await` in between. That gives us + // the same guarantee Python's `_emit_change_locked` gives — two + // concurrent callers cannot interleave mutation A → mutation B → + // emit B → emit A. + this._records = new Map([ + [ + "cat-001", + { id: "cat-001", name: "Beverages", display_order: "1", featured: "true", parent_id: "" }, + ], + [ + "cat-002", + { id: "cat-002", name: "Bakery", display_order: "2", featured: "true", parent_id: "" }, + ], + [ + "cat-003", + { id: "cat-003", name: "Pantry Staples", display_order: "3", featured: "false", parent_id: "" }, + ], + [ + "cat-004", + { id: "cat-004", name: "Frozen", display_order: "4", featured: "false", parent_id: "" }, + ], + [ + "cat-005", + { id: "cat-005", name: "Specialty Cheeses", display_order: "5", featured: "false", parent_id: "cat-002" }, + ], + ]); + } + + /** Metadata-only listing. No simulated latency, no read counter increment. */ + listIds() { + return [...this._records.keys()].sort(); + } + + /** Return every record. Used by the cache's bulk-load path on startup. */ + async listRecords() { + await new Promise((resolve) => setTimeout(resolve, this.readLatencyMs)); + this._reads += 1; + return [...this._records.values()].map((r) => ({ ...r })); + } + + /** Single-record read. Not on the demo's normal read path. */ + async read(entityId) { + await new Promise((resolve) => setTimeout(resolve, this.readLatencyMs)); + this._reads += 1; + const record = this._records.get(entityId); + return record ? { ...record } : null; + } + + addRecord(record) { + const entityId = (record && record.id ? String(record.id) : "").trim(); + if (!entityId) { + return false; + } + if (this._records.has(entityId)) { + return false; + } + const snapshot = { ...record }; + this._records.set(entityId, snapshot); + // Emit synchronously after mutation, before yielding to the event + // loop, so queue order matches mutation order. + this._emitChange(CHANGE_OP_UPSERT, entityId, { ...snapshot }); + return true; + } + + updateField(entityId, field, value) { + const record = this._records.get(entityId); + if (record === undefined) { + return false; + } + record[field] = String(value); + this._emitChange(CHANGE_OP_UPSERT, entityId, { ...record }); + return true; + } + + deleteRecord(entityId) { + if (!this._records.has(entityId)) { + return false; + } + this._records.delete(entityId); + this._emitChange(CHANGE_OP_DELETE, entityId, null); + return true; + } + + /** + * Block up to `timeoutMs` for the next change event. Resolves to the + * event, or null if the timeout elapsed. + */ + nextChange(timeoutMs) { + if (this._pendingEvents.length > 0) { + return Promise.resolve(this._pendingEvents.shift()); + } + return new Promise((resolve) => { + let resolved = false; + const timer = setTimeout(() => { + if (resolved) return; + resolved = true; + // Remove ourselves from the waiter list. + const idx = this._waiters.findIndex((w) => w.resolve === waiterResolve); + if (idx !== -1) { + this._waiters.splice(idx, 1); + } + resolve(null); + }, timeoutMs); + const waiterResolve = (event) => { + if (resolved) return; + resolved = true; + clearTimeout(timer); + resolve(event); + }; + this._waiters.push({ resolve: waiterResolve }); + }); + } + + reads() { + return this._reads; + } + + resetReads() { + this._reads = 0; + } + + _emitChange(op, entityId, fields) { + const event = { + op, + id: entityId, + fields, + timestamp_ms: Date.now(), + }; + if (this._waiters.length > 0) { + const waiter = this._waiters.shift(); + waiter.resolve(event); + } else { + this._pendingEvents.push(event); + } + } +} + +module.exports = { MockPrimaryStore, CHANGE_OP_UPSERT, CHANGE_OP_DELETE }; diff --git a/content/develop/use-cases/prefetch-cache/nodejs/sync_worker.js b/content/develop/use-cases/prefetch-cache/nodejs/sync_worker.js new file mode 100644 index 0000000000..33790e82f3 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/nodejs/sync_worker.js @@ -0,0 +1,152 @@ +"use strict"; + +/** + * Background sync worker for the prefetch-cache demo. + * + * An async task drains the primary's change queue and applies each event + * to Redis through `PrefetchCache.applyChange`. In a real system, the + * queue is replaced by a CDC pipeline (Redis Data Integration, Debezium, + * or an equivalent) that tails the primary's binlog/WAL and writes the + * same shape of events. + * + * The worker exposes `pause()` and `resume()` so maintenance paths + * (`/reprefetch`, `clear()`) can stop event application without tearing + * the task down. `pause()` blocks until the worker is parked, so the + * caller knows no apply is in flight by the time it returns. + */ + +class SyncWorker { + constructor({ primary, cache, pollTimeoutMs = 50 } = {}) { + if (!primary || !cache) { + throw new Error("primary and cache are required."); + } + this.primary = primary; + this.cache = cache; + this.pollTimeoutMs = pollTimeoutMs; + + this._stopped = true; + this._paused = false; + this._runPromise = null; + + // Promise that resolves when the worker has confirmed it is parked + // (idle, with no apply in flight). pause() awaits this. + this._pausedIdlePromise = null; + this._pausedIdleResolve = null; + + // Promise that resolves when the worker should leave its parked + // state — either resume() is called or the worker is stopped. + this._resumePromise = null; + this._resumeResolve = null; + } + + start() { + if (this._runPromise && !this._stopped) { + return; + } + this._stopped = false; + this._paused = false; + this._resetIdleSignal(); + this._resetResumeSignal(); + this._runPromise = this._run(); + } + + /** + * Signal the worker to exit and await its task. + * + * If the join times out the worker is wedged inside applyChange; we + * leave `_runPromise` populated so a subsequent `start()` does not + * spawn a second worker on top of the orphan. + */ + async stop(joinTimeoutMs = 2000) { + this._stopped = true; + // Wake any parked waiter so the worker can observe the stop flag. + if (this._resumeResolve) { + this._resumeResolve(); + } + if (!this._runPromise) { + return; + } + const timeout = new Promise((resolve) => setTimeout(() => resolve("timeout"), joinTimeoutMs)); + const result = await Promise.race([this._runPromise.then(() => "ok"), timeout]); + if (result === "ok") { + this._runPromise = null; + } + } + + /** + * Stop applying events and wait until the worker is parked. + * + * Returns `true` once the worker has confirmed it is idle, or `false` + * if the timeout elapsed first. While paused, change events accumulate + * in the primary's queue and are applied in order after `resume()`. + */ + async pause(timeoutMs = 2000) { + this._resetIdleSignal(); + this._paused = true; + if (!this._runPromise || this._stopped) { + return true; + } + const timeout = new Promise((resolve) => setTimeout(() => resolve(false), timeoutMs)); + const idle = this._pausedIdlePromise.then(() => true); + return Promise.race([idle, timeout]); + } + + resume() { + this._paused = false; + // Wake the parked worker. + if (this._resumeResolve) { + this._resumeResolve(); + } + this._resetResumeSignal(); + this._resetIdleSignal(); + } + + _resetIdleSignal() { + this._pausedIdlePromise = new Promise((resolve) => { + this._pausedIdleResolve = resolve; + }); + } + + _resetResumeSignal() { + this._resumePromise = new Promise((resolve) => { + this._resumeResolve = resolve; + }); + } + + async _run() { + while (!this._stopped) { + if (this._paused) { + // Park until resume() or stop() is called. + // Resolve `_pausedIdleResolve` on every iteration (after each + // resume/pause cycle that's re-armed by `_resetIdleSignal`) so a + // *new* pause() that arrives while we are still parked from the + // previous cycle gets acknowledged immediately, not after the + // caller's full pause-timeout. + while (this._paused && !this._stopped) { + if (this._pausedIdleResolve) { + this._pausedIdleResolve(); + } + await this._resumePromise; + } + // Loop will re-check flags at the top. + continue; + } + + const change = await this.primary.nextChange(this.pollTimeoutMs); + if (change === null || change === undefined) { + continue; + } + try { + await this.cache.applyChange(change); + } catch (err) { + // Demo behaviour: log and drop the event. A production CDC + // consumer would retry with bounded backoff and expose a + // dead-letter / error counter; see the guide's "Production + // usage" section. + console.error(`[sync] failed to apply ${JSON.stringify(change)}: ${err && err.message}`); + } + } + } +} + +module.exports = { SyncWorker }; diff --git a/content/develop/use-cases/prefetch-cache/php/Cache.php b/content/develop/use-cases/prefetch-cache/php/Cache.php new file mode 100644 index 0000000000..010ee26ee2 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/Cache.php @@ -0,0 +1,317 @@ +redis = $redis; + $this->prefix = $prefix; + $this->ttlSeconds = $ttlSeconds; + $this->statsKey = 'demo:stats:' . $prefix; + } + + public function getPrefix(): string + { + return $this->prefix; + } + + public function getTtlSeconds(): int + { + return $this->ttlSeconds; + } + + private function cacheKey(string $entityId): string + { + return $this->prefix . $entityId; + } + + private function stripPrefix(string $key): string + { + if (strpos($key, $this->prefix) === 0) { + return substr($key, strlen($this->prefix)); + } + return $key; + } + + /** + * Flatten an associative ["field" => "value"] map into the variadic + * field/value/field/value form HSET expects in Predis 3.x. + * + * Predis 3.x dropped the 1.x convenience signature that accepted an + * associative array; the 1.x form raises "wrong number of arguments + * for 'hset'" against 3.x. + * + * @return list + */ + private static function flattenFields(array $fields): array + { + $out = []; + foreach ($fields as $k => $v) { + $out[] = (string) $k; + $out[] = (string) $v; + } + return $out; + } + + /** + * Pipeline DEL + HSET + EXPIRE for every record. Returns the count + * loaded. + * + * The pipeline is non-transactional: it is fast on startup (when + * nothing is reading the cache) and on the live /reprefetch path + * (when the demo pauses the sync worker around the call). Calling + * bulkLoad on a cache that is actively being read and written to + * can briefly expose a key that has been deleted but not yet + * rewritten; pause the writers first or rewrite this as a + * transactional pipeline if that matters. + * + * @param iterable> $records + */ + public function bulkLoad(iterable $records): int + { + $loaded = 0; + $pipe = $this->redis->pipeline(); + foreach ($records as $record) { + $entityId = $record['id'] ?? ''; + if ($entityId === '') { + continue; + } + $cacheKey = $this->cacheKey($entityId); + $pipe->del([$cacheKey]); + $pipe->hset($cacheKey, ...self::flattenFields($record)); + $pipe->expire($cacheKey, $this->ttlSeconds); + $loaded++; + } + if ($loaded > 0) { + $pipe->execute(); + $this->redis->hincrby($this->statsKey, 'prefetched', $loaded); + } + return $loaded; + } + + /** + * Return [record_or_null, hit, redisLatencyMs] for an HGETALL. + * + * Prefetch-cache reads do not fall back to the primary. A miss is + * a signal that the cache is incomplete, not a trigger to re-query + * the source. The caller decides how to surface it. + * + * @return array{0: ?array, 1: bool, 2: float} + */ + public function get(string $entityId): array + { + $cacheKey = $this->cacheKey($entityId); + $started = microtime(true); + $cached = $this->redis->hgetall($cacheKey); + $redisLatencyMs = (microtime(true) - $started) * 1000.0; + + if (is_array($cached) && !empty($cached)) { + $this->redis->hincrby($this->statsKey, 'hits', 1); + return [$cached, true, $redisLatencyMs]; + } + + $this->redis->hincrby($this->statsKey, 'misses', 1); + return [null, false, $redisLatencyMs]; + } + + /** + * Apply a primary change event to Redis. + * + * For an upsert, rewrites the hash (DEL + HSET + EXPIRE in a + * MULTI/EXEC transaction) and refreshes the safety-net TTL. For a + * delete, removes the cache key. + * + * If op=="upsert" and fields is missing or empty, returns early + * without writing — HSET with an empty map raises in most clients, + * including Predis 3.x. A real CDC consumer would route this to a + * dead-letter queue; the demo drops it. + * + * @param array{op:string,id:string,fields:?array,timestamp_ms:float} $change + */ + public function applyChange(array $change): void + { + $op = $change['op'] ?? ''; + $entityId = $change['id'] ?? ''; + if ($entityId === '') { + return; + } + $cacheKey = $this->cacheKey($entityId); + + if ($op === 'upsert') { + $fields = $change['fields'] ?? null; + if (!is_array($fields) || empty($fields)) { + return; + } + $tx = $this->redis->transaction(); + $tx->del([$cacheKey]); + $tx->hset($cacheKey, ...self::flattenFields($fields)); + $tx->expire($cacheKey, $this->ttlSeconds); + $tx->execute(); + } elseif ($op === 'delete') { + $this->redis->del([$cacheKey]); + } else { + return; + } + + $this->redis->hincrby($this->statsKey, 'sync_events_applied', 1); + $timestampMs = $change['timestamp_ms'] ?? null; + if (is_int($timestampMs) || is_float($timestampMs)) { + $lagMs = max(0.0, (microtime(true) * 1000.0) - (float) $timestampMs); + // Track sum + sample count separately; stats() divides on + // read. HINCRBYFLOAT is the only sane way to accumulate a + // floating-point sum across processes without round-trip + // read-modify-writes. + $this->redis->hincrbyfloat($this->statsKey, 'sync_lag_ms_total', $lagMs); + $this->redis->hincrby($this->statsKey, 'sync_lag_samples', 1); + } + } + + /** + * Delete one cache key. Demo-only: simulates a broken sync pipeline. + */ + public function invalidate(string $entityId): bool + { + return (int) $this->redis->del([$this->cacheKey($entityId)]) === 1; + } + + /** + * Delete every key under this cache's prefix and return the count. + */ + public function clear(): int + { + $deleted = 0; + $batch = []; + $cursor = '0'; + do { + $result = $this->redis->scan($cursor, ['MATCH' => $this->prefix . '*', 'COUNT' => 500]); + $cursor = (string) $result[0]; + $keys = $result[1] ?? []; + foreach ($keys as $key) { + $batch[] = (string) $key; + if (count($batch) >= 500) { + $deleted += (int) $this->redis->del($batch); + $batch = []; + } + } + } while ($cursor !== '0'); + if (!empty($batch)) { + $deleted += (int) $this->redis->del($batch); + } + return $deleted; + } + + /** + * Return every entity id currently in the cache, sorted. + * + * @return list + */ + public function ids(): array + { + $ids = []; + $cursor = '0'; + do { + $result = $this->redis->scan($cursor, ['MATCH' => $this->prefix . '*', 'COUNT' => 500]); + $cursor = (string) $result[0]; + $keys = $result[1] ?? []; + foreach ($keys as $key) { + $ids[] = $this->stripPrefix((string) $key); + } + } while ($cursor !== '0'); + sort($ids, SORT_STRING); + return $ids; + } + + public function count(): int + { + $count = 0; + $cursor = '0'; + do { + $result = $this->redis->scan($cursor, ['MATCH' => $this->prefix . '*', 'COUNT' => 500]); + $cursor = (string) $result[0]; + $keys = $result[1] ?? []; + $count += count($keys); + } while ($cursor !== '0'); + return $count; + } + + public function ttlRemaining(string $entityId): int + { + return (int) $this->redis->ttl($this->cacheKey($entityId)); + } + + /** + * Return the counter snapshot. + * + * Counters live in Redis under demo:stats:{prefix}, so every + * request and the sync worker see the same totals. The average lag + * is computed at read time from a running sum and a sample count + * so cross-process increments don't have to coordinate to update + * a single float. + * + * @return array{hits:int,misses:int,hit_rate_pct:float,prefetched:int,sync_events_applied:int,sync_lag_ms_avg:float} + */ + public function stats(): array + { + $raw = $this->redis->hgetall($this->statsKey) ?: []; + $hits = (int) ($raw['hits'] ?? 0); + $misses = (int) ($raw['misses'] ?? 0); + $prefetched = (int) ($raw['prefetched'] ?? 0); + $applied = (int) ($raw['sync_events_applied'] ?? 0); + $lagTotal = (float) ($raw['sync_lag_ms_total'] ?? 0.0); + $lagSamples = (int) ($raw['sync_lag_samples'] ?? 0); + + $total = $hits + $misses; + $hitRate = $total > 0 ? round(100.0 * $hits / $total, 1) : 0.0; + $avgLag = $lagSamples > 0 ? round($lagTotal / $lagSamples, 2) : 0.0; + + return [ + 'hits' => $hits, + 'misses' => $misses, + 'hit_rate_pct' => $hitRate, + 'prefetched' => $prefetched, + 'sync_events_applied' => $applied, + 'sync_lag_ms_avg' => $avgLag, + ]; + } + + public function resetStats(): void + { + $this->redis->del([$this->statsKey]); + } +} diff --git a/content/develop/use-cases/prefetch-cache/php/Primary.php b/content/develop/use-cases/prefetch-cache/php/Primary.php new file mode 100644 index 0000000000..5aeed54838 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/Primary.php @@ -0,0 +1,395 @@ +> + */ + private static array $seedRecords = [ + 'cat-001' => [ + 'id' => 'cat-001', + 'name' => 'Beverages', + 'display_order' => '1', + 'featured' => 'true', + 'parent_id' => '', + ], + 'cat-002' => [ + 'id' => 'cat-002', + 'name' => 'Bakery', + 'display_order' => '2', + 'featured' => 'true', + 'parent_id' => '', + ], + 'cat-003' => [ + 'id' => 'cat-003', + 'name' => 'Pantry Staples', + 'display_order' => '3', + 'featured' => 'false', + 'parent_id' => '', + ], + 'cat-004' => [ + 'id' => 'cat-004', + 'name' => 'Frozen', + 'display_order' => '4', + 'featured' => 'false', + 'parent_id' => '', + ], + 'cat-005' => [ + 'id' => 'cat-005', + 'name' => 'Specialty Cheeses', + 'display_order' => '5', + 'featured' => 'false', + 'parent_id' => 'cat-002', + ], + ]; + + public function __construct(ClientInterface $redis, int $readLatencyMs = 80) + { + $this->redis = $redis; + $this->readLatencyMs = $readLatencyMs; + $this->idsKey = 'demo:primary:ids'; + $this->hashKeyPrefix = 'demo:primary:hash:'; + $this->changesKey = 'demo:primary:changes'; + $this->readsKey = 'demo:primary:reads'; + } + + public function getReadLatencyMs(): int + { + return $this->readLatencyMs; + } + + public function getChangesKey(): string + { + return $this->changesKey; + } + + private function hashKey(string $id): string + { + return $this->hashKeyPrefix . $id; + } + + /** + * Wipe primary state and re-seed the five demo categories. Called + * by the demo server on startup so the data is always in a known + * shape across restarts. + */ + public function seedIfEmpty(): void + { + if ((int) $this->redis->exists($this->idsKey) > 0) { + return; + } + $pipe = $this->redis->pipeline(); + foreach (self::$seedRecords as $id => $record) { + $pipe->sadd($this->idsKey, [$id]); + $hashKey = $this->hashKey($id); + $pipe->del([$hashKey]); + $args = []; + foreach ($record as $k => $v) { + $args[] = $k; + $args[] = $v; + } + $pipe->hset($hashKey, ...$args); + } + $pipe->execute(); + } + + public function resetSeed(): void + { + // Clear every primary key. Used on /reset in tests. + $ids = $this->redis->smembers($this->idsKey) ?: []; + $pipe = $this->redis->pipeline(); + foreach ($ids as $id) { + $pipe->del([$this->hashKey((string) $id)]); + } + $pipe->del([$this->idsKey, $this->changesKey, $this->readsKey]); + $pipe->execute(); + $this->seedIfEmpty(); + } + + /** + * @return list + */ + public function listIds(): array + { + $ids = $this->redis->smembers($this->idsKey) ?: []; + $ids = array_map('strval', $ids); + sort($ids, SORT_STRING); + return $ids; + } + + /** + * Return every record. Used by the cache's bulk-load path. + * + * @return list> + */ + public function listRecords(): array + { + if ($this->readLatencyMs > 0) { + usleep($this->readLatencyMs * 1000); + } + $this->redis->incr($this->readsKey); + + $ids = $this->listIds(); + $pipe = $this->redis->pipeline(); + foreach ($ids as $id) { + $pipe->hgetall($this->hashKey($id)); + } + $results = $pipe->execute(); + + $records = []; + foreach ($results as $row) { + if (is_array($row) && !empty($row)) { + $records[] = array_map('strval', $row); + } + } + return $records; + } + + /** + * Single-record read. Not on the demo's normal read path. + * + * @return ?array + */ + public function read(string $entityId): ?array + { + if ($this->readLatencyMs > 0) { + usleep($this->readLatencyMs * 1000); + } + $this->redis->incr($this->readsKey); + $row = $this->redis->hgetall($this->hashKey($entityId)); + if (!is_array($row) || empty($row)) { + return null; + } + return array_map('strval', $row); + } + + /** + * Insert if absent, emit an upsert event under the same Lua script + * so the queue order matches the mutation order. + * + * @param array $record + */ + public function addRecord(array $record): bool + { + $entityId = trim((string) ($record['id'] ?? '')); + if ($entityId === '') { + return false; + } + $normalised = []; + foreach ($record as $k => $v) { + $normalised[(string) $k] = (string) $v; + } + $changeJson = json_encode([ + 'op' => self::CHANGE_OP_UPSERT, + 'id' => $entityId, + 'fields' => $normalised, + 'timestamp_ms' => $this->nowMs(), + ], JSON_UNESCAPED_SLASHES); + $fieldsJson = json_encode($normalised, JSON_UNESCAPED_SLASHES); + + $result = $this->redis->eval( + self::ADD_SCRIPT, + 3, + $this->idsKey, + $this->hashKey($entityId), + $this->changesKey, + $entityId, + $fieldsJson, + $changeJson + ); + return (int) $result === 1; + } + + /** + * Atomic update + emit. Two concurrent callers cannot interleave + * mutation A → mutation B → emit B → emit A because the Lua + * script holds the Redis main thread for the duration. + */ + public function updateField(string $entityId, string $field, string $value): bool + { + $result = $this->redis->eval( + self::UPDATE_SCRIPT, + 3, + $this->idsKey, + $this->hashKey($entityId), + $this->changesKey, + $entityId, + $field, + $value, + (string) $this->nowMs() + ); + return (int) $result === 1; + } + + public function deleteRecord(string $entityId): bool + { + $result = $this->redis->eval( + self::DELETE_SCRIPT, + 3, + $this->idsKey, + $this->hashKey($entityId), + $this->changesKey, + $entityId, + (string) $this->nowMs() + ); + return (int) $result === 1; + } + + /** + * Block up to $timeoutSeconds for the next change event. Returns + * null on timeout. + * + * @return ?array + */ + public function nextChange(int $timeoutSeconds): ?array + { + // BRPOP on the changes list. Predis returns [key, value] on + // success or null on timeout. + $result = $this->redis->brpop([$this->changesKey], $timeoutSeconds); + if ($result === null || $result === false) { + return null; + } + $raw = is_array($result) ? ($result[1] ?? null) : null; + if (!is_string($raw)) { + return null; + } + $change = json_decode($raw, true); + if (!is_array($change)) { + return null; + } + // cjson serialises null as a real null on the consumer side + // for the delete case; normalise that into PHP null. + if (array_key_exists('fields', $change) && $change['fields'] === null) { + $change['fields'] = null; + } + return $change; + } + + public function reads(): int + { + return (int) ($this->redis->get($this->readsKey) ?? 0); + } + + public function resetReads(): void + { + $this->redis->del([$this->readsKey]); + } + + private function nowMs(): float + { + return microtime(true) * 1000.0; + } +} diff --git a/content/develop/use-cases/prefetch-cache/php/SyncWorker.php b/content/develop/use-cases/prefetch-cache/php/SyncWorker.php new file mode 100644 index 0000000000..808e8c7231 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/SyncWorker.php @@ -0,0 +1,333 @@ +redis = $redis; + $this->primary = $primary; + $this->cache = $cache; + // BRPOP timeout is in whole seconds. A 1-second poll keeps the + // worker responsive to pause / stop signals without spinning. + $this->pollTimeoutS = max(1, $pollTimeoutS); + $this->pausedKey = 'demo:sync:paused'; + $this->idleKey = 'demo:sync:idle'; + $this->pidKey = 'demo:sync:pid'; + } + + /** + * Drain the primary's change feed until requestStop() is called or + * the host signals SIGTERM / SIGINT. + */ + public function run(): void + { + if (function_exists('pcntl_async_signals')) { + pcntl_async_signals(true); + pcntl_signal(SIGTERM, function () { $this->stop = true; }); + pcntl_signal(SIGINT, function () { $this->stop = true; }); + } + + $this->redis->set($this->pidKey, (string) getmypid()); + + while (!$this->stop) { + if ($this->isPaused()) { + // Tell the supervisor we are parked. The supervisor's + // /clear and /reprefetch handlers wait for this flag + // before writing to the cache, so a queued event can't + // be applied between cache.clear() and bulk_load(). + // + // Re-SET idle=1 on every iteration so a *new* pause + // request that arrives while we are still parked from + // the previous cycle gets acknowledged within one tick, + // not after the supervisor's full pause-timeout. + while ($this->isPaused() && !$this->stop) { + $this->redis->set($this->idleKey, '1'); + usleep(20 * 1000); + } + $this->redis->set($this->idleKey, '0'); + continue; + } + + $change = $this->primary->nextChange($this->pollTimeoutS); + if ($change === null) { + continue; + } + try { + $this->cache->applyChange($change); + } catch (\Throwable $exc) { + // Demo behaviour: log and drop the event. A real CDC + // consumer would retry with bounded backoff and route + // poison events to a dead-letter queue; the guide's + // "Production usage" section spells that out. + fwrite(STDERR, "[sync] failed to apply event: " . $exc->getMessage() . "\n"); + } + } + } + + public function requestStop(): void + { + $this->stop = true; + } + + private function isPaused(): bool + { + return ((string) $this->redis->get($this->pausedKey)) === '1'; + } +} + +/** + * Cross-process supervisor: spawns and kills the sync_worker.php + * process and drives pause/resume from inside the demo server's HTTP + * handlers. + */ +class SyncWorkerSupervisor +{ + private ClientInterface $redis; + private string $workerScript; + private string $phpBinary; + private array $env; + + private string $pidKey; + private string $pausedKey; + private string $idleKey; + + public function __construct( + ClientInterface $redis, + string $workerScript, + array $env = [], + string $phpBinary = 'php' + ) { + $this->redis = $redis; + $this->workerScript = $workerScript; + $this->phpBinary = $phpBinary; + $this->env = $env; + + $this->pidKey = 'demo:sync:pid'; + $this->pausedKey = 'demo:sync:paused'; + $this->idleKey = 'demo:sync:idle'; + } + + /** + * Spawn one sync_worker.php process if none is alive. Idempotent — + * calling it while a worker is already running is a no-op. + */ + public function start(): int + { + $existing = $this->runningPid(); + if ($existing > 0) { + return $existing; + } + // Clear any stale coordination flags from a previous run. + $this->redis->del([$this->pausedKey, $this->idleKey]); + + $pid = $this->spawnWorker(); + if ($pid > 0) { + $this->redis->set($this->pidKey, (string) $pid); + } + return $pid; + } + + /** + * SIGTERM the worker and forget its PID. Returns true if a worker + * was running. + */ + public function stop(): bool + { + $pid = $this->runningPid(); + if ($pid <= 0) { + $this->redis->del([$this->pidKey, $this->pausedKey, $this->idleKey]); + return false; + } + $this->killPid($pid); + // Give the worker a chance to drain its BRPOP and exit. + for ($i = 0; $i < 20; $i++) { + if (!$this->isAlive($pid)) { + break; + } + usleep(50 * 1000); + } + $this->redis->del([$this->pidKey, $this->pausedKey, $this->idleKey]); + return true; + } + + public function running(): bool + { + return $this->runningPid() > 0; + } + + public function runningPid(): int + { + $raw = $this->redis->get($this->pidKey); + if ($raw === null) { + return 0; + } + $pid = (int) $raw; + if ($pid <= 0 || !$this->isAlive($pid)) { + $this->redis->del([$this->pidKey]); + return 0; + } + return $pid; + } + + /** + * Request a pause and block up to $timeoutMs for the worker to + * acknowledge it has parked itself. Returns true once the worker + * has confirmed it is idle (via the demo:sync:idle key), or false + * if the timeout elapsed first. + * + * If no worker is running, returns true immediately — there is + * nothing to wait for, and the caller's cache write is safe by + * construction. + */ + public function pause(int $timeoutMs = 2000): bool + { + if (!$this->running()) { + return true; + } + $this->redis->set($this->idleKey, '0'); + $this->redis->set($this->pausedKey, '1'); + $started = microtime(true); + while ((microtime(true) - $started) * 1000.0 < $timeoutMs) { + if (((string) $this->redis->get($this->idleKey)) === '1') { + return true; + } + usleep(10 * 1000); + } + return false; + } + + public function resume(): void + { + $this->redis->set($this->pausedKey, '0'); + $this->redis->set($this->idleKey, '0'); + } + + private function spawnWorker(): int + { + // The php -S dev server keeps its listening socket open in + // every child a request handler forks, which would let the + // worker hijack the port. Launch the worker through `setsid` + // (Linux) or a /bin/sh detach (macOS) so it gets a new session + // and detaches from the dev server's process group. Redirect + // every standard FD to files so socket FDs can't leak in. + $cmdArgs = [ + $this->phpBinary, + $this->workerScript, + ]; + foreach ($this->env as $k => $v) { + $cmdArgs[] = '--' . $k; + $cmdArgs[] = (string) $v; + } + + $logPath = '/tmp/prefetch_cache_sync_worker.log'; + + if (PHP_OS_FAMILY === 'Darwin') { + // macOS ships `setsid` without the -f flag; fall back to a + // shell command that backgrounds and detaches. + $escaped = array_map('escapeshellarg', $cmdArgs); + $shellCmd = sprintf( + 'exec %s >>%s 2>&1 ['file', '/dev/null', 'r'], + 1 => ['pipe', 'w'], + 2 => ['file', $logPath, 'a'], + ]; + + $proc = proc_open($args, $descriptorSpec, $pipes); + if (!is_resource($proc)) { + return 0; + } + + $childPid = 0; + if (PHP_OS_FAMILY === 'Darwin') { + $line = trim((string) fgets($pipes[1])); + $childPid = (int) $line; + } else { + $status = proc_get_status($proc); + $childPid = (int) ($status['pid'] ?? 0); + } + foreach ($pipes as $pipe) { + if (is_resource($pipe)) { + fclose($pipe); + } + } + proc_close($proc); + return $childPid; + } + + private function killPid(int $pid): bool + { + if ($pid <= 0 || !function_exists('posix_kill')) { + return false; + } + @posix_kill($pid, defined('SIGTERM') ? SIGTERM : 15); + return true; + } + + private function isAlive(int $pid): bool + { + if ($pid <= 0 || !function_exists('posix_kill')) { + return false; + } + return @posix_kill($pid, 0); + } +} diff --git a/content/develop/use-cases/prefetch-cache/php/_index.md b/content/develop/use-cases/prefetch-cache/php/_index.md new file mode 100644 index 0000000000..27a0ebbde4 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/_index.md @@ -0,0 +1,447 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in PHP with Predis +linkTitle: Predis example (PHP) +title: Redis prefetch cache with Predis +weight: 7 +--- + +This guide shows you how to implement a Redis prefetch cache in PHP with [Predis](https://github.com/predis/predis). It includes a small local web server built on PHP's built-in dev server so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `$cache->bulkLoad($primary->listRecords())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `$cache->get($entityId)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to a Redis list (`demo:primary:changes`). A separate `sync_worker.php` process drains the list with `BRPOP` and calls `$cache->applyChange($event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system, the Redis-list change feed is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/php/Cache.php)): + +```php +require __DIR__ . '/vendor/autoload.php'; +require __DIR__ . '/Cache.php'; +require __DIR__ . '/Primary.php'; +require __DIR__ . '/SyncWorker.php'; + +use Predis\Client as PredisClient; + +$redis = new PredisClient(['host' => '127.0.0.1', 'port' => 6379]); +$primary = new MockPrimaryStore($redis); +$cache = new PrefetchCache($redis, 'cache:category:', 3600); + +// Pre-load every primary record into Redis in one pipelined round trip. +$cache->bulkLoad($primary->listRecords()); + +// Spawn the long-running sync_worker.php process. +$supervisor = new SyncWorkerSupervisor($redis, __DIR__ . '/sync_worker.php'); +$supervisor->start(); + +// Read paths now go to Redis only. +[$record, $hit, $redisMs] = $cache->get('cat-001'); +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `bulkLoad` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```php +public function bulkLoad(iterable $records): int +{ + $loaded = 0; + $pipe = $this->redis->pipeline(); + foreach ($records as $record) { + $entityId = $record['id'] ?? ''; + if ($entityId === '') { + continue; + } + $cacheKey = $this->cacheKey($entityId); + $pipe->del([$cacheKey]); + $pipe->hset($cacheKey, ...self::flattenFields($record)); + $pipe->expire($cacheKey, $this->ttlSeconds); + $loaded++; + } + if ($loaded > 0) { + $pipe->execute(); + $this->redis->hincrby($this->statsKey, 'prefetched', $loaded); + } + return $loaded; +} +``` + +The Predis pipeline is non-transactional by default, which is intentional on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulkLoad` directly from your own code on a cache that is already serving reads, either pause your writers first or wrap the pipeline in a `MULTI`/`EXEC` transaction so callers cannot observe a half-loaded record. + +`flattenFields()` converts the associative `['field' => 'value']` record into the variadic field/value/field/value form `HSET` expects in Predis 3.x. Predis 1.x accepted an associative array directly; the 1.x form raises `wrong number of arguments for 'hset'` against 3.x. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```php +public function get(string $entityId): array +{ + $cacheKey = $this->cacheKey($entityId); + $started = microtime(true); + $cached = $this->redis->hgetall($cacheKey); + $redisLatencyMs = (microtime(true) - $started) * 1000.0; + + if (is_array($cached) && !empty($cached)) { + $this->redis->hincrby($this->statsKey, 'hits', 1); + return [$cached, true, $redisLatencyMs]; + } + + $this->redis->hincrby($this->statsKey, 'misses', 1); + return [null, false, $redisLatencyMs]; +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `applyChange` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one pipelined `MULTI`/`EXEC` transaction so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```php +public function applyChange(array $change): void +{ + $op = $change['op'] ?? ''; + $entityId = $change['id'] ?? ''; + if ($entityId === '') { + return; + } + $cacheKey = $this->cacheKey($entityId); + + if ($op === 'upsert') { + $fields = $change['fields'] ?? null; + if (!is_array($fields) || empty($fields)) { + return; + } + $tx = $this->redis->transaction(); + $tx->del([$cacheKey]); + $tx->hset($cacheKey, ...self::flattenFields($fields)); + $tx->expire($cacheKey, $this->ttlSeconds); + $tx->execute(); + } elseif ($op === 'delete') { + $this->redis->del([$cacheKey]); + } + // ... record sync_events_applied + sync lag +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. The malformed-upsert guard (`empty($fields)` returns early) prevents `HSET` from being called with no arguments, which raises an error in Predis 3.x. + +## The sync worker + +The sync worker is a separate long-running PHP process: `sync_worker.php` ([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/php/sync_worker.php)). The demo server spawns one of these on first request through a `SyncWorkerSupervisor` and tracks its PID in Redis ([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/php/SyncWorker.php)): + +```php +public function run(): void +{ + if (function_exists('pcntl_async_signals')) { + pcntl_async_signals(true); + pcntl_signal(SIGTERM, function () { $this->stop = true; }); + } + + while (!$this->stop) { + if ($this->isPaused()) { + $this->redis->set($this->idleKey, '1'); + while ($this->isPaused() && !$this->stop) { + usleep(20 * 1000); + } + $this->redis->set($this->idleKey, '0'); + continue; + } + + $change = $this->primary->nextChange($this->pollTimeoutS); + if ($change === null) { + continue; + } + try { + $this->cache->applyChange($change); + } catch (\Throwable $exc) { + fwrite(STDERR, "[sync] failed to apply event: " . $exc->getMessage() . "\n"); + } + } +} +``` + +`MockPrimaryStore::nextChange()` blocks on `BRPOP demo:primary:changes `, so the worker uses a fraction of a CPU when idle and reacts within milliseconds when a change arrives. + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate($entityId)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulkLoad($primary->listRecords())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear()` and `bulkLoad()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary->listRecords()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```php +$supervisor->pause(); +try { + $cache->clear(); + $cache->bulkLoad($primary->listRecords()); +} finally { + $supervisor->resume(); +} +``` + +`pause()` writes `demo:sync:paused = "1"`, then waits up to 2 seconds for the worker to write `demo:sync:idle = "1"` (the worker writes this key once it has parked itself on the pause flag). Once `pause()` returns true, no `applyChange` is in flight and no new event will be drained from the change list. Change events that arrive during the pause sit on `demo:primary:changes` and apply in order once `resume()` is called. + +This pattern is forced by PHP's process model: the demo server runs under `php -S`, every HTTP request is a fresh process, and the sync worker lives in a separate long-running process. The pause / idle signals cannot live in shared memory; Redis is the only place every component already talks to. + +## Hit/miss accounting + +The helper keeps counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up. + +In PHP under `php -S`, every HTTP request runs in its own process, so the counters cannot live in object properties. The helper stores them in a Redis hash under `demo:stats:{prefix}` and uses `HINCRBY` / `HINCRBYFLOAT` to update them. Every HTTP request, plus the sync worker process, sees the same totals: + +```php +public function stats(): array +{ + $raw = $this->redis->hgetall($this->statsKey) ?: []; + $hits = (int) ($raw['hits'] ?? 0); + $misses = (int) ($raw['misses'] ?? 0); + $prefetched = (int) ($raw['prefetched'] ?? 0); + $applied = (int) ($raw['sync_events_applied'] ?? 0); + $lagTotal = (float) ($raw['sync_lag_ms_total'] ?? 0.0); + $lagSamples = (int) ($raw['sync_lag_samples'] ?? 0); + + $total = $hits + $misses; + $hitRate = $total > 0 ? round(100.0 * $hits / $total, 1) : 0.0; + $avgLag = $lagSamples > 0 ? round($lagTotal / $lagSamples, 2) : 0.0; + + return [ + 'hits' => $hits, + 'misses' => $misses, + 'hit_rate_pct' => $hitRate, + 'prefetched' => $prefetched, + 'sync_events_applied' => $applied, + 'sync_lag_ms_avg' => $avgLag, + ]; +} +``` + +In production you would emit these as Prometheus counters and gauges. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* PHP 8.1 or later is installed (`php --version`) +* [Composer](https://getcomposer.org/) is installed +* Redis is running and accessible. By default, the demo connects to `127.0.0.1:6379` + +If your Redis server is running elsewhere, start the demo with the `PREFETCH_REDIS_HOST` and `PREFETCH_REDIS_PORT` environment variables. + +## Running the demo + +### Get the source files + +The demo consists of six files. Download them from the [`php` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/php) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/php +curl -O $BASE/Cache.php +curl -O $BASE/Primary.php +curl -O $BASE/SyncWorker.php +curl -O $BASE/sync_worker.php +curl -O $BASE/demo_server.php +curl -O $BASE/composer.json +``` + +### Start the demo server + +From that directory, install Predis and start the server: + +```bash +composer install +php -S 127.0.0.1:8788 demo_server.php +``` + +After starting the server, visit `http://127.0.0.1:8788`. + +On the first request the server will: + +* Seed the mock primary store into Redis (records under `demo:primary:hash:{id}`, ID set under `demo:primary:ids`). +* Bulk-load every primary record into the cache. +* Spawn a long-running `sync_worker.php` process via `proc_open` and record its PID in `demo:sync:pid`. + +The sync worker keeps running until you stop it manually. To stop everything cleanly: + +```bash +# Stop the demo server with Ctrl+C, then kill the sync worker: +kill $(redis-cli get demo:sync:pid) +``` + +The demo server uses only the PHP standard library and Predis: + +* PHP's [built-in dev server](https://www.php.net/manual/en/features.commandline.webserver.php) for HTTP +* [`proc_open`](https://www.php.net/manual/en/function.proc-open.php) + `posix_kill` for spawning and signalling the sync worker +* [Predis](https://github.com/predis/predis) for every Redis command + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/php/Primary.php)). + +Each record lives in its own Redis hash under `demo:primary:hash:{id}`, and the set `demo:primary:ids` tracks the current ID universe. The change feed is a Redis list (`demo:primary:changes`); every mutation `LPUSH`es a JSON-encoded change event and the sync worker drains the list with `BRPOP`. + +The reference Python implementation guards mutation + emit with an in-process lock so two concurrent updates produce change events in queue order matching mutation order. PHP doesn't have shared memory across requests, so the PHP port runs each mutation as a Lua script on the Redis server. Lua scripts run atomically on the Redis main thread, so the mutation and the `LPUSH` happen as one step and the queue order can't get scrambled: + +```php +private const UPDATE_SCRIPT = <<<'LUA' +local id = ARGV[1] +local field = ARGV[2] +local value = ARGV[3] +local now_ms = tonumber(ARGV[4]) +local ids_key = KEYS[1] +local hash_key = KEYS[2] +local changes_key = KEYS[3] +if redis.call('SISMEMBER', ids_key, id) == 0 then + return 0 +end +redis.call('HSET', hash_key, field, value) +local raw = redis.call('HGETALL', hash_key) +local fields = {} +for i = 1, #raw, 2 do + fields[raw[i]] = raw[i + 1] +end +local change = cjson.encode({ + op = 'upsert', + id = id, + fields = fields, + timestamp_ms = now_ms, +}) +redis.call('LPUSH', changes_key, change) +return 1 +LUA; +``` + +In a real system this list-based change feed is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's Redis-list change feed is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Run the sync worker as a managed service + +Under `php -S`, every HTTP request runs in its own short-lived process, so this demo spawns a separate `sync_worker.php` process via `proc_open` and records its PID in Redis. That keeps the demo self-contained, but it is not how you would run this in production: the supervisor's lifecycle is bound to whichever HTTP request happened to spawn the worker, and a stale PID can survive a server restart. + +In production, run the sync worker as a managed service — systemd, supervisord, Kubernetes, or whatever your platform uses to run long-lived workers — and remove the in-request spawning entirely. The pause/resume coordination still belongs in Redis (`demo:sync:paused` / `demo:sync:idle`), because the application processes that handle `/clear` and `/reprefetch` still need a way to ask the out-of-band worker to park itself before they rewrite the cache. The supervisor in this demo is only there to give you a one-command "press play and watch the cache work" experience. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` argument exactly for this. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [Predis on GitHub](https://github.com/predis/predis) - The Predis client for PHP +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [BRPOP command]({{< relref "/commands/brpop" >}}) - Block on a list for the next change event +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/php/composer.json b/content/develop/use-cases/prefetch-cache/php/composer.json new file mode 100644 index 0000000000..201ce4ca9c --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/composer.json @@ -0,0 +1,8 @@ +{ + "name": "redis/prefetch-cache-php-demo", + "description": "Redis prefetch-cache demo using Predis (PHP).", + "require": { + "php": ">=8.1", + "predis/predis": "^3.0" + } +} diff --git a/content/develop/use-cases/prefetch-cache/php/demo_server.php b/content/develop/use-cases/prefetch-cache/php/demo_server.php new file mode 100644 index 0000000000..cdaf07c5e1 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/demo_server.php @@ -0,0 +1,663 @@ + $redisHost, + 'port' => $redisPort, + ]); + $redis->ping(); +} catch (\Throwable $e) { + http_response_code(500); + header('Content-Type: text/plain'); + echo "Failed to connect to Redis at {$redisHost}:{$redisPort}: " . $e->getMessage(); + return; +} + +$cache = new PrefetchCache($redis, $cachePrefix, $ttlSeconds); +$primary = new MockPrimaryStore($redis, $latencyMs); +$supervisor = new SyncWorkerSupervisor( + $redis, + __DIR__ . '/sync_worker.php', + [ + 'redis-host' => $redisHost, + 'redis-port' => (string) $redisPort, + 'cache-prefix' => $cachePrefix, + 'ttl-seconds' => (string) $ttlSeconds, + 'primary-latency-ms' => (string) $latencyMs, + ] +); + +// First-request bootstrap: seed the primary, prefetch the cache, spawn +// the sync worker. Idempotent — the next request finds the bootstrap +// flag and the worker PID already there and skips everything. +$bootstrapKey = 'demo:bootstrap:' . $cachePrefix; +$bootstrapDone = ((string) $redis->get($bootstrapKey)) === '1'; +$primary->seedIfEmpty(); +if (!$bootstrapDone || !$supervisor->running()) { + if (!$bootstrapDone) { + $cache->clear(); + $cache->bulkLoad($primary->listRecords()); + } + $supervisor->start(); + $redis->set($bootstrapKey, '1'); +} + +$method = $_SERVER['REQUEST_METHOD']; +$path = parse_url($_SERVER['REQUEST_URI'] ?? '/', PHP_URL_PATH) ?: '/'; + +if ($method === 'GET' && ($path === '/' || $path === '/index.html')) { + send_html(render_page($cache->getTtlSeconds())); + return; +} + +if ($method === 'GET' && $path === '/categories') { + send_json([ + 'cache_ids' => $cache->ids(), + 'primary_ids' => $primary->listIds(), + ]); + return; +} + +if ($method === 'GET' && $path === '/read') { + $entityId = (string) ($_GET['id'] ?? ''); + if ($entityId === '') { + send_json(['error' => "Missing 'id'."], 400); + return; + } + [$record, $hit, $redisMs] = $cache->get($entityId); + send_json([ + 'id' => $entityId, + 'record' => $record, + 'hit' => $hit, + 'redis_latency_ms' => round($redisMs, 2), + 'ttl_remaining' => $cache->ttlRemaining($entityId), + 'stats' => build_stats($cache, $primary), + ]); + return; +} + +if ($method === 'GET' && $path === '/stats') { + send_json(build_stats($cache, $primary)); + return; +} + +if ($method === 'POST' && $path === '/update') { + $params = read_form_data(); + $entityId = (string) ($params['id'] ?? ''); + $field = (string) ($params['field'] ?? ''); + $value = (string) ($params['value'] ?? ''); + if ($entityId === '' || $field === '') { + send_json(['error' => "Missing 'id' or 'field'."], 400); + return; + } + if (!$primary->updateField($entityId, $field, $value)) { + send_json(['error' => "Unknown category '{$entityId}'."], 404); + return; + } + send_json([ + 'id' => $entityId, + 'field' => $field, + 'value' => $value, + 'stats' => build_stats($cache, $primary), + ]); + return; +} + +if ($method === 'POST' && $path === '/add') { + $params = read_form_data(); + $entityId = trim((string) ($params['id'] ?? '')); + $name = trim((string) ($params['name'] ?? '')); + if ($entityId === '' || $name === '') { + send_json(['error' => "Missing 'id' or 'name'."], 400); + return; + } + $record = [ + 'id' => $entityId, + 'name' => $name, + 'display_order' => (string) ($params['display_order'] ?? '99') ?: '99', + 'featured' => (string) ($params['featured'] ?? 'false') ?: 'false', + 'parent_id' => (string) ($params['parent_id'] ?? '') ?: '', + ]; + if (!$primary->addRecord($record)) { + send_json(['error' => "Category '{$entityId}' already exists."], 409); + return; + } + send_json([ + 'id' => $entityId, + 'record' => $record, + 'stats' => build_stats($cache, $primary), + ]); + return; +} + +if ($method === 'POST' && $path === '/delete') { + $params = read_form_data(); + $entityId = (string) ($params['id'] ?? ''); + if ($entityId === '') { + send_json(['error' => "Missing 'id'."], 400); + return; + } + if (!$primary->deleteRecord($entityId)) { + send_json(['error' => "Unknown category '{$entityId}'."], 404); + return; + } + send_json(['id' => $entityId, 'stats' => build_stats($cache, $primary)]); + return; +} + +if ($method === 'POST' && $path === '/invalidate') { + $params = read_form_data(); + $entityId = (string) ($params['id'] ?? ''); + if ($entityId === '') { + send_json(['error' => "Missing 'id'."], 400); + return; + } + $deleted = $cache->invalidate($entityId); + send_json([ + 'id' => $entityId, + 'deleted' => $deleted, + 'stats' => build_stats($cache, $primary), + ]); + return; +} + +if ($method === 'POST' && $path === '/clear') { + // Pause the sync worker so it cannot recreate keys between SCAN + // and DEL. Queued events accumulate in demo:primary:changes and + // apply in order after resume(). + $supervisor->pause(); + try { + $deleted = $cache->clear(); + } finally { + $supervisor->resume(); + } + send_json([ + 'deleted' => $deleted, + 'stats' => build_stats($cache, $primary), + ]); + return; +} + +if ($method === 'POST' && $path === '/reprefetch') { + // Pause the sync worker so it cannot interleave with the clear + + // snapshot + bulk_load sequence. Without this, a change applied + // between list_records() and bulk_load() would be overwritten by + // the stale snapshot. + $supervisor->pause(); + try { + $started = microtime(true); + $cache->clear(); + $loaded = $cache->bulkLoad($primary->listRecords()); + $elapsed = (microtime(true) - $started) * 1000.0; + } finally { + $supervisor->resume(); + } + send_json([ + 'loaded' => $loaded, + 'elapsed_ms' => round($elapsed, 2), + 'stats' => build_stats($cache, $primary), + ]); + return; +} + +if ($method === 'POST' && $path === '/reset') { + $cache->resetStats(); + $primary->resetReads(); + send_json(build_stats($cache, $primary)); + return; +} + +http_response_code(404); +echo 'Not Found'; + +// --------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------- + +function build_stats(PrefetchCache $cache, MockPrimaryStore $primary): array +{ + $stats = $cache->stats(); + $stats['primary_reads_total'] = $primary->reads(); + $stats['primary_read_latency_ms'] = $primary->getReadLatencyMs(); + return $stats; +} + +function read_form_data(): array +{ + $raw = file_get_contents('php://input') ?: ''; + $parsed = []; + parse_str($raw, $parsed); + return $parsed; +} + +function send_html(string $html, int $status = 200): void +{ + http_response_code($status); + header('Content-Type: text/html; charset=utf-8'); + echo $html; +} + +function send_json($payload, int $status = 200): void +{ + http_response_code($status); + header('Content-Type: application/json'); + echo json_encode($payload, JSON_UNESCAPED_SLASHES); +} + +function render_page(int $ttlSeconds): string +{ + $cacheTtl = (string) $ttlSeconds; + return << + + + + + Redis Prefetch Cache Demo + + + +
+
Predis + php -S dev server
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL ({$cacheTtl} s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +HTML; +} diff --git a/content/develop/use-cases/prefetch-cache/php/sync_worker.php b/content/develop/use-cases/prefetch-cache/php/sync_worker.php new file mode 100644 index 0000000000..189e4fd86b --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/php/sync_worker.php @@ -0,0 +1,83 @@ + '127.0.0.1', + 'redis-port' => 6379, + 'cache-prefix' => 'cache:category:', + 'ttl-seconds' => 3600, + 'primary-latency-ms' => 80, + ]; + $count = count($argv); + for ($i = 1; $i < $count; $i++) { + $arg = $argv[$i]; + if (strpos($arg, '--') !== 0) { + continue; + } + $key = substr($arg, 2); + $value = null; + $eq = strpos($key, '='); + if ($eq !== false) { + $value = substr($key, $eq + 1); + $key = substr($key, 0, $eq); + } elseif ($i + 1 < $count) { + $value = $argv[++$i]; + } + if (!array_key_exists($key, $opts)) { + fwrite(STDERR, "[sync] unknown option --{$key}\n"); + exit(2); + } + $opts[$key] = $value; + } + return $opts; +} + +$opts = parse_cli_args($argv); + +$redis = new PredisClient([ + 'host' => (string) $opts['redis-host'], + 'port' => (int) $opts['redis-port'], +]); + +try { + $redis->ping(); +} catch (\Throwable $exc) { + fwrite(STDERR, "[sync] cannot reach Redis at {$opts['redis-host']}:{$opts['redis-port']}: " . $exc->getMessage() . "\n"); + exit(1); +} + +$primary = new MockPrimaryStore($redis, (int) $opts['primary-latency-ms']); +$cache = new PrefetchCache( + $redis, + (string) $opts['cache-prefix'], + (int) $opts['ttl-seconds'] +); +$worker = new SyncWorker($redis, $primary, $cache); + +fwrite(STDERR, "[sync] started pid=" . getmypid() . " prefix={$opts['cache-prefix']}\n"); +$worker->run(); +fwrite(STDERR, "[sync] stopped\n"); diff --git a/content/develop/use-cases/prefetch-cache/redis-py/_index.md b/content/develop/use-cases/prefetch-cache/redis-py/_index.md new file mode 100644 index 0000000000..ee385f59b2 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/redis-py/_index.md @@ -0,0 +1,372 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Python with redis-py +linkTitle: redis-py example (Python) +title: Redis prefetch cache with redis-py +weight: 1 +--- + +This guide shows you how to implement a Redis prefetch cache in Python with [`redis-py`]({{< relref "/develop/clients/redis-py" >}}). It includes a small local web server built with the Python standard library so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.bulk_load(primary.list_records())`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.get(entity_id)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process queue. The sync worker thread drains the queue and calls `cache.apply_change(event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process change queue is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/redis-py/cache.py)): + +```python +import redis +from cache import PrefetchCache +from primary import MockPrimaryStore +from sync_worker import SyncWorker + +r = redis.Redis(host="localhost", port=6379, decode_responses=True) +primary = MockPrimaryStore() +cache = PrefetchCache(redis_client=r, ttl_seconds=3600) + +# Pre-load every primary record into Redis in one pipelined round trip. +cache.bulk_load(primary.list_records()) + +# Start the sync worker so primary mutations propagate into Redis. +sync = SyncWorker(primary=primary, cache=cache) +sync.start() + +# Read paths now go to Redis only. +record, hit, redis_ms = cache.get("cat-001") +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `bulk_load` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```python +def bulk_load(self, records: Iterable[dict[str, str]]) -> int: + loaded = 0 + pipe = self.redis.pipeline(transaction=False) + for record in records: + entity_id = record.get("id") + if not entity_id: + continue + cache_key = self._cache_key(entity_id) + pipe.delete(cache_key) + pipe.hset(cache_key, mapping=record) + pipe.expire(cache_key, self.ttl_seconds) + loaded += 1 + if loaded: + pipe.execute() + return loaded +``` + +`transaction=False` is intentional on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulk_load` directly from your own code on a cache that is already serving reads, either pause your writers first or rewrite it with `pipeline(transaction=True)` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```python +def get( + self, + entity_id: str, +) -> tuple[Optional[dict[str, str]], bool, float]: + cache_key = self._cache_key(entity_id) + + started = time.perf_counter() + cached = self.redis.hgetall(cache_key) + redis_latency_ms = (time.perf_counter() - started) * 1000.0 + + if cached: + with self._stats_lock: + self._hits += 1 + return cached, True, redis_latency_ms + + with self._stats_lock: + self._misses += 1 + return None, False, redis_latency_ms +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `apply_change` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one pipelined transaction so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```python +def apply_change(self, change: dict) -> None: + op = change.get("op") + entity_id = change.get("id") + if not entity_id: + return + + cache_key = self._cache_key(entity_id) + + if op == "upsert": + fields = change.get("fields") or {} + pipe = self.redis.pipeline(transaction=True) + pipe.delete(cache_key) + pipe.hset(cache_key, mapping=fields) + pipe.expire(cache_key, self.ttl_seconds) + pipe.execute() + elif op == "delete": + self.redis.delete(cache_key) +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. + +## The sync worker + +The `SyncWorker` runs a daemon thread that blocks on the primary's change queue with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/redis-py/sync_worker.py)): + +```python +def _run(self) -> None: + while not self._stop_event.is_set(): + change = self.primary.next_change(timeout=self.poll_timeout_s) + if change is None: + continue + try: + self.cache.apply_change(change) + except Exception as exc: + print(f"[sync] failed to apply {change!r}: {exc}") +``` + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate(entity_id)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulk_load(primary.list_records())` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear()` and `bulk_load()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.list_records()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```python +self.sync.pause() +try: + self.cache.clear() + self.cache.bulk_load(self.primary.list_records()) +finally: + self.sync.resume() +``` + +`pause()` waits for the worker to finish whatever event it is currently applying, parks the run loop, and returns. Change events that arrive during the pause sit in the primary's queue and apply in order once `resume()` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```python +def stats(self) -> dict[str, float]: + with self._stats_lock: + total = self._hits + self._misses + hit_rate = round(100.0 * self._hits / total, 1) if total else 0.0 + avg_lag = ( + round(self._sync_lag_ms_total / self._sync_lag_samples, 2) + if self._sync_lag_samples + else 0.0 + ) + return { + "hits": self._hits, + "misses": self._misses, + "hit_rate_pct": hit_rate, + "prefetched": self._prefetched, + "sync_events_applied": self._sync_events_applied, + "sync_lag_ms_avg": avg_lag, + } +``` + +In production you would emit these as Prometheus counters and gauges. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* The `redis` Python package is installed: + +```bash +pip install redis +``` + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of four files. Download them from the [`redis-py` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/redis-py) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/redis-py +curl -O $BASE/cache.py +curl -O $BASE/primary.py +curl -O $BASE/sync_worker.py +curl -O $BASE/demo_server.py +``` + +### Start the demo server + +From that directory: + +```bash +python3 demo_server.py +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8082 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 82.1 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8082`. + +The demo server uses only Python standard library features for HTTP handling and concurrency: + +* [`http.server`](https://docs.python.org/3/library/http.server.html) for the web server +* [`urllib.parse`](https://docs.python.org/3/library/urllib.parse.html) for query and form decoding +* [`threading`](https://docs.python.org/3/library/threading.html) for the sync worker daemon + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/redis-py/primary.py)): + +```python +class MockPrimaryStore: + def __init__(self, read_latency_ms: int = 80) -> None: + ... + + def list_records(self) -> list[dict[str, str]]: + time.sleep(self.read_latency_ms / 1000.0) + ... + + def update_field(self, entity_id: str, field: str, value: str) -> bool: + ... + self._emit_change(CHANGE_OP_UPSERT, entity_id, snapshot) + return True +``` + +Every mutation appends a change event to an in-process [`queue.Queue`](https://docs.python.org/3/library/queue.html). The sync worker drains the queue with a 50 ms timeout and applies each event to Redis. In a real system this queue is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's in-process queue is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` argument exactly for this. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [redis-py guide]({{< relref "/develop/clients/redis-py" >}}) - Install and use the Python Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/redis-py/cache.py b/content/develop/use-cases/prefetch-cache/redis-py/cache.py new file mode 100644 index 0000000000..f01b64acdb --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/redis-py/cache.py @@ -0,0 +1,209 @@ +""" +Redis prefetch-cache helper. + +Each cached entity is stored as a Redis hash under ``cache:{prefix}:{id}`` +with a long safety-net TTL that bounds memory if the sync pipeline ever +stops, but is not the freshness mechanism. Freshness comes from the +``apply_change`` path, which the sync worker calls every time a primary +mutation arrives. + +Reads run ``HGETALL`` against Redis only. A miss is not a fall-back +trigger — the application treats it as an error or a deliberate +``invalidate`` for testing. In production a sustained miss rate means +the prefetch or the sync pipeline is broken, not that the primary should +be re-queried on the request path. +""" + +from __future__ import annotations + +from threading import Lock +import time +from typing import Iterable, Optional + +import redis + + +class PrefetchCache: + """Prefetch-cache helper backed by Redis hashes with a safety-net TTL.""" + + def __init__( + self, + redis_client: Optional[redis.Redis] = None, + prefix: str = "cache:category:", + ttl_seconds: int = 3600, + ) -> None: + self.redis = redis_client or redis.Redis( + host="localhost", + port=6379, + decode_responses=True, + ) + self.prefix = prefix + self.ttl_seconds = ttl_seconds + + self._stats_lock = Lock() + self._hits = 0 + self._misses = 0 + self._prefetched = 0 + self._sync_events_applied = 0 + self._sync_lag_ms_total = 0.0 + self._sync_lag_samples = 0 + + def _cache_key(self, entity_id: str) -> str: + return f"{self.prefix}{entity_id}" + + def _strip_prefix(self, key: str) -> str: + return key[len(self.prefix):] if key.startswith(self.prefix) else key + + def bulk_load(self, records: Iterable[dict[str, str]]) -> int: + """Pipeline ``HSET`` + ``EXPIRE`` for every record. Returns the count loaded. + + The pipeline is non-transactional: it is fast on startup (when + nothing is reading the cache) and on the live ``/reprefetch`` + path (when the demo pauses the sync worker around the call). + Calling ``bulk_load`` on a cache that is actively being read + and written to can briefly expose a key that has been deleted + but not yet rewritten; pause the writers first or rewrite this + with ``pipeline(transaction=True)`` if that matters. + """ + loaded = 0 + pipe = self.redis.pipeline(transaction=False) + for record in records: + entity_id = record.get("id") + if not entity_id: + continue + cache_key = self._cache_key(entity_id) + pipe.delete(cache_key) + pipe.hset(cache_key, mapping=record) + pipe.expire(cache_key, self.ttl_seconds) + loaded += 1 + if loaded: + pipe.execute() + with self._stats_lock: + self._prefetched += loaded + return loaded + + def get( + self, + entity_id: str, + ) -> tuple[Optional[dict[str, str]], bool, float]: + """Return ``(record, hit, redis_latency_ms)`` for an ``HGETALL`` against Redis. + + Prefetch-cache reads do not fall back to the primary. A miss is a + signal that the cache is incomplete, not a trigger to re-query the + source. The caller decides how to surface it. + """ + cache_key = self._cache_key(entity_id) + + started = time.perf_counter() + cached = self.redis.hgetall(cache_key) + redis_latency_ms = (time.perf_counter() - started) * 1000.0 + + if cached: + with self._stats_lock: + self._hits += 1 + return cached, True, redis_latency_ms + + with self._stats_lock: + self._misses += 1 + return None, False, redis_latency_ms + + def apply_change(self, change: dict) -> None: + """Apply a primary change event to Redis. + + The sync worker calls this for every event the primary emits. + For an upsert, the helper rewrites the hash and refreshes the + safety-net TTL. For a delete, it removes the cache key. + """ + op = change.get("op") + entity_id = change.get("id") + if not entity_id: + return + + cache_key = self._cache_key(entity_id) + + if op == "upsert": + fields = change.get("fields") + if not fields: + # Malformed upsert with no fields. Skip rather than crash + # the sync worker: HSET with an empty mapping raises + # DataError, and there's nothing to write anyway. A real + # CDC consumer would route this to a dead-letter queue + # and alert; the demo just drops it. + return + pipe = self.redis.pipeline(transaction=True) + pipe.delete(cache_key) + pipe.hset(cache_key, mapping=fields) + pipe.expire(cache_key, self.ttl_seconds) + pipe.execute() + elif op == "delete": + self.redis.delete(cache_key) + else: + return + + with self._stats_lock: + self._sync_events_applied += 1 + timestamp_ms = change.get("timestamp_ms") + if isinstance(timestamp_ms, (int, float)): + lag_ms = max(0.0, (time.time() * 1000.0) - timestamp_ms) + self._sync_lag_ms_total += lag_ms + self._sync_lag_samples += 1 + + def invalidate(self, entity_id: str) -> bool: + """Delete one cache key. Demo-only: simulates a broken sync pipeline.""" + return self.redis.delete(self._cache_key(entity_id)) == 1 + + def clear(self) -> int: + """Delete every key under this cache's prefix and return the count.""" + deleted = 0 + pipe = self.redis.pipeline(transaction=False) + batch = 0 + for key in self.redis.scan_iter(match=f"{self.prefix}*", count=500): + pipe.delete(key) + batch += 1 + if batch >= 500: + deleted += sum(int(bool(r)) for r in pipe.execute()) + pipe = self.redis.pipeline(transaction=False) + batch = 0 + if batch: + deleted += sum(int(bool(r)) for r in pipe.execute()) + return deleted + + def ids(self) -> list[str]: + """Return every entity id currently in the cache.""" + return sorted( + self._strip_prefix(key) + for key in self.redis.scan_iter(match=f"{self.prefix}*", count=500) + ) + + def count(self) -> int: + return sum(1 for _ in self.redis.scan_iter(match=f"{self.prefix}*", count=500)) + + def ttl_remaining(self, entity_id: str) -> int: + return int(self.redis.ttl(self._cache_key(entity_id))) + + def stats(self) -> dict[str, float]: + with self._stats_lock: + total = self._hits + self._misses + hit_rate = round(100.0 * self._hits / total, 1) if total else 0.0 + avg_lag = ( + round(self._sync_lag_ms_total / self._sync_lag_samples, 2) + if self._sync_lag_samples + else 0.0 + ) + return { + "hits": self._hits, + "misses": self._misses, + "hit_rate_pct": hit_rate, + "prefetched": self._prefetched, + "sync_events_applied": self._sync_events_applied, + "sync_lag_ms_avg": avg_lag, + } + + def reset_stats(self) -> None: + with self._stats_lock: + self._hits = 0 + self._misses = 0 + self._prefetched = 0 + self._sync_events_applied = 0 + self._sync_lag_ms_total = 0.0 + self._sync_lag_samples = 0 diff --git a/content/develop/use-cases/prefetch-cache/redis-py/demo_server.py b/content/develop/use-cases/prefetch-cache/redis-py/demo_server.py new file mode 100644 index 0000000000..8c642ba189 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/redis-py/demo_server.py @@ -0,0 +1,658 @@ +#!/usr/bin/env python3 +""" +Redis prefetch-cache demo server. + +Run this file and visit http://localhost:8082 to watch a prefetch cache +in action: the demo bulk-loads every primary record into Redis on +startup, runs a background sync worker that applies primary mutations +within milliseconds, and lets you add, update, delete, and re-prefetch +records to see how the cache stays current without ever falling back to +the primary on the read path. +""" + +from __future__ import annotations + +import argparse +from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer +import json +from pathlib import Path +import sys +import time +from urllib.parse import parse_qs, urlparse + +sys.path.insert(0, str(Path(__file__).resolve().parent)) + +try: + import redis + + from cache import PrefetchCache + from primary import MockPrimaryStore + from sync_worker import SyncWorker +except ImportError as exc: + print(f"Error: {exc}") + print("Make sure the 'redis' package is installed: pip install redis") + sys.exit(1) + + +HTML_TEMPLATE = """ + + + + + Redis Prefetch Cache Demo + + + +
+
redis-py + Python standard library HTTP server
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +""" + + +class PrefetchCacheDemoHandler(BaseHTTPRequestHandler): + """Serve the prefetch-cache demo UI and JSON endpoints.""" + + cache: PrefetchCache | None = None + primary: MockPrimaryStore | None = None + sync: SyncWorker | None = None + + def do_GET(self) -> None: + parsed = urlparse(self.path) + if parsed.path in {"/", "/index.html"}: + self._send_html(self._html_page()) + return + if parsed.path == "/categories": + self._send_json( + { + "cache_ids": self.cache.ids(), + "primary_ids": self.primary.list_ids(), + }, + 200, + ) + return + if parsed.path == "/read": + self._handle_read(parse_qs(parsed.query)) + return + if parsed.path == "/stats": + self._send_json(self._build_stats(), 200) + return + self.send_error(404) + + def do_POST(self) -> None: + parsed = urlparse(self.path) + if parsed.path == "/update": + self._handle_update() + return + if parsed.path == "/add": + self._handle_add() + return + if parsed.path == "/delete": + self._handle_delete() + return + if parsed.path == "/invalidate": + self._handle_invalidate() + return + if parsed.path == "/clear": + # Pause the sync worker so it cannot recreate keys between + # SCAN and DEL. Queued events accumulate and apply after resume. + self.sync.pause() + try: + deleted = self.cache.clear() + finally: + self.sync.resume() + self._send_json({"deleted": deleted, "stats": self._build_stats()}, 200) + return + if parsed.path == "/reprefetch": + # Pause the sync worker so it cannot interleave with the + # clear + snapshot + bulk_load sequence. Without this, a + # change applied between list_records() and bulk_load() + # would be overwritten by the stale snapshot. + self.sync.pause() + try: + started = time.perf_counter() + self.cache.clear() + loaded = self.cache.bulk_load(self.primary.list_records()) + elapsed = (time.perf_counter() - started) * 1000.0 + finally: + self.sync.resume() + self._send_json( + {"loaded": loaded, "elapsed_ms": round(elapsed, 2), "stats": self._build_stats()}, + 200, + ) + return + if parsed.path == "/reset": + self.cache.reset_stats() + self.primary.reset_reads() + self._send_json(self._build_stats(), 200) + return + self.send_error(404) + + def _handle_read(self, query: dict[str, list[str]]) -> None: + entity_id = query.get("id", [""])[0] + if not entity_id: + self._send_json({"error": "Missing 'id'."}, 400) + return + record, hit, redis_ms = self.cache.get(entity_id) + self._send_json( + { + "id": entity_id, + "record": record, + "hit": hit, + "redis_latency_ms": round(redis_ms, 2), + "ttl_remaining": self.cache.ttl_remaining(entity_id), + "stats": self._build_stats(), + }, + 200, + ) + + def _handle_update(self) -> None: + params = self._read_form_data() + entity_id = params.get("id", [""])[0] + field = params.get("field", [""])[0] + value = params.get("value", [""])[0] + if not entity_id or not field: + self._send_json({"error": "Missing 'id' or 'field'."}, 400) + return + if not self.primary.update_field(entity_id, field, value): + self._send_json({"error": f"Unknown category '{entity_id}'."}, 404) + return + self._send_json( + {"id": entity_id, "field": field, "value": value, "stats": self._build_stats()}, + 200, + ) + + def _handle_add(self) -> None: + params = self._read_form_data() + entity_id = params.get("id", [""])[0].strip() + name = params.get("name", [""])[0].strip() + if not entity_id or not name: + self._send_json({"error": "Missing 'id' or 'name'."}, 400) + return + record = { + "id": entity_id, + "name": name, + "display_order": params.get("display_order", ["99"])[0] or "99", + "featured": params.get("featured", ["false"])[0] or "false", + "parent_id": params.get("parent_id", [""])[0] or "", + } + if not self.primary.add_record(record): + self._send_json({"error": f"Category '{entity_id}' already exists."}, 409) + return + self._send_json({"id": entity_id, "record": record, "stats": self._build_stats()}, 200) + + def _handle_delete(self) -> None: + params = self._read_form_data() + entity_id = params.get("id", [""])[0] + if not entity_id: + self._send_json({"error": "Missing 'id'."}, 400) + return + if not self.primary.delete_record(entity_id): + self._send_json({"error": f"Unknown category '{entity_id}'."}, 404) + return + self._send_json({"id": entity_id, "stats": self._build_stats()}, 200) + + def _handle_invalidate(self) -> None: + params = self._read_form_data() + entity_id = params.get("id", [""])[0] + if not entity_id: + self._send_json({"error": "Missing 'id'."}, 400) + return + deleted = self.cache.invalidate(entity_id) + self._send_json( + {"id": entity_id, "deleted": deleted, "stats": self._build_stats()}, + 200, + ) + + def _build_stats(self) -> dict: + stats = self.cache.stats() + stats["primary_reads_total"] = self.primary.reads() + stats["primary_read_latency_ms"] = self.primary.read_latency_ms + return stats + + def _read_form_data(self) -> dict[str, list[str]]: + content_length = int(self.headers.get("Content-Length", "0")) + raw_body = self.rfile.read(content_length).decode("utf-8") + return parse_qs(raw_body) + + def _send_html(self, html: str, status: int = 200) -> None: + self.send_response(status) + self.send_header("Content-Type", "text/html; charset=utf-8") + self.end_headers() + self.wfile.write(html.encode("utf-8")) + + def _send_json(self, payload: dict, status: int) -> None: + self.send_response(status) + self.send_header("Content-Type", "application/json") + self.end_headers() + self.wfile.write(json.dumps(payload).encode("utf-8")) + + def log_message(self, format: str, *args) -> None: # noqa: A002 + sys.stderr.write(f"[demo] {format % args}\n") + + def _html_page(self) -> str: + return HTML_TEMPLATE.replace("__CACHE_TTL__", str(self.cache.ttl_seconds)) + + +def parse_args() -> argparse.Namespace: + parser = argparse.ArgumentParser(description="Run the Redis prefetch-cache demo server.") + parser.add_argument("--host", default="127.0.0.1", help="HTTP bind host") + parser.add_argument("--port", type=int, default=8082, help="HTTP bind port") + parser.add_argument("--redis-host", default="localhost", help="Redis host") + parser.add_argument("--redis-port", type=int, default=6379, help="Redis port") + parser.add_argument("--cache-prefix", default="cache:category:", help="Cache key prefix") + parser.add_argument( + "--ttl-seconds", + type=int, + default=3600, + help="Safety-net TTL in seconds (refreshed on every sync event)", + ) + parser.add_argument( + "--primary-latency-ms", + type=int, + default=80, + help="Simulated primary read latency (only affects bulk loads and reconciliations)", + ) + return parser.parse_args() + + +def main() -> None: + args = parse_args() + + redis_client = redis.Redis( + host=args.redis_host, + port=args.redis_port, + decode_responses=True, + ) + cache = PrefetchCache( + redis_client=redis_client, + prefix=args.cache_prefix, + ttl_seconds=args.ttl_seconds, + ) + primary = MockPrimaryStore(read_latency_ms=args.primary_latency_ms) + sync = SyncWorker(primary=primary, cache=cache) + + started = time.perf_counter() + cache.clear() + loaded = cache.bulk_load(primary.list_records()) + elapsed_ms = (time.perf_counter() - started) * 1000.0 + sync.start() + + PrefetchCacheDemoHandler.cache = cache + PrefetchCacheDemoHandler.primary = primary + PrefetchCacheDemoHandler.sync = sync + + print(f"Redis prefetch-cache demo server listening on http://{args.host}:{args.port}") + print( + f"Using Redis at {args.redis_host}:{args.redis_port}" + f" with cache prefix '{args.cache_prefix}' and TTL {args.ttl_seconds}s" + ) + print(f"Prefetched {loaded} records in {elapsed_ms:.1f} ms; sync worker running") + + server = ThreadingHTTPServer((args.host, args.port), PrefetchCacheDemoHandler) + try: + server.serve_forever() + except KeyboardInterrupt: + pass + finally: + sync.stop() + + +if __name__ == "__main__": + main() diff --git a/content/develop/use-cases/prefetch-cache/redis-py/primary.py b/content/develop/use-cases/prefetch-cache/redis-py/primary.py new file mode 100644 index 0000000000..e7c47ab7f4 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/redis-py/primary.py @@ -0,0 +1,160 @@ +""" +Mock primary data store for the prefetch-cache demo. + +This stands in for a source-of-truth database (Postgres, MySQL, Mongo, +etc.) that holds reference data the application serves to users. + +Every mutation appends a change event to an in-process queue, which the +sync worker drains and applies to Redis. In a real system the queue is +replaced by a CDC pipeline — Redis Data Integration, Debezium plus a +lightweight consumer, or an equivalent tool that tails the source's +binlog/WAL and pushes changes into Redis. + +The store also exposes ``read_latency_ms`` so the demo can illustrate +how much slower a direct primary read would be than a Redis hit. +""" + +from __future__ import annotations + +from queue import Empty, Queue +from threading import Lock +import time +from typing import Iterable, Optional + + +CHANGE_OP_UPSERT = "upsert" +CHANGE_OP_DELETE = "delete" + + +class MockPrimaryStore: + """In-memory stand-in for a primary database of reference data.""" + + def __init__(self, read_latency_ms: int = 80) -> None: + self.read_latency_ms = read_latency_ms + self._lock = Lock() + self._reads = 0 + self._changes: Queue[dict] = Queue() + self._records: dict[str, dict[str, str]] = { + "cat-001": { + "id": "cat-001", + "name": "Beverages", + "display_order": "1", + "featured": "true", + "parent_id": "", + }, + "cat-002": { + "id": "cat-002", + "name": "Bakery", + "display_order": "2", + "featured": "true", + "parent_id": "", + }, + "cat-003": { + "id": "cat-003", + "name": "Pantry Staples", + "display_order": "3", + "featured": "false", + "parent_id": "", + }, + "cat-004": { + "id": "cat-004", + "name": "Frozen", + "display_order": "4", + "featured": "false", + "parent_id": "", + }, + "cat-005": { + "id": "cat-005", + "name": "Specialty Cheeses", + "display_order": "5", + "featured": "false", + "parent_id": "cat-002", + }, + } + + def list_ids(self) -> list[str]: + with self._lock: + return sorted(self._records.keys()) + + def list_records(self) -> list[dict[str, str]]: + """Return every record. Used by the cache's bulk-load path on startup.""" + time.sleep(self.read_latency_ms / 1000.0) + with self._lock: + self._reads += 1 + return [dict(record) for record in self._records.values()] + + def read(self, entity_id: str) -> Optional[dict[str, str]]: + """Single-record read. Not on the demo's normal read path.""" + time.sleep(self.read_latency_ms / 1000.0) + with self._lock: + self._reads += 1 + record = self._records.get(entity_id) + return dict(record) if record else None + + def add_record(self, record: dict[str, str]) -> bool: + entity_id = record.get("id", "").strip() + if not entity_id: + return False + with self._lock: + if entity_id in self._records: + return False + self._records[entity_id] = dict(record) + # Emit while the lock is held so the queue order matches the + # mutation order. Two concurrent callers cannot interleave + # mutation A → mutation B → emit B → emit A. + self._emit_change_locked(CHANGE_OP_UPSERT, entity_id, dict(record)) + return True + + def update_field(self, entity_id: str, field: str, value: str) -> bool: + with self._lock: + record = self._records.get(entity_id) + if record is None: + return False + record[field] = value + self._emit_change_locked(CHANGE_OP_UPSERT, entity_id, dict(record)) + return True + + def delete_record(self, entity_id: str) -> bool: + with self._lock: + if entity_id not in self._records: + return False + del self._records[entity_id] + self._emit_change_locked(CHANGE_OP_DELETE, entity_id, None) + return True + + def next_change(self, timeout: float) -> Optional[dict]: + """Block up to ``timeout`` seconds for the next change event.""" + try: + return self._changes.get(timeout=timeout) + except Empty: + return None + + def reads(self) -> int: + with self._lock: + return self._reads + + def reset_reads(self) -> None: + with self._lock: + self._reads = 0 + + def _emit_change_locked( + self, + op: str, + entity_id: str, + fields: Optional[dict[str, str]], + ) -> None: + """Append a change event to the feed. Caller must hold ``self._lock``. + + ``queue.Queue.put`` is itself thread-safe and never tries to acquire + ``self._lock``, so calling it while holding the records lock cannot + deadlock. Holding the lock here is what guarantees that the queue + order matches the order in which the records dict was mutated. + """ + self._changes.put( + { + "op": op, + "id": entity_id, + "fields": fields, + "timestamp_ms": time.time() * 1000.0, + } + ) diff --git a/content/develop/use-cases/prefetch-cache/redis-py/sync_worker.py b/content/develop/use-cases/prefetch-cache/redis-py/sync_worker.py new file mode 100644 index 0000000000..b27c8c629a --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/redis-py/sync_worker.py @@ -0,0 +1,111 @@ +""" +Background sync worker for the prefetch-cache demo. + +A daemon thread drains the primary's change queue and applies each event +to Redis through ``PrefetchCache.apply_change``. In a real system, the +queue is replaced by a CDC pipeline (Redis Data Integration, Debezium, +or an equivalent) that tails the primary's binlog/WAL and writes the +same shape of events. + +The worker exposes ``pause()`` and ``resume()`` so maintenance paths +(``/reprefetch``, ``clear()``) can stop event application without +tearing the thread down. ``pause()`` blocks until the worker is parked, +so the caller knows no apply is in flight by the time it returns. +""" + +from __future__ import annotations + +from threading import Event, Thread +from typing import Optional + +from cache import PrefetchCache +from primary import MockPrimaryStore + + +class SyncWorker: + """Drain primary change events into Redis on a daemon thread.""" + + def __init__( + self, + primary: MockPrimaryStore, + cache: PrefetchCache, + poll_timeout_s: float = 0.05, + ) -> None: + self.primary = primary + self.cache = cache + self.poll_timeout_s = poll_timeout_s + self._stop_event = Event() + self._pause_event = Event() + self._paused_idle_event = Event() + self._thread: Optional[Thread] = None + + def start(self) -> None: + if self._thread is not None and self._thread.is_alive(): + return + self._stop_event.clear() + self._pause_event.clear() + self._paused_idle_event.clear() + self._thread = Thread( + target=self._run, + name="prefetch-cache-sync", + daemon=True, + ) + self._thread.start() + + def stop(self, join_timeout_s: float = 2.0) -> None: + """Signal the worker to exit and join its thread. + + If the join times out the worker is wedged inside ``apply_change``; + we leave ``self._thread`` populated so a subsequent ``start()`` does + not spawn a second worker on top of the orphan. + """ + self._stop_event.set() + if self._thread is None: + return + self._thread.join(timeout=join_timeout_s) + if not self._thread.is_alive(): + self._thread = None + + def pause(self, timeout_s: float = 2.0) -> bool: + """Stop applying events and block until the worker is parked. + + Returns ``True`` once the worker has confirmed it is idle, or + ``False`` if the timeout elapsed first. While paused, change + events accumulate in the primary's queue and are applied in order + after ``resume()``. + """ + self._paused_idle_event.clear() + self._pause_event.set() + if self._thread is None or not self._thread.is_alive(): + return True + return self._paused_idle_event.wait(timeout=timeout_s) + + def resume(self) -> None: + self._pause_event.clear() + self._paused_idle_event.clear() + + def _run(self) -> None: + while not self._stop_event.is_set(): + if self._pause_event.is_set(): + # Park until the pause is lifted or the worker is stopped. + # Re-set ``_paused_idle_event`` on every iteration so a *new* + # ``pause()`` that arrives while we are still parked from + # the previous cycle gets acknowledged within one poll + # interval, not the caller's full pause-timeout. + while self._pause_event.is_set() and not self._stop_event.is_set(): + self._paused_idle_event.set() + self._stop_event.wait(timeout=self.poll_timeout_s) + self._paused_idle_event.clear() + continue + + change = self.primary.next_change(timeout=self.poll_timeout_s) + if change is None: + continue + try: + self.cache.apply_change(change) + except Exception as exc: + # Demo behaviour: log and drop the event. A production + # CDC consumer would retry with bounded backoff and + # expose a dead-letter / error counter; see the guide's + # "Production usage" section. + print(f"[sync] failed to apply {change!r}: {exc}") diff --git a/content/develop/use-cases/prefetch-cache/ruby/_index.md b/content/develop/use-cases/prefetch-cache/ruby/_index.md new file mode 100644 index 0000000000..1103a987c2 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/ruby/_index.md @@ -0,0 +1,426 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Ruby with redis-rb +linkTitle: redis-rb example (Ruby) +title: Redis prefetch cache with redis-rb +weight: 8 +--- + +This guide shows you how to implement a Redis prefetch cache in Ruby with [`redis-rb`]({{< relref "/develop/clients/ruby" >}}). It includes a small local web server built with the Ruby standard library `webrick` so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.bulk_load(primary.list_records)`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.get(entity_id)`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process queue. The sync worker thread drains the queue and calls `cache.apply_change(event)`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process change queue is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` class wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/ruby/cache.rb)): + +```ruby +require "redis" +require_relative "cache" +require_relative "primary" +require_relative "sync_worker" + +redis = Redis.new(host: "localhost", port: 6379) +primary = MockPrimaryStore.new +cache = PrefetchCache.new(redis_client: redis, ttl_seconds: 3600) + +# Pre-load every primary record into Redis in one pipelined round trip. +cache.bulk_load(primary.list_records) + +# Start the sync worker so primary mutations propagate into Redis. +sync = SyncWorker.new(primary: primary, cache: cache) +sync.start + +# Read paths now go to Redis only. +record, hit, redis_ms = cache.get("cat-001") +``` + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `bulk_load` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```ruby +def bulk_load(records) + loaded = 0 + @apply_lock.synchronize do + @redis.pipelined do |pipe| + records.each do |record| + entity_id = record["id"] || record[:id] + next if entity_id.nil? || entity_id.to_s.empty? + cache_key = cache_key_for(entity_id) + pipe.del(cache_key) + pipe.hset(cache_key, stringify_record(record)) + pipe.expire(cache_key, @ttl_seconds) + loaded += 1 + end + end + end + @stats_lock.synchronize { @prefetched += loaded } + loaded +end +``` + +The `pipelined` block sends every command in one network round trip without `MULTI`/`EXEC`. This is intentional on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulk_load` directly from your own code on a cache that is already serving reads, either pause your writers first or rewrite it with `multi` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```ruby +def get(entity_id) + cache_key = cache_key_for(entity_id) + + started = monotonic_ms + cached = @redis.hgetall(cache_key) + redis_latency_ms = monotonic_ms - started + + if cached && !cached.empty? + @stats_lock.synchronize { @hits += 1 } + [cached, true, redis_latency_ms] + else + @stats_lock.synchronize { @misses += 1 } + [nil, false, redis_latency_ms] + end +end +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `apply_change` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one `MULTI`/`EXEC` transaction so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```ruby +def apply_change(change) + op = change[:op] || change["op"] + entity_id = change[:id] || change["id"] + return if entity_id.nil? || entity_id.to_s.empty? + + cache_key = cache_key_for(entity_id) + + case op + when "upsert" + fields = change[:fields] || change["fields"] + return if fields.nil? || fields.empty? + + @apply_lock.synchronize do + @redis.multi do |tx| + tx.del(cache_key) + tx.hset(cache_key, stringify_record(fields)) + tx.expire(cache_key, @ttl_seconds) + end + end + when "delete" + @redis.del(cache_key) + end + # stats update omitted for brevity +end +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. The `Mutex` around `multi` is a redis-rb-specific guard: a single `Redis.new` connection is thread-safe for individual commands, but a `multi` block uses the underlying connection for the duration of the transaction, so concurrent transactions on the same client must be serialised by the caller (or you must use a connection pool). The demo uses one shared client plus a mutex; production code should use [`connection_pool`](https://github.com/mperham/connection_pool) and check out a connection per transaction instead. + +## The sync worker + +The `SyncWorker` runs a daemon thread that blocks on the primary's change queue with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/ruby/sync_worker.rb)): + +```ruby +def run_loop + loop do + should_stop, should_pause = @state_mutex.synchronize { [@stop, @pause] } + break if should_stop + + if should_pause + @state_mutex.synchronize do + @paused_idle = true + @state_cv.broadcast + while @pause && !@stop + @state_cv.wait(@state_mutex, @poll_timeout_s) + end + @paused_idle = false + end + next + end + + change = @primary.next_change(@poll_timeout_s) + next if change.nil? + begin + @cache.apply_change(change) + rescue => exc + warn "[sync] failed to apply #{change.inspect}: #{exc}" + end + end +end +``` + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate(entity_id)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulk_load(primary.list_records)` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear` and `bulk_load` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.list_records`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```ruby +@sync.pause +begin + @cache.clear + @cache.bulk_load(@primary.list_records) +ensure + @sync.resume +end +``` + +`pause` waits for the worker to finish whatever event it is currently applying, parks the run loop on a `ConditionVariable`, and returns. Change events that arrive during the pause sit in the primary's `Queue` and apply in order once `resume` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps in-process counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```ruby +def stats + @stats_lock.synchronize do + total = @hits + @misses + hit_rate = total.zero? ? 0.0 : (100.0 * @hits / total).round(1) + avg_lag = @sync_lag_samples.zero? ? 0.0 : (@sync_lag_ms_total / @sync_lag_samples).round(2) + { + "hits" => @hits, + "misses" => @misses, + "hit_rate_pct" => hit_rate, + "prefetched" => @prefetched, + "sync_events_applied" => @sync_events_applied, + "sync_lag_ms_avg" => avg_lag, + } + end +end +``` + +In production you would emit these as Prometheus counters and gauges. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* Ruby 2.6 or later is installed (the demo uses `webrick`, which is stdlib in 2.x and is shipped as a bundled gem in 3.x). +* The `redis` gem (`redis-rb` 5.x) is installed: + +```bash +gem install redis webrick +``` + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of four files. Download them from the [`ruby` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/ruby) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/ruby +curl -O $BASE/cache.rb +curl -O $BASE/primary.rb +curl -O $BASE/sync_worker.rb +curl -O $BASE/demo_server.rb +``` + +### Start the demo server + +From that directory: + +```bash +ruby demo_server.rb +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8789 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 86.5 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8789`. + +The demo server uses only Ruby standard library features for HTTP handling and concurrency: + +* [`webrick`](https://docs.ruby-lang.org/en/master/WEBrick.html) for the web server +* [`uri`](https://docs.ruby-lang.org/en/master/URI.html) and `req.query` for query and form decoding +* [`Thread`](https://docs.ruby-lang.org/en/master/Thread.html), [`Mutex`](https://docs.ruby-lang.org/en/master/Mutex.html), [`ConditionVariable`](https://docs.ruby-lang.org/en/master/ConditionVariable.html), and [`Queue`](https://docs.ruby-lang.org/en/master/Thread/Queue.html) for the sync worker daemon + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/ruby/primary.rb)): + +```ruby +class MockPrimaryStore + def initialize(read_latency_ms: 80) + # ... + end + + def list_records + sleep(@read_latency_ms / 1000.0) + # ... + end + + def update_field(entity_id, field, value) + @lock.synchronize do + record = @records[entity_id] + return false if record.nil? + record[field] = value + emit_change_locked(CHANGE_OP_UPSERT, entity_id, record.dup) + end + true + end +end +``` + +Every mutation appends a change event to an in-process [`Queue`](https://docs.ruby-lang.org/en/master/Thread/Queue.html). The sync worker drains the queue with a 50 ms timeout and applies each event to Redis. Ruby's `Queue#pop` does not accept a timeout directly, so `next_change` wraps a non-blocking `pop(true)` in a short polling loop — that keeps the worker responsive to its stop flag. + +In a real system this queue is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change queue with a real CDC pipeline + +The demo's in-process queue is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Use a connection pool for transactions, not a shared client + mutex + +A single `Redis.new` client is thread-safe for individual commands because Ruby's GIL serialises access to the C-level command pipeline. But `multi` blocks use the underlying connection for the duration of the transaction, so two threads cannot run `multi` blocks concurrently on the same client without interleaving. The demo gets away with one shared client guarded by a `Mutex`, which is fine for a small local server. In production, use [`connection_pool`](https://github.com/mperham/connection_pool) to keep a pool of `Redis` clients and check one out per transaction: + +```ruby +require "connection_pool" + +REDIS = ConnectionPool.new(size: 16, timeout: 5) { Redis.new(host: "localhost") } + +REDIS.with do |redis| + redis.multi do |tx| + tx.del(cache_key) + tx.hset(cache_key, fields) + tx.expire(cache_key, ttl_seconds) + end +end +``` + +That gives you per-thread transactions without a global lock and reuses connections across requests. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. The helper takes a `prefix` argument exactly for this. + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [redis-rb guide]({{< relref "/develop/clients/ruby" >}}) - Install and use the Ruby Redis client +* [HSET command]({{< relref "/commands/hset" >}}) - Write hash fields +* [HGETALL command]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [EXPIRE command]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [DEL command]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [SCAN command]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [TTL command]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/ruby/cache.rb b/content/develop/use-cases/prefetch-cache/ruby/cache.rb new file mode 100644 index 0000000000..04bc7eeca6 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/ruby/cache.rb @@ -0,0 +1,233 @@ +# Redis prefetch-cache helper. +# +# Each cached entity is stored as a Redis hash under "cache:{prefix}:{id}" +# with a long safety-net TTL that bounds memory if the sync pipeline ever +# stops, but is not the freshness mechanism. Freshness comes from the +# +apply_change+ path, which the sync worker calls every time a primary +# mutation arrives. +# +# Reads run +HGETALL+ against Redis only. A miss is not a fall-back +# trigger -- the application treats it as an error or a deliberate +# +invalidate+ for testing. In production a sustained miss rate means +# the prefetch or the sync pipeline is broken, not that the primary +# should be re-queried on the request path. + +require "redis" + +class PrefetchCache + # Prefetch-cache helper backed by Redis hashes with a safety-net TTL. + + attr_reader :ttl_seconds, :prefix + + def initialize(redis_client: nil, prefix: "cache:category:", ttl_seconds: 3600) + @redis = redis_client || Redis.new(host: "localhost", port: 6379) + @prefix = prefix + @ttl_seconds = ttl_seconds + + # Ruby's GIL makes individual commands thread-safe, but pipelines and + # transactions need to be serialised so concurrent callers do not + # interleave commands inside one MULTI/EXEC block. + @apply_lock = Mutex.new + @stats_lock = Mutex.new + @hits = 0 + @misses = 0 + @prefetched = 0 + @sync_events_applied = 0 + @sync_lag_ms_total = 0.0 + @sync_lag_samples = 0 + end + + # Pipeline +DEL+ + +HSET+ + +EXPIRE+ for every record. Returns the count + # loaded. + # + # The pipeline is non-transactional: it is fast on startup (when nothing + # is reading the cache) and on the live +/reprefetch+ path (when the + # demo pauses the sync worker around the call). Calling +bulk_load+ on + # a cache that is actively being read and written to can briefly expose + # a key that has been deleted but not yet rewritten; pause the writers + # first or rewrite this with a transactional pipeline if that matters. + def bulk_load(records) + loaded = 0 + @apply_lock.synchronize do + @redis.pipelined do |pipe| + records.each do |record| + entity_id = record["id"] || record[:id] + next if entity_id.nil? || entity_id.to_s.empty? + cache_key = cache_key_for(entity_id) + pipe.del(cache_key) + pipe.hset(cache_key, stringify_record(record)) + pipe.expire(cache_key, @ttl_seconds) + loaded += 1 + end + end + end + @stats_lock.synchronize { @prefetched += loaded } + loaded + end + + # Return +[record, hit, redis_latency_ms]+ for an +HGETALL+ against Redis. + # + # Prefetch-cache reads do not fall back to the primary. A miss is a + # signal that the cache is incomplete, not a trigger to re-query the + # source. The caller decides how to surface it. + def get(entity_id) + cache_key = cache_key_for(entity_id) + + started = monotonic_ms + cached = @redis.hgetall(cache_key) + redis_latency_ms = monotonic_ms - started + + if cached && !cached.empty? + @stats_lock.synchronize { @hits += 1 } + [cached, true, redis_latency_ms] + else + @stats_lock.synchronize { @misses += 1 } + [nil, false, redis_latency_ms] + end + end + + # Apply a primary change event to Redis. + # + # The sync worker calls this for every event the primary emits. For an + # upsert, the helper rewrites the hash and refreshes the safety-net + # TTL. For a delete, it removes the cache key. + def apply_change(change) + op = change[:op] || change["op"] + entity_id = change[:id] || change["id"] + return if entity_id.nil? || entity_id.to_s.empty? + + cache_key = cache_key_for(entity_id) + + case op + when "upsert" + fields = change[:fields] || change["fields"] + # Malformed upsert with no fields. Skip rather than crash the sync + # worker: HSET with an empty mapping raises in redis-rb, and there + # is nothing to write anyway. A real CDC consumer would route this + # to a dead-letter queue and alert; the demo just drops it. + return if fields.nil? || fields.empty? + + @apply_lock.synchronize do + @redis.multi do |tx| + tx.del(cache_key) + tx.hset(cache_key, stringify_record(fields)) + tx.expire(cache_key, @ttl_seconds) + end + end + when "delete" + @redis.del(cache_key) + else + return + end + + @stats_lock.synchronize do + @sync_events_applied += 1 + timestamp_ms = change[:timestamp_ms] || change["timestamp_ms"] + if timestamp_ms.is_a?(Numeric) + lag_ms = [0.0, wall_ms - timestamp_ms.to_f].max + @sync_lag_ms_total += lag_ms + @sync_lag_samples += 1 + end + end + end + + # Delete one cache key. Demo-only: simulates a broken sync pipeline. + def invalidate(entity_id) + @redis.del(cache_key_for(entity_id)) == 1 + end + + # Delete every key under this cache's prefix and return the count. + def clear + deleted = 0 + cursor = "0" + loop do + cursor, keys = @redis.scan(cursor, match: "#{@prefix}*", count: 500) + unless keys.empty? + results = @redis.pipelined do |pipe| + keys.each { |k| pipe.del(k) } + end + deleted += results.sum { |r| r.to_i } + end + break if cursor == "0" + end + deleted + end + + # Return every entity id currently in the cache, sorted. + def ids + results = [] + cursor = "0" + loop do + cursor, keys = @redis.scan(cursor, match: "#{@prefix}*", count: 500) + keys.each { |k| results << strip_prefix(k) } + break if cursor == "0" + end + results.sort + end + + def count + total = 0 + cursor = "0" + loop do + cursor, keys = @redis.scan(cursor, match: "#{@prefix}*", count: 500) + total += keys.length + break if cursor == "0" + end + total + end + + def ttl_remaining(entity_id) + @redis.ttl(cache_key_for(entity_id)).to_i + end + + def stats + @stats_lock.synchronize do + total = @hits + @misses + hit_rate = total.zero? ? 0.0 : (100.0 * @hits / total).round(1) + avg_lag = @sync_lag_samples.zero? ? 0.0 : (@sync_lag_ms_total / @sync_lag_samples).round(2) + { + "hits" => @hits, + "misses" => @misses, + "hit_rate_pct" => hit_rate, + "prefetched" => @prefetched, + "sync_events_applied" => @sync_events_applied, + "sync_lag_ms_avg" => avg_lag, + } + end + end + + def reset_stats + @stats_lock.synchronize do + @hits = 0 + @misses = 0 + @prefetched = 0 + @sync_events_applied = 0 + @sync_lag_ms_total = 0.0 + @sync_lag_samples = 0 + end + end + + private + + def cache_key_for(entity_id) + "#{@prefix}#{entity_id}" + end + + def strip_prefix(key) + key.start_with?(@prefix) ? key[@prefix.length..-1] : key + end + + def stringify_record(record) + out = {} + record.each { |k, v| out[k.to_s] = v.to_s } + out + end + + def monotonic_ms + Process.clock_gettime(Process::CLOCK_MONOTONIC) * 1000.0 + end + + def wall_ms + Time.now.to_f * 1000.0 + end +end diff --git a/content/develop/use-cases/prefetch-cache/ruby/demo_server.rb b/content/develop/use-cases/prefetch-cache/ruby/demo_server.rb new file mode 100644 index 0000000000..8817d78fdd --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/ruby/demo_server.rb @@ -0,0 +1,669 @@ +#!/usr/bin/env ruby +# Redis prefetch-cache demo server. +# +# Run this file and visit http://localhost:8789 to watch a prefetch +# cache in action: the demo bulk-loads every primary record into Redis +# on startup, runs a background sync worker that applies primary +# mutations within milliseconds, and lets you add, update, delete, and +# re-prefetch records to see how the cache stays current without ever +# falling back to the primary on the read path. + +require "json" +require "optparse" +require "redis" +require "uri" +require "webrick" + +require_relative "cache" +require_relative "primary" +require_relative "sync_worker" + +HTML_TEMPLATE = <<~'HTML' + + + + + + Redis Prefetch Cache Demo + + + +
+
redis-rb + WEBrick
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +HTML + +class PrefetchCacheDemoServer + def initialize(host:, port:, cache:, primary:, sync:) + @host = host + @port = port + @cache = cache + @primary = primary + @sync = sync + @server = WEBrick::HTTPServer.new( + BindAddress: host, + Port: port, + Logger: WEBrick::Log.new($stderr, WEBrick::Log::WARN), + AccessLog: [], + ) + mount_routes + end + + def start + trap("INT") { @server.shutdown } + trap("TERM") { @server.shutdown } + @server.start + end + + def shutdown + @server.shutdown + end + + private + + def mount_routes + @server.mount_proc("/") { |req, res| handle_root(req, res) } + @server.mount_proc("/index.html") { |req, res| handle_root(req, res) } + @server.mount_proc("/categories") { |req, res| handle_categories(req, res) } + @server.mount_proc("/read") { |req, res| handle_read(req, res) } + @server.mount_proc("/stats") { |req, res| handle_stats(req, res) } + @server.mount_proc("/update") { |req, res| handle_update(req, res) } + @server.mount_proc("/add") { |req, res| handle_add(req, res) } + @server.mount_proc("/delete") { |req, res| handle_delete(req, res) } + @server.mount_proc("/invalidate") { |req, res| handle_invalidate(req, res) } + @server.mount_proc("/clear") { |req, res| handle_clear(req, res) } + @server.mount_proc("/reprefetch") { |req, res| handle_reprefetch(req, res) } + @server.mount_proc("/reset") { |req, res| handle_reset(req, res) } + end + + def handle_root(_req, res) + res.status = 200 + res["Content-Type"] = "text/html; charset=utf-8" + res.body = HTML_TEMPLATE.gsub("__CACHE_TTL__", @cache.ttl_seconds.to_s) + end + + def handle_categories(_req, res) + send_json(res, 200, { + "cache_ids" => @cache.ids, + "primary_ids" => @primary.list_ids, + }) + end + + def handle_read(req, res) + entity_id = req.query["id"].to_s + if entity_id.empty? + send_json(res, 400, { "error" => "Missing 'id'." }) + return + end + record, hit, redis_ms = @cache.get(entity_id) + send_json(res, 200, { + "id" => entity_id, + "record" => record, + "hit" => hit, + "redis_latency_ms" => redis_ms.round(2), + "ttl_remaining" => @cache.ttl_remaining(entity_id), + "stats" => build_stats, + }) + end + + def handle_stats(_req, res) + send_json(res, 200, build_stats) + end + + def handle_update(req, res) + params = parse_form(req) + entity_id = (params["id"] || "").to_s + field = (params["field"] || "").to_s + value = (params["value"] || "").to_s + if entity_id.empty? || field.empty? + send_json(res, 400, { "error" => "Missing 'id' or 'field'." }) + return + end + unless @primary.update_field(entity_id, field, value) + send_json(res, 404, { "error" => "Unknown category '#{entity_id}'." }) + return + end + send_json(res, 200, { + "id" => entity_id, + "field" => field, + "value" => value, + "stats" => build_stats, + }) + end + + def handle_add(req, res) + params = parse_form(req) + entity_id = (params["id"] || "").to_s.strip + name = (params["name"] || "").to_s.strip + if entity_id.empty? || name.empty? + send_json(res, 400, { "error" => "Missing 'id' or 'name'." }) + return + end + record = { + "id" => entity_id, + "name" => name, + "display_order" => (params["display_order"].to_s.empty? ? "99" : params["display_order"].to_s), + "featured" => (params["featured"].to_s.empty? ? "false" : params["featured"].to_s), + "parent_id" => params["parent_id"].to_s, + } + unless @primary.add_record(record) + send_json(res, 409, { "error" => "Category '#{entity_id}' already exists." }) + return + end + send_json(res, 200, { + "id" => entity_id, + "record" => record, + "stats" => build_stats, + }) + end + + def handle_delete(req, res) + params = parse_form(req) + entity_id = (params["id"] || "").to_s + if entity_id.empty? + send_json(res, 400, { "error" => "Missing 'id'." }) + return + end + unless @primary.delete_record(entity_id) + send_json(res, 404, { "error" => "Unknown category '#{entity_id}'." }) + return + end + send_json(res, 200, { "id" => entity_id, "stats" => build_stats }) + end + + def handle_invalidate(req, res) + params = parse_form(req) + entity_id = (params["id"] || "").to_s + if entity_id.empty? + send_json(res, 400, { "error" => "Missing 'id'." }) + return + end + deleted = @cache.invalidate(entity_id) + send_json(res, 200, { "id" => entity_id, "deleted" => deleted, "stats" => build_stats }) + end + + def handle_clear(_req, res) + # Pause the sync worker so it cannot recreate keys between SCAN and + # DEL. Queued events accumulate and apply after resume. + @sync.pause + begin + deleted = @cache.clear + ensure + @sync.resume + end + send_json(res, 200, { "deleted" => deleted, "stats" => build_stats }) + end + + def handle_reprefetch(_req, res) + # Pause the sync worker so it cannot interleave with the clear + + # snapshot + bulk_load sequence. Without this, a change applied + # between list_records and bulk_load would be overwritten by the + # stale snapshot. + @sync.pause + begin + started = monotonic_ms + @cache.clear + loaded = @cache.bulk_load(@primary.list_records) + elapsed_ms = monotonic_ms - started + ensure + @sync.resume + end + send_json(res, 200, { + "loaded" => loaded, + "elapsed_ms" => elapsed_ms.round(2), + "stats" => build_stats, + }) + end + + def handle_reset(_req, res) + @cache.reset_stats + @primary.reset_reads + send_json(res, 200, build_stats) + end + + def build_stats + stats = @cache.stats + stats["primary_reads_total"] = @primary.reads + stats["primary_read_latency_ms"] = @primary.read_latency_ms + stats + end + + def parse_form(req) + # WEBrick parses form-encoded bodies into req.query for application/x-www-form-urlencoded. + # Fall back to URI.decode_www_form on the body if the header is missing or different. + if req.query && !req.query.empty? + req.query + else + URI.decode_www_form(req.body.to_s).to_h + end + end + + def send_json(res, status, payload) + res.status = status + res["Content-Type"] = "application/json" + res.body = JSON.generate(payload) + end + + def monotonic_ms + Process.clock_gettime(Process::CLOCK_MONOTONIC) * 1000.0 + end +end + +def parse_args + options = { + host: "127.0.0.1", + port: 8789, + redis_host: "localhost", + redis_port: 6379, + cache_prefix: "cache:category:", + ttl_seconds: 3600, + primary_latency_ms: 80, + } + OptionParser.new do |opts| + opts.banner = "Usage: ruby demo_server.rb [options]" + opts.on("--host HOST", "HTTP bind host") { |v| options[:host] = v } + opts.on("--port PORT", Integer, "HTTP bind port") { |v| options[:port] = v } + opts.on("--redis-host HOST", "Redis host") { |v| options[:redis_host] = v } + opts.on("--redis-port PORT", Integer, "Redis port") { |v| options[:redis_port] = v } + opts.on("--cache-prefix PREFIX", "Cache key prefix") { |v| options[:cache_prefix] = v } + opts.on("--ttl-seconds N", Integer, "Safety-net TTL in seconds (refreshed on every sync event)") { |v| options[:ttl_seconds] = v } + opts.on("--primary-latency-ms N", Integer, "Simulated primary read latency (only affects bulk loads and reconciliations)") { |v| options[:primary_latency_ms] = v } + end.parse! + options +end + +def main + args = parse_args + + redis_client = Redis.new(host: args[:redis_host], port: args[:redis_port]) + cache = PrefetchCache.new( + redis_client: redis_client, + prefix: args[:cache_prefix], + ttl_seconds: args[:ttl_seconds], + ) + primary = MockPrimaryStore.new(read_latency_ms: args[:primary_latency_ms]) + sync = SyncWorker.new(primary: primary, cache: cache) + + started = Process.clock_gettime(Process::CLOCK_MONOTONIC) * 1000.0 + cache.clear + loaded = cache.bulk_load(primary.list_records) + elapsed_ms = (Process.clock_gettime(Process::CLOCK_MONOTONIC) * 1000.0) - started + sync.start + + puts "Redis prefetch-cache demo server listening on http://#{args[:host]}:#{args[:port]}" + puts "Using Redis at #{args[:redis_host]}:#{args[:redis_port]} with cache prefix '#{args[:cache_prefix]}' and TTL #{args[:ttl_seconds]}s" + puts "Prefetched #{loaded} records in #{format('%.1f', elapsed_ms)} ms; sync worker running" + + server = PrefetchCacheDemoServer.new( + host: args[:host], + port: args[:port], + cache: cache, + primary: primary, + sync: sync, + ) + + begin + server.start + ensure + sync.stop + end +end + +main if __FILE__ == $PROGRAM_NAME diff --git a/content/develop/use-cases/prefetch-cache/ruby/primary.rb b/content/develop/use-cases/prefetch-cache/ruby/primary.rb new file mode 100644 index 0000000000..09e82b0f5d --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/ruby/primary.rb @@ -0,0 +1,172 @@ +# Mock primary data store for the prefetch-cache demo. +# +# This stands in for a source-of-truth database (Postgres, MySQL, Mongo, +# etc.) that holds reference data the application serves to users. +# +# Every mutation appends a change event to an in-process queue, which +# the sync worker drains and applies to Redis. In a real system the +# queue is replaced by a CDC pipeline -- Redis Data Integration, +# Debezium plus a lightweight consumer, or an equivalent tool that +# tails the source's binlog/WAL and pushes changes into Redis. +# +# The store also exposes +read_latency_ms+ so the demo can illustrate +# how much slower a direct primary read would be than a Redis hit. + +require "thread" + +CHANGE_OP_UPSERT = "upsert" +CHANGE_OP_DELETE = "delete" + +class MockPrimaryStore + # In-memory stand-in for a primary database of reference data. + + attr_reader :read_latency_ms + + def initialize(read_latency_ms: 80) + @read_latency_ms = read_latency_ms + @lock = Mutex.new + @reads = 0 + @changes = Queue.new + @records = { + "cat-001" => { + "id" => "cat-001", + "name" => "Beverages", + "display_order" => "1", + "featured" => "true", + "parent_id" => "", + }, + "cat-002" => { + "id" => "cat-002", + "name" => "Bakery", + "display_order" => "2", + "featured" => "true", + "parent_id" => "", + }, + "cat-003" => { + "id" => "cat-003", + "name" => "Pantry Staples", + "display_order" => "3", + "featured" => "false", + "parent_id" => "", + }, + "cat-004" => { + "id" => "cat-004", + "name" => "Frozen", + "display_order" => "4", + "featured" => "false", + "parent_id" => "", + }, + "cat-005" => { + "id" => "cat-005", + "name" => "Specialty Cheeses", + "display_order" => "5", + "featured" => "false", + "parent_id" => "cat-002", + }, + } + end + + def list_ids + @lock.synchronize { @records.keys.sort } + end + + # Return every record. Used by the cache's bulk-load path on startup. + def list_records + sleep(@read_latency_ms / 1000.0) + @lock.synchronize do + @reads += 1 + @records.values.map { |r| r.dup } + end + end + + # Single-record read. Not on the demo's normal read path. + def read(entity_id) + sleep(@read_latency_ms / 1000.0) + @lock.synchronize do + @reads += 1 + record = @records[entity_id] + record ? record.dup : nil + end + end + + def add_record(record) + entity_id = (record["id"] || "").to_s.strip + return false if entity_id.empty? + @lock.synchronize do + return false if @records.key?(entity_id) + @records[entity_id] = record.dup + # Emit while the lock is held so the queue order matches the + # mutation order. Two concurrent callers cannot interleave + # mutation A, mutation B, emit B, emit A. + emit_change_locked(CHANGE_OP_UPSERT, entity_id, record.dup) + end + true + end + + def update_field(entity_id, field, value) + @lock.synchronize do + record = @records[entity_id] + return false if record.nil? + record[field] = value + emit_change_locked(CHANGE_OP_UPSERT, entity_id, record.dup) + end + true + end + + def delete_record(entity_id) + @lock.synchronize do + return false unless @records.key?(entity_id) + @records.delete(entity_id) + emit_change_locked(CHANGE_OP_DELETE, entity_id, nil) + end + true + end + + # Block up to +timeout_seconds+ for the next change event. + # + # Queue#pop does not accept a timeout directly. We use a short polling + # loop on +pop(true)+ (non-blocking), sleeping for a fraction of the + # timeout between attempts, so the worker remains responsive to its + # stop flag. + def next_change(timeout_seconds) + deadline = monotonic + timeout_seconds + loop do + begin + return @changes.pop(true) + rescue ThreadError + remaining = deadline - monotonic + return nil if remaining <= 0 + sleep([remaining, 0.01].min) + end + end + end + + def reads + @lock.synchronize { @reads } + end + + def reset_reads + @lock.synchronize { @reads = 0 } + end + + private + + # Append a change event to the feed. Caller must hold +@lock+. + # + # Queue#push is itself thread-safe and does not try to acquire + # +@lock+, so calling it while holding the records lock cannot + # deadlock. Holding the lock here is what guarantees that the queue + # order matches the order in which the records hash was mutated. + def emit_change_locked(op, entity_id, fields) + @changes.push( + op: op, + id: entity_id, + fields: fields, + timestamp_ms: Time.now.to_f * 1000.0, + ) + end + + def monotonic + Process.clock_gettime(Process::CLOCK_MONOTONIC) + end +end diff --git a/content/develop/use-cases/prefetch-cache/ruby/sync_worker.rb b/content/develop/use-cases/prefetch-cache/ruby/sync_worker.rb new file mode 100644 index 0000000000..e608230c97 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/ruby/sync_worker.rb @@ -0,0 +1,131 @@ +# Background sync worker for the prefetch-cache demo. +# +# A daemon thread drains the primary's change queue and applies each +# event to Redis through +PrefetchCache#apply_change+. In a real system +# the queue is replaced by a CDC pipeline (Redis Data Integration, +# Debezium, or an equivalent) that tails the primary's binlog/WAL and +# writes the same shape of events. +# +# The worker exposes +pause+ and +resume+ so maintenance paths +# (+/reprefetch+, +clear+) can stop event application without tearing +# the thread down. +pause+ blocks until the worker is parked, so the +# caller knows no apply is in flight by the time it returns. + +require "thread" + +class SyncWorker + # Drain primary change events into Redis on a daemon thread. + + def initialize(primary:, cache:, poll_timeout_s: 0.05) + @primary = primary + @cache = cache + @poll_timeout_s = poll_timeout_s + + @state_mutex = Mutex.new + @state_cv = ConditionVariable.new + @stop = false + @pause = false + @paused_idle = false + @thread = nil + end + + def start + @state_mutex.synchronize do + return if @thread && @thread.alive? + @stop = false + @pause = false + @paused_idle = false + end + @thread = Thread.new { run_loop } + @thread.name = "prefetch-cache-sync" if @thread.respond_to?(:name=) + @thread + end + + # Signal the worker to exit and join its thread. + # + # If the join times out the worker is wedged inside +apply_change+; + # we leave +@thread+ populated so a subsequent +start+ does not spawn + # a second worker on top of the orphan. + def stop(join_timeout_s: 2.0) + @state_mutex.synchronize do + @stop = true + @state_cv.broadcast + end + thread = @thread + return if thread.nil? + thread.join(join_timeout_s) + @thread = nil unless thread.alive? + end + + # Stop applying events and block until the worker is parked. + # + # Returns +true+ once the worker has confirmed it is idle, or + # +false+ if the timeout elapsed first. While paused, change events + # accumulate in the primary's queue and are applied in order after + # +resume+. + def pause(timeout_s: 2.0) + deadline = monotonic + timeout_s + @state_mutex.synchronize do + @pause = true + @paused_idle = false + @state_cv.broadcast + return true if @thread.nil? || !@thread.alive? + until @paused_idle + remaining = deadline - monotonic + return false if remaining <= 0 + @state_cv.wait(@state_mutex, remaining) + end + true + end + end + + def resume + @state_mutex.synchronize do + @pause = false + @paused_idle = false + @state_cv.broadcast + end + end + + private + + def run_loop + loop do + should_stop, should_pause = @state_mutex.synchronize { [@stop, @pause] } + break if should_stop + + if should_pause + @state_mutex.synchronize do + # Park until the pause is lifted or the worker is stopped. + # Re-set @paused_idle on every iteration so a *new* pause + # that arrives while we are still parked from the previous + # cycle gets acknowledged immediately, not after the caller's + # full pause-timeout. + while @pause && !@stop + @paused_idle = true + @state_cv.broadcast + @state_cv.wait(@state_mutex, @poll_timeout_s) + end + @paused_idle = false + end + next + end + + change = @primary.next_change(@poll_timeout_s) + next if change.nil? + begin + @cache.apply_change(change) + rescue => exc + # Demo behaviour: log and drop the event. A production CDC + # consumer would retry with bounded backoff and expose a + # dead-letter / error counter; see the guide's "Production + # usage" section. + warn "[sync] failed to apply #{change.inspect}: #{exc}" + end + end + end + + def monotonic + Process.clock_gettime(Process::CLOCK_MONOTONIC) + end +end diff --git a/content/develop/use-cases/prefetch-cache/rust/Cargo.toml b/content/develop/use-cases/prefetch-cache/rust/Cargo.toml new file mode 100644 index 0000000000..0725eb6f4e --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/rust/Cargo.toml @@ -0,0 +1,15 @@ +[package] +name = "prefetch-cache-demo" +version = "0.1.0" +edition = "2021" + +[dependencies] +redis = { version = "0.24", features = ["tokio-comp", "aio", "connection-manager"] } +tokio = { version = "1", features = ["full"] } +axum = "0.7" +serde = { version = "1.0", features = ["derive"] } +serde_json = "1.0" + +[[bin]] +name = "demo_server" +path = "demo_server.rs" diff --git a/content/develop/use-cases/prefetch-cache/rust/_index.md b/content/develop/use-cases/prefetch-cache/rust/_index.md new file mode 100644 index 0000000000..98d772c0f9 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/rust/_index.md @@ -0,0 +1,408 @@ +--- +categories: +- docs +- develop +- stack +- oss +- rs +- rc +description: Implement a Redis prefetch cache in Rust with redis-rs +linkTitle: redis-rs example (Rust) +title: Redis prefetch cache with redis-rs +weight: 9 +--- + +This guide shows you how to implement a Redis prefetch cache in Rust with the [`redis`](https://crates.io/crates/redis) crate (redis-rs). It includes a small local web server built with [`axum`](https://github.com/tokio-rs/axum) and [`tokio`](https://tokio.rs/) so you can watch the cache pre-load at startup, see a background sync worker apply primary mutations within milliseconds, and break the cache to confirm that reads never fall back to the primary. + +## Overview + +Prefetch caching pre-loads a working set of reference data into Redis before the first request arrives, so every read on the request path is a cache hit. A separate sync worker keeps the cache current as the source of truth changes — there is no fall-back to the primary on the read path. + +That gives you: + +* Near-100% cache hit ratios for reference and master data +* Sub-millisecond reads for lookup-heavy paths at peak traffic +* All reference-data reads offloaded from the primary database +* Source-database changes propagated into Redis within a few milliseconds +* A long safety-net TTL that bounds memory if the sync pipeline ever stops + +In this example, each cached category is stored as a Redis hash under a key like `cache:category:{id}`. The hash holds the category fields (`id`, `name`, `display_order`, `featured`, `parent_id`) and the key has a long safety-net TTL that the sync worker refreshes on every add or update event. Delete events remove the cache key outright, so there is no TTL to refresh in that case. + +## How it works + +The flow has three independent paths: + +1. **On startup**, the demo server calls `cache.bulk_load(primary.list_records().await).await`, which pipelines `DEL` + `HSET` + `EXPIRE` for every record in one round trip. +2. **On every read**, the application calls `cache.get(entity_id).await`, which runs `HGETALL` against Redis only. A miss is treated as an error, not a trigger to query the primary. +3. **On every primary mutation**, the primary appends a change event to an in-process `tokio::sync::mpsc` channel. The sync worker task drains the channel and calls `cache.apply_change(event).await`. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL; for a `delete`, it removes the cache key. + +In a real system the in-process change channel is replaced by a CDC pipeline — [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}), Debezium plus a lightweight consumer, or an equivalent tool that tails the source's binlog/WAL and pushes events into Redis. + +## The prefetch-cache helper + +The `PrefetchCache` struct wraps the cache operations +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/rust/cache.rs)): + +```rust +use redis::aio::ConnectionManager; +use redis::Client; + +let client = Client::open("redis://localhost:6379/")?; +let conn = ConnectionManager::new(client).await?; + +let cache = PrefetchCache::new(conn, CacheConfig::default()); +let primary = MockPrimaryStore::new(80); + +// Pre-load every primary record into Redis in one pipelined round trip. +let records = primary.list_records().await; +cache.bulk_load(records).await?; + +// Start the sync worker so primary mutations propagate into Redis. +let sync = Arc::new(SyncWorker::new(primary.clone(), cache.clone())); +sync.start().await; + +// Read paths now go to Redis only. +let result = cache.get("cat-001").await?; +``` + +The helper uses [`ConnectionManager`](https://docs.rs/redis/latest/redis/aio/struct.ConnectionManager.html), which is cheap to clone, auto-reconnects on failure, and is the recommended way to share a redis-rs connection across tokio tasks. + +### Data model + +Each cached category is stored in a Redis hash: + +```text +cache:category:cat-001 + id = cat-001 + name = Beverages + display_order = 1 + featured = true + parent_id = +``` + +The implementation uses: + +* [`HSET`]({{< relref "/commands/hset" >}}) + [`EXPIRE`]({{< relref "/commands/expire" >}}), pipelined, for the bulk load and every sync event +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) on the read path +* [`DEL`]({{< relref "/commands/del" >}}) for sync-delete events and explicit invalidation +* [`SCAN`]({{< relref "/commands/scan" >}}) to enumerate the cached keyspace and to clear the prefix +* [`TTL`]({{< relref "/commands/ttl" >}}) to surface remaining safety-net time in the demo UI + +## Bulk load on startup + +The `bulk_load` method pipelines a `DEL` + `HSET` + `EXPIRE` triple for every record. The pipeline is sent in a single round trip, so loading thousands of records takes one network RTT plus the time Redis spends executing the commands locally — typically tens of milliseconds even for a large reference table: + +```rust +pub async fn bulk_load( + &self, + records: Vec>, +) -> RedisResult { + let mut pipe = redis::pipe(); + let mut loaded: u64 = 0; + for record in &records { + let entity_id = match record.get("id") { + Some(v) if !v.is_empty() => v.clone(), + _ => continue, + }; + let cache_key = self.cache_key(&entity_id); + let pairs: Vec<(String, String)> = + record.iter().map(|(k, v)| (k.clone(), v.clone())).collect(); + pipe.del(&cache_key).ignore(); + pipe.hset_multiple(&cache_key, &pairs).ignore(); + pipe.expire(&cache_key, self.cfg.ttl_seconds).ignore(); + loaded += 1; + } + if loaded > 0 { + let mut conn = self.conn.clone(); + pipe.query_async::<_, ()>(&mut conn).await?; + } + self.stats.prefetched.fetch_add(loaded, Ordering::Relaxed); + Ok(loaded) +} +``` + +A plain `redis::pipe()` (without `.atomic()`) is intentional on the **startup** path: nothing is reading the cache yet, the records do not need to be applied atomically as a set, and skipping `MULTI`/`EXEC` keeps the bulk load fast. The same method is used for the live `/reprefetch` reload, which is safe because the demo pauses the sync worker around the clear-and-reload sequence — see [Re-prefetch under load](#re-prefetch-under-load) below. If you call `bulk_load` directly from your own code on a cache that is already serving reads, either pause your writers first or switch to `redis::pipe().atomic()` so callers cannot observe a half-loaded record. + +## Reads from Redis only + +The `get` method runs `HGETALL` and returns the cached hash. **It does not fall back to the primary on a miss.** In a healthy system, a miss never happens; if it does, the application surfaces it as an error and treats it as a sync-pipeline incident: + +```rust +pub async fn get(&self, entity_id: &str) -> RedisResult { + let cache_key = self.cache_key(entity_id); + let mut conn = self.conn.clone(); + + let started = Instant::now(); + let cached: HashMap = conn.hgetall(&cache_key).await?; + let redis_latency_ms = started.elapsed().as_secs_f64() * 1000.0; + + if !cached.is_empty() { + self.stats.hits.fetch_add(1, Ordering::Relaxed); + Ok(GetResult { record: Some(cached), hit: true, redis_latency_ms }) + } else { + self.stats.misses.fetch_add(1, Ordering::Relaxed); + Ok(GetResult { record: None, hit: false, redis_latency_ms }) + } +} +``` + +This is the key behavioural difference from [cache-aside]({{< relref "/develop/use-cases/cache-aside" >}}): the request path never touches the primary, so reference-data reads cannot contribute to primary database load. + +## Applying sync events + +The sync worker calls `apply_change` for every primary mutation. For an `upsert`, the helper rewrites the cache hash and refreshes the safety-net TTL in one `MULTI`/`EXEC` pipeline so the cache never holds a stale mix of old and new fields. For a `delete`, it removes the cache key: + +```rust +pub async fn apply_change(&self, change: &ChangeEvent) -> RedisResult<()> { + let cache_key = self.cache_key(&change.id); + let mut conn = self.conn.clone(); + + match change.op.as_str() { + "upsert" => { + let fields = match &change.fields { + Some(f) if !f.is_empty() => f, + _ => return Ok(()), + }; + let pairs: Vec<(String, String)> = + fields.iter().map(|(k, v)| (k.clone(), v.clone())).collect(); + redis::pipe() + .atomic() + .del(&cache_key).ignore() + .hset_multiple(&cache_key, &pairs).ignore() + .expire(&cache_key, self.cfg.ttl_seconds).ignore() + .query_async::<_, ()>(&mut conn).await?; + } + "delete" => { + let _: i64 = conn.del(&cache_key).await?; + } + _ => {} + } + Ok(()) +} +``` + +The `DEL` before the `HSET` ensures the cached hash contains exactly the fields the primary record has now — fields that have been dropped from the primary will not linger in Redis. The early return on empty `fields` matters: redis-rs panics if you call `hset_multiple` with an empty slice, so a malformed upsert with no fields is dropped (in a real CDC consumer you would route it to a dead-letter queue). + +## The sync worker + +The `SyncWorker` runs a long-lived `tokio::task` that polls the primary's change channel with a short timeout. Every change is applied to Redis as soon as it arrives +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/rust/sync_worker.rs)): + +```rust +async fn run_loop( + primary: Arc, + cache: PrefetchCache, + poll_timeout: Duration, + state: WorkerState, +) { + loop { + if *state.stop_rx.borrow() { return; } + if *state.pause_rx.borrow() { + state.idle_notify.notify_waiters(); + // park until the pause is lifted or the worker is stopped + // ... + continue; + } + match primary.next_change(poll_timeout).await { + Some(change) => { + if let Err(err) = cache.apply_change(&change).await { + eprintln!("[sync] failed to apply {:?} id={}: {}", change.op, change.id, err); + } + } + None => continue, + } + } +} +``` + +In production this loop is replaced by a CDC consumer reading from RDI's Redis output stream, Debezium's Kafka topic, or an equivalent change feed. The shape stays the same: drain events, apply them to Redis, advance the consumer offset. + +## Invalidation and re-prefetch + +Two helpers exist for testing and recovery: + +* `invalidate(entity_id)` deletes a single cache key. The demo uses it to simulate a sync-pipeline failure on one record. +* `clear()` runs `SCAN MATCH cache:category:*` and deletes every key under the prefix. The demo uses it to simulate a full cache loss. + +In both cases, the recovery path is to call `bulk_load(primary.list_records().await).await` again — re-prefetching from the primary. The demo exposes this as the "Re-prefetch" button so you can see the cache come back to a fully-warm state in one operation. + +### Re-prefetch under load + +`clear()` and `bulk_load()` are not atomic against the sync worker. If a change event arrives between the snapshot (`primary.list_records()`) and the bulk write, the bulk write can overwrite a newer value; if a change event arrives between `clear()`'s `SCAN` and `DEL`, the cleared entry can immediately be recreated. The demo's `/clear` and `/reprefetch` handlers solve this by pausing the sync worker around the operation: + +```rust +let _ = state.sync.pause(Duration::from_secs(2)).await; +let _ = state.cache.clear().await.unwrap_or(0); +let records = state.primary.list_records().await; +let loaded = state.cache.bulk_load(records).await.unwrap_or(0); +state.sync.resume().await; +``` + +`pause()` flips a `tokio::sync::watch` flag and blocks on a `tokio::sync::Notify` until the worker confirms it has parked. Change events that arrive during the pause sit in the primary's channel and apply in order once `resume()` is called, so no event is lost. + +## Hit/miss accounting + +The helper keeps thread-safe atomic counters for hits, misses, prefetched records, sync events applied, and the average lag between a primary change and its application to Redis. The demo UI surfaces these so you can confirm the cache is absorbing all reads and the sync worker is keeping up: + +```rust +pub fn stats(&self) -> serde_json::Value { + let hits = self.stats.hits.load(Ordering::Relaxed); + let misses = self.stats.misses.load(Ordering::Relaxed); + // ... + serde_json::json!({ + "hits": hits, + "misses": misses, + "hit_rate_pct": hit_rate_pct, + "prefetched": prefetched, + "sync_events_applied": sync_events_applied, + "sync_lag_ms_avg": sync_lag_ms_avg, + }) +} +``` + +In production you would emit these as Prometheus counters and gauges. The sync-lag metric is the most important: a sudden rise indicates the CDC pipeline is falling behind. + +## Prerequisites + +Before running the demo, make sure that: + +* Redis is running and accessible. By default, the demo connects to `localhost:6379`. +* You have a working [Rust toolchain](https://www.rust-lang.org/tools/install) (`cargo` 1.70+). +* The demo's `Cargo.toml` pins the `redis` crate at 0.24+ with the `tokio-comp`, `aio`, and `connection-manager` features; `cargo build` will pull the rest of the dependency graph. + +If your Redis server is running elsewhere, start the demo with `--redis-host` and `--redis-port`. + +## Running the demo + +### Get the source files + +The demo consists of five files. Download them from the [`rust` source folder](https://github.com/redis/docs/tree/main/content/develop/use-cases/prefetch-cache/rust) on GitHub, or grab them with `curl`: + +```bash +mkdir prefetch-cache-demo && cd prefetch-cache-demo +BASE=https://raw.githubusercontent.com/redis/docs/main/content/develop/use-cases/prefetch-cache/rust +curl -O $BASE/Cargo.toml +curl -O $BASE/cache.rs +curl -O $BASE/primary.rs +curl -O $BASE/sync_worker.rs +curl -O $BASE/demo_server.rs +``` + +### Start the demo server + +From that directory: + +```bash +cargo run --bin demo_server +``` + +You should see something like: + +```text +Redis prefetch-cache demo server listening on http://127.0.0.1:8790 +Using Redis at localhost:6379 with cache prefix 'cache:category:' and TTL 3600s +Prefetched 5 records in 90.1 ms; sync worker running +``` + +After starting the server, visit `http://localhost:8790`. + +The demo accepts these flags: + +* `--host` / `--port` — HTTP bind host and port (defaults to `127.0.0.1:8790`) +* `--redis-host` / `--redis-port` — Redis location +* `--cache-prefix` — cache key prefix (default `cache:category:`) +* `--ttl-seconds` — safety-net TTL in seconds (default `3600`) +* `--primary-latency-ms` — simulated primary read latency (default `80`) + +It exposes a small interactive page where you can: + +* See which IDs are in the cache and in the primary, side by side +* Read a category through the cache and confirm every read is a hit +* Update a field on the primary and watch the sync worker rewrite the cache hash +* Add and delete categories and watch them appear and disappear from the cache +* Invalidate one key or clear the entire cache to simulate a sync-pipeline failure +* Re-prefetch from the primary to recover from a broken cache state +* Watch the average sync lag, and confirm primary reads stay at one until you re-prefetch — each `/reprefetch` adds another primary read for the snapshot, but normal request traffic never reaches the primary at all + +## The mock primary store + +To make the demo self-contained, the example includes a `MockPrimaryStore` that stands in for a source-of-truth database +([source](https://github.com/redis/docs/blob/main/content/develop/use-cases/prefetch-cache/rust/primary.rs)): + +```rust +pub struct MockPrimaryStore { + pub read_latency_ms: u64, + reads: AtomicU64, + inner: Mutex, // records + tx + rx: Mutex>, +} + +pub async fn update_field(&self, entity_id: &str, field: &str, value: &str) -> bool { + let mut inner = self.inner.lock().await; + // ... mutate the record ... + emit_locked(&inner.tx, CHANGE_OP_UPSERT, entity_id, Some(snapshot)); + true +} +``` + +Every mutation appends a change event to an in-process [`tokio::sync::mpsc`](https://docs.rs/tokio/latest/tokio/sync/mpsc/index.html) channel. The emit happens **while the mutation lock is held**, so two concurrent updates cannot interleave their channel sends — the queue order always matches the order in which the records map was mutated. The sync worker drains the channel with a 50 ms timeout via `tokio::time::timeout` and applies each event to Redis. In a real system this channel is replaced by a CDC pipeline — RDI on Redis Enterprise or Debezium with a Redis consumer on open-source Redis. + +## Production usage + +This guide uses a deliberately small local demo so you can focus on the prefetch-cache pattern. In production, you will usually want to harden several aspects of it. + +### Replace the in-process change channel with a real CDC pipeline + +The demo's in-process channel is the simplest possible stand-in for a CDC change feed. In production, the change feed lives outside the application process: an RDI pipeline configured against your primary database, Debezium connectors writing to Kafka or a Redis stream, or your application explicitly publishing change events from the write path. Whatever you choose, the consumer side stays the same — read events, apply them to Redis, advance the offset. + +### Use a long safety-net TTL, not a freshness TTL + +The TTL on each cache key is a **safety net**: it bounds memory if the sync pipeline silently stops, so a stuck consumer cannot leave stale data in Redis indefinitely. The TTL is not the freshness mechanism — freshness comes from the sync worker, which refreshes the TTL on every add or update event (delete events remove the key). Pick a TTL that is comfortably longer than your worst-case sync lag plus your alerting window, so a transient sync hiccup never expires hot keys. + +### Decide what to do on a cache miss + +A prefetch cache treats a miss as an error or a missing record. The two reasonable strategies are: + +* **Return a 404 to the user.** Appropriate when the cache is authoritative for the lookup — for example, when the user is asking for a category by ID and the ID is not in the cache. +* **Page on-call.** A sustained miss rate on IDs you know exist is an incident: either the prefetch did not run, or the sync pipeline is broken. + +Whichever you choose, do not fall back to the primary on the read path — that is what cache-aside is for, and conflating the two patterns breaks the load-isolation guarantee that prefetch provides. + +### Bound the working set to what fits in memory + +Prefetch only works if the entire dataset fits in Redis memory with headroom. Estimate the size of your reference data, multiply by a growth factor, and confirm the result fits within your Redis instance's `maxmemory` minus what other use cases need. If the working set grows beyond what Redis can hold, switch the dataset to a cache-aside pattern instead — the request path will pay miss latency, but you will not OOM. + +### Reconcile periodically against the primary + +CDC pipelines are eventually consistent: an event can be lost (broker outage, consumer crash, configuration drift) and the cache can silently diverge from the source. Run a periodic reconciliation job that re-reads all primary records, compares them against the cache, and either re-prefetches or fixes individual entries. Even running it once a day catches drift that ad-hoc inspection would miss. + +### Namespace cache keys in shared Redis deployments + +If multiple applications share a Redis deployment, prefix cache keys with the application name (`cache:billing:category:{id}`) so different services cannot clobber each other's entries. `CacheConfig::prefix` takes the prefix exactly for this; the demo also accepts `--cache-prefix` on the command line. + +### Use `ConnectionManager`, not a bare connection + +`redis::aio::ConnectionManager` is the recommended async connection type for redis-rs. It is cheap to clone (the demo passes a clone to every async task), reconnects automatically on transient failures, and pipelines commands behind the scenes. A bare `redis::aio::Connection` is not `Clone` and cannot be safely shared across tokio tasks. If you need a true connection pool — for example, to fan out heavy `MULTI`/`EXEC` traffic across multiple sockets — use [`deadpool-redis`](https://crates.io/crates/deadpool-redis) or [`bb8-redis`](https://crates.io/crates/bb8-redis). + +### Inspect cached entries directly in Redis + +When testing or troubleshooting, inspect the stored cache keys directly to confirm the bulk load and the sync worker are writing what you expect: + +```bash +redis-cli --scan --pattern 'cache:category:*' +redis-cli HGETALL cache:category:cat-001 +redis-cli TTL cache:category:cat-001 +``` + +If a key is missing for an ID that still exists in the primary, the prefetch did not run, the key expired without a sync refresh, or someone invalidated it. If a key is still present for an ID that was deleted in the primary, the delete event has not yet been applied. If the TTL is much lower than the configured safety-net value on a hot key, the sync worker is not keeping up. + +## Learn more + +* [redis-rs crate](https://crates.io/crates/redis) - Install and use the Rust Redis client +* [`HSET`]({{< relref "/commands/hset" >}}) - Write hash fields +* [`HGETALL`]({{< relref "/commands/hgetall" >}}) - Read every field of a hash +* [`EXPIRE`]({{< relref "/commands/expire" >}}) - Set key expiration in seconds +* [`DEL`]({{< relref "/commands/del" >}}) - Delete a key on invalidation or sync-delete +* [`SCAN`]({{< relref "/commands/scan" >}}) - Iterate the cached keyspace without blocking the server +* [`TTL`]({{< relref "/commands/ttl" >}}) - Inspect remaining safety-net time on a key +* [Redis Data Integration]({{< relref "/integrate/redis-data-integration" >}}) - Configuration-driven CDC into Redis on Redis Enterprise and Redis Cloud diff --git a/content/develop/use-cases/prefetch-cache/rust/cache.rs b/content/develop/use-cases/prefetch-cache/rust/cache.rs new file mode 100644 index 0000000000..f6f5a4c72e --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/rust/cache.rs @@ -0,0 +1,318 @@ +//! Redis prefetch-cache helper. +//! +//! Each cached entity is stored as a Redis hash under +//! `cache:{prefix}:{id}` with a long safety-net TTL that bounds memory if +//! the sync pipeline ever stops, but is not the freshness mechanism. +//! Freshness comes from the `apply_change` path, which the sync worker +//! calls every time a primary mutation arrives. +//! +//! Reads run `HGETALL` against Redis only. A miss is not a fall-back +//! trigger — the application treats it as an error or a deliberate +//! `invalidate` for testing. In production a sustained miss rate means +//! the prefetch or the sync pipeline is broken, not that the primary +//! should be re-queried on the request path. + +use std::collections::HashMap; +use std::sync::atomic::{AtomicU64, Ordering}; +use std::sync::Arc; +use std::time::Instant; + +use redis::{aio::ConnectionManager, AsyncCommands, AsyncIter, RedisResult}; + +use crate::primary::ChangeEvent; + +#[derive(Debug, Clone)] +pub struct CacheConfig { + pub prefix: String, + pub ttl_seconds: i64, +} + +impl Default for CacheConfig { + fn default() -> Self { + Self { + prefix: "cache:category:".to_string(), + ttl_seconds: 3600, + } + } +} + +#[derive(Debug, Clone)] +pub struct GetResult { + pub record: Option>, + pub hit: bool, + pub redis_latency_ms: f64, +} + +#[derive(Default)] +struct CacheStats { + hits: AtomicU64, + misses: AtomicU64, + prefetched: AtomicU64, + sync_events_applied: AtomicU64, + // sync lag is tracked as integer microseconds so we can use atomics + // without an extra mutex. Converted to f64 ms at read time. + sync_lag_us_total: AtomicU64, + sync_lag_samples: AtomicU64, +} + +#[derive(Clone)] +pub struct PrefetchCache { + conn: ConnectionManager, + cfg: CacheConfig, + stats: Arc, +} + +impl PrefetchCache { + pub fn new(conn: ConnectionManager, cfg: CacheConfig) -> Self { + Self { + conn, + cfg, + stats: Arc::new(CacheStats::default()), + } + } + + pub fn ttl_seconds(&self) -> i64 { + self.cfg.ttl_seconds + } + + #[allow(dead_code)] + pub fn prefix(&self) -> &str { + &self.cfg.prefix + } + + fn cache_key(&self, id: &str) -> String { + format!("{}{}", self.cfg.prefix, id) + } + + fn strip_prefix<'a>(&self, key: &'a str) -> &'a str { + key.strip_prefix(self.cfg.prefix.as_str()).unwrap_or(key) + } + + /// Pipeline `DEL` + `HSET` + `EXPIRE` for every record. Returns the + /// count loaded. + /// + /// The pipeline is non-transactional: it is fast on startup (when + /// nothing is reading the cache) and on the live `/reprefetch` path + /// (when the demo pauses the sync worker around the call). Calling + /// `bulk_load` on a cache that is actively being read and written + /// to can briefly expose a key that has been deleted but not yet + /// rewritten; pause the writers first or rewrite this with an + /// atomic pipeline if that matters. + pub async fn bulk_load( + &self, + records: Vec>, + ) -> RedisResult { + let mut pipe = redis::pipe(); + let mut loaded: u64 = 0; + for record in &records { + let entity_id = match record.get("id") { + Some(v) if !v.is_empty() => v.clone(), + _ => continue, + }; + let cache_key = self.cache_key(&entity_id); + let pairs: Vec<(String, String)> = + record.iter().map(|(k, v)| (k.clone(), v.clone())).collect(); + pipe.del(&cache_key).ignore(); + pipe.hset_multiple(&cache_key, &pairs).ignore(); + pipe.expire(&cache_key, self.cfg.ttl_seconds).ignore(); + loaded += 1; + } + if loaded > 0 { + let mut conn = self.conn.clone(); + pipe.query_async::<_, ()>(&mut conn).await?; + } + self.stats.prefetched.fetch_add(loaded, Ordering::Relaxed); + Ok(loaded) + } + + /// Return `(record, hit, redis_latency_ms)` for an `HGETALL` against Redis. + /// + /// Prefetch-cache reads do not fall back to the primary. A miss is a + /// signal that the cache is incomplete, not a trigger to re-query + /// the source. The caller decides how to surface it. + pub async fn get(&self, entity_id: &str) -> RedisResult { + let cache_key = self.cache_key(entity_id); + let mut conn = self.conn.clone(); + + let started = Instant::now(); + let cached: HashMap = conn.hgetall(&cache_key).await?; + let redis_latency_ms = started.elapsed().as_secs_f64() * 1000.0; + + if !cached.is_empty() { + self.stats.hits.fetch_add(1, Ordering::Relaxed); + Ok(GetResult { + record: Some(cached), + hit: true, + redis_latency_ms, + }) + } else { + self.stats.misses.fetch_add(1, Ordering::Relaxed); + Ok(GetResult { + record: None, + hit: false, + redis_latency_ms, + }) + } + } + + /// Apply a primary change event to Redis. + /// + /// For an `upsert`, rewrite the hash and refresh the safety-net TTL + /// in a `MULTI`/`EXEC` pipeline. For a `delete`, remove the cache + /// key. If `op == "upsert"` and `fields` is missing or empty, + /// return early — calling `HSET` with no pairs raises in most + /// clients, and there is nothing to write anyway. + pub async fn apply_change(&self, change: &ChangeEvent) -> RedisResult<()> { + if change.id.is_empty() { + return Ok(()); + } + let cache_key = self.cache_key(&change.id); + let mut conn = self.conn.clone(); + + match change.op.as_str() { + "upsert" => { + let fields = match &change.fields { + Some(f) if !f.is_empty() => f, + _ => { + // Malformed upsert with no fields. Skip rather + // than crash the sync worker. A real CDC + // consumer would route this to a dead-letter + // queue and alert; the demo just drops it. + return Ok(()); + } + }; + let pairs: Vec<(String, String)> = + fields.iter().map(|(k, v)| (k.clone(), v.clone())).collect(); + redis::pipe() + .atomic() + .del(&cache_key) + .ignore() + .hset_multiple(&cache_key, &pairs) + .ignore() + .expire(&cache_key, self.cfg.ttl_seconds) + .ignore() + .query_async::<_, ()>(&mut conn) + .await?; + } + "delete" => { + let _: i64 = conn.del(&cache_key).await?; + } + _ => return Ok(()), + } + + self.stats + .sync_events_applied + .fetch_add(1, Ordering::Relaxed); + + // Record sync lag: now_ms - change.timestamp_ms, clamped >= 0. + let now_ms = now_unix_ms(); + let lag_ms = (now_ms - change.timestamp_ms).max(0.0); + let lag_us = (lag_ms * 1000.0) as u64; + self.stats + .sync_lag_us_total + .fetch_add(lag_us, Ordering::Relaxed); + self.stats.sync_lag_samples.fetch_add(1, Ordering::Relaxed); + Ok(()) + } + + /// Delete one cache key. Demo-only: simulates a broken sync pipeline. + pub async fn invalidate(&self, entity_id: &str) -> RedisResult { + let mut conn = self.conn.clone(); + let n: i64 = conn.del(self.cache_key(entity_id)).await?; + Ok(n == 1) + } + + /// Delete every key under this cache's prefix and return the count. + pub async fn clear(&self) -> RedisResult { + let mut conn = self.conn.clone(); + let pattern = format!("{}*", self.cfg.prefix); + let mut keys: Vec = Vec::new(); + { + let mut iter: AsyncIter = conn.scan_match(&pattern).await?; + while let Some(key) = iter.next_item().await { + keys.push(key); + } + } + if keys.is_empty() { + return Ok(0); + } + let mut deleted: u64 = 0; + let mut conn = self.conn.clone(); + for chunk in keys.chunks(500) { + let mut pipe = redis::pipe(); + for key in chunk { + pipe.del(key); + } + let results: Vec = pipe.query_async(&mut conn).await?; + deleted += results.into_iter().filter(|n| *n > 0).count() as u64; + } + Ok(deleted) + } + + /// Return every entity id currently in the cache, sorted, prefix stripped. + pub async fn ids(&self) -> RedisResult> { + let mut conn = self.conn.clone(); + let pattern = format!("{}*", self.cfg.prefix); + let mut out: Vec = Vec::new(); + let mut iter: AsyncIter = conn.scan_match(&pattern).await?; + while let Some(key) = iter.next_item().await { + out.push(self.strip_prefix(&key).to_string()); + } + out.sort(); + Ok(out) + } + + pub async fn ttl_remaining(&self, entity_id: &str) -> RedisResult { + let mut conn = self.conn.clone(); + Ok(conn.ttl(self.cache_key(entity_id)).await?) + } + + pub fn stats(&self) -> serde_json::Value { + let hits = self.stats.hits.load(Ordering::Relaxed); + let misses = self.stats.misses.load(Ordering::Relaxed); + let prefetched = self.stats.prefetched.load(Ordering::Relaxed); + let sync_events_applied = self.stats.sync_events_applied.load(Ordering::Relaxed); + let sync_lag_us_total = self.stats.sync_lag_us_total.load(Ordering::Relaxed); + let sync_lag_samples = self.stats.sync_lag_samples.load(Ordering::Relaxed); + let total = hits + misses; + let hit_rate_pct = if total == 0 { + 0.0 + } else { + // Round-half-up to 1 decimal in floating point. Earlier + // `(1000 * hits / total) as f64 / 10.0` did integer division + // and truncated, so 2/3 came out as 66.6 instead of 66.7. + (1000.0 * hits as f64 / total as f64).round() / 10.0 + }; + let sync_lag_ms_avg = if sync_lag_samples == 0 { + 0.0 + } else { + let avg_us = sync_lag_us_total as f64 / sync_lag_samples as f64; + (avg_us / 10.0).round() / 100.0 + }; + serde_json::json!({ + "hits": hits, + "misses": misses, + "hit_rate_pct": hit_rate_pct, + "prefetched": prefetched, + "sync_events_applied": sync_events_applied, + "sync_lag_ms_avg": sync_lag_ms_avg, + }) + } + + pub fn reset_stats(&self) { + self.stats.hits.store(0, Ordering::Relaxed); + self.stats.misses.store(0, Ordering::Relaxed); + self.stats.prefetched.store(0, Ordering::Relaxed); + self.stats.sync_events_applied.store(0, Ordering::Relaxed); + self.stats.sync_lag_us_total.store(0, Ordering::Relaxed); + self.stats.sync_lag_samples.store(0, Ordering::Relaxed); + } +} + +fn now_unix_ms() -> f64 { + use std::time::{SystemTime, UNIX_EPOCH}; + SystemTime::now() + .duration_since(UNIX_EPOCH) + .map(|d| d.as_secs_f64() * 1000.0) + .unwrap_or(0.0) +} diff --git a/content/develop/use-cases/prefetch-cache/rust/demo_server.rs b/content/develop/use-cases/prefetch-cache/rust/demo_server.rs new file mode 100644 index 0000000000..dc73cf5ed1 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/rust/demo_server.rs @@ -0,0 +1,746 @@ +//! Redis prefetch-cache demo server. +//! +//! Run this demo and visit http://localhost:8790 to watch a prefetch +//! cache in action: the demo bulk-loads every primary record into Redis +//! on startup, runs a background sync worker that applies primary +//! mutations within milliseconds, and lets you add, update, delete, and +//! re-prefetch records to see how the cache stays current without ever +//! falling back to the primary on the read path. + +mod cache; +mod primary; +mod sync_worker; + +use std::env; +use std::sync::Arc; +use std::time::{Duration, Instant}; + +use axum::{ + extract::{Form, Query, State}, + http::{header, StatusCode}, + response::{IntoResponse, Response}, + routing::{get, post}, + Json, Router, +}; +use redis::aio::ConnectionManager; +use redis::Client; +use serde::Deserialize; +use serde_json::{json, Value}; + +use cache::{CacheConfig, PrefetchCache}; +use primary::MockPrimaryStore; +use sync_worker::SyncWorker; + +#[derive(Clone)] +struct AppState { + cache: PrefetchCache, + primary: Arc, + sync: Arc, +} + +#[tokio::main] +async fn main() { + let mut host = String::from("127.0.0.1"); + let mut port: u16 = 8790; + let mut redis_host = env::var("REDIS_HOST").unwrap_or_else(|_| "localhost".to_string()); + let mut redis_port: u16 = env::var("REDIS_PORT") + .ok() + .and_then(|s| s.parse().ok()) + .unwrap_or(6379); + let mut cache_prefix = String::from("cache:category:"); + let mut ttl_seconds: i64 = 3600; + let mut primary_latency_ms: u64 = 80; + + let args: Vec = env::args().collect(); + let mut i = 1; + while i < args.len() { + match args[i].as_str() { + "--host" if i + 1 < args.len() => { + host = args[i + 1].clone(); + i += 2; + } + "--port" if i + 1 < args.len() => { + port = args[i + 1].parse().expect("invalid --port"); + i += 2; + } + "--redis-host" if i + 1 < args.len() => { + redis_host = args[i + 1].clone(); + i += 2; + } + "--redis-port" if i + 1 < args.len() => { + redis_port = args[i + 1].parse().expect("invalid --redis-port"); + i += 2; + } + "--cache-prefix" if i + 1 < args.len() => { + cache_prefix = args[i + 1].clone(); + i += 2; + } + "--ttl-seconds" if i + 1 < args.len() => { + ttl_seconds = args[i + 1].parse().expect("invalid --ttl-seconds"); + i += 2; + } + "--primary-latency-ms" if i + 1 < args.len() => { + primary_latency_ms = args[i + 1].parse().expect("invalid --primary-latency-ms"); + i += 2; + } + _ => { + i += 1; + } + } + } + + let url = format!("redis://{}:{}/", redis_host, redis_port); + let client = Client::open(url).expect("failed to create Redis client"); + let conn = ConnectionManager::new(client) + .await + .expect("failed to connect to Redis"); + + let cache = PrefetchCache::new( + conn, + CacheConfig { + prefix: cache_prefix.clone(), + ttl_seconds, + }, + ); + let primary = MockPrimaryStore::new(primary_latency_ms); + let sync = Arc::new(SyncWorker::new(primary.clone(), cache.clone())); + + let started = Instant::now(); + let _ = cache.clear().await.unwrap_or(0); + let records = primary.list_records().await; + let loaded = cache.bulk_load(records).await.unwrap_or(0); + let elapsed_ms = started.elapsed().as_secs_f64() * 1000.0; + sync.start().await; + + println!( + "Redis prefetch-cache demo server listening on http://{}:{}", + host, port + ); + println!( + "Using Redis at {}:{} with cache prefix '{}' and TTL {}s", + redis_host, redis_port, cache_prefix, ttl_seconds + ); + println!( + "Prefetched {} records in {:.1} ms; sync worker running", + loaded, elapsed_ms + ); + + let state = AppState { + cache, + primary, + sync: sync.clone(), + }; + + let app = Router::new() + .route("/", get(index)) + .route("/categories", get(categories)) + .route("/read", get(read)) + .route("/stats", get(stats_handler)) + .route("/update", post(update)) + .route("/add", post(add)) + .route("/delete", post(delete)) + .route("/invalidate", post(invalidate)) + .route("/clear", post(clear)) + .route("/reprefetch", post(reprefetch)) + .route("/reset", post(reset)) + .with_state(state); + + let listener = tokio::net::TcpListener::bind((host.as_str(), port)) + .await + .expect("failed to bind"); + let serve = axum::serve(listener, app); + if let Err(err) = serve.await { + eprintln!("server error: {}", err); + } + sync.stop(Duration::from_secs(2)).await; +} + +async fn index(State(state): State) -> Response { + let html = render_html_page(state.cache.ttl_seconds()); + ([(header::CONTENT_TYPE, "text/html; charset=utf-8")], html).into_response() +} + +async fn categories(State(state): State) -> Response { + let cache_ids = state.cache.ids().await.unwrap_or_default(); + let primary_ids = state.primary.list_ids().await; + Json(json!({ + "cache_ids": cache_ids, + "primary_ids": primary_ids, + })) + .into_response() +} + +#[derive(Deserialize)] +struct ReadParams { + id: Option, +} + +async fn read(State(state): State, Query(params): Query) -> Response { + let id = params.id.unwrap_or_default(); + if id.is_empty() { + return error_json(StatusCode::BAD_REQUEST, "Missing 'id'."); + } + let result = match state.cache.get(&id).await { + Ok(r) => r, + Err(e) => return error_json(StatusCode::INTERNAL_SERVER_ERROR, &e.to_string()), + }; + let ttl = state.cache.ttl_remaining(&id).await.unwrap_or(-2); + Json(json!({ + "id": id, + "record": result.record, + "hit": result.hit, + "redis_latency_ms": round2(result.redis_latency_ms), + "ttl_remaining": ttl, + "stats": build_stats(&state).await, + })) + .into_response() +} + +async fn stats_handler(State(state): State) -> Json { + Json(build_stats(&state).await) +} + +#[derive(Deserialize)] +struct UpdateForm { + id: String, + field: String, + value: Option, +} + +async fn update(State(state): State, Form(form): Form) -> Response { + if form.id.is_empty() || form.field.is_empty() { + return error_json(StatusCode::BAD_REQUEST, "Missing 'id' or 'field'."); + } + let value = form.value.unwrap_or_default(); + if !state.primary.update_field(&form.id, &form.field, &value).await { + return error_json( + StatusCode::NOT_FOUND, + &format!("Unknown category '{}'.", form.id), + ); + } + Json(json!({ + "id": form.id, + "field": form.field, + "value": value, + "stats": build_stats(&state).await, + })) + .into_response() +} + +#[derive(Deserialize)] +struct AddForm { + id: String, + name: String, + display_order: Option, + featured: Option, + parent_id: Option, +} + +async fn add(State(state): State, Form(form): Form) -> Response { + let entity_id = form.id.trim().to_string(); + let name = form.name.trim().to_string(); + if entity_id.is_empty() || name.is_empty() { + return error_json(StatusCode::BAD_REQUEST, "Missing 'id' or 'name'."); + } + let display_order = form + .display_order + .filter(|s| !s.is_empty()) + .unwrap_or_else(|| "99".to_string()); + let featured = form + .featured + .filter(|s| !s.is_empty()) + .unwrap_or_else(|| "false".to_string()); + let parent_id = form.parent_id.unwrap_or_default(); + + let mut record = std::collections::HashMap::new(); + record.insert("id".to_string(), entity_id.clone()); + record.insert("name".to_string(), name); + record.insert("display_order".to_string(), display_order); + record.insert("featured".to_string(), featured); + record.insert("parent_id".to_string(), parent_id); + + if !state.primary.add_record(record.clone()).await { + return error_json( + StatusCode::CONFLICT, + &format!("Category '{}' already exists.", entity_id), + ); + } + Json(json!({ + "id": entity_id, + "record": record, + "stats": build_stats(&state).await, + })) + .into_response() +} + +#[derive(Deserialize)] +struct IdForm { + id: String, +} + +async fn delete(State(state): State, Form(form): Form) -> Response { + if form.id.is_empty() { + return error_json(StatusCode::BAD_REQUEST, "Missing 'id'."); + } + if !state.primary.delete_record(&form.id).await { + return error_json( + StatusCode::NOT_FOUND, + &format!("Unknown category '{}'.", form.id), + ); + } + Json(json!({ + "id": form.id, + "stats": build_stats(&state).await, + })) + .into_response() +} + +async fn invalidate(State(state): State, Form(form): Form) -> Response { + if form.id.is_empty() { + return error_json(StatusCode::BAD_REQUEST, "Missing 'id'."); + } + let deleted = state.cache.invalidate(&form.id).await.unwrap_or(false); + Json(json!({ + "id": form.id, + "deleted": deleted, + "stats": build_stats(&state).await, + })) + .into_response() +} + +async fn clear(State(state): State) -> Response { + // Pause the sync worker so it cannot recreate keys between SCAN + // and DEL. Queued events accumulate and apply after resume. + let _ = state.sync.pause(Duration::from_secs(2)).await; + let deleted = state.cache.clear().await.unwrap_or(0); + state.sync.resume().await; + Json(json!({ + "deleted": deleted, + "stats": build_stats(&state).await, + })) + .into_response() +} + +async fn reprefetch(State(state): State) -> Response { + // Pause the sync worker so it cannot interleave with the clear + + // snapshot + bulk_load sequence. Without this, a change applied + // between list_records() and bulk_load() would be overwritten by + // the stale snapshot. + let _ = state.sync.pause(Duration::from_secs(2)).await; + let started = Instant::now(); + let _ = state.cache.clear().await.unwrap_or(0); + let records = state.primary.list_records().await; + let loaded = state.cache.bulk_load(records).await.unwrap_or(0); + let elapsed_ms = started.elapsed().as_secs_f64() * 1000.0; + state.sync.resume().await; + Json(json!({ + "loaded": loaded, + "elapsed_ms": round2(elapsed_ms), + "stats": build_stats(&state).await, + })) + .into_response() +} + +async fn reset(State(state): State) -> Json { + state.cache.reset_stats(); + state.primary.reset_reads(); + Json(build_stats(&state).await) +} + +async fn build_stats(state: &AppState) -> Value { + let mut stats = state.cache.stats(); + if let Some(obj) = stats.as_object_mut() { + obj.insert( + "primary_reads_total".to_string(), + json!(state.primary.reads()), + ); + obj.insert( + "primary_read_latency_ms".to_string(), + json!(state.primary.read_latency_ms), + ); + } + stats +} + +fn error_json(status: StatusCode, message: &str) -> Response { + (status, Json(json!({ "error": message }))).into_response() +} + +fn round2(value: f64) -> f64 { + (value * 100.0).round() / 100.0 +} + +fn render_html_page(cache_ttl: i64) -> String { + HTML_TEMPLATE.replace("__CACHE_TTL__", &cache_ttl.to_string()) +} + +const HTML_TEMPLATE: &str = r##" + + + + + Redis Prefetch Cache Demo + + + +
+
redis-rs + axum/tokio
+

Redis Prefetch Cache Demo

+

+ Every record from the primary store has been pre-loaded into Redis. + Reads run HGETALL against Redis only — there is no + fall-back to the primary on the read path. When you add, update, or + delete a record, the primary emits a change event that a background + sync worker applies to Redis within a few milliseconds. A long + safety-net TTL (__CACHE_TTL__ s) is refreshed on every add or update + event (delete events remove the key) and bounds memory if sync ever stops. +

+ +
+
+

Cache state

+
Loading...
+ +
+ +
+

Read a category

+

Reads come from Redis only. Every read should be a hit because + the cache was pre-loaded and the sync worker keeps it current.

+ + + +
+ +
+

Update a field

+

Updates write to the primary. The sync worker picks up the + change event and rewrites the cache hash within milliseconds.

+ + + + + + + +
+ +
+

Add a category

+

Inserts to the primary propagate to the cache through the same + sync path.

+ + + + + + + +
+ +
+

Delete a category

+

Deletes remove the record from the primary, and the sync worker + removes the cache entry.

+ + + +
+ +
+

Break the cache

+

Simulate a failure of the sync pipeline. Reads against the + affected key(s) return a miss until you re-prefetch.

+ + +
+ + +
+ +
+ +
+

Cache stats

+
Loading...
+ +
+ +
+

Last result

+

Read a category to see the cached record and timing.

+
+
+ +
+
+ + + + +"##; diff --git a/content/develop/use-cases/prefetch-cache/rust/primary.rs b/content/develop/use-cases/prefetch-cache/rust/primary.rs new file mode 100644 index 0000000000..8494919a84 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/rust/primary.rs @@ -0,0 +1,201 @@ +//! Mock primary data store for the prefetch-cache demo. +//! +//! This stands in for a source-of-truth database (Postgres, MySQL, +//! Mongo, etc.) that holds reference data the application serves to +//! users. +//! +//! Every mutation appends a change event to an in-process queue, which +//! the sync worker drains and applies to Redis. In a real system the +//! queue is replaced by a CDC pipeline — Redis Data Integration, +//! Debezium plus a lightweight consumer, or an equivalent tool that +//! tails the source's binlog/WAL and pushes changes into Redis. +//! +//! The store also exposes `read_latency_ms` so the demo can illustrate +//! how much slower a direct primary read would be than a Redis hit. + +use std::collections::HashMap; +use std::sync::atomic::{AtomicU64, Ordering}; +use std::sync::Arc; +use std::time::Duration; + +use tokio::sync::{mpsc, Mutex}; + +pub const CHANGE_OP_UPSERT: &str = "upsert"; +pub const CHANGE_OP_DELETE: &str = "delete"; + +#[derive(Debug, Clone)] +pub struct ChangeEvent { + pub op: String, + pub id: String, + pub fields: Option>, + pub timestamp_ms: f64, +} + +struct Inner { + records: HashMap>, + tx: mpsc::UnboundedSender, +} + +pub struct MockPrimaryStore { + pub read_latency_ms: u64, + reads: AtomicU64, + inner: Mutex, + rx: Mutex>, +} + +impl MockPrimaryStore { + pub fn new(read_latency_ms: u64) -> Arc { + let (tx, rx) = mpsc::unbounded_channel::(); + let mut records: HashMap> = HashMap::new(); + records.insert( + "cat-001".to_string(), + make_record("cat-001", "Beverages", "1", "true", ""), + ); + records.insert( + "cat-002".to_string(), + make_record("cat-002", "Bakery", "2", "true", ""), + ); + records.insert( + "cat-003".to_string(), + make_record("cat-003", "Pantry Staples", "3", "false", ""), + ); + records.insert( + "cat-004".to_string(), + make_record("cat-004", "Frozen", "4", "false", ""), + ); + records.insert( + "cat-005".to_string(), + make_record("cat-005", "Specialty Cheeses", "5", "false", "cat-002"), + ); + + Arc::new(Self { + read_latency_ms, + reads: AtomicU64::new(0), + inner: Mutex::new(Inner { records, tx }), + rx: Mutex::new(rx), + }) + } + + pub async fn list_ids(&self) -> Vec { + // Metadata-only query: no sleep, no counter increment. + let inner = self.inner.lock().await; + let mut ids: Vec = inner.records.keys().cloned().collect(); + ids.sort(); + ids + } + + /// Return every record. Used by the cache's bulk-load path on startup. + pub async fn list_records(&self) -> Vec> { + tokio::time::sleep(Duration::from_millis(self.read_latency_ms)).await; + self.reads.fetch_add(1, Ordering::Relaxed); + let inner = self.inner.lock().await; + inner.records.values().cloned().collect() + } + + /// Single-record read. Not on the demo's normal read path. + #[allow(dead_code)] + pub async fn read(&self, entity_id: &str) -> Option> { + tokio::time::sleep(Duration::from_millis(self.read_latency_ms)).await; + self.reads.fetch_add(1, Ordering::Relaxed); + let inner = self.inner.lock().await; + inner.records.get(entity_id).cloned() + } + + pub async fn add_record(&self, record: HashMap) -> bool { + let entity_id = match record.get("id") { + Some(v) if !v.is_empty() => v.clone(), + _ => return false, + }; + let mut inner = self.inner.lock().await; + if inner.records.contains_key(&entity_id) { + return false; + } + inner.records.insert(entity_id.clone(), record.clone()); + // Emit while the lock is held so the queue order matches the + // mutation order. Two concurrent callers cannot interleave + // mutation A → mutation B → emit B → emit A. + emit_locked(&inner.tx, CHANGE_OP_UPSERT, &entity_id, Some(record)); + true + } + + pub async fn update_field(&self, entity_id: &str, field: &str, value: &str) -> bool { + let mut inner = self.inner.lock().await; + let snapshot = match inner.records.get_mut(entity_id) { + Some(record) => { + record.insert(field.to_string(), value.to_string()); + record.clone() + } + None => return false, + }; + emit_locked(&inner.tx, CHANGE_OP_UPSERT, entity_id, Some(snapshot)); + true + } + + pub async fn delete_record(&self, entity_id: &str) -> bool { + let mut inner = self.inner.lock().await; + if inner.records.remove(entity_id).is_none() { + return false; + } + emit_locked(&inner.tx, CHANGE_OP_DELETE, entity_id, None); + true + } + + /// Block up to `timeout` for the next change event. + pub async fn next_change(&self, timeout: Duration) -> Option { + let mut rx = self.rx.lock().await; + match tokio::time::timeout(timeout, rx.recv()).await { + Ok(Some(event)) => Some(event), + Ok(None) => None, + Err(_) => None, + } + } + + pub fn reads(&self) -> u64 { + self.reads.load(Ordering::Relaxed) + } + + pub fn reset_reads(&self) { + self.reads.store(0, Ordering::Relaxed); + } +} + +/// Append a change event to the feed. Caller must hold the mutation +/// lock. Holding the lock here is what guarantees that the queue order +/// matches the order in which the records map was mutated. +fn emit_locked( + tx: &mpsc::UnboundedSender, + op: &str, + entity_id: &str, + fields: Option>, +) { + let _ = tx.send(ChangeEvent { + op: op.to_string(), + id: entity_id.to_string(), + fields, + timestamp_ms: now_unix_ms(), + }); +} + +fn make_record( + id: &str, + name: &str, + display_order: &str, + featured: &str, + parent_id: &str, +) -> HashMap { + let mut m = HashMap::new(); + m.insert("id".to_string(), id.to_string()); + m.insert("name".to_string(), name.to_string()); + m.insert("display_order".to_string(), display_order.to_string()); + m.insert("featured".to_string(), featured.to_string()); + m.insert("parent_id".to_string(), parent_id.to_string()); + m +} + +fn now_unix_ms() -> f64 { + use std::time::{SystemTime, UNIX_EPOCH}; + SystemTime::now() + .duration_since(UNIX_EPOCH) + .map(|d| d.as_secs_f64() * 1000.0) + .unwrap_or(0.0) +} diff --git a/content/develop/use-cases/prefetch-cache/rust/sync_worker.rs b/content/develop/use-cases/prefetch-cache/rust/sync_worker.rs new file mode 100644 index 0000000000..069f442d99 --- /dev/null +++ b/content/develop/use-cases/prefetch-cache/rust/sync_worker.rs @@ -0,0 +1,221 @@ +//! Background sync worker for the prefetch-cache demo. +//! +//! A long-running tokio task drains the primary's change queue and +//! applies each event to Redis through `PrefetchCache::apply_change`. +//! In a real system, the queue is replaced by a CDC pipeline (Redis +//! Data Integration, Debezium, or an equivalent) that tails the +//! primary's binlog/WAL and writes the same shape of events. +//! +//! The worker exposes `pause()` and `resume()` so maintenance paths +//! (`/reprefetch`, `clear()`) can stop event application without +//! tearing the task down. `pause()` blocks until the worker is +//! parked, so the caller knows no apply is in flight by the time it +//! returns. + +use std::sync::Arc; +use std::time::Duration; + +use tokio::sync::{watch, Mutex, Notify}; +use tokio::task::JoinHandle; + +use crate::cache::PrefetchCache; +use crate::primary::MockPrimaryStore; + +#[derive(Clone)] +struct WorkerState { + pause_tx: watch::Sender, + pause_rx: watch::Receiver, + stop_tx: watch::Sender, + stop_rx: watch::Receiver, + idle_notify: Arc, +} + +pub struct SyncWorker { + primary: Arc, + cache: PrefetchCache, + poll_timeout: Duration, + state: Mutex>, + handle: Mutex>>, +} + +impl SyncWorker { + pub fn new( + primary: Arc, + cache: PrefetchCache, + ) -> Self { + Self { + primary, + cache, + poll_timeout: Duration::from_millis(50), + state: Mutex::new(None), + handle: Mutex::new(None), + } + } + + pub async fn start(self: &Arc) { + let mut handle_guard = self.handle.lock().await; + if let Some(h) = handle_guard.as_ref() { + if !h.is_finished() { + return; + } + } + + let (pause_tx, pause_rx) = watch::channel(false); + let (stop_tx, stop_rx) = watch::channel(false); + let idle_notify = Arc::new(Notify::new()); + + let state = WorkerState { + pause_tx, + pause_rx, + stop_tx, + stop_rx, + idle_notify, + }; + + { + let mut s = self.state.lock().await; + *s = Some(state.clone()); + } + + let primary = self.primary.clone(); + let cache = self.cache.clone(); + let poll_timeout = self.poll_timeout; + let run_state = state.clone(); + let handle = tokio::spawn(async move { + run_loop(primary, cache, poll_timeout, run_state).await; + }); + *handle_guard = Some(handle); + } + + /// Signal the worker to exit and join its task. If the join times + /// out the worker is wedged inside `apply_change`; we leave the + /// handle populated so a subsequent `start()` does not spawn a + /// second worker on top of the orphan. + pub async fn stop(&self, join_timeout: Duration) { + if let Some(state) = self.state.lock().await.as_ref() { + let _ = state.stop_tx.send(true); + } + let mut handle_guard = self.handle.lock().await; + if let Some(h) = handle_guard.as_mut() { + match tokio::time::timeout(join_timeout, h).await { + Ok(_) => { + *handle_guard = None; + } + Err(_) => { + // Timed out; leave the handle in place so a future + // start() will see it as still running and refuse + // to spawn a second worker on top of the orphan. + } + } + } + } + + /// Stop applying events and block until the worker is parked. + /// Returns `true` once the worker has confirmed it is idle, or + /// `false` if the timeout elapsed first. While paused, change + /// events accumulate in the primary's queue and are applied in + /// order after `resume()`. + pub async fn pause(&self, timeout: Duration) -> bool { + let state = { + let guard = self.state.lock().await; + match guard.as_ref() { + Some(s) => s.clone(), + None => return true, + } + }; + + // Check whether the task is still alive — if it's already + // exited or never started, treat that as "paused". + let handle_alive = { + let h = self.handle.lock().await; + h.as_ref().map(|j| !j.is_finished()).unwrap_or(false) + }; + if !handle_alive { + let _ = state.pause_tx.send(true); + return true; + } + + // Subscribe to the idle notification BEFORE we flip the pause + // flag, so we cannot miss a notification that fires between + // the flag flip and `notified().await`. + let notify = state.idle_notify.clone(); + let notified = notify.notified(); + tokio::pin!(notified); + let _ = state.pause_tx.send(true); + + // If the worker is already parked, the notification may have + // happened before we subscribed; re-check by peeking at the + // pause state from the worker's side. The worker re-emits + // `notify_one` on each pass through the park loop, so the + // worst case is a one-poll-cycle wait. + match tokio::time::timeout(timeout, &mut notified).await { + Ok(_) => true, + Err(_) => false, + } + } + + pub async fn resume(&self) { + if let Some(state) = self.state.lock().await.as_ref() { + let _ = state.pause_tx.send(false); + } + } +} + +async fn run_loop( + primary: Arc, + cache: PrefetchCache, + poll_timeout: Duration, + state: WorkerState, +) { + let WorkerState { + pause_tx: _, + mut pause_rx, + stop_tx: _, + mut stop_rx, + idle_notify, + } = state; + + loop { + if *stop_rx.borrow() { + return; + } + if *pause_rx.borrow() { + // Park until the pause is lifted or the worker is stopped. + // Re-notify on every iteration so a *new* pause() that + // subscribes while we are still parked from the previous + // cycle gets a wake-up within one poll interval, not after + // the caller's full pause-timeout. + loop { + if *stop_rx.borrow() { + return; + } + if !*pause_rx.borrow() { + break; + } + idle_notify.notify_waiters(); + tokio::select! { + _ = pause_rx.changed() => {} + _ = stop_rx.changed() => {} + _ = tokio::time::sleep(poll_timeout) => {} + } + } + continue; + } + + let change = primary.next_change(poll_timeout).await; + let change = match change { + Some(c) => c, + None => continue, + }; + if let Err(err) = cache.apply_change(&change).await { + // Demo behaviour: log and drop the event. A production + // CDC consumer would retry with bounded backoff and + // expose a dead-letter / error counter; see the guide's + // "Production usage" section. + eprintln!( + "[sync] failed to apply {:?} id={}: {}", + change.op, change.id, err + ); + } + } +}