Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .agents/skills/redis-use-case-ports/assets/audit-checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,55 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a

---

## 14. Empty-fields `HSET` guard in change-event consumers

**What to scan for:** any code path that takes a "fields" payload from a change event / message / callback and forwards it to `HSET` (or the client-equivalent `hSet` / `hSetMultiple` / `HashSet` / `hMSet` / etc.). Typically this is a CDC consumer, sync worker, or write-through path.

**Pass criterion:** before the `HSET` call, the code explicitly guards against `fields` being null, missing, or empty, and returns early on the malformed case (or routes to a dead-letter, etc.). The guard must run before the pipeline / transaction is opened.

**Sample audit prompt:**

> Audit every code path in the 9 client implementations under `content/develop/use-cases/{{USE_CASE_NAME}}/` that forwards a fields payload from a change-event / callback / message to `HSET` (or the client equivalent). For each, confirm there is an explicit early-return guard for null / missing / empty fields **before** any pipeline or transaction is constructed. Flag any port without the guard with file path and line number.

**Why on list:** Every Redis client tested in the prefetch-cache use case raises or panics on `HSET` with an empty fields mapping: redis-py `DataError`, node-redis throws, Predis "wrong number of arguments", redis-rs **panics** on `pipe().hset_multiple(&key, &[])`, Jedis errors, go-redis errors. A defensive `|| {}` fallback that LOOKS like it handles the empty case is actually misleading — Cursor bugbot caught this on the reference implementation. ([PR #3317 comment](https://github.com/redis/docs/pull/3317))

---

## 15. TTL sentinel preservation across libraries

**What to scan for:** any `TTL` / `ttl_remaining` / `ttlRemaining` helper that wraps the client's TTL command. Particularly any code that converts the library's return type (often `time.Duration`, `TimeSpan?`, `Long`) into integer seconds.

**Pass criterion:** the helper returns **`-2`** for a missing key and **`-1`** for a key with no TTL, as integer seconds (or the language's native integer type). Libraries encode these sentinels inconsistently:

- **redis-py**: returns `int` directly with `-2` / `-1` preserved.
- **go-redis**: returns `time.Duration` with `-2` / `-1` as **raw nanoseconds** (not seconds-scaled). A naive `int(d.Seconds())` truncates to `0`.
- **StackExchange.Redis**: `KeyTimeToLive` returns `TimeSpan?` and collapses **both** missing-key and no-TTL into `null` — a null-coalesce loses the `-2` sentinel.
- **node-redis / Jedis / Lettuce / Predis / redis-rb**: return integer-typed seconds with `-2` / `-1` preserved.

The recommended cross-client idiom is to **bypass the library wrapper** and send the raw command (`client.Do(ctx, "TTL", key).Int64()` in Go, `IDatabase.Execute("TTL", key)` in .NET) so the integer reply comes through untouched.

**Sample audit prompt:**

> For each port's `TTLRemaining` (or equivalent) under `content/develop/use-cases/{{USE_CASE_NAME}}/`, confirm it returns `-2` for a missing key and `-1` for a key with no TTL. Test each by reading a non-existent ID and by running `PERSIST` on an existing cache key then reading it. Flag any port that returns `0`, `null`, or collapses the two sentinels into one value.

**Why on list:** Caught in the prefetch-cache cross-port audit. go-redis and StackExchange.Redis both shipped with subtle bugs in their TTL conversion that the audit caught. ([PR #3317 audit B](https://github.com/redis/docs/pull/3317))

---

## 16. Locked-emit ordering for producer/consumer queues

**What to scan for:** any mock primary store, in-memory writer, or producer that (a) mutates internal state under a lock and (b) appends a corresponding event to an out-of-process or out-of-thread queue/stream/channel. Typical methods: `add_record` / `update_field` / `delete_record`, `enqueue`, `publish_change`.

**Pass criterion:** the queue append happens **inside the same locked section** as the state mutation, not after it. Without this, two concurrent mutations can complete in one order but enqueue their events in the opposite order, and a downstream consumer applies them out of order — the cache ends up divergent from the source. For cross-process producers (PHP, etc.), the equivalent is wrapping the mutation + `LPUSH` in a Lua script so the server enforces ordering.

**Sample audit prompt:**

> Audit every mutation method in each port's mock primary store (or equivalent producer) under `content/develop/use-cases/{{USE_CASE_NAME}}/`. For each, confirm the change event is appended to the queue / stream / channel **while the mutation lock is still held** (or, for cross-process ports, wrapped in a Lua script that combines the record write and the LPUSH server-side). Flag any port where the emit happens after the lock release.

**Why on list:** Locked-emit ordering is what guarantees a CDC consumer can replay events deterministically. Caught and fixed in the prefetch-cache reference's `_emit_change_locked` pattern after Codex review; the prefetch-cache cross-port audit confirmed all 9 ports preserve the invariant, including PHP's Lua-script equivalent. ([PR #3317 audit C](https://github.com/redis/docs/pull/3317))

---

## How to add a new row

When a bug class is identified after this skill has been used:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ A sub-agent can run this in read-only mode. For each row, produce a 9-column com
| Write path | `HSET` (with all fields) + `EXPIRE`, ideally pipelined or in a single `HSET ... EXPIRE` MULTI. |
| Invalidate | `DEL` (not `EXPIRE 0`, not `UNLINK`). |
| Field update | `HSET key field value` + `EXPIRE` inside a conditional transaction or `Condition.KeyExists`. |
| TTL inspection | `TTL` (not `PTTL`, not `OBJECT`). |
| TTL inspection | `TTL` (not `PTTL`, not `OBJECT`). The wrapper must preserve the `-2` (missing key) and `-1` (no TTL) sentinels as integer seconds; if the client's typed wrapper collapses or rescales them (go-redis's `time.Duration` with nanosecond-encoded sentinels, StackExchange.Redis's `KeyTimeToLive` returning `null` for both cases), bypass it with the raw command (`Do("TTL", ...)` / `Execute("TTL", ...)`). See audit-checklist row 15. |
| Single-flight acquire | Lua script using `SET NX PX`. |
| Single-flight release | Lua script using `GET == token` check + `DEL`. |
| Counters (where stats are in Redis, e.g. PHP) | `HINCRBY`. |
Expand Down Expand Up @@ -187,10 +187,11 @@ The only per-client variation should be the **pill text** at the top of `<body>`
| `Get the source files` subsection | Every `_index.md` has a `### Get the source files` subsection as the first child of `## Running the demo`. It contains a `mkdir <use-case>-demo && cd <use-case>-demo`, a `BASE=https://raw.githubusercontent.com/redis/docs/main/...` variable, and one `curl -O $BASE/<file>` per source file the port needs. |
| Files curled match files run | The set of files in the curl block matches what the existing run command (e.g. `python3 demo_server.py`, `dotnet run`, `php -S ... demo_server.php`) actually requires. No missing config files (`package.json`, `composer.json`, `*.csproj`, `go.mod`, `Cargo.toml`), no extras (`Cargo.lock` only if `cargo` expects it; build outputs never). |
| Rust folder layout | The curl block matches the port's on-disk layout: if files live under `src/`, the block does `mkdir -p .../src && cd ...` then `curl -o src/<file> $BASE/src/<file>`; if files are flat at the project root (driven by explicit `path =` in `Cargo.toml`), `curl -O $BASE/<file>` for all of them. |
| Source-file count in prose matches curl block | Prose like *"The demo consists of N files"* in `### Get the source files` must match the actual number of `curl -O` lines in the block. Easy drift when a port adds an extra worker entry point (e.g. PHP's separate `sync_worker.php`) and the count is not updated. |

**Audit prompt:**

> For each of the 9 client implementations of `content/develop/use-cases/{{USE_CASE_NAME}}/`, grep `_index.md` with `grep -nE "\]\(([^h)][^)]*\.[a-z]+)\)"` — the result must be empty (no relative file links). Then confirm `## Running the demo` is followed by `### Get the source files`, and that the curl block downloads the same files the run command needs. Flag any port where the curl-block file set diverges from the run-time requirements, or where a Rust port's `src/` layout doesn't match its on-disk reality.
> For each of the 9 client implementations of `content/develop/use-cases/{{USE_CASE_NAME}}/`, grep `_index.md` with `grep -nE "\]\(([^h)][^)]*\.[a-z]+)\)"` — the result must be empty (no relative file links). Then confirm `## Running the demo` is followed by `### Get the source files`, and that the curl block downloads the same files the run command needs. Count the `curl -O` lines and confirm the prose intro ("The demo consists of N files") matches. Flag any port where the curl-block file set diverges from the run-time requirements, or where a Rust port's `src/` layout doesn't match its on-disk reality.

## File names per client

Expand Down
29 changes: 29 additions & 0 deletions .agents/skills/redis-use-case-ports/assets/redis-conventions.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,9 @@ PHP runs each HTTP request in a fresh process under `php -S`. This means:
- **In-process state doesn't persist.** Cache stats, primary record state, primary read counters, and per-job-queue counters must live in Redis (under a `demo:*` keyspace, or a `<prefix>:{name}:stats` hash).
- **Spawning sub-processes from a request handler must detach from the dev server's listen socket.** This bites both `pcntl_fork` (forked children inherit the accept socket) and `proc_open` (children inherit FDs unless explicitly redirected). The fix is **`setsid` on Linux**, and a shell-based new-session wrapper on macOS (which lacks `setsid(1)`). The detach also needs to redirect stdin/stdout/stderr to files; closing them alone isn't enough.
- **Predis 3.x's `hset()` is variadic, not associative.** The 1.x `$redis->hset($key, ['field' => 'value'])` form raises `wrong number of arguments for 'hset'` against a 3.x client/server. Use `$redis->hset($key, 'field', 'value', 'field2', 'value2', ...)` and write a small `flattenFields()` helper if you're storing a map.
- **Predis `BRPOP` only accepts whole-second timeouts.** Sub-second polling intervals (e.g. a 50 ms `next_change` loop in the reference Python) need a workaround: use a 1 s `BRPOP` for change draining plus a separate fast pause-flag poll (e.g. 20 ms `usleep`) so pause/resume latency stays low even when the main `BRPOP` is parked.
- **Cross-process pause/resume goes through Redis flags.** Where threaded ports use a `threading.Event` (or equivalent) inside one process, PHP needs the demo server and the long-running sync worker to coordinate across processes. The pattern is two keys: `demo:sync:paused` (writer to worker) and `demo:sync:idle` (worker acks parked state). The demo's `/clear` and `/reprefetch` handlers set `paused=1`, spin-wait for `idle=1` with a 10 ms poll and a 2 s timeout, do the cache write, then set `paused=0`. The worker checks `paused` on each loop iteration; if set, writes `idle=1` and spin-waits for it to clear. Established in the prefetch-cache PHP port.
- **Mutation + change-event emit needs Lua-script atomicity** when the producer is also stateless (PHP). The reference Python uses an in-process `Lock` to make "mutate-then-emit" atomic; the PHP equivalent is wrapping the record write and the `LPUSH` (or `XADD`) onto the change feed in a single `EVAL`. Without this, two concurrent mutations on the same key can land in queue order opposite to their server-side commit order. (Audit-checklist row 16.)
- The brief should call out that the cross-process supervision approach is **PHP-specific** in the production-usage section.

## .NET-specific notes
Expand All @@ -281,20 +284,25 @@ PHP runs each HTTP request in a fresh process under `php -S`. This means:
- **StackExchange.Redis intentionally does not expose blocking pops** (`BRPOPLPUSH` / `BLMOVE` with a timeout) because they would monopolise the multiplexer's single command pipeline. Use cases that need a blocking claim (job queue, etc.) should poll the non-blocking `IDatabase.ListRightPopLeftPush` on a short interval (50 ms is a reasonable default). Document this in the helper's "Claiming jobs" / "How it works" section.
- **`RedisChannel` no longer has an implicit `string` conversion in 2.7+.** `db.Publish(...)` needs `RedisChannel.Literal("channel:name")` or `RedisChannel.Pattern(...)` explicitly.
- StackExchange.Redis transparently caches Lua scripts: the first `ScriptEvaluate(script, keys, args)` sends `EVAL`, subsequent calls switch to `EVALSHA` automatically. No need to manage SHAs by hand.
- **`IDatabase.KeyTimeToLive` collapses the `-2` (missing) and `-1` (no TTL) sentinels into a single `TimeSpan?` null.** For any `TTL` lookup that needs to distinguish them, send the raw command instead: `(long) db.Execute("TTL", key)` returns the integer the server actually replied with. (Audit-checklist row 15.)
- **`IServer.Keys` (the typed SCAN enumerator) requires `AllowAdmin = true` on the `ConfigurationOptions`** — which also grants `FLUSHDB` / `CONFIG`, a real security concern in production. Where SCAN-style enumeration is needed (e.g. a `clear()` helper) prefer `db.Execute("SCAN", cursor, "MATCH", pattern, "COUNT", count)` so the demo doesn't pull in admin-privileged client config.

## Java-specific notes

- **Jedis**: use `JedisPool` and acquire a `Jedis` instance per call with try-with-resources. Each transaction gets its own connection; no in-process lock is needed.
- **Jedis 5.x's `brpoplpush` takes integer seconds.** Sub-second blocking-claim timeouts (e.g. 500 ms polling windows) round up to 1 s on the wire. The polling loop still observes its stop flag promptly enough; just be aware the per-iteration block is longer than the reference suggests.
- **Lettuce**: by default the demo shares one `StatefulRedisConnection` across HTTP handlers. Lettuce is thread-safe for individual commands but pipelined sequences and transactions are connection-scoped — concurrent pipelines or `MULTI`/`EXEC` blocks on one connection can interleave. Options when an enqueue / update needs two-or-more commands atomic-ish: (a) wrap in a `ReentrantLock`; (b) use `MULTI`/`EXEC` with the same lock; (c) merge into a Lua script (preferred — atomic server-side and lock-free, but requires writing the script). The production-usage section should explain you'd switch to `ConnectionPoolSupport.createGenericObjectPool(...)` in production and drop the lock.
- **Lettuce sync API does not cooperate with `setAutoFlushCommands(false)`.** Each sync call internally awaits its `CompletableFuture`; with auto-flush off, those futures never complete because nothing flushes. Symptom: bulk-load deadlocks silently — no exception, just a hung process. Use the **async API** (`RedisAsyncCommands<K,V>`) for any pipelined batch where you intend to flush at the end: queue commands without awaiting each one, then `connection.flushCommands()` and await the futures in bulk. Documented after the prefetch-cache Lettuce port hit it during testing.
- Lettuce's `BLMOVE` accepts a `double` timeout in seconds with sub-second precision (`bRPopLPush(timeout: double)`). Don't use the older `long`-overload — pre-6.x builds treated values < 1 as "block forever".
- Both Java demos depend on a small classpath. The `_index.md` should give an example `javac` + `java` command listing the jars by name.
- **JDK version: pick text blocks (15+) or string concatenation (11+) and apply it across both Java ports of the same use case.** Text blocks (`"""..."""`) keep the inlined HTML readable; concatenation works on older JDKs. The cache-aside Java ports use concatenation with JDK 11+ prereq; the prefetch-cache Java ports use text blocks with JDK 17+ prereq. Either is fine — just don't mix within a use case, and set Prerequisites accordingly.

## Go-specific notes

- Use `package <use-case-package>` (e.g., `package cacheaside`) for all files, including the demo server. Expose the entry point as a `RunDemoServer()` function rather than `main()` directly.
- Ask the user to create a one-line `main.go` next to the files: `package main; import "<use-case-package>"; func main() { <pkg>.RunDemoServer() }`. This avoids the Go limitation that `package main` can't coexist with another package in the same directory.
- `go.mod` should declare `module <use-case-package>` and `require github.com/redis/go-redis/v9` at a recent stable version.
- **go-redis encodes the `TTL` sentinels `-2` / `-1` as raw nanoseconds**, not seconds-scaled. `client.TTL(...).Result()` returns `time.Duration(-2)` (one nanosecond) for a missing key, and a naive `int(d.Seconds())` truncates it to `0`. For any `TTL` lookup, bypass the typed wrapper: `client.Do(ctx, "TTL", key).Int64()` returns the integer reply directly. Same idiom maps to the .NET `Execute("TTL", ...)` workaround. (Audit-checklist row 15.)

## Rust-specific notes

Expand All @@ -313,6 +321,27 @@ Every client has a `MockPrimaryStore` that:
- Is thread-safe (mutex around the records map, atomic on the counter).
- Lives entirely in-process — except in PHP, where it persists in Redis under `demo:primary:*` keys for cross-request survival.

### Locked-emit ordering for producer/consumer use cases

When the mock primary store doubles as the *producer* of a change feed that some downstream consumer (CDC worker, sync worker, replicator) drains — as in the prefetch-cache use case — every mutation method must emit its change event **while the mutation lock is still held**. The append-to-queue cannot happen after the lock is released, even though the queue itself is thread-safe.

Without this, two concurrent `update_field` calls can mutate the records map in one order (T1 then T2 → primary state ends at T2's value) and then enqueue their events in the opposite order (T2 then T1 → consumer applies T1 last → cache ends at T1's value, divergent from primary).

The reference Python pattern is an `_emit_change_locked(...)` helper called inside each `with self._lock:` block. The equivalent in other languages:

| Language | Pattern |
|---|---|
| Python | `_emit_change_locked` inside `with self._lock:` |
| Node.js | mutation + emit are synchronous within the same function; no `await` between them (single-threaded event loop guarantees serial execution) |
| Go | `defer mu.Unlock()` + `emitChangeLocked` before the deferred unlock |
| Java | `synchronized (lock) { ...mutate...; emitChangeLocked(...); }` |
| C# | `lock (_lock) { ...mutate...; EmitChangeLocked(...); }` |
| PHP | Lua scripts that combine the record write and the `LPUSH` server-side (no in-process lock to hold across requests) |
| Ruby | `@lock.synchronize { ...mutate...; emit_change_locked(...); }` |
| Rust | `emit_locked(...)` while the `MutexGuard` is still in scope (call before drop) |

See audit-checklist row 16 for the audit prompt.

## Library versions to standardise (when this skill is updated)

Pin the recommended versions in the `_index.md` Prerequisites section. As of the cache-aside use case:
Expand Down
1 change: 1 addition & 0 deletions content/develop/use-cases/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@ This section provides practical examples and reference implementations for commo
* [Time series dashboard]({{< relref "/develop/use-cases/time-series-dashboard" >}}) - Build a rolling sensor graph demo with Redis time series data
* [Leaderboards]({{< relref "/develop/use-cases/leaderboard" >}}) - Build a ranked leaderboard with sorted sets and user metadata
* [Job queue]({{< relref "/develop/use-cases/job-queue" >}}) - Run a reliable background job queue with at-least-once delivery and visibility-timeout reclaim
* [Prefetch cache]({{< relref "/develop/use-cases/prefetch-cache" >}}) - Pre-load reference data into Redis so every read is a cache hit, kept current by a CDC sync worker
Loading
Loading