Skip to content
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 52 additions & 5 deletions .agents/skills/redis-use-case-ports/assets/audit-checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,54 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a

---

## 14. Subscribe-acknowledgement race in pub/sub-style helpers
## 14. Empty-fields `HSET` guard in change-event consumers

**What to scan for:** any code path that takes a "fields" payload from a change event / message / callback and forwards it to `HSET` (or the client-equivalent `hSet` / `hSetMultiple` / `HashSet` / `hMSet` / etc.). Typically this is a CDC consumer, sync worker, or write-through path.

**Pass criterion:** before the `HSET` call, the code explicitly guards against `fields` being null, missing, or empty, and returns early on the malformed case (or routes to a dead-letter, etc.). The guard must run before the pipeline / transaction is opened.

**Sample audit prompt:**

> Audit every code path in the 9 client implementations under `content/develop/use-cases/{{USE_CASE_NAME}}/` that forwards a fields payload from a change-event / callback / message to `HSET` (or the client equivalent). For each, confirm there is an explicit early-return guard for null / missing / empty fields **before** any pipeline or transaction is constructed. Flag any port without the guard with file path and line number.

**Why on list:** Every Redis client tested in the prefetch-cache use case raises or panics on `HSET` with an empty fields mapping: redis-py `DataError`, node-redis throws, Predis "wrong number of arguments", redis-rs **panics** on `pipe().hset_multiple(&key, &[])`, Jedis errors, go-redis errors. A defensive `|| {}` fallback that LOOKS like it handles the empty case is actually misleading — Cursor bugbot caught this on the reference implementation. ([PR #3317 comment](https://github.com/redis/docs/pull/3317))

---

## 15. TTL sentinel preservation across libraries

**What to scan for:** any `TTL` / `ttl_remaining` / `ttlRemaining` helper that wraps the client's TTL command. Particularly any code that converts the library's return type (often `time.Duration`, `TimeSpan?`, `Long`) into integer seconds.

**Pass criterion:** the helper returns **`-2`** for a missing key and **`-1`** for a key with no TTL, as integer seconds (or the language's native integer type). Libraries encode these sentinels inconsistently:

- **redis-py**: returns `int` directly with `-2` / `-1` preserved.
- **go-redis**: returns `time.Duration` with `-2` / `-1` as **raw nanoseconds** (not seconds-scaled). A naive `int(d.Seconds())` truncates to `0`.
- **StackExchange.Redis**: `KeyTimeToLive` returns `TimeSpan?` and collapses **both** missing-key and no-TTL into `null` — a null-coalesce loses the `-2` sentinel.
- **node-redis / Jedis / Lettuce / Predis / redis-rb**: return integer-typed seconds with `-2` / `-1` preserved.

The recommended cross-client idiom is to **bypass the library wrapper** and send the raw command (`client.Do(ctx, "TTL", key).Int64()` in Go, `IDatabase.Execute("TTL", key)` in .NET) so the integer reply comes through untouched.

**Sample audit prompt:**

> For each port's `TTLRemaining` (or equivalent) under `content/develop/use-cases/{{USE_CASE_NAME}}/`, confirm it returns `-2` for a missing key and `-1` for a key with no TTL. Test each by reading a non-existent ID and by running `PERSIST` on an existing cache key then reading it. Flag any port that returns `0`, `null`, or collapses the two sentinels into one value.

**Why on list:** Caught in the prefetch-cache cross-port audit. go-redis and StackExchange.Redis both shipped with subtle bugs in their TTL conversion that the audit caught. ([PR #3317 audit B](https://github.com/redis/docs/pull/3317))

---

## 16. Locked-emit ordering for producer/consumer queues

**What to scan for:** any mock primary store, in-memory writer, or producer that (a) mutates internal state under a lock and (b) appends a corresponding event to an out-of-process or out-of-thread queue/stream/channel. Typical methods: `add_record` / `update_field` / `delete_record`, `enqueue`, `publish_change`.

**Pass criterion:** the queue append happens **inside the same locked section** as the state mutation, not after it. Without this, two concurrent mutations can complete in one order but enqueue their events in the opposite order, and a downstream consumer applies them out of order — the cache ends up divergent from the source. For cross-process producers (PHP, etc.), the equivalent is wrapping the mutation + `LPUSH` in a Lua script so the server enforces ordering.

**Sample audit prompt:**

> Audit every mutation method in each port's mock primary store (or equivalent producer) under `content/develop/use-cases/{{USE_CASE_NAME}}/`. For each, confirm the change event is appended to the queue / stream / channel **while the mutation lock is still held** (or, for cross-process ports, wrapped in a Lua script that combines the record write and the LPUSH server-side). Flag any port where the emit happens after the lock release.

**Why on list:** Locked-emit ordering is what guarantees a CDC consumer can replay events deterministically. Caught and fixed in the prefetch-cache reference's `_emit_change_locked` pattern after Codex review; the prefetch-cache cross-port audit confirmed all 9 ports preserve the invariant, including PHP's Lua-script equivalent. ([PR #3317 audit C](https://github.com/redis/docs/pull/3317))

## 17. Subscribe-acknowledgement race in pub/sub-style helpers

**What to scan for:** the constructor or registration path of any subscriber object (pub/sub Subscription, message-listener, channel consumer). Specifically, the code path between "request the SUBSCRIBE / PSUBSCRIBE" and "return the Subscription handle to the caller".

Expand All @@ -219,7 +266,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a

---

## 15. Concurrent-name reservation race in async helpers
## 18. Concurrent-name reservation race in async helpers

**What to scan for:** any helper that does "check map for duplicate → release lock → do async work → acquire lock → insert". This shape is common in Rust (`std::sync::Mutex` is `!Send`, so can't be held across `await`) and any async language where the check and the insert are bracketed by an `await` that releases the lock implicitly.

Expand All @@ -233,7 +280,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a

---

## 16. Detached-worker PID capture
## 19. Detached-worker PID capture

**What to scan for:** in any port that spawns subscriber/worker processes from a request handler (typically PHP under `php -S`, but any helper that uses `proc_open`, `subprocess.Popen`, `child_process.spawn`, `posix_spawn`, etc.), how is the worker's PID recorded? Look for `proc_get_status()['pid']` after `proc_open([...])`, or `pid` properties on subprocess handles.

Expand All @@ -247,7 +294,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a

---

## 17. Silent timeout fallthrough in readiness waits
## 20. Silent timeout fallthrough in readiness waits

**What to scan for:** functions named `waitFor*`, `pollUntil*`, `awaitReady`, etc. that loop with a deadline. Especially ones that return `void` / `None` / `()` instead of a status.

Expand All @@ -261,7 +308,7 @@ A naked `DEL` without the token check is a bug: if the lock expired and was re-a

---

## 18. Pub/sub introspection commands are server-wide
## 21. Pub/sub introspection commands are server-wide

**What to scan for:** any test or smoke-test step that asserts an **absolute** value of `PUBSUB CHANNELS`, `PUBSUB NUMSUB`, or `PUBSUB NUMPAT`. Especially common in pub/sub-style use cases.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ A sub-agent can run this in read-only mode. For each row, produce a 9-column com
| Write path | `HSET` (with all fields) + `EXPIRE`, ideally pipelined or in a single `HSET ... EXPIRE` MULTI. |
| Invalidate | `DEL` (not `EXPIRE 0`, not `UNLINK`). |
| Field update | `HSET key field value` + `EXPIRE` inside a conditional transaction or `Condition.KeyExists`. |
| TTL inspection | `TTL` (not `PTTL`, not `OBJECT`). |
| TTL inspection | `TTL` (not `PTTL`, not `OBJECT`). The wrapper must preserve the `-2` (missing key) and `-1` (no TTL) sentinels as integer seconds; if the client's typed wrapper collapses or rescales them (go-redis's `time.Duration` with nanosecond-encoded sentinels, StackExchange.Redis's `KeyTimeToLive` returning `null` for both cases), bypass it with the raw command (`Do("TTL", ...)` / `Execute("TTL", ...)`). See audit-checklist row 15. |
| Single-flight acquire | Lua script using `SET NX PX`. |
| Single-flight release | Lua script using `GET == token` check + `DEL`. |
| Counters (where stats are in Redis, e.g. PHP) | `HINCRBY`. |
Expand Down Expand Up @@ -187,10 +187,11 @@ The only per-client variation should be the **pill text** at the top of `<body>`
| `Get the source files` subsection | Every `_index.md` has a `### Get the source files` subsection as the first child of `## Running the demo`. It contains a `mkdir <use-case>-demo && cd <use-case>-demo`, a `BASE=https://raw.githubusercontent.com/redis/docs/main/...` variable, and one `curl -O $BASE/<file>` per source file the port needs. |
| Files curled match files run | The set of files in the curl block matches what the existing run command (e.g. `python3 demo_server.py`, `dotnet run`, `php -S ... demo_server.php`) actually requires. No missing config files (`package.json`, `composer.json`, `*.csproj`, `go.mod`, `Cargo.toml`), no extras (`Cargo.lock` only if `cargo` expects it; build outputs never). |
| Rust folder layout | The curl block matches the port's on-disk layout: if files live under `src/`, the block does `mkdir -p .../src && cd ...` then `curl -o src/<file> $BASE/src/<file>`; if files are flat at the project root (driven by explicit `path =` in `Cargo.toml`), `curl -O $BASE/<file>` for all of them. |
| Source-file count in prose matches curl block | Prose like *"The demo consists of N files"* in `### Get the source files` must match the actual number of `curl -O` lines in the block. Easy drift when a port adds an extra worker entry point (e.g. PHP's separate `sync_worker.php`) and the count is not updated. |

**Audit prompt:**

> For each of the 9 client implementations of `content/develop/use-cases/{{USE_CASE_NAME}}/`, grep `_index.md` with `grep -nE "\]\(([^h)][^)]*\.[a-z]+)\)"` — the result must be empty (no relative file links). Then confirm `## Running the demo` is followed by `### Get the source files`, and that the curl block downloads the same files the run command needs. Flag any port where the curl-block file set diverges from the run-time requirements, or where a Rust port's `src/` layout doesn't match its on-disk reality.
> For each of the 9 client implementations of `content/develop/use-cases/{{USE_CASE_NAME}}/`, grep `_index.md` with `grep -nE "\]\(([^h)][^)]*\.[a-z]+)\)"` — the result must be empty (no relative file links). Then confirm `## Running the demo` is followed by `### Get the source files`, and that the curl block downloads the same files the run command needs. Count the `curl -O` lines and confirm the prose intro ("The demo consists of N files") matches. Flag any port where the curl-block file set diverges from the run-time requirements, or where a Rust port's `src/` layout doesn't match its on-disk reality.

## File names per client

Expand Down
Loading
Loading