Allow CLUSTERSCAN 0 to be executed on any node directly by enjoy-binbin · Pull Request #3675 · valkey-io/valkey

enjoy-binbin · 2026-05-12T06:56:52Z

Currently, for the initial cursor—specifically CLUSTERSCAN 0,
we calculate the slot for "0" (yielding 13907) and then redirect
the request to the corresponding node. However, the initial cursor
"0" should, in principle, be executable by any node, as its sole
purpose is to return the next CLUSTERSCAN cursor:

    /* Handle cursor "0" case. If slot information is provided we return
     * the updated cursor to scan input slot, else scan from slot 0. */
    if (strcmp(objectGetVal(c->argv[1]), "0") == 0) {
        if (opts.input_slot != -1) {
            slot = opts.input_slot;
        } else if (opts.match_slot != -1) {
            slot = opts.match_slot; /* If match maps to a particular slot, start scan from there */
        } else {
            slot = 0;
        }

        addReplyArrayLen(c, 2);
        if (skip_scan) {
            addReplyBulkCString(c, "0");
        } else {
            sds new_cursor = sdscatfmt(sdsempty(), "0-{%s}-0", crc16_slot_table[slot]);
            addReplyBulkSds(c, new_cursor);
        }
        addReplyArrayLen(c, 0);
        return;
    }

Add clusterscanGetKeys: cursor "0" returns no keys (handle locally),
non-"0" cursors return the cursor itself so the embedded {hashtag}
routes to the correct slot owner.

doesCommandHaveKeys: When a command has getkeys_proc but all its
key-specs are NOT_KEY (e.g. CLUSTERSCAN), treat it as having no real
keys so that ACL key checks, COMMAND GETKEYS, and Module API are not
misled by routing-only tokens.

ACLSelectorCheckKey: As a defense-in-depth measure, skip key-pattern
ACL validation for entries flagged CMD_KEY_NOT_KEY, since they are
routing tokens (e.g. CLUSTERSCAN cursor) rather than real user keys.

In a three shards empty cluster, before:

❯ ./src/valkey-cli -p 30001 -c
127.0.0.1:30001> clusterscan 0
-> Redirected to slot [13907] located at 127.0.0.1:30003
1) "0-{06S}-0"
2) (empty array)
127.0.0.1:30003> clusterscan 0-{06S}-0
-> Redirected to slot [0] located at 127.0.0.1:30001
1) "0-{8M}-0"
2) (empty array)
127.0.0.1:30001> clusterscan 0-{8M}-0
-> Redirected to slot [5461] located at 127.0.0.1:30002
1) "0-{63n}-0"
2) (empty array)
127.0.0.1:30002> clusterscan 0-{63n}-0
-> Redirected to slot [10923] located at 127.0.0.1:30003
1) "0"
2) (empty array)

In a three shards empty cluster, after:

❯ ./src/valkey-cli -p 30001 -c
127.0.0.1:30001> clusterscan 0
1) "0-{06S}-0"
2) (empty array)
127.0.0.1:30001> clusterscan 0-{06S}-0
1) "0-{8M}-0"
2) (empty array)
127.0.0.1:30001> clusterscan 0-{8M}-0
-> Redirected to slot [5461] located at 127.0.0.1:30002
1) "0-{63n}-0"
2) (empty array)
127.0.0.1:30002> clusterscan 0-{63n}-0
-> Redirected to slot [10923] located at 127.0.0.1:30003
1) "0"
2) (empty array)

CLUSTERSCAN was introduced in #2934.

Currently, for the initial cursor—specifically `CLUSTERSCAN 0`, we calculate the slot for "0" (yielding 13907) and then redirect the request to the corresponding node. However, the initial cursor "0" should, in principle, be executable by any node, as its sole purpose is to return the next `CLUSTERSCAN` cursor: ``` /* Handle cursor "0" case. If slot information is provided we return * the updated cursor to scan input slot, else scan from slot 0. */ if (strcmp(objectGetVal(c->argv[1]), "0") == 0) { if (opts.input_slot != -1) { slot = opts.input_slot; } else if (opts.match_slot != -1) { slot = opts.match_slot; /* If match maps to a particular slot, start scan from there */ } else { slot = 0; } addReplyArrayLen(c, 2); if (skip_scan) { addReplyBulkCString(c, "0"); } else { sds new_cursor = sdscatfmt(sdsempty(), "0-{%s}-0", crc16_slot_table[slot]); addReplyBulkSds(c, new_cursor); } addReplyArrayLen(c, 0); return; } ``` Add clusterscanGetKeys: cursor "0" returns no keys (handle locally), non-"0" cursors return the cursor itself so the embedded {hashtag} routes to the correct slot owner. In a three shards empty cluster, before: ``` ❯ ./src/valkey-cli -p 30001 -c 127.0.0.1:30001> clusterscan 0 -> Redirected to slot [13907] located at 127.0.0.1:30003 1) "0-{06S}-0" 2) (empty array) 127.0.0.1:30003> clusterscan 0-{06S}-0 -> Redirected to slot [0] located at 127.0.0.1:30001 1) "0-{8M}-0" 2) (empty array) 127.0.0.1:30001> clusterscan 0-{8M}-0 -> Redirected to slot [5461] located at 127.0.0.1:30002 1) "0-{63n}-0" 2) (empty array) 127.0.0.1:30002> clusterscan 0-{63n}-0 -> Redirected to slot [10923] located at 127.0.0.1:30003 1) "0" 2) (empty array) ``` In a three shards empty cluster, after: ``` ❯ ./src/valkey-cli -p 30001 -c 127.0.0.1:30001> clusterscan 0 1) "0-{06S}-0" 2) (empty array) 127.0.0.1:30001> clusterscan 0-{06S}-0 1) "0-{8M}-0" 2) (empty array) 127.0.0.1:30001> clusterscan 0-{8M}-0 -> Redirected to slot [5461] located at 127.0.0.1:30002 1) "0-{63n}-0" 2) (empty array) 127.0.0.1:30002> clusterscan 0-{63n}-0 -> Redirected to slot [10923] located at 127.0.0.1:30003 1) "0" 2) (empty array) ``` CLUSTERSCAN was introduced in valkey-io#2934. Signed-off-by: Binbin <binloveplay1314@qq.com>

madolson

I prefer the way it is today for clients specifically. I think it's easier on the client side to mark the second key as opposed to a purpose built get keys command. The only benefit we get is that you can send the command to a random node.

Signed-off-by: Binbin <binloveplay1314@qq.com>

coderabbitai · 2026-05-13T03:42:40Z

📝 Walkthrough

Walkthrough

CLUSTERSCAN cursors are treated as routing-only tokens (CMD_KEY_NOT_KEY) for ACL and key-extraction. A new clusterscanGetKeys marks/non-marks the cursor, doesCommandHaveKeys logic adjusted, command metadata wired to the new callback, and tests added for ACL and cluster behavior.

Changes

CLUSTERSCAN cluster routing

Layer / File(s)	Summary
ACL bypass for routing-only tokens `src/acl.c`	`ACLSelectorCheckKey()` returns early for keyspec entries marked with `CMD_KEY_NOT_KEY`, bypassing normal key-pattern ACL validation for routing-only tokens.
CLUSTERSCAN key extraction and doesCommandHaveKeys adjustment `src/server.h`, `src/db.c`	Added `clusterscanGetKeys`: cursor `"0"` => numkeys=0; non-`"0"` cursor => report argv[1] as CMD_KEY_NOT_KEY. `doesCommandHaveKeys()` now treats commands as having keys only if at least one keyspec is not `CMD_KEY_NOT_KEY`; commands with only `CMD_KEY_NOT_KEY` specs and a getkeys_proc report no key args.
Command metadata wiring `src/commands/clusterscan.json`, `src/commands.def`	CLUSTERSCAN command entry now references `clusterscanGetKeys` as its `get_keys_function` callback instead of `NULL`.
ACL regression and cluster behavior tests `tests/unit/cluster/clusterscan.tcl`, `tests/unit/introspection-2.tcl`	Added ACL regression tests confirming restricted users can run CLUSTERSCAN without ACL errors on cursors, cluster tests asserting `clusterscan 0` returns initial cursor and empty keys on all nodes, and introspection tests asserting `COMMAND GETKEYS*` reports no key args for NOT_KEY-routed commands including CLUSTERSCAN cursor forms.

Sequence Diagram

sequenceDiagram
  participant Client
  participant CLUSTERSCAN_Command
  participant clusterscanGetKeys
  participant ACLSelectorCheckKey
  participant ClusterRouter

  Client->>CLUSTERSCAN_Command: clusterscan 0
  CLUSTERSCAN_Command->>clusterscanGetKeys: extract keys from argv
  clusterscanGetKeys->>clusterscanGetKeys: cursor == "0"? Yes
  clusterscanGetKeys-->>CLUSTERSCAN_Command: numkeys = 0
  CLUSTERSCAN_Command->>ACLSelectorCheckKey: validate permissions (no keys)
  ACLSelectorCheckKey-->>CLUSTERSCAN_Command: allow
  CLUSTERSCAN_Command-->>Client: initial cursor, local keys

  Client->>CLUSTERSCAN_Command: clusterscan 0-{06S}-0
  CLUSTERSCAN_Command->>clusterscanGetKeys: extract keys from argv
  clusterscanGetKeys->>clusterscanGetKeys: cursor == "0"? No
  clusterscanGetKeys-->>CLUSTERSCAN_Command: numkeys = 1, pos=1 flagged CMD_KEY_NOT_KEY
  CLUSTERSCAN_Command->>ACLSelectorCheckKey: validate permissions for reported arg
  ACLSelectorCheckKey->>ACLSelectorCheckKey: sees CMD_KEY_NOT_KEY -> bypass
  ACLSelectorCheckKey-->>CLUSTERSCAN_Command: allow routing token
  CLUSTERSCAN_Command->>ClusterRouter: route by embedded hashtag in cursor
  ClusterRouter-->>Client: scan results from target slot

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Allow CLUSTERSCAN 0 to be executed on any node directly' clearly and specifically describes the main change, which is enabling the initial cursor '0' to be handled locally without redirection.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description clearly explains the problem (unnecessary redirection of CLUSTERSCAN 0), the solution (clusterscanGetKeys implementation), and provides concrete before/after examples demonstrating the behavior change.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

enjoy-binbin · 2026-05-13T03:44:00Z

@nmvk @madolson I don't know the ACL details as much as you do, i agree this is a good-to-have option, let me know if we want to close it.

coderabbitai

🧹 Nitpick comments (1)

tests/unit/cluster/clusterscan.tcl (1)

67-94: 💤 Low value

Consider cleaning up the ACL user after the test.

The test creates user scan_acl_leak but doesn't delete it afterward. While this test block likely gets reset between runs, adding cleanup improves test isolation.

♻️ Suggested cleanup

         $rd read

         $rd close
+
+        # Clean up the test user
+        R 0 ACL DELUSER scan_acl_leak
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/cluster/clusterscan.tcl` around lines 67 - 94, Add cleanup to
remove the test ACL user "scan_acl_leak" after the test to avoid leaking state:
after the client $rd is closed (or immediately after the last $rd read), call
the corresponding ACL delete command (the inverse of R 0 ACL SETUSER used to
create the user) — e.g., issue R 0 ACL DELUSER scan_acl_leak — so the user
created by the test is removed when the test finishes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/cluster/clusterscan.tcl`:
- Around line 67-94: Add cleanup to remove the test ACL user "scan_acl_leak"
after the test to avoid leaking state: after the client $rd is closed (or
immediately after the last $rd read), call the corresponding ACL delete command
(the inverse of R 0 ACL SETUSER used to create the user) — e.g., issue R 0 ACL
DELUSER scan_acl_leak — so the user created by the test is removed when the test
finishes.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4ea48c2c-d593-456f-b0b0-32ac6af5740d

📥 Commits

Reviewing files that changed from the base of the PR and between a813df0 and 688ee70.

📒 Files selected for processing (6)

src/acl.c
src/commands.def
src/commands/clusterscan.json
src/db.c
src/server.h
tests/unit/cluster/clusterscan.tcl

codecov · 2026-05-13T04:07:28Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.68%. Comparing base (a813df0) to head (87a1338).
⚠️ Report is 2 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #3675      +/-   ##
============================================
- Coverage     76.94%   76.68%   -0.26%     
============================================
  Files           162      162              
  Lines         80656    80669      +13     
============================================
- Hits          62058    61865     -193     
- Misses        18598    18804     +206

Files with missing lines	Coverage Δ
src/acl.c	`92.53% <100.00%> (-0.12%)`	⬇️
src/commands.def	`100.00% <ø> (ø)`
src/db.c	`94.85% <100.00%> (+0.04%)`	⬆️
src/server.h	`100.00% <ø> (ø)`

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Binbin <binloveplay1314@qq.com>

nmvk

LGTM, Thanks @enjoy-binbin.

enjoy-binbin mentioned this pull request May 12, 2026

Improve CLUSTERSCAN error handling test with broader coverage #3674

Merged

enjoy-binbin commented May 12, 2026

View reviewed changes

Comment thread src/db.c

github-actions Bot assigned enjoy-binbin May 12, 2026

madolson reviewed May 12, 2026

View reviewed changes

Comment thread src/db.c Outdated

enjoy-binbin added 2 commits May 13, 2026 11:39

Merge remote-tracking branch 'upstream/unstable' into clusterscan0

8d95426

Signed-off-by: Binbin <binloveplay1314@qq.com>

Add NOT_KEY flag and skip it in ACL check

688ee70

Signed-off-by: Binbin <binloveplay1314@qq.com>

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

nmvk reviewed May 13, 2026

View reviewed changes

Comment thread src/acl.c

Handle doesCommandHaveKeys as well

87a1338

Signed-off-by: Binbin <binloveplay1314@qq.com>

nmvk approved these changes May 13, 2026

View reviewed changes

nmvk mentioned this pull request May 13, 2026

Skip NOT_KEY commands in client tracking #3699

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow CLUSTERSCAN 0 to be executed on any node directly#3675

Allow CLUSTERSCAN 0 to be executed on any node directly#3675
enjoy-binbin wants to merge 4 commits into
valkey-io:unstablefrom
enjoy-binbin:clusterscan0

enjoy-binbin commented May 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

madolson left a comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

enjoy-binbin commented May 13, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

codecov Bot commented May 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

nmvk left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

enjoy-binbin commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Uh oh!

enjoy-binbin commented May 13, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

nmvk left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enjoy-binbin commented May 12, 2026 •

edited

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading

codecov Bot commented May 13, 2026 •

edited

Loading