Skip to content

draft: make dict a fully concrete, transparent set of map helpers for hashtable, removing dict elements where redundant#3720

Draft
rainsupreme wants to merge 4 commits into
valkey-io:unstablefrom
valkey-rainfall:no-dictsetkey-v2
Draft

draft: make dict a fully concrete, transparent set of map helpers for hashtable, removing dict elements where redundant#3720
rainsupreme wants to merge 4 commits into
valkey-io:unstablefrom
valkey-rainfall:no-dictsetkey-v2

Conversation

@rainsupreme
Copy link
Copy Markdown
Contributor

@rainsupreme rainsupreme commented May 14, 2026

disclaimer: This is mainly for discussion purposes, and represents one potential final state for dict as convenience helpers for using hashtable as a map.

Summary

This is an alternative approach to #3566 (eliminate dictSetKey). Instead of migrating replicaKeysWithExpire to a new hashtable struct, this PR:

  1. Eliminates dictSetKey by using dictAddRaw to get the insert position, then setting the owned key directly on the new entry — no post-insert key mutation needed.

  2. Removes all dict accessor wrappers (dictGetKey, dictGetVal, dictSetVal, dictGetUnsignedIntegerVal, dictSetUnsignedIntegerVal, dictIncrUnsignedIntegerVal, etc.) and replaces all call sites with direct field access (de->key, de->v.val, de->v.u64).

Motivation

If we treat dictEntry as a transparent struct (which it is — it's defined in the public header), then the accessor functions add no abstraction value. They just wrap de->field with extra syntax.

This reframes dict as a set of operation helpers (find, add, delete, iterate) that manage the dictEntry allocation + hashtable position dance, rather than an opaque abstraction that hides its internals behind getters/setters.

What remains in dict.h after this

  • Type aliases: dict = hashtable, dictType = hashtableType, dictIterator = hashtableIterator
  • Macro aliases: dictSize, dictCreate, dictRelease, etc. (just naming)
  • Operation wrappers: dictAdd, dictAddRaw, dictFind, dictDelete, dictNext, etc. (handle zmalloc + cast)
  • Utility: dictEntryGetKey (hashtableType callback), dictEntryMemUsage, dictMemUsage

For discussion

Opening as draft to explore what "dict as transparent convenience layer" looks like in practice. This is relevant to the ongoing discussion about dict's long-term role (#3561, #3566).

22 files changed, ~240 insertions, ~290 deletions. All tests pass.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Important

Review skipped

Ignore keyword(s) in the title.

⛔ Ignored keywords (1)
  • draft

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 7638db25-13d9-46b0-bdb2-abaf15664286

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands and usage tips.

Replace the dictAddOrFind + dictSetKey pattern in rememberReplicaKeyWithExpire
with dictAddRaw, which separates existence-check from entry creation. On new
entries, set de->key directly to the owned sds copy — no post-insert mutation
needed.

This is the only caller of dictSetKey, so remove it from the dict API entirely.

81 expire tests pass.

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rainval@amazon.com>
dictEntry is a transparent struct — the accessor functions (dictGetKey,
dictGetVal, dictSetVal, dictGetUnsignedIntegerVal, dictSetUnsignedIntegerVal,
dictIncrUnsignedIntegerVal, etc.) add no abstraction value. They just wrap
direct field access with extra function call syntax.

Remove all accessor wrappers from dict.h and replace all call sites with
direct field access:

  dictGetKey(de)                  -> de->key
  dictGetVal(de)                  -> de->v.val
  dictSetVal(d, de, v)            -> de->v.val = v
  dictGetUnsignedIntegerVal(de)   -> de->v.u64
  dictSetUnsignedIntegerVal(de,v) -> de->v.u64 = v
  dictIncrUnsignedIntegerVal(de,v)-> de->v.u64 += v (or --de->v.u64)

This makes the code shorter, removes a layer of indirection in reading,
and makes it explicit that dictEntry is a plain data struct — not an
opaque handle.

dictEntryGetKey (the hashtableType callback) is retained since it serves
as a function pointer for the hashtable infrastructure.

22 files changed, 227 insertions(+), 284 deletions(-).
954 tests pass (expire, scripting, multi, pubsub, latency, maxmemory, wait).

Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Rain Valentine <rainval@amazon.com>
Match the concise copyright format used by hashtable.h/hashtable.c.

Signed-off-by: Rain Valentine <rainval@amazon.com>
Copy link
Copy Markdown
Contributor Author

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little rough maybe, but fine for discussion

Comment thread .gitignore Outdated
Comment thread src/dict.h Outdated
Copy link
Copy Markdown
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!

You removed the most common accessors like dictGetVal and the diff is just about 300 lines in the whole code base! I'm surprised it's not more.

Is the point to remove all abstractions around dictEntry, so that we don't have a leaky abstraction? Either fully opaque or fully concrete, but not a leaky abstraction with non-opaque types?

I could buy that... My main argument against it is to avoid touching all these lines.

In the past, long ago, the dictEntry was not opaque, yet there were accessors like dictGetVal, though they were just simple defines: https://github.com/valkey-io/valkey/blob/6.0/src/dict.h#L144

The reason I made it opaque (in 7.2) was to be able to mess with the internals and do pointer tagging for the dictEntry pointer. I wouldn't have proposed it otherwise. But also, I wouldn't have proposed swapping all dictGetVal to direct field access either... Avoiding unnecessary changes was a strong desire of the maintainers at the time.

@rainsupreme
Copy link
Copy Markdown
Contributor Author

@zuiderkwast Yeah, exactly! 😁 My opinion is that it should either be fully opaque or fully concrete "in the end", and this is my idea of what the fully-concrete version might look like.

The first commit alone is more conservative - it's just a version that gets rid of dictSetKey by taking advantage of the fact that dict is transparent. Ignoring dictSetKey for now, I'm more interested in what you think of this style of using dict where I reach through to the hashtable underneath. For some reason I felt like I was sinning when I wrote it, but maybe I'm just used to the old dict 😆

I think I disagree with the old maintainers about avoiding change 😅 I lean towards frequently cleaning and organizing as our needs change and evolve, even if it doesn't fix a bug, add a feature, or improve performance. I've had challenging experiences in the past where the team had leaned too heavily in the "avoid changing too much" direction, and it resulted in massive built up tech debt that was miserable to work with and seriously slowed us down. Of course, to go the always-cleaning route you need very strong unit tests and integration tests to make reviewing easier and avoid accidental regression, but Valkey is doing pretty well in that regard.

As for solid reasons to go the opaque route instead - I think embedding keys into dictEntry could make sense to eliminate 8B overhead and one memory fetch. If we did that then maybe opaque starts to make more sense, as it would help prevent bugs from people incorrectly accessing/modifying it. 🤷‍♀️ On the other hand, maybe those cases should be migrated to hashtable instead - it might make more sense to embed the key data in whatever object we're indexing. I raised this PR as an example: #3706

@rainsupreme rainsupreme changed the title dict: remove dictSetKey and accessor wrappers draft: make dict a fully concrete, transparent set of map helpers for hashtable, removing dict elements where redundant May 14, 2026
… directly

Remove the typedef aliases (dict -> hashtable, dictType -> hashtableType,
dictIterator -> hashtableIterator) and update all callers to use the
hashtable type names directly.

This makes it explicit that dict.h is a helper layer on top of hashtable,
not a separate type system. The dict* operation helpers (dictFind, dictAdd,
dictDelete, etc.) remain as convenience functions that manage dictEntry
allocation.

29 files changed, 279/279 unit tests pass.

Signed-off-by: Rain Valentine <rainval@amazon.com>
@rainsupreme
Copy link
Copy Markdown
Contributor Author

just to see what it looks like for discussion, I added a fourth commit that removes the dict and dictType typedefs. Makes it about a 600 line change

@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

❌ Patch coverage is 68.61314% with 129 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.68%. Comparing base (fdf13ca) to head (d507179).

Files with missing lines Patch % Lines
src/sentinel.c 0.00% 85 Missing ⚠️
src/valkey-cli.c 61.90% 16 Missing ⚠️
src/module.c 74.28% 9 Missing ⚠️
src/fuzzer_command_generator.c 66.66% 3 Missing ⚠️
src/valkey-benchmark.c 25.00% 3 Missing ⚠️
src/cluster.c 71.42% 2 Missing ⚠️
src/cluster_legacy.c 96.77% 2 Missing ⚠️
src/config.c 88.88% 2 Missing ⚠️
src/rdb.c 77.77% 2 Missing ⚠️
src/server.c 90.00% 2 Missing ⚠️
... and 3 more
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3720      +/-   ##
============================================
+ Coverage     76.65%   76.68%   +0.03%     
============================================
  Files           162      162              
  Lines         80662    80645      -17     
============================================
+ Hits          61830    61846      +16     
+ Misses        18832    18799      -33     
Files with missing lines Coverage Δ
src/aof.c 80.30% <100.00%> (-0.07%) ⬇️
src/blocked.c 90.30% <100.00%> (ø)
src/db.c 94.95% <100.00%> (+0.14%) ⬆️
src/defrag.c 81.12% <100.00%> (ø)
src/dict.h 100.00% <100.00%> (ø)
src/eval.c 91.82% <100.00%> (ø)
src/functions.c 96.64% <100.00%> (ø)
src/latency.c 83.33% <100.00%> (ø)
src/module.h 0.00% <ø> (ø)
src/multi.c 97.90% <100.00%> (ø)
... and 17 more

... and 13 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants