| id | 000119 |
|---|---|
| title | Composite Secondary Index Core |
| status | proposal |
| created | 2026-04-15 |
| github_issue | 557 |
Implement RFC 0014 Phase 3 for user-table secondary indexes: a
self-contained composite MemTree/DiskTree core that groups the existing
single-tree runtime index with the checkpointed secondary DiskTree root and
implements the current unique and non-unique secondary-index trait contracts.
This task proves the dual-tree method semantics with real MemTree and DiskTree fixtures, but it does not wire the composite into table foreground access or recovery. The existing single-tree runtime path remains the only table runtime path until RFC 0014 Phase 4.
RFC 0014 has already delivered the durable prerequisites:
- Phase 1 added user-table secondary
DiskTreeroots and concrete persistedDiskTreereaders/writers. - Phase 2 publishes those roots as table-checkpoint sidecar state.
RFC 0014 Phase 3 focuses on the reusable and testable part of the composite runtime: method-by-method unique and non-unique semantics over the two tree layers. Phase 4 remains responsible for table runtime wiring and composite recovery.
The current table runtime still stores GenericSecondaryIndex values backed by
one mutable B+Tree per secondary index. This task wraps that existing
single-tree implementation as MemTree, groups it with the persisted cold
DiskTree context in a DualTree-style type, and implements the existing
UniqueIndex and NonUniqueIndex traits on the grouped types.
Issue Labels:
- type:task
- priority:high
- codex
Parent RFC:
- docs/rfcs/0014-dual-tree-secondary-index.md
- Add a user-table-only dual-tree secondary-index core that groups a MemTree backend with one persisted secondary DiskTree root context.
- Implement a unique dual-tree index that implements the existing
UniqueIndextrait. - Implement a non-unique dual-tree index that implements the existing
NonUniqueIndextrait. - Add an enum wrapper for one secondary index:
DualTreeSecondaryIndex::{Unique, NonUnique}or an equivalent local name. - Keep the cold context table-specific. The composite API must not carry a
caller-supplied
FileKind; secondary DiskTree roots are user-table-only and internal open helpers should useFileKind::TableFile. - Keep key encoding single-sourced. The DiskTree context must not store a
duplicate
BTreeKeyEncoder; DiskTree readers derive encoders from table metadata and MemTree helper scans return already encoded keys from their own backing tree. - Preserve current unique index semantics over two layers:
- MemTree live hit is terminal.
- MemTree delete-shadow hit is terminal.
- DiskTree is probed only on MemTree miss.
- DiskTree duplicate candidates are reported without DiskTree mutation.
- DiskTree owners can be claimed by installing MemTree overlay state.
- cold-owner delete shadows are installed in MemTree only.
- Preserve current non-unique index semantics over two layers:
- exact entries from MemTree and DiskTree are merged.
- active MemTree exact entries are returned.
- delete-marked MemTree exact entries suppress only matching DiskTree exact
(logical_key, row_id)entries. - results are deduplicated and ordered by exact key order.
- foreground-style mutations affect MemTree only.
- Add narrow MemTree helper APIs only where required for correct composite
semantics:
- insert a unique delete-shadow for a cold owner when MemTree misses and DiskTree matches;
- insert a non-unique delete-marked exact overlay for a cold exact entry;
- scan MemTree entries with encoded keys, row ids, and delete state for composite merge and suppression.
- Prove with tests that composite mutations leave DiskTree roots unchanged.
- Do not wire the dual-tree core into
Table,GenericMemTable, table foreground DML, table scans, rollback, purge, or recovery. - Do not replace the current
GenericSecondaryIndextable runtime array. - Do not change catalog-table secondary indexes.
- Do not change recovery source selection or remove the persisted-LWC cold secondary-index rebuild path.
- Do not rebuild the current single runtime tree from DiskTree.
- Do not perform foreground persistent DiskTree writes.
- Do not change table-file metadata or storage format.
- Do not add storage-version compatibility work.
- Do not implement MemTree cleanup or eviction.
- Do not perform generic B+Tree backend unification.
- Do not expose non-unique DiskTree delete-mask APIs or value payloads.
No new unsafe code is planned. The implementation should reuse existing safe MemTree, DiskTree, table-file, and readonly-buffer APIs.
If implementation unexpectedly touches low-level B+Tree node internals that
require unsafe changes, keep those changes private, document every invariant
with // SAFETY:, and run:
cargo clippy -p doradb-storage --all-targets -- -D warnings-
Add a composite secondary-index module.
- Create
doradb-storage/src/index/composite_secondary_index.rs. - Export it internally from
doradb-storage/src/index/mod.rs. - Keep all new runtime types
pub(crate)or narrower.
- Create
-
Define the cold-root context.
- Add a
SecondaryDiskTreeContextor similarly named struct containing:root: BlockIDindex_no: usizemetadata: Arc<TableMetadata>file: Arc<SparseFile>disk_pool: QuiescentGuard<ReadonlyBufferPool>
- Do not store
FileKind; internal DiskTree open helpers useFileKind::TableFile. - Do not store
BTreeKeyEncoder; open helpers derive encoding throughUniqueDiskTree::newandNonUniqueDiskTree::new, and merge helpers use encoded entries returned by each tree layer.
- Add a
-
Define dual-tree grouping types.
- Add
DualTreeUniqueIndex<P: BufferPool>with:mem: GenericUniqueBTreeIndex<P>disk: SecondaryDiskTreeContext
- Add
DualTreeNonUniqueIndex<P: BufferPool>with:mem: GenericNonUniqueBTreeIndex<P>disk: SecondaryDiskTreeContext
- Add
DualTreeSecondaryIndex<P: BufferPool>with unique and non-unique variants for future table runtime use. - Add constructors that validate
index_no, metadata index kind, and MemTree/DiskTree kind alignment.
- Add
-
Add table-only DiskTree open helpers.
- Add private helper methods on
SecondaryDiskTreeContextto openUniqueDiskTreeandNonUniqueDiskTree. - Each helper should allocate or derive a disk pool guard internally from
the stored readonly pool, then call the existing DiskTree constructor with
FileKind::TableFile. - Keep the root snapshot immutable for each method call.
- Add private helper methods on
-
Add narrow unique MemTree helper APIs.
- Add a helper on
GenericUniqueBTreeIndexto insert a delete-shadow(logical_key -> deleted(row_id))when the key is currently absent. - Add a helper to scan unique MemTree entries as encoded logical keys plus
(row_id, deleted)state. - Keep helpers
pub(crate)and document that they exist for dual-tree composition, not for DiskTree mutation.
- Add a helper on
-
Add narrow non-unique MemTree helper APIs.
- Add a helper on
GenericNonUniqueBTreeIndexto insert a delete-marked exact(logical_key, row_id)overlay when the exact key is currently absent. - Add a helper to scan exact MemTree entries as encoded exact keys plus
(row_id, deleted)state. - Preserve the current one-byte MemTree delete flag as a runtime-only concern.
- Add a helper on
-
Implement
UniqueIndexforDualTreeUniqueIndex<P>.lookup:- probe
mem.lookup; - return any MemTree live or delete-shadow hit;
- otherwise open the unique DiskTree and return a cold hit as
Some((row_id, false)).
- probe
insert_if_not_exists:- preserve current MemTree duplicate and delete-shadow merge behavior;
- on MemTree miss, report a DiskTree owner as
IndexInsert::DuplicateKey(row_id, false)without mutation; - otherwise insert into MemTree.
compare_exchange:- MemTree
OkandMismatchare terminal; - on MemTree
NotExists, claim a matching DiskTree owner by insertingnew_row_idinto MemTree; - return
Mismatchfor a different DiskTree owner andNotExistsfor complete miss.
- MemTree
mask_as_deleted:- mark matching MemTree state deleted when present;
- when only DiskTree maps the key to
row_id, install a MemTree delete-shadow for that cold owner; - never update DiskTree.
compare_delete:- remove or adjust only MemTree overlay state;
- if only DiskTree has the matching owner, return success as an idempotent no-op;
- never update DiskTree.
scan_values:- merge MemTree and DiskTree entries by encoded logical key;
- MemTree state is terminal per logical key and suppresses the same-key DiskTree candidate;
- append the resulting row ids in deterministic encoded-key order.
-
Implement
NonUniqueIndexforDualTreeNonUniqueIndex<P>.lookup:- collect MemTree exact entries for the logical key, including delete state;
- collect DiskTree exact entries for the logical key;
- return active MemTree row ids;
- suppress only matching DiskTree exact entries when MemTree has a delete-marked exact overlay;
- deduplicate and order by encoded exact key.
lookup_unique:- check MemTree exact state first;
- a live or delete-marked MemTree exact hit is terminal;
- on MemTree miss, return DiskTree exact-key presence as
Some(true).
insert_if_not_exists:- preserve current MemTree duplicate and exact delete-mark merge behavior;
- on MemTree miss, report matching DiskTree exact presence as a duplicate without mutation;
- otherwise insert active exact entry into MemTree.
mask_as_deleted:- mark matching MemTree exact state deleted when present;
- when only DiskTree contains the exact key, install a delete-marked MemTree exact overlay;
- never update DiskTree.
mask_as_active:- unmask MemTree exact state only;
- never update DiskTree.
compare_delete:- remove only matching MemTree exact overlay state;
- if only DiskTree has the exact key, return success as an idempotent no-op;
- never update DiskTree.
scan_values:- merge all MemTree and DiskTree exact entries with the same suppression
and ordering rules as
lookup.
- merge all MemTree and DiskTree exact entries with the same suppression
and ordering rules as
-
Keep runtime integration seams explicit.
- The new dual-tree types may include constructors that accept already-built MemTree indexes and cold contexts.
- Do not alter
build_secondary_indexes,GenericMemTable::new, orTable::sec_idx()in this task. - Leave Phase 4 responsible for changing table-owned secondary-index storage and call sites.
-
Add focused tests and validation.
- Keep tests inline in the changed index modules.
- Use real
GenericUniqueBTreeIndexandGenericNonUniqueBTreeIndexMemTree fixtures. - Use real persisted DiskTree roots built through existing batch writers.
- Run:
cargo fmt --all --check
cargo clippy -p doradb-storage --all-targets -- -D warnings
cargo nextest run -p doradb-storageRun the alternate backend pass only if implementation changes backend-neutral I/O paths beyond using existing table-file and DiskTree read helpers:
cargo nextest run -p doradb-storage --no-default-features --features libaiodoradb-storage/src/index/composite_secondary_index.rs- new dual-tree grouping types, cold-root context, trait implementations, and composite merge helpers
doradb-storage/src/index/mod.rs- internal module export for the composite index core
doradb-storage/src/index/unique_index.rs- narrow unique MemTree helper APIs for cold-owner delete-shadow insertion and encoded-state scans
doradb-storage/src/index/non_unique_index.rs- narrow non-unique MemTree helper APIs for delete-marked exact overlay insertion and encoded-state scans
doradb-storage/src/index/disk_tree.rs- existing user-table DiskTree readers are reused; optional table-only open
helpers may be added if they keep
FileKindout of the composite API
- existing user-table DiskTree readers are reused; optional table-only open
helpers may be added if they keep
doradb-storage/src/index/secondary_index.rs- optional construction helpers or enum alignment for future Phase 4 wiring; no table runtime replacement in this task
- Unique
lookup:- MemTree live hit returns MemTree row id and does not read DiskTree result.
- MemTree delete-shadow hit is terminal and does not fall through.
- MemTree miss plus DiskTree hit returns cold owner as live candidate.
- complete miss returns
None.
- Unique
insert_if_not_exists:- active MemTree duplicate is terminal.
- matching MemTree delete-shadow can merge when requested.
- DiskTree duplicate returns
DuplicateKey(row_id, false)without modifying MemTree or DiskTree. - complete miss inserts into MemTree.
- Unique
compare_exchange:- matching MemTree state updates normally.
- mismatched MemTree state is terminal.
- matching DiskTree owner is claimed by MemTree overlay.
- mismatched DiskTree owner returns mismatch.
- Unique delete behavior:
mask_as_deletedmarks MemTree state when present.mask_as_deletedinstalls a cold-owner MemTree delete-shadow when only DiskTree matches.compare_deleteremoves only MemTree overlay state.- all delete paths preserve the original DiskTree root.
- Unique
scan_values:- MemTree entries are returned in deterministic key order.
- same-key MemTree state suppresses DiskTree state.
- delete-shadow suppression is covered.
- Non-unique
lookup:- active MemTree exact entries and DiskTree exact entries are merged.
- delete-marked MemTree exact entries suppress only matching DiskTree exact keys.
- different row ids with the same logical key remain visible.
- output is deduplicated and ordered by
(logical_key, row_id).
- Non-unique
lookup_unique:- MemTree active hit returns
Some(true). - MemTree delete-marked hit returns
Some(false). - MemTree miss plus DiskTree exact hit returns
Some(true). - complete miss returns
None.
- MemTree active hit returns
- Non-unique mutation methods:
insert_if_not_existsreports DiskTree exact duplicate without mutation.mask_as_deletedinstalls a MemTree delete-marked overlay for a cold exact key.mask_as_activeaffects only MemTree.compare_deleteaffects only MemTree and treats DiskTree-only matches as idempotent no-ops.
- Non-unique
scan_values:- full exact-entry merge follows lookup suppression and ordering rules.
- Rollback-shaped behavior:
- unmask/remask/remove sequences mutate only MemTree.
- DiskTree scan results and root ids remain unchanged across composite mutation tests.
None blocking for this task.
Future Phase 4 must decide how DualTreeSecondaryIndex replaces or coexists
with the current GenericSecondaryIndex array inside user-table runtime
storage. This task intentionally leaves that table ownership change out of
scope.