Skip to content

Commit 1772490

Browse files
authored
chore: composite secondary index core (#558)
* composite secondary index
1 parent 34bf56c commit 1772490

23 files changed

+2679
-672
lines changed
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Backlog: Persist DiskTree upper fences for prefix compression
2+
3+
## Summary
4+
5+
Persist meaningful upper fences for secondary DiskTree leaf and branch nodes so BTreeNode common-prefix compression can take effect for clustered encoded keys.
6+
7+
## Reference
8+
9+
Current task discussion and implementation in docs/tasks/000119-composite-secondary-index-core.md and docs/rfcs/0014-dual-tree-secondary-index.md; DiskTree currently initializes persisted nodes with an empty upper_fence.
10+
11+
## Deferred From (Optional)
12+
13+
docs/tasks/000119-composite-secondary-index-core.md; docs/rfcs/0014-dual-tree-secondary-index.md
14+
15+
## Deferral Context (Optional)
16+
17+
- Defer Reason: The current fix is scoped to removing full-tree materialization from non-unique prefix scans; changing persisted node fence semantics is a broader storage-format and traversal hardening task.
18+
- Findings: DiskTree leaf and branch writers currently call BTreeNode::init with upper_fence set to an empty slice, so common_prefix_len(lower_fence, upper_fence) is normally zero and BTreeNode prefix compression does not take effect. The streamed prefix scan now uses common-prefix-aware lower-bound helpers, so it should benefit once real upper fences are persisted.
19+
- Direction Hint: Persist real upper fences for DiskTree nodes, then verify exact lookup and range traversal against nonempty node common prefixes before relying on prefix-compressed persisted blocks.
20+
21+
## Scope Hint
22+
23+
Evaluate DiskTree leaf and branch block writing, exact lookup, and streamed range traversal with nonempty upper fences; keep compatibility with current root snapshot semantics.
24+
25+
## Acceptance Hint
26+
27+
DiskTree tests demonstrate nonempty BTreeNode common prefixes for clustered keys, exact lookups and non-unique prefix scans remain correct, and prefix scans benefit from the existing lower-bound helpers.
28+
29+
## Notes (Optional)
30+
31+
32+
## Close Reason (Added When Closed)
33+
34+
When a backlog item is moved to `docs/backlogs/closed/`, append:
35+
36+
```md
37+
## Close Reason
38+
39+
- Type: <implemented|stale|replaced|duplicate|wontfix|already-implemented|other>
40+
- Detail: <reason detail>
41+
- Closed By: <backlog close>
42+
- Reference: <task/issue/pr reference>
43+
- Closed At: <YYYY-MM-DD>
44+
```
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Backlog: Optimize secondary index access path for dual-tree indexes
2+
3+
## Summary
4+
5+
Revisit the secondary-index access path now that user-table indexes are dual-tree; the current UniqueIndex and NonUniqueIndex interfaces were shaped around a single mutable tree and may force unnecessary DiskTree opens, scans, merges, or guard construction. In particular, evaluate whether callers should provide disk pool guards for operations that can touch DiskTree instead of having dual-tree indexes construct those guards internally.
6+
7+
## Reference
8+
9+
Current task discussion; docs/tasks/000119-composite-secondary-index-core.md; docs/rfcs/0014-dual-tree-secondary-index.md; docs/secondary-index.md; doradb-storage/src/index/unique_index.rs; doradb-storage/src/index/non_unique_index.rs; doradb-storage/src/index/composite_secondary_index.rs; table/trx call sites that already pass PoolGuards for other storage access.
10+
11+
## Deferred From (Optional)
12+
13+
docs/tasks/000119-composite-secondary-index-core.md; docs/rfcs/0014-dual-tree-secondary-index.md
14+
15+
## Deferral Context (Optional)
16+
17+
- Defer Reason: Task 000119 is focused on making the composite secondary-index core correct under the existing trait surface; broader access-path optimization may require interface or call-site changes and should be planned separately.
18+
- Findings: The current UniqueIndex and NonUniqueIndex traits expose single-tree-style operations. DualTreeUniqueIndex and DualTreeNonUniqueIndex must internally decide when to consult MemTree, DiskTree, or both, which can lead to extra DiskTree opens/lookups and makes cleanup-only behavior easy to mis-specify. DualTree methods currently receive an index pool guard through the existing traits, but DiskTree access also needs a disk pool guard; constructing that guard inside SecondaryDiskTreeRuntime is opposite to the existing client-passes-guards pattern used by table and transaction code. Some decisions may be better expressed as dual-tree-specific APIs or by changing internal behavior behind the existing table-facing surface.
19+
- Direction Hint: Start from table/trx call sites and classify operations by intent: read lookup, duplicate check, foreground logical delete, compare-exchange claim, rollback cleanup, purge cleanup, and scan. Prefer an access contract that makes cold-layer reads explicit only where they are semantically required. Evaluate whether to change trait signatures to carry caller-owned PoolGuards or explicit disk PoolGuard references, or preserve the public traits while adding internal dual-tree fast paths if that limits churn.
20+
21+
## Scope Hint
22+
23+
Evaluate whether to change the index traits, add dual-tree-specific internal methods, batch or cache DiskTree opens, pass caller-owned disk pool guards, and avoid cold-layer reads for cleanup-only paths while preserving table/trx call-site semantics. Include UniqueIndex and NonUniqueIndex signatures, DualTreeUniqueIndex and DualTreeNonUniqueIndex internals, SecondaryDiskTreeRuntime open paths, and table/trx call sites.
24+
25+
## Acceptance Hint
26+
27+
A future task identifies and implements a cleaner dual-tree access contract or internal fast path, removes unnecessary cold-layer work from common lookup/delete/update paths, and includes benchmarks or targeted counters/tests showing fewer DiskTree opens/scans for representative secondary-index operations. Operations that may consult DiskTree should receive or derive the needed caller-provided disk pool guard from operation context, with internal ad-hoc disk_pool_guard creation removed or limited to clearly justified helper boundaries.
28+
29+
## Notes (Optional)
30+
31+
32+
## Close Reason (Added When Closed)
33+
34+
When a backlog item is moved to `docs/backlogs/closed/`, append:
35+
36+
```md
37+
## Close Reason
38+
39+
- Type: <implemented|stale|replaced|duplicate|wontfix|already-implemented|other>
40+
- Detail: <reason detail>
41+
- Closed By: <backlog close>
42+
- Reference: <task/issue/pr reference>
43+
- Closed At: <YYYY-MM-DD>
44+
```

docs/backlogs/next-id

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
000085
1+
000087

0 commit comments

Comments
 (0)