Skip to content

Commit 3102c3e

Browse files
authored
feat: Dual-Tree Secondary Index (#560)
* implement secondary index runtime access and recovery
1 parent 1772490 commit 3102c3e

21 files changed

+2219
-863
lines changed
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Backlog: Refactor recovery process and parallelize log replay
2+
3+
## Summary
4+
5+
Recovery has accumulated timestamp-boundary logic, checkpoint bootstrap handling, hot MemTree rebuild behavior, and sequential redo replay in one coordinator. Add a follow-up to clean up the recovery code structure and evaluate or implement parallel log replay without regressing catalog or user-table replay correctness.
6+
7+
## Reference
8+
9+
Raised during task 000120 while wiring dual-tree secondary-index recovery. Discussion clarified LogRecovery state such as catalog_replay_start_ts, replay_floor, max_recovered_cts, table_states, and recovered_tables, and exposed that recovery needs clearer structure before adding more concurrency.
10+
11+
## Deferred From (Optional)
12+
13+
docs/tasks/000120-secondary-index-runtime-access-and-recovery.md
14+
15+
## Deferral Context (Optional)
16+
17+
- Defer Reason: Task 000120 is scoped to secondary-index runtime access and recovery cutover. Broad recovery cleanup and replay parallelization would widen the task beyond dual-tree integration and risk delaying the current feature.
18+
- Findings: Current recovery tracks multiple replay boundaries: catalog checkpoint replay start, per-table heap redo start, deletion cutoff, a global replay floor, and max recovered CTS. Checkpointed DiskTree secondary-index roots should remain cold state while recovery rebuilds only hot MemTree rows. DDL acts as a pipeline breaker today, and DML replay is still sequential with a todo for dispatching work to multiple threads.
19+
- Direction Hint: Start with documentation and small structural cleanup so timestamp boundary semantics remain explicit. Then plan parallel replay around DDL barriers, catalog-table serialization, user-table independence, and row-page conflict grouping. Preserve max recovered CTS as a global timestamp watermark even for skipped log records.
20+
21+
## Scope Hint
22+
23+
Audit the trx recovery flow, separate checkpoint bootstrap, DDL and catalog replay, user-table DML replay, index rebuild, and page refresh responsibilities where useful, then design parallel DML or log replay boundaries with deterministic ordering around DDL pipeline breakers and per-table or per-page conflicts.
24+
25+
## Acceptance Hint
26+
27+
Recovery code has clearer ownership boundaries and comments, tests cover catalog replay boundaries, per-table heap and delete cutoffs, hot secondary-index rebuilds, and parallel replay ordering, and full doradb-storage nextest passes with no timestamp reuse or index recovery regressions.
28+
29+
## Notes (Optional)
30+
31+
Consider whether a full RFC is needed before implementation because parallel replay touches recovery ordering, error propagation, and replay determinism.
32+
33+
## Close Reason (Added When Closed)
34+
35+
When a backlog item is moved to `docs/backlogs/closed/`, append:
36+
37+
```md
38+
## Close Reason
39+
40+
- Type: <implemented|stale|replaced|duplicate|wontfix|already-implemented|other>
41+
- Detail: <reason detail>
42+
- Closed By: <backlog close>
43+
- Reference: <task/issue/pr reference>
44+
- Closed At: <YYYY-MM-DD>
45+
```
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Backlog: Remove recovery skip option
2+
3+
## Summary
4+
5+
Remove the transaction-system `skip_recovery` configuration path. Skipping redo recovery is dangerous for a storage engine because it can admit runtime operations on stale or partially recovered state and corrupt data. The option was originally useful for tests, but the project should prioritize data safety as the storage engine matures.
6+
7+
## Reference
8+
9+
Deferred from `docs/tasks/000120-secondary-index-runtime-access-and-recovery.md` and parent RFC `docs/rfcs/0014-dual-tree-secondary-index.md` Phase 4 review. The issue surfaced while reviewing recovery startup behavior around `TrxSysConfig::skip_recovery`, `log_recover(skip=true)`, and `TransactionSystem::new` timestamp initialization.
10+
11+
## Deferred From (Optional)
12+
13+
docs/tasks/000120-secondary-index-runtime-access-and-recovery.md; docs/rfcs/0014-dual-tree-secondary-index.md Phase 4
14+
15+
## Deferral Context (Optional)
16+
17+
- Defer Reason: The active task is focused on secondary-index runtime/recovery cutover. Removing the recovery skip option affects broader engine configuration and many tests, so it should be planned separately instead of widening the current task during resolve.
18+
- Findings: The current code exposes `TrxSysConfig::skip_recovery` and many tests set `.skip_recovery(true)`. A reviewed startup path showed that skip mode can bypass recovery entirely and hand a transaction timestamp seed directly to `TransactionSystem::new`. Even if individual bugs are patched, the option itself is unsafe as a storage-engine configuration because it permits startup without replaying durable logs.
19+
- Direction Hint: Prefer deleting the production configuration surface and making recovery mandatory. For tests, first classify why each `.skip_recovery(true)` call exists: fresh temporary root convenience, log subsystem isolation, or explicit recovery bypass. Replace fresh-root cases with normal recovery, isolate log-only tests behind private test fixtures, and avoid any public API that can start a persisted engine while skipping recovery.
20+
21+
## Scope Hint
22+
23+
Audit and remove the production-facing recovery skip option from configuration and startup flow. Replace test uses with safer test-only construction helpers or fixtures that do not bypass recovery for persistent storage roots. Keep any unavoidable no-recovery behavior private to tests and clearly separated from engine configuration.
24+
25+
## Acceptance Hint
26+
27+
`TrxSysConfig` no longer exposes a general `skip_recovery` option for engine startup. Production startup always runs recovery. Existing tests are migrated to safe alternatives, and tests that intentionally need an empty log or fresh temporary root express that setup without disabling recovery. Documentation and comments no longer present recovery skipping as a supported mode.
28+
29+
## Notes (Optional)
30+
31+
32+
## Close Reason (Added When Closed)
33+
34+
When a backlog item is moved to `docs/backlogs/closed/`, append:
35+
36+
```md
37+
## Close Reason
38+
39+
- Type: <implemented|stale|replaced|duplicate|wontfix|already-implemented|other>
40+
- Detail: <reason detail>
41+
- Closed By: <backlog close>
42+
- Reference: <task/issue/pr reference>
43+
- Closed At: <YYYY-MM-DD>
44+
```

docs/backlogs/next-id

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
000087
1+
000089

docs/rfcs/0014-dual-tree-secondary-index.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -502,10 +502,10 @@ SAFETY:` comments, and run the repository lint gate. [D10], [C4], [C7]
502502
visibility authority, rebuilding checkpointed cold DiskTree entries into
503503
MemTree, removing the single-tree catalog implementation, and generic
504504
B+Tree backend unification.
505-
- Task Doc: `docs/tasks/TBD.md`
506-
- Task Issue: `#0`
507-
- Phase Status: `pending`
508-
- Implementation Summary: `pending`
505+
- Task Doc: `docs/tasks/000120-secondary-index-runtime-access-and-recovery.md`
506+
- Task Issue: `#559`
507+
- Phase Status: done
508+
- Implementation Summary: Implemented Phase 4 user-table dual-tree runtime and recovery cutover, preserving catalog single-tree indexes, routing foreground/rollback access through composite indexes, and using checkpointed DiskTree roots as cold recovery state. [Task Resolve Sync: docs/tasks/000120-secondary-index-runtime-access-and-recovery.md @ 2026-04-16]
509509

510510
- **Phase 5: Cleanup, Eviction, And Parity Hardening**
511511
- Scope: remove migration scaffolding only after composite behavior is

0 commit comments

Comments
 (0)