Skip to content

feat(migration): statehistory migration#3658

Open
MaksymMalicki wants to merge 4 commits into
maksym/headstate-migrationfrom
maksym/statehistory-migration
Open

feat(migration): statehistory migration#3658
MaksymMalicki wants to merge 4 commits into
maksym/headstate-migrationfrom
maksym/statehistory-migration

Conversation

@MaksymMalicki
Copy link
Copy Markdown
Contributor

@MaksymMalicki MaksymMalicki commented May 19, 2026

Summary

Adds the statehistory migration: rewrites the three deprecated contract history layouts (class-hash, nonce, per-slot storage) so each entry stores the post-update value at its block instead of the pre-update value. Gated behind the existing --new-state flag. Depends on the headstate migration (consolidated Contract record) shipped in the sibling PR — the new layout reads contract.{ClassHash, Nonce} and the head storage trie as the "last value" source.

How it works

block │ old (pre-update) │ new (post-update)
──────┼──────────────────┼───────────────────
 100  │ —                │ V₀  ← explicit deploy (class-hash only)
 200  │ V₀               │ V₁
 500  │ V₁               │ V₂
 head │ V₂ (Contract)    │ (read from history)

Runs three sequential phases — class-hash, nonce, storage — each iterating the Contract bucket. Four worker goroutines (ingestorCount) per phase walk one contract's deprecated entries at a time, shift them into the new layout in shared db.Batches, and DeleteRange the deprecated rows in the same batch. One committer drains batches to disk; a semaphore caps in-flight batches at ingestorCount + 1.

  • Class-hash: the deprecated layout never wrote a deploy entry — the deploy-time hash lives only in the first replace's "pre-value". The migration inserts an explicit deploy_h entry on top of the shifted history, growing the count by one per replaced contract.
  • Nonce / storage: the first change entry's pre-value is 0 (the deploy default). Shift only; entry count per contract / per slot is unchanged.
  • Storage: the "last value for a slot" comes from the head storage trie, not the Contract record. The ingestor walks the deprecated history and the head trie in lockstep (both sorted by raw slot bytes); slots with no head leaf (zeroed out) resolve to felt.Zero.

What changes

  • New migration/statehistory/ package (migrator, three per-phase ingestors, shared baseIngestor, committer, counter, parse helpers, tests).
  • Registered in node/migration.go as an optional migration gated by cfg.NewState, running after the headstate migration.

Resume safety

  • Per-contract writes + deprecated-row deletion happen in the same batch; pebble batches commit atomically.
  • A contract whose history is large may span more than one batch. Each new entry's value is a pure function of the deprecated source, so re-running over a partially-rewritten contract overwrites with identical values and then deletes the (still-present) deprecated rows.
  • Contracts whose deprecated entries are already gone short-circuit on an empty iterator.
  • Phases run sequentially; a later phase only starts after the earlier phase completes.
  • Ctx cancellation returns (shouldRerun, ctx.Err()).

Alternatives considered

Two earlier attempts were benchmarked and dropped:

  1. Wipe + rewrite from state updates. Drop the deprecated history entirely and rebuild the new layout by replaying state updates block-by-block. Conceptually clean but the resulting writes touch every history bucket in a near-random order — pebble compaction has to merge many small per-block updates across overlapping key ranges, so compact pressure dominated runtime.

  2. Per-address instead of per-phase. Loop over contracts once and run all three phases (class-hash, nonce, storage) inside the same per-contract worker, finishing each contract before moving on. Saves two passes over the Contract bucket but interleaves writes to three different history buckets per contract — again scattered, again heavy on compaction.

The current per-phase approach writes are tightly clustered: one phase writes only one history bucket, in contract-address order, with deprecated DeleteRanges landing in the same batch as the new rows that replace them. Sequential, large, mostly-sorted writes — pebble's happy path. The two extra Contract-bucket scans are negligible compared to the compaction savings.

@MaksymMalicki MaksymMalicki marked this pull request as draft May 19, 2026 11:20
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 62.09913% with 130 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.10%. Comparing base (c2eacce) to head (05dd000).

Files with missing lines Patch % Lines
migration/statehistory/storage_ingestor.go 63.82% 21 Missing and 13 partials ⚠️
migration/statehistory/class_hash_ingestor.go 56.45% 15 Missing and 12 partials ⚠️
migration/statehistory/migrator.go 70.49% 13 Missing and 5 partials ⚠️
migration/statehistory/counter.go 41.37% 16 Missing and 1 partial ⚠️
migration/statehistory/nonce_ingestor.go 62.16% 7 Missing and 7 partials ⚠️
migration/statehistory/ingestor.go 66.66% 8 Missing ⚠️
migration/statehistory/parse.go 42.85% 6 Missing and 2 partials ⚠️
migration/statehistory/committer.go 90.00% 1 Missing and 1 partial ⚠️
node/migration.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@                      Coverage Diff                       @@
##           maksym/headstate-migration    #3658      +/-   ##
==============================================================
- Coverage                       76.18%   76.10%   -0.09%     
==============================================================
  Files                             400      408       +8     
  Lines                           36571    36913     +342     
==============================================================
+ Hits                            27861    28091     +230     
- Misses                           6715     6792      +77     
- Partials                         1995     2030      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MaksymMalicki MaksymMalicki force-pushed the maksym/statehistory-migration branch from 5c0928e to 5a09f6c Compare May 19, 2026 22:36
@MaksymMalicki MaksymMalicki marked this pull request as ready for review May 19, 2026 22:41
@MaksymMalicki MaksymMalicki force-pushed the maksym/headstate-migration branch from 0f8a352 to c2eacce Compare May 20, 2026 12:55
@MaksymMalicki MaksymMalicki force-pushed the maksym/statehistory-migration branch from fa47934 to 05dd000 Compare May 20, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant