Skip to content

Change Resolver from fn pointer to Arc<dyn Fn> for storage backends#10

Open
mwaddip wants to merge 3 commits into
ergoplatform:mainfrom
mwaddip:main
Open

Change Resolver from fn pointer to Arc<dyn Fn> for storage backends#10
mwaddip wants to merge 3 commits into
ergoplatform:mainfrom
mwaddip:main

Conversation

@mwaddip
Copy link
Copy Markdown

@mwaddip mwaddip commented Apr 4, 2026

Changes Resolver from fn(&Digest32) -> Node to Arc<dyn Fn(&Digest32) -> Node + Send + Sync>. A bare fn can't capture a database handle, so VersionedAVLStorage / PersistentBatchAVLProver can't be backed by real storage today.

Arc not Box because AVLTree: Clone and Box isn't. All 22 tests pass; existing sites wrapped via Arc::new(...).

…upport

The Resolver type was defined as a bare function pointer (fn(&Digest32) -> Node),
which cannot capture state. This makes it impossible to implement VersionedAVLStorage
with a real storage backend — the resolver needs to load nodes from a database, but
a function pointer cannot hold a database reference.

Changed to Arc<dyn Fn(&Digest32) -> Node + Send + Sync> which allows closures that
capture storage handles. Arc (not Box) because AVLTree derives Clone. Send + Sync
for thread safety with concurrent readers.

All 22 existing tests pass unchanged (modulo wrapping bare functions in Arc::new).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@kushti kushti requested review from ross-weir and sethdusek April 5, 2026 22:01
Adds a default-no-op method on the storage trait so callers can force a
durable commit (fsync) of outstanding writes. Implementations using
deferred writes (e.g. redb Durability::None) should override this to
fsync the backing store. In-memory or always-durable storages can keep
the default no-op.

Motivation: PersistentBatchAVLProver::generate_proof_and_update_storage
uses non-durable writes for throughput. Without a caller-triggered flush,
graceful SIGTERM leaves uncommitted state pending; on reopen, redb sees
no valid commit and treats the storage as empty. Periodic flush calls
bound crash data loss.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…bsent

`contains_recursive` previously returned false at the catch-all `else`
arm whenever it reached a non-Internal node. That branch fires for both
Leaf nodes (terminal — return false is correct) and LabelOnly nodes
that the resolver could not materialise (return false is *unsafe*).

Used by `removed_nodes()` to decide which digests to delete from
persistent storage:

  for cn in &self.base.changed_nodes_buffer_to_check {
      if !self.contains(cn) {
          self.base.changed_nodes_buffer.push(cn.clone())
      }
  }

If `contains()` walks into an unresolvable subtree, returning false says
"definitely not in tree → delete it". The caller deletes the node from
storage. But the node may still be referenced from the very subtree we
couldn't resolve. The next walk into that subtree hits a LabelOnly with
the deleted digest and bails:

  ERROR ergo_sync::state: apply_state failed
    error=UTXO state operation failed:
      operation N failed: Should never reach this point.
      If in prover, this is a bug. If in verifier, this proof is wrong.

Observed in production on a Rust full-node implementation of the Ergo
mainnet at v0.4.x. After ~250 blocks of steady-state validation with
the typical "most subtrees are LabelOnly, only walked paths are
materialised" prover state, an on-disk scan revealed:

  Reachable nodes:    6,353,404
  Missing references:       135   (parent in storage, child digest not in NODES_TABLE)
  Orphan nodes:         378,274   (in storage but unreachable from root)

Fix: distinguish Leaf from LabelOnly at the catch-all. Leaf with
non-matching label remains terminal → false. LabelOnly that the
resolver couldn't materialise → true (fail safe). At worst this leaks
orphan nodes; never silently corrupts.

Refactored to use a `Kind` enum so we can scope the immutable borrow
that does the discrimination separately from the mutable borrow that
drives the Internal-node walk.

All 22 existing tests pass unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mwaddip
Copy link
Copy Markdown
Author

mwaddip commented May 18, 2026

Downstream impact heads-up: ergotree-interpreter in sigma-rust has 13 AVLTree::new call sites that pass a bare-fn closure as the Resolver, all of which need Arc::new(|digest: &Digest32| ...) wrapping when this lands.

Scope (3 files):

  • ergotree-interpreter/src/eval/savltree.rs — 10 sites (7 production EvalFns + 3 test sites)
  • ergotree-interpreter/src/eval/tree_lookup.rs — 2 sites
  • ergotree-interpreter/src/eval/create_avl_tree.rs — 1 test site

Pure type-level adaptation, no semantic changes.

Tracking PR on sigma-rust side, currently Draft pending this merge + a version bump: ergoplatform/sigma-rust#865.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant