Skip to content

Implement JIT costing for ErgoTree evaluation with Scala parity#846

Open
arkadianet wants to merge 6 commits into
ergoplatform:developfrom
arkadianet:jit-costing
Open

Implement JIT costing for ErgoTree evaluation with Scala parity#846
arkadianet wants to merge 6 commits into
ergoplatform:developfrom
arkadianet:jit-costing

Conversation

@arkadianet
Copy link
Copy Markdown

@arkadianet arkadianet commented Feb 25, 2026

Summary

Implements JIT cost accounting parity improvements for issue #193 and fixes several Scala-parity gaps found during a node/path cost audit.

Fixes

  • Fix Cthreshold crypto verification cost by including TO_BYTES_CONJUNCTION.
  • Charge Scala's StorageContractCost = 50 for storage-rent spends.
  • Enforce cumulative per-transaction cost limit in validate() by giving each input only the remaining block budget.
  • Fix PerItemCost::total_cost(0) to match Scala PerItemCost.chunks, where zero items still charge one chunk.
  • Charge DeserializeContext / DeserializeRegister substitution costs:
    • payload bytes: CostPerByteDeserialized = 2 block cost, converted to JIT scale
    • tree bytes: CostPerTreeByte = 2 block cost, V6-activated only
    • register default fallback does not charge payload cost

Cost Coverage

Audited the IR/eval cost paths against issue #193:

  • fixed-cost expression dispatch is covered in charge_expr_cost
  • runtime-size costs are charged in their eval implementations
  • equality costs dispatch through DataValueComparer
  • deserialize substitution is now explicitly charged before evaluating substituted expressions
  • non-evaluable nodes are either charged by their containing construct or intentionally rejected

Tests

Added targeted JIT conformance tests with hard-coded Scala literals and source citations for:

  • Cthreshold polynomial costs
  • DecodePoint, Exponentiate, MultiplyGroup
  • EQ_GroupElement
  • PerItemCost zero-item and AtLeast chunk boundaries
  • UnsignedBigInt.modInverse
  • SHeader.checkPow
  • DeserializeContext/Register substitution costs
  • cumulative per-input remaining-budget validation

The checked-in corpus tests remain as real-transaction integration smoke coverage, not as the primary correctness proof.

Verification

  • cargo test -p ergotree-interpreter --features arbitrary
  • cargo clippy -p ergotree-interpreter --features arbitrary
  • cargo fmt --check
  • cargo test -p ergo-lib --test cost_parity --release -- --include-ignored

Full corpus parity remains green: 78/78 smoke and 19,549/19,549 full corpus.

Notes

Rust charges deserialize payload costs after substitution completes, so total cost matches Scala. Scala updates cost after each payload substitution, so cost-limit fail-fast granularity can differ for scripts with multiple deserialize nodes.

@kushti kushti requested a review from sethdusek February 25, 2026 19:08
Closes the known JIT costing parity gaps between sigma-rust and
the Scala sigmastate-interpreter, validated on 19,549 mainnet
transactions across 11 height ranges (500000-1751000) with zero
mismatches.

## What this implements

Per-operation JIT cost accumulation during ErgoTree evaluation,
matching Scala's CostAccumulator model. Every Expr variant (55+)
and method EvalFn (89+) charges costs during evaluation. The full
transaction validation pipeline computes:

  block_cost = (init_cost + SUM(eval_snapped + crypto)) / 10

## Bugs fixed vs prior PR ergoplatform#846 attempt

1. ConstPlaceholder resolve-in-place: proposition_for_cost_eval()
   resolves placeholders at cost=1 instead of substituting to
   Const at cost=5. Handles mixed trees correctly.
2. trivial_reduce for P2PK: short-circuits with cost=50 matching
   Scala's EVAL_SIGMA_PROP_CONSTANT.
3. SubstConstants: charges template constant count, not replacement
   count.
4. Collection EQ MATCH_TYPE: adds MatchType(1) dispatch cost.
5. Collection EQ length mismatch: no spurious base_cost charge.
6. ADD_TO_ENV_COST: charges 5 JitCost per lambda invocation in
   ForAll/Exists/Map/Filter/Fold/Option.map/Coll.flatMap.
7. Slice cost: uses output size, not input size.
8. to_block_cost: floor division, not ceiling.
9. estimate_crypto_cost: ProveDlog=3980, ProveDhTuple=7140.
10. validate() pipeline: init cost + per-input fresh context +
    snap-to-block-boundary + crypto cost + floor division.

## Test infrastructure

- ergo-lib/tests/cost_parity.rs: cross-validation harness with
  smoke (78 txs, always runs) and full corpus (19549 txs, #[ignore])
- 31 targeted cost unit tests including regressions for each bug
- 6 crypto cost unit tests

## Parity results

- smoke_cost_parity (external pipeline): 78/78 matched
- smoke_validate_parity (validate() direct): 78/78 matched
- full_corpus_cost_parity: 19549/19549 matched, 0 mismatches

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@arkadianet arkadianet changed the title Implement JIT costing for ErgoTree evaluation Implement JIT costing for ErgoTree evaluation with Scala parity Apr 12, 2026
1. Add MaxBlockCost(1_000_000) to Parameters::default() — was missing,
   would panic if accessed.
2. make_context() now derives jit_cost_limit from
   state_ctx.parameters.max_block_cost() * 10 instead of hardcoded
   10_000_000 literal.
3. Harness estimate_crypto_cost replaced with real crate import
   (ergotree_interpreter::sigma_protocol::crypto_cost) — fixes
   Cthreshold formula mismatch between harness and shipped code.

Full corpus: 19549 compared, 0 mismatches. 31/31 cost tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
arkadianet and others added 4 commits April 22, 2026 16:25
Two Scala-parity gaps surfaced by cross-checking against the Scala
reference at ergo-core:

- crypto_cost.rs: Cthreshold cost was missing the conjunction node's
  TO_BYTES_CONJUNCTION term. Scala's Interpreter.scala:580-587 adds
  nodeC = ToBytes_ProofTreeConjecture.costKind.cost alongside
  parseC, evalC, and childrenC. Updated the 2-of-3 test expectation
  from 11978 to 11993.

- tx_context.rs: storage-rent spend bypass added zero cost; Scala
  charges Constants.StorageContractCost = 50 (block-cost scale) per
  expired-box spend, per ErgoInterpreter.scala:81 and Constants.scala:35.
  Added STORAGE_CONTRACT_COST_BLOCK = 50 and charge 500 JIT (= 50
  block cost) on successful bypass.

Neither gap is triggered by the 78-tx smoke corpus or 19,549-tx full
corpus — mainnet heights in those ranges don't exercise Cthreshold
or expired-box spends — so the shipped parity tests still pass
78/78, 78/78, and 19,549/19,549 unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously each input in validate() was given a fresh Context with
jit_cost_limit = max_block_cost * 10, so an adversarial multi-input tx
could stack per-input costs (each individually under the cap) with no
cross-input check until the total was returned. This matches Scala's
Interpreter behaviour at the interpreter-call level but skips the
tx-level cumulative cap enforced by ErgoTransaction.scala:133–159.

Per-input budget is now the remaining block budget (Scala's
`costLimit = maxCost - currentTxCost` at ErgoTransaction.scala:135),
converted from block scale each time so sub-block JIT remainders don't
cost the tx up to 9 JIT of unfair headroom. Post-input cumulative checks
match ErgoTransaction.scala:159.

Changes:

- TxValidationError::CostLimitExceeded { phase, block_cost, limit }
  variant — phase identifies which stage tripped (init, input N,
  storage_rent N, or post_eval N).
- validate(): four explicit checks — init cost pre-loop, post-storage-
  rent cost, mid-eval (via interpreter's own LimitExceeded surfacing),
  and post-eval cumulative check.
- Mid-eval breach: EvalError::CostError(LimitExceeded) is mapped to
  CostLimitExceeded so callers don't unpack VerifierError to diagnose
  cost-driven failures.
- make_context() signature untouched; per-input budget is set by
  mutating ctx.jit_cost_limit after creation.
- cost_accum promoted from pub(crate) to pub so CostError is reachable
  from ergo-lib.
- Two new regression tests: cumulative breach on a tight max_block_cost,
  and init-only breach with max_block_cost = 1.

All existing tests unchanged: 102 ergo-lib unit tests, 373 interpreter
unit tests, 6 crypto_cost tests, 78/78 smoke parity, 78/78 validate
parity, 19,549/19,549 full corpus parity, clippy clean, fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds targeted unit tests that assert cost values against hard-coded Scala
literals (not costs::X.0 constants) so drift vs Scala is catchable by
reading the test alone. Each assertion cites the originating Scala source
file:line. Surfaces and fixes one shared Scala-parity bug along the way.

PerItemCost zero-item fix
- costs.rs: total_cost(n) now uses `n.saturating_sub(1) / chunk_size + 1`,
  matching Scala `PerItemCost.chunks` at CostKind.scala:26 where signed
  arithmetic yields `chunks(0) = (-1)/size + 1 = 1`. The previous
  `if n == 0 { 0 } else { div_ceil }` special-case undercharged every
  PerItemCost op on an empty input by per_chunk_cost. Existing
  per_item_cost_calculation test updated: cost(0) is now 30, not 20.
  Doc comment rewritten to cite Scala and explain the saturating_sub
  guard against u32 underflow.

Cthreshold crypto cost — broader coverage
- test_cthreshold_2_of_3_dlog: reworded to carry the Scala formula
  (Interpreter.scala:580-587) and assert the literal 11993.
- test_cthreshold_3_of_5_dlog: new, expects 19990 JIT.
- test_cthreshold_5_of_5_dlog_degenerate: new, expects 19940 JIT.
  Exercises the n_coefs == 0 boundary of the polynomial formula.

Fixed-cost op conformance (each asserting hard-coded JIT literal + Scala ref)
- decode_point_cost_matches_scala_trees_529 -> 305 JIT
- exponentiate_cost_matches_scala_trees_1046 -> 910 JIT
- multiply_group_cost_matches_scala_trees_1067 -> 50 JIT
- eq_group_element_cost_matches_scala_dvc_44 -> 182 JIT
- mod_inverse_cost_matches_scala_methods_574 -> 164 JIT
- sheader_check_pow_cost_matches_scala_methods_1816 -> delta 704 JIT
  (isolated by subtracting the Context/Headers/ByIndex baseline; avoids
   coupling the test to unrelated plumbing costs)

PerItemCost boundary tests
- per_item_cost_zero_items_charges_one_chunk: direct primitive regression
  locking in the Scala-matching formula.
- atleast_cost_one_item / at_chunk_boundary_five / past_chunk_boundary_six:
  evaluate AtLeast(PerItemCost(20,3,5)) at n=1, 5, 6; verifies chunk math
  matches Scala LanguageSpecificationV5.scala:8917.

All 385 interpreter unit tests pass (was 372 before these additions).
Smoke cost parity 78/78 and full corpus 19,549/19,549 unchanged.
Clippy clean, fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ErgoTrees containing DeserializeContext or DeserializeRegister nodes
went through substitute_deserialize() with zero cost charged, while
Scala charges per-byte costs via Interpreter.deserializeMeasured and
reductionWithDeserialize. Close the gap by matching the active Scala
runtime formula (not the unused transformers.scala PerItemCost companion
descriptor).

Scala runtime path (Interpreter.scala):
  :81  CostPerByteDeserialized = 2    (block cost; always charged per
                                        byte of each substituted payload)
  :88  CostPerTreeByte         = 2    (block cost; charged once per
                                        serialized ErgoTree byte, gated
                                        on V6/Evolution activation)
  :99-107  deserializeMeasured adds scriptBytes.length * 2 to initCost
  :240-259 reductionWithDeserialize adds ergoTree.bytes.length * 2 to
           initCost when VersionContext.current.isV6Activated

Converted to JIT scale (JitCost = 10 * block cost), so each byte
contributes 20 JIT.

Implementation:

- ergotree-ir: substitute_deserialize_with_stats(ctx) -> (Expr, Vec<usize>)
  returns the rewritten expression plus the byte lengths of each
  byte-backed substitution that actually took place. Lengths are pushed
  only after the parsed sub-expression passes type-check, so the Vec
  reflects "substitutions performed". Default fallback for
  DeserializeRegister records nothing (no bytes deserialized).
  The existing substitute_deserialize() now delegates and drops stats,
  keeping ergotree-ir cost-agnostic.
- ergotree-interpreter: in reduce_to_crypto, when tree.has_deserialize():
  - if ctx.activated_script_version() >= ErgoTreeVersion::V3, charge
    tree.sigma_serialize_bytes().len() * 20 JIT
  - call substitute_deserialize_with_stats and charge each payload_len
    * 20 JIT
  - evaluate the substituted expression
  Cost conversion goes through a byte_len_to_deserialize_jit_cost helper
  that uses checked_mul(20) and try_into::<u32> so an adversarially sized
  payload can't wrap the accumulator.

Regression tests (3):
  - deserialize_context_cost_pre_v6: activation = V2, only payload cost
    charged (tree-byte cost skipped).
  - deserialize_context_cost_v6_adds_tree_bytes: activation = V3, delta
    vs pre-V6 case equals serialized_tree_bytes.len() * 20.
  - deserialize_register_default_fallback_no_payload_cost: register
    empty, default inlined, no payload cost charged — only eval of the
    default expression.

Each test asserts hard-coded JIT expected values derived from the Scala
formula, with file:line citations.

Granularity caveat: Rust records payload lengths during substitution and
charges them after substitution completes, so totals match Scala. Scala
updates cost after each payload substitution, so cost-limit fail-fast
granularity can differ for scripts with multiple deserialize nodes.

All tests pass: 388 interpreter unit tests, 3 corpus tests
(smoke 78/78 + full 19,549/19,549 — deserialize is rare in mainnet so
the corpus is unaffected), clippy clean, fmt clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant