refactor(protocol/sandbox): remove store clear counter#5466
Conversation
# Conflicts: # Cargo.lock # Cargo.toml # ethexe/processor/src/host/api/mod.rs # ethexe/rpc/src/lib.rs # ethexe/signer/src/storage.rs # examples/ctor/build.rs # gcli/src/template.rs # pallets/gear/src/tests.rs # sandbox/host/Cargo.toml # sandbox/host/src/sandbox/wasmer_backend.rs # sandbox/sandbox/Cargo.toml # sandbox/sandbox/src/embedded_executor.rs # utils/calc-stack-height/src/main.rs # utils/gear-wasmer-cache/Cargo.toml # utils/gear-wasmer-cache/src/lib.rs # utils/gring/tests/command.rs # utils/lazy-pages-fuzzer/src/wasmtime_backend.rs
# Conflicts: # Cargo.lock # utils/gear-workspace-hack/Cargo.toml
…store-clear # Conflicts: # Cargo.lock # Cargo.toml # ethexe/processor/src/host/api/mod.rs # protocol/lazy-pages/fuzzer/src/wasmtime_backend.rs # protocol/lazy-pages/src/sys/windows.rs # protocol/sandbox/host/src/sandbox/wasmtime_backend.rs # protocol/sandbox/src/embedded_executor.rs # utils/gear-workspace-hack/Cargo.toml # vara/pallets/gear/src/schedule.rs # vara/runtime/interface/sandbox/src/detail.rs
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the sandbox protocol by eliminating the store clear counter, a mechanism that became redundant with the full adoption of the Wasmtime backend. The change simplifies the host context, removes related configuration options from the CLI, and ensures that sandbox initialization no longer involves a clear limit. A new benchmark has been added to validate that this removal does not negatively impact performance, particularly concerning Wasmtime store growth. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request removes the sandbox store clear counter limit logic and its associated CLI argument, simplifying sandbox initialization. It also introduces a new benchmark, wasmtime_store_growth, to evaluate Wasmtime store growth. However, the benchmark code contains several compilation issues, including references to private wasmtime constants and the use of unstable Rust features like is_multiple_of and let chains, which will fail to compile on stable Rust.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b3829b5d7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| state.sandboxes.clear(&mut state.clear_counter); | ||
| state | ||
| .sandboxes | ||
| .get(supervisor_context.data_ptr()) |
There was a problem hiding this comment.
Preserve periodic Wasmtime store resets
When a worker thread keeps using the same StoreData pointer, this now creates each sandbox memory/instance in the same long-lived Wasmtime Store indefinitely. Wasmtime increments the store's lifetime instance/resource counters on each module instantiation and enforces the default 10,000-instance limit, so after enough program executions on the same runtime store, new_memory/instantiation starts failing even though Gear tears down its handles. The removed clear counter was the only path that refreshed this backend store before store_data_ptr changed, so long-running nodes can hit this limit under normal block processing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
@codex verify the usage of memory_teardown and instance_teardown functions. I guess the limit should be unattainable
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5b3829b5d7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| state.sandboxes.clear(&mut state.clear_counter); | ||
| state | ||
| .sandboxes | ||
| .get(supervisor_context.data_ptr()) |
There was a problem hiding this comment.
Restore a reset before exhausting Wasmtime store limits
For long-lived runtime stores, relying on memory_teardown/instance_teardown is not enough: those functions only replace Gear's handle slots with None, while the underlying Wasmtime Store keeps monotonically bumped resource counters (DEFAULT_INSTANCE_LIMIT/DEFAULT_MEMORY_LIMIT are 10,000 and are not decremented on teardown). After roughly 10k sandbox instantiations on the same StoreData pointer, instantiate starts returning ERR_MODULE from Wasmtime's resource limit even though every Instance/Memory was dropped; the new benchmark also has to recreate BenchState before that limit, which confirms the limit is reachable unless the store is periodically cleared/recreated here.
Useful? React with 👍 / 👎.
Closes #5465
Summary
Removes the sandbox store clear counter plumbing now that the runtime uses the Wasmtime backend. The sandbox host context no longer tracks a per-thread clear counter, the runtime/node/replay entry points initialize the sandbox backend without a clear limit, and the CLI option was removed.
Keeps the Criterion benchmark that compares a long-lived Wasmtime-backed sandbox store with periodic store clearing, so the old Wasmer-era workaround stays covered by measurable evidence.
How to test
cargo test -p gear-sandbox-host init_selects_backend_without_store_clear_limit cargo clippy -p gear-sandbox-host -p gear-sandbox-interface -p gear-cli -p gear-replay-cli --all-targets -- -D warnings cargo bench -p gear-sandbox-host --bench wasmtime_store_growthNotes
Current local benchmark result did not show a Wasmtime slowdown comparable to the old Wasmer behavior:
Checklist
type(scope): description)