Skip to content

feat: share remote ALT readiness waiters#1288

Open
thlorenz wants to merge 12 commits into
masterfrom
thlorenz/table-mania-shared-readiness
Open

feat: share remote ALT readiness waiters#1288
thlorenz wants to merge 12 commits into
masterfrom
thlorenz/table-mania-shared-readiness

Conversation

@thlorenz

@thlorenz thlorenz commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Share in-flight ALT remote-readiness waiters across compatible concurrent callers in magicblock-table-mania so that multiple transactions waiting on the same set of address-lookup tables coalesce onto a single batched polling loop instead of each running their own.

  • Adds a batch-level shared waiter keyed by the set of (table_address, required_pubkeys, latest_update_sent_at) tuples
  • Preserves the existing batched get_multiple_accounts_with_commitment(&matching_table_keys, ...) call shape — no extra RPCs for isolated callers
  • Caches successful results briefly (750ms TTL) so back-to-back compatible requests can skip the poll entirely; failures are never cached
  • Lives in its own remote_readiness_waiters module with focused unit tests

Details

magicblock-table-mania

The existing phase-2 "wait for the remote ALT entries to reflect what we just sent" loop ran independently per caller. When several transactions referenced overlapping pubkey sets in the same window, each one issued its own get_multiple_accounts_with_commitment poll loop.

Now compatible callers share a single in-flight waiter:

  • A RemoteReadinessWaiterKey captures the table set, the required pubkeys per table (sorted for deterministic comparison), and the latest_update_sent_at watermark. is_satisfied_by defines when one key's result can satisfy another: same tables, required pubkeys are a subset of the other's, and the other's update watermark is same-or-newer.
  • RemoteReadinessWaiters::wait_or_spawn looks up a compatible cached success first, then a compatible active waiter (joined via tokio::sync::watch), and only spawns a new batch poll if neither exists. The spawned task is decoupled from any individual awaiter, so dropping one caller does not cancel the shared work.
  • Successful results are cached for REMOTE_READINESS_SUCCESS_TTL (750ms) and pruned on every lookup. Failures are surfaced through a new TableManiaError::SharedRemoteReadinessFailure(Arc<TableManiaError>) variant and intentionally not cached so the next caller can retry.
  • The phase-2 batch loop itself is unchanged; it was extracted into wait_for_remote_tables_readiness_batch and is now driven through the shared waiter via wait_for_remote_tables_readiness_shared.

All of this lives in a new remote_readiness_waiters module so manager.rs only deals with key construction and the public flow. The module ships with unit tests covering key sorting, subset/superset matching, missing-table and missing-pubkey rejection, stale-watermark rejection, active-waiter sharing, success-cache reuse and expiry, failure non-caching, and the drop-does-not-cancel behavior.

Expected impact:

  • Isolated caller: identical RPC behavior to before.
  • Concurrent compatible callers: one shared batch loop instead of N.
  • Incompatible callers: untouched — they get their own loop.

Summary by CodeRabbit

  • Refactor
    • Centralized, deduplicated coordination for remote table readiness with short-lived caching to reduce redundant work and improve concurrency performance.
  • Bug Fix / Error Handling
    • Better propagation of shared remote-readiness failures so callers receive clearer aggregated error context.
  • Tests
    • Added comprehensive unit tests for readiness matching, cache/TTL, reuse semantics, batching, and failure scenarios.

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 91ab77d6-e988-4e5d-b22b-ab697c735b31

📥 Commits

Reviewing files that changed from the base of the PR and between e396d35 and 16b3f50.

📒 Files selected for processing (3)
  • magicblock-table-mania/src/error.rs
  • magicblock-table-mania/src/manager.rs
  • magicblock-table-mania/src/remote_readiness_waiters.rs

📝 Walkthrough

Walkthrough

This PR introduces a shared async deduplication mechanism for "remote readiness" checking in table-mania's lookup-table initialization. It adds a new SharedRemoteReadinessFailure error variant, creates a standalone remote_readiness_waiters module that implements request deduplication and TTL-based result caching using Tokio watch channels, and integrates this into TableMania by replacing an inline fetch/verify loop with calls to the new shared helpers. The mechanism ensures concurrent callers with compatible requirements share the same spawned task and reuse cached successes within a 750ms window.

Suggested reviewers

  • GabrielePicco
  • snawaz
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch thlorenz/table-mania-shared-readiness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
magicblock-table-mania/src/manager.rs (1)

769-776: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace this production unwrap() with an explicit contract error.

Line 775 will panic the whole request if remote_tables is ever missing an address, which is exactly the kind of cross-module invariant that should surface as a TableManiaError instead of aborting the flow. Please convert this into an explicit error path with the table address in the message so shared-waiter regressions stay diagnosable.

As per coding guidelines, "Treat any usage of .unwrap() or .expect() in production Rust code as a MAJOR issue."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@magicblock-table-mania/src/manager.rs` around lines 769 - 776, The closure
building AddressLookupTableAccount uses remote_tables.get(&address).unwrap()
which can panic; replace this unwrap with an explicit error-returning path that
maps a missing entry into a TableManiaError (e.g., a contract/validation
variant) containing the offending table address in the message. Locate the map
over matching_tables.into_keys() and change the addresses field to call
remote_tables.get(&address).cloned().ok_or_else(||
TableManiaError::ContractViolation(format!("missing remote table for address
{address}"))) (or the project’s equivalent error constructor) so the function
returns Err(TableManiaError) instead of panicking. Ensure the error type and
propagation match the surrounding function signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@magicblock-table-mania/src/remote_readiness_waiters.rs`:
- Around line 136-155: wait_or_spawn currently keys only on matching_tables so
callers with different timeouts can share a waiter and inherit the wrong
deadline; fix by including the caller’s effective deadline in waiter
compatibility or by enforcing per-caller timeouts when awaiting a shared
receiver: either (A) change RemoteReadinessWaiterKey::new to accept the caller
deadline (e.g. compute Instant::now() + timeout and include it in the key used
by wait_or_spawn and state.find_active/state.find_success) so only compatible
deadlines share a waiter, or (B) keep keys as-is but when you get an active
receiver from state.find_active(&key) do not await it directly—instead clone the
receiver and await it via the caller’s own timeout (e.g. use
tokio::time::timeout with the remaining budget), returning a timeout error for
that caller if exceeded while leaving the underlying spawned batch running;
update wait_or_spawn, RemoteReadinessWaiterKey usage, and any
state.find_active/find_success logic accordingly to ensure per-caller deadline
semantics.
- Around line 154-211: find_active is returning stale watch::Receiver instances
when the spawned task exits abnormally; prune dead waiters before reuse by
checking receiver liveness and removing closed entries from state.active. Update
the logic around state.find_active(&key) / ActiveRemoteReadinessWaiter to: when
you locate a candidate receiver, verify it is still alive (i.e., detect that the
sender is not dropped / receiver will not immediately error) and if it is closed
remove that waiter from state.active and continue searching (or fall through to
create a new sender/receiver pair); ensure the same pruning happens anywhere you
iterate state.active so waiters closed before cleanup don’t get reused by
wait_for_result. Use the watch::Receiver error semantics (detect changed().await
would error or the receiver’s closed state via a non-blocking liveness check) to
identify and drop dead waiters.

---

Outside diff comments:
In `@magicblock-table-mania/src/manager.rs`:
- Around line 769-776: The closure building AddressLookupTableAccount uses
remote_tables.get(&address).unwrap() which can panic; replace this unwrap with
an explicit error-returning path that maps a missing entry into a
TableManiaError (e.g., a contract/validation variant) containing the offending
table address in the message. Locate the map over matching_tables.into_keys()
and change the addresses field to call
remote_tables.get(&address).cloned().ok_or_else(||
TableManiaError::ContractViolation(format!("missing remote table for address
{address}"))) (or the project’s equivalent error constructor) so the function
returns Err(TableManiaError) instead of panicking. Ensure the error type and
propagation match the surrounding function signature.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a0d63bcf-d360-4d3c-b015-c1ac770f3d47

📥 Commits

Reviewing files that changed from the base of the PR and between 8ee6832 and 37aa65a.

📒 Files selected for processing (4)
  • magicblock-table-mania/src/error.rs
  • magicblock-table-mania/src/lib.rs
  • magicblock-table-mania/src/manager.rs
  • magicblock-table-mania/src/remote_readiness_waiters.rs

Comment thread magicblock-table-mania/src/remote_readiness_waiters.rs
Comment thread magicblock-table-mania/src/remote_readiness_waiters.rs
@thlorenz thlorenz requested a review from GabrielePicco June 4, 2026 04:32
@thlorenz thlorenz marked this pull request as ready for review June 4, 2026 04:32
thlorenz and others added 2 commits June 4, 2026 13:15
* master:
  fix: reject short account responses (#1290)
  fix: wait for pubsub listeners before reconnect (#1253)
  fix: retry failed program subscriptions (#1268)
  release: 0.12.0 (#1299)
  Ignore compute unit price in processor fees (#1298)
  Recover recent pending intents on restart (#1296)
  chore: adjust log level (#1297)
  release: v0.11.4 (#1292)
  fix(scheduler): remove block subscription in the scheduler (#1293)
  fix(committor): race-condition on cleanup (#1291)
  Handle ALT extend invalid instruction data (#1287)
  fix: use provided compute limits instead of defaults (#1289)
  feat: snapshot accountsdb even in the replica mode (#1282)
  feat: added vrf ephemeral test queue, delegation record and metadata for mb-test-validator (#1281)
  fix: use wire size (1232), not encoded size (1644), for tx fit checks! (#1285)
@thlorenz thlorenz requested a review from bzawisto June 11, 2026 09:26
@thlorenz thlorenz removed the request for review from bzawisto June 11, 2026 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant