Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
3c27f16
fix(e2e): hard-timeout iOS CDP eval/evaluate so a wedged WebView fail…
WiktorStarczewski Jun 27, 2026
4d70c1f
docs(changelog): note iOS CDP eval hard-timeout
WiktorStarczewski Jun 27, 2026
8b4021a
fix(e2e): retry miden-client CLI on transient remote-prover connectio…
WiktorStarczewski Jun 27, 2026
4b20eb7
fix(e2e): give guardian iOS auth-structure read a 90s budget (slow ru…
WiktorStarczewski Jun 27, 2026
2a59246
Revert "fix(e2e): give guardian iOS auth-structure read a 90s budget"
WiktorStarczewski Jun 27, 2026
3f5c9e7
fix(e2e): pause background sync during guardian iOS auth read to brea…
WiktorStarczewski Jun 27, 2026
1de78c9
fix(e2e): quiesce all frontend WASM pollers during guardian iOS auth …
WiktorStarczewski Jun 27, 2026
775c9f1
fix(e2e): guardian iOS auth read skips service.sync() to avoid the re…
WiktorStarczewski Jun 27, 2026
9d14e6f
fix(e2e): read guardian auth structure via AccountInspector (pure sto…
WiktorStarczewski Jun 27, 2026
fd892ce
fix(e2e): unblock guardian iOS auth read (skip wallet mutex on getAcc…
WiktorStarczewski Jun 27, 2026
a2e1e0c
fix(e2e): quiesce frontend WASM pollers during guardian iOS auth read…
WiktorStarczewski Jun 27, 2026
8c903ca
fix(e2e): serve guardian iOS auth structure from a balance-poll-captu…
WiktorStarczewski Jun 28, 2026
697b497
fix(e2e): cap _simPair setup at 8min + restart sim subsystem on overr…
WiktorStarczewski Jun 28, 2026
6f3eb4b
fix(e2e): harden macos-26 sim recovery (shutdown all wedged devices) …
WiktorStarczewski Jun 28, 2026
0d00355
fix(e2e): await the guardian auth-structure capture + log it, so it s…
WiktorStarczewski Jun 28, 2026
695dfef
fix(e2e): guardian auth hook falls back to the single stashed structu…
WiktorStarczewski Jun 28, 2026
ae9767b
fix(e2e): guardian iOS auth read over sync atom — async execute_async…
WiktorStarczewski Jun 28, 2026
028a3fa
fix(e2e): let degraded-but-completing iOS sim setup finish (cap 8->13…
WiktorStarczewski Jun 28, 2026
d5c0c45
ci(e2e): run iOS mobile E2E jobs on dedicated macos-26-xlarge runners…
WiktorStarczewski Jun 28, 2026
22d8d9d
docs(changelog): record macos-26-xlarge runner move for iOS mobile E2E
WiktorStarczewski Jun 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 28 additions & 16 deletions .github/workflows/e2e-blockchain.yml
Original file line number Diff line number Diff line change
Expand Up @@ -378,8 +378,12 @@ jobs:
name: Mobile E2E (devnet)
if: github.event_name != 'workflow_dispatch' || inputs.network == 'both' || inputs.network == 'devnet'
# macos-26 matches build-mobile.yml — required for iOS 26 SDK symbols
# used by dapp-browser's WKWebViewController.
runs-on: macos-26
# used by dapp-browser's WKWebViewController. The -xlarge size is the
# DEDICATED Apple-Silicon larger runner (2x vCPU/RAM, no noisy-neighbour
# IO contention) — the shared standard macos-26 runners were intermittently
# degraded for hours at a time, with CoreSimulator install/launch crawling
# so badly that two-sim `_simPair` setup couldn't finish even in 13-15 min.
runs-on: macos-26-xlarge
# Observed worst case: ~40 min setup (CLI compile on cache miss + iOS
# build + simulator cold boot overlap) + ~55 min suite with one retry.
# 90 minutes guillotined an otherwise-passing run at test 7 of 7.
Expand Down Expand Up @@ -534,11 +538,13 @@ jobs:
run: yarn test:e2e:mobile:build

- name: Run blockchain E2E (mobile, devnet)
# --retries=1 absorbs flaky CDP "no pages found" errors on macos-26
# runners. Each fixture retries install → launch → CDP connect from
# scratch, so a transient webinspectord hiccup doesn't fail the run.
# Config stays at retries: 0 for fast local dev feedback.
run: yarn playwright test --config playwright.ios.config.ts --retries=1
# --retries=2 absorbs flaky CDP "no pages found" errors and degraded-
# CoreSimulator sim-setup failures on macos-26 runners. Each fixture
# retries install → launch → CDP connect from scratch (and restarts the
# sim subsystem on a wedged daemon), so a transient runner hiccup gets
# multiple fresh attempts within the job budget. Config stays at
# retries: 0 for fast local dev feedback.
run: yarn playwright test --config playwright.ios.config.ts --retries=2

- name: Upload artifacts
if: always()
Expand All @@ -552,7 +558,7 @@ jobs:
mobile-testnet:
name: Mobile E2E (testnet)
if: github.event_name != 'workflow_dispatch' || inputs.network == 'both' || inputs.network == 'testnet'
runs-on: macos-26
runs-on: macos-26-xlarge
# Observed worst case: ~40 min setup (CLI compile on cache miss + iOS
# build + simulator cold boot overlap) + ~55 min suite with one retry.
# 90 minutes guillotined an otherwise-passing run at test 7 of 7.
Expand Down Expand Up @@ -701,11 +707,13 @@ jobs:
run: yarn test:e2e:mobile:build

- name: Run blockchain E2E (mobile, testnet)
# --retries=1 absorbs flaky CDP "no pages found" errors on macos-26
# runners. Each fixture retries install → launch → CDP connect from
# scratch, so a transient webinspectord hiccup doesn't fail the run.
# Config stays at retries: 0 for fast local dev feedback.
run: yarn playwright test --config playwright.ios.config.ts --retries=1
# --retries=2 absorbs flaky CDP "no pages found" errors and degraded-
# CoreSimulator sim-setup failures on macos-26 runners. Each fixture
# retries install → launch → CDP connect from scratch (and restarts the
# sim subsystem on a wedged daemon), so a transient runner hiccup gets
# multiple fresh attempts within the job budget. Config stays at
# retries: 0 for fast local dev feedback.
run: yarn playwright test --config playwright.ios.config.ts --retries=2

- name: Upload artifacts
if: always()
Expand Down Expand Up @@ -747,8 +755,12 @@ jobs:
name: Mobile Guardian E2E (devnet)
if: github.event_name != 'workflow_dispatch' || inputs.network == 'both' || inputs.network == 'devnet'
# macos-26 matches build-mobile.yml — required for iOS 26 SDK symbols
# used by dapp-browser's WKWebViewController.
runs-on: macos-26
# used by dapp-browser's WKWebViewController. The -xlarge size is the
# DEDICATED Apple-Silicon larger runner (2x vCPU/RAM, no noisy-neighbour
# IO contention) — the shared standard macos-26 runners were intermittently
# degraded for hours at a time, with CoreSimulator install/launch crawling
# so badly that two-sim `_simPair` setup couldn't finish even in 13-15 min.
runs-on: macos-26-xlarge
# ~40 min setup (CLI compile on cache miss + iOS build + sim cold boot)
# plus a single guardian spec with up to 2 retries. Generous ceiling that
# matches the standard mobile jobs so a passing build is never guillotined.
Expand Down Expand Up @@ -921,7 +933,7 @@ jobs:
mobile-guardian-testnet:
name: Mobile Guardian E2E (testnet)
if: github.event_name != 'workflow_dispatch' || inputs.network == 'both' || inputs.network == 'testnet'
runs-on: macos-26
runs-on: macos-26-xlarge
# ~40 min setup (CLI compile on cache miss + iOS build + sim cold boot)
# plus a single guardian spec with up to 2 retries. Generous ceiling that
# matches the standard mobile jobs so a passing build is never guillotined.
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@
* [FIX][mobile] **iOS hot-key signing now requires Face ID / Touch ID on device, matching Android.** The Secure Enclave hot key was created with `.privateKeyUsage` only — a *usage permission* that, contrary to the old code comment, does **not** prompt for authentication — so user-initiated Guardian claims and sends signed silently on iOS, while Android (StrongBox + `setUserAuthenticationRequired`) already prompted. New hot keys now also set `.userPresence`, so every user-initiated hot signature and hot-key reveal requires user presence (Face ID / Touch ID with passcode fallback). `.userPresence` is used rather than `.biometryCurrentSet` so the key survives biometric re-enrollment instead of bricking until re-activation. Background auto-consume is unaffected — it is cold-signed in WASM and never touches the hot key. Scope: the gate applies to device builds only (the simulator / iOS E2E path keeps `.privateKeyUsage`-only silent signing), and existing hot keys keep their prior behavior until re-activated/rotated. (#299)
* [FIX][mobile] **iOS app builds again under Xcode 26.** Two `foundKey as? SecKey` downcasts in the hot-key plugin (`signWithHotKey` / `revealHotKey`) are no-ops for CoreFoundation types — they always succeed — which Xcode 26 now rejects as a hard error, breaking the iOS build. Replaced with a `CFGetTypeID(foundKey) == SecKeyGetTypeID()` guard plus a force-cast: both the correct defensive downcast and Xcode-26-clean. (#299)
* [FIX][all] **Guardian accounts can now connect to dApps (faucet, etc.) instead of failing with "Connection Failed" / `NOT_GRANTED`.** A Guardian account's auth component is built by `@openzeppelin/miden-multisig-client` and its procedures live in the `openzeppelin::auth::*` MASM namespace, so they don't MAST-match any bundled `miden-standards` template. The SDK's `AccountInterface` therefore classifies the component as `Custom` and `Account.getPublicKeyCommitments()` returns `[]`; the wallet's connect flow read that as "no public key" and rejected the connection (surfaced to the dApp as `NOT_GRANTED`). Public-key resolution now falls back, for accounts the SDK can't classify, to reading the hot signer's commitment directly from the account's `openzeppelin::multisig::signer_public_keys` storage map — the key the wallet actually signs with — so Guardian accounts resolve a usable session key. Plain single-key accounts are unaffected (their `AuthSingleSig` component is recognized as before). The same resolution covers the reveal-private-key and advanced-settings public-key views, which broke identically for Guardian accounts. (#300)
* [CHANGE][ci] **iOS E2E no longer hangs the full timeout when the simulator's CDP bridge wedges.** `CdpBridge.eval`/`evaluate` now race the WebKit `executeAtom` call against a 30s hard timeout (matching `evalAsync`), so a wedged RWI socket or a momentarily-blocked WebView main thread surfaces as a fast throw instead of an indefinite await. Previously `pollForCondition` could only check its deadline *between* iterations, so a single hung `eval` stalled the whole test until Playwright's 15-minute kill (and the rest of the serial suite then skipped); now the poll enforces its own budget and `--retries` restarts on a fresh app + CDP. (#302)
* [CHANGE][ci] **Blockchain E2E retries the `miden-client` harness CLI on transient remote-prover connection failures.** The CLI deploy/mint/sync retry loop classified only node-RPC and nonce-lag errors as transient; an intermittent TLS/gRPC handshake failure to the delegated prover endpoint on the macOS runners (`failed to connect to the remote prover` / `transport error` / `no native certs found`) was treated as fatal, so a mint failed outright even though a sibling mint in the same test connected fine. These connection-level prover errors are now recognized as transient and retried with backoff, and the three duplicated classifiers were unified into one `isTransientCliError` helper. (#302)
* [CHANGE][ci] **iOS E2E mobile jobs run on dedicated `macos-26-xlarge` runners, and are more resilient to degraded shared runners.** The shared standard `macos-26` runner pool was intermittently degraded for hours at a time by noisy-neighbour IO contention — every `simctl` op crawled (97 CI samples: per-wallet `_simPair` setup p50 65s, p90 267s, max 401s vs. <5s healthy — so two sequential wallets took up to ~13 min and sometimes never finished even in 15 min), making the whole mobile suite un-runnable. All four mobile E2E jobs now use the dedicated Apple-Silicon `-xlarge` larger runner (2× vCPU/RAM, no noisy neighbours), which restores a healthy ~2-3 min setup and a full green suite. As belt-and-suspenders for any residual slowness, the `_simPair` fixture setup is capped (at 13 min, past the slowest observed completing setup) so a genuinely-hung CoreSimulator fails fast with a named error and a sim-subsystem restart instead of silently eating the whole per-test timeout, while a degraded-but-completing setup is allowed to finish rather than being killed mid-flight (no assertion is relaxed — purely tolerance for degraded IO); the subsystem recovery `simctl shutdown all`s to clear half-booted device state (the `SimError 405` signature); the per-test timeout is 25 min (from 15); and the non-guardian mobile suite runs with `--retries=2` (matching the guardian suite). (#302)
* [CHANGE][ci] **Guardian iOS E2E reads the on-chain auth structure with a pure storage parse instead of loading the multisig client.** The `verify_guardian_auth_structure` assertion's `__TEST_GUARDIAN_AUTH__` hook used to build a `MultisigService` (`getOrCreateMultisigService` → `MultisigClient.load`) and read it. Against the post-consume state — where the guardian's stored blob lags the on-chain account — that load entered a re-sign/realign loop (~48 `signWithHotKey` calls for this read vs. 26 for a full consume) that hung the single-threaded mobile WASM past the eval budget; the assertion never got far enough to run on iOS. The structure (signer set + procedure thresholds) is immutable and lives in the account's storage maps, so the hook now reads it directly with `AccountInspector.fromAccount` — a pure parse with no signing, no guardian HTTP, and no client load (just one `getAccount`, the same read the balance poll already does). Because even that lone `getAccount` was still starved by other main-thread WASM activity on iOS (the auth eval was observed taking 60s with all the wallet's own pollers paused), the structure is now captured in the wallet's own balance poll (`fetchBalances`, which reliably completes) and stashed on a global; the test reads it as a plain value with no WASM call at all. Finally, the iOS harness reads that stash over the SYNCHRONOUS `execute_script` atom (polled), not the async `execute_async_script` one: appium-remote-debugger's async atom delivers its completion callback in the `arguments[arguments.length-1]` slot as the boolean `true` on this iOS RWI bridge, so `cb(result)` threw `TypeError: cb is not a function`, the promise rejected unhandled, and every `evalAsync` hung to its timeout no matter how fast the script ran — which is why the auth read still timed out at 60s even with the stash already populated. Test-only, gated on `MIDEN_E2E_TEST` and tree-shaken from production. (#302)

## 1.15.2 (2026-06-22)

Expand Down
8 changes: 7 additions & 1 deletion playwright.ios.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,13 @@ export default defineConfig({
// Guardian specs run via playwright.ios.guardian.config.ts (dedicated run);
// keep them out of the standard iOS suite.
testIgnore: '**/guardian-*.ios.spec.ts',
timeout: 900_000, // 15 min per test — WASM prove on simulator is slow (~60-90s per consume)
// 25 min per test. WASM prove on the simulator is slow (~60-90s per consume),
// and on degraded macos-26 runners BOTH the two-sim `_simPair` setup (capped
// at 13 min, see SETUP_DEADLINE_MS) and the test body's simctl/WASM ops crawl.
// 25 min leaves room for a slow-but-completing setup + a slow test instead of
// killing a run that would have passed given a little more patience (no
// assertion is relaxed — this is purely tolerance for degraded-runner IO).
timeout: 1_500_000,
expect: {
timeout: 60_000,
},
Expand Down
59 changes: 26 additions & 33 deletions playwright/e2e/helpers/miden-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,29 @@ decimals = 8
symbol = "TST"
`;

/**
* Classify a `miden-client` CLI stderr as a transient failure that should be
* retried (vs. a deterministic error that should fail fast). Matched
* wrap-tolerantly with `\s+` because miette folds messages at terminal width.
*
* Categories:
* - RPC/transport to the node: 5xx, gRPC framing, reset/timeout.
* - `new nonce N is less than old nonce M`: the node's account state lags the
* store's optimistic post-submit state while a deploy/mint is still in
* flight, and miden-client's sqlite store hard-fails the whole sync on it
* (0xMiden/miden-client#2243). Clears once the tx commits.
* - Remote-prover connection failures (`failed to connect to ... prover`,
* `transport error`, `no native certs found`): the TLS/gRPC handshake to the
* delegated prover endpoint flakes intermittently on the macOS CI runners
* (a sibling mint in the same test connects fine), so a connection-level
* prover error is transient, not a proving-logic failure.
*/
export function isTransientCliError(stderr: string): boolean {
return /HTTP status code 5\d\d|grpc request failed|grpc-status header missing|connection reset|timed out|Temporary failure|less\s+than\s+old\s+nonce|failed\s+to\s+connect\s+to(\s+the)?(\s+remote)?\s+prover|transport\s+error|no\s+native\s+certs/i.test(
stderr
);
}

/**
* Resolve the miden-client binary path.
* 1. MIDEN_CLIENT_BIN env var
Expand Down Expand Up @@ -188,17 +211,7 @@ export class MidenCli {
break;
}
lastErr = createResult.stderr;
const transient =
// `new nonce N is less than old nonce M` (matched wrap-tolerantly —
// miette folds the message at terminal width): the node's account
// state lags the store's optimistic post-submit state while a
// deploy or mint is still in flight, and miden-client's sqlite
// store hard-fails the whole sync on it (0xMiden/miden-client#2243).
// Clears as soon as the tx commits, so it retries like any other
// transient.
/HTTP status code 5\d\d|grpc request failed|grpc-status header missing|connection reset|timed out|Temporary failure|less\s+than\s+old\s+nonce/i.test(
lastErr
);
const transient = isTransientCliError(lastErr);
if (!transient || attempt === maxAttempts) break;
const backoffMs = Math.min(30_000, 1_000 * 2 ** (attempt - 1));
// eslint-disable-next-line no-console
Expand Down Expand Up @@ -266,17 +279,7 @@ export class MidenCli {
return { txId, noteId };
}
lastErr = result.stderr;
const transient =
// `new nonce N is less than old nonce M` (matched wrap-tolerantly —
// miette folds the message at terminal width): the node's account
// state lags the store's optimistic post-submit state while a
// deploy or mint is still in flight, and miden-client's sqlite
// store hard-fails the whole sync on it (0xMiden/miden-client#2243).
// Clears as soon as the tx commits, so it retries like any other
// transient.
/HTTP status code 5\d\d|grpc request failed|grpc-status header missing|connection reset|timed out|Temporary failure|less\s+than\s+old\s+nonce/i.test(
lastErr
);
const transient = isTransientCliError(lastErr);
if (!transient || attempt === maxAttempts) break;
const backoffMs = Math.min(30_000, 1_000 * 2 ** (attempt - 1));
// eslint-disable-next-line no-console
Expand All @@ -296,17 +299,7 @@ export class MidenCli {
const result = await this.run('sync', { timeoutMs: 60_000 });
if (result.exitCode === 0) return;
lastErr = result.stderr;
const transient =
// `new nonce N is less than old nonce M` (matched wrap-tolerantly —
// miette folds the message at terminal width): the node's account
// state lags the store's optimistic post-submit state while a
// deploy or mint is still in flight, and miden-client's sqlite
// store hard-fails the whole sync on it (0xMiden/miden-client#2243).
// Clears as soon as the tx commits, so it retries like any other
// transient.
/HTTP status code 5\d\d|grpc request failed|grpc-status header missing|connection reset|timed out|Temporary failure|less\s+than\s+old\s+nonce/i.test(
lastErr
);
const transient = isTransientCliError(lastErr);
if (!transient || attempt === maxAttempts) break;
const backoffMs = Math.min(30_000, 1_000 * 2 ** (attempt - 1));
// eslint-disable-next-line no-console
Expand Down
Loading
Loading