Implement: Backfill: per-route store swapping for the experiment switcher (#145)#265
Conversation
|
codex-review: converged (implementation profile, 3 rounds; record under Merged Note: commits are unsigned (the 1Password SSH-signing agent was locked during the work; pushes went through during unlock windows). Re-sign before merge if signature verification is required. |
59498e9 to
7b52430
Compare
…ial plumbing Introduces the per-experiment store-vending substrate that lets every web-ui route operate against the operator's selected experiment: - store_factory.py: live StoreFactory (per-(experiment_id, role) StoreClient views over one shared httpx.Client; JIT worker-credential bootstrap via BearerCache) + StaticStoreFactory (single pre-built store for the single-experiment / test path). - credentials.py: deployment-scoped control-plane credential bootstrap (Posture C) + credential-dir resolution. - routes/_helpers.py: resolve_active_experiment / active_config / resolve_active_context with the StaleSelection / ControlPlaneUnreachable / MissingAdminToken / config-missing exception taxonomy and the unseeded (registered-but-not-seeded) classification. - app.py: build/accept a store_factory, move experiment_id to a per-request template context processor, add experiment_config_dir + config cache; lifespan closes the factory. Legacy store/admin_store kwargs retained for the W3 fixture sweep. - cli.py: build the live StoreFactory; new --credential-dir / --experiment-config-dir / --control-plane-worker-id flags; control-plane client uses the bootstrapped deployment-scoped credential. All 629 existing web-ui tests pass unchanged via the StaticStoreFactory compat path; adds test_store_factory.py + test_resolve_active.py. Refs #145 (plan W1). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…modules Every per-experiment route handler now resolves the active experiment per-request via resolve_active_context(request) instead of reading the startup-bound request.app.state.store / experiment_config / admin_store / experiment_id. With control_plane=None (single-experiment / test posture) the helper returns the deployment default, so behavior is identical and the existing suite validates the refactor unchanged. - ideator / executor / evaluator: resolve + thread experiment_id (executor starting-variant) and config (evaluator render/parse helpers) through. - admin observability / actions / work_refs / index / workers / groups: resolve store + admin_store; actor attribution stays app.state.worker_id (until #140). Narrowed payload casts where tightening store: Any → Store surfaced pre-existing union-access types. - index.py: resolve store. Credential-dir fix: the web-ui is a worker host, so resolve_credential_dir now honors the common --credentials-dir / $EDEN_WORKER_CREDENTIALS_DIR (with --credential-dir / $EDEN_CREDENTIAL_DIR as override, XDG as final fallback). Without this the per-experiment BearerCache wrote to a shared XDG path and leaked stale credentials across ephemeral task-stores — breaking the admin-workers / admin-groups real-subprocess e2e tests (stale token → reissue → 404 → startup crash). Full web-ui suite green (651 incl. e2e); ruff + pyright clean. Refs #145 (plan W2). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ke_app make_app now takes store_factory as its sole store dependency (legacy store= / admin_store= kwargs and app.state.store / admin_store are gone; experiment_config / experiment_id stay as the single-experiment config source + resolve fast-path default). The CLI passes its live StoreFactory; tests build a StaticStoreFactory via the new conftest._one_experiment_factory helper. All direct make_app / make_web_ui_app call sites across the test suite converted. Also fixes AdminGateMiddleware (missed in W2): it read app.state.store for the admins-group check. It now resolves the active experiment's worker store via resolve_active_context, so the admin gate follows the operator's experiment selection. Deployment-scoped admin pages (/admin/experiments, /admin/control) gate against the default experiment and are exempt from per-experiment resolution (they are the redirect target for resolution failures, so resolving them per-experiment loops). Full web-ui suite green (651); ruff + pyright clean. Refs #145 (plan W3). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- base.html gains a top-nav switcher dropdown: a no-JS <details> listing
every registered experiment as a CSRF-protected POST to
/admin/experiments/{E}/select, keyed off the session selection (shows
"Active: <id>" / "Default: <id>", highlights the active row). Hidden in
no-control-plane deployments.
- switcher_context template context processor + a 5s in-process TTL cache
on list_experiments (§3.7) so the per-render dropdown doesn't hammer the
control plane.
- form_experiment_guard (§3.6): every worker submit form carries a hidden
form_experiment_id; the ideator/executor/evaluator submit handlers
discard a submission whose form was rendered against a different
experiment than the now-active one and redirect with a clear banner,
rather than silently writing to the wrong experiment.
- The dashboard renders the active-experiment resolution-failure banners
(stale-selection / control-plane-unreachable / cannot-bootstrap-credential
/ task-store-unreachable / config-missing / config-invalid /
switched-mid-form) that the per-route redirects target.
Full web-ui suite green (660); ruff + pyright clean.
Refs #145 (plan W4).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The executor module's local bare clone is now per-experiment. A new RepoMaterializer vends per-experiment clones under <repo-path-parent>/<experiment_id>.git (cloned from the Forgejo remote with the per-experiment URL substituted from the --forgejo-url org base, fetched on each access); a repo_for(request, experiment_id) helper returns the startup-materialized app.state.repo for the deployment default (single-experiment deployments unchanged) and the materialized clone for non-default experiments. The executor submit + draft-render, the admin work-refs list/delete, and the admin dashboard now resolve the active experiment's repo. Full web-ui suite green (667); ruff + pyright clean. Adds test_per_experiment_repo.py. Refs #145 (plan W5). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- glossary: add experiment switcher / selected experiment / active experiment (active_experiment_id) / StoreFactory / active store. - user-guide §12: rewrite for the control-plane + switcher multi-experiment path (selection now changes data); separate-stacks isolation kept as 12.2. - docs/operations/web-ui-multi-experiment.md (new) + README index: the switcher, the four credential-bootstrap postures, per-experiment config / repo layout, config-drift caveat. - compose web-ui: --experiment-config-dir + web-ui-configs bind-mount + explicit --credentials-dir (the new resolver otherwise falls back to a non-persisted in-container XDG path). - setup-experiment.sh: create web-ui-configs/ + copy each experiment's config to <data-root>/web-ui-configs/<id>.yaml. - CHANGELOG [Unreleased] entry (closes the 12c §3.6 deferral); roadmap row. - Deferrals filed: #259 (config wire endpoint), #260 (resolve cache), #261 (v1 switcher affordances), #262 (admin-form guard); #147 (multi-exp smoke) unchanged. Refs #145 (plan W6). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… gate The per-experiment wiring pushed make_app to 106 lines and submit_idea to 101 (threshold 100). Extracted _install_healthz_and_error_handlers(app, templates) from make_app, and a _collect_idea_form_fields(form) helper (also DRYs the identical block in add_row). Behavior-preserving; web-ui suite green; complexity-gate clean (0 blocking). Refs #145. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses all 5 codex round-0 findings (record under docs/plans/review/issue-145/impl/): - Bug 1: active_config no longer silently reuses the default experiment's config for a non-default experiment with no --experiment-config-dir (raises ExperimentConfigMissing → config-missing redirect). --experiment-config is now optional in control-plane mode; _resolve_default_config validates the posture + fails fast on a default-config/config-dir mismatch. make_app.experiment_config is now ExperimentConfig | None. - Bug 2: resolve_active_experiment catches Unauthorized on the seed probe, evicts the cached credential, re-bootstraps once, and raises MissingAdminToken (cannot-bootstrap-credential) on a persistent 401 — the Decision-8 Posture C/D ladder (a 401 is never inferred as unseeded). - Bug 3: per-experiment clones move to a durable --repo-root (<repo-root>/<id>.git); Compose bind-mounts web-ui-repos + passes the flag (the parent-of-repo-path default lands on the non-durable container fs in Compose). - Risk 4: StoreFactory.evict(experiment_id) clears the cached bearer + clients, wired into the 401 recovery so reseed/reissue self-heals. - Risk 5: switcher hidden (not an empty dropdown) when control-plane reads are unavailable with a cold cache (Posture D / CP outage). New tests: config-missing redirect, 401→evict→MissingAdminToken, switcher-hidden-on-cp-error. Web-ui suite green in isolation; ruff / pyright clean. Refs #145. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ashes Round-1 verdict: 4/5 round-0 findings resolved. Round-2 fixes: - New Risk: _build_control_plane_client now catches transport / control-plane WireError (not just RuntimeError) — a control-plane outage or rejection at startup degrades to the Posture-D banners + hidden-switcher posture instead of aborting web-ui startup. - Finding 4 (default-experiment credential staleness): scoped to non-default (the default fast path stays zero-overhead, matching pre-#145 behavior) and folded into #260; not a regression. ruff / pyright clean; control-plane + resolve tests green. Refs #145. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Commits the impl-stage codex-review iteration record under docs/plans/review/issue-145/impl/ (durable *.md; regenerable *.jsonl/*.stderr/prompt.txt gitignored per top-level .gitignore). Review converged after 3 rounds (0 → 1 → 2): round 0 raised 3 Bugs + 2 Risks, all addressed; round 1 confirmed 4/5 + 1 new Risk; round 2 confirmed no remaining Bug/Risk findings. Refs #145. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
W2's incoming side removed variant_id from _parse_evaluator_submit_form, but the function body's nested call to _maybe_bundle_evaluator_artifact still references variant_id. Reinstate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7b52430 to
450fb76
Compare
…xt additions #145 added ~16 SLOC to executor.py from threading resolve_active_context through each handler, pushing the file from 800 to 816 SLOC. Per-resource split is a separate refactor candidate (cousin of F-3); not in scope here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tch check The 8 compose smokes all append fields (ideation_policy / max_quiescent_iterations) to the mounted experiment-config.yaml *after* setup-experiment has already copied the pre-append config into the web-ui's --experiment-config-dir. The codex-round-1 "fail fast on default-config/config-dir mismatch" check then saw the two configs differ and exited the web-ui (rc=1) → container unhealthy → every compose-smoke / compose-e2e job failed at `up --wait`. The check was over-strict: the deployment-default experiment ALWAYS resolves its config from --experiment-config (active_config's default branch returns app.state.experiment_config and never reads <config-dir>/<default>.yaml), so a divergent default entry in the config-dir is harmless. Removed the mismatch fail-fast; the posture validation (single-exp requires --experiment-config; control-plane mode requires config OR config-dir) stays. Verified: `bash reference/compose/healthcheck/smoke.sh` → PASS (web-ui healthy, quiescence reached, all assertions pass). ruff + pyright clean. Refs #145. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Records the round-3 review of the compose-smoke regression fix (removing the over-strict default-config/config-dir mismatch check). Codex confirms it is acceptable and does not reintroduce round-0 Bug 1 — the load-bearing fix (non-default experiments never fall back to the default config; raise ExperimentConfigMissing) lives in active_config, which is unchanged. No remaining Bug/Risk. Review converged across 4 rounds (0→1→2→3). Refs #145. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Fixed the compose-smoke regression (all 8 compose-smoke + compose-e2e jobs were red: "container eden-web-ui is unhealthy"). Root cause: the codex-round-1 Bug-1 fix added a "fail fast on default-config / config-dir mismatch" check in Fix ( Verified locally:
|
Closes #145. Implements
docs/plans/issue-145-per-route-store-swap.md.Summary
resolve_active_context) and operates against its store / config / integrator-repo, instead of the startup-boundapp.state.store/experiment_id/experiment_config. "Select experiment Y" now actually changes the data on ideator / executor / evaluator / all/admin/*per-experiment pages.StoreFactory(per-(experiment_id, role)StoreClientviews over one sharedhttpx.Client, JIT worker-credential bootstrap reusingeden_service_common.auth.bootstrap_worker_credential), a top-nav switcher dropdown, aform_experiment_idswitch-mid-form guard, per-experiment config-dir + integrator-repo, and the four credential-bootstrap postures (§3.2).Landed as waves W1–W6 (see commit history + CHANGELOG
[Unreleased]).What this does NOT cover
GET /v0/experiments/{E}/configwire endpoint — per-experiment config is loaded from an on-disk--experiment-config-dir(Decision 6); the cleaner wire-read that removes the on-disk copy + its drift risk (Risk 12) is a normative chapter-7 amendment, out of scope. Tracked in Wire endpoint GET /v0/experiments/{E}/config (replace web-ui --experiment-config-dir) #259.list_experimentscache shipped; the per-request seeded/unseeded classification is uncached (correct, but a latency cost on non-default admin pages, and the default-experiment credential-staleness refresh). Tracked in Web-ui: per-request active-experiment resolution cache (Decision 8 TTL) #260.?exp=permalink override + draft-survives-switch (v1 affordances). Tracked in Web-ui switcher v1 affordances: ?exp= permalink override + draft-survives-switch #261.form_experiment_idguard on admin mutating forms — shipped on the worker submit forms (the documented long-draft risk); admin forms fail safe (cross-experiment id →NotFound) but lack the explicit banner. Tracked in Web-ui: extend form_experiment_id switch-mid-form guard to admin mutating forms #262.Fresh-operator walkthrough
test_e2e_real_subprocess,test_admin_e2e,test_executor_e2e,test_evaluator_e2e,test_admin_workers_e2e,test_admin_groups_e2e) — these spawn the actualpython -m eden_web_uiagainst a real task-store-server and drive claim→draft→submit / admin-reclaim through real HTTP.test_admin_experiments_routes.py+test_resolve_active.py+test_store_factory.py(fakes + the real control-plane server overhttpx.MockTransport), but a live operator click-through of "register 2 experiments → switch → observe data follows" is deferred to the multi-experiment Compose smoke (Backfill: compose-smoke-multi-experiment CI job (Phase 12c deferral) #147). Notes: single-experiment behavior passed cleanly; multi-experiment is unit/integration-covered but not live-walked.Test plan
uv run ruff check .— cleanuv run pyright reference/services/web-ui/— 0 errorspython3 scripts/check-complexity.py— clean (0 blocking)python3 scripts/check-rename-discipline.py— cleanpython3 scripts/spec-xref-check.py— clean (no spec edits)uv run pytest -q reference/services/web-ui/tests/— green (667 → 669 with new tests); the one full-suitetest_e2e_real_subprocessflake and 2eden-checkpointfailures both pass in isolation (e2e-under-load / test-ordering artifacts, not in this diff's packages)uv run pytest -q(full reference suite) — 1990 passed / 221 skipped / 2 (eden-checkpoint, pass in isolation)docs/plans/review/issue-145/impl/bash reference/compose/healthcheck/smoke.sh/smoke-subprocess.sh— NOT run (no docker in this environment); the compose changes (--experiment-config-dir/--credentials-dir/--repo-rootflags + bind-mounts + setup-experiment dirs) are single-experiment-additive and should be exercised by CI's compose-smoke jobs before merge.Related issues