Implement: Backfill: compose-smoke-multi-experiment CI job (#147)#258
Open
ealt wants to merge 2 commits into
Open
Implement: Backfill: compose-smoke-multi-experiment CI job (#147)#258ealt wants to merge 2 commits into
ealt wants to merge 2 commits into
Conversation
859fea9 to
515bd06
Compare
13070d3 to
403dc20
Compare
Wires the chapter-11 control-plane-server into the reference Compose stack as an always-on, Postgres-backed service, and makes lease-driven mode opt-in via an EDEN_CONTROL_PLANE_URL env fallback on the orchestrator + web-ui CLIs (no entrypoint wrappers). The existing six smokes are unchanged in posture: with the URL empty they run single-experiment and ignore the control plane. - control-plane server: add unauthenticated /healthz (outside /v0/control) for the compose healthcheck + a unit test. - orchestrator + web-ui CLIs: --control-plane-url defaults to $EDEN_CONTROL_PLANE_URL (empty treated as unset). - compose.yaml: always-on `control-plane` service (Postgres store in a separate eden_control_plane database, chapter 11 §3.4 Option A); postgres init hook (init-control-plane-db.sh) creates that database on a fresh data dir; orchestrator gains the EDEN_CONTROL_PLANE_URL env, --lease-duration-seconds, and a control-plane depends_on; web-ui gains the EDEN_CONTROL_PLANE_URL env. - setup-experiment: emit EDEN_CONTROL_PLANE_STORE_URL + POSTGRES_DB_CONTROL_PLANE + EDEN_CONTROL_PLANE_URL= (empty) and create the logs/control-plane substrate dir. - Delete compose.control-plane.yaml (its only content was web-ui flag-passing, now baked in conditionally); rewrite observability.md §3.4 to the first-class-service + env-toggle flow. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Re-scopes #147 (operator-authorized): the reference impl hosts exactly one experiment per task-store-server (single-experiment Store + ExperimentIdMismatch guard; orchestrator multi-loop targets one task-store URL; integrator is one shared bare repo deployment-wide), so the planned two-experiment cross-experiment-isolation smoke is unbuildable. True multi-experiment hosting + that smoke are deferred to - compose.multi-experiment.yaml: a second orchestrator-2 replica in lease mode (env-fallback flips both replicas when EDEN_CONTROL_PLANE_URL is set). - smoke-multi-experiment.sh + the compose-smoke-multi-experiment CI job (unrequired initially): control-plane /healthz; one registered experiment; two lease-contending replicas; lease-singleton invariant; kill the holder + assert clean hand-off to the standby; surviving replica drives the pipeline to >=2 variant.integrated; operator-driven terminate; control-plane last_known_state converges to terminated. Validated locally (PASS); smoke.sh regression PASS. - Re-scope the plan doc (§0 governs); CHANGELOG [Unreleased] entry (closes #147); AGENTS.md Commands row; README "Multi-experiment mode"; user-guide aside; same-PR audit of the retired compose.control-plane overlay refs (issue-110/-182 forward refs updated; issue-157 historical analysis preserved). Two pre-existing gaps surfaced by the deployed-substrate smoke and filed: the lease-driven orchestrator doesn't self-join the task-store orchestrators group (#254), and its auto-termination decision 403s under wire auth because terminate_experiment is admins-gated (#256). The smoke seeds the group + uses operator-driven termination as workarounds. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
403dc20 to
75e741a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Backfills the Phase 12c CHANGELOG-narrated
compose-smoke-multi-experimentdeferral — re-scoped during impl (operator-authorized 2026-05-31).Store+ExperimentIdMismatchguard ateden-wire/_dependencies.py:73; orchestrator multi-loop targets one task-store URL; integrator is one shared bare repo deployment-wide). 12c's multi-experiment surface was validated only against fake stores + the single-IUT conformance binding. So the planned "two experiments + cross-experiment isolation" smoke is unbuildable today.EDEN_CONTROL_PLANE_URLenv fallback on the orchestrator + web-ui CLIs (no entrypoint wrappers, no Dockerfile change). With it empty — the default and the existing six smokes — behavior is unchanged.See
docs/plans/issue-147-compose-smoke-multi-experiment.md§0 (governs) for the full re-scope rationale with code citations.What this does NOT cover
dispatch_mode.termination = "auto"403s becauseterminate_experimentisadmins-gated while the orchestrator is inorchestrators(spec inter-chapter drift, 03 §6.2 vs 07 §2.9 / 04 §8.2; pre-existing, never caught because existing smokes usenever_terminate+ quiescence-exit and dispatch tests run auth-disabled). Filed as Orchestrator auto-termination (decision-type 0) 403s under wire auth: terminate_experiment is admins-gated but orchestrator is in orchestrators #256. The smoke uses the supported operator-driventerminate_experimentpath instead.orchestratorsgroup — the multi-experiment path joins only the control-plane group, so the smoke seeds the task-store group as a workaround; the in-orchestrator fix is folded into Multi-experiment task-store-server hosting (prereq for #147 cross-experiment isolation smoke) #254.Fresh-operator walkthrough
setup-experiment, the new smoke, theEDEN_CONTROL_PLANE_URLtoggle, observability §3.4).smoke-multi-experiment.shlocally end-to-end → PASS (control-plane/healthz; one registered experiment; single lease holder = orchestrator-2; chaos kill → clean hand-off to orchestrator;integrated=2 exec_completed=3 eval_completed=2; operator-driven terminate →experiment.terminated; control-planelast_known_state→terminated). Ransmoke.sh→ PASS (regression: existing single-experiment smoke unaffected by the always-on control-plane).Test plan
uv run ruff check .— cleanuv run pyright— cleanpython3 scripts/check-rename-discipline.py— cleannpx markdownlint-cli2 ...— clean (changed docs)uv run pytest -q— 1951 passed; 4 failures are environmental sandbox flakes (os.killpgEPERM under cross-worktree load / git-in-tmp / docker), all ineden-checkpoint/_commoncontainer_exec /ideatorsubprocess tests that import none of this PR's changed modules (test_dispatch_collects_ideaspasses in isolation;git commit --allow-emptyworks directly). CI exercises these on clean runners.bash reference/compose/healthcheck/smoke-multi-experiment.sh— PASSbash reference/compose/healthcheck/smoke.sh— PASS (regression)The new
compose-smoke-multi-experimentCI job is added unrequired (not branch-protected); bump to required-status after ~2 weeks clean on main, same posture ascompose-smoke-checkpoint/compose-smoke-multi-orchestrator.Related issues
🤖 Generated with Claude Code