Skip to content

Implement: Backfill: compose-smoke-multi-experiment CI job (#147)#258

Open
ealt wants to merge 2 commits into
mainfrom
impl/issue-147-compose-smoke-multi-experiment
Open

Implement: Backfill: compose-smoke-multi-experiment CI job (#147)#258
ealt wants to merge 2 commits into
mainfrom
impl/issue-147-compose-smoke-multi-experiment

Conversation

@ealt
Copy link
Copy Markdown
Owner

@ealt ealt commented Jun 2, 2026

Summary

Backfills the Phase 12c CHANGELOG-narrated compose-smoke-multi-experiment deferral — re-scoped during impl (operator-authorized 2026-05-31).

  • The reference impl hosts exactly one experiment per task-store-server (single-experiment Store + ExperimentIdMismatch guard at eden-wire/_dependencies.py:73; orchestrator multi-loop targets one task-store URL; integrator is one shared bare repo deployment-wide). 12c's multi-experiment surface was validated only against fake stores + the single-IUT conformance binding. So the planned "two experiments + cross-experiment isolation" smoke is unbuildable today.
  • This PR ships the genuinely-new, genuinely-shippable substrate piece instead: the control-plane as a first-class Compose service, plus a lease-lifecycle + lease-handoff chaos smoke. True multi-experiment hosting + the cross-experiment-isolation smoke are deferred to Multi-experiment task-store-server hosting (prereq for #147 cross-experiment isolation smoke) #254.
  • Lease mode is opt-in via an EDEN_CONTROL_PLANE_URL env fallback on the orchestrator + web-ui CLIs (no entrypoint wrappers, no Dockerfile change). With it empty — the default and the existing six smokes — behavior is unchanged.

See docs/plans/issue-147-compose-smoke-multi-experiment.md §0 (governs) for the full re-scope rationale with code citations.

What this does NOT cover

Fresh-operator walkthrough

  • Performed against the changed operator surfaces (compose stack, setup-experiment, the new smoke, the EDEN_CONTROL_PLANE_URL toggle, observability §3.4).
  • Notes: ran smoke-multi-experiment.sh locally end-to-end → PASS (control-plane /healthz; one registered experiment; single lease holder = orchestrator-2; chaos kill → clean hand-off to orchestrator; integrated=2 exec_completed=3 eval_completed=2; operator-driven terminate → experiment.terminated; control-plane last_known_stateterminated). Ran smoke.shPASS (regression: existing single-experiment smoke unaffected by the always-on control-plane).

Test plan

  • uv run ruff check . — clean
  • uv run pyright — clean
  • python3 scripts/check-rename-discipline.py — clean
  • npx markdownlint-cli2 ... — clean (changed docs)
  • uv run pytest -q — 1951 passed; 4 failures are environmental sandbox flakes (os.killpg EPERM under cross-worktree load / git-in-tmp / docker), all in eden-checkpoint / _common container_exec / ideator subprocess tests that import none of this PR's changed modules (test_dispatch_collects_ideas passes in isolation; git commit --allow-empty works directly). CI exercises these on clean runners.
  • bash reference/compose/healthcheck/smoke-multi-experiment.sh — PASS
  • bash reference/compose/healthcheck/smoke.sh — PASS (regression)

The new compose-smoke-multi-experiment CI job is added unrequired (not branch-protected); bump to required-status after ~2 weeks clean on main, same posture as compose-smoke-checkpoint / compose-smoke-multi-orchestrator.

Related issues

🤖 Generated with Claude Code

@ealt ealt force-pushed the impl/issue-147-compose-smoke-multi-experiment branch from 859fea9 to 515bd06 Compare June 2, 2026 02:06
@ealt ealt enabled auto-merge (squash) June 2, 2026 02:06
@ealt ealt force-pushed the impl/issue-147-compose-smoke-multi-experiment branch 4 times, most recently from 13070d3 to 403dc20 Compare June 2, 2026 20:03
ealt and others added 2 commits June 2, 2026 14:11
Wires the chapter-11 control-plane-server into the reference Compose
stack as an always-on, Postgres-backed service, and makes lease-driven
mode opt-in via an EDEN_CONTROL_PLANE_URL env fallback on the
orchestrator + web-ui CLIs (no entrypoint wrappers). The existing six
smokes are unchanged in posture: with the URL empty they run
single-experiment and ignore the control plane.

- control-plane server: add unauthenticated /healthz (outside
  /v0/control) for the compose healthcheck + a unit test.
- orchestrator + web-ui CLIs: --control-plane-url defaults to
  $EDEN_CONTROL_PLANE_URL (empty treated as unset).
- compose.yaml: always-on `control-plane` service (Postgres store in a
  separate eden_control_plane database, chapter 11 §3.4 Option A);
  postgres init hook (init-control-plane-db.sh) creates that database
  on a fresh data dir; orchestrator gains the EDEN_CONTROL_PLANE_URL
  env, --lease-duration-seconds, and a control-plane depends_on;
  web-ui gains the EDEN_CONTROL_PLANE_URL env.
- setup-experiment: emit EDEN_CONTROL_PLANE_STORE_URL +
  POSTGRES_DB_CONTROL_PLANE + EDEN_CONTROL_PLANE_URL= (empty) and
  create the logs/control-plane substrate dir.
- Delete compose.control-plane.yaml (its only content was web-ui
  flag-passing, now baked in conditionally); rewrite observability.md
  §3.4 to the first-class-service + env-toggle flow.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Re-scopes #147 (operator-authorized): the reference impl hosts exactly
one experiment per task-store-server (single-experiment Store +
ExperimentIdMismatch guard; orchestrator multi-loop targets one
task-store URL; integrator is one shared bare repo deployment-wide), so
the planned two-experiment cross-experiment-isolation smoke is
unbuildable. True multi-experiment hosting + that smoke are deferred to

- compose.multi-experiment.yaml: a second orchestrator-2 replica in
  lease mode (env-fallback flips both replicas when
  EDEN_CONTROL_PLANE_URL is set).
- smoke-multi-experiment.sh + the compose-smoke-multi-experiment CI job
  (unrequired initially): control-plane /healthz; one registered
  experiment; two lease-contending replicas; lease-singleton invariant;
  kill the holder + assert clean hand-off to the standby; surviving
  replica drives the pipeline to >=2 variant.integrated; operator-driven
  terminate; control-plane last_known_state converges to terminated.
  Validated locally (PASS); smoke.sh regression PASS.
- Re-scope the plan doc (§0 governs); CHANGELOG [Unreleased] entry
  (closes #147); AGENTS.md Commands row; README "Multi-experiment mode";
  user-guide aside; same-PR audit of the retired compose.control-plane
  overlay refs (issue-110/-182 forward refs updated; issue-157
  historical analysis preserved).

Two pre-existing gaps surfaced by the deployed-substrate smoke and filed:
the lease-driven orchestrator doesn't self-join the task-store
orchestrators group (#254), and its auto-termination decision 403s under
wire auth because terminate_experiment is admins-gated (#256). The smoke
seeds the group + uses operator-driven termination as workarounds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ealt ealt force-pushed the impl/issue-147-compose-smoke-multi-experiment branch from 403dc20 to 75e741a Compare June 2, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backfill: compose-smoke-multi-experiment CI job (Phase 12c deferral)

1 participant