| 1 |
Validate v0.1.2 on real Windows 11 + VS Code 1.107+ |
high |
open |
The M3 capability-shape question gets answered live by VS Code's response to our dual-declaration (tools.tasks: true AND top-level tasks: {}). Run on a hardened enterprise machine to validate the cmd→ps1→py wrapper chain. F1 fix (sys.executable everywhere) means this should now succeed even on hardened machines with no python3 alias. |
| 2 |
Flesh out reference stub: skills/vs-code-copilot-foundry/references/architecture.md |
medium |
open |
Currently 4-line placeholder. Should cover: foundry-server internals (job manager, skill registry, resource layer), persona-to-server data flow, sampling vs subprocess delegation trade-off. |
| 3 |
Flesh out reference stub: skills/vs-code-copilot-foundry/references/byok-setup.md |
medium |
open |
Currently placeholder. Should cover: VS Code 1.117+ BYOK setup for Business/Enterprise, plugging Anthropic / OpenAI / Google keys directly, how it integrates with foundry persona model: chains. |
| 4 |
Flesh out reference stub: skills/vs-code-copilot-foundry/references/custom-agents-spec.md |
medium |
open |
Currently placeholder. Should cover: .agent.md frontmatter schema, handoff buttons, model fallback chains, target field, agents allowlist, runSubagent semantics, depth=5 cap, cost-tier constraint. |
| 5 |
Flesh out reference stub: skills/vs-code-copilot-foundry/references/setup-copilot-cli.md |
medium |
open |
Currently placeholder. Should cover: copilot --agent foundry -p ... headless mode, /agent slash command, /fleet parallel orchestration, model selection in CLI vs picker. |
| 6 |
Flesh out reference stub: skills/vs-code-copilot-foundry/references/troubleshooting.md |
medium |
open |
Currently placeholder. Should consolidate all the "common issues" tables from INSTALL.md and SKILL.md + setup-vscode-chat.md into one canonical troubleshooting reference. Add Zone Identifier section, mcp.json troubleshooting, persona auto-discovery debugging. |
| 7 |
Write foundry_design coordinator script |
medium |
open |
v0.1.x returns a hint asking the caller (@forge) to do parallel codex+gemini calls manually. v1.2 should ship foundry-server/coordinator_design.py that runs the parallel delegations, aggregates, returns a synthesized design hypothesis to forge. |
| 8 |
Build foundry-cli helper binary |
medium |
open |
Currently config edits happen by hand-editing ~/.vs-code-foundry/config.json. Ship a small CLI: foundry-cli enable claude-bridge, foundry-cli redeploy --workspace <path>, foundry-cli status, foundry-cli logs. Install to ~/.vs-code-foundry/bin/. |
| 9 |
Extract foundry-server modules if it grows |
low |
open |
foundry_server.py is ~1080 LOC currently. If it grows past ~1300 LOC, split into foundry_server.py (protocol), foundry_jobs.py (job manager), foundry_skills.py (skill registry), foundry_resources.py (MCP resources). Tests follow. |
| 10 |
Optional: thin VS Code extension wrapping foundry-server |
low / v2 |
open |
Use mcpServerDefinitionProviders to ship foundry-server as part of an extension, eliminating manual .vscode/mcp.json editing for users. ~1000-1500 LOC TS, 5-7 dev-days, Marketplace publication. Defer until real-user-demand-signal is clear. |
| 13 |
Sign the PowerShell scripts with a code-signing certificate |
low |
open / future |
For environments that enforce AllSigned. Requires acquiring a code-signing cert; cost + maintenance burden. Defer until a real user requests it. |
| 14 |
Public-repo decision (currently PRIVATE) |
open |
future |
If/when going public: scrub paths, polish README for public audience, add Marketplace/installer landing pages, decide LICENSE. Currently MIT but not yet committed publicly. |
| 15 |
Real-fleet user feedback gathering |
low |
future |
Once 5+ real users have installed foundry, gather telemetry on which personas + tools are actually used. Drives the v1.2 polish backlog. |
| 16 |
Remove v0.1.x backward-compat symlink at ~/.vs-code-foundry/skills/vs-code-copilot-foundry |
low |
open / v0.3.x |
The v0.2.0 installer creates this symlink so anyone with hand-coded absolute paths or custom MCP clients pointing at the old location keeps working. Removable in v0.3.x once enough time has passed for consumers to migrate. Drop the migration block in install_skill_family() and document the removal in CHANGELOG. |
| 17 |
Schema-validate one SKILL.md from the family against the published agentskills.io schema as part of installer smoke |
low |
open |
Defensive guard against schema drift between foundry's installed skills and MS Copilot's parser expectations. Both use agentskills.io; this would catch any future divergence. ~30 LOC in installer/install-foundry.py smoke_test() or a new helper. |
| 18 |
Real Windows validation of v0.2.0 (Developer Mode + non-admin symlink path) |
medium |
done (superseded) |
Closed 2026-05-21 by the win11-laptop install of v0.3.1+. Confirmed: (a) WinError 1314 warning fires cleanly on non-Developer-Mode + try/except continues correctly; (b) ~/.copilot/skills/ canonical location works regardless; (c) Path.home() honors USERPROFILE. Three subsequent fixes (v0.3.1 / v0.3.2 / v0.3.3) shipped same day from findings. v0.3.3 full re-install still pending real-machine validation — tracked as #23. |
| 23 |
Real Windows validation of v0.3.3 full re-install path |
medium |
open |
Re-run install-foundry.cmd -Yes -PrefixPath %USERPROFILE%\.vs-code-foundry on win11-laptop after git pull + git checkout v0.3.3. Confirm: (a) 6 personas land in ~/.copilot/agents/ AND ~/.vs-code-foundry/agents/; (b) no DeprecationWarning: datetime.utcnow(); (c) personal-os smoke leg completes with status: PASS (no UnicodeEncodeError on report-print); (d) Developer: Reload Window in VS Code → @ autocomplete shows all 6 personas. Then workspace-install round (-WorkspacePath C:\dev\vs-code-foundry) → confirm .vscode\mcp.json registers both servers + @foundry foundry_health + @kit kit_health both respond. File REPLY thread in remote_claude.md. |
| 19 |
Installer auto-creates workspace dir when --workspace points at a missing path |
low |
open |
Discovered during WP-14 smoke 2026-05-18: when --workspace <path> points at a non-existent dir, install-foundry.py prints ERROR: Workspace path is not a directory but proceeds anyway. install_workspace_skills_scaffold() creates the dir (and a .github/skills/README.md), then install_workspace() short-circuits because the dir didn't pre-exist. Net result: partial workspace install (just the skills scaffold). Pre-existing v0.2.0 behavior, not v0.3.0 regression. Fix: have install_workspace() also Path.mkdir(parents=True, exist_ok=True) before writing files, OR have _validate_workspace_path() mkdir as part of its check. Either way the misleading ERROR log line should be downgraded to INFO + auto-creation message. |
| 20 |
Document Claude Code agent name collision in INSTALL.md / AGENTS.md R-rules |
low |
done (v0.5.1) |
Discovered 2026-05-21 on win11-laptop: VS Code Copilot scans both ~/.copilot/agents/ (foundry) and ~/.claude/agents/ (Claude Code) per MS Agent Skills spec → alf/bob/pa name dupes in Copilot Chat's @ autocomplete. Closed in v0.5.1 (feature/v0.5.1-installer-agent-reconcile): (a) INSTALL.md "Agent scopes & dedup" + "Known interactions (#20)" sections shipped; (b) AGENTS.md R2 carve-out wording (the installer may do a metadata exists() only on ~/.claude/agents/{bob,alf,pa}.md for the warning — no content read/write/delete); (c) the installer now DETECTS the collision at install time and prints a metadata-only warning (warn_claude_collision()), and the unrelated foundry-internal 2× bob duplication is fixed by the reconcile engine + --agent-scope default global. NOTE: the ~/.claude/ name collision itself is a cross-tool fact that foundry only warns about (never deletes Claude's files); real win11 smoke of the warning is tracked in remote_claude.md. |
| 21 |
Installer "user-level-only" mode should warn that kit_* tools won't function without workspace MCP wiring |
low |
open |
Discovered 2026-05-21: when user answered Install to current workspace? n, persona files (post-v0.3.3) reach ~/.copilot/agents/ and Copilot Chat shows @kit. But kit_health / kit_status / etc. all fail because personal-os-server isn't registered in any workspace's .vscode/mcp.json. User saw @kit appearing but tools nonfunctional — confusing. Fix: when --with-personal-os is ON and workspace install is being skipped, emit a clear INFO line at end of install explaining the limitation + how to wire a workspace later. |
| 22 |
Smoke runner status line on Windows: avoid printing dynamic content that could exceed cp1252 even with reconfigure |
low |
open |
v0.3.2 fixed the immediate UnicodeEncodeError by reconfigure(utf-8, errors=replace), but the underlying issue is that log lines accumulating across all 14 KIT tool exercises CAN contain user-data with non-cp1252 chars (e.g., file paths with em-dashes, sample task titles). Long-term: audit smoke_runner.py's log accumulation for any string interpolation that could surface user content. Belt-and-braces with the v0.3.2 reconfigure should hold; this is preventive. |
| 24 |
Evaluate Agent Host Protocol (AHP) for foundry |
medium |
open / v0.4 eval |
VS Code 1.121 introduced AHP (microsoft.github.io/agent-host-protocol/) plus Remote Agents (Preview): agent sessions coordinated across SSH / Dev Tunnels, with a lightweight "agent host" process that survives client disconnection. Foundry today is local-stdio MCP only. Decide one of: (a) ignore for v0.x (foundry stays local), (b) prototype @bob / @kit running on a remote AHP host in v0.4, (c) declare no-fit. Inputs: AHP spec, multi-client coordination story, whether foundry-server can host as AHP server, whether stdio MCP still works inside an AHP host. Deliverable: 1-page decision doc in docs/plans/. |
| 25 |
Document Claude-agent permission settings in agents/bob.agent.md + INSTALL.md |
medium |
open |
VS Code 1.121 added github.copilot.chat.claudeAgent.allowAutoPermissions (Auto Mode — execute without permission prompts but with background safety checks) and github.copilot.chat.claudeAgent.allowDangerouslySkipPermissions (unrestricted). @bob is the persona that writes code; users currently get prompted for every edit. Document both settings in agents/bob.agent.md body (recommend allowAutoPermissions for trusted workspaces, never allowDangerouslySkipPermissions) and add an INSTALL.md "Autonomy modes" subsection. R6 (no --no-verify) still holds — these settings affect prompts only, not commit policy. |
| 26 |
Add chat.utilityModel / chat.utilitySmallModel recommendations to BYOK reference stub |
low |
open |
Folds into task #3 (BYOK reference stub). VS Code 1.121 added two settings to override the default model for general utility flows (titles, summaries, commit messages, rename suggestions) and lightweight utility tasks. Orthogonal to persona model: chains (R9) — these are user-tier. Recommend cheap small models here to cut token cost; suggest claude-haiku-4-5-20251001 for chat.utilitySmallModel and gemini-flash-line or gpt-5.4-mini (subject to user BYOK) for chat.utilityModel. |
| 27 |
Verify persona model: display-names resolve under all 6 BYOK providers |
medium |
open |
April-2026 Copilot release: Business/Enterprise BYOK now covers OpenRouter, Microsoft Foundry, Google, Anthropic, OpenAI, and other Chat-Completions/Responses/Messages-compatible endpoints. R9 mandates display-name strings (e.g., 'Claude Opus 4.7 (anthropic)'). Smoke-test that each of the 6 personas' model: declarations actually resolve when the workspace BYOK admin policy is set. Also note: the Insiders Custom Endpoint Provider replaces the deprecated customoai provider — update references/byok-setup.md (task #3). Output: a docs/byok-matrix.md mapping each persona × each BYOK provider → resolved model. |
| 28 |
Audit JobManager.spawn for VSCODE_AGENT env-var awareness |
medium |
open |
VS Code 1.121 sets VSCODE_AGENT on agent-initiated terminals so CLIs can detect agent context and switch to machine-readable output. foundry-server's JobManager.spawn and the future coordinator_design.py (task #7) run subprocesses (foundry_codex etc.) that could benefit. Two-part fix: (a) JobManager.spawn passes VSCODE_AGENT=1 to its subprocess env (so child CLIs know they're under an agent — they already are; just be explicit), (b) when foundry-server itself sees VSCODE_AGENT set by VS Code Copilot, log a single INFO line at startup confirming agent context. |
| 29 |
Test foundry_codex long-running jobs vs VS Code background-terminal auto-cleanup |
medium |
open |
VS Code 1.121: background terminals created by chat agents auto-dispose upon command completion. foundry-server's JobManager spawns its own subprocess directly (not via the chat terminal) so the disposal shouldn't reach it, but the new behavior changes the affordance landscape. Smoke-test: start a long-running foundry_codex job, switch chat context, verify the job continues + result is retrievable via foundry_jobs_get. Add to tests/test_foundry_server.py if reproducible. |
| 30 |
Evaluate deferring foundry_search_skills to Copilot's local semantic index |
low / v2 |
open |
April-2026: semantic indexing now works in all workspaces; agents can run grep-style search across GitHub repos/orgs via the new githubTextSearch tool; experimental /chronicle queries chat history (github.copilot.chat.localIndex.enabled). foundry-server's SkillRegistry / foundry_search_skills is glob+regex based. Evaluate whether to (a) keep glob (simple, stdlib), (b) defer to Copilot's local index when present, (c) emit both. Decision constrained by R8 stdlib-only. |
| 31 |
Optional OTel emission from foundry_* and kit_* tool handlers |
low / v2 |
open |
VS Code 1.121 ships prebuilt Azure Managed Grafana dashboard visualizing agent operations / token usage / chat sessions / tool calls / per-model latency. If foundry-server emitted OpenTelemetry spans per tool call, users' dashboards would show foundry tool usage natively. Tension with R8 stdlib-only — opentelemetry-api would be foundry's first dep. Possible compromise: env-gated optional dep (pip install vs-code-foundry[otel]); off by default; foundry-server runs without it if not installed. Defer until a user requests dashboard integration. |
| 32 |
Review agents/bob.agent.md for stale text vs new in-chat diff visualization |
low |
open |
April-2026: code changes display as inline diffs directly in chat threads. @bob persona body may contain instructions like "your edits will be applied silently" or "the user can't see the diff" that contradict the new affordance. Grep agents/bob.agent.md for such language and update — the new affordance is better UX so any stale defensive text just reads weird now. ~10 LOC change at most. |
| 33 |
Document terminal read/write security boundary in INSTALL.md "Security" section |
medium |
open |
April-2026: agents gained read/write capabilities to any open terminal in VS Code Copilot. This is orthogonal to foundry-server's R2 boundary (which governs what foundry-server itself reads), but it changes the user's threat model — any other agent in the same workspace can now read terminals foundry-server's subprocesses (foundry_codex etc.) are running in. INSTALL.md needs a new subsection alongside the Claude Code agent name collision (task #20) covering: (a) what VS Code's new terminal access means, (b) recommend isolating sensitive workspaces, (c) note that VSCODE_AGENT being set is the signal. |
| 34 |
Update @forge persona body to emit mermaid for design exploration |
low |
open |
VS Code 1.121: built-in Mermaid Markdown Features extension renders mermaid code blocks in Markdown preview, notebooks, AND chats, with pan/zoom support. @forge does design exploration in chat — emitting mermaid sequence diagrams, component graphs, state machines would substantially improve the design conversation. Update agents/forge.agent.md body to encourage mermaid output for: component relationship graphs, sequence diagrams during cross-CLI deliberation, state machines for cascade transitions. No tools change needed. |
| 35 |
Self-review V2 — decide the fate of the signed progress/contract-map.yaml (wire it or remove it) |
high |
open |
Architecture review 2026-05-23 (docs/reviews/2026-05-23-cascade-architecture-review.md). progress/contract-map.yaml + .sig exist (HMAC-signed, "rev 2 covers both servers", AGENTS.md 245-248) but a full grep of foundry-server/, agents/, and tests finds NO runtime read, NO signature verification, NO gate consuming them — vestigial copy of internal-rnd's pattern, never wired in. A signed artifact nothing verifies implies integrity the system doesn't enforce. Decision: (a) wire a real verify step bob/forge must pass before execution (gives the fork one genuine mechanical gate, partially closes V1), OR (b) delete the artifact + document in AGENTS.md/PROJECT.md that vs-code-foundry is prompt-discipline-only. Cheapest high-value item; do first. alf score 6. |
| 36 |
Self-review V4 — build efficacy-telemetry rollup on the existing actions.jsonl substrate |
medium |
open |
Architecture review 2026-05-23. No metric exists for bob-PARTIAL rate, Codex/Gemini false-positive rate (forge/alf both cite "~60% FP" with zero measurement behind it), user-override rate, or test-failure-at-completion rate. BUT foundry_log_action + ~/.vs-code-foundry/actions.jsonl already log actions — substrate is there, only the rollup/metric layer is missing. Ship a foundry-cli metrics (or MCP foundry_metrics read-only tool) that aggregates actions.jsonl into the above rates. Makes the "triple-model coverage is the value-add" claim falsifiable. Highest ROI for making every other claim measurable. alf score 8 (MODERATE). |
| 37 |
Self-review V7 — add a windows-latest CI job (riskiest platform, least automated coverage) |
medium |
open |
Architecture review 2026-05-23. CI (#11) runs ubuntu+macos × py3.10/3.11/3.12 but EXCLUDES Windows (tests use POSIX paths /bin/cat, /tmp). Yet Windows is where bugs keep surfacing: cp1252 console (#22), py3.14 (v0.3.2), WinError 1314 symlink (#18), datetime.utcnow (#23). The most-targeted enterprise platform is the only one without a CI job. Fix: add @unittest.skipIf(os.name=='nt') (or tempfile/shutil.which abstractions) to the POSIX-path tests, then add a windows-latest leg to .github/workflows/ci.yml. Distinct from #23 (one-off manual re-validation) — this AUTOMATES it. alf score 8 (MODERATE). |
| 38 |
Self-review V3 — add an independent verification pass to the design→build arc |
medium |
open |
Architecture review 2026-05-23. Unlike internal-rnd's cold-context dual-verdict, here bob SELF-reports COMPLETE/PARTIAL/FAILED (bob.agent.md Step 5); the only objective check is "run the test suite" (Step 4); "report PARTIAL honestly" is prose (line 156). @alf can review bob's output but only runs on explicit user invocation — it is NOT in the build path. A padded COMPLETE is caught only if a human or separately-invoked alf looks. Candidate: make forge's bob-handoff include an auto-alf-review step on completion (button → alf), OR a lightweight foundry_verify MCP tool that re-runs tests + diffs against the WP plan independently of bob's self-report. alf score 8 (MODERATE). |
| 39 |
Self-review V8 — no SAST/secrets gate on bob-generated code |
medium |
open |
Architecture review 2026-05-23. bob writes code with only an OPTIONAL foundry_codex review (bob line 143). No secrets scan, no SAST, no pre-commit security gate. internal-rnd has a pre-push secrets-scan + the S038 SAST batch in flight (#109/#112 there); none exists here. The Copilot fork ships code to user workspaces with no security floor beyond "ask Codex if you feel like it." Candidate: port internal-rnd's scripts/secrets-scan.{sh,py} as a foundry pre-commit hook the installer can wire, + optional bandit/semgrep invocation via a foundry_sast tool. Coordinate with cross-repo-review.md (internal-rnd S038 is the upstream source). alf score 7 (MINOR, just under threshold). |
| 40 |
Self-review V1 — no mechanical enforcement floor; all persona hard rules are prose (keystone) |
high |
open |
Architecture review 2026-05-23. The defining structural gap. internal-rnd's thesis ("convert prose rules to subprocess gates because LLMs drift") is dropped entirely here — bob's "Hard rules" (lines 147-157) are unenforced prose with no backstop if the model ignores them. Intrinsic to the Copilot platform choice to a degree (no easy mid-agent subprocess gates), but it means cascade correctness == model instruction-following on a given turn, with no floor. Root cause beneath V2/V3/V5. Not a single fix — a direction: decide how much mechanical floor this fork wants. Minimum viable floor = wire the contract-map (V2/#35) + an independent verify (V3/#38) + a security gate (V8/#39). Revisit scope after #35 and #36 land. alf score 8 (CRITICAL-structural; additive formula compresses it). |
| 41 |
Self-review V5 — autonomy fully delegated to IDE host; foundry has no backstop of its own |
low |
open |
Architecture review 2026-05-23. bob.agent.md 118-129 cedes the permission model to VS Code 1.121 claudeAgent.allowAutoPermissions ("execute without prompts but with background safety checks") — but those checks are COPILOT'S, not foundry's. If a user flips allowDangerouslySkipPermissions, nothing foundry-side catches a destructive action. internal-rnd had gates as a host-independent backstop. Low-urgency given the warning text already discourages the dangerous setting, but worth a documented "foundry provides no independent safety check; you are trusting the IDE host" note in INSTALL.md Security section (folds with #33). alf score 6. |
| 42 |
Self-review V6 — fork drift from internal-rnd is structural and one-directional |
low |
open |
Architecture review 2026-05-23. AGENTS.md R2 accepts persona drift by design ("this is a fork"). Consequence: internal-rnd's rigor advances (gate system, dual-verdict, S038 security batch) and NONE flows here automatically; cross-repo-review.md is a manual queue. This fork falls progressively behind the original on rigor unless human-pollinated. Not necessarily wrong — but make it a tracked cadence: a periodic (monthly?) pass over internal-rnd's history.md + cross-repo-review.md Outbound entries to decide what to adopt. Otherwise drift compounds silently. alf score 6. |
| 43 |
Self-review V10/V11/V13 — minor hygiene: read-only invariant, version drift, bob rollback |
low |
open |
Architecture review 2026-05-23. Three MINOR items bundled. V10: personal-os→foundry "read-only" (R13) is SQLite mode=ro convention, not enforced — add a test asserting no rw open path exists across personal-os reads (score 5). V11: VERSION constant must be hand-synced across foundry_server.py:34 + install-foundry.py ("pre-existing drift acknowledged" in v0.1.2 notes) — single-source it (e.g. read from one VERSION file) so a release can't ship mismatched self-reported versions (score 5). V13: bob runs sequentially in one chat with no .bob-checkpoint.md equivalent; an interrupted multi-WP run loses progress state (per-WP commits mitigate partially) — consider a lightweight resume protocol (score 5). |
| 44 |
@testbed forge cycle continuation — finish design sections 3-9 + write canonical design doc + spawn bob |
high |
open / v0.4 |
Forge cycle paused 2026-05-23 at sections 1 (Goal) + 2 (Approach) user-approved. WIP captured at docs/plans/2026-05-23-testbed-design-WIP.md. Decisions frozen: B+C+D modalities (A visual deferred to v0.5 per R8 stdlib-only); terminal testbed (no auto-fix loop, max 1 user-approved recheck); separate testbed-server (3rd MCP server, mirrors personal-os-server per R13); HMAC freshness gate on progress/contract-map.yaml.sig; opt-in dev-server policy. Do NOT re-spawn the triple-model design team — full convergence captured in WIP doc (Claude challenger NEEDS-REWORK 4 CRITICAL, Codex challenger NEEDS-REWORK 3 CRITICAL, Gemini analyst 8 research areas). Continue with: section 3 (Components — files + line-count), 4 (Data flow — verdict pipeline), 5 (Error handling — INCONCLUSIVE / INFRA-FAILED vs DELIVERED-BROKEN), 6 (Testing — stdlib-unittest mocking strategy for browser tools), 7 (Performance — modality run-time budgets), 8 (Open questions), 9 (WP plan for bob). Per forge protocol pause after each section. Then write canonical design at docs/plans/2026-05-23-testbed-and-contract-loop-design.md, generate signed contract map via component-contract-mapping skill, run G1 verify, spawn bob. |
| 45 |
Implement testbed-server (v0.4) — Phase 1 B+C+D modalities, terminal posture, dedicated 3rd MCP server |
high |
open / v0.4 |
Blocked on #44 (design doc + contract map). Phase 1 scope (per user decisions captured in #44 WIP): new testbed-server/ Python stdlib MCP server (~6 tools, mirrors foundry-server/ shape) + new persona agents/testbed.agent.md + new skill skills/vs-code-copilot-foundry/references/visual-testing-playbook.md. Cascade integration: @bob auto-invokes when HMAC-fresh ledger artifacts present (Step 4 verification gate); @kit routes via kit_queue_prompt to target-workspace @bob/@pa (no direct invocation); @alf delegates rendered-behavior URL targets; @forge gains optional "Preview with @testbed" handoff button. R-rule update: R4 combined ≤40 tool cap stays under (~15 foundry + 14 personal-os + ~6 testbed = ~35); R13 extends to three-server topology. New tests: ~25 in testbed-server/tests/test_testbed_server.py + ~5 cascade-edge tests in foundry-server/tests/test_foundry_server.py. Installer (install-foundry.py) gains --with-testbed flag (default ON), registers third server in .vscode/mcp.json. R11 cross-repo-review.md entry: informational only (internal-rnd has the analogous visual-arbiter / verification-arbiter but vs-code-foundry's testbed is side-loaded with different boundaries — not a port request). |
| 46 |
First-run environment scan + capabilities manifest — foundry-native env-adoption for no-CLI prod environments |
high |
done (pending win11 M1-M3 validation) 2026-06-07 |
Shipped in v0.4.0 (docs/plans/2026-06-06-env-adaptive-cascade-design.md): foundry_env.py scanner + capabilities.json schema v1 + foundry_capabilities MCP tool + first-start daemon scan + installer manifest write + refresh_capabilities.py + verifier G13. M1-M3 manual smoke on win11-laptop still gates the v0.4.0 tag (see remote_claude.md). User requirement 2026-06-05. vs-code-foundry sits on top of agent-foundry-derived content that is CLI-flavored, but prod environments have NO CLIs (no claude/codex/agy, possibly no copilot binary) — VS Code + Copilot only. First run (install-time AND first foundry-server start — prod users may never re-run the installer) must scan the environment: copilot/codex/gemini/claude CLI presence+version, Python, git, network posture. Output: ~/.vs-code-foundry/capabilities.json (tier + per-tool availability, analogous to ~/.claude/state/inventory.json from the env-adoption skill — port the pattern, not the code). Consumers: foundry_health (extend existing probes), new read-only foundry_capabilities MCP tool (R4 budget: 16/25 after add), installer deploy decisions, #47 adaptation layer. Wiki analysis: .wiki/wiki/comparisons/foundry-multi-model-orchestration-options.md §Environment-tier availability. |
| 47 |
Capability-adaptive cascade content — no-CLI degradation paths in personas/skills |
high |
done (pending win11 M1-M3 validation) 2026-06-07 |
Shipped in v0.4.0: cascade personas (foundry/forge/bob/alf) got 'agent' tool + ## Capability routing + baked ## Capability floor (Tier 0) + mechanism-conditional hard rules + tier banner; 2 Tier-0 worker personas (challenger/analyst, (copilot) model arrays, subagent-only); TestPersonaTripwires mechanical gates. Chose adaptation mode (c) hybrid → resolved to static-floor + manifest-hint + live-tool (no install-time templating). M1-M3 win11 smoke gates the tag. Original requirement: re-route by tier — Tier 2 (CLIs) = current design; Tier 1/0 = native runSubagent fan-out with per-worker model: pins from Copilot's three-vendor catalog (triple-model challenge SURVIVES with zero CLIs). HARD boundary held: adaptation rewrote foundry's OWN shipped copies only, NEVER ~/.claude/ (R2). User requirement 2026-06-05; depends on #46. |
| 48 |
Re-point the Tier-2 analyst lane after the gemini CLI retires 2026-06-18 |
medium |
closed by design (v0.5.0) 2026-06-08 |
Resolved by the v0.5 dynamic perspective dispatch (docs/plans/2026-06-08-dynamic-perspective-dispatch-design.md). The analyst lane is no longer hard-pinned to a single CLI delegate: the analyst STANCE is dispatched dynamically from the perspective_policy plan (the roster), and on a host without a live external CLI it falls to the Tier-0 native floor runSubagent('analyst', { model: <plan model> }) — design-decision (b), now the primary path. foundry_capabilities._recommended_routing already treats avail("gemini") or avail("agy") as the Tier-2 analyst signal (unchanged in v0.5), and agy is probed by the manifest, so the 2026-06-18 gemini-CLI cutover degrades cleanly with zero code change (analyst stance routes to the floor or to agy when present). Subprocess delegate swap to agy -p (option (a)) remains an OPTIONAL future polish, not a blocker — file a fresh narrow task if/when desired. |
| 49 |
Config-driven model selection (was tracked in docs/models.md) |
medium |
closed by design (v0.5.0) 2026-06-08 |
The old docs/models.md "make Layer 3 config-driven" item (hardcoded reviewer model names → read from config.json) is subsumed by the v0.5 model_roster. Model selection is now dispatch-time + roster-driven: roster.py's DEFAULT_ROSTER + read_roster(home) deep-merges a model_roster block from config.json, and resolve_perspectives projects the (stance, model, angle) plan — one config edit swaps models with no code change or test run, and a reinstall preserves the edit (deep-merge, not clobber). docs/models.md is now a thin pointer. R15 codifies the principle. |
| 50 |
v0.6 — vs-code-foundry as the VS Code BRIDGE to agent-foundry (kill the "2 sets" maintenance; thin bridge + generated subagent-team floor) |
high |
design brief ready 2026-06-08 |
DESIGN BRIEF → docs/plans/2026-06-08-v0.6-agent-foundry-bridge-design-BRIEF.md; start a fresh forge COMPLEX cycle on it. User direction: agent-foundry = single source of truth (176 skills + flows + canonical agents, "where main dev lives"); vs-code-foundry = thin Copilot BRIDGE (personas bridge the claude/codex/agy CLI use case). ADAPTIVE (standalone floor preserved). GENERATE the Tier-0 floor personas from agent-foundry (build-time adapter) to drive native VS Code 1.123 subagent TEAMS; DELEGATE to agent-foundry's CLI flows when present; ROUTE skills to the 176. Measured this session: the "2 sets" pain is the PERSONAS (bob 624L↔226L, 5 shared lines, drifting); skills barely duplicated; the 1338L server is legitimate bridge. Open sub-Qs (for the design team): generator transform; the enforcement-engine re-import decision (entangled with #35/#40 — its own phase); the agent collision (#20); the R2 amendment. Likely phased: P1 generator+floor / P2 delegation+skills / P3 enforcement. Revises the 2026-05-11 "deliberately separate" directive (governance — reflect in AGENTS.md R2 + cross-repo-review). 1.123 currency verified live 2026-06-08. |