Skip to content

Latest commit

 

History

History
325 lines (246 loc) · 25.8 KB

File metadata and controls

325 lines (246 loc) · 25.8 KB

AGENTS.md — vs-code-foundry developer guidance

Canonical cross-tool dev guidance for this repo. Read by Claude Code (via the CLAUDE.md symlink), GitHub Copilot CLI (natively reads AGENTS.md), Codex CLI (natively reads AGENTS.md), Gemini CLI (via GEMINI.md @AGENTS.md import if you set one up). When working on vs-code-foundry, follow this file.

For the TEMPLATE AGENTS.md that the installer deploys to user workspaces, see templates/AGENTS.md. They are different files. This file is for working ON foundry; the template is for projects that USE foundry.

What this project does

vs-code-foundry is a standalone agentic workflow for VS Code + GitHub Copilot. It implements a design → build → review cascade (foundry / forge / bob / alf / pa) using Copilot's native customization primitives (custom agents, agent skills, MCP) plus a focused Python MCP server.

As of v0.3.0 (2026-05-18) the former sibling vs-code-personal-os has been merged in as a subtree under personal-os/. That subtree adds a sixth persona — @kit — at universal altitude (cross-workspace task tracking, day planning, coaching) and a second MCP server (personal-os-server) co-registered alongside foundry-server. The two-server topology, state isolation, and reconciled rule set are codified in R13 below. Full history of the personal-os repo is preserved via git subtree.

This repo is deliberately separate from joogy06/agent-foundry (public Claude Code skills) and internal-rnd-mirror (private internal-rnd R&D mirror). Foundry does NOT modify ~/.claude/ and does NOT require Claude Code to be installed.

Build / test / lint

# Foundry-server (136 tests as of v0.5.0)
cd foundry-server
python3 -m unittest tests.test_foundry_server -v

# Personal-os-server (43 tests)
cd personal-os/personal-os-server
python3 -m unittest tests.test_personal_os_server -v

# Both suites from repo root (179 tests total as of v0.5.0)
python3 -m unittest discover -v -s foundry-server -p 'test_*.py' && \
python3 -m unittest discover -v -s personal-os/personal-os-server -p 'test_*.py'

# Validate JSON files (templates/.vscode/mcp.json registers BOTH servers post-v0.3.0)
python3 -c "import json; json.load(open('templates/.vscode/mcp.json'))"

# Validate Python syntax across all installer + server files
python3 -c "import ast, glob; [ast.parse(open(f).read()) for f in glob.glob('foundry-server/**/*.py', recursive=True) + glob.glob('personal-os/personal-os-server/**/*.py', recursive=True) + glob.glob('installer/*.py') + glob.glob('personal-os/installer/*.py')]"

# Smoke-test the installer locally (isolated prefix; harmless to your real ~/.vs-code-foundry/)
python3 installer/install-foundry.py --yes --prefix /tmp/foundry-test-$(date +%s) --skip-smoke

There is no lint config yet (Python stdlib-only, hand-formatted). If you add ruff or black, document it here.

CRITICAL RULES (read first)

R1 — Windows installer hardening (NEVER violate)

The .cmd and .ps1 files are designed for enterprise-hardened Windows machines. These constraints are non-negotiable and apply to ANY future installer or helper script in this repo:

Rule Why
NO -ExecutionPolicy Bypass Hardened machines disable bypass at GPO level; using it fails AND signals disrespect for the policy
NO dot-source (.) Dot-source runs the script in the caller's scope, leaks variables/functions, often blocked by AppLocker / Constrained Language Mode
NO -Command (-c) -File is the safer invocation mode; -Command allows arbitrary script-block injection from quoting accidents
-NoProfile $PROFILE not loaded — reproducible runs, immune to profile-based hijacking
-NonInteractive PowerShell's Read-Host disabled; Python's stdin still works (different layer)
-File only for .ps1 invocation The .cmd invokes the .ps1 only via -File; the .ps1 invokes Python only via the call operator &. Never any other shape.
CRLF line endings on Windows scripts .gitattributes enforces this. .cmd/.bat/.ps1 are text eol=crlf; .sh/.py/.md are text eol=lf.
Prefer PowerShell 7+ (pwsh.exe), fall back to 5.1 (powershell.exe) Both supported; pwsh is faster + better-supported
Zone Identifier detection If the run fails AND the .ps1 has a :Zone.Identifier ADS, print 3-option remediation (git clone, Unblock-File, manual)
Cross-platform Python detection Try python, python3, py in that order
No PowerShell scripts >250 lines (or >200 if the script contains conditional logic flow, function definitions, or non-trivial loop bodies) Keep wrappers thin; complex logic stays in Python. Read-only verifiers / installers that are mostly Write-Host formatting are exempt up to 250. The cap targets complex flow in PS, not cosmetic length.

R2 — Foundry-server NEVER touches ~/.claude/

R2 governs foundry-server's posture toward Claude state — not a universal claim about ~/.claude/skills/ being off-limits to the broader ecosystem. (As of MS Copilot Insiders 2026-05-13, VS Copilot natively auto-discovers ~/.claude/skills/ per the agentskills.io standardization; that's outside foundry's scope and not something R2 tries to prevent.)

Aspect Constraint
Foundry's own install location ~/.vs-code-foundry/ for state files (config, task DB, action log, server) + ~/.copilot/skills/vs-code-copilot-foundry/ for the skill family (v0.2.0+, MS-standardized global path). Never ~/.claude/.
Skill enumeration foundry-server's SkillRegistry reads ~/.copilot/skills/ always. Can OPTIONALLY also read ~/.claude/skills/ IF (a) read_claude_skills: true in config AND (b) the caller passes include_claude: true. Both required (dual-opt-in, F4 fix) — neither alone leaks. Defense in depth.
Optional Claude bridge foundry_claude MCP tool is OFF by default. When enable_claude_bridge: false, the tool is hidden from tools/list. Only when enabled does foundry-server invoke claude as a subprocess (it reads its own config; foundry-server doesn't access ~/.claude/ directly).
Persona files Have full bodies in this repo. Do NOT auto-generate from ~/.claude/agents/*.md at runtime or install time. Drift between this repo and Claude Code's canonical bob/alf/pa is acceptable — this is a fork.
Backward-compat symlink (v0.2.x only) The installer leaves a symlink at ~/.vs-code-foundry/skills/vs-code-copilot-foundry pointing to the new canonical location so any v0.1.x consumer with hardcoded paths keeps working. Symlink is documented as removable in v0.3.x.

What R2 explicitly does NOT prevent: VS Copilot, VS Code Copilot, or any other agentskills.io-conformant client auto-discovering ~/.claude/skills/ natively per the Microsoft standard. That's the user's broader Copilot ecosystem, not foundry-server's behavior. R2 is about what foundry-server reads, not what other tools the user runs alongside it read.

v0.5.1 installer carve-out (tasks.md #20 collision warning): the installer (install-foundry.py, not foundry-server) may perform a metadata os.path.exists() only on ~/.claude/agents/{bob,alf,pa}.md to print the name-collision warning — no content read, no write, no delete, no sentinel injection. This is the strictest possible touch (a stat, never an open()), is covered by the test_claude_collision_metadata_only_no_read_no_write test (asserts the file is never opened-for-write, mtime unchanged, no sentinel injected), and does NOT relax R2's "never touch ~/.claude/ content/state" rule for foundry-server. The installer's reconcile engine likewise scopes all deletes to foundry-managed roots (~/.copilot/agents/, ${FOUNDRY_HOME}/agents/, the current-run workspace) and never to ~/.claude/.

R3 — MCP spec conformance

Applies to both foundry-server AND personal-os-server.

  • protocolVersion: "2025-11-25"
  • Declare BOTH tools.tasks: true AND top-level tasks: {} in initialize response (M3 fix from spec review — let client accept whichever shape it parses)
  • Long-running tools return Task handles, never block
  • Streaming progress via the MCP Tasks primitive
  • readOnlyHint: true annotation on read-only tools

R4 — Tool count budget

VS Code Copilot caps tools per chat request at 128. Budgets per server:

  • foundry-server ≤ 25 tools (currently exposes 16 visible; 17 distinct schemas counting the hidden foundry_claude bridge — verified against TOOL_SCHEMAS at v0.5.0. foundry_capabilities added in v0.4.0 lifted the visible count from 15 → 16; v0.5 adds NO new tool, only a perspective_policy sibling key on the existing foundry_capabilities, R4 unchanged)
  • personal-os-server ≤ 15 tools
  • Combined ≤ 40 — leaves comfortable headroom under the 128 cap for other MCP servers the user might install

If you must add tools that would breach a server's cap, consider splitting into a secondary MCP server rather than overflowing.

R5 — Cross-platform Python in mcp.json

The workspace .vscode/mcp.json deployed by the installer MUST use sys.executable (the absolute Python path), NOT the string "python3". Windows users don't have python3 in PATH by default — they have python or py. The _customize_mcp_template() helper in install-foundry.py handles this for both server entries (foundry-server and personal-os-server) in the merged template; don't break it.

R6 — Commit conventions

  • NO Co-Authored-By: Claude (or any AI attribution) lines in commit messages
  • NO --no-verify to bypass git hooks unless the user explicitly asks
  • NO direct push to main in a multi-author setup (currently single-author; safe but use PRs when you scale)
  • Subject lines under ~80 chars; body wraps at ~72
  • For changes to either server, run the relevant test suite(s) before commit (see R7).

R7 — Tests must pass

179 tests total — 136 in foundry-server/tests/ + 43 in personal-os/personal-os-server/tests/ (re-counted at the v0.5.0 ship; the combined suite was 136 at v0.4.0). Run both before any commit that touches either server or installer. Acceptance is "full suite green", never a hardcoded number — these counts are a convenience, not a contract.

  • Run foundry-server suite before commits touching foundry-server/ or installer/install-foundry.py
  • Run personal-os-server suite before commits touching personal-os/personal-os-server/ or personal-os/installer/
  • New functionality requires a new test class or test method in the appropriate suite
  • Stdlib unittest only — do NOT introduce pytest or other test framework deps (the stdlib-only constraint is load-bearing for fast installer smoke tests)

R8 — Python conventions

  • Python 3.10+ (uses | union types, dict[str, Any] style)
  • STDLIB ONLY — no requirements.txt, no pyproject.toml dependencies, no pip install step in the installer
  • Type hints on public function signatures (interior helpers optional)
  • Docstrings on classes + public functions, one-line on private helpers
  • 4-space indent, ~100 char line limit (rough; not enforced by formatter)
  • Strings: prefer double quotes; use single quotes for nested
  • Errors: explicit exception types, not bare except:

R9 — Persona body conventions

Each agents/<name>.agent.md is a complete persona definition. Conventions:

  • Frontmatter model: uses display-name strings: 'Claude Opus 4.8 (anthropic)', NOT IDs like 'claude-opus-4-8'. Display names are portable across VS Code, CLI, JetBrains, Visual Studio.
  • tools: lists are explicit (no wildcards unless intentional). foundry/* is fine for foundry-server tools; personal-os/* for personal-os-server tools when both are registered.
  • agents: declares the subagent allowlist. Foundry → forge → bob is allowed; deeper nesting requires updating the allowlist.
  • handoffs: provides clickable buttons in chat — label, agent, prompt, send: false (user reviews before send).
  • Body uses @name references when describing other personas (e.g., @forge, @bob, @kit).
  • Body uses runSubagent and foundry_<tool> / personal_os_<tool> syntax for orchestration calls (matches VS Code Copilot's expectations).
  • HARD rules in personas should match this AGENTS.md (e.g., no Co-Authored-By, no --no-verify).
  • @kit lives in agents/kit.agent.md alongside the other five; persona-zoo enumerator reads this single dir (R13).

R10 — Documentation

  • Top-level files (PROJECT.md, AGENTS.md, README.md, INSTALL.md, history.md, tasks.md) must NOT exceed ~400 lines each. Detail goes to docs/.
  • docs/ files can be long.
  • references/ files (under skills/) can be long.
  • Markdown style: ATX headers (#), fenced code blocks, tables for tabular data, no horizontal rules except section breaks.

R11 — Cross-repo review (external siblings)

External siblings: internal-rnd (active, private — mirrors to internal-rnd-mirror), vs-code-personal-os (archived 2026-05-18 — merged here as the personal-os/ subtree). Cross-pollination is tracked in cross-repo-review.md at the root.

internal-rnd and foundry share concepts (forge / bob / alf / pa cascade, contract-driven gates, MCP server patterns, cross-CLI orchestration). Improvements here often apply there.

Whenever you add a new persona, MCP tool, foundry-server / personal-os-server module, cross-CLI pattern, or significant installer feature, append a review entry to cross-repo-review.md. internal-rnd's owner reviews it and decides whether to adopt.

Trigger an entry for Skip
New persona (agents/<name>.agent.md) Doc-only edits
New MCP tool (added to either server's TOOL_SCHEMAS) Tiny config fixes
New foundry-server / personal-os-server module Cosmetic refactors
New installer feature (esp. cmd/ps1 hardening) Renames / dep bumps
New skill / reference doc with a pattern worth reusing
Significant change to the forge → bob → alf cascade protocol or KIT routing
New .gitattributes rule / new cross-platform compatibility pattern

When you commit, mention Cross-repo review: see cross-repo-review.md in the message.

R12 — Anti-over-engineering (personal-os subsystem only)

Inside the personal-os/ subsystem, the user has explicitly asked for the simple version. The following are out of scope for personal-os-server and its installer unless the user explicitly asks:

Out of scope (do NOT add inside personal-os/) Why
Background daemons / services Over-engineered; needs elevation or AppLocker exceptions on hardened Win 11
Windows Task Scheduler integration (schtasks) Locked down in enterprise GPO
Cron / launchd / systemd timers Same — out of user's policy reach on the target machine
File watchers (watchdog, inotify, etc.) Stdlib has no portable watcher; not worth a pip dep
Always-on processes outside VS Code Runtime model: VS Code is open during work hours; scheduling happens inside the session
WebSocket / SSE servers Over-complicated; stdio MCP is fine
Heavy state synchronization (CRDTs, vector clocks) Single user, single machine; SQLite + BEGIN IMMEDIATE is sufficient

The rest of foundry is not bound by R12. If a future personal-os requirement seems to need one of these, push back and ask the user before building.

R13 — Two-server topology + state isolation

v0.3.0 ships two MCP servers co-registered in the workspace .vscode/mcp.json. Strict rules govern their boundaries:

(a) Two MCP servers, namespaced tools. foundry-server exposes foundry_* tools; personal-os-server exposes personal_os_* tools. Names never collide. Each ships its own initialize handshake and its own tools/list.

(b) Personal-os state lives under foundry's home. Default state root: ~/.vs-code-foundry/personal-os/ (containing kit.db, events.jsonl, state/, outbox/, inbox/). One home dir, two subtrees — simpler backups, simpler uninstall.

(c) Personal-os-server MAY read foundry-server state — read-only. Specifically ~/.vs-code-foundry/tasks.db (URI mode=ro) and ~/.vs-code-foundry/status/*.json. It MUST NOT write to foundry-server's state files. Use sqlite3.connect('file:...?mode=ro', uri=True) to enforce at the driver layer.

(d) Personal-os-server IS allowed to write to its own subtree at ~/.vs-code-foundry/personal-os/. That subtree is its sandbox.

(e) Foundry-server does NOT read personal-os state in v0.3.0. Information flow is one-way (personal-os reads foundry). Cross-direction reads, if ever needed, are a future-version decision.

(f) PERSONAL_OS_HOME env override is respected. Default = ${FOUNDRY_HOME:-~/.vs-code-foundry}/personal-os/. Setting PERSONAL_OS_HOME=/some/other/path overrides only the personal-os subtree, not foundry's home.

R14 — Capability adaptation scope + manifest honesty (v0.4)

Capability adaptation rewrites only foundry-shipped copies (never foreign files, never ~/.claude/); capabilities.json is a presence hint, not a routing authority, and never contains secrets/auth state.

R15 — Model selection is dispatch-time + roster-driven (v0.5)

Model selection is dispatch-time + roster-driven, never role→vendor-locked. The cost-tier ceiling master stays Claude Opus 4.8 (anthropic); cheaper perspectives are spawned in-session via runSubagent with an explicit model= drawn from the perspective_policy plan. To change models, edit ONE config block (model_roster in config.json), not persona files. Worker model: arrays are cross-vendor FALLBACK chains (≥2 distinct vendor families) — the safety net when no explicit model is supplied, NEVER a role→vendor lock (removing the spread would route an over-tier request straight to the master = silent self-review). Diversity claims are tier-qualified: Tier-2 (live external CLI subprocess) is verified; Tier-0 (in-session subagent) is requested-unverified → capped confidence + no cross-model corroboration claim. served_by= self-reports are hints, never proof (R14: roster holds model names + policy only, never secrets/auth). The frozen recommended_routing contract is untouched; perspective_policy is a strictly-additive sibling key (no new MCP tool — R4 unchanged).

Repo layout

.
├── README.md                    ← user-facing intro (mentions v0.3.0 KIT)
├── INSTALL.md                   ← install walk-through (POSIX + Windows enterprise; --with-personal-os)
├── PROJECT.md                   ← architecture map (auto-read by Claude Code at session start)
├── AGENTS.md                    ← this file
├── CLAUDE.md → AGENTS.md        ← symlink (Claude Code reads CLAUDE.md natively)
├── CHANGELOG.md
├── history.md                   ← repo history (auto-read at session start)
├── tasks.md                     ← backlog (foundry + personal-os roadmaps)
├── session_control.md           ← active session tracker
├── index.md                     ← index of design docs + references
├── cross-repo-review.md         ← cross-pollination queue (internal-rnd + archived personal-os)
├── remote_claude.md             ← T001-T004 remote-Claude threads
├── LICENSE                      ← MIT
├── .gitattributes               ← line-ending normalization (LF for code, CRLF for .cmd/.ps1)
├── .gitignore
├── installer/
│   ├── install-foundry.{cmd,ps1,py,sh}   ← Windows enterprise entry chain + cross-platform impl
│   ├── verify-foundry-setup.{cmd,ps1,sh} ← 13 G-checks (v0.4.0), verifies BOTH servers
│   └── install-personal-os.py            ← thin shim delegating to install-foundry.py
├── foundry-server/
│   ├── foundry_server.py        ← MCP server (Python stdlib)
│   ├── foundry_env.py           ← shared stdlib capability scanner (v0.4.0)
│   ├── refresh_capabilities.py  ← standalone manual capability refresh (v0.4.0)
│   └── tests/test_foundry_server.py  ← unit tests (incl. TestPersonaTripwires v0.4.0)
├── personal-os/                  ← subtree-merged from vs-code-personal-os (full history preserved)
│   ├── README.md, HANDOVER.md, PROJECT.md, AGENTS.md, history.md, tasks.md, ...
│   ├── personal-os-server/      ← Python MCP server + 43 unit tests + smoke runner
│   │   ├── personal_os_server.py
│   │   ├── db.py, events.py, state.py, brief.py, smoke_runner.py
│   │   ├── tools/{tasks,status,bootstrap,outbox,inbox,prefs,registry}.py
│   │   └── tests/test_personal_os_server.py
│   ├── installer/               ← legacy standalone installer (preserved)
│   ├── templates/               ← legacy workspace assets (preserved)
│   ├── docs/                    ← kit-functional-spec.md, kit-deliberation-2026-05-12.md, ...
│   ├── progress/                ← v0.1 KIT contract map rev 1 (HMAC-signed, immutable)
│   ├── .forge/                  ← v0.1 session material (preserved alongside rev 1)
│   └── .wiki/                   ← personal-os's local markdown wiki stub
├── agents/                       ← 8 custom Copilot personas (v0.4.0: 6 coordinators + 2 workers)
│   ├── foundry.agent.md
│   ├── forge.agent.md
│   ├── bob.agent.md
│   ├── alf.agent.md
│   ├── pa.agent.md
│   ├── kit.agent.md             ← @kit master persona (universal altitude)
│   ├── challenger.agent.md      ← Tier-0 challenge worker (subagent-only, v0.4.0)
│   └── analyst.agent.md         ← Tier-0 analysis worker (subagent-only, v0.4.0)
├── progress/                     ← foundry-wide contract map rev 2 (HMAC-signed, covers both servers)
│   ├── contract-map.yaml
│   └── contract-map.yaml.sig
├── .forge/                       ← session material for rev 2 (session-id + session.key 0600)
├── skills/vs-code-copilot-foundry/  ← skill family (parent + references + scripts)
├── templates/                       ← workspace assets deployed at install time
│   ├── AGENTS.md                    ← TEMPLATE for user workspaces (not THIS AGENTS.md)
│   ├── .vscode/mcp.json             ← registers BOTH foundry-server AND personal-os-server
│   ├── .github/copilot-instructions.md
│   └── .github/agents/README.md
├── docs/
│   ├── design.md                ← 2026-05-11 fork architecture decision doc
│   ├── spec-review.md           ← subagent review of the design
│   ├── architecture.md          ← current-state architecture (incl. Personal-os Subsystem section)
│   ├── conventions.md           ← coding conventions deep-dive
│   ├── CONTRIBUTING.md          ← dev loop (running 179 tests across both servers)
│   ├── models.md                ← v0.5 thin pointer → roster-driven model selection
│   └── plans/2026-05-18-personal-os-merge-design.md
└── research/2026-05-11/         ← 5 research briefs + foundry-server sketch

Branches + repos

Repo Visibility Status Purpose / Relationship
joogy06/vs-code-foundry PRIVATE Active (canonical) This repo (the distributable).
joogy06/vs-code-personal-os PRIVATE Archived 2026-05-18 Merged here as personal-os/ subtree at v0.3.0; full history preserved. No further commits — work on the subtree happens in this repo.
internal-rnd-mirror PRIVATE Active internal-rnd project mirror — Claude Code R&D. SEPARATE ecosystem. Cross-pollination tracked via cross-repo-review.md.
joogy06/agent-foundry PUBLIC Active Published Claude Code skills/agents/commands. SEPARATE — never publishes vs-code-foundry content. Belt-and-braces guards in ~/.claude/publish-config.json.

Branch policy: main is the only long-lived branch. Use short-lived feature branches for non-trivial work; PR back to main.

When you start a Claude Code session in this repo

Per global ~/.claude/CLAUDE.md, the session-start protocol reads:

  1. PROJECT.md — architecture (read this first)
  2. history.md — what's happened so far (head + tail if >400 lines)
  3. tasks.md — current backlog
  4. docs/plans/*.md — active design docs (e.g., the 2026-05-18 personal-os merge design)
  5. docs/components/*/COMPONENT.md — N/A here (we don't decompose to that granularity)
  6. session_control.md — active session tracker
  7. index.md — design doc index

After reading those, follow the rules in this AGENTS.md.

Cross-references

Project HARD-RULEs

  • Combined ≤ 40 — leaves comfortable headroom under the 128 cap for other MCP servers the user might install
  • NO direct push to main in a multi-author setup (currently single-author; safe but use PRs when you scale)