From 7dd5c8bc4b01e115fab256068e4209cca84205f8 Mon Sep 17 00:00:00 2001
From: Lachlan Donald <lachlan@buildkite.com>
Date: Tue, 17 Mar 2026 08:05:36 +1100
Subject: [PATCH] docs: split inference credential research and planning

---
 docs/plans/inference-credential-mediation.md | 136 +++++++++++++++++++
 docs/research.md                             |  61 +++++++--
 2 files changed, 188 insertions(+), 9 deletions(-)
 create mode 100644 docs/plans/inference-credential-mediation.md

diff --git a/docs/plans/inference-credential-mediation.md b/docs/plans/inference-credential-mediation.md
new file mode 100644
index 00000000..2118e062
--- /dev/null
+++ b/docs/plans/inference-credential-mediation.md
@@ -0,0 +1,136 @@
+# Inference credential mediation plan
+
+This document captures a Cleanroom-aligned design direction for allowing
+inference and agent-tool calls from sandboxes without injecting long-lived
+credentials into guest environment variables or guest-visible files.
+
+## Goals
+
+- Keep inference credentials on the host side of the sandbox boundary.
+- Reuse Cleanroom's existing mediation model rather than creating a second
+  credential path outside the gateway.
+- Keep repository policy backend-agnostic and runtime credential resolution
+  host-local.
+- Support both generic inference APIs and agent-specific CLIs where practical.
+
+## Non-goals
+
+- Making every upstream provider look identical when their auth and protocol
+  surfaces differ materially.
+- Supporting arbitrary guest-configured HTTP proxying.
+- Treating mounted auth files or long-lived env vars as acceptable steady-state
+  solutions.
+
+## Proposal A: inference gateway as the target architecture
+
+Extend the existing host gateway with explicit inference and agent routes, for
+example:
+
+- `/llm/openai/v1/...`
+- `/llm/anthropic/v1/...`
+- `/agent/amp/...`
+
+Properties:
+
+- sandbox identity continues to come from transport identity:
+  - Firecracker: source IP on TAP network
+  - `darwin-vz`: scoped capability token fallback
+- credentials stay entirely host-side and are resolved by provider adapters
+- the guest receives only non-secret routing/config such as endpoint URLs,
+  helper command paths, or generated config files
+- audit events stay centralized in the gateway
+
+Repository policy should declare capability, not credential location:
+
+```yaml
+inference:
+  allow:
+    - service: openai
+      purpose: codex
+      models: [gpt-5-codex]
+      binding: codex_account
+    - service: anthropic
+      purpose: claude-code
+      models: [claude-sonnet-4-5]
+      binding: claude_code
+    - service: amp
+      purpose: amp-cli
+      binding: amp_account
+```
+
+Runtime config should define how bindings are resolved on the host:
+
+```toml
+[credential_bindings.codex_account]
+kind = "codex-auth-state"
+path = "~/.codex/auth.json"
+
+[credential_bindings.claude_code]
+kind = "command"
+command = ["cleanroom-credential-helper", "claude"]
+
+[credential_bindings.amp_account]
+kind = "amp-secret-store"
+path = "~/.local/share/amp/secrets.json"
+```
+
+Rationale:
+
+- matches Cleanroom's gateway-first architecture
+- keeps provider-specific auth details out of repo policy
+- preserves a single audit and enforcement point
+- works for both direct inference APIs and mediated agent services
+
+## Proposal B: helper bridge as a narrow early slice
+
+For tools that already support credential-helper hooks, add a small broker path
+under `/secrets/` and configure the tool to call it through a guest-side stub.
+
+Best fit:
+
+- Claude Code, because `apiKeyHelper` is already part of the documented surface
+
+Tradeoffs:
+
+- smaller and faster to ship than a full request mediation gateway
+- still returns a real provider credential to the guest process, even if
+  short-lived
+- does not map cleanly to Codex or Amp's currently observed auth surfaces
+
+This is useful as a bootstrap path, but it should not replace host-side request
+mediation as the long-term architecture.
+
+## Proposal C: host-executed agent adapters
+
+For tools whose auth model is strongly tied to host login state, run the agent
+on the host and expose it to the guest as a mediated Cleanroom tool surface.
+
+Properties:
+
+- the guest invokes a Cleanroom-owned command or RPC surface
+- the actual provider CLI runs on the host with host credentials
+- stdout/stderr and tool/file access are streamed or mediated explicitly
+
+Best fit:
+
+- Codex and Amp, if their auth stores remain awkward to proxy cleanly from
+  inside the guest
+
+Tradeoffs:
+
+- strongest credential isolation
+- less transparent than running the provider CLI natively inside the sandbox
+- more opinionated execution model
+
+## Recommended order
+
+1. Treat Proposal A as the target architecture.
+2. Use Proposal B selectively for Claude as an early proving ground.
+3. Use Proposal C for provider CLIs that remain strongly host-login-shaped.
+
+## Design rule
+
+- The guest may know where to send an inference request.
+- The host decides whether to forward it.
+- The host owns the credential.
+- Repository policy decides whether the capability exists at all.
diff --git a/docs/research.md b/docs/research.md
index 7a751bb9..61b9c042 100644
--- a/docs/research.md
+++ b/docs/research.md
@@ -212,12 +212,55 @@ Key properties of this model:
 - Secret injection is scoped by destination host -- a secret bound to `api.github.com` cannot be injected into requests to other hosts.
 - All injection events are logged with secret ID and destination, never with secret values.
 
-## Research conclusions
-
-1. Use Firecracker as the local sandbox backend (inspired by Matchlock patterns). See [backend/firecracker.md](backend/firecracker.md) for implementation details.
-2. Use an in-tree host gateway as the package/registry and git mediation layer,
-   borrowing ideas from tools like `content-cache` and `git-proxy-cache` rather
-   than depending on them directly.
-3. Use a tokenizer-like secret-injection model with host-scoped policy and no
-   plaintext propagation.
-4. Keep CLI first: `cleanroom exec` as primary entrypoint and command pattern.
+## Agent and inference credentials
+
+Validation snapshot date: 2026-03-17.
+
+This section records what auth surfaces these tools expose, what host-side
+state is present locally, and what material would become visible if that state
+were projected into a guest sandbox.
+
+### Provider auth surface snapshot
+
+| Tool | Auth surfaces observed | Helper or endpoint hooks observed | Host-side persisted state observed | Notes |
+|---|---|---|---|---|
+| Codex CLI | ChatGPT account login; API key login via `codex login --with-api-key` | Official config/auth docs; local CLI exposes login surfaces; local binary strings reference `OPENAI_BASE_URL` | `~/.codex/auth.json` with `auth_mode`, `access_token`, `refresh_token`, `id_token` fields | On this host, `codex login status` returned `Logged in using ChatGPT` |
+| Claude Code | Anthropic docs describe API-key auth plus IAM-backed options | `apiKeyHelper`; custom base URL; LLM gateway docs | Not observed locally in this environment (`claude` CLI not installed) | Docs present a first-class helper-based credential retrieval surface |
+| Amp CLI | `AMP_API_KEY`; `AMP_URL` | CLI help exposes URL override; no helper-style credential hook was found in local help output | `~/.local/share/amp/secrets.json`; `~/.local/share/amp/session.json` | Amp docs describe stored OAuth/local secret state; owner manual says self-hosted/BYOK is not currently supported |
+
+### Security-relevant observations
+
+- The local Codex auth store contains token-bearing fields, including
+  `access_token` and `refresh_token`.
+- The local Amp installation stores credential-related state under the user's
+  home directory rather than relying only on ephemeral process env.
+- Claude Code's published settings include an explicit credential helper hook
+  (`apiKeyHelper`) rather than requiring a static key in config.
+- Codex and Amp both show evidence of host-persisted login state in this
+  environment.
+- Projecting host auth files such as `~/.codex/auth.json` or
+  `~/.local/share/amp/secrets.json` into a guest would expose their contents to
+  guest processes.
+
+### Relevant references
+
+- [Codex configuration/auth docs](https://developers.openai.com/codex/config-advanced/)
+- local CLI `codex login --help` and `codex login status`
+- [Claude Code settings](https://docs.anthropic.com/en/docs/claude-code/settings)
+- [Claude Code IAM and auth guide](https://docs.anthropic.com/en/docs/claude-code/iam)
+- [Claude Code LLM gateway guide](https://docs.anthropic.com/en/docs/claude-code/llm-gateway)
+- [Amp security reference](https://ampcode.com/security)
+- [Amp owner manual](https://ampcode.com/manual)
+- local CLI `amp --help`, `amp usage`, and local config layout
+
+## Research summary
+
+1. Hosted sandbox providers generally default to full internet access and do
+   not provide repository-scoped egress control or credential isolation.
+2. Among the self-hosted tools reviewed, Matchlock is the closest structural
+   prior art to Cleanroom, while still differing on default allow behavior,
+   policy mutability, image pinning, and TLS proxying.
+3. Firecracker maps directly to TAP plus host-firewall enforcement on Linux;
+   macOS requires different enforcement primitives and capability scoping.
+4. Existing prior art for host-side mediation and caching exists in
+   `content-cache`, `git-proxy-cache`, and `tokenizer`.