Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions docs/plans/inference-credential-mediation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Inference credential mediation plan

This document captures a Cleanroom-aligned design direction for allowing
inference and agent-tool calls from sandboxes without injecting long-lived
credentials into guest environment variables or guest-visible files.

## Goals

- Keep inference credentials on the host side of the sandbox boundary.
- Reuse Cleanroom's existing mediation model rather than creating a second
credential path outside the gateway.
- Keep repository policy backend-agnostic and runtime credential resolution
host-local.
- Support both generic inference APIs and agent-specific CLIs where practical.

## Non-goals

- Making every upstream provider look identical when their auth and protocol
surfaces differ materially.
- Supporting arbitrary guest-configured HTTP proxying.
- Treating mounted auth files or long-lived env vars as acceptable steady-state
solutions.

## Proposal A: inference gateway as the target architecture

Extend the existing host gateway with explicit inference and agent routes, for
example:

- `/llm/openai/v1/...`
- `/llm/anthropic/v1/...`
- `/agent/amp/...`

Properties:

- sandbox identity continues to come from transport identity:
- Firecracker: source IP on TAP network
- `darwin-vz`: scoped capability token fallback
- credentials stay entirely host-side and are resolved by provider adapters
- the guest receives only non-secret routing/config such as endpoint URLs,
helper command paths, or generated config files
- audit events stay centralized in the gateway

Repository policy should declare capability, not credential location:

```yaml
inference:
allow:
- service: openai
purpose: codex
models: [gpt-5-codex]
binding: codex_account
- service: anthropic
purpose: claude-code
models: [claude-sonnet-4-5]
binding: claude_code
- service: amp
purpose: amp-cli
binding: amp_account
```

Runtime config should define how bindings are resolved on the host:

```toml
[credential_bindings.codex_account]
kind = "codex-auth-state"
path = "~/.codex/auth.json"

[credential_bindings.claude_code]
kind = "command"
command = ["cleanroom-credential-helper", "claude"]

[credential_bindings.amp_account]
kind = "amp-secret-store"
path = "~/.local/share/amp/secrets.json"
```

Rationale:

- matches Cleanroom's gateway-first architecture
- keeps provider-specific auth details out of repo policy
- preserves a single audit and enforcement point
- works for both direct inference APIs and mediated agent services

## Proposal B: helper bridge as a narrow early slice

For tools that already support credential-helper hooks, add a small broker path
under `/secrets/` and configure the tool to call it through a guest-side stub.

Best fit:

- Claude Code, because `apiKeyHelper` is already part of the documented surface

Tradeoffs:

- smaller and faster to ship than a full request mediation gateway
- still returns a real provider credential to the guest process, even if
short-lived
- does not map cleanly to Codex or Amp's currently observed auth surfaces

This is useful as a bootstrap path, but it should not replace host-side request
mediation as the long-term architecture.

## Proposal C: host-executed agent adapters

For tools whose auth model is strongly tied to host login state, run the agent
on the host and expose it to the guest as a mediated Cleanroom tool surface.

Properties:

- the guest invokes a Cleanroom-owned command or RPC surface
- the actual provider CLI runs on the host with host credentials
- stdout/stderr and tool/file access are streamed or mediated explicitly

Best fit:

- Codex and Amp, if their auth stores remain awkward to proxy cleanly from
inside the guest

Tradeoffs:

- strongest credential isolation
- less transparent than running the provider CLI natively inside the sandbox
- more opinionated execution model

## Recommended order

1. Treat Proposal A as the target architecture.
2. Use Proposal B selectively for Claude as an early proving ground.
3. Use Proposal C for provider CLIs that remain strongly host-login-shaped.

## Design rule

- The guest may know where to send an inference request.
- The host decides whether to forward it.
- The host owns the credential.
- Repository policy decides whether the capability exists at all.
61 changes: 52 additions & 9 deletions docs/research.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,12 +212,55 @@ Key properties of this model:
- Secret injection is scoped by destination host -- a secret bound to `api.github.com` cannot be injected into requests to other hosts.
- All injection events are logged with secret ID and destination, never with secret values.

## Research conclusions

1. Use Firecracker as the local sandbox backend (inspired by Matchlock patterns). See [backend/firecracker.md](backend/firecracker.md) for implementation details.
2. Use an in-tree host gateway as the package/registry and git mediation layer,
borrowing ideas from tools like `content-cache` and `git-proxy-cache` rather
than depending on them directly.
3. Use a tokenizer-like secret-injection model with host-scoped policy and no
plaintext propagation.
4. Keep CLI first: `cleanroom exec` as primary entrypoint and command pattern.
## Agent and inference credentials

Validation snapshot date: 2026-03-17.

This section records what auth surfaces these tools expose, what host-side
state is present locally, and what material would become visible if that state
were projected into a guest sandbox.

### Provider auth surface snapshot

| Tool | Auth surfaces observed | Helper or endpoint hooks observed | Host-side persisted state observed | Notes |
|---|---|---|---|---|
| Codex CLI | ChatGPT account login; API key login via `codex login --with-api-key` | Official config/auth docs; local CLI exposes login surfaces; local binary strings reference `OPENAI_BASE_URL` | `~/.codex/auth.json` with `auth_mode`, `access_token`, `refresh_token`, `id_token` fields | On this host, `codex login status` returned `Logged in using ChatGPT` |
| Claude Code | Anthropic docs describe API-key auth plus IAM-backed options | `apiKeyHelper`; custom base URL; LLM gateway docs | Not observed locally in this environment (`claude` CLI not installed) | Docs present a first-class helper-based credential retrieval surface |
| Amp CLI | `AMP_API_KEY`; `AMP_URL` | CLI help exposes URL override; no helper-style credential hook was found in local help output | `~/.local/share/amp/secrets.json`; `~/.local/share/amp/session.json` | Amp docs describe stored OAuth/local secret state; owner manual says self-hosted/BYOK is not currently supported |

### Security-relevant observations

- The local Codex auth store contains token-bearing fields, including
`access_token` and `refresh_token`.
- The local Amp installation stores credential-related state under the user's
home directory rather than relying only on ephemeral process env.
- Claude Code's published settings include an explicit credential helper hook
(`apiKeyHelper`) rather than requiring a static key in config.
- Codex and Amp both show evidence of host-persisted login state in this
environment.
- Projecting host auth files such as `~/.codex/auth.json` or
`~/.local/share/amp/secrets.json` into a guest would expose their contents to
guest processes.

### Relevant references

- [Codex configuration/auth docs](https://developers.openai.com/codex/config-advanced/)
- local CLI `codex login --help` and `codex login status`
- [Claude Code settings](https://docs.anthropic.com/en/docs/claude-code/settings)
- [Claude Code IAM and auth guide](https://docs.anthropic.com/en/docs/claude-code/iam)
- [Claude Code LLM gateway guide](https://docs.anthropic.com/en/docs/claude-code/llm-gateway)
- [Amp security reference](https://ampcode.com/security)
- [Amp owner manual](https://ampcode.com/manual)
- local CLI `amp --help`, `amp usage`, and local config layout

## Research summary

1. Hosted sandbox providers generally default to full internet access and do
not provide repository-scoped egress control or credential isolation.
2. Among the self-hosted tools reviewed, Matchlock is the closest structural
prior art to Cleanroom, while still differing on default allow behavior,
policy mutability, image pinning, and TLS proxying.
3. Firecracker maps directly to TAP plus host-firewall enforcement on Linux;
macOS requires different enforcement primitives and capability scoping.
4. Existing prior art for host-side mediation and caching exists in
`content-cache`, `git-proxy-cache`, and `tokenizer`.