Run Claude Code with GPT-5.5 as the backend — through Azure OpenAI, OpenAI direct, or your existing Codex CLI session.
A tiny FastAPI proxy that translates Anthropic's Messages API into the OpenAI Responses API. Point Claude Code's ANTHROPIC_BASE_URL at it and the entire CLI keeps working — same UX, same tools, same sub‑agents — just running on GPT-5.5 instead of Claude.
🇰🇷 한국어 사용자 안내 · Claude Code의 UX는 그대로, 백엔드만 GPT-5.5로 바꾸는 가벼운 프록시입니다. Azure OpenAI · OpenAI · Codex(ChatGPT 구독) 중 선택해 사용할 수 있고, 첫 실행 시 셋업 마법사가 자격증명을 안내합니다. 자세한 한국어 문서는 README.ko.md를 보세요.
Claude Code ──► claude-code-gpt ──► Azure / OpenAI / Codex
(UX) (this repo) (cheaper compute)
Same Claude Code workflow, with the option to point at cheaper backends. From a single internal run on the same task (a small Pygame demo, sub‑agent enabled, sonnet tier × sonnet tier):
| Backend | Approx. spend on that one run |
|---|---|
| Azure GPT‑5.5 via this proxy | ~$4 |
| Azure GPT‑5.4‑mini via this proxy | ~$0.43 |
Anthropic Opus on the same task would have been several times more expensive. We're not publishing apples‑to‑apples benchmarks — your mileage will vary heavily by task, prompt size, and sub‑agent fan‑out. Treat this as a rough proof that the proxy actually works, not a marketing claim.
git clone https://github.com/gh777111/claude-code-gpt
cd claude-code-gpt
uv sync # installs fastapi/uvicorn/httpx/python-dotenv
mkdir -p ~/.local/bin
ln -sf "$PWD/claudegpt" ~/.local/bin/claudegpt
# make sure ~/.local/bin is on your PATH
claudegpt # first run launches a setup wizardThe wizard asks which backend to use (Azure / OpenAI / Codex) and writes a .env for you. For Codex it just shells out to codex login — no API key to copy/paste. After that, every claudegpt invocation boots the proxy and execs Claude Code.
Requirements:
- macOS / Linux
- Python 3.12+
uv(pip install -e .also works)- Claude Code installed
- Credentials for at least one supported backend (see below)
Set CLAUDEGPT_PROVIDER in .env.
Uses your Azure OpenAI resource. Bring your own deployments:
CLAUDEGPT_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://YOUR-RESOURCE.cognitiveservices.azure.com/
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_CHAT_DEPLOYMENT_FULL=gpt-5-5 # → claude-opus-*
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-54-mini # → claude-sonnet-*
AZURE_OPENAI_CHAT_DEPLOYMENT_NANO=gpt-54-nano # → claude-haiku-*Uses the public OpenAI API with your OPENAI_API_KEY:
CLAUDEGPT_PROVIDER=openai
OPENAI_API_KEY=sk-...
CLAUDEGPT_OPENAI_OPUS=gpt-5.5
CLAUDEGPT_OPENAI_SONNET=gpt-5.4-miniReads the local Codex CLI session at ~/.codex/auth.json and forwards requests to the same backend the Codex CLI uses. Subject to whatever rate limits / quotas your account has there. May break at any time when the upstream changes — treat as a curiosity, not a contract.
CLAUDEGPT_PROVIDER=codex
# ~/.codex/auth.json must already exist (run `codex login` first)When Claude Code asks for claude-opus-* / claude-sonnet-* / claude-haiku-*, the proxy routes to your configured model per tier. Defaults:
| Claude tier | azure | openai | codex |
|---|---|---|---|
| opus | gpt-5-5 |
gpt-5.5 |
gpt-5.5 |
| sonnet | gpt-54-mini |
gpt-5.4-mini |
gpt-5.4-mini |
| haiku | gpt-54-nano |
gpt-5.4-mini |
gpt-5.4-mini |
Override any of them via .env.
GPT‑5 series is a reasoning family. Effort per tier (default medium) is configurable; tool‑bearing turns are automatically shifted down to low to keep agentic latency reasonable:
CLAUDEGPT_REASONING_OPUS=medium
CLAUDEGPT_REASONING_SONNET=medium
CLAUDEGPT_REASONING_HAIKU=medium
CLAUDEGPT_TOOLS_REASONING=lowThe launcher also intercepts Claude Code's --effort flag (or claudegpt --effort medium) and applies the requested effort across all tiers for that invocation. The proxy auto‑restarts when the effort changes — Claude Code itself does not forward thinking to non‑Anthropic backends, so this launcher‑side translation is what makes --effort actually do anything here.
Anthropic's WebSearch and WebFetch are server‑side tools — they only work when Claude is talking to Anthropic's own backend. The proxy reproduces both on the GPT side:
When Claude Code declares WebSearch in the request, the proxy rewrites it to {"type":"web_search"} and lets Azure's hosted tool do the actual searching. The model gets results inline, citations and all.
CLAUDEGPT_MAP_WEB_SEARCH=1 # default; set 0 to drop insteadCost: charged per search by Azure (typically ~$0.025 / call), in addition to model tokens.
Mirrors Anthropic's architecture: fetch the URL with urllib, run a small‑model first‑stage to extract the relevant slice per the user's prompt, then feed only that compact result to the main model. Emits server_tool_use + web_fetch_tool_result content blocks identical to Anthropic's native shape.
CLAUDEGPT_MAP_WEB_FETCH=1 # default
CLAUDEGPT_WEB_FETCH_SUMMARIZER=gpt-54-nano # 1st-stage model; empty disables
CLAUDEGPT_WEB_FETCH_TIMEOUT=15
CLAUDEGPT_WEB_FETCH_MAX_CHARS=50000 # raw fetch cap
CLAUDEGPT_WEB_FETCH_SUMMARY_MAX_CHARS=4000 # 1st-stage output cap
CLAUDEGPT_WEB_FETCH_MAX_CHAIN=4 # max fetches per turnThis keeps the main model's context small even on huge pages — same trick Anthropic uses to keep WebFetch from blowing up your token budget.
The proxy can selectively block tool declarations before they reach the backend, saving a few thousand tokens per turn and avoiding tools that can't work over GPT.
# Strip MCP-server tool declarations (mcp__*). Useful when you keep ~/.claude
# settings (CLAUDE.md, commands, permissions) but want the per-turn overhead
# of MCP servers off your bill.
CLAUDEGPT_BLOCK_MCP=1
# Drop specific tools by name. Default drops NotebookEdit (most users don't
# touch .ipynb).
CLAUDEGPT_DROP_TOOLS=NotebookEdit,Bash # example: hand all shell to /codex-execEvery turn writes a self‑contained JSON trace under <repo>/traces/<cwd-tag>/<ts>-<id>.json. Default on; flip CLAUDEGPT_TRACE=0 to disable.
CLAUDEGPT_TRACE=1 # default
CLAUDEGPT_TRACE_DIR=/some/other/path # default <repo>/tracesEach file holds the incoming Anthropic request, the outgoing OpenAI body, the streamed Azure events (when stream mode), the translated Anthropic response, plus timing/effort/model metadata. WebFetch chains record per-round info under chain_rounds. Inspect with jq or any editor — the format is human‑readable.
$ jq '{model: .request_anthropic.model,
deployment, effort,
fetches: (.chain_rounds // [] | map(.fetches // []) | add),
cost: .duration_ms}' \
traces/*/20260505T*.jsontraces/ is git‑ignored — trace files contain user prompts.
Claude Code
│
│ POST /v1/messages (Anthropic Messages API + SSE)
▼
claude-code-gpt ── translate ──► POST /v1/responses (OpenAI Responses API)
▲ │
│ Anthropic-style SSE │ Responses SSE (response.output_text.delta, etc.)
└──────────── translate ◄──────────┘
Key implementation choices:
- Responses API, not chat/completions — required for
reasoning_effort+ tools simultaneously, plus better stream semantics. - Body field order = stable prefix first (
instructions→tools→input). Azure's automatic prefix cache hashes the leading 1,024 tokens, so we keep the boilerplate up front and the conversation history at the tail. - 24h cache retention auto‑injected on Azure (
prompt_cache_retention: "24h"), extending the default 5–10 minute TTL to a full day. Cuts repeat‑prefix cost across long sessions. - Tool args buffered + cleaned — empty optional parameters (e.g. Read's
pages) are stripped before reaching Claude Code, fixing the most common "tool call rejected" loop. - WebFetch chain controller — when the model issues
function_call(name="WebFetch"), the proxy intercepts mid‑stream, runs urllib + a small‑model summarizer, then continues a fresh Azure call with the result spliced into the same Anthropic message envelope. Multi-fetch chaining up toMAX_CHAIN. - Global config isolation — by default the launcher exports
CLAUDE_CONFIG_DIR=/tmp/claudegpt-clean-configso Claude Code skips your global~/.claude/(CLAUDE.md, MCP, skills, agents). This alone cut our system‑prompt overhead from ~20k to ~6.6k tokens. Disable withCLAUDEGPT_GLOBAL_ISOLATE=0if you want your global setup back. - Launcher reads .env directly —
~/.claude/.envand<repo>/.envare parsed forCLAUDEGPT_*keys, so settings work regardless of which shell rc your terminal/IDE actually loaded. - Cost‑cutting envs —
DISABLE_NON_ESSENTIAL_MODEL_CALLS=1,DISABLE_AUTOCOMPACT=1,MAX_THINKING_TOKENS=0exported by the launcher.
- Anthropic prompt caching with
cache_controlmarkers is not translated. We rely on Azure's automatic prefix cache (with 24h retention) instead — quieter, less explicit, but functional. - Citations format: WebSearch results come back as inline markdown links rather than Anthropic's distinct citation blocks. Content is identical; the rendering is slightly different in Claude Code's UI.
- WebFetch on dynamic JS sites: pure urllib, no headless browser. Sites that need JS execution will return mostly chrome (nav menus, etc.). Public APIs, static HTML, and simple SSR pages work great.
- The model will insist it is Claude when you ask. That's just how strongly it follows the Claude Code system prompt; the routing logs and your provider's billing dashboard are the real source of truth.
- You are responsible for ToS — using Claude Code with a non‑Anthropic backend is your call. Same for any backend.
claude-code-gpt/
├── claudegpt # bash launcher: boots proxy, exports envs, execs `claude`
├── server.py # FastAPI app, provider dispatch, SSE collect-to-json
├── translate.py # Anthropic Messages ↔ OpenAI Responses input/output
├── stream.py # Responses SSE → Anthropic SSE (default driver)
├── chain.py # WebFetch chain controller — runs fetches mid-stream
├── webfetch.py # urllib + stdlib HTML→text + Anthropic-shape result blocks
├── trace.py # per-turn request/response JSON dump (cwd-grouped)
├── config.py # env loading + model + reasoning_effort mapping
├── pyproject.toml # fastapi / uvicorn / httpx / python-dotenv
└── .env.example
Seven small source files. No framework lock‑in.
MIT — see LICENSE.
Inspired by aattaran/deepclaude (Claude Code → DeepSeek). Not affiliated with Anthropic, OpenAI, or Microsoft.
