Skip to content

gh777111/claude-code-gpt

Repository files navigation

claude-code-gpt

Run Claude Code with GPT-5.5 as the backend — through Azure OpenAI, OpenAI direct, or your existing Codex CLI session.

한국어 README

A tiny FastAPI proxy that translates Anthropic's Messages API into the OpenAI Responses API. Point Claude Code's ANTHROPIC_BASE_URL at it and the entire CLI keeps working — same UX, same tools, same sub‑agents — just running on GPT-5.5 instead of Claude.

claude-code-gpt 아키텍처

🇰🇷 한국어 사용자 안내 · Claude Code의 UX는 그대로, 백엔드만 GPT-5.5로 바꾸는 가벼운 프록시입니다. Azure OpenAI · OpenAI · Codex(ChatGPT 구독) 중 선택해 사용할 수 있고, 첫 실행 시 셋업 마법사가 자격증명을 안내합니다. 자세한 한국어 문서는 README.ko.md를 보세요.

Claude Code  ──►  claude-code-gpt  ──►  Azure / OpenAI / Codex
   (UX)            (this repo)          (cheaper compute)

Why

Same Claude Code workflow, with the option to point at cheaper backends. From a single internal run on the same task (a small Pygame demo, sub‑agent enabled, sonnet tier × sonnet tier):

Backend Approx. spend on that one run
Azure GPT‑5.5 via this proxy ~$4
Azure GPT‑5.4‑mini via this proxy ~$0.43

Anthropic Opus on the same task would have been several times more expensive. We're not publishing apples‑to‑apples benchmarks — your mileage will vary heavily by task, prompt size, and sub‑agent fan‑out. Treat this as a rough proof that the proxy actually works, not a marketing claim.


Quick start

git clone https://github.com/gh777111/claude-code-gpt
cd claude-code-gpt
uv sync                       # installs fastapi/uvicorn/httpx/python-dotenv

mkdir -p ~/.local/bin
ln -sf "$PWD/claudegpt" ~/.local/bin/claudegpt
# make sure ~/.local/bin is on your PATH

claudegpt                     # first run launches a setup wizard

The wizard asks which backend to use (Azure / OpenAI / Codex) and writes a .env for you. For Codex it just shells out to codex login — no API key to copy/paste. After that, every claudegpt invocation boots the proxy and execs Claude Code.

Requirements:

  • macOS / Linux
  • Python 3.12+
  • uv (pip install -e . also works)
  • Claude Code installed
  • Credentials for at least one supported backend (see below)

Backends

Set CLAUDEGPT_PROVIDER in .env.

azure (recommended for production)

Uses your Azure OpenAI resource. Bring your own deployments:

CLAUDEGPT_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://YOUR-RESOURCE.cognitiveservices.azure.com/
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_CHAT_DEPLOYMENT_FULL=gpt-5-5      # → claude-opus-*
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-54-mini       # → claude-sonnet-*
AZURE_OPENAI_CHAT_DEPLOYMENT_NANO=gpt-54-nano  # → claude-haiku-*

openai

Uses the public OpenAI API with your OPENAI_API_KEY:

CLAUDEGPT_PROVIDER=openai
OPENAI_API_KEY=sk-...
CLAUDEGPT_OPENAI_OPUS=gpt-5.5
CLAUDEGPT_OPENAI_SONNET=gpt-5.4-mini

codex (experimental)

Reads the local Codex CLI session at ~/.codex/auth.json and forwards requests to the same backend the Codex CLI uses. Subject to whatever rate limits / quotas your account has there. May break at any time when the upstream changes — treat as a curiosity, not a contract.

CLAUDEGPT_PROVIDER=codex
# ~/.codex/auth.json must already exist (run `codex login` first)

Model mapping

When Claude Code asks for claude-opus-* / claude-sonnet-* / claude-haiku-*, the proxy routes to your configured model per tier. Defaults:

Claude tier azure openai codex
opus gpt-5-5 gpt-5.5 gpt-5.5
sonnet gpt-54-mini gpt-5.4-mini gpt-5.4-mini
haiku gpt-54-nano gpt-5.4-mini gpt-5.4-mini

Override any of them via .env.

Reasoning effort

GPT‑5 series is a reasoning family. Effort per tier (default medium) is configurable; tool‑bearing turns are automatically shifted down to low to keep agentic latency reasonable:

CLAUDEGPT_REASONING_OPUS=medium
CLAUDEGPT_REASONING_SONNET=medium
CLAUDEGPT_REASONING_HAIKU=medium
CLAUDEGPT_TOOLS_REASONING=low

The launcher also intercepts Claude Code's --effort flag (or claudegpt --effort medium) and applies the requested effort across all tiers for that invocation. The proxy auto‑restarts when the effort changes — Claude Code itself does not forward thinking to non‑Anthropic backends, so this launcher‑side translation is what makes --effort actually do anything here.


Web tools

Anthropic's WebSearch and WebFetch are server‑side tools — they only work when Claude is talking to Anthropic's own backend. The proxy reproduces both on the GPT side:

WebSearch → Azure hosted web_search

When Claude Code declares WebSearch in the request, the proxy rewrites it to {"type":"web_search"} and lets Azure's hosted tool do the actual searching. The model gets results inline, citations and all.

CLAUDEGPT_MAP_WEB_SEARCH=1   # default; set 0 to drop instead

Cost: charged per search by Azure (typically ~$0.025 / call), in addition to model tokens.

WebFetch → local 2‑stage urllib + summarizer

Mirrors Anthropic's architecture: fetch the URL with urllib, run a small‑model first‑stage to extract the relevant slice per the user's prompt, then feed only that compact result to the main model. Emits server_tool_use + web_fetch_tool_result content blocks identical to Anthropic's native shape.

CLAUDEGPT_MAP_WEB_FETCH=1                          # default
CLAUDEGPT_WEB_FETCH_SUMMARIZER=gpt-54-nano         # 1st-stage model; empty disables
CLAUDEGPT_WEB_FETCH_TIMEOUT=15
CLAUDEGPT_WEB_FETCH_MAX_CHARS=50000                # raw fetch cap
CLAUDEGPT_WEB_FETCH_SUMMARY_MAX_CHARS=4000         # 1st-stage output cap
CLAUDEGPT_WEB_FETCH_MAX_CHAIN=4                    # max fetches per turn

This keeps the main model's context small even on huge pages — same trick Anthropic uses to keep WebFetch from blowing up your token budget.


Tool customization

The proxy can selectively block tool declarations before they reach the backend, saving a few thousand tokens per turn and avoiding tools that can't work over GPT.

# Strip MCP-server tool declarations (mcp__*). Useful when you keep ~/.claude
# settings (CLAUDE.md, commands, permissions) but want the per-turn overhead
# of MCP servers off your bill.
CLAUDEGPT_BLOCK_MCP=1

# Drop specific tools by name. Default drops NotebookEdit (most users don't
# touch .ipynb).
CLAUDEGPT_DROP_TOOLS=NotebookEdit,Bash      # example: hand all shell to /codex-exec

Tracing

Every turn writes a self‑contained JSON trace under <repo>/traces/<cwd-tag>/<ts>-<id>.json. Default on; flip CLAUDEGPT_TRACE=0 to disable.

CLAUDEGPT_TRACE=1                          # default
CLAUDEGPT_TRACE_DIR=/some/other/path       # default <repo>/traces

Each file holds the incoming Anthropic request, the outgoing OpenAI body, the streamed Azure events (when stream mode), the translated Anthropic response, plus timing/effort/model metadata. WebFetch chains record per-round info under chain_rounds. Inspect with jq or any editor — the format is human‑readable.

$ jq '{model: .request_anthropic.model,
       deployment, effort,
       fetches: (.chain_rounds // [] | map(.fetches // []) | add),
       cost: .duration_ms}' \
     traces/*/20260505T*.json

traces/ is git‑ignored — trace files contain user prompts.


How it works

Claude Code
   │
   │  POST /v1/messages   (Anthropic Messages API + SSE)
   ▼
claude-code-gpt  ── translate ──►  POST /v1/responses   (OpenAI Responses API)
   ▲                                  │
   │  Anthropic-style SSE             │  Responses SSE (response.output_text.delta, etc.)
   └──────────── translate ◄──────────┘

Key implementation choices:

  • Responses API, not chat/completions — required for reasoning_effort + tools simultaneously, plus better stream semantics.
  • Body field order = stable prefix first (instructionstoolsinput). Azure's automatic prefix cache hashes the leading 1,024 tokens, so we keep the boilerplate up front and the conversation history at the tail.
  • 24h cache retention auto‑injected on Azure (prompt_cache_retention: "24h"), extending the default 5–10 minute TTL to a full day. Cuts repeat‑prefix cost across long sessions.
  • Tool args buffered + cleaned — empty optional parameters (e.g. Read's pages) are stripped before reaching Claude Code, fixing the most common "tool call rejected" loop.
  • WebFetch chain controller — when the model issues function_call(name="WebFetch"), the proxy intercepts mid‑stream, runs urllib + a small‑model summarizer, then continues a fresh Azure call with the result spliced into the same Anthropic message envelope. Multi-fetch chaining up to MAX_CHAIN.
  • Global config isolation — by default the launcher exports CLAUDE_CONFIG_DIR=/tmp/claudegpt-clean-config so Claude Code skips your global ~/.claude/ (CLAUDE.md, MCP, skills, agents). This alone cut our system‑prompt overhead from ~20k to ~6.6k tokens. Disable with CLAUDEGPT_GLOBAL_ISOLATE=0 if you want your global setup back.
  • Launcher reads .env directly~/.claude/.env and <repo>/.env are parsed for CLAUDEGPT_* keys, so settings work regardless of which shell rc your terminal/IDE actually loaded.
  • Cost‑cutting envsDISABLE_NON_ESSENTIAL_MODEL_CALLS=1, DISABLE_AUTOCOMPACT=1, MAX_THINKING_TOKENS=0 exported by the launcher.

Limitations

  • Anthropic prompt caching with cache_control markers is not translated. We rely on Azure's automatic prefix cache (with 24h retention) instead — quieter, less explicit, but functional.
  • Citations format: WebSearch results come back as inline markdown links rather than Anthropic's distinct citation blocks. Content is identical; the rendering is slightly different in Claude Code's UI.
  • WebFetch on dynamic JS sites: pure urllib, no headless browser. Sites that need JS execution will return mostly chrome (nav menus, etc.). Public APIs, static HTML, and simple SSR pages work great.
  • The model will insist it is Claude when you ask. That's just how strongly it follows the Claude Code system prompt; the routing logs and your provider's billing dashboard are the real source of truth.
  • You are responsible for ToS — using Claude Code with a non‑Anthropic backend is your call. Same for any backend.

File layout

claude-code-gpt/
├── claudegpt          # bash launcher: boots proxy, exports envs, execs `claude`
├── server.py          # FastAPI app, provider dispatch, SSE collect-to-json
├── translate.py       # Anthropic Messages ↔ OpenAI Responses input/output
├── stream.py          # Responses SSE → Anthropic SSE (default driver)
├── chain.py           # WebFetch chain controller — runs fetches mid-stream
├── webfetch.py        # urllib + stdlib HTML→text + Anthropic-shape result blocks
├── trace.py           # per-turn request/response JSON dump (cwd-grouped)
├── config.py          # env loading + model + reasoning_effort mapping
├── pyproject.toml     # fastapi / uvicorn / httpx / python-dotenv
└── .env.example

Seven small source files. No framework lock‑in.


License

MIT — see LICENSE.

Inspired by aattaran/deepclaude (Claude Code → DeepSeek). Not affiliated with Anthropic, OpenAI, or Microsoft.

About

Run Claude Code with GPT-5.5 — Azure OpenAI / OpenAI / Codex backend

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors