claude-code-gpt

Run Claude Code with GPT-5.5 as the backend — through Azure OpenAI, OpenAI direct, or your existing Codex CLI session.

한국어 README

A tiny FastAPI proxy that translates Anthropic's Messages API into the OpenAI Responses API. Point Claude Code's ANTHROPIC_BASE_URL at it and the entire CLI keeps working — same UX, same tools, same sub‑agents — just running on GPT-5.5 instead of Claude.

🇰🇷 한국어 사용자 안내 · Claude Code의 UX는 그대로, 백엔드만 GPT-5.5로 바꾸는 가벼운 프록시입니다. Azure OpenAI · OpenAI · Codex(ChatGPT 구독) 중 선택해 사용할 수 있고, 첫 실행 시 셋업 마법사가 자격증명을 안내합니다. 자세한 한국어 문서는 README.ko.md를 보세요.

Claude Code  ──►  claude-code-gpt  ──►  Azure / OpenAI / Codex
   (UX)            (this repo)          (cheaper compute)

Why

Same Claude Code workflow, with the option to point at cheaper backends. From a single internal run on the same task (a small Pygame demo, sub‑agent enabled, sonnet tier × sonnet tier):

Backend	Approx. spend on that one run
Azure GPT‑5.5 via this proxy	~$4
Azure GPT‑5.4‑mini via this proxy	~$0.43

Anthropic Opus on the same task would have been several times more expensive. We're not publishing apples‑to‑apples benchmarks — your mileage will vary heavily by task, prompt size, and sub‑agent fan‑out. Treat this as a rough proof that the proxy actually works, not a marketing claim.

Quick start

git clone https://github.com/gh777111/claude-code-gpt
cd claude-code-gpt
uv sync                       # installs fastapi/uvicorn/httpx/python-dotenv

mkdir -p ~/.local/bin
ln -sf "$PWD/claudegpt" ~/.local/bin/claudegpt
# make sure ~/.local/bin is on your PATH

claudegpt                     # first run launches a setup wizard

The wizard asks which backend to use (Azure / OpenAI / Codex) and writes a .env for you. For Codex it just shells out to codex login — no API key to copy/paste. After that, every claudegpt invocation boots the proxy and execs Claude Code.

Requirements:

macOS / Linux
Python 3.12+
uv (pip install -e . also works)
Claude Code installed
Credentials for at least one supported backend (see below)

Backends

Set CLAUDEGPT_PROVIDER in .env.

`azure` (recommended for production)

Uses your Azure OpenAI resource. Bring your own deployments:

CLAUDEGPT_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://YOUR-RESOURCE.cognitiveservices.azure.com/
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_CHAT_DEPLOYMENT_FULL=gpt-5-5      # → claude-opus-*
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-54-mini       # → claude-sonnet-*
AZURE_OPENAI_CHAT_DEPLOYMENT_NANO=gpt-54-nano  # → claude-haiku-*

`openai`

Uses the public OpenAI API with your OPENAI_API_KEY:

CLAUDEGPT_PROVIDER=openai
OPENAI_API_KEY=sk-...
CLAUDEGPT_OPENAI_OPUS=gpt-5.5
CLAUDEGPT_OPENAI_SONNET=gpt-5.4-mini

`codex` (experimental)

Reads the local Codex CLI session at ~/.codex/auth.json and forwards requests to the same backend the Codex CLI uses. Subject to whatever rate limits / quotas your account has there. May break at any time when the upstream changes — treat as a curiosity, not a contract.

CLAUDEGPT_PROVIDER=codex
# ~/.codex/auth.json must already exist (run `codex login` first)

Model mapping

When Claude Code asks for claude-opus-* / claude-sonnet-* / claude-haiku-*, the proxy routes to your configured model per tier. Defaults:

Claude tier	azure	openai	codex
opus	`gpt-5-5`	`gpt-5.5`	`gpt-5.5`
sonnet	`gpt-54-mini`	`gpt-5.4-mini`	`gpt-5.4-mini`
haiku	`gpt-54-nano`	`gpt-5.4-mini`	`gpt-5.4-mini`

Override any of them via .env.

Reasoning effort

GPT‑5 series is a reasoning family. Effort per tier (default medium) is configurable; tool‑bearing turns are automatically shifted down to low to keep agentic latency reasonable:

CLAUDEGPT_REASONING_OPUS=medium
CLAUDEGPT_REASONING_SONNET=medium
CLAUDEGPT_REASONING_HAIKU=medium
CLAUDEGPT_TOOLS_REASONING=low

The launcher also intercepts Claude Code's --effort flag (or claudegpt --effort medium) and applies the requested effort across all tiers for that invocation. The proxy auto‑restarts when the effort changes — Claude Code itself does not forward thinking to non‑Anthropic backends, so this launcher‑side translation is what makes --effort actually do anything here.

Web tools

Anthropic's WebSearch and WebFetch are server‑side tools — they only work when Claude is talking to Anthropic's own backend. The proxy reproduces both on the GPT side:

WebSearch → Azure hosted `web_search`

When Claude Code declares WebSearch in the request, the proxy rewrites it to {"type":"web_search"} and lets Azure's hosted tool do the actual searching. The model gets results inline, citations and all.

CLAUDEGPT_MAP_WEB_SEARCH=1   # default; set 0 to drop instead

Cost: charged per search by Azure (typically ~$0.025 / call), in addition to model tokens.

WebFetch → local 2‑stage urllib + summarizer

Mirrors Anthropic's architecture: fetch the URL with urllib, run a small‑model first‑stage to extract the relevant slice per the user's prompt, then feed only that compact result to the main model. Emits server_tool_use + web_fetch_tool_result content blocks identical to Anthropic's native shape.

CLAUDEGPT_MAP_WEB_FETCH=1                          # default
CLAUDEGPT_WEB_FETCH_SUMMARIZER=gpt-54-nano         # 1st-stage model; empty disables
CLAUDEGPT_WEB_FETCH_TIMEOUT=15
CLAUDEGPT_WEB_FETCH_MAX_CHARS=50000                # raw fetch cap
CLAUDEGPT_WEB_FETCH_SUMMARY_MAX_CHARS=4000         # 1st-stage output cap
CLAUDEGPT_WEB_FETCH_MAX_CHAIN=4                    # max fetches per turn

This keeps the main model's context small even on huge pages — same trick Anthropic uses to keep WebFetch from blowing up your token budget.

Tool customization

The proxy can selectively block tool declarations before they reach the backend, saving a few thousand tokens per turn and avoiding tools that can't work over GPT.

# Strip MCP-server tool declarations (mcp__*). Useful when you keep ~/.claude
# settings (CLAUDE.md, commands, permissions) but want the per-turn overhead
# of MCP servers off your bill.
CLAUDEGPT_BLOCK_MCP=1

# Drop specific tools by name. Default drops NotebookEdit (most users don't
# touch .ipynb).
CLAUDEGPT_DROP_TOOLS=NotebookEdit,Bash      # example: hand all shell to /codex-exec

Tracing

Every turn writes a self‑contained JSON trace under <repo>/traces/<cwd-tag>/<ts>-<id>.json. Default on; flip CLAUDEGPT_TRACE=0 to disable.

CLAUDEGPT_TRACE=1                          # default
CLAUDEGPT_TRACE_DIR=/some/other/path       # default <repo>/traces

Each file holds the incoming Anthropic request, the outgoing OpenAI body, the streamed Azure events (when stream mode), the translated Anthropic response, plus timing/effort/model metadata. WebFetch chains record per-round info under chain_rounds. Inspect with jq or any editor — the format is human‑readable.

$ jq '{model: .request_anthropic.model,
       deployment, effort,
       fetches: (.chain_rounds // [] | map(.fetches // []) | add),
       cost: .duration_ms}' \
     traces/*/20260505T*.json

traces/ is git‑ignored — trace files contain user prompts.

How it works

Claude Code
   │
   │  POST /v1/messages   (Anthropic Messages API + SSE)
   ▼
claude-code-gpt  ── translate ──►  POST /v1/responses   (OpenAI Responses API)
   ▲                                  │
   │  Anthropic-style SSE             │  Responses SSE (response.output_text.delta, etc.)
   └──────────── translate ◄──────────┘

Key implementation choices:

Responses API, not chat/completions — required for reasoning_effort + tools simultaneously, plus better stream semantics.
Body field order = stable prefix first (instructions → tools → input). Azure's automatic prefix cache hashes the leading 1,024 tokens, so we keep the boilerplate up front and the conversation history at the tail.
24h cache retention auto‑injected on Azure (prompt_cache_retention: "24h"), extending the default 5–10 minute TTL to a full day. Cuts repeat‑prefix cost across long sessions.
Tool args buffered + cleaned — empty optional parameters (e.g. Read's pages) are stripped before reaching Claude Code, fixing the most common "tool call rejected" loop.
WebFetch chain controller — when the model issues function_call(name="WebFetch"), the proxy intercepts mid‑stream, runs urllib + a small‑model summarizer, then continues a fresh Azure call with the result spliced into the same Anthropic message envelope. Multi-fetch chaining up to MAX_CHAIN.
Global config isolation — by default the launcher exports CLAUDE_CONFIG_DIR=/tmp/claudegpt-clean-config so Claude Code skips your global ~/.claude/ (CLAUDE.md, MCP, skills, agents). This alone cut our system‑prompt overhead from ~20k to ~6.6k tokens. Disable with CLAUDEGPT_GLOBAL_ISOLATE=0 if you want your global setup back.
Launcher reads .env directly — ~/.claude/.env and <repo>/.env are parsed for CLAUDEGPT_* keys, so settings work regardless of which shell rc your terminal/IDE actually loaded.
Cost‑cutting envs — DISABLE_NON_ESSENTIAL_MODEL_CALLS=1, DISABLE_AUTOCOMPACT=1, MAX_THINKING_TOKENS=0 exported by the launcher.

Limitations

Anthropic prompt caching with cache_control markers is not translated. We rely on Azure's automatic prefix cache (with 24h retention) instead — quieter, less explicit, but functional.
Citations format: WebSearch results come back as inline markdown links rather than Anthropic's distinct citation blocks. Content is identical; the rendering is slightly different in Claude Code's UI.
WebFetch on dynamic JS sites: pure urllib, no headless browser. Sites that need JS execution will return mostly chrome (nav menus, etc.). Public APIs, static HTML, and simple SSR pages work great.
The model will insist it is Claude when you ask. That's just how strongly it follows the Claude Code system prompt; the routing logs and your provider's billing dashboard are the real source of truth.
You are responsible for ToS — using Claude Code with a non‑Anthropic backend is your call. Same for any backend.

File layout

claude-code-gpt/
├── claudegpt          # bash launcher: boots proxy, exports envs, execs `claude`
├── server.py          # FastAPI app, provider dispatch, SSE collect-to-json
├── translate.py       # Anthropic Messages ↔ OpenAI Responses input/output
├── stream.py          # Responses SSE → Anthropic SSE (default driver)
├── chain.py           # WebFetch chain controller — runs fetches mid-stream
├── webfetch.py        # urllib + stdlib HTML→text + Anthropic-shape result blocks
├── trace.py           # per-turn request/response JSON dump (cwd-grouped)
├── config.py          # env loading + model + reasoning_effort mapping
├── pyproject.toml     # fastapi / uvicorn / httpx / python-dotenv
└── .env.example

Seven small source files. No framework lock‑in.

License

MIT — see LICENSE.

Inspired by aattaran/deepclaude (Claude Code → DeepSeek). Not affiliated with Anthropic, OpenAI, or Microsoft.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude-code-gpt

Why

Quick start

Backends

`azure` (recommended for production)

`openai`

`codex` (experimental)

Model mapping

Reasoning effort

Web tools

WebSearch → Azure hosted `web_search`

WebFetch → local 2‑stage urllib + summarizer

Tool customization

Tracing

How it works

Limitations

File layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
docs/img		docs/img
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.ko.md		README.ko.md
README.md		README.md
chain.py		chain.py
claudegpt		claudegpt
config.py		config.py
pyproject.toml		pyproject.toml
server.py		server.py
setup.py		setup.py
stream.py		stream.py
trace.py		trace.py
translate.py		translate.py
uv.lock		uv.lock
webfetch.py		webfetch.py

Folders and files

Latest commit

History

Repository files navigation

claude-code-gpt

Why

Quick start

Backends

azure (recommended for production)

openai

codex (experimental)

Model mapping

Reasoning effort

Web tools

WebSearch → Azure hosted web_search

WebFetch → local 2‑stage urllib + summarizer

Tool customization

Tracing

How it works

Limitations

File layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`azure` (recommended for production)

`openai`

`codex` (experimental)

WebSearch → Azure hosted `web_search`

Packages