Reading time: 20 minutes | Skill level: All levels | Version: v1.4.0 | Last updated: 2026-01-22
cco
❯ 1+1 ?
⏱️ Response: 2-6 MINUTES (should be 3-10 seconds)
💻 CPU at 100%
🔥 Mac fans spinning
# OR: hallucinations, "stuck on Explore" behavior, incoherent responsesContext mismatch: Claude Code sends ~18K tokens of system prompt + tools, but default Ollama context is 4K tokens.
Result: Context is truncated → model hallucinates → regenerates constantly → extreme slowness.
-
Verify effective context:
# During cco session (in another terminal) ollama ps # Look at CONTEXT column - should be 65536 for Claude Code
-
Check context usage in Claude Code:
# During session /contextLook for: Free space negative or very low
This is the official recommended method from Ollama documentation. It persists across restarts.
# 1. Create Modelfile directory
mkdir -p ~/.ollama
# 2. Create the Modelfile
cat > ~/.ollama/Modelfile.devstral-64k << 'EOF'
FROM devstral-small-2
PARAMETER num_ctx 65536
PARAMETER temperature 0.15
EOF
# 3. Create the model variant
ollama create devstral-64k -f ~/.ollama/Modelfile.devstral-64k
# 4. Use the 64K model
OLLAMA_MODEL=devstral-64k ccoWhy this works:
PARAMETER num_ctxis embedded in the model → persists across restartstemperature 0.15reduces hallucinations for code generation- Custom model variants can be listed with
ollama list
Less priority than Modelfile, but works as fallback:
# Set global context length
launchctl setenv OLLAMA_CONTEXT_LENGTH 65536
brew services restart ollama
# Verify
launchctl getenv OLLAMA_CONTEXT_LENGTH
# Should show: 65536Note: Environment variable has lower priority than Modelfile PARAMETER.
For large projects or when local performance is insufficient:
# Copilot (free with subscription)
ccc
⏱️ Response: 1-3 seconds ✅
# Anthropic Direct (paid)
ccd
⏱️ Response: 1-2 seconds ✅Why this works: Both handle 200K+ tokens context natively.
# 1. Pull recommended model
ollama pull devstral-small-2
# 2. Create 64K Modelfile (see Option 1 above)
# 3. Start session
OLLAMA_MODEL=devstral-64k cco
# 4. Check effective context (in another terminal)
ollama ps
# Expected: CONTEXT = 65536
# 5. Test agentic task
❯ create a file hello.py with print("Hello")
# Should complete in 5-15 seconds, not 2-6 minutes| Context Size | Model RAM | Cache RAM | Total | Free RAM |
|---|---|---|---|---|
| 32K | 15 GB | 4-6 GB | ~21 GB | ~27 GB |
| 64K | 15 GB | 8-12 GB | ~27 GB | ~21 GB |
Recommendation: Use 64K if possible. If RAM is tight, use 32K or switch to Copilot.
| Model | Size | SWE-bench | Use Case |
|---|---|---|---|
| devstral-small-2 (default) | 24B | 68% | Best agentic coding |
| ibm/granite4:small-h | 32B (9B active) | ~62% | Long context, 70% less VRAM |
| qwen3-coder:30b | 30B | 85% | Highest accuracy (needs template work) |
Sources:
cco
ERROR: Model devstral-small-2 not found
Pull it with: ollama pull devstral-small-2Model not installed or model name mismatch.
-
Pull the recommended model:
ollama pull devstral-small-2
-
Or pull backup model for long context:
ollama pull ibm/granite4:small-h
-
Check installed models:
ollama list
-
Override model if needed:
OLLAMA_MODEL=ibm/granite4:small-h cco
cco
Detected a custom API key in your environment
ANTHROPIC_API_KEY: <YOUR_API_KEY>
Do you want to use this API key?
1. Yes
2. No (recommended) ✓ANTHROPIC_API_KEY set in script triggers Claude Code validation.
Already fixed in v1.1.0+. If using older version:
- Edit
~/bin/claude-switch - In
_run_ollama()function, remove:export ANTHROPIC_API_KEY="ollama"
- Keep only:
export ANTHROPIC_BASE_URL="http://localhost:11434" export ANTHROPIC_AUTH_TOKEN="ollama"
ccc-gpt # or COPILOT_MODEL=gpt-5.2-codex ccc
ERROR Failed to create chat completions Response { status: 400
ERROR HTTP error: { error:
{ message: 'model gpt-5.2-codex is not accessible via the /chat/completions endpoint',
code: 'unsupported_api_for_model' } }TOUS les modèles de la famille GPT Codex nécessitent l'endpoint OpenAI /responses (lancé en octobre 2025) au lieu du standard /chat/completions. copilot-api (v0.7.0) ne supporte que /chat/completions, rendant TOUS les modèles Codex incompatibles :
Statut (Feb 2026): La version officielle v0.7.0 est stalled depuis octobre 2025. Le fork caozhiyuan (v1.3.1) est maintenant la version maintenue.
⚠️ Issue #191: Risque de casse API GitHub (nouveau format API). Surveiller. Recommandation: utiliser le fork viaccunifiedpour tous les cas d'usage.
- ❌
gpt-5.2-codex(GA depuis 14 jan 2026) - ❌
gpt-5.1-codex(Preview) - ❌
gpt-5.1-codex-mini(Preview) - ❌
gpt-5-codex(Preview)
Cause technique : Les modèles Codex utilisent un paradigme stateful avec previous_response_id pour le contexte, incompatible avec l'API Chat Completions classique.
Option 1: Utiliser des modèles GPT compatibles (Recommandé)
# Modèles GPT 100% compatibles avec copilot-api:
COPILOT_MODEL=gpt-4.1 ccc # Équilibré, 0x premium (inclus)
COPILOT_MODEL=gpt-5 ccc # Raisonnement avancé, 1x premium
COPILOT_MODEL=gpt-5-mini ccc # Ultra rapide, 0x premiumOption 2: Utiliser Claude via Copilot (100% compatible)
ccc-sonnet # Claude Sonnet 4.6 (défaut, fiable)
ccc-opus # Claude Opus 4.6 (meilleure qualité)
ccc-haiku # Claude Haiku 4.5 (ultra rapide)Option 3: Suivre le développement du fix
Le PR communautaire ericc-ch/copilot-api#117 travaille sur le support de l'endpoint /responses.
| Modèle | Statut | Usage |
|---|---|---|
claude-sonnet-4-6 |
✅ Fonctionne | Développement quotidien |
claude-opus-4-6 |
✅ Fonctionne | Code critique |
claude-haiku-4.5 |
✅ Fonctionne | Questions rapides |
gpt-4.1 |
✅ Fonctionne | Usage général, 0x premium |
gpt-5 |
✅ Fonctionne | Raisonnement avancé, 1x premium |
gpt-5-mini |
✅ Fonctionne | Ultra rapide, 0x premium |
gemini-3-pro-preview |
✅ Fonctionne | Alternative |
grok-code-fast-1 |
✅ Fonctionne | Code spécialisé |
raptor-mini |
✅ Fonctionne | Léger |
gpt-5.2-codex |
❌ Incompatible | Endpoint /responses requis |
gpt-5.1-codex |
❌ Incompatible | Endpoint /responses requis |
gpt-5.1-codex-mini |
❌ Incompatible | Endpoint /responses requis |
gpt-5-codex |
❌ Incompatible | Endpoint /responses requis |
# À remplacer (dépréciés depuis le 17 février 2026):
claude-opus-4.1 → claude-opus-4-6
gemini-2.5-pro → gemini-3-pro-previewccc
❯ 1+1
API Error: 400 Bad Request
{"error":{"message":"x-anthropic-billing-header is a reserved keyword and may not be used in the system prompt",
"code":"invalid_request_body"}}Claude Code v2.1.15+ injecte la chaîne x-anthropic-billing-header dans son system prompt pour le tracking billing interne. L'API Anthropic (via copilot-api proxy) rejette cette requête car c'est un mot-clé réservé qui ne peut pas apparaître dans les prompts utilisateur.
Cause technique : Anthropic réserve certains mots-clés (comme x-anthropic-billing-header) pour son infrastructure interne. Quand Claude Code inclut ces mots dans le system prompt, l'API les détecte et rejette la requête pour éviter les conflits.
Issue GitHub : ericc-ch/copilot-api#174 (ouverte le 22 janvier 2026)
- ✅ Anthropic Direct (
ccd) : Non affecté (API native gère correctement) - ❌ Copilot via copilot-api (
ccc) : AFFECTÉ (proxy rejette le header) - ✅ Ollama Local (
cco) : Non affecté (pas d'API Anthropic)
Option 1: Utiliser Anthropic Direct (Recommandé) ⭐
ccd # Anthropic API native, gère x-anthropic-billing-header correctementAvantages:
- ✅ 100% compatible avec toutes les versions Claude Code
- ✅ Meilleure qualité (pas de proxy)
- ✅ Support officiel Anthropic
Inconvénients:
- 💰 Payant (facturation au token)
Option 2: Utiliser Ollama Local
cco # 100% privé, pas d'API AnthropicAvantages:
- ✅ Gratuit, illimité
- ✅ 100% privé (aucune donnée ne quitte la machine)
- ✅ Pas affecté par les problèmes API Anthropic
Inconvénients:
- 🐌 Plus lent que cloud (voir Optimisation M4 Pro)
Option 3: Attendre un fix de copilot-api
L'issue est activement suivie sur GitHub. Possibles solutions en développement :
- Filtrage automatique du header réservé par copilot-api
- Patch Claude Code pour exclure le header des proxies
- Configuration Anthropic API pour accepter le header via proxies
Suivi : ericc-ch/copilot-api#174
Un utilisateur a reporté que retirer manuellement x-anthropic-billing-header du system message permet de contourner l'erreur, mais :
❌ Ne PAS utiliser : Modifier le system prompt casse la session Claude Code ❌ Fragile : Cassera à chaque update de Claude Code ❌ Complexe : Nécessite d'intercepter et modifier les requêtes
Préférez les Options 1 ou 2 ci-dessus.
Si tu vois cette erreur sporadiquement :
# Vérifier la version Claude Code
claude --version
# Si v2.1.15+, le problème est présent
# Vérifier les logs récents
tail -50 ~/.claude/claude-switch.log | grep "400\|billing"
# Tester avec Anthropic Direct
ccd
❯ 1+1
# Si ça fonctionne → confirme que le problème vient de copilot-apiUn utilisateur de la communauté @mrhanhan a proposé un patch fonctionnel qui filtre automatiquement le header réservé.
# Trouver l'installation de copilot-api
which copilot-api
# → /Users/YOU/.nvm/versions/node/vXX.XX.X/bin/copilot-api
# Le fichier à modifier est dans dist/main.js
# Exemple: ~/.nvm/versions/node/v22.18.0/lib/node_modules/copilot-api/dist/main.jscd ~/.nvm/versions/node/v22.18.0/lib/node_modules/copilot-api/dist
cp main.js main.js.backup
echo "✅ Backup créé: main.js.backup"Éditer dist/main.js et trouver la fonction translateAnthropicMessagesToOpenAI (autour de la ligne 897).
Avant (original) :
function translateAnthropicMessagesToOpenAI(anthropicMessages, system) {
const systemMessages = handleSystemPrompt(system);
const otherMessages = anthropicMessages.flatMap((message) =>
message.role === "user" ? handleUserMessage(message) : handleAssistantMessage(message)
);
return [...systemMessages, ...otherMessages];
}Après (patché) :
function translateAnthropicMessagesToOpenAI(anthropicMessages, system) {
let systemMessages = handleSystemPrompt(system);
// FIX #174: Filter x-anthropic-billing-header from system prompt
systemMessages = systemMessages.map((it) => {
if (typeof it.content === "string" && it.content.startsWith("x-anthropic-billing-header")) {
it.content = it.content.replace(
/x-anthropic-billing-header: \?cc_version=.+; \?cc_entrypoint=\\+\n{0,2}\./,
""
);
console.info('Filtered x-anthropic-billing-header from system message');
}
return it;
});
const otherMessages = anthropicMessages.flatMap((message) =>
message.role === "user" ? handleUserMessage(message) : handleAssistantMessage(message)
);
return [...systemMessages, ...otherMessages];
}Modifications apportées :
const systemMessages→let systemMessages(ligne 2)- Ajout du filtre
systemMessages.map()(lignes 3-12) - Log de confirmation quand le header est filtré
# Arrêter le processus actuel
kill $(ps aux | grep "copilot-api start" | grep -v grep | awk '{print $2}')
# Redémarrer avec le patch
copilot-api startTest automatique :
Un script de test est disponible dans le projet cc-copilot-bridge :
# Dans le dépôt cc-copilot-bridge
./scripts/test-billing-header-fix.shTest manuel :
# Test 1: Requête avec billing header
curl -s -X POST http://localhost:4141/v1/messages \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 100,
"system": "x-anthropic-billing-header: test\n\nYou are helpful.",
"messages": [{"role": "user", "content": "Say hello"}]
}' | jq '.content[0].text // .error'
# Résultat attendu: Réponse normale (pas d'erreur 400)Test avec Claude Code :
ccc
❯ 1+1Résultat attendu : Réponse normale sans erreur invalid_request_body
Dans le terminal où copilot-api tourne, tu devrais voir :
Filtered x-anthropic-billing-header from system message
Chaque fois que Claude Code envoie une requête avec le header réservé.
Si le patch cause des problèmes :
# Arrêter copilot-api
kill $(ps aux | grep "copilot-api start" | grep -v grep | awk '{print $2}')
# Restaurer le backup
cd ~/.nvm/versions/node/v22.18.0/lib/node_modules/copilot-api/dist
cp main.js.backup main.js
# Redémarrer
copilot-api start- ❌ Sera écrasé à chaque
npm update copilot-api - ❌ Non testé sur toutes les versions de copilot-api
- ❌ Peut ne pas couvrir tous les cas edge
Note v1.7.0: La Solution 1 (variable d'environnement) reste la plus fiable. Le patch regex (Solution 2) sera écrasé lors des mises à jour npm.
Après update de copilot-api :
# Vérifier si le patch existe toujours
grep -n "FIX #174" ~/.nvm/versions/node/v22.18.0/lib/node_modules/copilot-api/dist/main.js
# Si vide → ré-appliquer le patchSurveille copilot-api#174 pour un fix officiel dans une future version.
Une fois le fix intégré officiellement :
npm update -g copilot-api # Mettre à jour
# Plus besoin du patch manuelGitHub modifie son API interne. copilot-api officiel (v0.7.0, stalled) pourrait casser.
Status: En cours de monitoring (février 2026)
- Utiliser le fork caozhiyuan v1.3.1 via
ccunified - Fork plus activement maintenu, réponse plus rapide aux breaking changes
# Terminal 1: Launch fork maintenu
ccunified
# Terminal 2: Utilisation normale
ccc-sonnet # ou tout autre modèleSuivi: ericc-ch/copilot-api#191
Quand tu utilises GPT-4.1, tu vois des erreurs API dans Claude Code :
COPILOT_MODEL=gpt-4.1 ccc
❯ 1+1
API Error: 400 {"error":{"message":"Invalid schema for function 'mcp__grepai__grepai_index_status':
In context=(), object schema missing properties.","code":"invalid_function_parameters"}}Ces erreurs apparaissent également dans les logs de copilot-api (terminal où copilot-api start tourne) :
ERROR HTTP error: { error:
{ message:
"Invalid schema for function 'mcp__grepai__grepai_index_status': In context=(), object schema missing properties.",
code: 'invalid_function_parameters' } }
GPT-4.1 applique une validation stricte des schémas JSON pour les outils MCP (Model Context Protocol). Certains serveurs MCP ont des schémas incomplets ou invalides qui passent avec Claude (permissif) mais échouent avec GPT-4.1.
Problème typique : Schéma déclaré comme "type": "object" sans définir "properties": {}, ce qui est techniquement invalide selon JSON Schema.
Serveurs MCP problématiques connus:
- ❌
grepai:grepai_index_status- object schema missing properties
Option 1: Utiliser Claude (100% compatible MCP) ⭐ Recommandé
ccc-sonnet # Claude Sonnet 4.6 (défaut)
ccc-opus # Claude Opus 4.6 (meilleure qualité)
ccc-haiku # Claude Haiku 4.5 (ultra rapide)Avantages:
- ✅ 100% compatible avec tous les serveurs MCP
- ✅ Validation permissive (accepte schémas imparfaits)
- ✅ Meilleure qualité que GPT-4.1
Option 2: Désactiver le serveur MCP problématique
Éditez ~/.claude/claude_desktop_config.json:
{
"mcpServers": {
// "grepai": { <- Commenté ou supprimé
// "command": "grepai",
// "args": ["mcp-serve"]
// },
"playwright": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-playwright"]
}
}
}Puis relancez Claude Code.
Option 3: Reporter le problème au mainteneur
Si un serveur MCP que tu utilises a ce problème, ouvre une issue:
- grepai: https://github.com/grepAI/grepai/issues
- Autres serveurs: Cherche le repo GitHub du serveur
Quand tu vois l'erreur "Invalid schema for function 'mcp__...'", lance le diagnostic :
# Vérifier tous les serveurs MCP configurés
mcp-check.sh
# Avec analyse des logs récents de Claude Code
mcp-check.sh --parse-logsLe script identifiera le serveur MCP problématique et te proposera 4 solutions.
Output exemple:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
MCP Server Compatibility Check
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Found 3 MCP server(s) configured:
━━━ grepai ━━━
Command: grepai mcp-serve
✓ Command installed
✗ Known compatibility issue:
grepai_index_status: object schema missing properties
Impact: Fails with GPT-4.1 (strict validation)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Summary
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Servers checked: 3
Compatibility issues: 1
═══ Recommendations ═══
Option 1: Use Claude models (100% MCP compatible)
ccc-sonnet # Claude Sonnet 4.6
ccc-opus # Claude Opus 4.6
Option 2: Disable problematic MCP servers
Edit: /Users/you/.claude/claude_desktop_config.json
Remove or comment out problematic servers
Option 3: Report issue to MCP server maintainer
Example: https://github.com/grepAI/grepai/issues
| Aspect | Claude | GPT-4.1 |
|---|---|---|
| Validation MCP | Permissive | Stricte |
| Schémas imparfaits | ✅ Accepte | ❌ Rejette |
| Compatibilité | 100% | ~80% (selon serveurs) |
| Recommandation | ⭐ Défaut | Backup si Claude indisponible |
- 100% compatible endpoint
/chat/completions - 100% compatible MCP tools (validation permissive)
- Pas de breaking changes sur updates MCP
- Meilleure qualité de code générée
Conclusion: Pour une expérience sans friction avec copilot-api, utilise Claude par défaut.
When using Gemini models with tool-calling (file creation, MCP tools, etc.), you experience:
COPILOT_MODEL=gemini-3-pro-preview ccc
❯ Create a file called hello.txt with "test"
# Possible symptoms:
# 1. No response or timeout
# 2. Error: model_not_supported
# 3. Error: invalid_request_body
# 4. Error: INVALID_ARGUMENT
# 5. File not created despite "success" messageSimple prompts work fine:
COPILOT_MODEL=gemini-3-pro-preview ccc -p "1+1"
✅ Returns: 2Agentic prompts fail:
COPILOT_MODEL=gemini-3-pro-preview ccc -p "Create hello.txt"
❌ No file created, errors in logsRoot issue: copilot-api translates Claude tool calling format → OpenAI format → Gemini format. This translation chain introduces incompatibilities:
- Tool Format Mismatch: Claude uses Anthropic tool schema, Gemini expects Google-specific format
- Subagent Calls: Claude Code spawns subagents (Task tool) that may not work correctly with Gemini
- Preview Model Instability:
gemini-3-*-previewmodels are experimental and may have incomplete tool support
Issue reference: copilot-api#151
Run automated diagnostic to identify the exact issue:
# In cc-copilot-bridge project
cd /path/to/cc-copilot-bridge
# Run test suite
./scripts/test-gemini.sh
# View results
cat debug-gemini/summary.txt
cat debug-gemini/diagnostic-report.mdDiagnostic tests:
| Test | Scenario | What It Checks |
|---|---|---|
| 1 | Simple calculation | Baseline (non-agentic) |
| 2 | File creation | Direct tool calling |
| 3 | MCP grep tool | MCP schema compatibility |
| 4 | Subagent workaround | If routing through GPT fixes issue |
| 5 | Gemini 2.5 stable | Stable model comparison |
Decision tree:
Test 1 fails → copilot-api auth/config issue
Test 2 fails, Test 1 OK → Tool format incompatibility
Test 3 fails → MCP schema validation issue (see "MCP Schema Validation Error" section)
Test 4 succeeds, Test 2 fails → Confirms subagent routing fixes issue
Test 5 succeeds, Test 2 fails → Gemini 3 preview limitation
Option 0: Use Unified Fork (RECOMMENDED for Gemini 3) ✅
The unified fork combines PR #167 (Gemini 3 thinking support) + PR #170 (Codex /responses).
thought_signature, reasoning_opaque). This is NOT the same as fixing tool calling format translation. The core issue (Claude → OpenAI → Gemini format) may still exist.
# Terminal 1: Launch unified fork
ccunified
# OR
~/Sites/perso/cc-copilot-bridge/scripts/launch-unified-fork.sh
# Terminal 2: Test Gemini 3
ccc-gemini3 # gemini-3-flash-preview
ccc-gemini3-pro # gemini-3-pro-previewWhat to test:
# 1. Baseline (should work)
ccc-gemini3 -p "1+1"
# 2. Agentic mode (uncertain - please report results!)
ccc-gemini3
❯ Create a file test.txt with "hello"
# Check: Was the file created?Pros:
- ✅ Adds Gemini 3 thinking response support
- ✅ Also supports Codex models (tested, working)
- ✅ Auto-clones and updates fork
Cons:
- ✅ Agentic mode Supported - tool calling improved in fork v1.3.1
⚠️ Requires running fork instead of official copilot-api⚠️ Fork maintenance depends on community
Source: caozhiyuan/copilot-api branch 'all'
Option 1: Use Stable Gemini 2.5 Pro
COPILOT_MODEL=gemini-2.5-pro ccc
# OR
ccc-gemini # Alias for gemini-2.5-pro
❯ Create hello.txt with "test"
✅ Works reliably for most scenariosPros:
- ✅ More stable than preview models
- ✅ Better tool calling support
- ✅ Production-ready
Cons:
⚠️ May still have occasional issues with complex multi-tool workflows⚠️ Deprecated: 17 fév 2026 (passé) → migrer vers gemini-3-pro-preview
Option 2: Use Subagent Workaround (Gemini 3 Preview)
For preview models, route complex operations through a stable subagent:
# Manual usage
COPILOT_MODEL=gemini-3-pro-preview CLAUDE_CODE_SUBAGENT_MODEL=gpt-5-mini ccc
❯ Create hello.txt with "test"
✅ Subagent (GPT-5-mini) handles tool callsHow it works:
- Main agent: Gemini 3 (planning, reasoning)
- Subagent: GPT-5-mini (tool execution)
- When Claude Code spawns Task tool → uses GPT instead of Gemini
Pros:
- ✅ Keeps Gemini 3 for main reasoning
- ✅ Stable tool execution via GPT
- ✅ No need to disable MCP servers
Cons:
⚠️ Slight latency increase (2 models involved)⚠️ Mixed model behavior
Option 3: Use Claude Models (100% Compatible) 🚀 Best Quality
ccc-sonnet # Claude Sonnet 4.6 (default, balanced)
ccc-opus # Claude Opus 4.6 (best quality)
ccc-haiku # Claude Haiku 4.5 (fastest)
❯ Create hello.txt with "test"
✅ Works flawlessly, no workarounds neededPros:
- ✅ 100% tool calling compatibility
- ✅ Best agentic performance
- ✅ No translation issues (native Anthropic format)
- ✅ Best code quality
Cons:
- None (Claude via Copilot is the gold standard)
Option 4: Use GPT Models (Reliable Alternative)
COPILOT_MODEL=gpt-4.1 ccc # Balanced, 0x premium
COPILOT_MODEL=gpt-5 ccc # Advanced reasoning, 1x premium
COPILOT_MODEL=gpt-5-mini ccc # Ultra fast, 0x premium
❯ Create hello.txt with "test"
✅ Reliable tool calling (with MCP exclusions if needed)Pros:
- ✅ Stable tool calling
- ✅ Good agentic performance
- ✅ Fast responses
Cons:
⚠️ MCP schema validation (see "MCP Schema Validation Error" section)⚠️ May need to disablegrepaiMCP server
The subagent workaround can be automated in claude-switch script:
# In ~/bin/claude-switch, add to _run_copilot() function:
# Gemini workaround: auto-set subagent for preview models
if [[ "$COPILOT_MODEL" == gemini-3-*-preview ]]; then
export CLAUDE_CODE_SUBAGENT_MODEL="${CLAUDE_CODE_SUBAGENT_MODEL:-gpt-5-mini}"
_log "INFO" "Gemini preview detected: subagent=$CLAUDE_CODE_SUBAGENT_MODEL"
fiBenefits:
- 🔄 Automatic workaround activation
- 📝 Logged in session logs
- 🎯 Only affects Gemini preview models
| Model | Simple Prompts | Agentic/Tools | Status | Recommendation |
|---|---|---|---|---|
claude-sonnet-4-6 |
✅ Excellent | ✅ Excellent | Stable | ⭐ Best choice |
claude-opus-4-6 |
✅ Excellent | ✅ Excellent | Stable | ⭐ Best quality |
gpt-4.1 |
✅ Excellent | ✅ Good | Stable | ✅ Reliable |
gpt-5 |
✅ Excellent | ✅ Good | Stable | ✅ Advanced reasoning |
gemini-2.5-pro |
✅ Good | Deprecated 2/17/26 (passé) | ||
gemini-3-pro-preview |
✅ Good | Supported (via fork v1.3.1) | ||
gemini-3-flash-preview |
✅ Good | Supported (via fork v1.3.1) |
Gemini 3 Preview Models:
- ❌ Direct tool calling unreliable
- ❌ Subagent spawning may fail
- ❌ MCP tool execution inconsistent
⚠️ File operations may silently fail
Gemini 2.5 Pro (Deprecated - 17 fév 2026 passé):
⚠️ Occasional tool calling failures⚠️ Complex multi-tool workflows problematic⚠️ Deprecation: 17 fév 2026 (passé)
Recommended Migration Path:
Current: gemini-2.5-pro
↓
Short-term: gemini-2.5-pro + monitor stability
↓
If issues: Switch to claude-sonnet-4-6 (ccc-sonnet)
↓
When stable: Migrate to gemini-3-pro-preview (with subagent)
If you prefer manual testing:
# Create test directory
cd /tmp && mkdir -p gemini-test && cd gemini-test
# Test 1: Baseline (should work)
COPILOT_MODEL=gemini-3-pro-preview ccc -p "Calculate 1+1"
# Test 2: File creation (may fail)
COPILOT_MODEL=gemini-3-pro-preview ccc -p "Create hello.txt with 'test'"
# Test 3: With subagent workaround (should work)
COPILOT_MODEL=gemini-3-pro-preview CLAUDE_CODE_SUBAGENT_MODEL=gpt-5-mini \
ccc -p "Create hello2.txt with 'test'"
# Test 4: Stable model (should work)
COPILOT_MODEL=gemini-2.5-pro ccc -p "Create hello3.txt with 'test'"
# Verify files created
ls -la hello*.txtIf you see errors, check copilot-api logs:
# Terminal 1: Start copilot-api in verbose mode
pkill -f copilot-api || true
copilot-api start -v 2>&1 | tee copilot-api-verbose.log
# Terminal 2: Run tests
# ... execute tests ...
# Terminal 1: Look for errors
grep -iE "(error|invalid|model_not_supported)" copilot-api-verbose.logCommon error patterns:
ERROR HTTP error: { error: { message: 'model_not_supported' } }
ERROR Invalid schema for function 'mcp__...'
ERROR INVALID_ARGUMENT: tool_config.function_calling_config ...
-
Verify copilot-api is running:
nc -z localhost 4141 && echo "✅ Running" || echo "❌ Not running"
-
Check model availability:
# In copilot-api logs, you should see: # Available models: claude-*, gpt-*, gemini-*
-
Test with working model first:
# Establish baseline with Claude ccc-sonnet -p "1+1" # If this fails → copilot-api issue, not Gemini-specific
-
Run automated diagnostic:
./scripts/test-gemini.sh cat debug-gemini/diagnostic-report.md
-
Analyze logs:
./scripts/analyze-copilot-logs.sh debug-gemini/copilot-api-verbose.log
For Production Code:
ccc-sonnet # 100% reliable, best qualityFor Experimentation with Gemini:
# Use subagent workaround
COPILOT_MODEL=gemini-3-pro-preview CLAUDE_CODE_SUBAGENT_MODEL=gpt-5-mini cccFor Quick Tasks:
ccc-haiku # Fast, reliable, no Gemini complexity- copilot-api Issue #151 - Gemini model compatibility
- Gemini API Tool Calling Docs
scripts/test-gemini.sh- Automated diagnostic suitedebug-gemini/README.md- Testing workspace documentation
Symptom:
ccc
ERROR: copilot-api not running on :4141
Start it with: copilot-api startSolution:
# Start copilot-api
copilot-api start
# If OAuth expired, re-authenticate
copilot-api stop
copilot-api startSymptom:
cco
ERROR: Ollama not running on :11434
Start it with: ollama serveSolution:
# Check Homebrew service
brew services info ollama
# Restart if needed
brew services restart ollama
# Verify it's running
curl http://localhost:11434/api/tagsollama ps
NAME SIZE PROCESSOR
devstral-small-2 12 GB 50% GPU # Should be ~15 GB, 100% GPUNot enough RAM or model not fully loaded.
-
Check available RAM:
# macOS vm_stat | grep free # Should have 20GB+ free for devstral-small-2
-
Use Granite4 if RAM-limited (70% less VRAM with hybrid Mamba architecture):
ollama pull ibm/granite4:small-h OLLAMA_MODEL=ibm/granite4:small-h cco
-
Close other applications to free RAM.
claude-switch: permission deniedchmod +x ~/bin/claude-switch
chmod +x ~/bin/ollama-check.sh
chmod +x ~/bin/ollama-optimize.shAfter running ollama-optimize.sh, performance unchanged.
Environment variables set but service not restarted.
# Restart Ollama service
brew services restart ollama
# Verify variables are set
launchctl getenv OLLAMA_FLASH_ATTENTION # Should show: 1
launchctl getenv OLLAMA_CONTEXT_LENGTH # Should show: 8192
# Wait 10 seconds for service to start
sleep 10
# Test
ollama pscco
❯ who are you?
⏺ I am Claude, created by Anthropic...But you're using Devstral (not Claude).
Not a bug. The local model sees Claude Code's system prompt and adopts that identity. This is normal behavior for instruction-tuned models.
It doesn't affect:
- Code quality
- Performance
- Functionality
The model is still Devstral (or Granite4), just confused about its identity from the prompt.
tail -5 ~/.claude/claude-switch.logLook for:
[INFO] Provider: Ollama Local - Model: devstral-64k
[INFO] Session started: mode=ollama:...
# Find Claude Code PID
ps aux | grep "claude --model"
# Check environment (replace PID)
cat /proc/PID/environ | tr '\0' '\n' | grep ANTHROPICExpected for Ollama:
ANTHROPIC_BASE_URL=http://localhost:11434
ANTHROPIC_AUTH_TOKEN=ollama
# Terminal 1: Monitor Ollama port
sudo tcpdump -i lo0 -A 'tcp port 11434'
# Terminal 2: Launch cco and send prompt
# You should see JSON traffic in Terminal 1If issues persist:
-
Run full diagnostic:
ollama-check.sh > diagnostic.txt -
Check provider status:
ccs # Status of all providers -
Review logs:
tail -50 ~/.claude/claude-switch.log -
Report issue with:
- Output of
ollama-check.sh - Last 20 lines of
~/.claude/claude-switch.log - Output of
ollama ps - Your project size (file count)
- Output of
- README.md - Full documentation
- OPTIMISATION-M4-PRO.md - Performance optimization guide
- STATUS.md - Implementation status
- COMMANDS.md - Command reference
Last Updated: 2026-01-21 Version: 1.1.0
- FAQ - Frequently asked questions
- Decision Trees - Choose the right command/model
- Best Practices - Strategic usage patterns
- Security Guide - Privacy and data flow
- Quick Start Guide - Installation guide
Back to: Documentation Index | Main README