Using Third-Party Models (OpenAI / DeepSeek / Local Models)
This project communicates with LLMs via the Anthropic protocol. By using a protocol translation proxy, you can use any model including OpenAI, DeepSeek, Ollama, etc.
How It Works
claude-code-haha ──Anthropic protocol──▶ LiteLLM Proxy ──OpenAI protocol──▶ Target Model API
(translation)This project sends Anthropic Messages API requests. The LiteLLM proxy automatically translates them to OpenAI Chat Completions API format and forwards them to the target model.
Option 1: LiteLLM Proxy (Recommended)
LiteLLM is a unified proxy gateway supporting 100+ LLMs (41k+ GitHub Stars), with native support for receiving Anthropic protocol requests.
1. Install LiteLLM
pip install 'litellm[proxy]'2. Create Configuration File
Create litellm_config.yaml:
Using OpenAI Models
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
litellm_settings:
drop_params: true # Drop Anthropic-specific params (thinking, etc.)Using DeepSeek Models
model_list:
- model_name: deepseek-chat
litellm_params:
model: deepseek/deepseek-chat
api_key: os.environ/DEEPSEEK_API_KEY
api_base: https://api.deepseek.com
litellm_settings:
drop_params: trueUsing Ollama Local Models
model_list:
- model_name: llama3
litellm_params:
model: ollama/llama3
api_base: http://localhost:11434
litellm_settings:
drop_params: trueUsing Multiple Models (switchable after startup)
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: deepseek-chat
litellm_params:
model: deepseek/deepseek-chat
api_key: os.environ/DEEPSEEK_API_KEY
api_base: https://api.deepseek.com
- model_name: llama3
litellm_params:
model: ollama/llama3
api_base: http://localhost:11434
litellm_settings:
drop_params: true3. Start the Proxy
# Set your target model's API key
export OPENAI_API_KEY=sk-xxx
# or
export DEEPSEEK_API_KEY=sk-xxx
# Start the proxy
litellm --config litellm_config.yaml --port 4000The proxy will listen on http://localhost:4000 and expose an Anthropic-compatible /v1/messages endpoint.
4. Configure This Project
Choose one of two configuration methods:
Method A: Via .env File
ANTHROPIC_AUTH_TOKEN=sk-anything
ANTHROPIC_BASE_URL=http://localhost:4000
ANTHROPIC_MODEL=gpt-4o
ANTHROPIC_DEFAULT_SONNET_MODEL=gpt-4o
ANTHROPIC_DEFAULT_HAIKU_MODEL=gpt-4o
ANTHROPIC_DEFAULT_OPUS_MODEL=gpt-4o
API_TIMEOUT_MS=3000000
DISABLE_TELEMETRY=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1Method B: Via ~/.claude/settings.json
{
"env": {
"ANTHROPIC_AUTH_TOKEN": "sk-anything",
"ANTHROPIC_BASE_URL": "http://localhost:4000",
"ANTHROPIC_MODEL": "gpt-4o",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "gpt-4o",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "gpt-4o",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "gpt-4o",
"API_TIMEOUT_MS": "3000000",
"DISABLE_TELEMETRY": "1",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1"
}
}Note: The
ANTHROPIC_AUTH_TOKENvalue can be any string when using the LiteLLM proxy (LiteLLM uses its own configured key for forwarding), unless you've set amaster_keyon the LiteLLM side.
5. Start and Verify
./bin/claude-hahaIf everything is configured correctly, you should see the normal chat interface, with your configured target model handling the requests.
Option 2: Direct Connection to Anthropic-Compatible Services
Some third-party services directly support the Anthropic Messages API, no proxy needed:
OpenRouter
ANTHROPIC_AUTH_TOKEN=sk-or-v1-xxx
ANTHROPIC_BASE_URL=https://openrouter.ai/api/v1
ANTHROPIC_MODEL=openai/gpt-4o
ANTHROPIC_DEFAULT_SONNET_MODEL=openai/gpt-4o
ANTHROPIC_DEFAULT_HAIKU_MODEL=openai/gpt-4o-mini
ANTHROPIC_DEFAULT_OPUS_MODEL=openai/gpt-4o
DISABLE_TELEMETRY=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1MiniMax (pre-configured in .env.example)
ANTHROPIC_AUTH_TOKEN=your_token_here
ANTHROPIC_BASE_URL=https://api.minimaxi.com/anthropic
ANTHROPIC_MODEL=MiniMax-M2.7-highspeed
ANTHROPIC_DEFAULT_SONNET_MODEL=MiniMax-M2.7-highspeed
ANTHROPIC_DEFAULT_HAIKU_MODEL=MiniMax-M2.7-highspeed
ANTHROPIC_DEFAULT_OPUS_MODEL=MiniMax-M2.7-highspeed
API_TIMEOUT_MS=3000000
DISABLE_TELEMETRY=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1Option 3: Other Proxy Tools
The community has built several proxy tools specifically for Claude Code:
| Tool | Description | Link |
|---|---|---|
| a2o | Anthropic → OpenAI single binary, zero dependencies | |
| Empero Proxy | Full Anthropic Messages API to OpenAI translation | |
| Alma | Client with built-in OpenAI → Anthropic proxy | |
| Chutes | Docker container supporting 60+ open-source models |
Known Limitations
1. drop_params: true Is Essential
This project sends Anthropic-specific parameters (e.g., thinking, cache_control) that don't exist in the OpenAI API. You must set drop_params: true in the LiteLLM config, otherwise requests will fail.
2. Extended Thinking Unavailable
Anthropic's Extended Thinking is a proprietary feature not supported by other models. It is automatically disabled when using third-party models.
3. Prompt Caching Unavailable
cache_control is an Anthropic-specific feature. Prompt caching won't work with third-party models (but won't cause errors — it's silently ignored by drop_params).
4. Tool Calling Compatibility
This project heavily uses tool calling (tool_use). LiteLLM automatically translates Anthropic's tool_use format to OpenAI's function_calling format. This works in most cases, but some complex tool calls may have compatibility issues. If you encounter problems, try using a more capable model (e.g., GPT-4o).
5. Telemetry and Non-Essential Requests
Configure these environment variables to avoid unnecessary network requests:
DISABLE_TELEMETRY=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1FAQ
Q: LiteLLM proxy returns /v1/responses not found?
Some OpenAI-compatible services only support /v1/chat/completions. Add this to your LiteLLM config:
litellm_settings:
use_chat_completions_url_for_anthropic_messages: trueQ: What's the difference between ANTHROPIC_API_KEY and ANTHROPIC_AUTH_TOKEN?
ANTHROPIC_API_KEY→ Sent viax-api-keyheaderANTHROPIC_AUTH_TOKEN→ Sent viaAuthorization: Bearerheader
LiteLLM proxy accepts Bearer Token format by default, so ANTHROPIC_AUTH_TOKEN is recommended.
Q: Can I configure multiple models?
Yes. Define multiple model_name entries in litellm_config.yaml, then switch by changing the ANTHROPIC_MODEL value.
Q: Local Ollama models don't work well?
This project's system prompts and tool calls require strong model capabilities. Use larger models (e.g., Llama 3 70B+, Qwen 72B+). Smaller models may fail to handle tool calling correctly.