

Claude Code has rapidly emerged as a powerful, unopinionated command-line tool designed for agentic coding. Developed as an internal research project at Anthropic, it allows developers to integrate Claude directly into their terminal workflows to automate complex tasks like refactoring, testing, and even managing git operations. While natively built for Anthropic’s Claude 3.5 and 4.0 series models, many power users are discovering that the tool's flexibility allows for the integration of alternative LLM models through an LLM gateway or API proxy.
This guide explores how to break free from the default configuration and leverage a variety of models—ranging from OpenAI's GPT series to local models running on your own hardware—within the Claude Code environment.
Table of Contents
What Claude Code Is and How It Uses LLMs
Claude Code is a REPL (Read-Eval-Print Loop) that acts as an agentic assistant, meaning it doesn't just suggest code; it can execute shell commands, read and write files, and orchestrate subagents to solve complex problems.
How it connects to LLMs
By default, Claude Code connects to Anthropic’s official API. It determines which features to enable based on the API format it receives, primarily looking for the Anthropic Messages format (e.g., /v1/messages). It uses these models to reason through a codebase, gather context automatically from files like CLAUDE.md, and use tools via the Model Context Protocol (MCP).
Why users might want alternative models
While Claude 3.5 Sonnet is the default for its balance of speed and reasoning, developers often seek alternative LLM models for distinct reasons. Cost Management is a primary driver, as third-party gateways can offer flexible usage-based pricing. Others prioritize Context Management, using different models for background tasks to prevent session clutter. Additionally, Privacy and Local Development needs often lead developers to run a local LLM with Claude Code, ensuring proprietary code never leaves their machine.
Why Use a Different LLM with Claude Code
For power users, routing Claude Code requests to alternative backends offers distinct functional advantages. One of the most immediate benefits is Speed; smaller models like GPT-4o-mini or Claude 3.5 Haiku can process simple tasks, such as writing git commit messages, much faster than their larger counterparts.
Beyond simple speed, these configurations offer Enhanced Features such as automatic failover and retry logic provided by certain gateways, ensuring coding sessions remain uninterrupted. Experimentation also becomes significantly more accessible; expensive tasks or vibe-coding flow states are more affordable when backed by cheaper or local OSS models like Qwen3 Coder or Mistral-Small. Finally, for organizations, this approach provides Centralized Control, enabling teams to use gateways like LiteLLM for auditing, logging, and budget tracking across multiple engineers.
Ways to Use a Different LLM with Claude Code
To use a non-Anthropic model, you must provide Claude Code with an endpoint that mimics the Anthropic API. There are four primary methods to achieve this:
1. LLM Gateways
Gateways like OpenRouter, LLMGateway, and ZenMux provide a unified API that translates various model outputs into the Anthropic-compatible format. They offer the advantage of minimal setup and access to hundreds of models with built-in load balancing, though they do introduce a third-party dependency and minor latency.
2. API Proxies
Tools like LiteLLM or custom Python scripts act as a local proxy between Claude Code and other API providers. This method provides high flexibility and supports in-session model switching across multiple providers, though it requires maintaining a secondary proxy process in the background.
3. Local Models
By using Ollama or LM Studio, you can run models directly on your hardware for total privacy and zero per-token costs. While this works offline, it demands significant local compute power (GPU/RAM), and developers should note that local models may occasionally lack the high-level agentic intelligence found in flagship cloud models.
4. Hosted Alternatives
Services like Z.AI or Moonshot AI provide direct, Anthropic-compatible endpoints specifically designed for Claude Code. These are often much cheaper than standard API rates and require no proxy, though users are limited to the specific models hosted by that provider.
Step-by-Step: Using a Different LLM with Claude Code
The core of any Claude Code LLM configuration involves overriding two environment variables: ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN.
Option A: Using OpenRouter (Direct Connection)
OpenRouter provides an "Anthropic Skin" that speaks the native protocol directly.
1. Set Environment Variables: Initialize your session by pointing the base URL to OpenRouter and clearing any conflicting keys.
export ANTHROPIC_BASE_URL="https://openrouter.ai/api"
export ANTHROPIC_AUTH_TOKEN="your_openrouter_key"
export ANTHROPIC_API_KEY="" # Must be explicitly empty2. Override the Model Tier: Explicitly define which model OpenRouter should route to for the default Sonnet alias.
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5.2-pro"3. Start Claude: Launch the tool with the claude command and verify your connection status using /status.
Option B: Using LiteLLM (Proxy Setup)
LiteLLM is ideal for those who want to switch between local and cloud providers dynamically.
1. Installation & Configuration: Install the proxy with pip install 'litellm[proxy]' and create a config.yaml to map your desired models.
model_list:
- model_name: custom-sonnet
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY2. Launch & Connect: Start the proxy server using litellm --config config.yaml, then point Claude Code to your local endpoint (usually http://0.0.0.0:4000).
Option C: Using Local Models via Ollama
1. Point to Localhost: Direct Claude Code to your running Ollama instance by updating the base URL variable.
export ANTHROPIC_BASE_URL="http://localhost:11434"
export ANTHROPIC_API_KEY="ollama"2. Run with Model Selection: Launch the tool while specifying your local model as the target.
claude --model qwen3-coderConfiguration Tip: Persistent Setup
To avoid re-exporting these variables every session, add them to your shell profile (~/.zshrc or ~/.bashrc). For more advanced control, you can use apiKeyHelper in your settings.json to handle rotating keys or per-user authentication.
Limitations and Things to Know
Using alternative models is a powerful workflow enhancement, but it comes with specific technical constraints. Most critically, Tool-Calling Requirement is non-negotiable; Claude Code relies on agentic behaviors to read files and run terminal commands. If your chosen model lacks native tool-calling support, the session will simply fail.
API Compatibility is another hurdle, as gateways must forward specific headers like anthropic-beta to maintain full functionality. Furthermore, Context Window Limits can be an issue—the tool's system prompt alone can exceed 20k tokens, which may overwhelm smaller models. Finally, users should note that MCP Constraints often limit support to HTTP servers in proxy setups, and the general Stability of these unofficial configurations can vary as the Claude Code tool continues to evolve.
Is Using a Different LLM with Claude Code Worth It?
Whether this path is right for you depends on your technical needs and budget. Developers on a Budget or those hitting Pro plan limits will find significant value in usage-based gateways. Local-First Advocates with sufficient hardware can unlock unparalleled privacy, while Power Users can leverage multi-provider setups to use specialized models for specific tasks.
However, for Engineers needing maximum reliability, the official Anthropic models remain the most tested and lucky in terms of adhering to complex system prompts. New Users should also likely stick to the defaults, as the learning curve of agentic coding is steep enough without the added complexity of proxy troubleshooting.
Ultimately, the ability to treat Claude Code as an LLM-agnostic digital intern allow you to build more ambitious projects at a fraction of the cost.
FAQ Section
Can Claude Code use models other than Claude?
Yes. By overriding the ANTHROPIC_BASE_URL, you can route requests to any model provider that supports the Anthropic Messages API format.
Does Claude Code support local LLMs?
Yes, though not natively. You can use tools like Ollama or LM Studio to host a local server and then point Claude Code to that local address using environment variables.
Is it safe to route Claude Code through a gateway?
Third-party gateways are not audited by Anthropic. Ensure your gateway provider has a privacy policy that meets your requirements regarding source code logging.
Will Anthropic officially support other LLMs?
There is no official confirmation. However, the tool is designed to be unopinionated, making it easy for developers to integrate their own backends.
Final Thoughts
Claude Code’s unopinionated and flexible architecture transforms it from a simple terminal client into a powerful, model-agnostic platform for agentic coding. While natively optimized for Anthropic’s flagship models, its reliance on standard environment variables allows developers to redirect requests to an expansive ecosystem of gateways, proxies, and self-hosted models.
By integrating alternative models, you can achieve a fine-tuned balance between high-reasoning performance and cost-effective execution. Whether leveraging specialized coding models, using OpenRouter for provider diversity, or running local OSS models via Ollama for total privacy, the ability to switch backends ensures your workflow is never gated by a single provider's limits.
Think of Claude Code as a highly capable specialized toolkit. While the manufacturer provides premium power cells, the tool is designed with a universal port. By adapting different batteries—from high-capacity cloud models to rechargeable local ones—you ensure that your development environment remains powered, efficient, and perfectly tailored to the demands of your project.
Related Articles
Continue exploring the future
Generative AI: Navigating a Creative New World
Explore the seismic shift in creative industries. Learn how GenAI is building a new world of multimodal production.
Will AI Replace Cybersecurity Jobs?
Explore the truth about AI automation in cybersecurity. Learn why AI won't replace humans, but will transform their roles and skills.
AI Chatbot Conversations Archive
Learn standard practices for saving, analyzing, and managing your AI conversation logs for productivity and compliance.
Loading comments...