LiteLLM is an open-source AI gateway that provides a unified OpenAI-compatible API to access 100+ LLM providers including OpenAI, Anthropic, Google, AWS Bedrock, Azure, and local models via Ollama.

How do I use LiteLLM with OpenClaw?

Deploy a LiteLLM Proxy, configure your model providers in litellm_config.yaml, generate a virtual API key, and point OpenClaw's model provider config to the LiteLLM endpoint (e.g., http://localhost:4000).

Yes, LiteLLM is open source and free to self-host. You pay for your own infrastructure and the API costs from each provider. An Enterprise tier with SSO, RBAC, and audit logs is available for custom pricing.

Should I use LiteLLM or OpenRouter with OpenClaw?

Use OpenRouter for simplicity — one API key, no setup. Use LiteLLM if you need self-hosted infrastructure, direct provider API keys, advanced load balancing, or enterprise controls like SSO and RBAC.

← Home

Guide

OpenClaw + LiteLLM: Use 100+ AI Models via One Proxy

LiteLLM is an open-source AI gateway that gives your OpenClaw agent access to 100+ LLM providers through a single OpenAI-compatible API. This guide covers how to set up LiteLLM with OpenClaw for model routing, load balancing, fallbacks, and cost tracking.

What Is LiteLLM?

LiteLLM is an open-source Python SDK and proxy server by BerriAI that provides a unified interface to call 100+ LLM providers — all using the OpenAI API format. Instead of managing separate API keys and request formats for each provider, you point everything at one LiteLLM endpoint.

With 240M+ Docker pulls and 40K+ GitHub stars, LiteLLM has become one of the most popular AI gateways. It handles authentication, load balancing, cost tracking, and failover automatically.

Why Use LiteLLM with OpenClaw?

OpenClaw already supports multiple providers via OpenRouter. LiteLLM is the alternative when you need:

Self-hosted proxy — Keep all API traffic within your infrastructure
Direct provider accounts — Use your own OpenAI, Anthropic, or cloud provider API keys without a middleman
Load balancing — Spread requests across multiple deployments of the same model
Automatic fallbacks — If one provider is down, LiteLLM fails over to another
Cost tracking — Per-key, per-user spend attribution and budget limits
Enterprise requirements — SSO, RBAC, audit logs, and compliance controls

Supported Providers

LiteLLM supports 100+ providers. Here are the most commonly used with OpenClaw:

Provider	Models	Connection
OpenAI	GPT-4o, GPT-4o-mini, o1, o3	Direct API
Anthropic	Claude Opus 4, Sonnet 4, Haiku 3.5	Direct API
Google	Gemini 2.5 Pro, Flash	Via Vertex AI
AWS Bedrock	Claude, Llama, Mistral, Titan	AWS credentials
Azure OpenAI	GPT-4o, GPT-4	Azure deployment
Ollama	Llama 3, Qwen 3, Mistral, Phi-4	Local models
Cohere	Command R+, Command R	Direct API
HuggingFace	Open models via Inference API	HF token

Setup: LiteLLM + OpenClaw

Step 1: Install LiteLLM

pip install 'litellm[proxy]'

Or run via Docker:

docker run -d -p 4000:4000 \
  -v ./litellm_config.yaml:/app/config.yaml \
  ghcr.io/berriai/litellm:main-latest \
  --config /app/config.yaml

Step 2: Configure Your Models

Create a litellm_config.yaml file that defines which models and providers LiteLLM should expose:

model_list:
  # Anthropic Claude
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: sk-ant-your-key

  # OpenAI GPT-4o
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: sk-your-openai-key

  # Local Ollama model
  - model_name: llama-local
    litellm_params:
      model: ollama/llama3.3
      api_base: http://localhost:11434

general_settings:
  master_key: sk-your-master-key

Step 3: Start LiteLLM Proxy

litellm --config litellm_config.yaml --port 4000

The proxy is now running at http://localhost:4000 with an OpenAI-compatible API. Test it:

curl http://localhost:4000/v1/models \
  -H "Authorization: Bearer sk-your-master-key"

Step 4: Generate a Key for OpenClaw

Create a dedicated virtual key with spend limits for your OpenClaw agent:

curl -X POST http://localhost:4000/key/generate \
  -H "Authorization: Bearer sk-your-master-key" \
  -H "Content-Type: application/json" \
  -d '{"max_budget": 50, "metadata": {"user": "openclaw-agent"}}'

Step 5: Configure OpenClaw

In your OpenClaw configuration (openclaw.json), add LiteLLM as a model provider:

{
  "models": {
    "providers": {
      "litellm": {
        "baseUrl": "http://localhost:4000",
        "apiKey": "sk-generated-key-from-step-4"
      }
    }
  },
  "agents": {
    "defaults": {
      "model": {
        "primary": "litellm/claude-sonnet"
      }
    }
  }
}

The model name is prefixed with the provider name from your config — e.g., litellm/claude-sonnet routes through LiteLLM to the Claude Sonnet model you defined in litellm_config.yaml.

Load Balancing

LiteLLM can distribute requests across multiple deployments of the same model. Define multiple entries with the same model_name:

model_list:
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: sk-ant-key-1
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: sk-ant-key-2

router_settings:
  routing_strategy: latency-based-routing

Strategy	Description
Simple Shuffle	Random distribution across deployments (default)
Latency-Based	Routes to the fastest responding deployment
Usage-Based	Distributes based on current utilization
Least-Busy	Routes to the deployment with fewest active requests
Cost-Based	Prefers the cheapest provider for each request

Automatic Fallbacks

Configure fallback models so if one provider fails, LiteLLM automatically retries with another:

model_list:
  - model_name: primary-model
    litellm_params:
      model: anthropic/claude-sonnet-4-20250514
      api_key: sk-ant-key
  - model_name: fallback-model
    litellm_params:
      model: openai/gpt-4o
      api_key: sk-openai-key

router_settings:
  fallbacks:
    - primary-model: [fallback-model]
  num_retries: 2
  retry_after: 5

Cost Tracking & Budgets

LiteLLM tracks spend per virtual key, per user, and per team. Set budget limits when generating keys:

max_budget — Maximum total spend for the key
tpm_limit — Tokens per minute limit
rpm_limit — Requests per minute limit

View spend via the LiteLLM dashboard at http://localhost:4000/ui or the /spend/logs API endpoint.

LiteLLM vs OpenRouter

Both LiteLLM and OpenRouter give OpenClaw access to multiple AI models. Here's how they compare:

	LiteLLM	OpenRouter
Hosting	Self-hosted	Managed SaaS
API keys	Your own provider keys	Single OpenRouter key
Cost	Free (open source)	Small markup on provider pricing
Setup complexity	Moderate (deploy proxy)	Easy (just an API key)
Load balancing	Built-in (6 strategies)	Automatic
Data privacy	Full control	Traffic via OpenRouter servers
Best for	Enterprise, self-hosters	Quick start, individuals

Pricing

Tier	Cost	Features
Open Source	Free	Full proxy, 100+ providers, virtual keys, load balancing, cost tracking
Enterprise	Custom	SSO (Okta/Google), RBAC, JWT auth, audit logs, SLAs, dedicated support

You pay for your own infrastructure to host the proxy (typically $100–$400/mo) plus the actual API costs from each provider. LiteLLM itself adds no markup.

Skip the Setup

Don't want to manage a LiteLLM proxy? OpenClaw Launch handles model routing for you with 20+ models pre-configured via OpenRouter. Deploy in 10 seconds, flat pricing from $3/mo, no proxy setup needed.