Can OpenClaw use local models?

Yes. OpenClaw supports Ollama as a model provider, which lets you run local AI models like Llama, Mistral, Qwen, and DeepSeek on your own hardware. Configure the Ollama API base URL in your OpenClaw config to connect them.

How do I connect Ollama to OpenClaw?

Install Ollama, pull a model (e.g., "ollama pull llama3.3"), then add an "ollama" provider in your OpenClaw config's models.providers section with apiBase set to http://localhost:11434/v1. Set the default model to "ollama/llama3.3" and restart OpenClaw.

Yes, Ollama is completely free and open-source. The models it runs (Llama, Mistral, Qwen, etc.) are also free. Your only cost is electricity and the hardware to run them. For users without capable GPUs, OpenClaw Launch offers cloud-hosted models starting at $3/month.

← Home

Guide

OpenClaw + Ollama: Run Local AI Models

Use Ollama to run AI models locally and connect them to OpenClaw — complete privacy, zero API costs, and offline capability.

What Is Ollama?

Ollama is an open-source tool that lets you run large language models (LLMs) locally on your own computer. Instead of sending your prompts to OpenAI, Anthropic, or Google, everything stays on your machine. Ollama supports popular open-source models like Llama, Mistral, Qwen, DeepSeek, and many more.

Why Use Ollama with OpenClaw?

Connecting Ollama to OpenClaw gives you a fully local AI agent with real capabilities:

Complete privacy — Your conversations never leave your machine. No data is sent to any cloud API.
Zero API costs — Local models are free to run. No per-token billing, no usage limits, no surprise charges.
Offline capability — Once a model is downloaded, it works without an internet connection.
Full agent features — You still get all of OpenClaw's features: Telegram/Discord integration, thousands of ClawHub skills, web UI, and session management.

Compatible Local Models

Ollama supports hundreds of models. Here are the most popular ones for use with OpenClaw:

Model	Parameters	VRAM Needed	Best For
Llama 3.3 70B	70B	40 GB	Best open-source all-rounder
Llama 3.2 8B	8B	5 GB	Fast and lightweight
Mistral Small 3.1	24B	14 GB	Strong reasoning at low cost
Qwen 3 32B	32B	20 GB	Excellent multilingual support
DeepSeek R1 14B	14B	9 GB	Strong coding and math
Phi-4 14B	14B	9 GB	Compact Microsoft model

How to Set Up Ollama

Install Ollama — Download from ollama.com/download. Available for macOS, Linux, and Windows.
Pull a model — Open your terminal and run:
ollama pull llama3.3
This downloads the Llama 3.3 model (~40 GB for 70B). For a lighter option, try ollama pull llama3.2 (8B, ~5 GB).
Start the Ollama server — Run ollama serve (it may already be running as a background service). The server listens on http://localhost:11434 by default.
Verify it works — Run ollama list to see your downloaded models, or ollama run llama3.3 to chat directly in the terminal.

How to Connect Ollama to OpenClaw

To use Ollama as the AI backend for a self-hosted OpenClaw instance, configure the model provider in your openclaw.json config file:

1. Set Ollama as a model provider

In your OpenClaw config, add an ollama entry under models.providers with the base URL of your Ollama server:

"models": {
  "providers": {
    "ollama": {
      "apiBase": "http://localhost:11434/v1"
    }
  }
}

2. Set the default model

Point the agent's primary model to your Ollama model. The model ID must be prefixed with the provider name:

"agents": {
  "defaults": {
    "model": {
      "primary": "ollama/llama3.3"
    }
  }
}

3. Restart OpenClaw

After updating the config, restart your OpenClaw container. The agent will now route all requests through your local Ollama server.

Network note: If OpenClaw runs in Docker but Ollama runs on the host, use http://host.docker.internal:11434/v1 instead of localhost.

Local (Ollama) vs. Cloud Models

Both approaches have trade-offs. Here's how they compare:

	Ollama (Local)	Cloud (OpenClaw Launch)
Privacy	Full — data stays on device	Encrypted at rest, routed via API
Cost	Free (electricity + hardware)	From $3/mo + per-token API costs
Model quality	Good (open-source models)	Best (Claude Opus, GPT-5.2, Gemini)
Speed	Depends on GPU hardware	Fast (cloud inference)
Setup	Install Ollama + self-host OpenClaw	Visual editor, one-click deploy
Offline	Yes	No — requires internet
Hardware needed	GPU with 5–40 GB VRAM	None — fully managed

Hardware Requirements

Local model performance depends on your GPU. Here are rough guidelines:

8B models (Llama 3.2, Phi-4) — Need ~5 GB VRAM. Run well on most modern GPUs or Apple M-series chips with 16 GB+ unified memory.
14–32B models (Mistral Small, Qwen 3) — Need 9–20 GB VRAM. Require a dedicated GPU (RTX 3090/4090) or M-series Mac with 32 GB+.
70B models (Llama 3.3) — Need ~40 GB VRAM. Require high-end hardware (dual GPUs, A100, or M-series with 64 GB+).

If you don't have a capable GPU, cloud models via OpenClaw Launch are the faster path — no hardware investment needed.

When to Use Each Approach

Choose Ollama + self-hosted OpenClaw if you prioritize data privacy, want zero ongoing costs, have a capable GPU, and are comfortable with Docker and command-line setup.

Choose OpenClaw Launch (cloud) if you want the best model quality (Claude Opus, GPT-5.2), don't want to manage servers, or need your bot running 24/7 without dedicated hardware. Deploy in 30 seconds with zero infrastructure setup.