Guide
OpenClaw + Gemma 4: Google's Open-Source AI Model
Run Gemma 4 with OpenClaw locally via Ollama, or use it in the cloud via OpenRouter — strong multilingual reasoning with no API costs when self-hosted.
What Is Gemma 4?
Gemma 4 is Google's latest generation of open-source language models, released in April 2026. Built on the same research foundations as Google's Gemini family, Gemma 4 is designed to deliver strong performance at a fraction of the resource cost of frontier commercial models.
The Gemma 4 family comes in three sizes to fit different hardware configurations: a lightweight 4B model for fast, low-resource inference; a 12B model that balances quality and speed; and a 27B model that matches or exceeds many commercial models on standard benchmarks. All three are freely available to download and run on your own machine.
Why Use Gemma 4 with OpenClaw?
Gemma 4 pairs well with OpenClaw for several reasons:
- Strong multilingual support — Gemma 4 was trained with extensive multilingual data, making it one of the better open-source choices for non-English conversations across Telegram, Discord, WhatsApp, and WeChat.
- Solid reasoning quality — The 27B variant performs competitively with GPT-4o-mini and Claude Haiku on instruction-following, coding, and general knowledge tasks.
- Free to run locally — When used via Ollama, Gemma 4 has no per-token costs. Once downloaded, it runs entirely on your hardware with no internet dependency.
- Available via OpenRouter — If you prefer a cloud setup with no local hardware requirements, Gemma 4 27B is available on OpenRouter under the model ID
google/gemma-4-27b, billed per token at a fraction of commercial model pricing. - Full OpenClaw feature set — Whether local or cloud, you still get all of OpenClaw's capabilities: multi-channel messaging (Telegram, Discord, WhatsApp, WeChat), 3,200+ ClawHub skills, session memory, and the web gateway.
Gemma 4 Model Variants
Choose the variant that fits your hardware and quality requirements:
| Model | Parameters | VRAM Needed | Best For |
|---|---|---|---|
gemma4:4b | 4B | 3 GB | Lightweight, fastest inference |
gemma4:12b | 12B | 8 GB | Balanced performance and speed |
gemma4:27b | 27B | 16 GB | Best quality, strongest reasoning |
For most users, gemma4:12b offers the best trade-off: good response quality without requiring a high-end GPU. If you have an RTX 4090 or Apple M-series chip with 24 GB+ unified memory, gemma4:27b is worth the extra weight.
Option A: Run Gemma 4 Locally via Ollama
This approach keeps all data on your machine. Your conversations never leave your hardware, and there are no ongoing API costs after the initial model download.
- Install Ollama — Download from ollama.com/download. Available for macOS, Linux, and Windows.
- Pull Gemma 4 — Open your terminal and run:
ollama pull gemma4
This downloads the default Gemma 4 variant (~8 GB for 12B). To select a specific size, useollama pull gemma4:4borollama pull gemma4:27b. - Start the Ollama server — Run
ollama serveif it is not already running as a background service. The server listens onhttp://localhost:11434by default. - Verify the model is ready — Run
ollama listto confirm the model appears, or test it withollama run gemma4.
Once Ollama is running, configure your OpenClaw instance to use it:
1. Add Ollama as a model provider
In your openclaw.json config, add an ollama entry under models.providers:
"models": {
"providers": {
"ollama": {
"apiBase": "http://localhost:11434/v1"
}
}
}2. Set Gemma 4 as the default model
Point the agent's primary model to Gemma 4. The model ID must be prefixed with the provider name:
"agents": {
"defaults": {
"model": {
"primary": "ollama/gemma4:12b"
}
}
}Replace gemma4:12b with gemma4:4b or gemma4:27b depending on which variant you pulled.
3. Restart OpenClaw
After saving the config, restart your OpenClaw container. The agent will now route all requests through your local Ollama server using Gemma 4.
http://host.docker.internal:11434/v1 instead of http://localhost:11434/v1 as the API base URL.Option B: Use Gemma 4 via OpenRouter (No Local Setup)
If you do not have a capable GPU or prefer a zero-maintenance setup, Gemma 4 27B is available on OpenRouter. You pay per token (significantly cheaper than commercial models), with no hardware to manage and no model downloads required.
1. Get an OpenRouter API key
Create a free account at openrouter.ai and generate an API key from your dashboard.
2. Configure OpenClaw to use OpenRouter with Gemma 4
Add OpenRouter as a provider and set google/gemma-4-27b as the primary model:
"models": {
"providers": {
"openrouter": {
"apiKey": "YOUR_OPENROUTER_API_KEY"
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "openrouter/google/gemma-4-27b"
}
}
}3. Restart OpenClaw
Restart your container to apply the new config. Gemma 4 will now be used for all agent responses, routed through OpenRouter's inference infrastructure.
Local vs. Cloud: Which Should You Choose?
| Ollama (Local) | OpenRouter (Cloud) | OpenClaw Launch (Managed) | |
|---|---|---|---|
| Privacy | Full — data stays on device | Routed via OpenRouter API | Encrypted at rest, routed via API |
| Cost | Free (electricity + hardware) | Per-token billing (low rates) | From $3/mo + per-token API costs |
| Setup time | 15–30 min (install + download) | 5 min (API key only) | Under 2 min (visual editor) |
| Hardware needed | GPU with 3–16 GB VRAM | None | None — fully managed |
| 24/7 uptime | Only while your machine is on | Yes (cloud inference) | Yes (managed hosting) |
| Offline use | Yes | No | No |
Hardware Requirements for Local Gemma 4
Running Gemma 4 locally requires enough VRAM or unified memory to hold the model in memory during inference. Here are the practical requirements per variant:
- Gemma 4 4B — Needs ~3 GB VRAM. Runs on most modern GPUs (RTX 3060 or newer) or any Apple M-series chip with 8 GB+ unified memory.
- Gemma 4 12B — Needs ~8 GB VRAM. Works on RTX 3070/4070 or Apple M-series with 16 GB+ unified memory.
- Gemma 4 27B — Needs ~16 GB VRAM. Requires RTX 3090/4090 or Apple M-series with 24 GB+ unified memory (M3 Pro or better).
If your hardware does not meet these requirements, using Gemma 4 via OpenRouter or deploying a managed instance on OpenClaw Launch is the simpler path with no hardware investment.
When to Choose Each Approach
Choose Ollama + self-hosted OpenClaw if data privacy is critical, you have a capable GPU, and you are comfortable with Docker and command-line configuration. This is the best choice for running Gemma 4 with zero ongoing costs.
Choose OpenRouter if you want Gemma 4 without local hardware but are comfortable managing your own OpenClaw instance and API keys. Good for developers who already self-host OpenClaw.
Choose OpenClaw Launch if you want the fastest path to a working AI agent with no infrastructure to manage. Deploy in under 2 minutes using the visual configurator, connect Gemma 4 via OpenRouter, and have your bot live on Telegram, Discord, or WhatsApp the same day.