Guide
Hermes Agent + Gemini: Use Google's Gemini Models with Hermes
Gemini — Google DeepMind's flagship model family — is a strong choice for Hermes Agent. Gemini 3 Pro brings million-token context for deep research, Gemini 2.5 Flash delivers fast, low-cost responses, and the broader Gemini Pro lineup covers multimodal vision, audio, and code workloads.
What Is Gemini?
Gemini is Google DeepMind's family of multimodal large language models. Built from the ground up to handle text, images, audio, and video in a single model, Gemini is known for its long-context performance and tight integration with Google's search and tool ecosystem. For agent workloads, Gemini 3 Pro's 1M+ token window stands out — useful for multi-document analysis and long sessions.
Hermes Agent reaches Gemini through two paths: the Google AI Studio API (via GOOGLE_API_KEY or GEMINI_API_KEY) or via OpenRouter, which routes to Gemini alongside 200+ other models on a single key.
Gemini Model Lineup for Hermes
| Model | Best For | Context | Cost (Input) |
|---|---|---|---|
| Gemini 3 Pro | Deep research, multi-document analysis, complex tool use | 1M+ tokens | ~$1.25/M tokens |
| Gemini 2.5 Pro | Balanced everyday agent work, coding, reasoning | 1M tokens | ~$1.25/M tokens |
| Gemini 2.5 Flash | Fast chat, high-volume messaging, low-cost triage | 1M tokens | ~$0.10/M tokens |
| Gemini 2.5 Flash-Lite | Highest-volume bots, classification, routing | 1M tokens | ~$0.04/M tokens |
For most Hermes deployments, Gemini 2.5 Flash is the right starting point. It handles tool calls, produces coherent multi-step responses, and costs roughly ten cents per million input tokens — an order of magnitude cheaper than frontier models. Upgrade to Gemini 3 Pro when you need its reasoning depth and million-token context; drop to Flash-Lite when volume eclipses everything else.
Option 1: Hermes Agent on OpenClaw Launch (Easiest)
The fastest path to a Gemini-powered Hermes Agent. No API key required, no Docker setup, no config file editing.
- Go to openclawlaunch.com/hermes-hosting and start a Hermes deploy.
- Select Gemini 2.5 Flash (or 3 Pro / 2.5 Pro) from the model dropdown.
- Connect your messaging channel — Telegram, Discord, WhatsApp, or others.
- Click Deploy. Your Gemini-powered Hermes Agent is live in roughly 30 seconds.
Option 2: Google AI Studio API Direct (Self-Hosted)
If you're running Hermes on your own server with a direct Google AI Studio key, set the environment variable and tell Hermes to use the google provider:
# Hermes reads GOOGLE_API_KEY or GEMINI_API_KEY
export GOOGLE_API_KEY=AIza...
hermes inference set google
hermes model set gemini-2.5-flash
# Or configure /opt/data/config.yaml directly:
# inference:
# provider: google
# model:
# default: gemini-2.5-flashGenerate an API key at aistudio.google.com/apikey. The free tier covers most prototyping; paid tier billing is usage-based with no monthly minimum.
Option 3: Gemini via OpenRouter (Self-Hosted)
OpenRouter lets you reach every Gemini model with a single key, alongside Claude, GPT, DeepSeek, Grok, and 200+ others.
export OPENROUTER_API_KEY=sk-or-...
hermes inference set openrouter
hermes model set google/gemini-2.5-flashWhen to Choose Each Gemini Model
Choose Gemini 2.5 Flash as your default. It handles everyday chat, coding, tool use, and short research tasks at conversational speed for roughly $0.10 per million input tokens. For high-volume bots, this is one of the cheapest viable options in the frontier-quality tier.
Choose Gemini 3 Pro when you need its 1M+ token context window or its reasoning depth for hard tasks — long research sessions, multi-PDF analysis, complex agentic plans. The pricing is competitive with Claude Sonnet for similar quality on reasoning benchmarks.
Choose Gemini 2.5 Flash-Lite for the highest-volume scenarios: classification, routing, intent detection, or simple Q&A where responses need to be both fast and very cheap.
Switching Models at Runtime
/model google/gemini-2.5-flash
/model google/gemini-2.5-pro
/model google/gemini-3-proBYOK on OpenClaw Launch
On managed OpenClaw Launch deploys, you can use your own Google AI Studio or OpenRouter key instead of bundled AI credits. In the configurator, choose BYOK and paste your key — all Hermes inference routes through your account, with usage and billing under your direct control.
What's Next?
- Hermes Agent + Claude — Use Anthropic's Claude family with Hermes
- Hermes Agent + OpenAI — Run Hermes on GPT-5.5 and the OpenAI lineup
- Hermes Agent + OpenRouter — One key for 200+ models, Gemini included
- Hermes Agent + Telegram — Connect your Gemini-powered Hermes bot to Telegram