Does Hermes Agent support Gemini?

Yes. Hermes Agent has a Google provider that accepts GOOGLE_API_KEY or GEMINI_API_KEY. It supports Gemini 3 Pro, 2.5 Pro, 2.5 Flash, and 2.5 Flash-Lite directly or via OpenRouter.

Which Gemini model should I use with Hermes?

Gemini 2.5 Flash is the recommended default — best balance of speed, quality, and cost at roughly $0.10 per million tokens. Use Gemini 3 Pro for long-context research and complex reasoning.

How do I get a Gemini API key?

Visit aistudio.google.com/apikey, sign in with a Google account, and create a free key. The free tier covers most prototyping; paid tier billing is usage-based.

Do I need a Google API key for Hermes on OpenClaw Launch?

No. OpenClaw Launch includes AI credits that cover Gemini requests through OpenRouter. Bring your own Google key only if you want billing under your own Google Cloud account.

← Home

Guide

Hermes Agent + Gemini: Use Google's Gemini Models with Hermes

Gemini — Google DeepMind's flagship model family — is a strong choice for Hermes Agent. Gemini 3 Pro brings million-token context for deep research, Gemini 2.5 Flash delivers fast, low-cost responses, and the broader Gemini Pro lineup covers multimodal vision, audio, and code workloads.

What Is Gemini?

Gemini is Google DeepMind's family of multimodal large language models. Built from the ground up to handle text, images, audio, and video in a single model, Gemini is known for its long-context performance and tight integration with Google's search and tool ecosystem. For agent workloads, Gemini 3 Pro's 1M+ token window stands out — useful for multi-document analysis and long sessions.

Hermes Agent reaches Gemini through two paths: the Google AI Studio API (via GOOGLE_API_KEY or GEMINI_API_KEY) or via OpenRouter, which routes to Gemini alongside 200+ other models on a single key.

Gemini Model Lineup for Hermes

Model	Best For	Context	Cost (Input)
Gemini 3 Pro	Deep research, multi-document analysis, complex tool use	1M+ tokens	~$1.25/M tokens
Gemini 2.5 Pro	Balanced everyday agent work, coding, reasoning	1M tokens	~$1.25/M tokens
Gemini 2.5 Flash	Fast chat, high-volume messaging, low-cost triage	1M tokens	~$0.10/M tokens
Gemini 2.5 Flash-Lite	Highest-volume bots, classification, routing	1M tokens	~$0.04/M tokens

For most Hermes deployments, Gemini 2.5 Flash is the right starting point. It handles tool calls, produces coherent multi-step responses, and costs roughly ten cents per million input tokens — an order of magnitude cheaper than frontier models. Upgrade to Gemini 3 Pro when you need its reasoning depth and million-token context; drop to Flash-Lite when volume eclipses everything else.

Option 1: Hermes Agent on OpenClaw Launch (Easiest)

The fastest path to a Gemini-powered Hermes Agent. No API key required, no Docker setup, no config file editing.

Go to openclawlaunch.com/hermes-hosting and start a Hermes deploy.
Select Gemini 2.5 Flash (or 3 Pro / 2.5 Pro) from the model dropdown.
Connect your messaging channel — Telegram, Discord, WhatsApp, or others.
Click Deploy. Your Gemini-powered Hermes Agent is live in roughly 30 seconds.

Tip: OpenClaw Launch routes Gemini requests through OpenRouter by default. AI credits are included in every Hermes plan — no separate Google billing required unless you bring your own key.

Option 2: Google AI Studio API Direct (Self-Hosted)

If you're running Hermes on your own server with a direct Google AI Studio key, set the environment variable and tell Hermes to use the google provider:

# Hermes reads GOOGLE_API_KEY or GEMINI_API_KEY
export GOOGLE_API_KEY=AIza...

hermes inference set google
hermes model set gemini-2.5-flash

# Or configure /opt/data/config.yaml directly:
# inference:
#   provider: google
# model:
#   default: gemini-2.5-flash

Generate an API key at aistudio.google.com/apikey. The free tier covers most prototyping; paid tier billing is usage-based with no monthly minimum.

Option 3: Gemini via OpenRouter (Self-Hosted)

OpenRouter lets you reach every Gemini model with a single key, alongside Claude, GPT, DeepSeek, Grok, and 200+ others.

export OPENROUTER_API_KEY=sk-or-...

hermes inference set openrouter
hermes model set google/gemini-2.5-flash

When to Choose Each Gemini Model

Choose Gemini 2.5 Flash as your default. It handles everyday chat, coding, tool use, and short research tasks at conversational speed for roughly $0.10 per million input tokens. For high-volume bots, this is one of the cheapest viable options in the frontier-quality tier.

Choose Gemini 3 Pro when you need its 1M+ token context window or its reasoning depth for hard tasks — long research sessions, multi-PDF analysis, complex agentic plans. The pricing is competitive with Claude Sonnet for similar quality on reasoning benchmarks.

Choose Gemini 2.5 Flash-Lite for the highest-volume scenarios: classification, routing, intent detection, or simple Q&A where responses need to be both fast and very cheap.

Switching Models at Runtime

/model google/gemini-2.5-flash
/model google/gemini-2.5-pro
/model google/gemini-3-pro

BYOK on OpenClaw Launch

On managed OpenClaw Launch deploys, you can use your own Google AI Studio or OpenRouter key instead of bundled AI credits. In the configurator, choose BYOK and paste your key — all Hermes inference routes through your account, with usage and billing under your direct control.

What's Next?

Hermes Agent + Claude — Use Anthropic's Claude family with Hermes
Hermes Agent + OpenAI — Run Hermes on GPT-5.5 and the OpenAI lineup
Hermes Agent + OpenRouter — One key for 200+ models, Gemini included
Hermes Agent + Telegram — Connect your Gemini-powered Hermes bot to Telegram