← Home

Guide

Hermes Agent + Gemini: Use Google's Gemini Models with Hermes

Gemini — Google DeepMind's flagship model family — is a strong choice for Hermes Agent. Gemini 3 Pro brings million-token context for deep research, Gemini 2.5 Flash delivers fast, low-cost responses, and the broader Gemini Pro lineup covers multimodal vision, audio, and code workloads.

What Is Gemini?

Gemini is Google DeepMind's family of multimodal large language models. Built from the ground up to handle text, images, audio, and video in a single model, Gemini is known for its long-context performance and tight integration with Google's search and tool ecosystem. For agent workloads, Gemini 3 Pro's 1M+ token window stands out — useful for multi-document analysis and long sessions.

Hermes Agent reaches Gemini through two paths: the Google AI Studio API (via GOOGLE_API_KEY or GEMINI_API_KEY) or via OpenRouter, which routes to Gemini alongside 200+ other models on a single key.

Gemini Model Lineup for Hermes

ModelBest ForContextCost (Input)
Gemini 3 ProDeep research, multi-document analysis, complex tool use1M+ tokens~$1.25/M tokens
Gemini 2.5 ProBalanced everyday agent work, coding, reasoning1M tokens~$1.25/M tokens
Gemini 2.5 FlashFast chat, high-volume messaging, low-cost triage1M tokens~$0.10/M tokens
Gemini 2.5 Flash-LiteHighest-volume bots, classification, routing1M tokens~$0.04/M tokens

For most Hermes deployments, Gemini 2.5 Flash is the right starting point. It handles tool calls, produces coherent multi-step responses, and costs roughly ten cents per million input tokens — an order of magnitude cheaper than frontier models. Upgrade to Gemini 3 Pro when you need its reasoning depth and million-token context; drop to Flash-Lite when volume eclipses everything else.

Option 1: Hermes Agent on OpenClaw Launch (Easiest)

The fastest path to a Gemini-powered Hermes Agent. No API key required, no Docker setup, no config file editing.

  1. Go to openclawlaunch.com/hermes-hosting and start a Hermes deploy.
  2. Select Gemini 2.5 Flash (or 3 Pro / 2.5 Pro) from the model dropdown.
  3. Connect your messaging channel — Telegram, Discord, WhatsApp, or others.
  4. Click Deploy. Your Gemini-powered Hermes Agent is live in roughly 30 seconds.
Tip: OpenClaw Launch routes Gemini requests through OpenRouter by default. AI credits are included in every Hermes plan — no separate Google billing required unless you bring your own key.

Option 2: Google AI Studio API Direct (Self-Hosted)

If you're running Hermes on your own server with a direct Google AI Studio key, set the environment variable and tell Hermes to use the google provider:

# Hermes reads GOOGLE_API_KEY or GEMINI_API_KEY
export GOOGLE_API_KEY=AIza...

hermes inference set google
hermes model set gemini-2.5-flash

# Or configure /opt/data/config.yaml directly:
# inference:
#   provider: google
# model:
#   default: gemini-2.5-flash

Generate an API key at aistudio.google.com/apikey. The free tier covers most prototyping; paid tier billing is usage-based with no monthly minimum.

Option 3: Gemini via OpenRouter (Self-Hosted)

OpenRouter lets you reach every Gemini model with a single key, alongside Claude, GPT, DeepSeek, Grok, and 200+ others.

export OPENROUTER_API_KEY=sk-or-...

hermes inference set openrouter
hermes model set google/gemini-2.5-flash

When to Choose Each Gemini Model

Choose Gemini 2.5 Flash as your default. It handles everyday chat, coding, tool use, and short research tasks at conversational speed for roughly $0.10 per million input tokens. For high-volume bots, this is one of the cheapest viable options in the frontier-quality tier.

Choose Gemini 3 Pro when you need its 1M+ token context window or its reasoning depth for hard tasks — long research sessions, multi-PDF analysis, complex agentic plans. The pricing is competitive with Claude Sonnet for similar quality on reasoning benchmarks.

Choose Gemini 2.5 Flash-Lite for the highest-volume scenarios: classification, routing, intent detection, or simple Q&A where responses need to be both fast and very cheap.

Switching Models at Runtime

/model google/gemini-2.5-flash
/model google/gemini-2.5-pro
/model google/gemini-3-pro

BYOK on OpenClaw Launch

On managed OpenClaw Launch deploys, you can use your own Google AI Studio or OpenRouter key instead of bundled AI credits. In the configurator, choose BYOK and paste your key — all Hermes inference routes through your account, with usage and billing under your direct control.

What's Next?

Deploy Hermes with Gemini

Get a Gemini-powered Hermes Agent running in 30 seconds on OpenClaw Launch.

Deploy Hermes