Guide

Hermes Agent on Mac: Install and Run on macOS

Hermes Agent runs natively on macOS — Apple Silicon (M1/M2/M3/M4) or Intel. Whether you want a fully local agent with Apple GPU acceleration or a cloud-backed one with bundled inference, the install is a single command.

Three Ways to Run Hermes on Mac

OpenClaw Launch (zero install). Hermes runs in our cloud; your Mac just opens a web UI.
Homebrew or direct binary. Hermes CLI runs as a native macOS process.
Docker Desktop. Hermes runs in a container; everything is portable.

Option 1: OpenClaw Launch (Easiest, Zero Install)

Your Mac doesn't need to run anything — we host Hermes and you reach it from any browser or messaging app.

Go to openclawlaunch.com/hermes-hosting.
Pick a model and a channel (Telegram, Discord, WhatsApp, Slack, or just the web UI).
Click Deploy. Done in 30 seconds.

Option 2: Homebrew Install (Native Mac Binary)

Hermes ships a native macOS binary for both Apple Silicon and Intel:

# Homebrew (recommended)
brew install nousresearch/tap/hermes-agent

# Or direct download
curl -fsSL https://hermes-agent.dev/install.sh | sh

# Start the gateway and the web UI
hermes start

# Open the web UI
open http://127.0.0.1:7777

Hermes stores config and data under ~/.hermes/. On Apple Silicon Macs, local model inference uses Metal GPU acceleration automatically.

Run a Local Model on Apple Silicon

The fastest path is via Ollama, which Hermes integrates with natively:

# Install Ollama (Metal GPU support built-in)
brew install ollama

# Pull a model that fits your RAM
ollama pull llama3.3:70b   # for 64 GB+ Macs
ollama pull qwen3.5:32b    # for 36 GB Macs
ollama pull llama3.1:8b    # for 16 GB Macs

# Point Hermes at Ollama
hermes inference set ollama
hermes model set llama3.3:70b

Option 3: Docker Desktop on Mac

If you prefer containerized deploys, install Docker Desktop for Mac and run:

docker run -d --name hermes \
  -p 7777:7777 \
  -v hermes-data:/opt/data \
  ghcr.io/nousresearch/hermes-agent:latest

open http://127.0.0.1:7777

Docker on Mac runs Linux containers in a lightweight VM, so Metal GPU acceleration isn't passed through — if you want Apple GPU for local inference, use the native Homebrew install (Option 2) and let Ollama handle the model layer.

RAM Sizing for Local Models

Mac	Local Model You Can Run	Notes
MacBook Air M2/M3 (8 GB)	3B q4	OpenClaw Launch is a better fit
MacBook Air/Pro M2/M3 (16 GB)	7–8B q4	Llama 3.1 8B at ~20 tok/sec
MacBook Pro M3/M4 Pro (36 GB)	14–32B q4	Qwen 3.5 32B at ~15 tok/sec
MacBook Pro M3/M4 Max (64 GB)	70B q4	Llama 3.3 70B at ~8 tok/sec
Mac Studio M2 Ultra (192 GB)	Mixtral 8x22B q4 or Llama 70B fp16	Pro-level local inference

Common Issues

“Hermes can't be opened, unidentified developer”. Right-click the binary and choose Open; macOS will let you bypass Gatekeeper. Homebrew installs avoid this entirely.
Port 7777 already in use. Run hermes start --port 7778 or stop the conflicting process via lsof -i :7777.
Local model is slow. Check Activity Monitor → Memory Pressure; if it's yellow or red, pick a smaller model or higher quantization (q4 instead of q8).
Docker model can't reach Ollama. Use http://host.docker.internal:11434 from inside the container, not 127.0.0.1.

What's Next?

Install Hermes Agent — Full install guide across platforms
Hermes Agent + Ollama — Local inference on Apple Silicon
Hermes Agent + LM Studio — GUI alternative for local models
Hermes Desktop App — The native macOS desktop client