Guide
Hermes Agent on Mac: Install and Run on macOS
Hermes Agent runs natively on macOS — Apple Silicon (M1/M2/M3/M4) or Intel. Whether you want a fully local agent with Apple GPU acceleration or a cloud-backed one with bundled inference, the install is a single command.
Three Ways to Run Hermes on Mac
- OpenClaw Launch (zero install). Hermes runs in our cloud; your Mac just opens a web UI.
- Homebrew or direct binary. Hermes CLI runs as a native macOS process.
- Docker Desktop. Hermes runs in a container; everything is portable.
Option 1: OpenClaw Launch (Easiest, Zero Install)
Your Mac doesn't need to run anything — we host Hermes and you reach it from any browser or messaging app.
- Go to openclawlaunch.com/hermes-hosting.
- Pick a model and a channel (Telegram, Discord, WhatsApp, Slack, or just the web UI).
- Click Deploy. Done in 30 seconds.
Option 2: Homebrew Install (Native Mac Binary)
Hermes ships a native macOS binary for both Apple Silicon and Intel:
# Homebrew (recommended)
brew install nousresearch/tap/hermes-agent
# Or direct download
curl -fsSL https://hermes-agent.dev/install.sh | sh
# Start the gateway and the web UI
hermes start
# Open the web UI
open http://127.0.0.1:7777Hermes stores config and data under ~/.hermes/. On Apple Silicon Macs, local model inference uses Metal GPU acceleration automatically.
Run a Local Model on Apple Silicon
The fastest path is via Ollama, which Hermes integrates with natively:
# Install Ollama (Metal GPU support built-in)
brew install ollama
# Pull a model that fits your RAM
ollama pull llama3.3:70b # for 64 GB+ Macs
ollama pull qwen3.5:32b # for 36 GB Macs
ollama pull llama3.1:8b # for 16 GB Macs
# Point Hermes at Ollama
hermes inference set ollama
hermes model set llama3.3:70bOption 3: Docker Desktop on Mac
If you prefer containerized deploys, install Docker Desktop for Mac and run:
docker run -d --name hermes \
-p 7777:7777 \
-v hermes-data:/opt/data \
ghcr.io/nousresearch/hermes-agent:latest
open http://127.0.0.1:7777Docker on Mac runs Linux containers in a lightweight VM, so Metal GPU acceleration isn't passed through — if you want Apple GPU for local inference, use the native Homebrew install (Option 2) and let Ollama handle the model layer.
RAM Sizing for Local Models
| Mac | Local Model You Can Run | Notes |
|---|---|---|
| MacBook Air M2/M3 (8 GB) | 3B q4 | OpenClaw Launch is a better fit |
| MacBook Air/Pro M2/M3 (16 GB) | 7–8B q4 | Llama 3.1 8B at ~20 tok/sec |
| MacBook Pro M3/M4 Pro (36 GB) | 14–32B q4 | Qwen 3.5 32B at ~15 tok/sec |
| MacBook Pro M3/M4 Max (64 GB) | 70B q4 | Llama 3.3 70B at ~8 tok/sec |
| Mac Studio M2 Ultra (192 GB) | Mixtral 8x22B q4 or Llama 70B fp16 | Pro-level local inference |
Common Issues
- “Hermes can't be opened, unidentified developer”. Right-click the binary and choose Open; macOS will let you bypass Gatekeeper. Homebrew installs avoid this entirely.
- Port 7777 already in use. Run
hermes start --port 7778or stop the conflicting process vialsof -i :7777. - Local model is slow. Check Activity Monitor → Memory Pressure; if it's yellow or red, pick a smaller model or higher quantization (q4 instead of q8).
- Docker model can't reach Ollama. Use
http://host.docker.internal:11434from inside the container, not127.0.0.1.
What's Next?
- Install Hermes Agent — Full install guide across platforms
- Hermes Agent + Ollama — Local inference on Apple Silicon
- Hermes Agent + LM Studio — GUI alternative for local models
- Hermes Desktop App — The native macOS desktop client