← Home

Guide

How to Use NVIDIA Nemotron 3 Ultra with OpenClaw

NVIDIA Nemotron 3 Ultra is a massive mixture-of-experts model built for long-horizon agentic tasks — tool calls, multi-step reasoning, and huge context windows. This guide shows how to wire it into OpenClaw via OpenRouter or a local Ollama server.

What Is NVIDIA Nemotron 3 Ultra?

Nemotron 3 Ultra is an open model released by NVIDIA and optimized for agentic, long-running workflows. A few numbers worth knowing:

  • Architecture: Mixture-of-Experts (MoE) — roughly 550B total parameters, with ~55B active per token. You get near-dense quality at a fraction of the inference cost.
  • Context window: Up to 1 million tokens. Useful for large codebases, lengthy documents, or extended multi-turn agent sessions.
  • Tool use & reasoning: NVIDIA trained the model specifically for function calling and multi-step reasoning — the two capabilities AI agents rely on most.

Because OpenClaw is tool-use-first (skills, MCP servers, browser, file system), pairing it with a model built for the same workload is a natural fit.

Option 1 — Managed Hosting on OpenClaw Launch

If you use OpenClaw Launch (the managed service), you pick your model from the dashboard — no config files, no restarts.

  1. Log in at openclawlaunch.com/dashboard.
  2. Open your instance and go to the Model settings tab.
  3. Select OpenRouter as the provider and search for “Nemotron” in the model list.
  4. Pick the Nemotron 3 Ultra entry and save — the change hot-applies instantly.

You can optionally bring your own OpenRouter API key (BYOK) under Settings → API Keys to use your own quota and billing. Without a key, the platform routes requests through its shared key.

Option 2 — Self-Hosting with OpenRouter

On a self-hosted OpenClaw instance, set the model under your agent's primary model config. OpenClaw uses the id format openrouter/nvidia/<model-slug>.

Important: NVIDIA's exact slug on OpenRouter can change between releases. Before copying the snippet below, search “Nemotron” on openrouter.ai/models and copy the current slug from the model's detail page. The snippet below uses a representative slug as a placeholder.

# openclaw.json (agent primary model)
{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openrouter/nvidia/llama-3_1-nemotron-ultra-253b-v1"
      }
    }
  },
  "models": {
    "providers": {
      "openrouter": {
        "apiKey": "sk-or-..."
      }
    }
  }
}

Replace llama-3_1-nemotron-ultra-253b-v1 with the slug you copied from OpenRouter, and fill in your API key. After saving, OpenClaw hot-applies model changes without a full restart.

Option 3 — Running Locally via Ollama

Nemotron 3 Ultra is also available through Ollama, which lets you run it entirely on your own hardware (a multi-GPU machine or a well-resourced workstation).

# Pull the model
ollama pull nemotron-3-ultra

# Verify it loaded
ollama list

Then point OpenClaw at your local Ollama server. In openclaw.json:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "ollama/nemotron-3-ultra"
      }
    }
  },
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://localhost:11434"
      }
    }
  }
}

The Ollama provider in OpenClaw talks directly to your local server — no API key required, and requests never leave your machine. See the OpenClaw Ollama guide for a full Ollama setup walkthrough.

Cost and Free Tier Notes

Nemotron 3 Ultra is available on OpenRouter, which includes a free tier for many models. Whether Nemotron 3 Ultra sits on the free tier at any given time depends on OpenRouter's current offering — check the model's page on openrouter.ai for live pricing.

  • Free-tier route: On OpenRouter's free tier, rate limits are tighter. Fine for exploration; not recommended for production agents with high message volume.
  • Paid route: Bring your own OpenRouter key (BYOK) for higher limits and predictable billing. MoE inference is priced on active parameters, so Nemotron 3 Ultra costs considerably less per token than a dense 550B model would.
  • Ollama / self-hosted: No per-token cost once you have the hardware. Running ~55B active MoE still requires meaningful VRAM — expect multi-GPU setups for full-speed inference.

Which Model Should I Pick?

Nemotron 3 Ultra is not the right tool for every job. A quick-pick guide:

  • Complex multi-step agents, heavy tool use, large codebases: Nemotron 3 Ultra — its 1M-token context and function-calling focus shine here.
  • Fast back-and-forth chat, low-latency replies: A smaller model (Qwen, Gemma, Mistral) will feel snappier. MoE routing adds some overhead.
  • Image or audio tasks: Nemotron 3 Ultra is text-only. Use a multimodal model for those inputs.
  • Budget-constrained, low-volume: Try OpenRouter's free tier first to benchmark quality before committing to paid.
  • Air-gapped / private deployment: The Ollama path keeps everything local.

You can compare model options on the Models page or switch models any time from your dashboard without redeploying.

Frequently Asked Questions

Is Nemotron 3 Ultra free to use?

It is available on OpenRouter, which offers a free tier for many models. Check OpenRouter's current model listing to confirm whether Nemotron 3 Ultra is on the free tier at the time you read this — free-tier availability changes. Running it locally via Ollama has no per-request cost once you have the hardware.

Can I run Nemotron 3 Ultra with OpenClaw?

Yes. OpenClaw supports any OpenRouter model using the openrouter/nvidia/<slug> id format, and it also supports Ollama for local inference. Both paths are covered above. On OpenClaw Launch (managed), you select the model directly from the dashboard — no config editing required.

Does Nemotron 3 Ultra work well for AI agents?

Yes — NVIDIA designed it specifically for agentic use cases. The 1M-token context window lets the agent retain long histories without truncation, and the model's function-calling capability maps directly to OpenClaw's skill and MCP tool system. Users running multi-step workflows (research, coding, task automation) tend to see the biggest benefit.

Prefer not to manage model configs yourself? OpenClaw Launch lets you pick Nemotron 3 Ultra (or any OpenRouter model) from a dropdown in your dashboard — no YAML, no restarts, no server management. Deploy in 30 seconds and switch models any time.

Run Nemotron 3 Ultra Without the Setup

Managed OpenClaw hosting — pick your model from the dashboard. Plans from $3/mo.

Deploy Now