Self-Hosting Guide

Run OpenClaw Locally on NVIDIA DGX Spark

Q: What models can I run on DGX Spark with OpenClaw?

With 128 GB of unified memory, the DGX Spark supports inference on models up to approximately 200B parameters and fine-tuning up to around 70B. For OpenClaw agent use, instruction-tuned models in the 7B to 70B range are practical.

Q: Is OpenClaw free to self-host?

Yes, OpenClaw is open-source. Self-hosting is free — you pay only for hardware and electricity. OpenClaw Launch is the managed hosting service starting at $3/mo, which removes all setup and ops burden.

Q: Can I use cloud models alongside a local model on DGX Spark?

Yes. OpenClaw supports multiple model providers simultaneously. You can route some requests to a local Ollama model and fall back to cloud APIs (OpenRouter, OpenAI, Anthropic) for tasks that need it. BYOK is supported in both self-hosted and managed setups.

The NVIDIA DGX Spark is a desktop AI computer powerful enough to run large local language models on-prem. Because OpenClaw is open-source, you can self-host it on your own hardware and point it at a local inference server for a fully private, offline-capable AI agent. This guide walks through how that setup works — and honestly compares it to the managed cloud path for users who don't need $4,699 of hardware.

What Is the NVIDIA DGX Spark?

The DGX Spark is a compact desktop AI computer built on the NVIDIA GB10 Grace Blackwell Superchip, combining a 20-core Arm Grace CPU and a Blackwell GPU in a single unified-memory design. Key specs:

128 GB unified LPDDR5x memory — shared coherently between CPU and GPU, so the full pool is available to your model with no separate VRAM bottleneck
Up to 1 PFLOP (FP4 with sparsity) of AI compute
Run models up to ~200B parameters, and fine-tune up to ~70B
Very small footprint — roughly 1.1 liters, desktop-sized
Price — Founder's Edition starts at $4,699 (4 TB SSD); partner versions from Acer, ASUS, Dell, and MSI are available at 1 TB for somewhat less via authorized retailers

The DGX Spark is designed for researchers, developers, and teams that need serious local inference power without a full data-center rack.

Why Self-Host OpenClaw on DGX Spark?

OpenClaw is the open-source AI agent framework that powers OpenClaw Launch's managed service. Because the framework is open-source, you can run it yourself on any machine — including the DGX Spark. Pairing them gives you:

Full data privacy — your conversations, files, and agent actions never leave your hardware. No API calls to third-party cloud providers.
Offline capability — the agent works even without internet access, using a locally-running model.
Large model headroom — 128 GB unified memory means you can run models that would be impractical on a typical workstation GPU (which usually caps at 24 – 80 GB VRAM).
Zero per-token cost — once you own the hardware, inference is free at the margin.

The trade-off: you are taking on hardware cost, setup, and operations yourself. For most users, the managed OpenClaw Launch service is faster and cheaper. We will compare both paths honestly below.

DGX Spark Self-Host vs Managed OpenClaw Launch

	DGX Spark + Self-Hosted OpenClaw	OpenClaw Launch (Managed)
Upfront cost	$4,699+ (hardware)	$0
Monthly cost	Power + your ops time	$3 first month, then $6 – $20/mo
Data stays on-prem	Yes — fully local	Hosted on managed EU servers
Works offline	Yes (local model)	No — cloud-dependent
Model size headroom	Up to ~200B parameters	API models (no local inference)
Setup time	Hours — install + configure	Under 2 minutes
Updates / ops	You manage everything	Managed for you
Channels (Telegram, Discord, etc.)	Supported (self-configured)	Supported (one-click)
Skills / MCP	3,200+ ClawHub skills (manual install)	3,200+ ClawHub skills (dashboard)

Bottom line: the DGX Spark path makes sense when you have strict data-residency requirements, need offline operation, or want to run very large local models for research or heavy inference workloads. For everyone else, the managed service is a better starting point at a fraction of the cost.

Who Should Consider the DGX Spark Path?

Researchers and teams who need to run 70B+ parameter models locally without paying per-token cloud inference costs
Organizations with strict data-residency or air-gap requirements where no data can leave premises
Developers fine-tuning models up to ~70B and wanting to run the resulting model in a live agent
Users in environments with unreliable or restricted internet access who need offline-capable AI

If none of the above apply to you, you will get a working OpenClaw agent much faster and cheaper through OpenClaw Launch.

Step 1: Get Your DGX Spark Running

Purchase a DGX Spark (Founder's Edition at $4,699 or a partner version from Acer, ASUS, Dell, or MSI) from an authorized NVIDIA retailer. NVIDIA ships with their DGX OS stack pre-configured for AI workloads, including CUDA and driver support for the GB10 Blackwell GPU.

Once powered on and connected to your network, verify the GPU is recognized and that you can run basic compute tasks. Consult the DGX Spark quick-start documentation from NVIDIA (nvidia.com) for OS-level setup steps specific to your unit.

Step 2: Install a Local Inference Server

To run an LLM locally on the DGX Spark, you need a local inference server that exposes an OpenAI-compatible API. Ollama is the most common choice for self-hosted setups — it handles model download, quantization selection, and serving in one command. Other options include vLLM (higher throughput, more complex setup) and llama.cpp-based servers.

Install Ollama following their official documentation (ollama.com), then pull a model. With 128 GB unified memory on the DGX Spark, you can comfortably run models that would not fit on a typical consumer GPU. For general-purpose agent use, a 32B–70B instruction-tuned model is a good starting point. For lighter workloads, 7B–14B models use far less memory and respond faster.

Once Ollama is running and you have pulled a model, it will serve an OpenAI-compatible API on localhost:11434 by default. You can verify it is working with a quick curl request to the chat completions endpoint.

See our OpenClaw + Ollama guide for a detailed walkthrough of the Ollama setup and model selection.

Step 3: Install and Run OpenClaw

OpenClaw is open-source and can be installed on the DGX Spark directly. The recommended path for self-hosting is via Docker, which keeps the agent isolated and makes updates straightforward. You can also install via npm if you prefer a bare metal install.

The key configuration step is pointing OpenClaw at your local Ollama server rather than a cloud model provider. In your openclaw.json, set the model provider to your local inference endpoint and specify the model name you pulled. The exact field names are in the OpenClaw install guide and the upstream OpenClaw documentation.

High-level flow:

Install OpenClaw (Docker is recommended for self-hosting)
Configure openclaw.json to use your local Ollama endpoint as the model provider
Set a model name matching what you pulled into Ollama
Start the OpenClaw gateway

OpenClaw will now route all inference requests to your local model on the DGX Spark, with no external API calls for the LLM layer.

Step 4: Connect Your Channels

With the OpenClaw gateway running locally, connect it to the messaging channels you want to use. OpenClaw supports Telegram, Discord, WhatsApp, WeChat, Slack, and web chat. Each channel requires a bot token or credential from that platform:

Telegram — create a bot via BotFather, add the token to your config
Discord — create a Discord application and bot, add the bot token
WhatsApp, WeChat, Slack — follow the respective channel connection guides in the OpenClaw documentation

Because you are self-hosting, you manage the gateway URL and TLS certificate yourself. If the DGX Spark is on a local network, you will need a way to expose the gateway publicly for Telegram/Discord webhooks, or use a tunnel service.

Step 5: Install Skills

OpenClaw's 3,200+ ClawHub skills work the same way in a self-hosted setup as in the managed service. Install skills from within the agent's chat interface or via the CLI. Skills that require external API keys (search, calendar, etc.) still need those keys configured, but the LLM inference itself stays local.

Frequently Asked Questions

What models can I run on DGX Spark with OpenClaw?

With 128 GB of unified memory, the DGX Spark can run models up to approximately 200B parameters for inference, and fine-tune models up to around 70B. For OpenClaw agent use, well-regarded open instruction-tuned models in the 7B–70B range are practical choices. Specific model recommendations depend on what is available via Ollama or your preferred inference server at the time of your setup.

Do I need a DGX Spark to self-host OpenClaw?

No. OpenClaw can be self-hosted on any Linux machine with Docker. The DGX Spark is a good choice if you specifically need to run very large local models (70B+) or want maximum unified memory. For lighter local models or cloud-API setups, an ordinary server or even a Mac with Apple Silicon works fine. If you just want a working OpenClaw agent without any hardware management, use OpenClaw Launch from $3/mo.

Can I use cloud models alongside a local model on DGX Spark?

Yes. OpenClaw supports multiple model providers in the same config. You can route some requests to your local Ollama model and fall back to a cloud API (OpenRouter, OpenAI, Anthropic, etc.) for tasks that need a capability your local model lacks. BYOK is fully supported in the self-hosted setup just as in the managed service.

Is OpenClaw free to self-host?

Yes, the OpenClaw framework is open-source. Self-hosting is free (you pay for the hardware and electricity). OpenClaw Launch is the managed hosting service around it, priced from $3/mo. If you self-host, you do not pay OpenClaw Launch anything — but you take on all setup and ops yourself.

Why does OpenClaw Launch exist if OpenClaw is open-source?

Self-hosting works, but it requires hardware, Linux knowledge, Docker, network configuration, and ongoing maintenance. OpenClaw Launch handles all of that — provisioning, updates, backups, channel connectivity, and uptime — so you can deploy an agent in under 2 minutes without touching a server. The DGX Spark path is for users who specifically need on-prem or local-model capabilities that the managed service cannot provide.

Deploy Without the Hardware

The DGX Spark + self-hosted OpenClaw setup is powerful for privacy-critical and heavy-inference workloads. For most users — individuals, small teams, and businesses who just want a reliable AI agent on Telegram, Discord, or WhatsApp — the managed path is far simpler.

OpenClaw Launch gives you the same open-source OpenClaw agent, with 20+ models (including cloud BYOK), 3,200+ ClawHub skills, and MCP support, managed for you starting at $3 for the first month.

What's Next?

OpenClaw + Ollama guide — Detailed walkthrough of local model setup
OpenClaw install guide — Full self-hosting setup from scratch
OpenClaw Docker guide — Containerized deployment for self-hosters
Managed vs self-hosted — Full comparison of both paths
See managed pricing — OpenClaw Launch from $3/month, no hardware needed