Guide
GPT-5.4 Image 2 with OpenClaw
Generate images inside your OpenClaw agent using OpenAI's GPT-5.4 Image 2 — GPT-5.4 reasoning combined with the GPT Image 2 generator, one model, one API call.
What is GPT-5.4 Image 2?
GPT-5.4 Image 2 is OpenAI's unified reasoning + image generation model, released April 2026. It combines the GPT-5.4 text model with the GPT Image 2 generator, so a single request can reason about a prompt, refine it, and produce the image — no separate image-gen call needed.
- Model ID:
openai/gpt-5.4-image-2 - Context: 272K tokens
- Pricing: $8 / $15 per 1M input / output tokens
- Available via the OpenAI API and OpenRouter
Step 1: Get an OpenAI API Key
- Go to platform.openai.com/api-keys.
- Sign in or create an OpenAI account.
- Click Create new secret key, name it "OpenClaw", and copy the
sk-...value. - Make sure billing is enabled under Settings → Billing. Image generation is pay-as-you-go.
Step 2: Configure in OpenClaw
Option A: OpenClaw Launch (Easiest)
- Go to openclawlaunch.com and open the configurator.
- In your instance settings, open the Image generation model picker and choose GPT-5.4 Image 2.
- Under Providers → OpenAI, paste your
sk-...API key. - Pick your chat platform (Telegram, Discord, WhatsApp, WeChat, or the web gateway). Paste a bot token for Telegram / Discord, or use the QR code flow for WhatsApp / WeChat. Click Deploy.
Option B: Self-Hosted Config
If you run OpenClaw yourself, add the provider key and set the image generation model in openclaw.json:
{
"models": {
"providers": {
"openai": {
"apiKey": "sk-..."
}
}
},
"agents": {
"defaults": {
"imageGenerationModel": {
"primary": "openai/gpt-5.4-image-2"
}
}
}
}Step 3: Generate an Image
Once deployed, ask your agent to generate an image from chat. Any of these work:
- "Generate an image of a cat astronaut on Mars."
- "Draw a retro 1980s travel poster for Jupiter."
- "Make a hero banner for a SaaS landing page, dark theme, blue glow."
Because GPT-5.4 Image 2 reasons before generating, you can give it loose briefs and it will expand them into detailed prompts on its own — no separate prompt-engineering step.
When to Pick GPT-5.4 Image 2
| Model | Best For | Cost |
|---|---|---|
| GPT-5.4 Image 2 | Reasoning + image in one call, brief-to-image workflows | $8 / $15 per 1M tokens |
| GPT Image 1 | Straight image generation, lower latency | Per-image (cheaper) |
| Gemini 3 Pro Image | Highest quality, slower | Google pricing |
| Gemini 3.1 Flash Image | Fast, Nano-Banana-class | Google pricing |
Use GPT-5.4 Image 2 when you want one model to handle the whole flow — understand the user's request, plan the scene, and generate the image. Use GPT Image 1 or Gemini 3.1 Flash Image if latency or per-image cost matters more than reasoning.
Switching Models
You can switch image models at any time from the dashboard — pick a different model from the image generation picker, save, and your next request uses the new one. See the Models page for the full list.
What's Next?
- Configure OpenAI text models — Use GPT-5.2, o3, or Codex for chat and coding in the same agent
- Connect Telegram — Send prompts and receive generated images in chat
- Compare image models — See all supported image generators side by side
- See pricing — Hosting from $3/mo with AI credits included