Summary of "I cut my OpenClaw API bill by 80% with one config change"
Problem
- OpenClaw routes every request to your primary model by default. Trivial tasks (heartbeats, quick lookups, sub-agent work) use the same expensive model as complex reasoning, which wastes money.
- There is no built-in provider fallback by default, so provider rate limits can halt your agent.
Solution: model tiering + config change
Use different models for different task tiers and add cross-provider fallbacks to avoid outages.
Model tiers
- Frontier models (heavy reasoning)
- Examples:
Opus,GPT-5.2— for architecture, large refactors, and other complex reasoning.
- Examples:
- Mid-tier models (daily work)
- Examples:
Sonnet,DeepSeek R1— for code generation, research, content.
- Examples:
- Cheap / utility models (background, heartbeats)
- Examples:
Gemini Flash-Lite,DeepSeek V3.2,GLM 4.7— for heartbeats, quick lookups, classification, background checks.
- Examples:
Fallbacks
- Add a fallback chain that spans providers (e.g., fall back to
GPT-5.2or another provider before falling to a same-provider mid-tier) to prevent total outages when a provider is rate-limited.
Model cost / performance examples
Opus: ~ $30 per million tokens; ~50 tokens/sec.GPT-5.2: similar premium pricing.Sonnet/DeepSeek R1: mid-tier; cheaper with good reasoning (DeepSeek R1 cited at $2.74 per million tokens).Gemini 2.5 Flash-Lite: $0.50 per million tokens.DeepSeek V3.2: $0.53 per million tokens.Gemini 3 Flash: ~250 tokens/sec (faster than Opus).- Cheap models can be ~60x cheaper than
Opusfor trivial requests.
Implementation options
- Manual configuration (recommended)
- More control: explicitly map tasks to models and define fallbacks.
- OpenRouter auto-router
- Automatic routing based on prompt complexity; less control but minimal setup.
Practical how-to (steps shown)
- Edit the OpenClaw config file:
~/.openclaw/openclaw.json. - Define model sections for heartbeats, sub-agents, vision tasks, and fallback chains.
- Create aliases for model shortcuts (e.g.,
opus,sonnet,flash,ds). - Restart the OpenClaw gateway.
- Switch models on the fly with commands like
/model sonnet,/model opus, or use/modelsto list providers. - Test to ensure heartbeats and background tasks now use cheaper models while primary agent tasks still use the frontier model.
Config highlights (example)
- Heartbeats →
Gemini 2.5 Flash-Lite(~$0.50/m) instead ofOpus. - Sub-agents →
DeepSeek R1(~$2.74/m) instead ofOpus. - Primary tasks remain on
Opus/GPT-5.2. - First fallback uses a different provider (e.g.,
GPT-5.2) to avoid provider-wide rate limits.
Savings examples and tools
- Interactive calculator: https://calculator.vlvt.sh
- Example savings (monthly):
- Light user: ~$200 → ~$70 (≈65% less).
- Power user: ~$943 → ~$347 (≈62% less; ≈$600 saved/month).
- Heavy user: ~$3,000 → ~$1,000 (≈62% less; ≈$1,700 saved/month).
- You can plug in different primary/heartbeat/sub-agent models to see custom savings and copy config snippets.
Why not rely on free models
- Free tiers commonly have aggressive rate limits, slow performance, and can disappear unexpectedly.
- The recommendation is to use near-free paid models (tens of cents per million tokens) for production reliability.
Links and resources mentioned
- Config examples and
openclaw.jsonsnippets: referenced in the video description. - Savings calculator: https://calculator.vlvt.sh
- Model/provider lists and aliases: available in the video description.
Actionable checklist to apply now
- Add model-tiering to
~/.openclaw/openclaw.json:- Heartbeat → cheap model
- Sub-agents → mid-tier
- Primary → frontier
- Add a cross-provider fallback chain.
- Create model aliases for quick switching.
- Restart the gateway and test switching with
/modelcommands. - Use the calculator (calculator.vlvt.sh) to estimate savings for your usage.
Main speaker and sources
- Video presenter / author (demonstrates the config and calculator).
- Providers/models referenced: Anthropic (
Opus,Sonnet), OpenAI (GPT-5.2), Google (GeminiFlash variants), OpenRouter, DeepSeek (R1,V3.2),GLM 4.7,Kimi K2.5.
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...