Agent Cost Monitor — Know What Your Agents Cost
Track token usage, costs, and efficiency across all your OpenClaw agents in real-time. Get alerts before you blow your budget.
The Problem
Running multiple agents is powerful — but expensive if you're not watching:
- - Which agent is burning the most tokens?
- Are heartbeats wasting money on expensive models?
- Is caching actually saving you anything?
- When will you hit your weekly rate limit?
What This Skill Does
When triggered (via cron or manually), the agent:
- 1. Checks
session_status for each agent - Calculates per-agent and total costs
- Compares against budget thresholds
- Sends alerts if limits are approaching
- Suggests optimization moves
Usage
Ask your monitoring agent (or any agent with this skill):
CODEBLOCK0
Automated Daily Report (Cron)
CODEBLOCK1
Cost Report Format
When generating a report, use this structure:
CODEBLOCK2
Model Cost Reference
Use these rates for estimation (as of 2026):
Anthropic (Claude OAuth / API)
| Model | Input/1M | Output/1M | Cache Read/1M | Cache Write/1M |
|---|
| Opus 4.6 | $5.00 | $25.00 | $0.50 | $6.25 |
| Sonnet 4.5 |
$3.00 | $15.00 | $0.30 | $3.75 |
| Haiku 4.5 | $1.00 | $5.00 | $0.08 | $1.25 |
Free Options
| Model | Cost | Use For |
|---|
| Ollama (local) | $0 | Heartbeats, simple tasks |
| Gemini OAuth |
$0* | Fallback (rate limited) |
*Free tier with rate limits
Optimization Playbook
Quick Wins (Do These First)
- 1. Heartbeats on Ollama
{ "heartbeat": { "model": "ollama/llama3.2:3b" } }
Saves: 100% of heartbeat costs (can be $5-10/week with Opus)
- 2. Haiku Cache Retention Off
{ "anthropic/claude-haiku-4-5": { "params": { "cacheRetention": "none" } } }
Saves: Cache write costs on cheap model (not worth caching)
- 3. Context Pruning
{ "contextPruning": { "mode": "cache-ttl", "ttl": "5m" } }
Saves: Stale context re-reads on every turn
- 4. Opus/Sonnet Cache Retention Long
{ "anthropic/claude-opus-4-6": { "params": { "cacheRetention": "long" } } }
Saves: Re-sending system prompt every turn (biggest single saving)
Model Tiering (Biggest Impact)
| Task Type | Use This | Not This | Saving |
|---|
| Coordination, complex reasoning | Opus | — | Justified |
| Finance, data analysis |
Sonnet | Opus | -40% |
| Sales drafts, marketing copy | Haiku | Sonnet | -67% |
| Heartbeats, health checks | Ollama | Any paid | -100% |
| Tweet drafts | Haiku or Grok | Opus | -80% |
Session Management
- - Daily reset: Sessions auto-clear at a set hour (reduces token accumulation)
{ "session": { "reset": { "mode": "daily", "atHour": 4, "idleMinutes": 45 } } }
- - Memory flush: Save important context before compaction
CODEBLOCK8
Alert Thresholds
Configure in your monitoring agent's memory:
CODEBLOCK9
Integration with DevOps Agent
If you have a DevOps/monitoring agent (e.g. your DevOps agent), add to its AGENTS.md:
CODEBLOCK10
FAQ
Q: Does this skill make API calls?
A: No. It uses OpenClaw's built-in session_status tool. No external APIs, no additional costs.
Q: How accurate are cost estimates?
A: Based on published model pricing. Actual costs may vary with caching hits. Estimates are conservative (slightly high).
Q: Can I track costs per conversation?
A: Not directly. Costs are tracked per session. Use sessions_list to see per-session token counts.
Q: Works with non-Anthropic models?
A: Yes. Token counts work for all providers. Cost estimation requires known pricing (add custom rates in the cost reference section).
Changelog
v1.1.0
- - Generalized all agent names in examples
- No specific setup references
v1.0.0
代理成本监控器 — 了解你的代理成本
实时追踪所有OpenClaw代理的令牌使用量、成本和效率。在预算超支前获取预警。
问题
运行多个代理功能强大——但如果不加监控,成本可能很高:
- - 哪个代理消耗的令牌最多?
- 心跳检测是否在昂贵模型上浪费资金?
- 缓存是否真的为你节省了成本?
- 何时会达到每周速率限制?
本技能功能
触发时(通过定时任务或手动),代理将:
- 1. 检查每个代理的session_status
- 计算每个代理及总成本
- 与预算阈值进行比较
- 在接近限制时发送预警
- 提出优化建议
使用方法
询问你的监控代理(或任何拥有此技能的代理):
给我所有代理的成本报告
今天哪个代理使用的令牌最多?
我这周会达到速率限制吗?
自动日报(定时任务)
json5
{
name: 每日成本报告,
schedule: { kind: cron, expr: 0 20 *, tz: Europe/Berlin },
payload: {
kind: agentTurn,
message: 对所有代理运行成本报告。检查每个代理的session_status。报告内容:总令牌数、每个代理成本、最高消耗者、预算预警。向用户发送摘要。
},
sessionTarget: isolated,
delivery: { mode: announce }
}
成本报告格式
生成报告时,使用此结构:
markdown
💰 代理成本报告 — [日期]
各代理明细
| 代理 | 模型 | 令牌数(24小时) | 预估成本 | 状态 |
|---|
| Central | Opus 4.6 | 125K | $1.87 | ⚠️ 高 |
| Techops |
Opus 4.6 | 89K | $1.33 | ✅ 正常 |
| Atlas | Sonnet 4.5 | 45K | $0.27 | ✅ 低 |
| Closer | Haiku 4.5 | 23K | $0.02 | ✅ 极低 |
| Heartbeats | Ollama | 12K | $0.00 | ✅ 免费 |
摘要
- - 24小时总计: 294K 令牌(约$3.49)
- 预计每周: 约$24.43
- 预算: $20/周 → ⚠️ 预计超出122%
建议
- 1. 将Techops从Opus迁移至Sonnet处理常规任务(成本降低40%)
- 将心跳检测间隔从15分钟延长至30分钟
- 在Atlas上启用上下文修剪(空闲会话消耗缓存)
模型成本参考
使用以下费率进行估算(截至2026年):
Anthropic(Claude OAuth / API)
| 模型 | 输入/百万 | 输出/百万 | 缓存读取/百万 | 缓存写入/百万 |
|---|
| Opus 4.6 | $5.00 | $25.00 | $0.50 | $6.25 |
| Sonnet 4.5 |
$3.00 | $15.00 | $0.30 | $3.75 |
| Haiku 4.5 | $1.00 | $5.00 | $0.08 | $1.25 |
免费选项
| 模型 | 成本 | 用途 |
|---|
| Ollama(本地) | $0 | 心跳检测、简单任务 |
| Gemini OAuth |
$0* | 备用方案(有速率限制) |
*免费套餐存在速率限制
优化手册
速效方案(优先执行)
- 1. 在Ollama上运行心跳检测
json5
{ heartbeat: { model: ollama/llama3.2:3b } }
节省:100%的心跳检测成本(使用Opus时每周可能达$5-10)
- 2. 关闭Haiku缓存保留
json5
{ anthropic/claude-haiku-4-5: { params: { cacheRetention: none } } }
节省:廉价模型的缓存写入成本(不值得缓存)
- 3. 上下文修剪
json5
{ contextPruning: { mode: cache-ttl, ttl: 5m } }
节省:每次交互时重新读取过期上下文
- 4. Opus/Sonnet长缓存保留
json5
{ anthropic/claude-opus-4-6: { params: { cacheRetention: long } } }
节省:每次交互时重新发送系统提示(最大的单项节省)
模型分层(影响最大)
| 任务类型 | 使用此模型 | 不使用此模型 | 节省 |
|---|
| 协调、复杂推理 | Opus | — | 合理 |
| 财务、数据分析 |
Sonnet | Opus | -40% |
| 销售草稿、营销文案 | Haiku | Sonnet | -67% |
| 心跳检测、健康检查 | Ollama | 任何付费模型 | -100% |
| 推文草稿 | Haiku或Grok | Opus | -80% |
会话管理
- - 每日重置:会话在设定时间自动清除(减少令牌累积)
json5
{ session: { reset: { mode: daily, atHour: 4, idleMinutes: 45 } } }
json5
{ compaction: { memoryFlush: { enabled: true } } }
预警阈值
在监控代理的内存中配置:
markdown
预算预警
- - 每日预算:$5.00(达到80%即$4.00时预警)
- 每周预算:$20.00(达到70%即$14.00时预警)
- 每个代理每日上限:$2.00
- 预警渠道:Telegram私信
与DevOps代理集成
如果你有DevOps/监控代理(例如你的DevOps代理),在其AGENTS.md中添加:
markdown
成本监控
- - 每天20:00运行成本报告
- 如有代理日支出超过$2则预警
- 每周一09:00发送周摘要
- 追踪趋势:使用量在上升还是下降?
常见问题
问:此技能会调用API吗?
答:不会。它使用OpenClaw内置的session_status工具。无需外部API,无额外成本。
问:成本估算有多准确?
答:基于已发布的模型定价。实际成本可能因缓存命中而有所差异。估算偏保守(略高)。
问:可以追踪每次对话的成本吗?
答:不能直接追踪。成本按会话追踪。使用sessions_list查看每个会话的令牌数。
问:是否支持非Anthropic模型?
答:支持。所有提供商的令牌计数都有效。成本估算需要已知定价(在成本参考部分添加自定义费率)。
更新日志
v1.1.0
v1.0.0