Agent Cost Monitor — Know What Your Agents Cost

Track token usage, costs, and efficiency across all your OpenClaw agents in real-time. Get alerts before you blow your budget.

The Problem

Running multiple agents is powerful — but expensive if you're not watching:

- Which agent is burning the most tokens?
Are heartbeats wasting money on expensive models?
Is caching actually saving you anything?
When will you hit your weekly rate limit?

What This Skill Does

When triggered (via cron or manually), the agent:

1. Checks session_status for each agent
Calculates per-agent and total costs
Compares against budget thresholds
Sends alerts if limits are approaching
Suggests optimization moves

Usage

Ask your monitoring agent (or any agent with this skill):

CODEBLOCK0

Automated Daily Report (Cron)

CODEBLOCK1

Cost Report Format

When generating a report, use this structure:

CODEBLOCK2

Model Cost Reference

Use these rates for estimation (as of 2026):

Anthropic (Claude OAuth / API)
Model Input/1M Output/1M Cache Read/1M Cache Write/1M
Opus 4.6 $5.00 $25.00 $0.50 $6.25
Sonnet 4.5
$3.00 | $15.00 | $0.30 | $3.75 |

Model	Input/1M	Output/1M	Cache Read/1M	Cache Write/1M
Opus 4.6	$5.00	$25.00	$0.50	$6.25
Sonnet 4.5

| Haiku 4.5 | $1.00 | $5.00 | $0.08 | $1.25 |

Free Options
Model Cost Use For
Ollama (local) $0 Heartbeats, simple tasks
Gemini OAuth
$0* | Fallback (rate limited) |

Model	Cost	Use For
Ollama (local)	$0	Heartbeats, simple tasks
Gemini OAuth

*Free tier with rate limits

Optimization Playbook

Quick Wins (Do These First)

1. Heartbeats on Ollama

{ "heartbeat": { "model": "ollama/llama3.2:3b" } }

Saves: 100% of heartbeat costs (can be $5-10/week with Opus)

2. Haiku Cache Retention Off

{ "anthropic/claude-haiku-4-5": { "params": { "cacheRetention": "none" } } }

Saves: Cache write costs on cheap model (not worth caching)

3. Context Pruning

{ "contextPruning": { "mode": "cache-ttl", "ttl": "5m" } }

Saves: Stale context re-reads on every turn

4. Opus/Sonnet Cache Retention Long

{ "anthropic/claude-opus-4-6": { "params": { "cacheRetention": "long" } } }

Saves: Re-sending system prompt every turn (biggest single saving)

Model Tiering (Biggest Impact)

Task Type	Use This	Not This	Saving
Coordination, complex reasoning	Opus	—	Justified
Finance, data analysis

Session Management

- Daily reset: Sessions auto-clear at a set hour (reduces token accumulation)

{ "session": { "reset": { "mode": "daily", "atHour": 4, "idleMinutes": 45 } } }

- Memory flush: Save important context before compaction

CODEBLOCK8

Alert Thresholds

Configure in your monitoring agent's memory:

CODEBLOCK9

Integration with DevOps Agent

If you have a DevOps/monitoring agent (e.g. your DevOps agent), add to its AGENTS.md:

CODEBLOCK10

FAQ

Q: Does this skill make API calls?
A: No. It uses OpenClaw's built-in session_status tool. No external APIs, no additional costs.

Q: How accurate are cost estimates?
A: Based on published model pricing. Actual costs may vary with caching hits. Estimates are conservative (slightly high).

Q: Can I track costs per conversation?
A: Not directly. Costs are tracked per session. Use sessions_list to see per-session token counts.

Q: Works with non-Anthropic models?
A: Yes. Token counts work for all providers. Cost estimation requires known pricing (add custom rates in the cost reference section).

Changelog

v1.1.0

- Generalized all agent names in examples
No specific setup references

v1.0.0

- Initial release

代理成本监控器 — 了解你的代理成本

实时追踪所有OpenClaw代理的令牌使用量、成本和效率。在预算超支前获取预警。

问题

运行多个代理功能强大——但如果不加监控，成本可能很高：

- 哪个代理消耗的令牌最多？
心跳检测是否在昂贵模型上浪费资金？
缓存是否真的为你节省了成本？
何时会达到每周速率限制？

本技能功能

触发时（通过定时任务或手动），代理将：

1. 检查每个代理的session_status
计算每个代理及总成本
与预算阈值进行比较
在接近限制时发送预警
提出优化建议

使用方法

询问你的监控代理（或任何拥有此技能的代理）：

给我所有代理的成本报告
今天哪个代理使用的令牌最多？
我这周会达到速率限制吗？

自动日报（定时任务）

json5
{
name: 每日成本报告,
schedule: { kind: cron, expr: 0 20 *, tz: Europe/Berlin },
payload: {
kind: agentTurn,
message: 对所有代理运行成本报告。检查每个代理的session_status。报告内容：总令牌数、每个代理成本、最高消耗者、预算预警。向用户发送摘要。
},
sessionTarget: isolated,
delivery: { mode: announce }
}

成本报告格式

生成报告时，使用此结构：

markdown

💰 代理成本报告 — [日期]

各代理明细
代理模型令牌数（24小时）预估成本状态
Central Opus 4.6 125K $1.87 ⚠️ 高
Techops
Opus 4.6 | 89K | $1.33 | ✅ 正常 |

代理	模型	令牌数（24小时）	预估成本	状态
Central	Opus 4.6	125K	$1.87	⚠️ 高
Techops

| Atlas | Sonnet 4.5 | 45K | $0.27 | ✅ 低 | | Closer | Haiku 4.5 | 23K | $0.02 | ✅ 极低 | | Heartbeats | Ollama | 12K | $0.00 | ✅ 免费 |

摘要

- 24小时总计： 294K 令牌（约$3.49）
预计每周： 约$24.43
预算： $20/周 → ⚠️ 预计超出122%

建议

1. 将Techops从Opus迁移至Sonnet处理常规任务（成本降低40%）
将心跳检测间隔从15分钟延长至30分钟
在Atlas上启用上下文修剪（空闲会话消耗缓存）

模型成本参考

使用以下费率进行估算（截至2026年）：

Anthropic（Claude OAuth / API）
模型输入/百万输出/百万缓存读取/百万缓存写入/百万
Opus 4.6 $5.00 $25.00 $0.50 $6.25
Sonnet 4.5
$3.00 | $15.00 | $0.30 | $3.75 |

模型	输入/百万	输出/百万	缓存读取/百万	缓存写入/百万
Opus 4.6	$5.00	$25.00	$0.50	$6.25
Sonnet 4.5

| Haiku 4.5 | $1.00 | $5.00 | $0.08 | $1.25 |

免费选项
模型成本用途
Ollama（本地） $0 心跳检测、简单任务
Gemini OAuth
$0* | 备用方案（有速率限制） |

模型	成本	用途
Ollama（本地）	$0	心跳检测、简单任务
Gemini OAuth

*免费套餐存在速率限制

优化手册

速效方案（优先执行）

1. 在Ollama上运行心跳检测

json5 { heartbeat: { model: ollama/llama3.2:3b } }

节省：100%的心跳检测成本（使用Opus时每周可能达$5-10）

2. 关闭Haiku缓存保留

json5 { anthropic/claude-haiku-4-5: { params: { cacheRetention: none } } }

节省：廉价模型的缓存写入成本（不值得缓存）

3. 上下文修剪

json5 { contextPruning: { mode: cache-ttl, ttl: 5m } }

节省：每次交互时重新读取过期上下文

4. Opus/Sonnet长缓存保留

json5 { anthropic/claude-opus-4-6: { params: { cacheRetention: long } } }

节省：每次交互时重新发送系统提示（最大的单项节省）

模型分层（影响最大）

任务类型	使用此模型	不使用此模型	节省
协调、复杂推理	Opus	—	合理
财务、数据分析

会话管理

- 每日重置：会话在设定时间自动清除（减少令牌累积）

json5 { session: { reset: { mode: daily, atHour: 4, idleMinutes: 45 } } }

- 内存刷新：在压缩前保存重要上下文

json5 { compaction: { memoryFlush: { enabled: true } } }

预警阈值

在监控代理的内存中配置：

markdown

预算预警

- 每日预算：$5.00（达到80%即$4.00时预警）
每周预算：$20.00（达到70%即$14.00时预警）
每个代理每日上限：$2.00
预警渠道：Telegram私信

与DevOps代理集成

如果你有DevOps/监控代理（例如你的DevOps代理），在其AGENTS.md中添加：

markdown

成本监控

- 每天20:00运行成本报告
如有代理日支出超过$2则预警
每周一09:00发送周摘要
追踪趋势：使用量在上升还是下降？

常见问题

问：此技能会调用API吗？
答：不会。它使用OpenClaw内置的session_status工具。无需外部API，无额外成本。

问：成本估算有多准确？
答：基于已发布的模型定价。实际成本可能因缓存命中而有所差异。估算偏保守（略高）。

问：可以追踪每次对话的成本吗？
答：不能直接追踪。成本按会话追踪。使用sessions_list查看每个会话的令牌数。

问：是否支持非Anthropic模型？
答：支持。所有提供商的令牌计数都有效。成本估算需要已知定价（在成本参考部分添加自定义费率）。

agent-cost-monitor智能代理成本监控