Token Saver v3
💡 Did you know? Every API call sends your workspace files (SOUL.md, USER.md, MEMORY.md, AGENTS.md, etc.) along with your message. These files count toward your context window, slowing responses and costing real money on every message.
Token Saver v3 is model-aware — it knows your model's context window and adapts recommendations accordingly. Using Gemini's 1M context? Presets scale up. On GPT-4o's 128K? Presets adjust down.
What's New in v3
| Feature | v2 | v3 |
|---|
| Compaction presets | Fixed (80K/120K/160K) | Dynamic (% of model's context) |
| Model detection |
Fragile, env-only | Robust fallback chain |
| Context windows | Not tracked | Full registry (9 models) |
| Model info | Hardcoded pricing | JSON registry, easy updates |
| Already-optimized | Re-compressed | Smart bypass |
Commands
| Command | What it does |
|---|
| INLINECODE0 | Full dashboard — files, models, context usage % |
| INLINECODE1 |
Compress workspace files (auto-backup) |
|
/optimize compaction | Chat compaction control (model-aware) |
|
/optimize compaction balanced | Apply balanced preset (60% of context) |
|
/optimize compaction 120 | Custom threshold (compact at 120K) |
|
/optimize models | Detailed model audit with registry |
|
/optimize revert | Restore backups, disable persistent mode |
Features
📊 Model-Aware Dashboard
Shows current model, context window, and usage percentage:
CODEBLOCK0
📁 Workspace File Compression
Scans all
.md files, shows token count and potential savings. Smart bypass skips already-optimized files.
File-aware compression:
- - SOUL.md — Light compression, keeps personality language
- AGENTS.md — Medium compression, dense instructions
- USER.md / MEMORY.md — Heavy compression, key:value format
- PROJECTS.md — No compression (user structure preserved)
💬 Dynamic Compaction Presets
Presets adapt to your model's context window:
| Preset | % of Context | Claude 200K | GPT-4o 128K | Gemini 1M |
|---|
| Aggressive | 40% | 80K | 51K | 400K |
| Balanced |
60% | 120K | 77K | 600K |
| Conservative | 80% | 160K | 102K | 800K |
| Off | 95% | 190K | 122K | 950K |
🤖 Model Registry
24+ models with context windows, pricing, and aliases:
- - Claude: Opus 4.6 (1M), Opus 4.5, Sonnet 4.5, Sonnet 4, Haiku 4.5, Haiku 3.5 (200K)
- OpenAI: GPT-5.2, GPT-5.1, GPT-5-mini, GPT-5-nano (256K), GPT-4.1, GPT-4o (128K), o1, o3, o4-mini
- Gemini: 3 Pro (2M), 2.5 Pro, 2.0 Flash (1M)
- Others: DeepSeek V3 (64K), Kimi K2.5 (128K), Llama 3.3 70B, Mistral Large
🔍 Robust Model Detection
Detection priority:
- 1. Runtime injection (
--model=...) - Environment variables (
SKILL_MODEL, OPENCLAW_MODEL) - Config file (
~/.openclaw/openclaw.json) - File inference (TOOLS.md, MEMORY.md mentions)
- Fallback: Claude Sonnet 4 (safe default)
Unknown model handling:
- - Strict version matching —
opus-6.5 won't fuzzy-match to INLINECODE13 - Unknown models get safe defaults (200K context) + warning
- Easy to add new models to INLINECODE14
📝 Persistent Mode
Adds writing guidance to AGENTS.md for continued token efficiency:
| File | Writing Style |
|---|
| SOUL.md | Evocative, personality-shaping |
| AGENTS.md |
Dense instructions, symbols OK |
| USER.md | Key:value facts |
| MEMORY.md | Ultra-dense data |
Safety
- - Auto-backup — All modified files get
.backup extension - Integrity > Size — Never sacrifices meaning for smaller tokens
- Smart bypass — Skips already-optimized files
- Revert anytime —
/optimize revert restores everything - No external calls — All analysis runs locally
Installation
CODEBLOCK1
Version History
- - 3.0.0 — Model registry, dynamic presets, robust detection, smart bypass
- 2.0.1 — Chat compaction, file-aware compression, persistent mode
- 1.0.0 — Initial release
Token Saver v3
💡 你知道吗? 每次API调用都会将你的工作区文件(SOUL.md、USER.md、MEMORY.md、AGENTS.md等)连同你的消息一起发送。这些文件会占用你的上下文窗口,拖慢响应速度,并且每条消息都会产生实际费用。
Token Saver v3 是模型感知型——它了解你模型的上下文窗口,并据此调整建议。使用Gemini的1M上下文?预设值会相应放大。使用GPT-4o的128K?预设值会相应缩小。
v3 新特性
| 特性 | v2 | v3 |
|---|
| 压缩预设 | 固定值(80K/120K/160K) | 动态值(模型上下文百分比) |
| 模型检测 |
脆弱,仅依赖环境变量 | 稳健的回退链 |
| 上下文窗口 | 未跟踪 | 完整注册表(9个模型) |
| 模型信息 | 硬编码定价 | JSON注册表,易于更新 |
| 已优化文件 | 重新压缩 | 智能跳过 |
命令
| 命令 | 功能 |
|---|
| /optimize | 完整仪表盘——文件、模型、上下文使用百分比 |
| /optimize tokens |
压缩工作区文件(自动备份) |
| /optimize compaction | 聊天压缩控制(模型感知型) |
| /optimize compaction balanced | 应用平衡预设(上下文的60%) |
| /optimize compaction 120 | 自定义阈值(在120K时压缩) |
| /optimize models | 带注册表的详细模型审计 |
| /optimize revert | 恢复备份,禁用持久模式 |
特性
📊 模型感知型仪表盘
显示当前模型、上下文窗口和使用百分比:
🤖 模型:Claude Opus 4.5(200K上下文)
检测来源:openclaw.json
📊 上下文使用:[████████░░░░░░░░░░░░] 42%(84K/200K)
📁 工作区文件压缩
扫描所有.md文件,显示token数量及潜在节省空间。智能跳过已优化的文件。
文件感知型压缩:
- - SOUL.md — 轻度压缩,保留个性语言
- AGENTS.md — 中度压缩,密集指令
- USER.md / MEMORY.md — 重度压缩,键:值格式
- PROJECTS.md — 不压缩(保留用户结构)
💬 动态压缩预设
预设会根据你模型的上下文窗口进行调整:
| 预设 | 上下文百分比 | Claude 200K | GPT-4o 128K | Gemini 1M |
|---|
| 激进型 | 40% | 80K | 51K | 400K |
| 平衡型 |
60% | 120K | 77K | 600K |
| 保守型 | 80% | 160K | 102K | 800K |
| 关闭 | 95% | 190K | 122K | 950K |
🤖 模型注册表
24+个模型,包含上下文窗口、定价和别名:
- - Claude: Opus 4.6(1M)、Opus 4.5、Sonnet 4.5、Sonnet 4、Haiku 4.5、Haiku 3.5(200K)
- OpenAI: GPT-5.2、GPT-5.1、GPT-5-mini、GPT-5-nano(256K)、GPT-4.1、GPT-4o(128K)、o1、o3、o4-mini
- Gemini: 3 Pro(2M)、2.5 Pro、2.0 Flash(1M)
- 其他: DeepSeek V3(64K)、Kimi K2.5(128K)、Llama 3.3 70B、Mistral Large
🔍 稳健的模型检测
检测优先级:
- 1. 运行时注入(--model=...)
- 环境变量(SKILLMODEL、OPENCLAWMODEL)
- 配置文件(~/.openclaw/openclaw.json)
- 文件推断(TOOLS.md、MEMORY.md中的提及)
- 回退:Claude Sonnet 4(安全默认值)
未知模型处理:
- - 严格版本匹配——opus-6.5不会模糊匹配到opus-4.5
- 未知模型使用安全默认值(200K上下文)+ 警告
- 易于向scripts/models.json添加新模型
📝 持久模式
向AGENTS.md添加写作指南,以持续保持token效率:
| 文件 | 写作风格 |
|---|
| SOUL.md | 富有感染力,塑造个性 |
| AGENTS.md |
密集指令,允许使用符号 |
| USER.md | 键:值事实 |
| MEMORY.md | 超密集数据 |
安全性
- - 自动备份 — 所有修改过的文件都会添加.backup扩展名
- 完整性 > 大小 — 绝不为了减少token而牺牲含义
- 智能跳过 — 跳过已优化的文件
- 随时恢复 — /optimize revert可恢复所有内容
- 无外部调用 — 所有分析均在本地运行
安装
clawhub install token-saver --registry https://www.clawhub.ai
版本历史
- - 3.0.0 — 模型注册表、动态预设、稳健检测、智能跳过
- 2.0.1 — 聊天压缩、文件感知型压缩、持久模式
- 1.0.0 — 初始发布