Context Death Spiral Prevention — OpenClaw Compaction Primer
What Is a Context Death Spiral?
A context death spiral is what happens when an OpenClaw agent accumulates so much
conversation history that its reasoning quality degrades — and then the degradation
makes it handle the accumulation worse, which accelerates the degradation.
You've seen the symptoms:
- - Agent starts forgetting instructions it acknowledged 20 turns ago
- Response quality drops noticeably mid-session without any obvious trigger
- Agent begins contradicting itself or repeating earlier failed attempts
- Sudden unexplained context resets that wipe work in progress
- Tool calls become erratic — the agent loses track of what it already tried
These aren't model failures. They're architecture failures. The agent isn't broken —
its context management is.
Why Default OpenClaw Setups Don't Handle This
Out of the box, OpenClaw has no compaction architecture. There is no:
- - Threshold configuration that triggers compaction before quality degrades
- Circuit breaker that catches failed compactions before they cascade
- Post-compaction cleanup sequence that verifies the context was actually reduced
- Sequencing logic that governs what gets compacted in what order
- Guard against recursive compaction (compacting a compaction summary)
Without these, the agent operates until it hits the model's hard context limit.
At that point, OpenClaw either crashes, truncates silently, or enters an error
loop. None of these are recoverable without manual intervention.
The Four Categories That Control Compaction Behavior
Production compaction architecture covers four distinct areas. You need all four:
1. Threshold Management
The threshold determines when compaction fires. Set it too high and the agent
degrades before compaction helps. Set it too low and you waste tokens on
unnecessary compaction. The right thresholds are not intuitive — they depend
on the model's actual quality degradation curve, not its advertised context window.
Most operators guess. Production deployments measure.
2. Autocompact Gate Logic
Compaction shouldn't fire on every threshold breach — some breaches are transient.
A production gate evaluates multiple conditions before triggering: token count,
session age, tool call density, the shape of recent content. A simple token
threshold is not a gate. It's a single condition, and it fires at the wrong time
roughly 30% of the time in active sessions.
3. Circuit Breaker
Compaction can fail. When it does, naive implementations retry immediately —
which can send the agent into an infinite compaction loop that burns tokens and
produces nothing. A production circuit breaker counts consecutive failures,
backs off, and eventually halts with a recoverable state.
Without a circuit breaker, one bad compaction attempt can destroy a session.
4. Post-Compaction Cleanup
After compaction runs, the context window needs to be verified. Did it actually
reduce? Was the summary written correctly? Are there orphaned references to
content that no longer exists? Post-compaction cleanup is not optional — without
it, you have no guarantee compaction worked.
Why This Is Harder Than It Looks
The threshold problem alone has three sub-problems:
- - Warning threshold — when to signal that compaction is approaching
- Trigger threshold — when to actually compact
- Block threshold — when the context is too full to compact safely and
the session must halt
These three values interact. Setting any one of them wrong creates either
unnecessary interruptions or silent degradation. Production deployments derive
all three from the same empirical baseline. Guessing independently at each one
is how operators end up with agents that compact too aggressively, lose
important context, and then compound the problem on the next session.
The Bottom Line
If your OpenClaw agent runs sessions longer than 30 minutes, handles multi-step
autonomous tasks, or operates without supervision — you have a context management
problem, whether you've seen the symptoms yet or not.
Most operators discover this the hard way.
*Full production architecture with all 7 SKILL.md files — including exact
production-validated constants validated in production Claude Code deployments — available
in the
Production Agent Ops bundle on Claw Mart:*
https://www.shopclawmart.com/listings/production-agent-ops-battle-tested-architecture-pack-0d1bb129
技能名称:上下文死亡螺旋预防 — OpenClaw压缩基础指南
详细描述:
上下文死亡螺旋预防 — OpenClaw压缩基础指南
什么是上下文死亡螺旋?
上下文死亡螺旋是指当OpenClaw代理累积过多对话历史,导致其推理质量下降——而质量下降又使其更难以处理累积内容,从而加速质量退化的情况。
你见过这些症状:
- - 代理开始遗忘20轮前确认过的指令
- 会话中途响应质量无故明显下降
- 代理开始自相矛盾或重复之前失败的尝试
- 突然出现无法解释的上下文重置,清空进行中的工作
- 工具调用变得不稳定——代理丢失了已尝试操作的记录
这些不是模型故障,而是架构缺陷。代理本身没问题——问题出在上下文管理上。
为什么默认的OpenClaw设置无法处理此问题
开箱即用的OpenClaw没有压缩架构。它缺少:
- - 在质量退化前触发压缩的阈值配置
- 在失败压缩级联前捕获它们的断路器
- 验证上下文是否实际减少的压缩后清理序列
- 控制压缩顺序的排序逻辑
- 防止递归压缩(压缩压缩摘要)的防护机制
没有这些机制,代理会一直运行直到触及模型的硬性上下文限制。届时,OpenClaw要么崩溃、静默截断,要么进入错误循环。这些情况都无法在不手动干预的情况下恢复。
控制压缩行为的四大类别
生产级压缩架构涵盖四个不同领域。你需要全部四个:
1. 阈值管理
阈值决定何时触发压缩。设置过高,代理会在压缩生效前退化。设置过低,你会因不必要的压缩浪费令牌。正确的阈值并非直观可得——它们取决于模型的实际质量退化曲线,而非其宣传的上下文窗口。
大多数操作员靠猜测。生产部署靠测量。
2. 自动压缩门控逻辑
不应每次阈值被突破都触发压缩——有些突破是暂时的。生产级门控在触发前评估多个条件:令牌数量、会话时长、工具调用密度、近期内容形态。简单的令牌阈值不是门控,它只是一个单一条件,在活跃会话中约30%的时间会在错误时机触发。
3. 断路器
压缩可能失败。当失败发生时,简单实现会立即重试——这可能导致代理进入无限压缩循环,消耗令牌却毫无产出。生产级断路器会统计连续失败次数、逐步退避,最终在可恢复状态下停止。
没有断路器,一次糟糕的压缩尝试就能毁掉整个会话。
4. 压缩后清理
压缩运行后,需要验证上下文窗口。它是否真的减少了?摘要是否正确写入?是否存在指向已不存在内容的孤立引用?压缩后清理不是可选项——没有它,你无法保证压缩生效。
为什么这比看起来更难
仅阈值问题就包含三个子问题:
- - 警告阈值——何时发出压缩即将进行的信号
- 触发阈值——何时实际执行压缩
- 阻塞阈值——当上下文过满无法安全压缩时,会话必须暂停
这三个值相互影响。任何一个设置错误都会导致不必要的干扰或静默退化。生产部署从同一经验基线推导出所有三个值。独立猜测每个值正是操作员最终让代理过度压缩、丢失重要上下文、然后在下一会话中加剧问题的原因。
核心结论
如果你的OpenClaw代理运行超过30分钟的会话、处理多步骤自主任务、或在无监督下运行——那么你就存在上下文管理问题,无论你是否已看到症状。
大多数操作员是通过惨痛教训发现这一点的。
包含全部7个SKILL.md文件的完整生产架构——包括在Claude Code生产部署中验证的确切生产验证常量——可在Claw Mart的生产代理运维套件中获取:
https://www.shopclawmart.com/listings/production-agent-ops-battle-tested-architecture-pack-0d1bb129