Battle-Tested Agent
19 production-hardened patterns for AI agents. Every one earned from failure.
Use this skill when you are:
- - hardening an agent that will run repeatedly or autonomously
- tightening memory, verification, or anti-hallucination behavior
- reducing compaction failures, weak handoffs, or orchestration drift
- reviewing an agent workspace for missing production patterns
- debugging why an agent keeps losing context, guessing, or dropping work
Do not use this skill for:
- - persona writing or onboarding polish
- one-off prompt tweaks with no reusable pattern behind them
- adding new tools, servers, or runtime capabilities
- turning a simple workspace into process theater
Default workflow
- 1. Audit first
Run
bash scripts/audit.sh <workspace> to see which patterns are present.
The script checks for all 16 patterns and tells you what to fix first.
- 2. Start with the smallest tier that fits
Implement starter patterns first, then intermediate, then advanced.
Do not cargo-cult every pattern into every agent.
- 3. Patch the actual failure mode
Change the mechanism, not just the wording. "ALWAYS check X" is not a fix —
a verification gate is a fix.
- 4. Keep patterns lightweight
Add only the pieces that materially reduce failures or operator burden.
Pattern tiers
- - Starter (5): baseline reliability for almost every agent
- Intermediate (5): daily-driver patterns for briefs, heartbeats, and recurring work
- Advanced (6): multi-agent orchestration, handoffs, and self-improvement discipline
Pattern clusters
Some patterns reinforce each other naturally. Adopt them together when the failure
mode calls for it:
- - Trust chain: WAL Protocol + Anti-Hallucination + Agent Verification — ensures
data is captured, sourced, and measured before reporting
- - Handoff loop: Delegation Rules + Completion Contract + Acceptance Gate + Task State Tracking — prevents
work from disappearing between agents or being certified without proof
- - Survival kit: Working Buffer + Compaction Injection Hardening + Silent Worker Recovery — keeps context
alive across long sessions and prevents silent delegated drift
- - Quality gate: QA Gates + Verify Implementation + Decision Logs — ensures output
quality and traceable reasoning
- - Delegation hardening: Brief Quality Gate + Scoped Verifier Gate — keeps delegation tight without turning the whole system into bureaucracy
When patterns conflict
If two patterns seem to give contradictory advice:
- - Safety patterns win over speed patterns. Ambiguity Gate overrides Simple Path First
when the request is ambiguous. Verify before acting, even if the simple path is obvious.
- - Evidence patterns win over action patterns. Anti-Hallucination overrides "just try it"
when reporting data. Never guess a number to move faster.
Assets — how to use them
The assets/ folder contains starter files you copy into your workspace and customize.
They are templates, not drop-in replacements.
CODEBLOCK0
Read references/audit-usage.md for the full rollout order and bootstrap workflow.
References
- -
references/starter-patterns.md — WAL, anti-hallucination, ambiguity, simple-path-first, unblock-before-shelve - INLINECODE4 — verification, working buffer, QA gates, decision logs, verify implementation
- INLINECODE5 — delegation, brief quality, proof-based handoffs, acceptance gates, orchestration, stale-worker recovery, compaction hardening, recurrence tracking
- INLINECODE6 — audit script usage, install/copy snippets, and expected outcomes
Included scripts
- -
scripts/audit.sh — workspace audit for all 19 patterns (supports AGENTS.md, CLAUDE.md, SOUL.md, and system.md)
Rules of thumb
- - Audit before expanding
- Prefer progressive disclosure over giant core files
- Silence is better than hallucination
- Ambiguity is a stop sign, not permission
- The orchestrator should preserve oversight, not sink into implementation
- Mechanism changes beat wording changes
- After acting, verify the new state before declaring success
- Partial progress is not success; recovery steps matter as much as first-attempt steps
Outcome
A leaner, more resilient agent that survives compaction, hands work off cleanly,
reports only what is verified, and improves without spiraling into bureaucracy.
实战锤炼型智能体
19个经生产环境验证的智能体模式,每一个都源自失败教训。
在以下场景使用此技能:
- - 加固需要重复运行或自主运行的智能体
- 收紧记忆、验证或反幻觉行为
- 减少压缩失败、弱交接或编排漂移
- 审查智能体工作区以发现缺失的生产模式
- 调试智能体持续丢失上下文、猜测或遗漏工作的原因
请勿在以下场景使用此技能:
- - 角色设定撰写或入职流程优化
- 缺乏可复用模式的单次提示词调整
- 添加新工具、服务器或运行时能力
- 将简单工作区变成流程表演
默认工作流程
- 1. 先审计
运行 bash scripts/audit.sh
查看当前已有哪些模式。
该脚本会检查全部16种模式,并告知应优先修复的内容。
- 2. 从最小的适用层级开始
先实施入门模式,再实施中级模式,最后实施高级模式。
不要将每个模式都生搬硬套到每个智能体中。
- 3. 修补实际故障模式
改变机制,而不仅仅是措辞。始终检查X不是修复方案——验证门控才是。
- 4. 保持模式轻量化
只添加能实质性减少故障或操作负担的组件。
模式层级
- - 入门级(5种): 几乎所有智能体的基础可靠性保障
- 中级(5种): 适用于简报、心跳检测和周期性工作的日常驱动模式
- 高级(6种): 多智能体编排、交接和自我改进规范
模式集群
某些模式天然相互强化。当故障模式需要时,应一并采用:
- - 信任链: WAL协议 + 反幻觉 + 智能体验证——确保数据在报告前已被捕获、溯源和度量
- 交接循环: 委派规则 + 完成契约 + 验收门控 + 任务状态追踪——防止工作在不同智能体间消失或未经证明即被确认
- 生存工具包: 工作缓冲区 + 压缩注入加固 + 静默工作器恢复——在长会话中保持上下文存活,防止静默委派漂移
- 质量门控: QA门控 + 验证实施 + 决策日志——确保输出质量和可追溯的推理过程
- 委派加固: 简报质量门控 + 范围验证门控——保持委派紧凑性,同时避免整个系统陷入官僚主义
模式冲突时的处理
若两个模式给出矛盾建议:
- - 安全模式优先于速度模式。 当请求存在歧义时,歧义门控覆盖简单路径优先原则。先验证再行动,即使简单路径显而易见。
- 证据模式优先于行动模式。 报告数据时,反幻觉覆盖先试试看原则。切勿为了加快速度而猜测数字。
资产——如何使用
assets/ 文件夹包含可复制到工作区并自定义的入门文件。它们是模板,而非即插即用的替代品。
bash
将委派和决策日志规则合并到现有的 AGENTS.md 中
cp assets/AGENTS-additions.md ~/workspace/ # 审查后合并
添加 QA 门控
cp assets/QA-gates.md ~/workspace/QA.md
设置自我改进追踪
mkdir -p ~/workspace/.learnings
cp assets/learnings-template.md ~/workspace/.learnings/LEARNINGS.md
cp assets/errors-template.md ~/workspace/.learnings/ERRORS.md
cp assets/features-template.md ~/workspace/.learnings/FEATURE_REQUESTS.md
阅读 references/audit-usage.md 了解完整的部署顺序和引导工作流程。
参考资料
- - references/starter-patterns.md — WAL、反幻觉、歧义处理、简单路径优先、搁置前先解阻
- references/intermediate-patterns.md — 验证、工作缓冲区、QA门控、决策日志、验证实施
- references/advanced-patterns.md — 委派、简报质量、基于证明的交接、验收门控、编排、陈旧工作器恢复、压缩加固、周期性追踪
- references/audit-usage.md — 审计脚本使用、安装/复制代码片段及预期结果
包含的脚本
- - scripts/audit.sh — 针对全部19种模式的工作区审计(支持 AGENTS.md、CLAUDE.md、SOUL.md 和 system.md)
经验法则
- - 先审计再扩展
- 渐进式披露优于巨型核心文件
- 沉默胜于幻觉
- 歧义是停止标志,而非许可
- 编排器应保持监督职能,而非陷入具体实现
- 机制改变优于措辞改变
- 行动后,先验证新状态再宣布成功
- 部分进展不等于成功;恢复步骤与首次尝试步骤同等重要
成果
一个更精简、更具韧性的智能体,能够经受压缩考验,干净利落地交接工作,仅报告经过验证的信息,并在不陷入官僚主义的情况下持续改进。