Agent Rule Audit
Audit the files that actually shape an OpenClaw agent's behavior. Focus on behavior-layer quality, not general file cleanup.
Quick start
- 1. Identify the audit target: which agent/workspace is being reviewed.
- Read the core behavior-layer files first.
- Read shared rule files only when the core files explicitly depend on them.
- Stay with the default core scope first. Widen only when the core files are not enough to explain the behavior, or when the user asks for a deeper audit.
- Produce two outputs:
- audit conclusions
- executable restructuring recommendations
- 6. Separate root causes from surface symptoms.
Default audit scope
By default, inspect only the agent's core, stable behavior files — the files most likely to be loaded every session and to consistently shape behavior.
Core behavior layer
Read these first when present:
- - INLINECODE0
- INLINECODE1
- INLINECODE2
- INLINECODE3 (if present)
- INLINECODE4
- INLINECODE5
- INLINECODE6
See references/openclaw-behavior-sources.md for why these matter in OpenClaw.
Optional widening
Only widen beyond the core set when needed, for example:
- - the user explicitly asks for a broader audit
- the core files explicitly depend on another file
- the core files look fine but behavior clearly points to another steering source
- a recent correction/example cannot be explained from the core files alone
Possible widening targets:
- - shared rule files explicitly referenced by the core files
- any correction / learnings / workflow-improvement layer that exists in the target workspace
- behavior-improvement or trial-related supporting files when they exist in the target workspace
- recent behavioral evidence files when they exist in the target workspace
- user-provided examples/screenshots/transcripts
What to look for
Use the problem categories in references/problem-types.md.
Default categories:
- - structure confusion
- repetition / redundancy
- rule conflict
- focus drift
- behavior-layer dilution from too many weak rules
- symptom-vs-root-cause confusion
- style guidance overpowering execution guidance
- stale or superseded rules not cleaned up
- trial rules that never reached the live behavior layer
Audit workflow
1. Map the real behavior sources
Do not assume every file matters equally.
First answer:
- - Which files are most likely shaping behavior now?
- Which are direct behavior rules vs supporting evidence?
- Which are probably ignored or low-weight?
2. Identify the user's real complaint
Do not let verbose files distract from the actual failure mode.
Ask or infer:
- - What is the user truly unhappy with?
- What is the root problem?
- Which observed symptoms are secondary?
Example: “progress-sounding replies” may be a symptom; “not actually doing the work” may be the root issue.
3. Read for layering problems
Check whether files are cleanly separated by role:
- - identity/persona
- working style
- hard boundaries
- task execution rules
- temporary trial rules
- business workflow rules
Flag when these are mixed together in ways that weaken the important rules.
4. Check alignment across files
Ask:
- - Do the core live rules point in the same direction, or are they pulling behavior apart?
- Does the workspace's correction / learnings layer support or contradict the live rules?
- If the widened scope includes behavior-improvement or trial files, do those files match the live rules?
- Are older rules still pulling behavior in the wrong direction?
5. Judge whether the most important rule is actually prominent enough
The key audit question is not just “is the right rule written somewhere?”
It is:
- - Is the right rule clear?
- Is it near the top or buried?
- Is it specific enough to change behavior?
- Is it being diluted by too many softer surrounding rules?
6. Recommend by role, not by habit
Do not tell the user to rewrite everything.
Recommend changes by file role:
- - what should stay in INLINECODE9
- what should move to the workspace's correction / learnings layer
- what should move to references/review/tracking
- what should become a stronger core rule
- what should be deleted or merged
Output structure
Use this default output shape:
- 1. Audit scope — what was checked
- Overall judgment — is the behavior layer mostly aligned or not
- Highest-priority problems — ranked
- Root cause vs symptoms — where relevant
- What is already fine — avoid over-editing
- Recommended changes — concrete and file-specific
For a reusable outline, see references/output-template.md.
Important judgment rules
- - Do not confuse “a rule exists somewhere” with “the agent is actually being steered by it.”
- Do not recommend giant rewrites when a smaller structural cleanup would solve the issue.
- Prefer fewer, clearer, stronger rules over many overlapping weak ones.
- When the user's complaint is concrete, optimize for that real complaint first.
- If a problem is mainly workflow/process rather than prompt wording, say so plainly.
When to widen the audit
Widen beyond the default scope only when needed, for example:
- - a shared file is explicitly referenced
- the user asks for a broader workspace audit
- the core files look fine but behavior still points to another steering source
- recent behavioral evidence contains the only concrete signs of how the behavior shifted
代理规则审计
审计实际塑造OpenClaw代理行为的文件。聚焦于行为层质量,而非常规文件清理。
快速开始
- 1. 确定审计目标:正在审查哪个代理/工作空间。
- 首先阅读核心行为层文件。
- 仅当核心文件明确依赖共享规则文件时才阅读它们。
- 首先保持在默认核心范围内。仅当核心文件不足以解释行为,或用户要求更深入审计时才扩大范围。
- 生成两项输出:
- 审计结论
- 可执行的重组建议
- 6. 区分根本原因与表面症状。
默认审计范围
默认情况下,仅检查代理的核心、稳定行为文件——这些文件最有可能在每个会话中被加载并持续塑造行为。
核心行为层
存在时优先阅读这些文件:
- - AGENTS.md
- SOUL.md
- USER.md
- MEMORY.md(如存在)
- TOOLS.md
- IDENTITY.md
- HEARTBEAT.md
参见 references/openclaw-behavior-sources.md 了解这些文件在OpenClaw中的重要性。
可选扩展范围
仅在需要时扩展至核心集之外,例如:
- - 用户明确要求更广泛的审计
- 核心文件明确依赖另一个文件
- 核心文件看起来正常但行为明显指向其他引导源
- 仅凭核心文件无法解释最近的修正/示例
可能的扩展目标:
- - 核心文件明确引用的共享规则文件
- 目标工作空间中存在的任何修正/学习/工作流改进层
- 目标工作空间中存在的行为改进或试验相关支持文件
- 目标工作空间中存在的近期行为证据文件
- 用户提供的示例/截图/转录文本
审计要点
使用 references/problem-types.md 中的问题分类。
默认分类:
- - 结构混乱
- 重复/冗余
- 规则冲突
- 焦点漂移
- 过多弱规则导致行为层稀释
- 症状与根本原因混淆
- 风格指导压倒执行指导
- 过时或被取代的规则未清理
- 从未进入活跃行为层的试验规则
审计工作流
1. 映射真实行为来源
不要假设每个文件同等重要。
首先回答:
- - 哪些文件最可能正在塑造当前行为?
- 哪些是直接行为规则,哪些是支持性证据?
- 哪些可能被忽略或权重较低?
2. 识别用户的真实诉求
不要让冗长的文件分散对实际失败模式的注意力。
询问或推断:
- - 用户真正不满的是什么?
- 根本问题是什么?
- 哪些观察到的症状是次要的?
示例:听起来像在汇报进度可能是症状;实际上没有完成工作可能是根本问题。
3. 检查分层问题
检查文件是否按角色清晰分离:
- - 身份/人格
- 工作风格
- 硬性边界
- 任务执行规则
- 临时试验规则
- 业务工作流规则
当这些内容以削弱重要规则的方式混合时,进行标记。
4. 检查文件间一致性
询问:
- - 核心活跃规则指向同一方向,还是正在拉扯行为?
- 工作空间的修正/学习层是支持还是矛盾于活跃规则?
- 如果扩展范围包括行为改进或试验文件,这些文件是否与活跃规则匹配?
- 旧规则是否仍在将行为拉向错误方向?
5. 判断最重要的规则是否足够突出
关键的审计问题不仅仅是正确的规则是否写在某处?
而是:
- - 正确的规则是否清晰?
- 它是在顶部还是被埋没?
- 它是否足够具体以改变行为?
- 它是否被太多较弱的周围规则稀释?
6. 按角色而非习惯推荐
不要告诉用户重写所有内容。
按文件角色推荐更改:
- - 哪些应保留在 AGENTS.md
- 哪些应移至工作空间的修正/学习层
- 哪些应移至参考/审查/跟踪
- 哪些应成为更强的核心规则
- 哪些应删除或合并
输出结构
使用此默认输出格式:
- 1. 审计范围 — 检查了哪些内容
- 总体判断 — 行为层是否基本一致
- 最高优先级问题 — 按优先级排序
- 根本原因与症状 — 在相关处说明
- 已正常的内容 — 避免过度编辑
- 推荐更改 — 具体且针对特定文件
可复用大纲参见 references/output-template.md。
重要判断规则
- - 不要混淆某处存在规则与代理实际被其引导。
- 当较小的结构清理可以解决问题时,不要推荐大规模重写。
- 优先选择更少、更清晰、更强的规则,而非多个重叠的弱规则。
- 当用户的投诉具体时,优先针对该实际投诉进行优化。
- 如果问题主要是工作流/流程而非提示词措辞,请直说。
何时扩展审计范围
仅在需要时扩展至默认范围之外,例如:
- - 明确引用了共享文件
- 用户要求更广泛的工作空间审计
- 核心文件看起来正常但行为仍指向其他引导源
- 近期行为证据包含行为如何转变的唯一具体迹象