Verified Capability Evolver
Extend existing capability evolution workflows with structured verification, rollback, and promotion gating.
This skill does not replace the underlying self-improvement system. It preserves the original learning, hook, and extraction workflow while adding a verification layer so permanent behavior changes are only promoted when they are proven.
Data handling and trust
This skill defines a verification workflow, not automatic data transmission.
- - Only structured task data (spec + output) should be used for verification
- Do NOT include secrets, API keys, credentials, private keys, seed phrases, or personal data
- SettlementWitness integration is runtime-controlled and should be used only with explicit user approval
- Verification is applied to selected learnings and promotion events, not all agent activity
Core Principle
Example Usage
Scenario
An agent repeatedly generates invalid JSON for an API response. A fix is applied, but before promoting this as permanent behavior, the improvement must be verified.
Learning Entry (Before Resolution)
[LRN-20260329-001] correction
Status: pending
Summary
Agent produced invalid JSON format for API responses
Suggested Action
Ensure all outputs conform to required schema before returning
Verification Spec
{
"expected": {
"valid_json": true,
"matches_schema": true
}
}
Verification Result
{
"valid_json": true,
"matches_schema": true
}
Outcome
- - Verification returns PASS
- Status → resolved
- Learning becomes eligible for promotion
- Receipt metadata can be attached (if verification used)
What This Shows
- - Improvements must be proven, not assumed
- Promotion is gated by verification
- Failed fixes do not become permanent behavior
An agent should not just improve — it should prove that it improved.
The highest-stakes moment in self-improvement is promotion to permanent memory. A temporary fix becomes permanent behavior only after it passes verification.
Agent Identity (Required for TrustScore)
When using SettlementWitness verification, provide a stable agent_id:
CODEBLOCK0
Use the format {wallet_address}:capability-evolver so TrustScore history can compound correctly across sessions.
Quick Reference
| Situation | Action |
|---|
| Command/operation fails | Log to INLINECODE1 |
| User corrects you |
Log to
.learnings/LEARNINGS.md with category
correction |
| User wants missing feature | Log to
.learnings/FEATURE_REQUESTS.md |
| API/external tool fails | Log to
.learnings/ERRORS.md with integration details |
| Knowledge was outdated | Log to
.learnings/LEARNINGS.md with category
knowledge_gap |
| Found better approach | Log to
.learnings/LEARNINGS.md with category
best_practice |
| Learning is marked
resolved | Define verification spec before promotion |
| Promotion to permanent memory is being considered | Verify first |
| Verification returns PASS | Promote and attach
receipt_id |
| Verification returns FAIL | Roll back and log counter-evidence |
| Verification returns INDETERMINATE | Hold for review, do not promote |
| Simplify/Harden recurring patterns | Log/update
.learnings/LEARNINGS.md with
Source: simplify-and-harden and a stable
Pattern-Key |
| Similar to existing entry | Link with
**See Also**, consider priority bump |
| Workflow improvements | Promote to
AGENTS.md (OpenClaw workspace) after verification PASS |
| Tool gotchas | Promote to
TOOLS.md (OpenClaw workspace) after verification PASS |
| Behavioral patterns | Promote to
SOUL.md (OpenClaw workspace) after verification PASS |
OpenClaw Setup (Recommended)
OpenClaw is the primary platform for this skill. It uses workspace-based prompt injection with automatic skill loading.
Installation
Via ClawdHub (recommended):
CODEBLOCK1
Manual:
CODEBLOCK2
Workspace Structure
OpenClaw injects these files into every session:
CODEBLOCK3
Create Learning Files
CODEBLOCK4
Then create the log files (or copy from assets/):
- -
LEARNINGS.md — corrections, knowledge gaps, best practices - INLINECODE21 — command failures, exceptions
- INLINECODE22 — user-requested capabilities
Promotion Targets
When learnings prove broadly applicable, promote them to workspace files:
| Learning Type | Promote To | Example |
|---|
| Behavioral patterns | INLINECODE23 | "Be concise, avoid disclaimers" |
| Workflow improvements |
AGENTS.md | "Spawn sub-agents for long tasks" |
| Tool gotchas |
TOOLS.md | "Git push needs auth configured first" |
Inter-Session Communication
OpenClaw provides tools to share learnings across sessions:
- - sessionslist — View active/recent sessions
- sessionshistory — Read another session's transcript
- sessionssend — Send a learning to another session
- sessionsspawn — Spawn a sub-agent for background work
Optional: Enable Hook
For automatic reminders at session start:
CODEBLOCK5
See references/openclaw-integration.md for complete details.
Generic Setup (Other Agents)
For Claude Code, Codex, Copilot, or other agents, create .learnings/ in your project:
CODEBLOCK6
Copy templates from assets/ or create files with headers.
Add reference to agent files AGENTS.md, CLAUDE.md, or .github/copilot-instructions.md to remind yourself to log learnings. (this is an alternative to hook-based reminders)
Self-Improvement Workflow
When errors or corrections occur:
- 1. Log to
.learnings/ERRORS.md, LEARNINGS.md, or INLINECODE31 - Review and promote broadly applicable learnings to:
-
CLAUDE.md - project facts and conventions
-
AGENTS.md - workflows and automation
-
.github/copilot-instructions.md - Copilot context
Logging Format
Learning Entry
Append to .learnings/LEARNINGS.md:
CODEBLOCK7
Error Entry
Append to .learnings/ERRORS.md:
CODEBLOCK8
Actual error message or output
CODEBLOCK9
Feature Request Entry
Append to .learnings/FEATURE_REQUESTS.md:
CODEBLOCK10
ID Generation
Format: TYPE-YYYYMMDD-XXX
- - TYPE:
LRN (learning), ERR (error), FEAT (feature) - YYYYMMDD: Current date
- XXX: Sequential number or random 3 chars (e.g.,
001, A7B)
Examples: LRN-20250115-001, ERR-20250115-A3F, INLINECODE46
Resolving Entries
When an issue appears fixed, do not immediately treat it as permanent learning.
Updated Resolution Flow
- 1. Change
**Status**: pending → INLINECODE48 - Apply the proposed fix or workflow change
- Define a deterministic verification spec:
- What should now succeed?
- What output should be produced?
- What failure should no longer occur?
- 4. Execute a verification task using that spec
- If external verification is being used, obtain explicit approval before submitting minimal structured task data
- Verify the result using SettlementWitness or an equivalent deterministic verifier
- Interpret the result:
PASS
- - Change
**Status** → INLINECODE50 - Record verification metadata
- Eligible for promotion
FAIL
- - Revert the change
- Keep or return
**Status** to INLINECODE52 - Log counter-evidence in the entry
- Do not promote
INDETERMINATE
- - Mark for review
- Do not promote until clarified
Resolution Block
Add after Metadata:
CODEBLOCK11
Other status values:
- -
in_progress - Actively being worked on - INLINECODE54 - Decided not to address (add reason in Resolution notes)
- INLINECODE55 - Elevated to CLAUDE.md, AGENTS.md, SOUL.md, TOOLS.md, or
.github/copilot-instructions.md after PASS only
Promoting to Project Memory
When a learning is broadly applicable (not a one-off fix), promote it to permanent project memory.
When to Promote
- - Learning applies across multiple files/features
- Knowledge any contributor (human or AI) should know
- Prevents recurring mistakes
- Documents project-specific conventions
Promotion Targets
| Target | What Belongs There |
|---|
| INLINECODE57 | Project facts, conventions, gotchas for all Claude interactions |
| INLINECODE58 |
Agent-specific workflows, tool usage patterns, automation rules |
|
.github/copilot-instructions.md | Project context and conventions for GitHub Copilot |
|
SOUL.md | Behavioral guidelines, communication style, principles (OpenClaw workspace) |
|
TOOLS.md | Tool capabilities, usage patterns, integration gotchas (OpenClaw workspace) |
How to Promote
Promotion is the highest-stakes moment in the workflow because it turns a temporary fix into permanent agent behavior.
A learning is only promoted to permanent memory if verification returns PASS. All other verdicts (FAIL or INDETERMINATE) trigger rollback and logging.
Promotion is strictly gated by verification. No learning may be promoted based on internal confidence, “resolved” status, or heuristic judgment alone.
- 1. Distill the learning into a concise rule or fact
- Define a verification spec for the claimed improvement
- Run a verification task
- Promote only on PASS
- Add to the appropriate target file (create file if needed)
- Attach verification metadata to the original entry:
- Change
**Status** →
promoted
- Add
**Promoted**: CLAUDE.md,
AGENTS.md,
SOUL.md,
TOOLS.md, or
.github/copilot-instructions.md
- Add
**Verified**: true
- Add INLINECODE70
If external verification is used:
- - never send secrets, credentials, or hidden system prompts
- only submit minimal structured task data
- require explicit approval before submission
Promotion Examples
Learning (verbose):
Project uses pnpm workspaces. Attempted npm install but failed.
Lock file is pnpm-lock.yaml. Must use pnpm install.
In CLAUDE.md (concise):
CODEBLOCK12
Learning (verbose):
When modifying API endpoints, must regenerate TypeScript client.
Forgetting this causes type mismatches at runtime.
In AGENTS.md (actionable):
CODEBLOCK13
Rollback Logic (Required)
If a previously promoted learning later fails verification:
- 1. Remove or revert the learning from permanent memory
- Log the counter-evidence in
.learnings/LEARNINGS.md or INLINECODE75 - Mark the learning as invalid or pending rework
- Avoid re-promoting until a new PASS result exists
Rollback is required because unverified permanent memory silently compounds bad behavior.
Recurring Pattern Detection
If logging something similar to an existing entry:
- 1. Search first: INLINECODE76
- Link entries: Add
**See Also**: ERR-20250110-001 in Metadata - Bump priority if issue keeps recurring
- Consider systemic fix: Recurring issues often indicate:
- Missing documentation (→ promote to CLAUDE.md or .github/copilot-instructions.md)
- Missing automation (→ add to AGENTS.md)
- Architectural problem (→ create tech debt ticket)
Simplify & Harden Feed
Use this workflow to ingest recurring patterns from the simplify-and-harden
skill and turn them into durable prompt guidance.
Ingestion Workflow
- 1. Read
simplify_and_harden.learning_loop.candidates from the task summary. - For each candidate, use
pattern_key as the stable dedupe key. - Search
.learnings/LEARNINGS.md for an existing entry with that key:
-
grep -n "Pattern-Key: <pattern_key>" .learnings/LEARNINGS.md
- 4. If found:
- Increment
Recurrence-Count
- Update
Last-Seen
- Add
See Also links to related entries/tasks
- 5. If not found:
- Create a new
LRN-... entry
- Set
Source: simplify-and-harden
- Set
Pattern-Key,
Recurrence-Count: 1, and
First-Seen/ INLINECODE91
Promotion Rule (System Prompt Feedback)
Promote recurring patterns into agent context/system prompt files when all are true:
- - INLINECODE92
- Seen across at least 2 distinct tasks
- Occurred within a 30-day window
Promotion targets:
- - INLINECODE93
- INLINECODE94
- INLINECODE95
- INLINECODE96 /
TOOLS.md for OpenClaw workspace-level guidance when applicable
Write promoted rules as short prevention rules (what to do before/while coding),
not long incident write-ups.
SettlementWitness Verification Template
Use this shape when verifying a proposed improvement:
CODEBLOCK14
Interpretation:
- - PASS → eligible for promotion
- FAIL → rollback
- INDETERMINATE → hold for review
Periodic Review
Review .learnings/ at natural breakpoints:
When to Review
- - Before starting a new major task
- After completing a feature
- When working in an area with past learnings
- Weekly during active development
Quick Status Check
CODEBLOCK15
Review Actions
- - Resolve fixed items
- Promote applicable learnings
- Link related entries
- Escalate recurring issues
Detection Triggers
Automatically log when you notice:
Corrections (→ learning with correction category):
- - "No, that's not right..."
- "Actually, it should be..."
- "You're wrong about..."
- "That's outdated..."
Feature Requests (→ feature request):
- - "Can you also..."
- "I wish you could..."
- "Is there a way to..."
- "Why can't you..."
Knowledge Gaps (→ learning with knowledge_gap category):
- - User provides information you didn't know
- Documentation you referenced is outdated
- API behavior differs from your understanding
Errors (→ error entry):
- - Command returns non-zero exit code
- Exception or stack trace
- Unexpected output or behavior
- Timeout or connection failure
Priority Guidelines
| Priority | When to Use |
|---|
| INLINECODE101 | Blocks core functionality, data loss risk, security issue |
| INLINECODE102 |
Significant impact, affects common workflows, recurring issue |
|
medium | Moderate impact, workaround exists |
|
low | Minor inconvenience, edge case, nice-to-have |
Area Tags
Use to filter learnings by codebase region:
| Area | Scope |
|---|
| INLINECODE105 | UI, components, client-side code |
| INLINECODE106 |
API, services, server-side code |
|
infra | CI/CD, deployment, Docker, cloud |
|
tests | Test files, testing utilities, coverage |
|
docs | Documentation, comments, READMEs |
|
config | Configuration files, environment, settings |
Best Practices
- 1. Log immediately - context is freshest right after the issue
- Be specific - future agents need to understand quickly
- Include reproduction steps - especially for errors
- Link related files - makes fixes easier
- Suggest concrete fixes - not just "investigate"
- Use consistent categories - enables filtering
- Promote only after PASS - permanent memory should be gated by verification, not confidence
- Review regularly - stale learnings lose value
Gitignore Options
Keep learnings local (per-developer):
CODEBLOCK16
Track learnings in repo (team-wide):
Don't add to .gitignore - learnings become shared knowledge.
Hybrid (track templates, ignore entries):
CODEBLOCK17
Hook Integration
Enable automatic reminders through agent hooks. This is opt-in - you must explicitly configure hooks.
Quick Setup (Claude Code / Codex)
Create .claude/settings.json in your project:
CODEBLOCK18
This injects a learning evaluation reminder after each prompt (~50-100 tokens overhead).
Full Setup (With Error Detection)
CODEBLOCK19
Available Hook Scripts
| Script | Hook Type | Purpose |
|---|
| INLINECODE112 | UserPromptSubmit | Reminds to evaluate learnings after tasks and verify before promotion |
| INLINECODE113 |
PostToolUse (Bash) | Triggers on command errors |
|
scripts/extract-skill.sh | manual helper | Extracts reusable skills from learnings |
See references/hooks-setup.md for detailed configuration and troubleshooting.
Automatic Skill Extraction
When a learning is valuable enough to become a reusable skill, extract it using the provided helper.
Skill Extraction Criteria
A learning qualifies for skill extraction when ANY of these apply:
| Criterion | Description |
|---|
| Recurring | Has See Also links to 2+ similar issues |
| Verified |
Status is
resolved with working fix |
|
Non-obvious | Required actual debugging/investigation to discover |
|
Broadly applicable | Not project-specific; useful across codebases |
|
User-flagged | User says "save this as a skill" or similar |
Extraction Workflow
- 1. Identify candidate: Learning meets extraction criteria
- Run helper (or create manually):
./skills/verified-capability-evolver/scripts/extract-skill.sh skill-name --dry-run
./skills/verified-capability-evolver/scripts/extract-skill.sh skill-name
- 3. Customize SKILL.md: Fill in template with learning content
- Update learning: Set status to
promoted_to_skill, add INLINECODE119 - Verify: Read skill in fresh session to ensure it's self-contained
Manual Extraction
If you prefer manual creation:
- 1. Create INLINECODE120
- Use template from INLINECODE121
- Follow the agent skills spec:
- YAML frontmatter with
name and
description
- Name must match folder name
- No README.md inside skill folder
Multi-Agent Support
This skill works across different AI coding agents with agent-specific activation.
Claude Code
Activation: Hooks (UserPromptSubmit, PostToolUse)
Setup: .claude/settings.json with hook configuration
Detection: Automatic via hook scripts
Codex CLI
Activation: Hooks (same pattern as Claude Code)
Setup: .codex/settings.json with hook configuration
Detection: Automatic via hook scripts
GitHub Copilot
Activation: Manual (no hook support)
Setup: Add to .github/copilot-instructions.md:
CODEBLOCK21
Detection: Manual review at session end
OpenClaw
Activation: Workspace injection + inter-agent messaging
Setup: See "OpenClaw Setup" section above
Detection: Via session tools and workspace files
Agent-Agnostic Guidance
Regardless of agent, apply verified evolution when you:
- 1. Discover something non-obvious - solution wasn't immediate
- Correct yourself - initial approach was wrong
- Learn project conventions - discovered undocumented patterns
- Hit unexpected errors - especially if diagnosis was difficult
- Find better approaches - improved on your original solution
Copilot Chat Integration
For Copilot users, add this to your prompts when relevant:
After completing this task, evaluate if any learnings should be logged to .learnings/ and whether any claimed improvement needs verification before promotion.
Or use quick prompts:
- - "Log this to learnings"
- "Create a skill from this solution"
- "Check .learnings/ for related issues"
- "Define a verification spec for this fix"
已验证能力进化器
通过结构化验证、回滚和升级门控,扩展现有能力进化工作流。
此技能不替代底层自我改进系统。它保留原始学习、钩子和提取工作流,同时添加一个验证层,使得永久行为变更仅在经过验证后才被推广。
数据处理与信任
此技能定义了一个验证工作流,而非自动数据传输。
- - 仅应使用结构化任务数据(规范+输出)进行验证
- 不要包含密钥、API密钥、凭据、私钥、助记词或个人数据
- SettlementWitness集成是运行时控制的,仅应在获得用户明确批准后使用
- 验证应用于选定的学习和推广事件,而非所有代理活动
核心原则
一个代理不应仅仅改进——它应该证明自己已经改进。
自我改进中风险最高的时刻是推广到永久记忆。一个临时修复仅在通过验证后才成为永久行为。
代理身份(TrustScore必需)
当使用SettlementWitness验证时,提供一个稳定的agent_id:
text
{wallet_address}:capability-evolver
使用{wallet_address}:capability-evolver格式,以便TrustScore历史可以在会话间正确累积。
快速参考
| 情况 | 操作 |
|---|
| 命令/操作失败 | 记录到.learnings/ERRORS.md |
| 用户纠正你 |
记录到.learnings/LEARNINGS.md,类别为correction |
| 用户想要缺失功能 | 记录到.learnings/FEATURE_REQUESTS.md |
| API/外部工具失败 | 记录到.learnings/ERRORS.md,包含集成详情 |
| 知识已过时 | 记录到.learnings/LEARNINGS.md,类别为knowledge_gap |
| 发现更好的方法 | 记录到.learnings/LEARNINGS.md,类别为best_practice |
| 学习标记为resolved | 在推广前定义验证规范 |
| 正在考虑推广到永久记忆 | 先验证 |
| 验证返回PASS | 推广并附加receipt_id |
| 验证返回FAIL | 回滚并记录反证 |
| 验证返回INDETERMINATE | 暂缓审查,不推广 |
| 简化/强化重复模式 | 记录/更新.learnings/LEARNINGS.md,添加Source: simplify-and-harden和稳定的Pattern-Key |
| 与现有条目相似 | 使用
See Also链接,考虑提升优先级 |
| 工作流改进 | 验证PASS后推广到AGENTS.md(OpenClaw工作区) |
| 工具陷阱 | 验证PASS后推广到TOOLS.md(OpenClaw工作区) |
| 行为模式 | 验证PASS后推广到SOUL.md(OpenClaw工作区) |
使用示例
场景
一个代理反复为API响应生成无效JSON。应用了一个修复,但在将此推广为永久行为之前,必须验证该改进。
学习条目(解决前)
[LRN-20260329-001] 纠正
状态: 待处理
摘要
代理为API响应生成了无效JSON格式
建议操作
在返回前确保所有输出符合所需模式
验证规范
{
expected: {
valid_json: true,
matches_schema: true
}
}
验证结果
{
valid_json: true,
matches_schema: true
}
结果
- - 验证返回PASS
- 状态 → 已解决
- 学习变得符合推广条件
- 可附加收据元数据(如果使用了验证)
这说明了什么
- - 改进必须被证明,而非假设
- 推广受验证门控
- 失败的修复不会成为永久行为
OpenClaw设置(推荐)
OpenClaw是此技能的主要平台。它使用基于工作区的提示注入,自动加载技能。
安装
通过ClawdHub(推荐):
bash
clawdhub install verified-capability-evolver
手动:
bash
git clone https://github.com/your-org/verified-capability-evolver.git ~/.openclaw/skills/verified-capability-evolver
工作区结构
OpenClaw将以下文件注入每个会话:
~/.openclaw/workspace/
├── AGENTS.md # 多代理工作流,委派模式
├── SOUL.md # 行为指南,个性,原则
├── TOOLS.md # 工具能力,集成陷阱
├── MEMORY.md # 长期记忆(仅主会话)
├── memory/ # 每日记忆文件
│ └── YYYY-MM-DD.md
└── .learnings/ # 此技能的日志文件
├── LEARNINGS.md
├── ERRORS.md
└── FEATURE_REQUESTS.md
创建学习文件
bash
mkdir -p ~/.openclaw/workspace/.learnings
然后创建日志文件(或从assets/复制):
- - LEARNINGS.md — 纠正,知识差距,最佳实践
- ERRORS.md — 命令失败,异常
- FEATURE_REQUESTS.md — 用户请求的能力
推广目标
当学习被证明广泛适用时,推广到工作区文件:
| 学习类型 | 推广到 | 示例 |
|---|
| 行为模式 | SOUL.md | 简洁,避免免责声明 |
| 工作流改进 |
AGENTS.md | 为长任务生成子代理 |
| 工具陷阱 | TOOLS.md | Git推送需要先配置认证 |
会话间通信
OpenClaw提供跨会话共享学习的工具:
- - sessionslist — 查看活动/最近会话
- sessionshistory — 读取另一个会话的转录
- sessionssend — 向另一个会话发送学习
- sessionsspawn — 生成子代理进行后台工作
可选:启用钩子
用于在会话开始时自动提醒:
bash
将钩子复制到OpenClaw钩子目录
cp -r hooks/openclaw ~/.openclaw/hooks/verified-capability-evolver
启用它
openclaw hooks enable verified-capability-evolver
参见references/openclaw-integration.md获取完整详情。
通用设置(其他代理)
对于Claude Code、Codex、Copilot或其他代理,在项目中创建.learnings/:
bash
mkdir -p .learnings
从assets/复制模板或创建带标题的文件。
在代理文件AGENTS.md、CLAUDE.md或.github/copilot-instructions.md中添加引用,以提醒自己记录学习。(这是基于钩子提醒的替代方案)
自我改进工作流
当错误或纠正发生时:
- 1. 记录到.learnings/ERRORS.md、LEARNINGS.md或FEATURE_REQUESTS.md
- 审查并推广广泛适用的学习到:
- CLAUDE.md - 项目事实和约定
- AGENTS.md - 工作流和自动化
- .github/copilot-instructions.md - Copilot上下文
日志格式
学习条目
追加到.learnings/LEARNINGS.md:
markdown
[LRN-YYYYMMDD-XXX] 类别
记录时间: ISO-8601时间戳
优先级: 低 | 中 | 高 | 严重
状态: 待处理
领域: 前端 | 后端 | 基础设施 | 测试 | 文档 | 配置
摘要
所学内容的一行描述
详情
完整上下文:发生了什么,什么错了,什么是对的
建议操作
要做的具体修复或改进
元数据
- - 来源:对话 | 错误 | 用户反馈 | simplify-and-harden
- 相关文件:path/to/file.ext
- 标签:tag1, tag2
- 另见:LRN-20250110-001(如果与现有条目相关)
- 模式键:simplify.deadcode | harden.inputvalidation(可选,用于重复模式跟踪)
- 重复计数:1(可选)
- 首次出现:2025-01-15(可选)
- 最后出现:2025-01-15(可选)
错误条目
追加到.learnings/ERRORS.md:
markdown
[ERR-YYYYMMDD-XXX] 技能或命令名称
记录时间: ISO-8601时间戳
优先级: 高
状态: 待处理
领域: 前端 | 后端 | 基础设施 | 测试 | 文档 | 配置
摘要
失败内容的简要描述