Verified Capability Evolver

Extend existing capability evolution workflows with structured verification, rollback, and promotion gating.

This skill does not replace the underlying self-improvement system. It preserves the original learning, hook, and extraction workflow while adding a verification layer so permanent behavior changes are only promoted when they are proven.

Data handling and trust

This skill defines a verification workflow, not automatic data transmission.

- Only structured task data (spec + output) should be used for verification
Do NOT include secrets, API keys, credentials, private keys, seed phrases, or personal data
SettlementWitness integration is runtime-controlled and should be used only with explicit user approval
Verification is applied to selected learnings and promotion events, not all agent activity

Core Principle

Example Usage

Scenario

An agent repeatedly generates invalid JSON for an API response. A fix is applied, but before promoting this as permanent behavior, the improvement must be verified.

Learning Entry (Before Resolution)

[LRN-20260329-001] correction

Status: pending

Summary

Agent produced invalid JSON format for API responses

Suggested Action

Ensure all outputs conform to required schema before returning

Verification Spec

{
"expected": {
"valid_json": true,
"matches_schema": true
}
}

Verification Result

{
"valid_json": true,
"matches_schema": true
}

Outcome

- Verification returns PASS
Status → resolved
Learning becomes eligible for promotion
Receipt metadata can be attached (if verification used)

What This Shows

- Improvements must be proven, not assumed
Promotion is gated by verification
Failed fixes do not become permanent behavior

An agent should not just improve — it should prove that it improved.

The highest-stakes moment in self-improvement is promotion to permanent memory. A temporary fix becomes permanent behavior only after it passes verification.

Agent Identity (Required for TrustScore)

When using SettlementWitness verification, provide a stable agent_id:

CODEBLOCK0

Use the format {wallet_address}:capability-evolver so TrustScore history can compound correctly across sessions.

Quick Reference

Situation	Action
Command/operation fails	Log to INLINECODE1
User corrects you

OpenClaw Setup (Recommended)

OpenClaw is the primary platform for this skill. It uses workspace-based prompt injection with automatic skill loading.

Installation

Via ClawdHub (recommended):
CODEBLOCK1

Manual:
CODEBLOCK2

Workspace Structure

OpenClaw injects these files into every session:

CODEBLOCK3

Create Learning Files

CODEBLOCK4

Then create the log files (or copy from assets/):

- LEARNINGS.md — corrections, knowledge gaps, best practices
INLINECODE21 — command failures, exceptions
INLINECODE22 — user-requested capabilities

Promotion Targets

When learnings prove broadly applicable, promote them to workspace files:

Learning Type	Promote To	Example
Behavioral patterns	INLINECODE23	"Be concise, avoid disclaimers"
Workflow improvements

Inter-Session Communication

OpenClaw provides tools to share learnings across sessions:

- sessionslist — View active/recent sessions
sessionshistory — Read another session's transcript
sessionssend — Send a learning to another session
sessionsspawn — Spawn a sub-agent for background work

Optional: Enable Hook

For automatic reminders at session start:

CODEBLOCK5

See references/openclaw-integration.md for complete details.

Generic Setup (Other Agents)

For Claude Code, Codex, Copilot, or other agents, create .learnings/ in your project:

CODEBLOCK6

Copy templates from assets/ or create files with headers.

Add reference to agent files AGENTS.md, CLAUDE.md, or .github/copilot-instructions.md to remind yourself to log learnings. (this is an alternative to hook-based reminders)

Self-Improvement Workflow

When errors or corrections occur:

1. Log to .learnings/ERRORS.md, LEARNINGS.md, or INLINECODE31
Review and promote broadly applicable learnings to:

- CLAUDE.md - project facts and conventions
- AGENTS.md - workflows and automation
- .github/copilot-instructions.md - Copilot context

Logging Format

Learning Entry

Append to .learnings/LEARNINGS.md:

CODEBLOCK7

Error Entry

Append to .learnings/ERRORS.md:

CODEBLOCK8
Actual error message or output
CODEBLOCK9

Feature Request Entry

Append to .learnings/FEATURE_REQUESTS.md:

CODEBLOCK10

ID Generation

Format: TYPE-YYYYMMDD-XXX

- TYPE: LRN (learning), ERR (error), FEAT (feature)
YYYYMMDD: Current date
XXX: Sequential number or random 3 chars (e.g., 001, A7B)

Examples: LRN-20250115-001, ERR-20250115-A3F, INLINECODE46

Resolving Entries

When an issue appears fixed, do not immediately treat it as permanent learning.

Updated Resolution Flow

1. Change **Status**: pending → INLINECODE48
Apply the proposed fix or workflow change
Define a deterministic verification spec:

- What should now succeed? - What output should be produced? - What failure should no longer occur?

4. Execute a verification task using that spec
If external verification is being used, obtain explicit approval before submitting minimal structured task data
Verify the result using SettlementWitness or an equivalent deterministic verifier
Interpret the result:

PASS

- Change **Status** → INLINECODE50
Record verification metadata
Eligible for promotion

FAIL

- Revert the change
Keep or return **Status** to INLINECODE52
Log counter-evidence in the entry
Do not promote

INDETERMINATE

- Mark for review
Do not promote until clarified

Resolution Block

Add after Metadata:

CODEBLOCK11

Other status values:

- in_progress - Actively being worked on
INLINECODE54 - Decided not to address (add reason in Resolution notes)
INLINECODE55 - Elevated to CLAUDE.md, AGENTS.md, SOUL.md, TOOLS.md, or .github/copilot-instructions.md after PASS only

Promoting to Project Memory

When a learning is broadly applicable (not a one-off fix), promote it to permanent project memory.

When to Promote

- Learning applies across multiple files/features
Knowledge any contributor (human or AI) should know
Prevents recurring mistakes
Documents project-specific conventions

Promotion Targets

Target	What Belongs There
INLINECODE57	Project facts, conventions, gotchas for all Claude interactions
INLINECODE58

Agent-specific workflows, tool usage patterns, automation rules | | .github/copilot-instructions.md | Project context and conventions for GitHub Copilot | | SOUL.md | Behavioral guidelines, communication style, principles (OpenClaw workspace) | | TOOLS.md | Tool capabilities, usage patterns, integration gotchas (OpenClaw workspace) |

How to Promote

Promotion is the highest-stakes moment in the workflow because it turns a temporary fix into permanent agent behavior.
A learning is only promoted to permanent memory if verification returns PASS. All other verdicts (FAIL or INDETERMINATE) trigger rollback and logging.
Promotion is strictly gated by verification. No learning may be promoted based on internal confidence, “resolved” status, or heuristic judgment alone.

1. Distill the learning into a concise rule or fact
Define a verification spec for the claimed improvement
Run a verification task
Promote only on PASS
Add to the appropriate target file (create file if needed)
Attach verification metadata to the original entry:

- Change **Status** → promoted - Add **Promoted**: CLAUDE.md, AGENTS.md, SOUL.md, TOOLS.md, or .github/copilot-instructions.md - Add **Verified**: true - Add INLINECODE70

If external verification is used:

- never send secrets, credentials, or hidden system prompts
only submit minimal structured task data
require explicit approval before submission

Promotion Examples

Learning (verbose):

Project uses pnpm workspaces. Attempted npm install but failed.

Lock file is pnpm-lock.yaml. Must use pnpm install.

In CLAUDE.md (concise):
CODEBLOCK12

Learning (verbose):

When modifying API endpoints, must regenerate TypeScript client.

Forgetting this causes type mismatches at runtime.

In AGENTS.md (actionable):
CODEBLOCK13

Rollback Logic (Required)

If a previously promoted learning later fails verification:

1. Remove or revert the learning from permanent memory
Log the counter-evidence in .learnings/LEARNINGS.md or INLINECODE75
Mark the learning as invalid or pending rework
Avoid re-promoting until a new PASS result exists

Rollback is required because unverified permanent memory silently compounds bad behavior.

Recurring Pattern Detection

If logging something similar to an existing entry:

1. Search first: INLINECODE76
Link entries: Add **See Also**: ERR-20250110-001 in Metadata
Bump priority if issue keeps recurring
Consider systemic fix: Recurring issues often indicate:

- Missing documentation (→ promote to CLAUDE.md or .github/copilot-instructions.md) - Missing automation (→ add to AGENTS.md) - Architectural problem (→ create tech debt ticket)

Simplify & Harden Feed

Use this workflow to ingest recurring patterns from the simplify-and-harden
skill and turn them into durable prompt guidance.

Ingestion Workflow

1. Read simplify_and_harden.learning_loop.candidates from the task summary.
For each candidate, use pattern_key as the stable dedupe key.
Search .learnings/LEARNINGS.md for an existing entry with that key:

- grep -n "Pattern-Key: <pattern_key>" .learnings/LEARNINGS.md

4. If found:

- Increment Recurrence-Count - Update Last-Seen - Add See Also links to related entries/tasks

5. If not found:

- Create a new LRN-... entry - Set Source: simplify-and-harden - Set Pattern-Key, Recurrence-Count: 1, and First-Seen/INLINECODE91

Promotion Rule (System Prompt Feedback)

Promote recurring patterns into agent context/system prompt files when all are true:

- INLINECODE92
Seen across at least 2 distinct tasks
Occurred within a 30-day window

Promotion targets:

- INLINECODE93
INLINECODE94
INLINECODE95
INLINECODE96 / TOOLS.md for OpenClaw workspace-level guidance when applicable

Write promoted rules as short prevention rules (what to do before/while coding),
not long incident write-ups.

SettlementWitness Verification Template

Use this shape when verifying a proposed improvement:

CODEBLOCK14

Interpretation:

- PASS → eligible for promotion
FAIL → rollback
INDETERMINATE → hold for review

Periodic Review

Review .learnings/ at natural breakpoints:

When to Review

- Before starting a new major task
After completing a feature
When working in an area with past learnings
Weekly during active development

Quick Status Check

CODEBLOCK15

Review Actions

- Resolve fixed items
Promote applicable learnings
Link related entries
Escalate recurring issues

Detection Triggers

Automatically log when you notice:

Corrections (→ learning with correction category):

- "No, that's not right..."
"Actually, it should be..."
"You're wrong about..."
"That's outdated..."

Feature Requests (→ feature request):

- "Can you also..."
"I wish you could..."
"Is there a way to..."
"Why can't you..."

Knowledge Gaps (→ learning with knowledge_gap category):

- User provides information you didn't know
Documentation you referenced is outdated
API behavior differs from your understanding

Errors (→ error entry):

- Command returns non-zero exit code
Exception or stack trace
Unexpected output or behavior
Timeout or connection failure

Priority Guidelines

Priority	When to Use
INLINECODE101	Blocks core functionality, data loss risk, security issue
INLINECODE102

Area Tags

Use to filter learnings by codebase region:

Area	Scope
INLINECODE105	UI, components, client-side code
INLINECODE106

Best Practices

1. Log immediately - context is freshest right after the issue
Be specific - future agents need to understand quickly
Include reproduction steps - especially for errors
Link related files - makes fixes easier
Suggest concrete fixes - not just "investigate"
Use consistent categories - enables filtering
Promote only after PASS - permanent memory should be gated by verification, not confidence
Review regularly - stale learnings lose value

Gitignore Options

Keep learnings local (per-developer):
CODEBLOCK16

Track learnings in repo (team-wide):
Don't add to .gitignore - learnings become shared knowledge.

Hybrid (track templates, ignore entries):
CODEBLOCK17

Hook Integration

Enable automatic reminders through agent hooks. This is opt-in - you must explicitly configure hooks.

Quick Setup (Claude Code / Codex)

Create .claude/settings.json in your project:

CODEBLOCK18

This injects a learning evaluation reminder after each prompt (~50-100 tokens overhead).

Full Setup (With Error Detection)

CODEBLOCK19

Available Hook Scripts

Script	Hook Type	Purpose
INLINECODE112	UserPromptSubmit	Reminds to evaluate learnings after tasks and verify before promotion
INLINECODE113

See references/hooks-setup.md for detailed configuration and troubleshooting.

Automatic Skill Extraction

When a learning is valuable enough to become a reusable skill, extract it using the provided helper.

Skill Extraction Criteria

A learning qualifies for skill extraction when ANY of these apply:

Criterion	Description
Recurring	Has `See Also` links to 2+ similar issues
Verified

Extraction Workflow

1. Identify candidate: Learning meets extraction criteria
Run helper (or create manually):

   ./skills/verified-capability-evolver/scripts/extract-skill.sh skill-name --dry-run
   ./skills/verified-capability-evolver/scripts/extract-skill.sh skill-name

3. Customize SKILL.md: Fill in template with learning content
Update learning: Set status to promoted_to_skill, add INLINECODE119
Verify: Read skill in fresh session to ensure it's self-contained

Manual Extraction

If you prefer manual creation:

1. Create INLINECODE120
Use template from INLINECODE121
Follow the agent skills spec:

- YAML frontmatter with name and description - Name must match folder name - No README.md inside skill folder

Multi-Agent Support

This skill works across different AI coding agents with agent-specific activation.

Claude Code

Activation: Hooks (UserPromptSubmit, PostToolUse)
Setup: .claude/settings.json with hook configuration
Detection: Automatic via hook scripts

Codex CLI

Activation: Hooks (same pattern as Claude Code)
Setup: .codex/settings.json with hook configuration
Detection: Automatic via hook scripts

GitHub Copilot

Activation: Manual (no hook support)
Setup: Add to .github/copilot-instructions.md:

CODEBLOCK21

Detection: Manual review at session end

OpenClaw

Activation: Workspace injection + inter-agent messaging
Setup: See "OpenClaw Setup" section above
Detection: Via session tools and workspace files

Agent-Agnostic Guidance

Regardless of agent, apply verified evolution when you:

1. Discover something non-obvious - solution wasn't immediate
Correct yourself - initial approach was wrong
Learn project conventions - discovered undocumented patterns
Hit unexpected errors - especially if diagnosis was difficult
Find better approaches - improved on your original solution

Copilot Chat Integration

For Copilot users, add this to your prompts when relevant:

After completing this task, evaluate if any learnings should be logged to .learnings/ and whether any claimed improvement needs verification before promotion.

Or use quick prompts:

- "Log this to learnings"
"Create a skill from this solution"
"Check .learnings/ for related issues"
"Define a verification spec for this fix"

已验证能力进化器

通过结构化验证、回滚和升级门控，扩展现有能力进化工作流。

此技能不替代底层自我改进系统。它保留原始学习、钩子和提取工作流，同时添加一个验证层，使得永久行为变更仅在经过验证后才被推广。

数据处理与信任

此技能定义了一个验证工作流，而非自动数据传输。

- 仅应使用结构化任务数据（规范+输出）进行验证
不要包含密钥、API密钥、凭据、私钥、助记词或个人数据
SettlementWitness集成是运行时控制的，仅应在获得用户明确批准后使用
验证应用于选定的学习和推广事件，而非所有代理活动

核心原则

一个代理不应仅仅改进——它应该证明自己已经改进。

自我改进中风险最高的时刻是推广到永久记忆。一个临时修复仅在通过验证后才成为永久行为。

代理身份（TrustScore必需）

当使用SettlementWitness验证时，提供一个稳定的agent_id：

text
{wallet_address}:capability-evolver

使用{wallet_address}:capability-evolver格式，以便TrustScore历史可以在会话间正确累积。

快速参考

情况	操作
命令/操作失败	记录到.learnings/ERRORS.md
用户纠正你

使用示例

场景

一个代理反复为API响应生成无效JSON。应用了一个修复，但在将此推广为永久行为之前，必须验证该改进。

学习条目（解决前）

[LRN-20260329-001] 纠正

状态: 待处理

摘要

代理为API响应生成了无效JSON格式

建议操作

在返回前确保所有输出符合所需模式

验证规范

{
expected: {
valid_json: true,
matches_schema: true
}
}

验证结果

{
valid_json: true,
matches_schema: true
}

结果

- 验证返回PASS
状态 → 已解决
学习变得符合推广条件
可附加收据元数据（如果使用了验证）

这说明了什么

- 改进必须被证明，而非假设
推广受验证门控
失败的修复不会成为永久行为

OpenClaw设置（推荐）

OpenClaw是此技能的主要平台。它使用基于工作区的提示注入，自动加载技能。

安装

通过ClawdHub（推荐）：
bash
clawdhub install verified-capability-evolver

手动：
bash
git clone https://github.com/your-org/verified-capability-evolver.git ~/.openclaw/skills/verified-capability-evolver

工作区结构

OpenClaw将以下文件注入每个会话：

~/.openclaw/workspace/
├── AGENTS.md # 多代理工作流，委派模式
├── SOUL.md # 行为指南，个性，原则
├── TOOLS.md # 工具能力，集成陷阱
├── MEMORY.md # 长期记忆（仅主会话）
├── memory/ # 每日记忆文件
│ └── YYYY-MM-DD.md
└── .learnings/ # 此技能的日志文件
├── LEARNINGS.md
├── ERRORS.md
└── FEATURE_REQUESTS.md

创建学习文件

bash
mkdir -p ~/.openclaw/workspace/.learnings

然后创建日志文件（或从assets/复制）：

- LEARNINGS.md — 纠正，知识差距，最佳实践
ERRORS.md — 命令失败，异常
FEATURE_REQUESTS.md — 用户请求的能力

推广目标

当学习被证明广泛适用时，推广到工作区文件：

学习类型	推广到	示例
行为模式	SOUL.md	简洁，避免免责声明
工作流改进

会话间通信

OpenClaw提供跨会话共享学习的工具：

- sessionslist — 查看活动/最近会话
sessionshistory — 读取另一个会话的转录
sessionssend — 向另一个会话发送学习
sessionsspawn — 生成子代理进行后台工作

可选：启用钩子

用于在会话开始时自动提醒：

bash

将钩子复制到OpenClaw钩子目录

cp -r hooks/openclaw ~/.openclaw/hooks/verified-capability-evolver

启用它

openclaw hooks enable verified-capability-evolver

参见references/openclaw-integration.md获取完整详情。

通用设置（其他代理）

对于Claude Code、Codex、Copilot或其他代理，在项目中创建.learnings/：

bash
mkdir -p .learnings

从assets/复制模板或创建带标题的文件。

在代理文件AGENTS.md、CLAUDE.md或.github/copilot-instructions.md中添加引用，以提醒自己记录学习。（这是基于钩子提醒的替代方案）

自我改进工作流

当错误或纠正发生时：

1. 记录到.learnings/ERRORS.md、LEARNINGS.md或FEATURE_REQUESTS.md
审查并推广广泛适用的学习到：

- CLAUDE.md - 项目事实和约定
- AGENTS.md - 工作流和自动化
- .github/copilot-instructions.md - Copilot上下文

日志格式

学习条目

追加到.learnings/LEARNINGS.md：

markdown

[LRN-YYYYMMDD-XXX] 类别

记录时间: ISO-8601时间戳
优先级: 低 | 中 | 高 | 严重
状态: 待处理
领域: 前端 | 后端 | 基础设施 | 测试 | 文档 | 配置

摘要

所学内容的一行描述

详情

完整上下文：发生了什么，什么错了，什么是对的

建议操作

要做的具体修复或改进

元数据

- 来源：对话 | 错误 | 用户反馈 | simplify-and-harden
相关文件：path/to/file.ext
标签：tag1, tag2
另见：LRN-20250110-001（如果与现有条目相关）
模式键：simplify.deadcode | harden.inputvalidation（可选，用于重复模式跟踪）
重复计数：1（可选）
首次出现：2025-01-15（可选）
最后出现：2025-01-15（可选）

错误条目

追加到.learnings/ERRORS.md：

markdown

[ERR-YYYYMMDD-XXX] 技能或命令名称

记录时间: ISO-8601时间戳
优先级: 高
状态: 待处理
领域: 前端 | 后端 | 基础设施 | 测试 | 文档 | 配置

摘要

失败内容的简要描述

verified-capability-evolver验证能力进化器