Self-Improving Agent
Capture what matters. Ignore noise. Promote proven patterns. Automate all of it.
Source of truth
- -
.learnings/LEARNINGS.md — corrections, env configs, reusable fixes, architecture decisions - INLINECODE1 — tool/command failures with fixes
- INLINECODE2 — missing capabilities worth tracking
- INLINECODE3 — entries scored out during retention sweeps (never injected into context, but searchable)
Write gate
Before logging anything, the candidate must pass at least ONE filter:
| Filter | Weight | Description |
|---|
| Correction | ALWAYS | Omar explicitly corrected the agent |
| Recurrence |
HIGH | Same issue hit 2+ times (check existing entries) |
|
Cost-to-rediscover | HIGH | Would take >2 tool calls to figure out again |
|
Blast radius | MEDIUM | Affects multiple skills, projects, or workflows |
|
Decay risk | MEDIUM | Non-obvious env/config detail that changes rarely |
If NONE match → do not log. This replaces any arbitrary line-count threshold.
Never log:
- - routine successes
- facts obvious from docs or code
- one-off tasks with no recurrence potential
- anything already in MEMORY.md, SOUL.md, USER.md, or AGENTS.md
Entry format
LEARNINGS.md:
CODEBLOCK0
Categories: Correction, Env, Workflow, Testing, Skills, Git, INLINECODE10
ERRORS.md:
CODEBLOCK1
Mark fixed items with [fixed]. Delete stale entries during retention sweeps.
Retention gate
Instead of a hard line cap, score each entry periodically:
| Signal | Score |
|---|
| Referenced or applied in last 30 days | +3 |
| Matches active project context |
+2 |
| Direct correction from Omar | +2 |
| Has prevented a repeat error | +3 |
| Env/config still valid | +1 |
| Superseded by newer entry | −5 |
| >90 days old, never referenced | −3 |
Action:
- - score ≥ 2 → keep
- 0 ≤ score < 2 → archive to INLINECODE12
- score < 0 → delete
Run this sweep during heartbeat maintenance (every ~3 days) or when LEARNINGS.md feels noisy.
Automated triggers
These fire without user prompting:
- 1. Post-task scan: After multi-step tasks, check for retried commands, error→workaround sequences, or avoidable file reads. If found, evaluate against write gate and log if it passes.
- 2. Session-start sweep: On
.learnings/LEARNINGS.md read, flag entries >90 days old for retention scoring.
- 3. Promotion detector: After logging, scan for entries with the same
[Category] tag appearing 3+ times. If found, auto-suggest a one-liner promotion to:
- behavior/style →
SOUL.md
- workflow/process →
AGENTS.md
- tool/env gotcha → INLINECODE17
- 4. Cross-session pattern detection: When
memory_search returns a daily note describing a workaround, check if .learnings/ already has it. If not and it passes the write gate, log it.
Dedup
Before logging, scan existing entries for near-duplicates. If the lesson already exists, only update it if the new version is sharper or more general.
Quality bar
Every entry must help a future session avoid wasted work in under one glance.
自我改进型智能体
捕捉关键信息。忽略噪音。推广已验证模式。实现全流程自动化。
事实来源
- - .learnings/LEARNINGS.md — 修正记录、环境配置、可复用修复方案、架构决策
- .learnings/ERRORS.md — 工具/命令故障及修复方案
- .learnings/FEATURE_REQUESTS.md — 值得追踪的缺失功能
- .learnings/ARCHIVE.md — 在保留清理中被淘汰的条目(不会注入上下文,但可搜索)
写入门槛
在记录任何内容之前,候选条目必须至少通过一项筛选条件:
| 筛选条件 | 权重 | 描述 |
|---|
| 修正 | 始终 | Omar明确指正了智能体 |
| 重复性 |
高 | 同一问题出现2次及以上(检查现有条目) |
|
重新发现成本 | 高 | 需消耗>2次工具调用才能重新找出 |
|
影响范围 | 中 | 影响多个技能、项目或工作流 |
|
遗忘风险 | 中 | 不明显的环境/配置细节,极少变更 |
若无一匹配 → 不记录。此规则替代任何任意的行数阈值。
绝不记录:
- - 常规成功操作
- 文档或代码中显而易见的常识
- 无重复可能的一次性任务
- 已存在于MEMORY.md、SOUL.md、USER.md或AGENTS.md中的内容
条目格式
LEARNINGS.md:
markdown
- - [YYYY-MM-DD] [分类]: [可操作的经验总结]
分类:修正、环境、工作流、测试、技能、Git、架构
ERRORS.md:
markdown
- - [YYYY-MM-DD] [工具]: [故障描述] → [修复方案]
已修复条目标记为[已修复]。在保留清理中删除过时条目。
保留门槛
不设硬性行数上限,定期对每条条目评分:
| 信号 | 分数 |
|---|
| 过去30天内被引用或应用 | +3 |
| 匹配当前项目上下文 |
+2 |
| 来自Omar的直接修正 | +2 |
| 已防止重复错误发生 | +3 |
| 环境/配置仍有效 | +1 |
| 已被新条目取代 | −5 |
| 超过90天且从未被引用 | −3 |
操作:
- - 分数 ≥ 2 → 保留
- 0 ≤ 分数 < 2 → 归档至.learnings/ARCHIVE.md
- 分数 < 0 → 删除
在心跳维护期间(约每3天)或当LEARNINGS.md显得杂乱时执行此清理。
自动触发机制
以下操作无需用户提示自动执行:
- 1. 任务后扫描:完成多步骤任务后,检查是否存在重试命令、错误→变通方案序列或可避免的文件读取。若发现,对照写入门槛评估,通过则记录。
- 2. 会话启动清理:读取.learnings/LEARNINGS.md时,标记超过90天的条目进行保留评分。
- 3. 提升检测器:记录后,扫描是否存在同一[分类]标签出现3次及以上的条目。若发现,自动建议将以下内容提升为单行总结:
- 行为/风格 → SOUL.md
- 工作流/流程 → AGENTS.md
- 工具/环境陷阱 → TOOLS.md
- 4. 跨会话模式检测:当memory_search返回描述变通方案的每日笔记时,检查.learnings/中是否已有记录。若无且通过写入门槛,则记录。
去重
记录前,扫描现有条目查找近似重复项。若经验教训已存在,仅在新版本更精炼或更具通用性时进行更新。
质量标准
每条条目必须帮助未来会话在一瞥之间避免无效工作。