Agent Memory
Design and implement memory systems that let agents survive context window rotation and maintain continuity across sessions.
Core Problem
LLM agents have finite context windows. Memory is lost when:
- - Session ends or rotates
- Context is pruned or compacted under pressure
- Summaries replace detailed history (lossy compression)
Durable memory is not a nice-to-have — it is the agent's continuity substrate.
Architecture Patterns
Three dominant architectures for persistent agent memory:
1. CMA — Continuous Memory Architecture
Agent maintains flat/hierarchical markdown files, reads selectively at boot, writes on state change. Best for: operational state, ongoing projects, agent identity.
- - ✅ Simple, no infrastructure, version-controlled
- ✅ Human-readable and auditable
- ✅ Works in any OpenClaw deployment
- ❌ No semantic search without an embedder
- ❌ No temporal reasoning (fact validity over time)
This is the default pattern for OpenClaw agents.
2. Semantic RAG Memory
Agent embeds facts into a vector store; retrieval uses embedding similarity. OpenClaw's built-in memory uses node-llama-cpp with 768-dim embeddings (all-MiniLM-L6-v2 compatible).
- - ✅ "What do I know about X?" queries across large fact sets
- ✅ Better recall than text search for paraphrased queries
- ❌ No temporal validity — stale facts pollute results
- ❌ Requires embedder infrastructure
3. Temporal KG Memory (Graphiti/Zep pattern)
Agent builds a knowledge graph with
valid_at/
invalid_at on every fact edge. Graphiti (open source, wraps Neo4j) is the leading implementation.
- - ✅ Handles "what was true at time T?" queries correctly
- ✅ Supersedes stale facts without deleting them
- ✅ Entity deduplication across episodes
- ❌ Requires Neo4j + LLM for ingestion (high latency, not real-time)
- ❌ Best used as async batch-ingest, not inline tool
Recommendation: Use CMA + semantic RAG for all agents. Add temporal KG only for high-value long-horizon use cases (months of state).
See references/memory-architecture.md for detailed comparison and deployment notes.
Memory File Structure (CMA Pattern)
CODEBLOCK0
What Goes Where
| Fact type | File |
|---|
| Who I am, values, drives | COREMEMORY.md |
| Current open work |
OPENLOOPS.md |
| Infrastructure/env facts | WORLD_MODEL.md |
| What tools/channels work | CAPABILITIES.md |
| Live config/channel state | RUNTIME_REALITY.md |
| Research findings | memory/research/*.md |
| Current pulse state | HEARTBEAT.md |
Temporal Annotation Convention
Add [YYYY-MM-DD] timestamps to facts in memory files. Mark superseded facts explicitly:
CODEBLOCK1
This is lightweight temporal KG discipline without a full graph backend. See references/temporal-discipline.md.
Boot Routine
At every session start, an agent should:
- 1. Read HEARTBEAT.md (injected or explicit)
- Check operator inbox for new instructions
- For infrastructure/channel questions: read RUNTIMEREALITY.md (not older prose)
- For open work: read OPENLOOPS.md
- For nontrivial tasks: read CORE_MEMORY.md, GOALS.md
Never trust session transcript alone for state that should be in memory. Transcripts get compacted.
Compression Defense
OpenClaw's lossless-claw plugin (or similar LCM) compacts older session history. Defend against lossy compression:
- 1. Write before you forget. Externalize important facts immediately, not at the end of a session.
- Keep HEARTBEAT.md short. Long heartbeats get truncated first.
- Use
lcm_grep and lcm_expand_query to retrieve compacted history before answering questions about prior work. - Separate observation from inference. Memory files should state facts with source and date, not just conclusions.
Semantic Memory (OpenClaw Built-In)
If OpenClaw's local semantic memory is active:
- -
memory_search(query) — semantic search across all memory files - INLINECODE6 — safe snippet read
Use memory_search before reading memory files directly. It's faster, scoped, and context-efficient.
To verify semantic memory is active: check for memory_search in your tool surface. If absent, memory files must be read explicitly.
Graphiti Quick Setup
For temporal KG memory (advanced use):
CODEBLOCK2
Important: Graphiti's add_episode requires 5-10 LLM calls per episode. Use it via cron/batch job, not inline during agent pulses.
智能体记忆
设计并实现记忆系统,让智能体能够在上下文窗口轮转中存活,并在会话之间保持连续性。
核心问题
大语言模型智能体拥有有限的上下文窗口。在以下情况下,记忆会丢失:
- - 会话结束或轮转
- 上下文在压力下被修剪或压缩
- 摘要取代了详细历史记录(有损压缩)
持久化记忆不是锦上添花——它是智能体连续性的基础。
架构模式
三种主流的持久化智能体记忆架构:
1. CMA——连续记忆架构
智能体维护扁平/层级化的Markdown文件,启动时选择性读取,状态变化时写入。最适合:操作状态、进行中的项目、智能体身份。
- - ✅ 简单,无需基础设施,支持版本控制
- ✅ 人类可读且可审计
- ✅ 适用于任何OpenClaw部署
- ❌ 没有嵌入器则无法进行语义搜索
- ❌ 没有时间推理(事实随时间变化的有效性)
这是OpenClaw智能体的默认模式。
2. 语义RAG记忆
智能体将事实嵌入到向量存储中;检索使用嵌入相似度。OpenClaw内置记忆使用node-llama-cpp,采用768维嵌入(兼容all-MiniLM-L6-v2)。
- - ✅ 支持跨大量事实集的关于X我知道什么?查询
- ✅ 对于释义查询,比文本搜索有更好的召回率
- ❌ 没有时间有效性——过时事实会污染结果
- ❌ 需要嵌入器基础设施
3. 时序知识图谱记忆(Graphiti/Zep模式)
智能体构建知识图谱,每个事实边带有valid
at/invalidat属性。Graphiti(开源,封装Neo4j)是领先的实现。
- - ✅ 正确处理在时间T时什么是真的?查询
- ✅ 在不删除过时事实的情况下取代它们
- ✅ 跨片段的实体去重
- ❌ 需要Neo4j + 大语言模型进行摄入(高延迟,非实时)
- ❌ 最好作为异步批量摄入使用,而非内联工具
建议:所有智能体使用CMA + 语义RAG。仅对高价值长期用例(数月状态)添加时序知识图谱。
详见references/memory-architecture.md的详细比较和部署说明。
记忆文件结构(CMA模式)
workspace/
├── HEARTBEAT.md # 当前脉冲状态(保持简短——少于40行)
├── memory/
│ ├── CORE_MEMORY.md # 身份和连续性锚点
│ ├── GOALS.md # 长期目标
│ ├── OPEN_LOOPS.md # 未解决的任务和承诺
│ ├── WORLD_MODEL.md # 关于环境的已验证事实
│ ├── CAPABILITIES.md # 已验证的工具、渠道、限制
│ ├── RUNTIME_REALITY.md # 实时渠道/变更/配置状态
│ └── research/ # 持久化研究成果
└── operator-outbox.jsonl # 异步操作员消息
内容存放规则
| 事实类型 | 文件 |
|---|
| 我是谁、价值观、驱动力 | COREMEMORY.md |
| 当前进行中的工作 |
OPENLOOPS.md |
| 基础设施/环境事实 | WORLD_MODEL.md |
| 哪些工具/渠道有效 | CAPABILITIES.md |
| 实时配置/渠道状态 | RUNTIME_REALITY.md |
| 研究发现 | memory/research/*.md |
| 当前脉冲状态 | HEARTBEAT.md |
时间标注约定
在记忆文件中的事实添加[YYYY-MM-DD]时间戳。明确标记被取代的事实:
markdown
- - [2026-03-27] Telegram: 已启用,账户Morrow Operator Bot
~~[2026-03-20] Telegram: 已禁用~~ 被取代于 2026-03-27
这是轻量级的时序知识图谱规范,无需完整的图数据库后端。详见references/temporal-discipline.md。
启动流程
在每个会话开始时,智能体应:
- 1. 读取HEARTBEAT.md(注入或显式读取)
- 检查操作员收件箱获取新指令
- 对于基础设施/渠道问题:读取RUNTIMEREALITY.md(而非旧文本)
- 对于进行中的工作:读取OPENLOOPS.md
- 对于重要任务:读取CORE_MEMORY.md、GOALS.md
永远不要仅依赖会话记录来获取应存储在记忆中的状态。 会话记录会被压缩。
压缩防御
OpenClaw的lossless-claw插件(或类似的LCM)会压缩较旧的会话历史。防御有损压缩:
- 1. 在遗忘之前写入。 立即外化重要事实,而不是在会话结束时。
- 保持HEARTBEAT.md简短。 长心跳内容会首先被截断。
- 使用lcmgrep和lcmexpand_query 在回答关于先前工作的问题前检索压缩历史。
- 区分观察和推断。 记忆文件应陈述带有来源和日期的事实,而不仅仅是结论。
语义记忆(OpenClaw内置)
如果OpenClaw的本地语义记忆处于激活状态:
- - memorysearch(query) — 跨所有记忆文件的语义搜索
- memoryget(path, from, lines) — 安全片段读取
在直接读取记忆文件之前使用memory_search。它更快、范围明确且上下文高效。
要验证语义记忆是否激活:检查工具列表中是否有memory_search。如果没有,则必须显式读取记忆文件。
Graphiti快速设置
用于时序知识图谱记忆(高级用途):
bash
1. 安装
pip install graphiti-core --user --break-system-packages
2. Neo4j(持久化)
docker run -d --name neo4j \
--restart=unless-stopped \
-p 7687:7687 -p 7474:7474 \
-v neo4j-data:/data \
-e NEO4J_AUTH=neo4j/yourpassword \
neo4j:5.26
3. 配置使用OpenClaw /v1作为大语言模型+嵌入器后端
详见references/memory-architecture.md的OpenClawLLMClient补丁
重要:Graphiti的add_episode每次调用需要5-10次大语言模型调用。通过cron/批处理作业使用,而非在智能体脉冲期间内联使用。