Memory Pipeline

Give your AI agent a memory that actually works.

AI agents wake up blank every session. Memory Pipeline fixes that — it extracts what matters from past conversations, connects the dots, and generates a daily briefing so your agent starts each session primed instead of clueless.

What It Does

Component	When it runs	What it does
Extract	Between sessions	Pulls structured facts (decisions, preferences, learnings) from daily notes and transcripts
Link

Why This Is Different

Most "memory" solutions are just vector search over chat logs. This is a cognitive architecture — inspired by how human memory actually works:

- Extraction over accumulation — Instead of dumping everything into a database, it identifies what's worth remembering: decisions, preferences, learnings, commitments. The rest is noise.
Knowledge graph, not just embeddings — Facts get linked to each other with bidirectional relationships. Your agent doesn't just find similar text — it understands that a decision about your tech stack relates to a project deadline relates to a preference you stated three weeks ago.
Briefing over retrieval — Rather than hoping the right context gets retrieved at query time, your agent starts every session with a curated cheat sheet. Active projects, recent decisions, personality reminders. Zero cold-start lag.
No mid-swing coaching — Borrowed from performance psychology. Corrections happen between sessions, not during. The after-action review feeds into the next briefing. The loop is closed — just not mid-execution.

Quick Start

Install

CODEBLOCK0

Setup

CODEBLOCK1

The setup script will detect your workspace, check dependencies (Python 3 + any LLM API key), create the memory/ directory, and run the full pipeline.

Requirements

- Python 3
At least one LLM API key (auto-detected):

- OpenAI (OPENAI_API_KEY or ~/.config/openai/api_key) - Anthropic (ANTHROPIC_API_KEY or ~/.config/anthropic/api_key) - Gemini (GEMINI_API_KEY or ~/.config/gemini/api_key)

Run Manually

CODEBLOCK2

Automate via Heartbeat

Add to your HEARTBEAT.md for daily automatic runs:

CODEBLOCK3

Import External Knowledge

Already have years of conversations in ChatGPT? Import them so your agent knows what you know.

ChatGPT Export

CODEBLOCK4

What it does:

- Parses ChatGPT's conversation tree format
Filters out throwaway conversations (configurable: --min-turns, --min-length)
Supports topic exclusion (edit EXCLUDE_PATTERNS to skip unwanted topics)
Outputs clean, dated markdown files to INLINECODE12
Files are automatically indexed by OpenClaw's semantic search

Options:

- --dry-run — Preview without writing files
INLINECODE14 — Skip all filtering
INLINECODE15 — Minimum user messages to keep (default: 2)
INLINECODE16 — Minimum total characters (default: 200)

Adding Other Sources

The pattern is extensible. Create ingest-<source>.py, parse the format, write markdown to memory/knowledge/<source>/. The indexer handles the rest.

How the Pipeline Works

Stage 1: Extract

Script: INLINECODE19

Reads daily notes (memory/YYYY-MM-DD.md) and session transcripts, then uses an LLM to extract structured facts:

CODEBLOCK5

Output: INLINECODE21

Stage 2: Link

Script: INLINECODE22

Takes extracted facts and builds a knowledge graph:

- Generates embeddings for semantic similarity
Creates bidirectional links between related facts
Detects contradictions and marks superseded facts
Auto-generates domain tags

Output: memory/knowledge-graph.json + INLINECODE24

Stage 3: Briefing

Script: INLINECODE25

Generates a compact daily briefing (< 2000 chars) combining:

- Personality traits (from SOUL.md)
User context (from USER.md)
Active projects and recent decisions
Open todos

Output: BRIEFING.md (workspace root)

Performance Hooks (Optional)

Four lifecycle hooks that enforce execution discipline during sessions. Based on a principle from performance psychology: separate preparation from execution.

CODEBLOCK6

Configuration

CODEBLOCK7

Hook Details

Hook	What it does
INLINECODE29	Loads memory files, builds bounded briefing packet, injects into system prompt
INLINECODE30

Output Files

File	Location	Purpose
INLINECODE33	Workspace root	Daily context cheat sheet
INLINECODE34

memory/ | All extracted facts (append-only) | | knowledge-graph.json | memory/ | Full graph with embeddings and links | | knowledge-summary.md | memory/ | Human-readable graph summary | | knowledge/chatgpt/*.md | memory/ | Ingested ChatGPT conversations |

Customization

- Change LLM models — Edit model names in each script (supports OpenAI, Anthropic, Gemini)
Adjust extraction — Modify the extraction prompt in memory-extract.py to focus on different fact types
Tune link sensitivity — Change the similarity threshold in memory-link.py (default: 0.3)
Filter ingestion — Edit EXCLUDE_PATTERNS in ingest-chatgpt.py for topic exclusion

Troubleshooting

Problem	Fix
No facts extracted	Check that daily notes or transcripts exist; verify API key
Low-quality links

Add OpenAI key for embedding-based similarity; adjust threshold | | Briefing too long | Reduce facts in template or let LLM generation handle it (auto-constrained to 2000 chars) |

记忆管道

为你的AI智能体赋予真正有效的记忆。

AI智能体每次会话开始时都是一片空白。记忆管道解决了这个问题——它从过去的对话中提取重要信息，连接各个要点，并生成每日简报，让你的智能体每次会话开始时都准备充分，而非茫然无措。

功能说明

组件	运行时机	功能
提取	会话之间	从日常笔记和对话记录中提取结构化事实（决策、偏好、经验）
链接

独特之处

大多数记忆解决方案只是对聊天记录进行向量搜索。这是一个认知架构——灵感来源于人类记忆的实际运作方式：

- 提取而非堆积——不是将所有内容都倾倒进数据库，而是识别出值得记住的内容：决策、偏好、经验、承诺。其余的都是噪音。
知识图谱，而非仅嵌入——事实通过双向关系相互连接。你的智能体不仅能找到相似的文本——它还能理解关于技术栈的决策与项目截止日期相关，而项目截止日期又与你三周前表达的偏好相关。
简报而非检索——不是寄希望于查询时能检索到正确的上下文，而是让你的智能体每次会话开始时都有一份精心策划的备忘单。活跃项目、近期决策、个性提醒。零冷启动延迟。
不在执行中指导——借鉴自运动心理学。纠正在会话之间进行，而非会话期间。事后复盘反馈到下一次简报。循环是闭合的——只是不在执行过程中进行。

快速开始

安装

bash
clawdhub install memory-pipeline

设置

bash
bash skills/memory-pipeline/scripts/setup.sh

设置脚本将检测你的工作区，检查依赖项（Python 3 + 任意LLM API密钥），创建 memory/ 目录，并运行完整管道。

要求

- Python 3
至少一个LLM API密钥（自动检测）：

- OpenAI（OPENAIAPIKEY 或 ~/.config/openai/api_key） - Anthropic（ANTHROPICAPIKEY 或 ~/.config/anthropic/api_key） - Gemini（GEMINIAPIKEY 或 ~/.config/gemini/api_key）

手动运行

bash

完整管道

python3 skills/memory-pipeline/scripts/memory-extract.py
python3 skills/memory-pipeline/scripts/memory-link.py
python3 skills/memory-pipeline/scripts/memory-briefing.py

通过心跳自动运行

添加到你的 HEARTBEAT.md 中，实现每日自动运行：

markdown

每日记忆管道

- 频率： 每天一次（早晨）
操作： 运行记忆管道：

1. python3 skills/memory-pipeline/scripts/memory-extract.py
2. python3 skills/memory-pipeline/scripts/memory-link.py
3. python3 skills/memory-pipeline/scripts/memory-briefing.py

导入外部知识

已经在ChatGPT中有多年的对话记录？导入它们，让你的智能体了解你所知道的内容。

ChatGPT导出

bash

1. 从ChatGPT导出：设置 → 数据控制 → 导出数据

2. 将zip文件放入你的工作区

3. 运行：

python3 skills/memory-pipeline/scripts/ingest-chatgpt.py ~/imports/chatgpt-export.zip

先预览（推荐）：

python3 skills/memory-pipeline/scripts/ingest-chatgpt.py ~/imports/chatgpt-export.zip --dry-run

功能说明：

- 解析ChatGPT的对话树格式
过滤掉一次性对话（可配置：--min-turns，--min-length）
支持主题排除（编辑 EXCLUDE_PATTERNS 以跳过不需要的主题）
输出干净的、带日期的markdown文件到 memory/knowledge/chatgpt/
文件由OpenClaw的语义搜索自动索引

选项：

- --dry-run — 预览而不写入文件
--keep-all — 跳过所有过滤
--min-turns N — 保留的最小用户消息数（默认：2）
--min-length N — 最小总字符数（默认：200）

添加其他来源

该模式是可扩展的。创建 ingest-.py，解析格式，将markdown写入 memory/knowledge//。索引器会处理其余部分。

管道工作原理

阶段1：提取

脚本： memory-extract.py

读取日常笔记（memory/YYYY-MM-DD.md）和会话记录，然后使用LLM提取结构化事实：

json
{type: decision, content: 使用Rust作为后端, subject: 项目架构, confidence: 0.9}
{type: preference, content: 偏好Google Drive而非Notion, subject: 工具, confidence: 0.95}

输出： memory/extracted.jsonl

阶段2：链接

脚本： memory-link.py

获取提取的事实并构建知识图谱：

- 生成嵌入以计算语义相似度
在相关事实之间创建双向链接
检测矛盾并标记已取代的事实
自动生成领域标签

输出： memory/knowledge-graph.json + memory/knowledge-summary.md

阶段3：简报

脚本： memory-briefing.py

生成紧凑的每日简报（< 2000字符），结合：

- 个性特征（来自 SOUL.md）
用户上下文（来自 USER.md）
活跃项目和近期决策
未完成的待办事项

输出： BRIEFING.md（工作区根目录）

性能钩子（可选）

四个生命周期钩子，在会话期间强制执行执行纪律。基于运动心理学的一个原则：将准备与执行分开。

用户消息 → 智能体循环
├── beforeagentstart → 简报包（记忆 + 检查清单）
├── beforetoolcall → 策略执行（拒绝列表）
├── toolresultpersist → 输出压缩（防止上下文膨胀）
└── agent_end → 事后复盘（持久化笔记）

配置

json
{
enabled: true,
briefing: {
maxChars: 6000,
checklist: [
用一句话重新表述任务。,
列出约束条件和成功标准。,
仅检索最少的相关记忆。,
当事实重要时，优先使用工具而非猜测。
],
memoryFiles: [memory/IDENTITY.md, memory/PROJECTS.md]
},
tools: {
deny: [dangerous_tool],
maxToolResultChars: 12000
},
afterAction: {
writeMemoryFile: memory/AFTER_ACTION.md,
maxBullets: 8
}
}

钩子详情

钩子	功能
beforeagentstart	加载记忆文件，构建有边界的简报包，注入系统提示
beforetoolcall

输出文件

文件	位置	用途
BRIEFING.md	工作区根目录	每日上下文备忘单
extracted.jsonl

memory/ | 所有提取的事实（仅追加） | | knowledge-graph.json | memory/ | 包含嵌入和链接的完整图谱 | | knowledge-summary.md | memory/ | 人类可读的图谱摘要 | | knowledge/chatgpt/*.md | memory/ | 导入的ChatGPT对话 |

自定义

- 更改LLM模型 — 在每个脚本中编辑模型名称（支持OpenAI、Anthropic、Gemini）
调整提取 — 修改 memory-extract.py 中的提取提示，专注于不同的事实类型
调整链接敏感

memory-pipeline记忆管道

memory-pipeline

Memory Pipeline

What It Does

Why This Is Different

Quick Start

Install

Setup

Requirements

Run Manually

Automate via Heartbeat

Import External Knowledge

ChatGPT Export

Adding Other Sources

How the Pipeline Works

Stage 1: Extract

Stage 2: Link

Stage 3: Briefing

Performance Hooks (Optional)

Configuration

Hook Details

Output Files

Customization

Troubleshooting

See Also

记忆管道

功能说明

独特之处

快速开始

安装

设置

要求

手动运行

完整管道

通过心跳自动运行

每日记忆管道

导入外部知识

ChatGPT导出

1. 从ChatGPT导出：设置 → 数据控制 → 导出数据

2. 将zip文件放入你的工作区

3. 运行：

先预览（推荐）：

添加其他来源

管道工作原理

阶段1：提取

阶段2：链接

阶段3：简报

性能钩子（可选）

配置

钩子详情

输出文件

自定义

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement