Enhanced Memory
Drop-in enhancement for OpenClaw's memory system. Replaces flat vector search with a 4-signal hybrid retrieval pipeline that achieved 0.782 MRR (vs ~0.45 baseline vector-only).
Setup
CODEBLOCK0
Re-run embed_memories.py whenever memory files change significantly.
Scripts
scripts/search_memory.py — Primary Search
Hybrid 4-signal retrieval with automatic adaptation:
CODEBLOCK1
Signals fused:
- 1. Vector similarity (0.4) — cosine similarity via nomic-embed-text embeddings
- Keyword matching (0.25) — query term overlap with chunk text
- Header matching (0.1) — query terms in section headers
- Filepath scoring (0.25) — query terms matching file/directory names
Automatic behaviors:
- - Temporal routing — date references ("yesterday", "Feb 8", "last Monday") get 3x boost on matching files
- Adaptive weighting — when keyword overlap is low, shifts to 85% vector weight
- Pseudo-relevance feedback (PRF) — when top score < 0.45, expands query with terms from initial results and re-scores
scripts/enhanced_memory_search.py — JSON-Compatible Search
Same pipeline with JSON output format compatible with OpenClaw's memory_search tool:
CODEBLOCK2
Returns {results: [{path, startLine, endLine, score, snippet, header}], ...}.
scripts/embed_memories.py — Indexing
Chunks all .md files in memory/ plus core workspace files (MEMORY.md, AGENTS.md, etc.) by markdown headers and embeds them:
CODEBLOCK3
Outputs memory/vectors.json. Batches embeddings in groups of 20, truncates chunks to 2000 chars.
scripts/memory_salience.py — Salience Scoring
Surfaces stale/important memory items for heartbeat self-prompting:
CODEBLOCK4
Scores importance × staleness considering: file type (topic > core > daily), size, access frequency, and query gap correlation.
scripts/crossref_memories.py — Knowledge Graph
Builds cross-reference links between memory chunks using embedding similarity:
CODEBLOCK5
Uses file-representative approach (top 5 chunks per file) to reduce O(n²) to manageable comparisons. Threshold: 0.75 cosine similarity.
Configuration
All tunable constants are at the top of each script. Key parameters:
| Parameter | Default | Script | Purpose |
|---|
| INLINECODE11 | 0.4 | searchmemory.py | Weight for vector similarity |
| INLINECODE12 |
0.25 | searchmemory.py | Weight for keyword overlap |
|
FILEPATH_WEIGHT | 0.25 | search_memory.py | Weight for filepath matching |
|
TEMPORAL_BOOST | 3.0 | search_memory.py | Multiplier for date-matching files |
|
PRF_THRESHOLD | 0.45 | search_memory.py | Score below which PRF activates |
|
SIMILARITY_THRESHOLD | 0.75 | crossref_memories.py | Min similarity for cross-ref links |
|
MODEL | nomic-embed-text | all | Ollama embedding model |
To use a different embedding model (e.g., mxbai-embed-large), change MODEL in each script and re-run embed_memories.py.
Integration
To replace the default memory search, point your agent's search tool at these scripts. The scripts expect:
- -
memory/ directory relative to workspace root containing .md files - INLINECODE23 (created by
embed_memories.py) - Ollama running locally on port 11434
All scripts use only Python stdlib + Ollama HTTP API. No pip dependencies.
增强型记忆
OpenClaw记忆系统的即插即用增强方案。用4信号混合检索管道取代平面向量搜索,实现了0.782 MRR(对比基线纯向量方案的约0.45)。
设置
bash
安装Ollama并拉取嵌入模型
ollama pull nomic-embed-text
索引记忆文件(从工作区根目录运行)
python3 skills/enhanced-memory/scripts/embed_memories.py
可选:构建交叉引用图
python3 skills/enhanced-memory/scripts/crossref_memories.py build
每当记忆文件发生重大变化时,重新运行embed_memories.py。
脚本
scripts/search_memory.py — 主要搜索
带自动适配的混合4信号检索:
bash
python3 skills/enhanced-memory/scripts/searchmemory.py 查询 [topn]
融合的信号:
- 1. 向量相似度 (0.4) — 通过nomic-embed-text嵌入的余弦相似度
- 关键词匹配 (0.25) — 查询词与文本块的覆盖度
- 标题匹配 (0.1) — 查询词在章节标题中的匹配
- 文件路径评分 (0.25) — 查询词与文件/目录名称的匹配
自动行为:
- - 时间路由 — 日期引用(昨天、2月8日、上周一)在匹配文件上获得3倍加权
- 自适应权重 — 当关键词覆盖度低时,切换到85%的向量权重
- 伪相关反馈(PRF) — 当最高得分低于0.45时,用初始结果中的词扩展查询并重新评分
scripts/enhancedmemorysearch.py — JSON兼容搜索
相同管道,输出与OpenClaw的memory_search工具兼容的JSON格式:
bash
python3 skills/enhanced-memory/scripts/enhancedmemorysearch.py --json 查询
返回 {results: [{path, startLine, endLine, score, snippet, header}], ...}。
scripts/embed_memories.py — 索引
按Markdown标题分块处理memory/中的所有.md文件以及核心工作区文件(MEMORY.md、AGENTS.md等),并生成嵌入:
bash
python3 skills/enhanced-memory/scripts/embed_memories.py
输出memory/vectors.json。每20个一批进行嵌入,将文本块截断至2000字符。
scripts/memory_salience.py — 显著性评分
为心跳自提示机制找出陈旧/重要的记忆项:
bash
python3 skills/enhanced-memory/scripts/memory_salience.py # 人类可读提示
python3 skills/enhanced-memory/scripts/memory_salience.py --json # 程序化输出
python3 skills/enhanced-memory/scripts/memory_salience.py --top 5 # 更多项
评分公式为 重要性 × 陈旧度,考虑因素包括:文件类型(主题 > 核心 > 日常)、大小、访问频率和查询间隔相关性。
scripts/crossref_memories.py — 知识图谱
使用嵌入相似度构建记忆块之间的交叉引用链接:
bash
python3 skills/enhanced-memory/scripts/crossref_memories.py build # 构建索引
python3 skills/enhanced-memory/scripts/crossref_memories.py show <文件> # 显示文件的引用
python3 skills/enhanced-memory/scripts/crossref_memories.py graph # 图统计
采用文件代表性方法(每个文件前5个块),将O(n²)降低到可管理的比较量。阈值:0.75余弦相似度。
配置
所有可调常量位于每个脚本的顶部。关键参数:
| 参数 | 默认值 | 脚本 | 用途 |
|---|
| VECTORWEIGHT | 0.4 | searchmemory.py | 向量相似度权重 |
| KEYWORDWEIGHT |
0.25 | searchmemory.py | 关键词覆盖度权重 |
| FILEPATH
WEIGHT | 0.25 | searchmemory.py | 文件路径匹配权重 |
| TEMPORAL
BOOST | 3.0 | searchmemory.py | 日期匹配文件的乘数 |
| PRF
THRESHOLD | 0.45 | searchmemory.py | 触发PRF的分数下限 |
| SIMILARITY
THRESHOLD | 0.75 | crossrefmemories.py | 交叉引用链接的最小相似度 |
| MODEL | nomic-embed-text | 全部 | Ollama嵌入模型 |
要使用不同的嵌入模型(例如mxbai-embed-large),在每个脚本中更改MODEL并重新运行embed_memories.py。
集成
要替换默认的记忆搜索,将代理的搜索工具指向这些脚本。脚本期望:
- - 工作区根目录下的memory/目录包含.md文件
- memory/vectors.json(由embed_memories.py创建)
- Ollama在本地端口11434上运行
所有脚本仅使用Python标准库 + Ollama HTTP API。无需pip依赖。