Architecture
Archive storage lives in ~/archive/ with tiered structure. See memory-template.md for setup.
CODEBLOCK0
Quick Reference
| Topic | File |
|---|
| What to capture | INLINECODE2 |
| Search patterns |
search.md |
| Resurfacing rules |
resurface.md |
Core Rules
1. Capture Complete, Not Just Links
When user sends something to archive:
- - Extract full content (not just URL)
- Generate 2-3 line summary
- Identify key quotes/data points
- Ask: "What's this for?" — store the WHY alongside the WHAT
- Assign semantic tags based on content + user history
2. Content Types
| Type | What to extract |
|---|
| Article/webpage | Full text, author, date, key quotes |
| Video (YouTube) |
Title, creator, duration, timestamps mentioned |
| Tweet/thread | Full text, author, context, media |
| PDF/paper | Title, authors, abstract, cited references |
| Image | Description, source, context given |
| Idea/note | Raw text + timestamp + related items |
3. Storage Structure
Each archived item stored as:
CODEBLOCK1
4. Semantic Search
User can ask naturally:
- - "What did I save about X?" → search by concept
- "That article about pricing from last month" → fuzzy time + topic
- "Everything for project Y" → project filter
- "Papers by author Z" → metadata search
NEVER require exact keywords. Match by meaning.
5. Proactive Resurfacing
When user works on a topic:
- - Check if archived items relate
- Surface ONLY if genuinely relevant (max 1-2 per session)
- Include context: "You saved this 3 months ago when researching X"
6. Never Delete Without Asking
- - Old items → mark as "possibly outdated", don't delete
- Duplicates → merge, keep both URLs
- Project closed → archive to cold storage, don't remove
7. Differentiation from Other Skills
| This skill | What it does | NOT this |
|---|
| archive | Preserves external content as snapshots | memory (agent context) |
| archive |
Captures full content for permanence | bookmark (just URLs) |
| archive | Stores raw material | second-brain (processed knowledge) |
| archive | Immutable snapshots | pkm (evolving notes) |
Scope
This skill ONLY:
- - Stores content user explicitly sends to archive
- Searches within archived content
- Surfaces related items when contextually relevant
This skill NEVER:
- - Monitors or observes without explicit request
- Deletes content without confirmation
- Modifies original archived content
- Accesses external services without user action
Data Storage
All data in ~/archive/. Create on first use:
CODEBLOCK2
架构
归档存储位于 ~/archive/ 目录下,采用分层结构。具体设置请参见 memory-template.md。
~/archive/
├── memory.md # 热数据:近期项目,不超过100行
├── index.md # 主题/标签索引
├── items/ # 单个归档项目
├── projects/ # 按项目归集
└── history.md # 搜索/访问记录
快速参考
search.md |
| 重新浮现规则 | resurface.md |
核心规则
1. 完整捕获,而非仅保存链接
当用户发送内容进行归档时:
- - 提取完整内容(不仅仅是URL)
- 生成2-3行摘要
- 识别关键引用/数据点
- 询问:这个用来做什么?——同时存储为什么和是什么
- 基于内容+用户历史分配语义标签
2. 内容类型
| 类型 | 提取内容 |
|---|
| 文章/网页 | 全文、作者、日期、关键引用 |
| 视频(YouTube) |
标题、创作者、时长、提及的时间戳 |
| 推文/帖子 | 全文、作者、上下文、媒体 |
| PDF/论文 | 标题、作者、摘要、引用文献 |
| 图片 | 描述、来源、给出的上下文 |
| 想法/笔记 | 原始文本+时间戳+相关项目 |
3. 存储结构
每个归档项目存储为:
items/{日期}_{短标题}.md
type: article
url: 原始URL
archived: 2026-02-16
why: 定价策略研究
tags: [定价, saas, 策略]
project: clawmsg
摘要
...
关键点
...
完整内容
...
4. 语义搜索
用户可以自然提问:
- - 我保存过关于X的什么内容? → 按概念搜索
- 上个月那篇关于定价的文章 → 模糊时间+主题
- 项目Y的所有内容 → 项目筛选
- 作者Z的论文 → 元数据搜索
绝不需要精确关键词。按含义匹配。
5. 主动重新浮现
当用户处理某个主题时:
- - 检查是否有相关的归档项目
- 仅在确实相关时呈现(每次会话最多1-2条)
- 包含上下文:这是3个月前你研究X时保存的
6. 未经询问绝不删除
- - 旧项目 → 标记为可能已过时,不删除
- 重复内容 → 合并,保留两个URL
- 项目关闭 → 归档至冷存储,不移除
7. 与其他技能的区别
| 本技能 | 功能 | 非本技能 |
|---|
| archive | 将外部内容保存为快照 | memory(智能体上下文) |
| archive |
捕获完整内容以永久保存 | bookmark(仅URL) |
| archive | 存储原始素材 | second-brain(已处理的知识) |
| archive | 不可变快照 | pkm(不断演进的笔记) |
范围
本技能仅:
- - 存储用户明确发送至归档的内容
- 在归档内容中进行搜索
- 在上下文相关时呈现相关项目
本技能绝不:
- - 未经明确请求进行监控或观察
- 未经确认删除内容
- 修改原始归档内容
- 未经用户操作访问外部服务
数据存储
所有数据存储在 ~/archive/ 目录下。首次使用时创建:
bash
mkdir -p ~/archive/items ~/archive/projects