Archive

Architecture

Archive storage lives in ~/archive/ with tiered structure. See memory-template.md for setup.

CODEBLOCK0

Quick Reference

Topic	File
What to capture	INLINECODE2
Search patterns

search.md | | Resurfacing rules | resurface.md |

Core Rules

1. Capture Complete, Not Just Links

When user sends something to archive:

- Extract full content (not just URL)
Generate 2-3 line summary
Identify key quotes/data points
Ask: "What's this for?" — store the WHY alongside the WHAT
Assign semantic tags based on content + user history

2. Content Types
Type What to extract
Article/webpage Full text, author, date, key quotes
Video (YouTube)
Title, creator, duration, timestamps mentioned |

Type	What to extract
Article/webpage	Full text, author, date, key quotes
Video (YouTube)

3. Storage Structure

Each archived item stored as: CODEBLOCK1

4. Semantic Search

User can ask naturally:

- "What did I save about X?" → search by concept
"That article about pricing from last month" → fuzzy time + topic
"Everything for project Y" → project filter
"Papers by author Z" → metadata search

NEVER require exact keywords. Match by meaning.

5. Proactive Resurfacing

When user works on a topic:

- Check if archived items relate
Surface ONLY if genuinely relevant (max 1-2 per session)
Include context: "You saved this 3 months ago when researching X"

6. Never Delete Without Asking

- Old items → mark as "possibly outdated", don't delete
Duplicates → merge, keep both URLs
Project closed → archive to cold storage, don't remove

7. Differentiation from Other Skills
This skill What it does NOT this
archive Preserves external content as snapshots memory (agent context)
archive
Captures full content for permanence | bookmark (just URLs) |

This skill	What it does	NOT this
archive	Preserves external content as snapshots	memory (agent context)
archive

Scope

This skill ONLY:

- Stores content user explicitly sends to archive
Searches within archived content
Surfaces related items when contextually relevant

This skill NEVER:

- Monitors or observes without explicit request
Deletes content without confirmation
Modifies original archived content
Accesses external services without user action

Data Storage

All data in ~/archive/. Create on first use:
CODEBLOCK2

架构

归档存储位于 ~/archive/ 目录下，采用分层结构。具体设置请参见 memory-template.md。

~/archive/
├── memory.md # 热数据：近期项目，不超过100行
├── index.md # 主题/标签索引
├── items/ # 单个归档项目
├── projects/ # 按项目归集
└── history.md # 搜索/访问记录

快速参考

主题	文件
捕获内容	capture.md
搜索模式

search.md | | 重新浮现规则 | resurface.md |

核心规则

1. 完整捕获，而非仅保存链接

当用户发送内容进行归档时：

- 提取完整内容（不仅仅是URL）
生成2-3行摘要
识别关键引用/数据点
询问：这个用来做什么？——同时存储为什么和是什么
基于内容+用户历史分配语义标签

2. 内容类型
类型提取内容
文章/网页全文、作者、日期、关键引用
视频（YouTube）
标题、创作者、时长、提及的时间戳 |

类型	提取内容
文章/网页	全文、作者、日期、关键引用
视频（YouTube）

3. 存储结构

每个归档项目存储为：

items/{日期}_{短标题}.md

type: article
url: 原始URL
archived: 2026-02-16
why: 定价策略研究
tags: [定价, saas, 策略]
project: clawmsg

摘要

...

关键点

...

完整内容

...

4. 语义搜索

用户可以自然提问：

- 我保存过关于X的什么内容？ → 按概念搜索
上个月那篇关于定价的文章 → 模糊时间+主题
项目Y的所有内容 → 项目筛选
作者Z的论文 → 元数据搜索

绝不需要精确关键词。按含义匹配。

5. 主动重新浮现

当用户处理某个主题时：

- 检查是否有相关的归档项目
仅在确实相关时呈现（每次会话最多1-2条）
包含上下文：这是3个月前你研究X时保存的

6. 未经询问绝不删除

- 旧项目 → 标记为可能已过时，不删除
重复内容 → 合并，保留两个URL
项目关闭 → 归档至冷存储，不移除

7. 与其他技能的区别
本技能功能非本技能
archive 将外部内容保存为快照 memory（智能体上下文）
archive
捕获完整内容以永久保存 | bookmark（仅URL） |

本技能	功能	非本技能
archive	将外部内容保存为快照	memory（智能体上下文）
archive

范围

本技能仅：

- 存储用户明确发送至归档的内容
在归档内容中进行搜索
在上下文相关时呈现相关项目

本技能绝不：

- 未经明确请求进行监控或观察
未经确认删除内容
修改原始归档内容
未经用户操作访问外部服务

数据存储

所有数据存储在 ~/archive/ 目录下。首次使用时创建：
bash
mkdir -p ~/archive/items ~/archive/projects

Archive智能存档

Architecture

Quick Reference