Installation
Option 1: ClawhHub CLI (recommended)
CODEBLOCK0
Option 2: From GitHub
CODEBLOCK1
Archive Project Skill
Organize a completed project into a complete, long-term searchable archive.
Data Privacy: Archived data (session transcripts, project files) never leaves the internal workspace unless you explicitly approve a publish step. The sanitize script is applied automatically before any archival.
Trigger Conditions
Archive is triggered only when you say "archive this" or "can we archive this". This is the only trigger — you always decide when a project is done.
Trigger 2: Slash command
Type
//archive followed by your project name to activate the Archive skill.
Example: "//archive cureforge-hr-assessment"
However, in these scenarios, I will prompt but not execute:
- - A delivery action just happened (email sent, demo link generated, all subagents done, code committed)
- You start a new project or say "next task" / "different topic"
I will NOT prompt when:
- - Project is still in active development
- Task is ongoing operations
- Waiting on external feedback (48h+ silence)
Archive Flow
Step 1: Create project archive directory
CODEBLOCK2
Step 2: Collect session transcripts
Subagent sessions (important — must collect):
CODEBLOCK3
Child subagent transcripts:
CODEBLOCK4
Step 3: Sanitize transcripts (CRITICAL — must do before archiving)
Before archiving, remove:
- - API keys, tokens, and authentication credentials
- Personal contact information (emails, phone numbers)
- Internal infrastructure details (hostnames, IPs)
- Any sensitive environment variables
Use the sanitization script:
CODEBLOCK5
The script redacts:
- - API keys (GitHub tokens, OpenAI keys, AWS credentials, etc.)
- Email addresses
- Phone numbers
- IP addresses (IPv4 and IPv6)
- Internal hostnames and AWS EC2 DNS names
- Generic secrets and high-entropy tokens
Verify before proceeding:
CODEBLOCK6
After verification, replace the original with the sanitized version:
CODEBLOCK7
Step 4: Write ARCHIVE.md
Use the template below. Fill in decision rationale — this is the most valuable part for future retrospectives.
Step 5: Update MEMORY.md
Add a one-line summary to MEMORY.md: project name + status + link.
Step 6: Delete EFS session files (requires approval)
Before deleting any session files from EFS, ask the user:
"Can I delete the EFS session files for this project? They are already backed up in the archive."
Only proceed if the user explicitly approves. Never auto-delete without asking.
If approved:
CODEBLOCK8
If not approved, leave the EFS session files as-is.
Step 7: Git commit (internal workspace only)
CODEBLOCK9
Keep project data private. Archive data is for internal reference only.
ARCHIVE.md Template
CODEBLOCK10 bash
decisions.md Template
CODEBLOCK12
Sanitization Script Reference
The scripts/sanitize_transcript.py script provides deterministic, audited redaction of sensitive data from session transcripts.
What it redacts
| Category | Examples | Replacement |
|---|
| GitHub tokens | INLINECODE2 , INLINECODE3 | INLINECODE4 |
| OpenAI keys |
sk-xxx,
sk-proj-xxx |
[REDACTED-OPENAI-KEY] |
| Anthropic keys |
sk-ant-xxx |
[REDACTED-ANTHROPIC-KEY] |
| AWS credentials |
AKIAxxx,
aws_access_key_id=xxx |
[REDACTED] |
| Email addresses |
user@example.com |
[REDACTED-EMAIL] |
| Phone numbers |
+1 555-123-4567 |
[REDACTED-PHONE] |
| IPv4 addresses |
192.168.1.1,
10.0.0.1 |
[REDACTED-IP] |
| IPv6 addresses |
2001:db8::1 |
[REDACTED-IPV6] |
| Internal hostnames |
ip-10-0-1-43.local |
[REDACTED-HOSTNAME] |
| AWS EC2 DNS |
ec2-xxx.amazonaws.com |
[REDACTED-AWS-HOST] |
| Generic secrets | High-entropy base64/hex strings |
[REDACTED-SECRET] |
Usage
CODEBLOCK13
Properties
- - Deterministic: Same input always produces identical output
- Non-destructive: Original file is never modified
- Structure-preserving: JSON/JSONL structure is maintained; only string values are redacted
- Testable: Built-in test mode verifies redaction patterns
安装
选项 1:ClawhHub CLI(推荐)
bash
openclaw skills install archive-project
或
clawhub install archive-project
选项 2:从 GitHub 安装
bash
克隆仓库
git clone https://github.com/KaigeGao1110/ArchiveProject.git ~/.openclaw/skills/archive-project
或直接下载
curl -L https://github.com/KaigeGao1110/ArchiveProject/archive/refs/heads/main.zip -o /tmp/archive-project.zip
unzip /tmp/archive-project.zip -d ~/.openclaw/skills/
mv ~/.openclaw/skills/ArchiveProject-main ~/.openclaw/skills/archive-project
项目归档技能
将已完成的项目整理成完整、可长期检索的归档文件。
数据隐私:归档数据(会话记录、项目文件)绝不会离开内部工作区,除非你明确批准发布步骤。在归档前会自动执行脱敏脚本。
触发条件
仅当你说归档这个或我们可以归档这个吗时,才会触发归档。这是唯一的触发方式——始终由你决定项目何时完成。
触发方式 2:斜杠命令
输入 //archive 后跟项目名称即可激活归档技能。
示例://archive cureforge-hr-assessment
但在以下场景中,我只会提示但不会执行:
- - 交付操作刚完成(邮件已发送、演示链接已生成、所有子代理已完成、代码已提交)
- 你开始一个新项目或说下一个任务/不同话题
在以下情况下,我不会提示:
- - 项目仍在积极开发中
- 任务正在进行操作
- 等待外部反馈(超过48小时无响应)
归档流程
步骤 1:创建项目归档目录
workspace/projects/<项目名称>/
ARCHIVE.md
session_transcript.jsonl
subagent_sessions/
deliverables/
decisions.md
步骤 2:收集会话记录
子代理会话(重要——必须收集):
bash
包含会话记录的目录(可通过 SESSIONTRANSCRIPTPATH 配置)
默认:~/.openclaw/agents/main/sessions/(适用于所有用户)
覆盖:设置 SESSIONTRANSCRIPTPATH 为自定义路径(例如 EFS 挂载点)
SESSION
DIR=${SESSIONTRANSCRIPT_PATH:-$HOME/.openclaw/agents/main/sessions/}
使用明确的会话密钥查找主会话记录(从会话标签或传入参数获取)
使用会话密钥/标签匹配确切的记录文件
SESSION_KEY=${1:-} # 将会话密钥作为参数传入或从上下文中提取
if [ -n $SESSION_KEY ]; then
MAIN
SESSIONPATH=$(grep -l $SESSION
KEY ${SESSIONDIR}*.jsonl 2>/dev/null | head -1)
fi
回退方案:如果未提供密钥或未找到,则使用最近的记录
if [ -z $MAIN
SESSIONPATH ] || [ ! -f $MAIN
SESSIONPATH ]; then
MAIN
SESSIONPATH=$(ls -t ${SESSION_DIR}*.jsonl 2>/dev/null | head -1)
fi
创建项目归档目录
mkdir -p workspace/projects/<项目名称>/subagent_sessions/
复制主会话记录
cp $MAIN
SESSIONPATH workspace/projects/<项目名称>/session_transcript.jsonl
子代理记录:
bash
子代理会话 ID 列在主会话 JSONL 中
在会话元数据中查找 childSessions 数组
将每个子会话记录复制到 subagent_sessions/
模式:{SESSION_DIR}/{child-id}.jsonl
步骤 3:脱敏处理记录(关键——归档前必须执行)
归档前,移除:
- - API 密钥、令牌和身份验证凭据
- 个人联系信息(电子邮件、电话号码)
- 内部基础设施详情(主机名、IP 地址)
- 任何敏感的环境变量
使用脱敏脚本:
bash
python3 scripts/sanitize_transcript.py \
workspace/projects/<项目名称>/session_transcript.jsonl \
-o workspace/projects/<项目名称>/sessiontranscriptsanitized.jsonl
该脚本会脱敏处理:
- - API 密钥(GitHub 令牌、OpenAI 密钥、AWS 凭据等)
- 电子邮件地址
- 电话号码
- IP 地址(IPv4 和 IPv6)
- 内部主机名和 AWS EC2 DNS 名称
- 通用密钥和高熵令牌
继续前进行验证:
bash
运行内置测试以确认脱敏功能正常
python3 scripts/sanitize_transcript.py --test
手动抽查(查找任何残留的敏感数据)
grep -iE (token|key|password|email|phone|@|192\.168|10\.) \
workspace/projects/<项目名称>/session
transcriptsanitized.jsonl || echo 未发现敏感数据
验证后,用脱敏版本替换原始版本:
bash
mv workspace/projects/<项目名称>/sessiontranscriptsanitized.jsonl \
workspace/projects/<项目名称>/session_transcript.jsonl
步骤 4:编写 ARCHIVE.md
使用下面的模板。填写决策理由——这是未来回顾时最有价值的部分。
步骤 5:更新 MEMORY.md
在 MEMORY.md 中添加一行摘要:项目名称 + 状态 + 链接。
步骤 6:删除 EFS 会话文件(需批准)
在从 EFS 删除任何会话文件之前,询问用户:
我可以删除此项目的 EFS 会话文件吗?它们已备份到归档中。
仅在用户明确批准后继续。 未经询问切勿自动删除。
如果获得批准:
bash
从 EFS 移除主会话记录
rm -f ${SESSION
DIR}$(basename $MAINSESSION_PATH)
从 EFS 移除任何子代理会话记录
for CHILD_ID in <子会话ID>; do
rm -f ${SESSION
DIR}${CHILDID}.jsonl
done
如果未获批准,保留 EFS 会话文件不变。
步骤 7:Git 提交(仅限内部工作区)
bash
cd workspace
git add projects/<项目名称>/
git commit -m 归档:<项目名称>
保持项目数据私密。 归档数据仅供内部参考。
ARCHIVE.md 模板
markdown
<项目名称> — 项目归档
创建日期:<日期> | 负责人:<负责人> | 状态:<状态>
一句话摘要
<1-2句话:此项目的功能、目标用户、核心价值>
项目背景
客户
<名称 + 联系信息——归档后,仅记录未来参考所需的信息>
源材料
<描述> |
交付物
代码/产品
报告/文档
演示/链接
时间线
| 日期 | 事件 |
|---|
| YYYY-MM-DD | <事件> |
| YYYY-MM-DD |
<交付> |
关键决策
N. <决策标题>
选项: A vs B(选择了 A)
理由: <为何选择此方案>
结果: <发生了什么>
待办事项
经验教训
N. <教训标题>
<学到了什么,下次应如何改进>
Git 提交(内部)
<哈希> | <描述> |
重建指南
bash
<重建命令>
decisions.md 模板
markdown
关键决策 — <项目名称>
决策 N
- A:<描述>
- B:<描述>
脱敏脚本参考
scripts/sanitize_transcript.py 脚本提供确定性的、可审计的敏感数据脱敏功能,