Skill Preflight
A smart plugin for OpenClaw that automatically injects the most relevant skills and protocols into your agent's context before each run. Uses Ollama embeddings — free, offline-capable, no separate embedding API key required.
What It Does
When you run an agent, this plugin:
- 1. Scans your
skills/ and memory/protocols/ directories for documentation - Embeds each doc using
nomic-embed-text (via Ollama) - Matches the incoming prompt against your docs using cosine similarity
- Injects only the relevant ones above a configurable threshold
- Deduplicates within a session (same doc won't be re-injected)
Result: Agents follow custom protocols and skills without burning tokens on irrelevant context.
Requirements
- - OpenClaw ≥ 1.0
- Ollama running locally on INLINECODE3
- Model:
nomic-embed-text (download with ollama pull nomic-embed-text)
Quick Start
1. Install Ollama
Download from ollama.com and install.
2. Pull the embedding model
CODEBLOCK0
3. Start Ollama
CODEBLOCK1
Leave this running in the background. It listens on http://localhost:11434 by default.
4. Install the plugin
Add to your openclaw.json:
CODEBLOCK2
5. Add your docs
Create your skills and protocols in:
- -
skills/ — skill documentation (looks for SKILL.md in subdirs or loose .md files) - INLINECODE11 — protocol docs (
.md files, 1 level deep)
Configuration
| Option | Default | Description |
|---|
| INLINECODE13 | INLINECODE14 | Directories to scan for protocol docs (recursive, 1 level) |
| INLINECODE15 |
["skills"] | Directories to scan for skill docs |
|
toolsFiles |
["TOOLS.md"] | Individual files to always include in the index |
|
pinnedDocs |
[] | Docs always injected first, regardless of score |
|
maxResults |
3 | Max ranked docs to inject per run (pinned docs don't count toward this) |
|
maxDocLines |
100 | Truncate injected docs to N lines (0 = no limit) |
|
minScore |
0.3 | Cosine similarity threshold (0–1). Lower = more permissive. Tune via debug logs. |
|
embedModel |
nomic-embed-text:latest | Ollama embedding model |
|
ollamaBaseUrl |
http://localhost:11434 | Ollama API base URL. For local-only privacy, keep this on
localhost,
127.0.0.1, or
::1. If you point it at a remote host, prompt text and indexed doc content are sent to that host for embeddings. |
|
requestTimeoutMs |
10000 | Timeout for embedding requests (ms) |
|
minPromptLength |
20 | Minimum prompt length to trigger preflight. Short prompts skip embedding. |
Pinned Docs
Pin specific docs so they're always injected first, regardless of relevance score:
CODEBLOCK3
Pinned docs appear first and don't count toward maxResults.
Tuning the Threshold
Enable debug logging in OpenClaw to see similarity scores:
CODEBLOCK4
Use this to dial in minScore. If too many irrelevant docs are injected, raise it. If relevant docs are missing, lower it.
Troubleshooting
"Ollama embedding unavailable"
- - Check Ollama is running: INLINECODE40
- Check model is installed:
ollama list (should show nomic-embed-text) - Check timeout: If embedding is slow, increase
requestTimeoutMs in config
"Not injecting docs I expect"
- - Enable debug logs in OpenClaw to see scores
- Check file locations: Docs must be in configured
protocolDirs or INLINECODE45 - Check doc metadata: Docs with
status: deprecated or status: archived are skipped - Verify content: Empty docs or docs with only frontmatter score 0 on all prompts
"Too many/too few docs injected"
- - Adjust
minScore (lower threshold = more docs) - Adjust
maxResults (cap on how many ranked docs) - Use
pinnedDocs to always include critical docs
Ollama is slow
- -
nomic-embed-text takes ~100–300ms per document on typical hardware - This is a one-time cost per new doc; embeddings are cached for 1 hour
- For faster iteration during development, raise
minScore to reduce docs being embedded
File Format
Docs are standard Markdown with optional frontmatter:
CODEBLOCK5
Frontmatter is optional. If not provided, the first heading or filename is used as the title, and the first few lines become the description.
How It Works Under the Hood
- 1. Initialization: Plugin scans configured dirs and builds a doc index
- Doc caching: Docs are cached for 1 hour to avoid repeated disk reads
- Embedding: On each agent run, the prompt is embedded via Ollama
- Ranking: Docs are scored by cosine similarity, top N are selected
- Deduplication: Tracked per session so the same doc isn't re-injected
- Injection: Matched docs are formatted and prepended to the prompt context
Privacy & Performance
- - No separate embedding API required — embeddings go through your configured Ollama endpoint
- Local-only when Ollama is local — keep
ollamaBaseUrl on localhost, 127.0.0.1, or ::1 if you want docs and prompts to stay on the same machine - Remote Ollama changes the trust boundary — if
ollamaBaseUrl points to another host, the following are sent to that host for embedding:
-
Prompt text from every agent run
-
Full indexed markdown content including secrets, API keys, credentials, and all sensitive data in your docs
- Any confidential information embedded in your skills, protocols, and tools documentation
- - Offline capable — once the Ollama model is downloaded and running locally, no internet is required
- Caching: Docs cached for 1 hour, embeddings cached in memory per session
- Session-aware: Same doc won't be re-injected in a single conversation
License
MIT
Questions? Check the OpenClaw docs at
openclaw.ai or report issues on GitHub.
技能预检
一个用于OpenClaw的智能插件,能在每次运行前自动将最相关的技能和协议注入到智能体上下文中。使用Ollama嵌入——免费、支持离线运行,无需单独的嵌入API密钥。
功能说明
当你运行智能体时,该插件会:
- 1. 扫描你的skills/和memory/protocols/目录中的文档
- 嵌入每个文档(使用nomic-embed-text,通过Ollama)
- 匹配传入的提示词与文档(使用余弦相似度)
- 注入仅超过可配置阈值的相关文档
- 去重(同一会话中不会重复注入相同文档)
效果: 智能体遵循自定义协议和技能,不会在不相关的上下文上消耗token。
系统要求
- - OpenClaw ≥ 1.0
- Ollama 在本地运行(http://localhost:11434)
- 模型: nomic-embed-text(使用ollama pull nomic-embed-text下载)
快速开始
1. 安装Ollama
从ollama.com下载并安装。
2. 拉取嵌入模型
bash
ollama pull nomic-embed-text
3. 启动Ollama
bash
ollama serve
保持此进程在后台运行。默认监听http://localhost:11434。
4. 安装插件
添加到你的openclaw.json:
json
{
plugins: {
skill-preflight: {
enabled: true,
config: {
minScore: 0.3,
maxResults: 3,
protocolDirs: [memory/protocols],
skillsDirs: [skills]
}
}
}
}
5. 添加文档
在以下位置创建技能和协议:
- - skills/ — 技能文档(查找子目录中的SKILL.md或散落的.md文件)
- memory/protocols/ — 协议文档(.md文件,深度为1层)
配置选项
| 选项 | 默认值 | 说明 |
|---|
| protocolDirs | [memory/protocols] | 扫描协议文档的目录(递归,1层深度) |
| skillsDirs |
[skills] | 扫描技能文档的目录 |
| toolsFiles | [TOOLS.md] | 始终包含在索引中的单个文件 |
| pinnedDocs | [] | 无论分数如何,始终优先注入的文档 |
| maxResults | 3 | 每次运行注入的排名文档最大数量(固定文档不计入此限制) |
| maxDocLines | 100 | 截断注入文档至N行(0 = 无限制) |
| minScore | 0.3 | 余弦相似度阈值(0–1)。越低越宽松。通过调试日志调整。 |
| embedModel | nomic-embed-text:latest | Ollama嵌入模型 |
| ollamaBaseUrl | http://localhost:11434 | Ollama API基础URL。如需本地隐私保护,请保持为localhost、127.0.0.1或::1。如果指向远程主机,提示文本和索引文档内容将被发送到该主机进行嵌入。 |
| requestTimeoutMs | 10000 | 嵌入请求超时时间(毫秒) |
| minPromptLength | 20 | 触发预检的最小提示长度。短提示跳过嵌入。 |
固定文档
固定特定文档,使其无论相关性分数如何,始终优先注入:
json
{
plugins: {
skill-preflight: {
config: {
pinnedDocs: [memory/protocols/house-rules.md, skills/ethereum/SKILL.md]
}
}
}
}
固定文档优先显示,且不计入maxResults。
调整阈值
在OpenClaw中启用调试日志以查看相似度分数:
skill-preflight: scores — DebuggingProtocol(0.72), EthereumSkill(0.51), MemoryProtocol(0.34), ...
使用此信息调整minScore。如果注入了太多不相关的文档,请提高阈值。如果相关文档缺失,请降低阈值。
故障排除
Ollama嵌入不可用
- - 检查Ollama是否运行: curl http://localhost:11434/api/tags
- 检查模型是否安装: ollama list(应显示nomic-embed-text)
- 检查超时: 如果嵌入速度慢,请在配置中增加requestTimeoutMs
未注入我期望的文档
- - 在OpenClaw中启用调试日志以查看分数
- 检查文件位置: 文档必须位于配置的protocolDirs或skillsDirs中
- 检查文档元数据: 状态为status: deprecated或status: archived的文档会被跳过
- 验证内容: 空文档或仅包含前置元数据的文档在所有提示上的分数均为0
注入的文档太多/太少
- - 调整minScore(阈值越低 = 文档越多)
- 调整maxResults(排名文档的上限)
- 使用pinnedDocs始终包含关键文档
Ollama运行缓慢
- - nomic-embed-text在典型硬件上每个文档约需100–300毫秒
- 这是每个新文档的一次性成本;嵌入结果会缓存1小时
- 开发期间如需更快迭代,可提高minScore以减少需要嵌入的文档
文件格式
文档为标准Markdown格式,可包含可选的前置元数据:
markdown
name: 我的自定义技能
description: 此功能的简要说明
status: active
我的自定义技能
详细说明、示例、分步流程...
前置元数据为可选项。如果未提供,则使用第一个标题或文件名作为标题,前几行作为描述。
底层工作原理
- 1. 初始化: 插件扫描配置的目录并构建文档索引
- 文档缓存: 文档缓存1小时,避免重复磁盘读取
- 嵌入: 每次智能体运行时,通过Ollama嵌入提示
- 排名: 文档按余弦相似度评分,选择前N个
- 去重: 按会话跟踪,避免同一文档重复注入
- 注入: 匹配的文档被格式化并前置到提示上下文中
隐私与性能
- - 无需单独的嵌入API — 嵌入通过你配置的Ollama端点进行
- Ollama本地运行时为纯本地模式 — 如果希望文档和提示保留在同一台机器上,请将ollamaBaseUrl保持为localhost、127.0.0.1或::1
- 远程Ollama会改变信任边界 — 如果ollamaBaseUrl指向另一台主机,以下内容将被发送到该主机进行嵌入:
-
每次智能体运行的提示文本
-
完整的索引Markdown内容,包括文档中的密钥、API密钥、凭证和所有敏感数据
- 嵌入在技能、协议和工具文档中的任何机密信息
- - 支持离线运行 — 一旦Ollama模型下载并在本地运行,无需互联网连接
- 缓存: 文档缓存1小时,嵌入结果按会话在内存中缓存
- 会话感知: 同一文档不会在单次对话中重复注入
许可证
MIT
有问题? 查看OpenClaw文档:
openclaw.ai 或在GitHub上报告问题。