deep-scout
Multi-stage deep intelligence pipeline (Search → Filter → Fetch → Synthesize).
🛠️ Installation
1. Ask OpenClaw (Recommended)
Tell OpenClaw:
"Install the deep-scout skill." The agent will handle the installation and configuration automatically.
2. Manual Installation (CLI)
If you prefer the terminal, run:
CODEBLOCK0
🚀 Usage
CODEBLOCK1
Options
| Flag | Default | Description |
|---|
| INLINECODE0 | 5 | Number of URLs to fully fetch (1–10) |
| INLINECODE1 |
pw |
pd=past day,
pw=past week,
pm=past month,
py=past year |
|
--country |
US | 2-letter country code for Brave search |
|
--language |
en | 2-letter language code |
|
--search-count | 8 | Total results to collect before filtering |
|
--min-score | 4 | Minimum relevance score to keep (0–10) |
|
--style |
report |
report \|
comparison \|
bullets \|
timeline |
|
--dimensions |
auto | Comparison dimensions (comma-separated, for
--style comparison) |
|
--output FILE | stdout | Write report to file |
|
--no-browser | — | Disable browser fallback |
|
--no-firecrawl | — | Disable Firecrawl fallback |
🛠️ Pipeline — Agent Loop Instructions
When this skill is invoked, execute the following four-stage pipeline:
Stage 1: SEARCH
Call web_search with:
CODEBLOCK2
Collect: title, url, snippet for each result.
If fewer than 3 results returned, retry with freshness: "py" (relaxed).
Stage 2: FILTER
Load prompts/filter.txt. Replace template vars:
- -
{{query}} → the user's query - INLINECODE29 → freshness param
- INLINECODE30 → min_score param
- INLINECODE31 → JSON array of search results
Call the LLM with this prompt. Parse the returned JSON array.
Keep only results where keep: true. Sort by score descending.
Take top depth URLs as the fetch list.
Deduplication: Max 2 results per root domain (already handled in filter prompt).
Stage 3: FETCH (Tiered Escalation)
For each URL in the filtered list:
Tier 1 — web_fetch (fast):
CODEBLOCK3
Tier 2 — Firecrawl (deep/JS):
CODEBLOCK4
Tier 3 — Browser (last resort):
CODEBLOCK5
If all tiers fail: Use the original snippet from Stage 1 search results. Mark as [snippet only].
Store: { url: extracted_content } dict.
Stage 4: SYNTHESIZE
Choose prompt template based on --style:
- -
report / bullets / timeline → INLINECODE40 - INLINECODE41 → INLINECODE42
Replace template vars:
- -
{{query}} → user query - INLINECODE44 → current date (YYYY-MM-DD)
- INLINECODE45 → language param
- INLINECODE46 → number of successfully fetched sources
- INLINECODE47 → dimensions param (or "auto")
- INLINECODE48 → build as:
CODEBLOCK6
Call LLM with the filled prompt. The output is the final report.
If --output FILE is set, write the report to that file. Otherwise, print to the channel.
⚙️ Configuration
Defaults are in config.yaml. Override via CLI flags above.
📂 Project Structure
CODEBLOCK7
🔧 Error Handling
| Scenario | Handling |
|---|
| All fetch attempts fail | Use snippet from Stage 1; mark INLINECODE51 |
| Search returns 0 results |
Retry with
freshness: py; error if still 0 |
| Firecrawl not installed |
firecrawl-wrap.sh outputs
FIRECRAWL_UNAVAILABLE, skip silently |
| Browser tool unavailable | Skip Tier 3; proceed with available content |
| LLM synthesis exceeds context | Trim sources proportionally, prioritize high-score sources |
| Rate limit on Brave API | Wait 2s, retry once |
📋 Example Outputs
See examples/openclaw-acquisition.md for a full sample report.
Deep Scout v0.1.0 · OpenClaw Skills · clawhub: deep-scout
deep-scout
多阶段深度情报处理流水线(搜索 → 筛选 → 获取 → 综合)。
🛠️ 安装
1. 询问 OpenClaw(推荐)
告诉 OpenClaw:
安装 deep-scout 技能。 代理将自动处理安装和配置。
2. 手动安装(命令行)
如果你更喜欢终端,运行:
bash
clawhub install deep-scout
🚀 使用方法
/deep-scout 你的研究问题 [--depth 5] [--freshness pw] [--country US] [--style report]
选项
| 标志 | 默认值 | 描述 |
|---|
| --depth N | 5 | 完全获取的 URL 数量(1–10) |
| --freshness |
pw | pd=过去一天,pw=过去一周,pm=过去一个月,py=过去一年 |
| --country | US | Brave 搜索的两位国家代码 |
| --language | en | 两位语言代码 |
| --search-count | 8 | 筛选前收集的结果总数 |
| --min-score | 4 | 保留的最低相关性评分(0–10) |
| --style | report | report \| comparison \| bullets \| timeline |
| --dimensions | auto | 比较维度(逗号分隔,用于 --style comparison) |
| --output FILE | stdout | 将报告写入文件 |
| --no-browser | — | 禁用浏览器回退 |
| --no-firecrawl | — | 禁用 Firecrawl 回退 |
🛠️ 流水线 — 代理循环指令
当调用此技能时,执行以下四个阶段的流水线:
阶段 1:搜索
调用 web_search,参数为:
query: <用户查询>
count:
country:
search_lang:
freshness:
收集每个结果的:标题、URL、摘要。
如果返回的结果少于 3 个,使用 freshness: py(放宽条件)重试。
阶段 2:筛选
加载 prompts/filter.txt。替换模板变量:
- - {{query}} → 用户的查询
- {{freshness}} → freshness 参数
- {{minscore}} → minscore 参数
- {{results_json}} → 搜索结果的 JSON 数组
使用此提示调用 LLM。解析返回的 JSON 数组。
仅保留 keep: true 的结果。按评分降序排序。
取前 depth 个 URL 作为获取列表。
去重: 每个根域名最多 2 个结果(已在筛选提示中处理)。
阶段 3:获取(分层升级)
对于筛选列表中的每个 URL:
第 1 层 — web_fetch(快速):
调用 web_fetch(url)
如果内容长度 >= 200 字符 → 接受,裁剪至 maxcharsper_source
第 2 层 — Firecrawl(深度/JS):
如果第 1 层失败或返回 < 200 字符:
运行:scripts/firecrawl-wrap.sh
如果输出 != FIRECRAWLUNAVAILABLE 且 != FIRECRAWLEMPTY → 接受
第 3 层 — 浏览器(最后手段):
如果第 2 层失败:
调用 browser(action=open, url=url)
调用 browser(action=snapshot)
加载 prompts/browser-extract.txt,替换 {{query}} 和 {{maxcharsper_source}}
使用快照内容和提取提示调用 LLM
如果输出 != FETCH_FAILED:... → 接受
如果所有层级都失败: 使用阶段 1 搜索结果中的原始摘要。标记为 [仅摘要]。
存储:{ url: extracted_content } 字典。
阶段 4:综合
根据 --style 选择提示模板:
- - report / bullets / timeline → prompts/synthesize-report.txt
- comparison → prompts/synthesize-comparison.txt
替换模板变量:
- - {{query}} → 用户查询
- {{today}} → 当前日期(YYYY-MM-DD)
- {{language}} → 语言参数
- {{sourcecount}} → 成功获取的来源数量
- {{dimensionsorauto}} → dimensions 参数(或 auto)
- {{fetchedcontent_blocks}} → 构建为:
[来源 1] (url1)
<内容>
[来源 2] (url2)
<内容>
使用填充后的提示调用 LLM。输出即为最终报告。
如果设置了 --output FILE,将报告写入该文件。否则,打印到频道。
⚙️ 配置
默认值在 config.yaml 中。通过上述 CLI 标志覆盖。
📂 项目结构
skills/deep-scout/
├── SKILL.md ← 本文件(代理指令)
├── config.yaml ← 默认参数值
├── prompts/
│ ├── filter.txt ← 阶段 2:相关性评分提示
│ ├── synthesize-report.txt ← 阶段 4:报告/要点/时间线综合
│ ├── synthesize-comparison.txt← 阶段 4:比较表格综合
│ └── browser-extract.txt ← 阶段 3:浏览器快照提取
├── scripts/
│ ├── run.sh ← CLI 入口点(发出流水线操作)
│ └── firecrawl-wrap.sh ← Firecrawl CLI 包装器,带回退处理
└── examples/
└── openclaw-acquisition.md ← 示例输出:OpenClaw 并购情报
🔧 错误处理
| 场景 | 处理方式 |
|---|
| 所有获取尝试均失败 | 使用阶段 1 的摘要;标记 [仅摘要] |
| 搜索返回 0 个结果 |
使用 freshness: py 重试;如果仍为 0 则报错 |
| Firecrawl 未安装 | firecrawl-wrap.sh 输出 FIRECRAWL_UNAVAILABLE,静默跳过 |
| 浏览器工具不可用 | 跳过第 3 层;使用可用内容继续 |
| LLM 综合超出上下文 | 按比例裁剪来源,优先保留高评分来源 |
| Brave API 速率限制 | 等待 2 秒,重试一次 |
📋 示例输出
参见 examples/openclaw-acquisition.md 获取完整示例报告。
Deep Scout v0.1.0 · OpenClaw 技能 · clawhub: deep-scout