Parallel AI Search (CLI Master)

This is a single “master” skill that replaces the earlier Node-script-based version of parallel-ai-search.

It routes to the right parallel-cli capability for the task:

- Search: quick web lookup with citations (parallel-cli search)
Extract: turn URLs (including PDFs and JS-heavy pages) into clean, LLM-ready text (parallel-cli extract)
Deep research: multi-source reports with processor tiers (parallel-cli research ...)
Enrich: add web-sourced columns to CSV/JSON (parallel-cli enrich ...)
FindAll: discover entities from the web with optional enrichments (parallel-cli findall ...)
Monitor: track web changes on a cadence, optionally via webhook (parallel-cli monitor ...)

Routing rules (pick ONE)

Choose the smallest / cheapest action that solves the user’s request:

1. Extract — if the user gives one or more URLs or says “read/summarise this page”, “extract”, “quote”, “pull the content”, “what does this page say”.
Deep research — ONLY if the user explicitly asks for deep, exhaustive, comprehensive, thorough investigation, or a multi-source “report”.
Enrich — if the user provides a list/table (CSV/JSON/inline objects) and wants new columns like CEO, revenue, funding, contact info, etc.
FindAll — if the user wants you to discover many entities (companies/people/venues/etc.) that match criteria.
Monitor — if the user wants ongoing tracking (“alert me”, “track changes”, “monitor this weekly”) rather than a one-off answer.
Search — default for everything else that needs current web info or citations.

Optional manual prefixes if the user invoked this skill directly:

- INLINECODE8
INLINECODE9
INLINECODE10
INLINECODE11
INLINECODE12
INLINECODE13

If a prefix is present, honour it.

Setup and authentication (only when needed)

Before running any Parallel command, ensure auth works:

CODEBLOCK0

If parallel-cli is missing, install it:

CODEBLOCK1

If you cannot use the install script, use pipx:

CODEBLOCK2

Then authenticate (choose one):

CODEBLOCK3

Output & citation rules

- Always cite web-sourced facts with inline markdown links: [Source Title](https://...).
End with a Sources list whenever you used Search/Extract/Research output.
Prefer official/primary sources when available.
For long outputs, save to files in /tmp/ and summarise in-chat.

Search (default web lookup)

Use Search for fast, cost-effective answers with citations.

Command template

CODEBLOCK4

Add any of these only when relevant:

- --after-date YYYY-MM-DD (freshness constraint)
INLINECODE18 (restrict sources)
INLINECODE19 (block sources)
one or more -q "keyword query" flags (extra keyword probes)
INLINECODE21 (save full JSON to a file)

Parse + respond

From the JSON results, extract title, url, and any publish_date / excerpt fields.
Answer the user’s question, and cite each claim inline.

Extract (read one or more URLs)

Use Extract when you need the actual contents of specific URLs (webpages, PDFs, JS-heavy sites).

Command template

CODEBLOCK5

Add when relevant:

- --objective "Focus area" (e.g., pricing, API usage, constraints)
INLINECODE23 (only if the user needs the whole page)
INLINECODE24 (if you only want full content)
INLINECODE25 (save full JSON to a file)

Respond

- If the user asked for a summary, summarise with citations to the extracted URL.
If the user asked for the verbatim text, provide the extracted markdown only if it is reasonably sized; otherwise provide the key sections + offer to read more from the saved output.

Deep research (only when explicitly requested)

Deep research is slower and may cost more than Search. Use it only when the user explicitly wants depth.

Step 1 — start (always async)

CODEBLOCK6

Parse run_id (and any monitoring URL) from JSON and tell the user the run started.

Step 2 — poll (bounded timeout)

Choose a short slug filename (lowercase-hyphen), then:

CODEBLOCK7

- Share the executive summary printed by the poll command.
Mention the output files:

- /tmp/$SLUG.md - INLINECODE28

If polling times out, re-run the same poll command — the run continues server-side.

Enrich (CSV/JSON or inline data)

Use Enrich to add web-sourced columns to structured data.

Step 1 — (optional) suggest columns

CODEBLOCK8

Use this when the user knows the goal but not the exact output schema.

Step 2 — run (always async for large jobs)

For CSV:

CODEBLOCK9

For inline JSON rows:

CODEBLOCK10

Parse taskgroup_id from JSON.

Step 3 — poll

CODEBLOCK11

After completion:

- Tell the user the output file path (the --target you chose).
Preview a few rows (using file read tools if available) and report row counts.

If poll times out, re-run it — the job continues server-side.

FindAll (entity discovery)

Use FindAll when the user wants you to discover a set of entities (e.g., “AI startups in healthcare”, “roofing companies in Charlotte”, “YC devtools companies”).

Step 1 — run

CODEBLOCK12

Useful options:

- --dry-run --json to preview schema before spending money
INLINECODE32 to avoid known entities
INLINECODE33 (core default; pro for hardest queries)

Parse run_id from JSON.

Step 2 — poll + fetch results

CODEBLOCK13

Respond with:

- total entities found
a clean list/table of the best matches (name + URL + key attributes)
any caveats about ambiguous matches

Monitor (web change tracking)

Use Monitor when the user wants ongoing tracking.

Create:

CODEBLOCK14

Optional:

- INLINECODE35
INLINECODE36 (deliver events externally)
INLINECODE37 (structured events)

Manage:

CODEBLOCK15

Respond with the monitor id and how to retrieve events (or confirm webhook delivery).

Reference material

- Copy/paste command templates and patterns: INLINECODE38
Troubleshooting common failures: INLINECODE39

Parallel AI 搜索（CLI 大师）

这是一个单一的“主”技能，取代了之前基于 Node 脚本的 parallel-ai-search 版本。

它会根据任务路由到正确的 parallel-cli 功能：

- 搜索：快速网页查找并附带引用（parallel-cli search）
提取：将 URL（包括 PDF 和 JavaScript 密集型页面）转换为干净、适合 LLM 的文本（parallel-cli extract）
深度研究：使用处理器层级的多源报告（parallel-cli research ...）
丰富：为 CSV/JSON 添加来自网络来源的列（parallel-cli enrich ...）
查找全部：从网络中查找实体，并可选择进行丰富（parallel-cli findall ...）
监控：按节奏跟踪网页变化，可选通过 webhook（parallel-cli monitor ...）

路由规则（选择一项）

选择解决用户请求的最小/最经济的操作：

1. 提取 — 如果用户提供了一个或多个 URL，或者要求“阅读/总结此页面”、“提取”、“引用”、“拉取内容”、“此页面说了什么”。
深度研究 — 仅当用户明确要求深度、详尽、全面、彻底调查或多源“报告”时。
丰富 — 如果用户提供了一个列表/表格（CSV/JSON/内联对象），并希望添加新列，如 CEO、收入、融资、联系信息等。
查找全部 — 如果用户希望您查找许多实体（公司/人员/场所等）以匹配条件。
监控 — 如果用户希望持续跟踪（“提醒我”、“跟踪变化”、“每周监控此内容”），而不是一次性回答。
搜索 — 其他所有需要当前网络信息或引用的情况的默认选项。

如果用户直接调用此技能，可选的强制前缀：

- search: ...
extract: ...
research: ...
enrich: ...
findall: ...
monitor: ...

如果存在前缀，请遵循它。

设置和身份验证（仅在需要时）

在运行任何 Parallel 命令之前，确保身份验证正常工作：

bash
parallel-cli auth

如果缺少 parallel-cli，请安装它：

bash
curl -fsSL https://parallel.ai/install.sh | bash

如果无法使用安装脚本，请使用 pipx：

bash
pipx install parallel-web-tools[cli]
pipx ensurepath

然后进行身份验证（选择一项）：

bash

交互式 OAuth（打开浏览器）

parallel-cli login

无头 / SSH / CI

parallel-cli login --device

或环境变量

export PARALLELAPIKEY=yourapikey

输出和引用规则

- 始终引用网络来源的事实，使用内联 Markdown 链接：来源标题。
以来源列表结尾，只要您使用了搜索/提取/研究输出。
尽可能优先使用官方/主要来源。
对于长输出，保存到 /tmp/ 中的文件，并在聊天中总结。

搜索（默认网页查找）

使用搜索进行快速、经济高效的带引用回答。

命令模板

bash
parallel-cli search $OBJECTIVE --mode agentic --max-results 10 --json

仅在相关时添加以下任何选项：

- --after-date YYYY-MM-DD（时效性约束）
--include-domains a.com b.org（限制来源）
--exclude-domains spam.com（屏蔽来源）
一个或多个 -q keyword query 标志（额外关键词探测）
-o /tmp/$SLUG.search.json（将完整 JSON 保存到文件）

解析并响应

从 JSON 结果中提取标题、URL 以及任何发布日期/摘要字段。
回答用户的问题，并内联引用每个声明。

提取（读取一个或多个 URL）

当您需要特定 URL（网页、PDF、JavaScript 密集型网站）的实际内容时，使用提取。

命令模板

bash
parallel-cli extract $URL --json

在相关时添加：

- --objective Focus area（例如，定价、API 使用、约束条件）
--full-content（仅当用户需要整个页面时）
--no-excerpts（如果您只想要完整内容）
-o /tmp/$SLUG.extract.json（将完整 JSON 保存到文件）

响应

- 如果用户要求总结，则进行总结并引用提取的 URL。
如果用户要求逐字文本，则仅当文本大小合理时才提供提取的 Markdown；否则提供关键部分，并提供从保存的输出中阅读更多内容的选项。

深度研究（仅在明确请求时）

深度研究比搜索慢，且可能成本更高。仅当用户明确要求深度时使用。

第 1 步 — 启动（始终异步）

bash
parallel-cli research run $QUESTION --processor pro-fast --no-wait --json

从 JSON 中解析 run_id（以及任何监控 URL），并告知用户运行已启动。

第 2 步 — 轮询（有界超时）

选择一个简短的 slug 文件名（小写连字符），然后：

bash
parallel-cli research poll $RUN_ID -o /tmp/$SLUG --timeout 540

- 分享轮询命令打印的执行摘要。
提及输出文件：

- /tmp/$SLUG.md - /tmp/$SLUG.json

如果轮询超时，重新运行相同的轮询命令 — 运行会在服务器端继续。

丰富（CSV/JSON 或内联数据）

使用丰富为结构化数据添加来自网络来源的列。

第 1 步 —（可选）建议列

bash
parallel-cli enrich suggest $INTENT --json

当用户知道目标但不确定确切输出模式时使用。

第 2 步 — 运行（大型作业始终异步）

对于 CSV：

bash
parallel-cli enrich run --source-type csv --source input.csv --target /tmp/enriched.csv --source-columns [{name:company,description:Company name}] --intent $INTENT --no-wait --json

对于内联 JSON 行：

bash
parallel-cli enrich run --data [{company:Google},{company:Apple}] --target /tmp/enriched.csv --intent $INTENT --no-wait --json

从 JSON 中解析 taskgroup_id。

第 3 步 — 轮询

bash
parallel-cli enrich poll $TASKGROUP_ID --timeout 540 --json

完成后：

- 告知用户输出文件路径（您选择的 --target）。
预览几行（如果可用，使用文件读取工具）并报告行数。

如果轮询超时，重新运行 — 作业会在服务器端继续。

查找全部（实体发现）

当用户希望您查找一组实体时使用查找全部（例如，“医疗保健领域的 AI 初创公司”、“夏洛特的屋顶公司”、“YC 开发者工具公司”）。

第 1 步 — 运行

bash
parallel-cli findall run $OBJECTIVE --generator core --match-limit 25 --no-wait --json

有用的选项：

- --dry-run --json 在花钱前预览模式
--exclude [{name:Example Corp,url:example.com}] 避免已知实体
--generator preview|base|core|pro（core 为默认；pro 用于最难的查询）

从 JSON 中解析 run_id。

第 2 步 — 轮询并获取结果

bash
parallel-cli findall poll $RUN_ID --json
parallel-cli findall result $RUN_ID --json

响应内容：

- 找到的实体总数
最佳匹配的清晰列表/表格（名称 + URL + 关键属性）
关于模糊匹配的任何注意事项

监控（网页变更跟踪）

当用户希望持续跟踪时使用监控。

创建：

bash
parallel-cli monitor create $OBJECTIVE --cadence daily --json

可选：

- --cadence hourly|daily|weekly|everytwoweeks
--webhook https://example.com/hook（外部传递事件）
--output-schema （结构化事件）

管理：

bash
parallel-cli monitor list --json
parallel-cli monitor get $MONITOR_ID --json
parallel-cli monitor update $MONITOR_ID --cadence weekly --json
parallel-cli monitor delete $MONITOR_ID
parallel-cli monitor events $MONITOR_ID --json

parallel-ai-search并行AI搜索