Parallel Enrichment

Bulk data enrichment that adds web-sourced fields to lists of companies, people, or products. Describe what you want in natural language.

When to Use

Trigger this skill when the user asks for:

- "enrich this list with...", "add CEO names to...", "find funding for these companies..."
"look up contact info for...", "get LinkedIn profiles for..."
Bulk data operations on CSV files or lists
Adding web-sourced columns to existing datasets
Lead enrichment, company research, product comparison

Quick Start

CODEBLOCK0

CLI Reference

Basic Usage

CODEBLOCK1

Note: There is no --json flag for enrich. Results are written to the target file.

Common Flags

Flag	Description
INLINECODE1	Inline JSON array of records
INLINECODE2

Processor Tiers

Processor	Use Case
INLINECODE9	Simple lookups
INLINECODE10

Examples

Inline data enrichment:
CODEBLOCK2

CSV file enrichment:
CODEBLOCK3

With explicit output columns:
CODEBLOCK4

Using AI to suggest columns:
CODEBLOCK5

Best-Practice Prompting

Intent Description

Write 1-2 sentences describing:

- What specific fields you want to add
Context about the data (B2B companies, tech startups, etc.)
Any constraints (recent data, specific sources)

Good:
CODEBLOCK6

Poor:
CODEBLOCK7

Source Column Descriptions

When using --source-columns, provide context:

CODEBLOCK8

Response Format

The CLI outputs:

- A monitoring URL to track progress
Status updates as rows are processed
Final output written to target CSV

The target CSV contains:

- All original columns from the source
New enriched columns as specified
A _parallel_status column indicating success/failure per row

Output Handling

After enrichment completes:

1. Report the number of rows enriched
Preview the first few rows: INLINECODE16
Share the full path to the output file
Note any rows that failed enrichment

Configuration File

For complex enrichments, use a YAML config:

CODEBLOCK9

Then run:
CODEBLOCK10

Running Out of Context?

For large enrichments, save results and use sessions_spawn:

CODEBLOCK11

Then spawn a sub-agent:
CODEBLOCK12

Error Handling

Exit Code	Meaning
0	Success
1

Common issues:

- Row failures: Check _parallel_status column in output
Timeout: Use smaller batches or lower processor tier
Rate limits: Add delays between large enrichments

Prerequisites

1. Get an API key at parallel.ai
Install the CLI:

CODEBLOCK13

References

并行数据丰富

批量数据增强功能，可为公司、人员或产品列表添加网络来源字段。用自然语言描述您的需求。

使用场景

当用户提出以下请求时触发此技能：

- 用...丰富此列表、添加CEO姓名到...、查找这些公司的融资信息...
查找...的联系信息、获取...的LinkedIn个人资料
对CSV文件或列表进行批量数据操作
为现有数据集添加网络来源列
潜在客户丰富、公司调研、产品对比

快速开始

bash

内联数据

parallel-cli enrich run \
--data [{company: Google}, {company: Microsoft}] \
--intent CEO姓名和成立年份 \
--target output.csv

CSV文件

parallel-cli enrich run \ --source-type csv --source input.csv \ --target output.csv \ --intent CEO姓名和成立年份

CLI参考

基本用法

bash
parallel-cli enrich run [选项]

注意： enrich没有--json标志。结果将写入目标文件。

常用标志

标志	描述
--data <json>	内联JSON记录数组
--source-type csv

处理器层级

处理器	使用场景
lite-fast	简单查询
base-fast

示例

内联数据丰富：
bash
parallel-cli enrich run \
--data [{company: Stripe}, {company: Square}, {company: Adyen}] \
--intent CEO姓名、总部城市和最新融资轮次 \
--target ./companies-enriched.csv

CSV文件丰富：
bash
parallel-cli enrich run \
--source-type csv \
--source ./leads.csv \
--target ./leads-enriched.csv \
--source-columns [{name: company_name, description: 公司名称}] \
--intent 查找CEO姓名、公司规模和LinkedIn公司页面URL

显式指定输出列：
bash
parallel-cli enrich run \
--data [{name: Sam Altman}, {name: Satya Nadella}] \
--source-columns [{name: name, description: 人员全名}] \
--enriched-columns [
{name: current_company, description: 当前公司/雇主},
{name: title, description: 当前职位},
{name: twitter, description: Twitter/X账号}
] \
--target ./people-enriched.csv

使用AI建议列：
bash

首先获取AI建议

parallel-cli enrich suggest \
--source-type csv \
--source ./companies.csv \
--intent 竞争对手分析数据

然后使用建议的列运行

parallel-cli enrich run \ --source-type csv \ --source ./companies.csv \ --target ./companies-analysis.csv \ --intent 竞争对手分析：市场地位、主要产品、最新新闻

最佳实践提示

意图描述

用1-2句话描述：

- 您想要添加的具体字段
数据背景（B2B公司、科技初创企业等）
任何限制条件（最新数据、特定来源）

良好示例：

--intent 查找B2B SaaS公司的CEO姓名、总融资额和员工数量

不佳示例：

--intent 查找这些公司的相关信息

源列描述

使用--source-columns时，提供上下文信息：

json
[
{name: company, description: 公司名称，可能包含Inc/LLC后缀},
{name: website, description: 公司网站URL，可能不完整}
]

响应格式

CLI输出：

- 用于跟踪进度的监控URL
行处理时的状态更新
写入目标CSV的最终输出

目标CSV包含：

- 源文件中的所有原始列
指定的新丰富列
指示每行成功/失败的parallelstatus列

输出处理

丰富完成后：

1. 报告已丰富的行数
预览前几行：head -6 output.csv
分享输出文件的完整路径
记录任何丰富失败的行

配置文件

对于复杂的丰富操作，使用YAML配置：

yaml

enrich-config.yaml

source:
type: csv
path: ./input.csv
columns:
- name: company_name
description: 公司法定名称
- name: website
description: 公司网站URL

target:
type: csv
path: ./output.csv

enriched_columns:
- name: ceo_name
description: 现任CEO全名
- name: employee_count
description: 员工大致数量
- name: funding_total
description: 以美元计的总融资额

processor: pro-fast

然后运行：
bash
parallel-cli enrich run enrich-config.yaml

上下文不足？

对于大规模丰富操作，保存结果并使用sessions_spawn：

bash
parallel-cli enrich run --source-type csv --source input.csv --target /tmp/enriched-<主题>.csv --intent ...

然后生成子代理：
json
{
tool: sessions_spawn,
task: 读取/tmp/enriched-<主题>.csv并总结结果。报告行数、成功率，并预览前5行。,
label: enrich-summary
}

错误处理

退出码	含义
0	成功
1

常见问题：

- 行失败： 检查输出中的parallelstatus列
超时： 使用更小的批次或更低的处理器层级
速率限制： 在大型丰富操作之间添加延迟

前提条件

1. 在parallel.ai获取API密钥
安装CLI：

bash
curl -fsSL https://parallel.ai/install.sh | bash
export PARALLELAPIKEY=您的密钥

parallel-enrichment并行数据丰富