Doubleword Batch Inference

Process multiple AI inference requests asynchronously using the Doubleword batch API.

When to Use Batches

Batches are ideal for:

- Multiple independent requests that can run simultaneously
Workloads that don't require immediate responses
Large volumes that would exceed rate limits if sent individually
Cost-sensitive workloads (24h window offers better pricing)

Quick Start

Basic workflow for any batch job:

1. Create JSONL file with requests (one JSON object per line)
Upload file to get file ID
Create batch using file ID
Poll status until complete
Download results from outputfileid

Workflow

Step 1: Create Batch Request File

Create a .jsonl file where each line contains a single request:

CODEBLOCK0

Required fields per line:

- custom_id: Unique identifier (max 64 chars) - use descriptive IDs like "user-123-question-5" for easier result mapping
INLINECODE3: Always INLINECODE4
INLINECODE5: Always INLINECODE6
INLINECODE7: Standard API request with model and INLINECODE9

Optional body parameters:

- temperature: 0-2 (default: 1.0)
INLINECODE11: Maximum response tokens
INLINECODE12: Nucleus sampling parameter
INLINECODE13: Stop sequences

File limits:

- Max size: 200MB
Format: JSONL only (JSON Lines - newline-delimited JSON)
Split large batches into multiple files if needed

Helper script:
Use scripts/create_batch_file.py to generate JSONL files programmatically:

CODEBLOCK1

Modify the script's requests list to generate your specific batch requests.

Step 2: Upload File

Upload the JSONL file:

CODEBLOCK2

Response contains id field - save this file ID for next step.

Step 3: Create Batch

Create the batch job using the file ID:

CODEBLOCK3

Parameters:

- input_file_id: File ID from upload step
INLINECODE18: Always INLINECODE19
INLINECODE20: Choose "24h" (better pricing) or "1h" (50% premium, faster results)

Response contains batch id - save this for status polling.

Step 4: Poll Status

Check batch progress:

CODEBLOCK4

Status progression:

1. validating - Checking input file format
INLINECODE25 - Processing requests
INLINECODE26 - All requests finished

Other statuses:

- failed - Batch failed (check error_file_id)
INLINECODE29 - Batch timed out
INLINECODE30/cancelled - Batch cancelled

Response includes:

- output_file_id - Download results here
INLINECODE33 - Failed requests (if any)
INLINECODE34 - Total/completed/failed counts

Polling frequency: Check every 30-60 seconds during processing.

Early access: Results available via output_file_id before batch fully completes - check X-Incomplete header.

Step 5: Download Results

Download completed results:

CODEBLOCK5

Response headers:

- X-Incomplete: true - Batch still processing, more results coming
INLINECODE38 - Resume point for partial downloads

Output format (each line):
CODEBLOCK6

Download errors (if any):
CODEBLOCK7

Error format (each line):
CODEBLOCK8

Additional Operations

List All Batches

CODEBLOCK9

Cancel Batch

CODEBLOCK10

Notes:

- Unprocessed requests are cancelled
Already-processed results remain downloadable
Cannot cancel completed batches

Common Patterns

Processing Results

Parse JSONL output line-by-line:

CODEBLOCK11

Handling Partial Results

Check for incomplete batches and resume:

CODEBLOCK12

Retry Failed Requests

Extract failed requests from error file and resubmit:

CODEBLOCK13

Best Practices

1. Descriptive custom_ids: Include context in IDs for easier result mapping

- Good: "user-123-question-5" - Bad: "1", INLINECODE41

2. Validate JSONL locally: Ensure each line is valid JSON before upload

3. Split large files: Keep under 200MB limit

4. Choose appropriate window: Use 24h for cost savings, 1h only when time-sensitive

5. Handle errors gracefully: Always check error_file_id and retry failed requests

6. Monitor request_counts: Track progress via completed/total ratio

7. Save file IDs: Store batchid, inputfileid, outputfile_id for later retrieval

Reference Documentation

For complete API details including authentication, rate limits, and advanced parameters, see:

- API Reference: references/api_reference.md - Full endpoint documentation and schemas

双字批量推理

使用双字批量API异步处理多个AI推理请求。

何时使用批量处理

批量处理适用于：

- 可同时运行的多个独立请求
无需即时响应的工作负载
单独发送会超出速率限制的大量请求
对成本敏感的工作负载（24小时窗口提供更优定价）

快速入门

任何批量作业的基本流程：

1. 创建JSONL文件，包含请求（每行一个JSON对象）
上传文件以获取文件ID
创建批量任务，使用文件ID
轮询状态直至完成
下载结果，从outputfileid获取

工作流程

步骤1：创建批量请求文件

创建一个.jsonl文件，每行包含一个请求：

json
{custom_id: req-1, method: POST, url: /v1/chat/completions, body: {model: anthropic/claude-3-5-sonnet, messages: [{role: user, content: 2+2等于多少？}]}}
{custom_id: req-2, method: POST, url: /v1/chat/completions, body: {model: anthropic/claude-3-5-sonnet, messages: [{role: user, content: 法国的首都是哪里？}]}}

每行必填字段：

- custom_id：唯一标识符（最多64个字符）——使用描述性ID如user-123-question-5以便于结果映射
method：始终为POST
url：始终为/v1/chat/completions
body：包含model和messages的标准API请求

可选body参数：

- temperature：0-2（默认值：1.0）
maxtokens：最大响应令牌数
topp：核采样参数
stop：停止序列

文件限制：

- 最大大小：200MB
格式：仅JSONL（JSON Lines——换行符分隔的JSON）
如有需要，可将大批量拆分为多个文件

辅助脚本：
使用scripts/createbatchfile.py以编程方式生成JSONL文件：

bash
python scripts/createbatchfile.py output.jsonl

修改脚本中的requests列表以生成特定的批量请求。

步骤2：上传文件

上传JSONL文件：

bash
curl https://api.doubleword.ai/v1/files \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
-F purpose=batch \
-F file=@batch_requests.jsonl

响应包含id字段——保存此文件ID用于下一步。

步骤3：创建批量任务

使用文件ID创建批量作业：

bash
curl https://api.doubleword.ai/v1/batches \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
-H Content-Type: application/json \
-d {
inputfileid: file-abc123,
endpoint: /v1/chat/completions,
completion_window: 24h
}

参数：

- inputfileid：上传步骤中的文件ID
endpoint：始终为/v1/chat/completions
completion_window：选择24h（更优定价）或1h（50%溢价，更快结果）

响应包含批量任务id——保存此ID用于状态轮询。

步骤4：轮询状态

检查批量任务进度：

bash
curl https://api.doubleword.ai/v1/batches/batch-xyz789 \
-H Authorization: Bearer $DOUBLEWORDAPIKEY

状态进展：

1. validating——检查输入文件格式
in_progress——正在处理请求
completed——所有请求完成

其他状态：

- failed——批量任务失败（检查errorfileid）
expired——批量任务超时
cancelling/cancelled——批量任务已取消

响应包含：

- outputfileid——在此下载结果
errorfileid——失败的请求（如有）
request_counts——总数/已完成/失败计数

轮询频率：处理期间每30-60秒检查一次。

早期访问：在批量任务完全完成前，可通过outputfileid获取结果——检查X-Incomplete标头。

步骤5：下载结果

下载已完成的结果：

bash
curl https://api.doubleword.ai/v1/files/file-output123/content \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
> results.jsonl

响应标头：

- X-Incomplete: true——批量任务仍在处理，更多结果即将到来
X-Last-Line: 45——部分下载的恢复点

输出格式（每行）：
json
{
id: batch-req-abc,
custom_id: request-1,
response: {
status_code: 200,
body: {
id: chatcmpl-xyz,
choices: [{
message: {
role: assistant,
content: 答案是4。
}
}]
}
}
}

下载错误（如有）：
bash
curl https://api.doubleword.ai/v1/files/file-error123/content \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
> errors.jsonl

错误格式（每行）：
json
{
id: batch-req-def,
custom_id: request-2,
error: {
code: invalid_request,
message: 缺少必需参数
}
}

其他操作

列出所有批量任务

bash
curl https://api.doubleword.ai/v1/batches?limit=10 \
-H Authorization: Bearer $DOUBLEWORDAPIKEY

取消批量任务

bash
curl https://api.doubleword.ai/v1/batches/batch-xyz789/cancel \
-X POST \
-H Authorization: Bearer $DOUBLEWORDAPIKEY

注意：

- 未处理的请求将被取消
已处理的结果仍可下载
无法取消已完成的批量任务

常见模式

处理结果

逐行解析JSONL输出：

python
import json

with open(results.jsonl) as f:
for line in f:
result = json.loads(line)
customid = result[customid]
content = result[response][body][choices][0][message][content]
print(f{custom_id}: {content})

处理部分结果

检查未完成的批量任务并恢复：

python
import requests

response = requests.get(
https://api.doubleword.ai/v1/files/file-output123/content,
headers={Authorization: fBearer {api_key}}
)

if response.headers.get(X-Incomplete) == true:
last_line = int(response.headers.get(X-Last-Line, 0))
print(f批量任务未完成。目前已处理 {last_line} 个请求。)
# 继续轮询并稍后再次下载

重试失败的请求

从错误文件中提取失败的请求并重新提交：

python
import json

failed_ids = []
with open(errors.jsonl) as f:
for line in f:
error = json.loads(line)
failedids.append(error[customid])

print(f失败的请求：{failed_ids})

仅使用失败的请求创建新的批量任务

最佳实践

1. 描述性custom_ids：在ID中包含上下文以便于结果映射

- 好：user-123-question-5 - 差：1、req1

2. 本地验证JSONL：上传前确保每行都是有效的JSON

3. 拆分大文件：保持在200MB限制以下

4. 选择合适的窗口：使用24h节省成本，仅在时间敏感时使用1h

5. 优雅处理错误：始终检查errorfileid并重试失败的请求

6. 监控request_counts：通过completed/total比率跟踪进度

7. 保存文件ID：存储batchid、inputfileid、outputfile_id以便后续检索

参考文档

有关完整的API详情，包括身份验证、速率限制和高级参数，请参阅：

- API参考：references/api_reference.md——完整的端点文档和模式

doubleword-batches双词批量推理

doubleword-batches

Doubleword Batch Inference

When to Use Batches

Quick Start

Workflow

Step 1: Create Batch Request File

Step 2: Upload File

Step 3: Create Batch

Step 4: Poll Status

Step 5: Download Results

Additional Operations

List All Batches

Cancel Batch

Common Patterns

Processing Results

Handling Partial Results

Retry Failed Requests

Best Practices

Reference Documentation

双字批量推理

何时使用批量处理

快速入门

工作流程

步骤1：创建批量请求文件

步骤2：上传文件

步骤3：创建批量任务

步骤4：轮询状态

步骤5：下载结果

其他操作

列出所有批量任务

取消批量任务

常见模式

处理结果

处理部分结果

重试失败的请求

仅使用失败的请求创建新的批量任务

最佳实践

参考文档

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement