Doubleword Batch Inference
Process multiple AI inference requests asynchronously using the Doubleword batch API.
When to Use Batches
Batches are ideal for:
- - Multiple independent requests that can run simultaneously
- Workloads that don't require immediate responses
- Large volumes that would exceed rate limits if sent individually
- Cost-sensitive workloads (24h window offers better pricing)
Quick Start
Basic workflow for any batch job:
- 1. Create JSONL file with requests (one JSON object per line)
- Upload file to get file ID
- Create batch using file ID
- Poll status until complete
- Download results from outputfileid
Workflow
Step 1: Create Batch Request File
Create a .jsonl file where each line contains a single request:
CODEBLOCK0
Required fields per line:
- -
custom_id: Unique identifier (max 64 chars) - use descriptive IDs like "user-123-question-5" for easier result mapping - INLINECODE3 : Always INLINECODE4
- INLINECODE5 : Always INLINECODE6
- INLINECODE7 : Standard API request with
model and INLINECODE9
Optional body parameters:
- -
temperature: 0-2 (default: 1.0) - INLINECODE11 : Maximum response tokens
- INLINECODE12 : Nucleus sampling parameter
- INLINECODE13 : Stop sequences
File limits:
- - Max size: 200MB
- Format: JSONL only (JSON Lines - newline-delimited JSON)
- Split large batches into multiple files if needed
Helper script:
Use scripts/create_batch_file.py to generate JSONL files programmatically:
CODEBLOCK1
Modify the script's requests list to generate your specific batch requests.
Step 2: Upload File
Upload the JSONL file:
CODEBLOCK2
Response contains id field - save this file ID for next step.
Step 3: Create Batch
Create the batch job using the file ID:
CODEBLOCK3
Parameters:
- -
input_file_id: File ID from upload step - INLINECODE18 : Always INLINECODE19
- INLINECODE20 : Choose
"24h" (better pricing) or "1h" (50% premium, faster results)
Response contains batch id - save this for status polling.
Step 4: Poll Status
Check batch progress:
CODEBLOCK4
Status progression:
- 1.
validating - Checking input file format - INLINECODE25 - Processing requests
- INLINECODE26 - All requests finished
Other statuses:
- -
failed - Batch failed (check error_file_id) - INLINECODE29 - Batch timed out
- INLINECODE30 /
cancelled - Batch cancelled
Response includes:
- -
output_file_id - Download results here - INLINECODE33 - Failed requests (if any)
- INLINECODE34 - Total/completed/failed counts
Polling frequency: Check every 30-60 seconds during processing.
Early access: Results available via output_file_id before batch fully completes - check X-Incomplete header.
Step 5: Download Results
Download completed results:
CODEBLOCK5
Response headers:
- -
X-Incomplete: true - Batch still processing, more results coming - INLINECODE38 - Resume point for partial downloads
Output format (each line):
CODEBLOCK6
Download errors (if any):
CODEBLOCK7
Error format (each line):
CODEBLOCK8
Additional Operations
List All Batches
CODEBLOCK9
Cancel Batch
CODEBLOCK10
Notes:
- - Unprocessed requests are cancelled
- Already-processed results remain downloadable
- Cannot cancel completed batches
Common Patterns
Processing Results
Parse JSONL output line-by-line:
CODEBLOCK11
Handling Partial Results
Check for incomplete batches and resume:
CODEBLOCK12
Retry Failed Requests
Extract failed requests from error file and resubmit:
CODEBLOCK13
Best Practices
- 1. Descriptive custom_ids: Include context in IDs for easier result mapping
- Good:
"user-123-question-5"
- Bad:
"1", INLINECODE41
- 2. Validate JSONL locally: Ensure each line is valid JSON before upload
- 3. Split large files: Keep under 200MB limit
- 4. Choose appropriate window: Use
24h for cost savings, 1h only when time-sensitive
- 5. Handle errors gracefully: Always check
error_file_id and retry failed requests
- 6. Monitor request_counts: Track progress via
completed/total ratio
- 7. Save file IDs: Store batchid, inputfileid, outputfile_id for later retrieval
Reference Documentation
For complete API details including authentication, rate limits, and advanced parameters, see:
- - API Reference:
references/api_reference.md - Full endpoint documentation and schemas
双字批量推理
使用双字批量API异步处理多个AI推理请求。
何时使用批量处理
批量处理适用于:
- - 可同时运行的多个独立请求
- 无需即时响应的工作负载
- 单独发送会超出速率限制的大量请求
- 对成本敏感的工作负载(24小时窗口提供更优定价)
快速入门
任何批量作业的基本流程:
- 1. 创建JSONL文件,包含请求(每行一个JSON对象)
- 上传文件以获取文件ID
- 创建批量任务,使用文件ID
- 轮询状态直至完成
- 下载结果,从outputfileid获取
工作流程
步骤1:创建批量请求文件
创建一个.jsonl文件,每行包含一个请求:
json
{custom_id: req-1, method: POST, url: /v1/chat/completions, body: {model: anthropic/claude-3-5-sonnet, messages: [{role: user, content: 2+2等于多少?}]}}
{custom_id: req-2, method: POST, url: /v1/chat/completions, body: {model: anthropic/claude-3-5-sonnet, messages: [{role: user, content: 法国的首都是哪里?}]}}
每行必填字段:
- - custom_id:唯一标识符(最多64个字符)——使用描述性ID如user-123-question-5以便于结果映射
- method:始终为POST
- url:始终为/v1/chat/completions
- body:包含model和messages的标准API请求
可选body参数:
- - temperature:0-2(默认值:1.0)
- maxtokens:最大响应令牌数
- topp:核采样参数
- stop:停止序列
文件限制:
- - 最大大小:200MB
- 格式:仅JSONL(JSON Lines——换行符分隔的JSON)
- 如有需要,可将大批量拆分为多个文件
辅助脚本:
使用scripts/createbatchfile.py以编程方式生成JSONL文件:
bash
python scripts/createbatchfile.py output.jsonl
修改脚本中的requests列表以生成特定的批量请求。
步骤2:上传文件
上传JSONL文件:
bash
curl https://api.doubleword.ai/v1/files \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
-F purpose=batch \
-F file=@batch_requests.jsonl
响应包含id字段——保存此文件ID用于下一步。
步骤3:创建批量任务
使用文件ID创建批量作业:
bash
curl https://api.doubleword.ai/v1/batches \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
-H Content-Type: application/json \
-d {
inputfileid: file-abc123,
endpoint: /v1/chat/completions,
completion_window: 24h
}
参数:
- - inputfileid:上传步骤中的文件ID
- endpoint:始终为/v1/chat/completions
- completion_window:选择24h(更优定价)或1h(50%溢价,更快结果)
响应包含批量任务id——保存此ID用于状态轮询。
步骤4:轮询状态
检查批量任务进度:
bash
curl https://api.doubleword.ai/v1/batches/batch-xyz789 \
-H Authorization: Bearer $DOUBLEWORDAPIKEY
状态进展:
- 1. validating——检查输入文件格式
- in_progress——正在处理请求
- completed——所有请求完成
其他状态:
- - failed——批量任务失败(检查errorfileid)
- expired——批量任务超时
- cancelling/cancelled——批量任务已取消
响应包含:
- - outputfileid——在此下载结果
- errorfileid——失败的请求(如有)
- request_counts——总数/已完成/失败计数
轮询频率:处理期间每30-60秒检查一次。
早期访问:在批量任务完全完成前,可通过outputfileid获取结果——检查X-Incomplete标头。
步骤5:下载结果
下载已完成的结果:
bash
curl https://api.doubleword.ai/v1/files/file-output123/content \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
> results.jsonl
响应标头:
- - X-Incomplete: true——批量任务仍在处理,更多结果即将到来
- X-Last-Line: 45——部分下载的恢复点
输出格式(每行):
json
{
id: batch-req-abc,
custom_id: request-1,
response: {
status_code: 200,
body: {
id: chatcmpl-xyz,
choices: [{
message: {
role: assistant,
content: 答案是4。
}
}]
}
}
}
下载错误(如有):
bash
curl https://api.doubleword.ai/v1/files/file-error123/content \
-H Authorization: Bearer $DOUBLEWORDAPIKEY \
> errors.jsonl
错误格式(每行):
json
{
id: batch-req-def,
custom_id: request-2,
error: {
code: invalid_request,
message: 缺少必需参数
}
}
其他操作
列出所有批量任务
bash
curl https://api.doubleword.ai/v1/batches?limit=10 \
-H Authorization: Bearer $DOUBLEWORDAPIKEY
取消批量任务
bash
curl https://api.doubleword.ai/v1/batches/batch-xyz789/cancel \
-X POST \
-H Authorization: Bearer $DOUBLEWORDAPIKEY
注意:
- - 未处理的请求将被取消
- 已处理的结果仍可下载
- 无法取消已完成的批量任务
常见模式
处理结果
逐行解析JSONL输出:
python
import json
with open(results.jsonl) as f:
for line in f:
result = json.loads(line)
customid = result[customid]
content = result[response][body][choices][0][message][content]
print(f{custom_id}: {content})
处理部分结果
检查未完成的批量任务并恢复:
python
import requests
response = requests.get(
https://api.doubleword.ai/v1/files/file-output123/content,
headers={Authorization: fBearer {api_key}}
)
if response.headers.get(X-Incomplete) == true:
last_line = int(response.headers.get(X-Last-Line, 0))
print(f批量任务未完成。目前已处理 {last_line} 个请求。)
# 继续轮询并稍后再次下载
重试失败的请求
从错误文件中提取失败的请求并重新提交:
python
import json
failed_ids = []
with open(errors.jsonl) as f:
for line in f:
error = json.loads(line)
failedids.append(error[customid])
print(f失败的请求:{failed_ids})
仅使用失败的请求创建新的批量任务
最佳实践
- 1. 描述性custom_ids:在ID中包含上下文以便于结果映射
- 好:user-123-question-5
- 差:1、req1
- 2. 本地验证JSONL:上传前确保每行都是有效的JSON
- 3. 拆分大文件:保持在200MB限制以下
- 4. 选择合适的窗口:使用24h节省成本,仅在时间敏感时使用1h
- 5. 优雅处理错误:始终检查errorfileid并重试失败的请求
- 6. 监控request_counts:通过completed/total比率跟踪进度
- 7. 保存文件ID:存储batchid、inputfileid、outputfile_id以便后续检索
参考文档
有关完整的API详情,包括身份验证、速率限制和高级参数,请参阅:
- - API参考:references/api_reference.md——完整的端点文档和模式