Laiye Agentic Document Processing (ADP)
Agentic Document Processing API — convert 10+ file formats(.jpeg,.jpg,.png,.bmp,.tiff,.pdf,.doc,.docx,.xls,.xlsx) to structured JSON/Excel with per-field confidence scores using VLM and LLM.
Base URL: INLINECODE0
Quick Start
CODEBLOCK0
Response:
CODEBLOCK1
Setup
1. Get Your API Credentials
CODEBLOCK2
Save your credentials:
CODEBLOCK3
2. Configuration (Optional)
Recommended: Use environment variables (most secure):
CODEBLOCK4
Security Note:
- - Set file permissions: INLINECODE1
- Never commit this file to version control
- Prefer environment variables or secret stores
- Rotate credentials regularly
Common Tasks
Extract from File URL
CODEBLOCK5
Extract from Base64
CODEBLOCK6
Extract with VLM Results
CODEBLOCK7
Access VLM results: INLINECODE2
Async Extraction (Large Documents)
Create extraction task:
CODEBLOCK8
Poll for results:
CODEBLOCK9
Advanced Features
Custom Scale Parameter
Enhance VLM quality with higher resolution:
CODEBLOCK10
Specify Config Version
Use a specific extraction configuration:
CODEBLOCK11
Document Recognition Only
Get VLM results without extraction:
CODEBLOCK12
When to Use
Use ADP For:
- - Invoice processing
- Order processing
- Receipt processing
- Financial document processing
- Logistics document processing
- Multi-table document data extraction
Don't Use For:
- - Video transcription
- audio transcription
Best Practices
| Document Size | Endpoint | Notes |
|---|
| Small files | INLINECODE3 (sync) | Immediate response |
| Large files |
/doc/extract/create/task (async) | Poll for results |
File Input:
- -
file_url: Prefer for large files (already hosted) - INLINECODE6 : Use for direct upload (max 20MB)
Confidence Scores:
- - Range: 0-1 per field
- Review fields with confidence <0.8 manually
Response Structure:
- -
extraction_result: Array of extracted fields - INLINECODE8 : VLM results (when
with_rec_result=true) - INLINECODE10 : Processing info (pages, time, model)
Response Schema
Success Response
CODEBLOCK13
Error Response
CODEBLOCK14
Common Use Cases
Invoice/Receipt Extraction
Extracts: invoice
number, invoicedate, vendor/customer
name, currency, vatrate, total
amountincluding
tax, totalamount
excludingtax, line_items, etc.
Purchase Order Extraction
Extracts: order
number, orderdate, buyer
name/sellername, address, total
amount, lineitems, etc.
Security & Privacy
Data Handling
Important: Documents uploaded to ADP are transmitted to https://adp-global.laiye.com/?utm_source=github and processed on external servers.
Before uploading sensitive documents:
- - Review ADP privacy policy and data retention policies
- Verify encryption in transit (HTTPS) and at rest
- Confirm data deletion/retention timelines
- Test with non-sensitive sample documents first
Best practices:
- - Do not upload highly sensitive PII until you've confirmed security posture
- Use credentials with limited permissions if available
- Rotate credentials regularly (every 90 days recommended)
- Monitor API usage logs for unauthorized access
- Never log or commit credentials to repositories
File Size Limits
- - Max file size: 50MB
- Supported formats: .jpeg, .jpg, .png, .bmp, .tiff, .pdf, .doc, .docx, .xls, .xlsx
- Concurrency limit: Free users support 1 concurrent request, paid users support 2 concurrent requests
- Timeout: 10 minutes for sync requests
Operational Safeguards
- - Always use environment variables or secure secret stores for credentials
- Never include real credentials in code examples or documentation
- Use placeholder values like
"your_access_key_here" in examples - Set appropriate file permissions on configuration files (600)
- Enable credential rotation and monitor usage
Billing
| Processing Stage | Cost |
|---|
| Document Parsing | 0.5 credits/page |
| Purchase Order Extraction |
1.5 credits/page |
| Invoice/Receipt Extraction | 1.5 credits/page |
| Custom Extraction | 1 credit/page |
New users: 100 free credits per month, no application restrictions.
Troubleshooting
| Error Code | Description | Common Causes & Solutions |
|---|
| 400 Bad Request | Invalid request parameters | • Missing app_key or app_secret<br>• Must provide exactly one input: file_url or file_base64<br>• Application has no published extraction config |
| 401 Unauthorized |
Authentication failed | • Invalid
X-Access-Key• Incorrect timestamp format (use Unix timestamp)
• Invalid signature format (must be UUID) |
|
404 Not Found | Resource not found | • Application does not exist
• No published extraction config found for the application |
|
500 Internal Server Error | Server-side processing error | • Document conversion failed
• VLM recognition timeout
• LLM extraction failure |
|
Sync Timeout | Request processing timed out | • Large files should use async endpoint
• Poll
/query/task/{task_id} for results |
Pre-Publish Security Checklist
Before publishing or updating this skill, verify:
- - [ ]
package.json declares requiredEnv and primaryEnv for credentials - [ ]
package.json lists API endpoints in endpoints array - [ ] All code examples use placeholder values not real credentials
- [ ] No credentials or secrets are embedded in
SKILL.md or INLINECODE25 - [ ] Security & Privacy section documents data handling and risks
- [ ] Configuration examples include security warnings for plaintext storage
- [ ] File permission guidance is included for config files
References
Laiye 智能体文档处理 (ADP)
智能体文档处理 API — 使用 VLM 和 LLM 将 10 多种文件格式(.jpeg、.jpg、.png、.bmp、.tiff、.pdf、.doc、.docx、.xls、.xlsx)转换为带有每个字段置信度分数的结构化 JSON/Excel。
基础 URL: https://adp-global.laiye.com/?utm_source=github
快速开始
bash
curl -X POST https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/extract \
-H Content-Type: application/json \
-H X-Access-Key: $ADPACCESSKEY \
-H X-Timestamp: $(date +%s) \
-H X-Signature: $(uuidgen) \
-d {
appkey: $ADPAPP_KEY,
appsecret: $ADPAPP_SECRET,
file_url: https://example.com/invoice.pdf
}
响应:
json
{
status: success,
extraction_result: [
{
fieldkey: invoicenumber,
field_value: INV-2024-001,
field_type: text,
confidence: 0.95,
source_pages: [1]
},
{
fieldkey: totalamount,
field_value: 1000.00,
field_type: number,
confidence: 0.98,
source_pages: [1]
}
]
}
设置
1. 获取您的 API 凭证
bash
联系 ADP 服务提供商获取:
- app_key: 应用程序访问密钥
- app_secret: 应用程序密钥
- X-Access-Key: 租户级访问密钥
保存您的凭证:
bash
export ADPACCESSKEY=youraccesskey_here
export ADPAPPKEY=yourappkey_here
export ADPAPPSECRET=yourappsecret_here
2. 配置(可选)
推荐:使用环境变量(最安全):
json5
{
skills: {
entries: {
adp-doc-extraction: {
enabled: true,
// 从环境变量加载的 API 凭证
},
},
},
}
安全说明:
- - 设置文件权限:chmod 600 ~/.openclaw/openclaw.json
- 切勿将此文件提交到版本控制
- 优先使用环境变量或密钥存储
- 定期轮换凭证
常见任务
从文件 URL 提取
bash
curl -X POST https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/extract \
-H Content-Type: application/json \
-H X-Access-Key: $ADPACCESSKEY \
-H X-Timestamp: $(date +%s) \
-H X-Signature: $(uuidgen) \
-d {
appkey: $ADPAPP_KEY,
appsecret: $ADPAPP_SECRET,
file_url: https://example.com/document.pdf
}
从 Base64 提取
bash
将文件转换为 base64
file_base64=$(base64 -i document.pdf | tr -d \n)
curl -X POST https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/extract \
-H Content-Type: application/json \
-H X-Access-Key: $ADPACCESSKEY \
-H X-Timestamp: $(date +%s) \
-H X-Signature: $(uuidgen) \
-d {
\appkey\: \$ADPAPP_KEY\,
\appsecret\: \$ADPAPP_SECRET\,
\filebase64\: \$filebase64\,
\file_name\: \document.pdf\
}
提取并包含 VLM 结果
bash
curl -X POST https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/extract \
-H Content-Type: application/json \
-H X-Access-Key: $ADPACCESSKEY \
-H X-Timestamp: $(date +%s) \
-H X-Signature: $(uuidgen) \
-d {
appkey: $ADPAPP_KEY,
appsecret: $ADPAPP_SECRET,
file_url: https://example.com/document.pdf,
withrecresult: true
}
访问 VLM 结果:response[docrecognizeresult]
异步提取(大型文档)
创建提取任务:
bash
curl -X POST https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/extract/create/task \
-H Content-Type: application/json \
-H X-Access-Key: $ADPACCESSKEY \
-H X-Timestamp: $(date +%s) \
-H X-Signature: $(uuidgen) \
-d {
appkey: $ADPAPP_KEY,
appsecret: $ADPAPP_SECRET,
file_url: https://example.com/large-document.pdf
}
返回:{taskid: taskid_value, metadata: {...}}
轮询获取结果:
bash
curl -X GET https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/extract/query/task/{task_id} \
-H X-Access-Key: $ADPACCESSKEY
高级功能
自定义缩放参数
通过更高分辨率增强 VLM 质量:
bash
model_params: { scale: 2.0 }
指定配置版本
使用特定的提取配置:
bash
modelparams: { versionid: configversionid }
仅文档识别
获取 VLM 结果而不进行提取:
bash
curl -X POST https://adp-global.laiye.com/open/agenticdocprocessor/laiye/v1/app/doc/recognize \
-H Content-Type: application/json \
-H X-Access-Key: $ADPACCESSKEY \
-H X-Timestamp: $(date +%s) \
-H X-Signature: $(uuidgen) \
-d {
appkey: $ADPAPP_KEY,
appsecret: $ADPAPP_SECRET,
file_url: https://example.com/document.pdf
}
何时使用
使用 ADP 的场景:
- - 发票处理
- 订单处理
- 收据处理
- 财务文档处理
- 物流文档处理
- 多表格文档数据提取
不适用的场景:
最佳实践
| 文档大小 | 端点 | 说明 |
|---|
| 小文件 | /doc/extract(同步) | 即时响应 |
| 大文件 |
/doc/extract/create/task(异步) | 轮询获取结果 |
文件输入:
- - fileurl:大文件优先(已托管)
- filebase64:用于直接上传(最大 20MB)
置信度分数:
- - 范围:每个字段 0-1
- 手动检查置信度 <0.8 的字段
响应结构:
- - extractionresult:提取的字段数组
- docrecognizeresult:VLM 结果(当 withrec_result=true 时)
- metadata:处理信息(页数、时间、模型)
响应模式
成功响应
json
{
status: success,
message: string,
extraction_result: [
{
field_key: string,
field_value: string,
field_type: text|number|date|table,
confidence: 0.95,
source_pages: [1],
table
data: [...] // 当 fieldtype=table 时
}
],
doc
recognizeresult: [...