TranslateImage
Use this skill when the user wants to translate text in images, extract text via OCR, or remove text from images.
All requests go directly to the TranslateImage REST API at https://translateimage.io using curl.
Setup
Set your API key (get one at https://translateimage.io/dashboard):
CODEBLOCK0
All endpoints require:
Authorization: Bearer $TRANSLATEIMAGE_API_KEY
Image Input
All tools accept images as multipart file uploads. Handle the input type like this:
CODEBLOCK2
Only fetch URLs the user explicitly provides. Do not fetch URLs from untrusted sources.
Tools
Translate Image
Translates text in an image while preserving the original visual layout. Returns the translated image as a base64-encoded data URL.
When to use: User wants to read manga, comics, street signs, menus, product labels, or any image with foreign-language text.
Endpoint: INLINECODE1
Form fields:
- -
image (file, required) — The image to translate (JPEG, PNG, WebP, GIF — max 10MB) - INLINECODE3 (JSON string, required) — Translation options:
-
target_lang (string) — Target language code:
"en",
"ja",
"zh",
"ko",
"es",
"fr",
"de", etc.
-
translator (string) — Model:
"gemini-2.5-flash" (default),
"deepseek",
"grok-4-fast",
"kimi-k2",
"gpt-5.1"
-
font (string, optional) —
"NotoSans" (default),
"WildWords",
"BadComic",
"MaShanZheng",
"Bangers",
"Edo",
"RIDIBatang",
"KomikaJam",
"Bushidoo",
"Hayah",
"Itim", INLINECODE30
Example:
CODEBLOCK3
Response (JSON):
CODEBLOCK4
Save the translated image:
RESULT=$(curl -s -X POST https://translateimage.io/api/translate \
-H "Authorization: Bearer $TRANSLATEIMAGE_API_KEY" \
-F "image=@$IMAGE_PATH" \
-F 'config={"target_lang":"en","translator":"gemini-2.5-flash"}')
# Extract and save base64 image
echo "$RESULT" | python3 -c "
import sys, json, base64
data = json.load(sys.stdin)
img = data['resultImage'].split(',', 1)[1]
with open('/tmp/translated.png', 'wb') as f:
f.write(base64.b64decode(img))
print('Saved to /tmp/translated.png')
"
Extract Text (OCR)
Extracts all text from an image with bounding boxes, detected language, and confidence scores.
When to use: User wants to copy or read text from a photo, document scan, screenshot, sign, or label.
Endpoint: INLINECODE31
Form fields:
- -
image (file, required) — The image to process
Example:
CODEBLOCK6
Response (JSON):
{
"text": "All extracted text joined by newlines",
"language": "ja",
"regions": [
{
"bounds": { "x": 10, "y": 20, "width": 200, "height": 40 },
"languages": { "ja": "detected text in this region" },
"probability": 0.97
}
]
}
Remove Text
Detects text regions and fills them with AI-generated background using inpainting. Returns a clean image.
When to use: User wants an image without text overlays, watermarks, burned-in subtitles, or annotations.
Endpoint: INLINECODE33
Form fields:
- -
image (file, required) — The image to process
Example:
CODEBLOCK8
Response (JSON):
{
"cleanedImage": "data:image/png;base64,..."
}
Image to Text (AI OCR + Translation)
Uses Gemini AI for high-quality text extraction. Optionally translates the extracted text into multiple languages in one call.
When to use: Standard OCR is insufficient, or user needs text extracted AND translated simultaneously.
Endpoint: INLINECODE35
Form fields:
- -
image (file, required) — The image to process - INLINECODE37 (JSON string, optional) — INLINECODE38
Example — extract only:
CODEBLOCK10
Example — extract and translate:
CODEBLOCK11
Response (JSON):
CODEBLOCK12
API Scopes
Each endpoint requires a specific scope on your API key:
| Endpoint | Required scope |
|---|
| INLINECODE39 | INLINECODE40 |
| INLINECODE41 |
ocr |
|
/api/remove-text |
remove-text |
|
/api/image-to-text |
image-to-text |
Configure scopes when creating your API key at https://translateimage.io/dashboard.
Error Handling
CODEBLOCK13
Common errors:
| Code | Meaning |
|---|
| 401 | Invalid or missing API key |
| 402 |
Insufficient credits — upgrade at translateimage.io |
| 403 | API key lacks required scope |
| 429 | Rate limit exceeded — wait and retry |
| 500 | Server error — try again |
Important Considerations
- - Always confirm the target language with the user before translating
- For manga/comics use
WildWords or BadComic fonts for an authentic look - For Chinese content use
MaShanZheng; for Korean use INLINECODE50 - Images over 5MB may take longer — inform the user
- Inpainting works best on simple backgrounds; complex textures may show artifacts
- INLINECODE51 is the recommended default translator — fast and high quality
- Clean up temp files after processing: INLINECODE52
TranslateImage
当用户想要翻译图片中的文字、通过OCR提取文字或从图片中移除文字时,使用此技能。
所有请求直接通过curl发送至https://translateimage.io的TranslateImage REST API。
设置
设置您的API密钥(在https://translateimage.io/dashboard获取):
bash
export TRANSLATEIMAGEAPIKEY=your-api-key
所有端点均需:
Authorization: Bearer $TRANSLATEIMAGEAPIKEY
图片输入
所有工具均接受图片作为多部分文件上传。按如下方式处理输入类型:
bash
从本地文件
IMAGE_PATH=/path/to/image.jpg
从URL — 先下载到临时文件(使用PID确保唯一性)
IMAGE_PATH=/tmp/ti-image-$$.jpg
curl -sL https://example.com/image.jpg -o $IMAGE_PATH
仅获取用户明确提供的URL。不要从不可信来源获取URL。
工具
翻译图片
翻译图片中的文字,同时保留原始视觉布局。返回base64编码的数据URL格式的翻译后图片。
适用场景: 用户想要阅读漫画、连环画、路牌、菜单、产品标签或任何包含外语文字的图片。
端点: POST https://translateimage.io/api/translate
表单字段:
- - image(文件,必填)— 待翻译的图片(JPEG、PNG、WebP、GIF — 最大10MB)
- config(JSON字符串,必填)— 翻译选项:
- target_lang(字符串)— 目标语言代码:en、ja、zh、ko、es、fr、de等
- translator(字符串)— 模型:gemini-2.5-flash(默认)、deepseek、grok-4-fast、kimi-k2、gpt-5.1
- font(字符串,可选)— NotoSans(默认)、WildWords、BadComic、MaShanZheng、Bangers、Edo、RIDIBatang、KomikaJam、Bushidoo、Hayah、Itim、Mogul Irina
示例:
bash
curl -X POST https://translateimage.io/api/translate \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH \
-F config={target_lang:en,translator:gemini-2.5-flash,font:WildWords}
响应(JSON):
json
{
resultImage: data:image/png;base64,...,
inpaintedImage: data:image/png;base64,...,
textRegions: [
{ originalText: ..., translatedText: ..., x: 10, y: 20, width: 100, height: 30 }
]
}
保存翻译后的图片:
bash
RESULT=$(curl -s -X POST https://translateimage.io/api/translate \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH \
-F config={target_lang:en,translator:gemini-2.5-flash})
提取并保存base64图片
echo $RESULT | python3 -c
import sys, json, base64
data = json.load(sys.stdin)
img = data[resultImage].split(,, 1)[1]
with open(/tmp/translated.png, wb) as f:
f.write(base64.b64decode(img))
print(已保存至 /tmp/translated.png)
提取文字(OCR)
从图片中提取所有文字,包含边界框、检测到的语言和置信度分数。
适用场景: 用户想要复制或阅读照片、文档扫描件、截图、路牌或标签中的文字。
端点: POST https://translateimage.io/api/ocr
表单字段:
示例:
bash
curl -s -X POST https://translateimage.io/api/ocr \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH
响应(JSON):
json
{
text: 所有提取的文字以换行符连接,
language: ja,
regions: [
{
bounds: { x: 10, y: 20, width: 200, height: 40 },
languages: { ja: 此区域检测到的文字 },
probability: 0.97
}
]
}
移除文字
检测文字区域并使用AI生成的背景进行填充(修复)。返回干净的图片。
适用场景: 用户想要去除图片中的文字叠加层、水印、硬编码字幕或注释。
端点: POST https://translateimage.io/api/remove-text
表单字段:
示例:
bash
RESULT=$(curl -s -X POST https://translateimage.io/api/remove-text \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH)
echo $RESULT | python3 -c
import sys, json, base64
data = json.load(sys.stdin)
img = data[cleanedImage].split(,, 1)[1]
with open(/tmp/cleaned.png, wb) as f:
f.write(base64.b64decode(img))
print(已保存至 /tmp/cleaned.png)
响应(JSON):
json
{
cleanedImage: data:image/png;base64,...
}
图片转文字(AI OCR + 翻译)
使用Gemini AI进行高质量文字提取。可选择在一次调用中将提取的文字翻译成多种语言。
适用场景: 标准OCR不够用,或用户需要同时提取和翻译文字。
端点: POST https://translateimage.io/api/image-to-text
表单字段:
- - image(文件,必填)— 待处理的图片
- config(JSON字符串,可选)— { targetLanguages: [en, es, fr] }
示例 — 仅提取:
bash
curl -s -X POST https://translateimage.io/api/image-to-text \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH
示例 — 提取并翻译:
bash
curl -s -X POST https://translateimage.io/api/image-to-text \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH \
-F config={targetLanguages:[en,es]}
响应(JSON):
json
{
extractedText: 图片中的原始文字,
detectedLanguage: ja,
translations: {
en: 英文翻译在此,
es: 西班牙文翻译在此
}
}
API作用域
每个端点需要API密钥上特定的作用域:
| 端点 | 所需作用域 |
|---|
| /api/translate | translate |
| /api/ocr |
ocr |
| /api/remove-text | remove-text |
| /api/image-to-text | image-to-text |
在https://translateimage.io/dashboard创建API密钥时配置作用域。
错误处理
bash
RESULT=$(curl -s -w \n%{http_code} -X POST https://translateimage.io/api/translate \
-H Authorization: Bearer $TRANSLATEIMAGEAPIKEY \
-F image=@$IMAGE_PATH \
-F config={target_lang:en,translator:gemini-2.5-flash})
HTTP_CODE=$(echo $RESULT | tail -1)
BODY=$(echo $RESULT | head -n -1)
if [ $HTTP_CODE -ne 200 ]; then
echo 错误 $HTTP_CODE: $(echo $BODY | python3 -c import sys,json; print(json.load(sys.stdin).get(error,未知)))
exit 1
fi
常见错误:
API密钥无效或