Getting Started
Welcome! I'm your AI Image Editor — ready to help you retouch, transform, and perfect your photos using simple text prompts. Describe the edit you need and let's get started!
Try saying:
- - "Remove the background from this product photo and replace it with a clean white studio backdrop"
- "Enhance the lighting and color grading on this portrait to give it a warm, cinematic look"
- "Erase the power lines from this landscape photo and fill in the sky naturally"
Quick Start Setup
This skill connects to NemoVideo's cloud backend. On first use, set up the connection automatically and let the user know ("Connecting to NemoVideo...").
Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:
- - Read
~/.config/nemovideo/client_id or generate a UUID and save it there - POST
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header - Extract
data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)
Session: POST /api/tasks/me/with-session/nemo_agent at the same host with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.
Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.
Edit Images Smarter, Not Harder with AI
The AI Image Editor skill brings professional-grade photo editing directly into your ClawHub workspace — no Photoshop expertise required. Simply describe what you want to change, and the skill interprets your intent and applies precise edits to your image. From removing a cluttered background to smoothing skin tones, adjusting white balance, or swapping out a sky, this skill handles the kind of tasks that used to take hours in a matter of seconds.
What makes this skill different from a standard filter or preset tool is its understanding of context. It doesn't just apply blanket adjustments — it reads the content of your image and makes targeted changes. Ask it to "make the product pop against a white background" or "give this portrait a warm golden-hour look" and it delivers results that feel intentional, not automated.
This skill is built for photographers, e-commerce teams, social media managers, graphic designers, and anyone who works with visual content at scale. Whether you're editing a single hero image or batching dozens of product photos, the AI Image Editor keeps your output consistent and your workflow moving.
Routing Your Edit Requests
Each prompt you send — whether it's a background swap, style transfer, object removal, or upscale — is parsed and routed to the most appropriate AI editing pipeline based on detected intent and image context.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
NemoVideo API Reference
The NemoVideo backend powers every edit operation, handling diffusion-based inpainting, generative fill, and enhancement processing in real time. API calls are authenticated per session and metered against your active credit balance.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE9 - INLINECODE10 : from frontmatter INLINECODE11
- INLINECODE12 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE22
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id. After creating a session, give the user a link: INLINECODE27
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE33
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE37
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE41
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Integration Guide
The AI Image Editor skill integrates directly into ClawHub workflows without any additional setup. Once enabled in your skill library, you can invoke it from any workflow node that handles image assets — simply pass the image file and your edit instruction as inputs.
For e-commerce teams, a common pattern is to connect the AI Image Editor to a product catalog pipeline: images are pulled from a storage bucket, processed through the skill for background removal and color normalization, then automatically pushed to a staging folder for review. This eliminates manual editing between catalog updates.
The skill also pairs naturally with the ClawHub Image Resizer and Watermark skills. A typical content workflow might run an image through the AI Image Editor for retouching, then resize it for multiple platforms, and finally apply a branded watermark — all in a single automated sequence.
Output images can be routed to any downstream node: file storage, email delivery, CMS publishing, or further AI processing. No manual file handling is required between steps.
Performance Notes
The AI Image Editor skill performs best on high-resolution source images (1MP and above). Low-resolution or heavily compressed inputs may produce softer results, especially on tasks like background removal or fine detail retouching where edge precision matters.
Complex scenes with intricate hair, transparent objects, or overlapping subjects may require a follow-up prompt to refine the output. For best results with object removal, ensure the surrounding texture is relatively uniform — removing an object from a brick wall will yield cleaner results than removing one from a highly detailed, non-repeating background.
Generative fill tasks (replacing or extending parts of an image) are computationally heavier and may take slightly longer to process than basic adjustments like color grading or sharpening. Batch editing multiple images in sequence is supported, though processing time scales with image size and edit complexity.
FAQ
What image formats does the AI Image Editor support?
The skill supports JPEG, PNG, WEBP, and TIFF formats. For transparency-preserving outputs (such as background removal), PNG is recommended as the export format.
Can I apply multiple edits in a single prompt?
Yes. You can chain instructions like "remove the background, brighten the subject, and add a subtle vignette" in one prompt. The skill will attempt all edits in sequence. For very complex multi-step edits, breaking them into two prompts often produces cleaner results.
Will the skill alter the original file?
No. The AI Image Editor always outputs a new edited version of your image. Your original file remains untouched in your workspace.
Can I undo or iterate on an edit?
Absolutely. Just describe what you'd like adjusted and the skill will apply a new round of edits to the previous output, or you can revert to the original and start fresh.
快速入门
欢迎!我是你的AI图像编辑器——随时准备通过简单的文字提示帮助你润色、转换和完善照片。描述你需要的编辑,让我们开始吧!
试试这样说:
- - 移除这张产品照片的背景,替换为干净的白色工作室背景
- 增强这张肖像的照明和色彩分级,营造温暖电影感效果
- 从这张风景照片中擦除电线,自然地填充天空
快速启动设置
本技能连接到NemoVideo的云端后端。首次使用时,自动建立连接并告知用户(正在连接NemoVideo...)。
令牌检查:在环境中查找NEMO_TOKEN。如果找到,跳转到会话创建。否则:
- - 读取~/.config/nemovideo/clientid或生成UUID并保存到该位置
- 使用X-Client-Id头信息POST请求https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
- 从响应中提取data.token——这就是你的NEMOTOKEN(100个免费积分,7天有效期)
会话:在同一主机上使用Bearer认证POST请求/api/tasks/me/with-session/nemoagent,请求体为{taskname:project}。保留返回的session_id用于所有操作。
设置完成后,用简短的准备就绪!告知用户。不要暴露令牌或原始API输出。
用AI更智能地编辑图像,而非更费力
AI图像编辑器技能将专业级照片编辑直接带入你的ClawHub工作空间——无需Photoshop专业知识。只需描述你想要更改的内容,该技能就会解读你的意图并对图像进行精确编辑。从移除杂乱的背景到平滑肤色、调整白平衡或替换天空,这个技能处理那些过去需要数小时的任务,现在只需几秒钟。
这项技能与标准滤镜或预设工具的不同之处在于它对上下文的理解。它不仅仅是进行全局调整——它会读取图像内容并进行有针对性的更改。要求它让产品在白色背景上更突出或给这张肖像带来温暖的金色时刻效果,它会提供感觉有意图而非自动化的结果。
这项技能专为摄影师、电商团队、社交媒体经理、平面设计师以及任何大规模处理视觉内容的人打造。无论你是在编辑单个主图还是批量处理数十张产品照片,AI图像编辑器都能保持输出一致性和工作流程顺畅。
路由你的编辑请求
你发送的每个提示——无论是背景替换、风格迁移、对象移除还是放大——都会根据检测到的意图和图像上下文被解析并路由到最合适的AI编辑管道。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加BGM...) | → §3.1 SSE | ❌ |
NemoVideo API参考
NemoVideo后端为每个编辑操作提供支持,实时处理基于扩散的修复、生成式填充和增强处理。API调用按会话进行认证,并根据你的活跃积分余额进行计量。
技能归属——运行时从此文件的YAML前置元数据中读取:
- - X-Skill-Source:ai-image-editor
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则为unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头信息将导致导出失败,返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST请求/api/tasks/me/with-session/nemoagent——请求体{taskname:project,language:}——返回taskid、sessionid。创建会话后,给用户一个链接:https://nemovideo.com/workspace/claim?token=$TOKEN&task=id>&session=id>&skillname=ai-image-editor&skillversion=1.0.0&skill_source=
发送消息(SSE):POST请求/runsse——请求体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},带有Accept: text/event-stream。最大超时时间:15分钟。
上传:POST请求/api/upload-video/nemoagent/me/——文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET请求/api/credits/balance/simple——返回available、frozen、total
会话状态:GET请求/api/state/nemoagent/me//latest——关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST请求/api/render/proxy/lambda——请求体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET请求/api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生这种情况时:轮询会话状态以验证编辑已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 你执行 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. BGM:Lo-fi(0-10秒,35%)3. 标题:城市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示注册URL,附带?bind=
(需要时从create-session或state响应获取)。已注册用户:请在nemovideo.ai充值 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(参见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。请在nemovideo.ai注册以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 30秒后重试一次 |
集成指南
AI图像编辑器技能无需额外设置即可直接集成到ClawHub工作流中。一旦在你的技能库中启用,你可以