Turn Still Photos Into Living, Breathing Videos
Most image animation tools lock you behind credit systems or monthly caps, forcing you to ration creativity. This skill removes that barrier entirely. Whether you have a single portrait, a product shot, or a landscape photograph, you can describe the movement, pacing, and visual atmosphere you want — and the AI handles the rest. There are no batch limits, no hidden queues, and no watermarks standing between your idea and the final file.
The skill operates through a conversational editing model, meaning you refine your video through plain-language instructions rather than timelines or keyframe editors. Want the camera to slowly pan left Ask for it. Need the subject to appear to breathe or the background to ripple like water Describe it. Each iteration responds to your feedback in context, so the output improves with every exchange rather than starting from scratch.
Powering this workflow is the OpenClaw agent, ClawHub's orchestration layer that routes your image and instructions to the appropriate AI backend, manages format conversion, and delivers the rendered video directly in your chosen container — mp4, mov, avi, webm, or mkv. The agent also preserves your session context, so follow-up edits feel like a conversation rather than isolated commands.
Environment Variables
| Variable | Required | Default |
|---|
| INLINECODE0 | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| INLINECODE1 |
No |
https://mega-api-prod.nemovideo.ai |
|
NEMO_WEB_URL | No |
https://nemovideo.com |
|
NEMO_CLIENT_ID | No | Auto-generated UUID, persisted to
~/.config/nemovideo/client_id (UUID only, no secrets) |
|
SKILL_SOURCE | No | Auto-detected from install path, fallback
unknown |
If NEMO_TOKEN is not set, get one (requires X-Client-Id header):
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
Save
token as
NEMO_TOKEN,
CLIENT_ID as
NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via
Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same
X-Client-Id.
Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
2. Routing Incoming Requests
Match each user intent to its corresponding endpoint using the table below.
| User says... | Action | Skip SSE |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
3. Primary Workflow Procedures
$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.
Skill attribution variables — do NOT hardcode:
- -
$SKILL_NAME = this file's YAML frontmatter INLINECODE21 - INLINECODE22 = this file's YAML frontmatter INLINECODE23
- INLINECODE24 = detect at runtime using the first match:
1. Env
SKILL_SOURCE if set
2. If this file's path contains a known platform directory (e.g.
~/.clawhub/ →
clawhub,
~/.cursor/skills/ →
cursor)
3. Fallback: INLINECODE30
CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
CODEBLOCK1
3.0 Initializing a Session
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
A session must be established before any other operations can proceed. This step provisions the context that all subsequent calls will reference.
Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:
INLINECODE31
3.1 Delivering Messages Over SSE
curl -s -X POST "https://mega-api-prod.nemovideo.ai/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
All conversational messages are transmitted through a persistent SSE connection that streams responses back in real time.
SSE Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Wait silently, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
Silent Response Fallback (CRITICAL)
Approximately 30% of editing operations return no visible text in the stream. When this occurs: (1) do not treat the absence of text as a failure, (2) poll the task state endpoint to confirm the job is progressing, (3) surface a status update to the user such as 'Your request is being processed,' (4) await the completion event before delivering the final result.
Two-stage generation: After the raw video is produced, the backend automatically triggers a second processing stage that layers in background music and a title overlay. Treat this as a two-part pipeline: Stage 1 yields the undecorated clip, and Stage 2 yields the fully composed output. Wait for both stages to complete before presenting the video to the user.
3.2 Handling File Uploads
File upload: INLINECODE34
URL upload: INLINECODE35
Use me in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Image and video file uploads are supported and must be submitted through the designated upload endpoint prior to referencing them in a generation request.
3.3 Checking Available Credits
curl -s "https://mega-api-prod.nemovideo.ai/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
Query the credits endpoint to verify the user has a sufficient balance before initiating any generation task.
3.4 Polling Task Status
curl -s "https://mega-api-prod.nemovideo.ai/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
Use
me for user in path; backend resolves from token.
Key fields:
data.state.draft,
data.state.video_infos,
data.state.canvas_config,
data.state.generated_media.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Draft ready for export when draft.t exists with at least one track with non-empty sg.
Track summary format:
CODEBLOCK6
3.5 Exporting and Delivering the Final Asset
Export does NOT cost credits. Only generation/editing consumes credits.
Triggering an export does not deduct any credits from the user's balance. To deliver the asset: (a) call the export endpoint with the completed task ID, (b) await the export job confirmation, (c) retrieve the download URL from the response payload, (d) verify the URL is accessible before presenting it, (e) surface the link or embed the video directly for the user.
b) Submit: INLINECODE47
Note: sessionId is camelCase (exception). On failure → new id, retry once.
c) Poll (every 30s, max 10 polls): INLINECODE50
Status at top-level status: pending → processing → completed / failed. Download URL at output.url.
d) Download from output.url → send to user. Fallback: https://mega-api-prod.nemovideo.ai/api/render/proxy/<id>/download.
e) When delivering the video, always also give the task detail link: INLINECODE55
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.
3.6 Recovering from an SSE Disconnection
If the SSE stream drops unexpectedly, follow these steps: (1) detect the disconnection event in your stream listener, (2) wait a brief interval before attempting to reconnect to avoid hammering the server, (3) re-establish the SSE connection using the original session ID, (4) resume polling the task state endpoint to retrieve any events that were missed during the outage, (5) reconcile the recovered state with what was already delivered to the user and continue from the last confirmed checkpoint.
4. Translating GUI Concepts for the Backend
The backend operates under the assumption that all interactions originate from a graphical interface, so GUI-specific language and instructions must never be forwarded verbatim to the API.
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
Keep content descriptions. Strip GUI actions.
5. Recommended Interaction Patterns
• Confirm the user's intent and any required assets before initiating a generation task to avoid wasted credits.
• Provide incremental progress updates during long-running jobs so the user is never left waiting without feedback.
• When a silent response is detected, proactively reassure the user that processing is underway rather than waiting for them to ask.
• After a completed export, present the final video URL clearly and offer follow-up actions such as re-editing or downloading in a different format.
• On any recoverable error, explain what happened in plain language and automatically retry where the API permits before escalating to the user.
6. Known Limitations
• Generation quality and processing speed are subject to server load and may vary across requests.
• Credit balances are consumed at the point of generation, not at export; failed jobs may still incur a charge depending on how far processing advanced.
• SSE streams do not guarantee message ordering under high-latency conditions; always reconcile final state via the polling endpoint.
• File uploads are subject to size and format restrictions defined by the upload endpoint; unsupported formats will be rejected before a job is queued.
• Concurrent session limits may apply per account; opening excessive parallel sessions can result in rate-limiting or session rejection.
7. Error Handling Reference
Use the table below to map API error codes to their causes and the appropriate recovery actions.
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.
8. Version Information and Token Scopes
Before making any calls, verify that the API version in use matches the version this skill was built and tested against; mismatched versions may cause undocumented behavior. Token scopes must include all permissions required by the endpoints you intend to call — insufficient scopes will result in authorization errors that cannot be resolved at runtime without re-authenticating with the correct scope set.
将静态照片转化为栩栩如生的动态视频
大多数图像动画工具都将你限制在积分系统或月度配额中,迫使你精打细算地使用创意。本技能彻底消除了这一障碍。无论你拥有的是单张肖像照、产品照片还是风景照,你只需描述所需的运动、节奏和视觉氛围——剩下的交给AI处理。没有批量限制,没有隐藏队列,也没有水印横亘在你的创意和最终文件之间。
该技能通过对话式编辑模型运作,这意味着你通过自然语言指令来完善视频,而非使用时间线或关键帧编辑器。希望摄像机缓慢向左平移?直接说出来。需要主体呈现呼吸感或背景像水波般荡漾?描述出来即可。每次迭代都会根据上下文响应你的反馈,因此输出会随着每次交互而改进,无需从头开始。
驱动这一工作流程的是OpenClaw代理——ClawHub的编排层,它将你的图像和指令路由到相应的AI后端,管理格式转换,并直接将渲染好的视频交付到你选择的容器中——mp4、mov、avi、webm或mkv。该代理还会保留你的会话上下文,使后续编辑感觉像是一次对话,而非孤立的指令。
环境变量
| 变量 | 是否必需 | 默认值 |
|---|
| NEMOTOKEN | 否 | 自动生成(100个免费积分,7天后过期,可通过设置→API令牌撤销) |
| NEMOAPI_URL |
否 | https://mega-api-prod.nemovideo.ai |
| NEMO
WEBURL | 否 | https://nemovideo.com |
| NEMO
CLIENTID | 否 | 自动生成的UUID,持久化存储至~/.config/nemovideo/client_id(仅UUID,无密钥) |
| SKILL_SOURCE | 否 | 从安装路径自动检测,回退为unknown |
如果未设置NEMO_TOKEN,请获取一个(需要X-Client-Id头):
bash
生成或读取持久化的Client-Id
CLIENT
ID=${NEMOCLIENT
ID:-$(cat ~/.config/nemovideo/clientid 2>/dev/null)}
if [ -z $CLIENT_ID ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo client-$(date +%s)-$RANDOM)
mkdir -p ~/.config/nemovideo & echo $CLIENT
ID > ~/.config/nemovideo/clientid
fi
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
→ {code:0,data:{token:nmvusrxxx,credits:100,...}}
将token保存为NEMOTOKEN,CLIENTID保存为NEMOCLIENTID。匿名用户:每个客户端每7天1个令牌;令牌7天后过期,可随时通过nemovideo.com上的设置→API令牌撤销。如果令牌过期,使用相同的X-Client-Id请求新令牌。
本地持久化: 本技能会写入~/.config/nemovideo/client_id以跨会话持久化Client-Id。这避免了每次请求都生成新ID,否则会很快触发每IP速率限制(默认每个IP每7天10个令牌)。该文件仅包含一个UUID——本地不存储任何凭证。
2. 路由传入请求
使用下表将每个用户意图匹配到相应的端点。
| 用户说... | 操作 | 跳过SSE |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
3. 主要工作流程
$API = ${NEMOAPIURL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMOTOKEN}, $WEB = ${NEMOWEB_URL:-https://nemovideo.com}。
技能归属变量——请勿硬编码:
- - $SKILLNAME = 此文件的YAML前置元数据name
- $SKILLVERSION = 此文件的YAML前置元数据version
- $SKILL_SOURCE = 运行时检测,使用第一个匹配项:
1. 如果设置了环境变量SKILL_SOURCE
2. 如果此文件的路径包含已知平台目录(例如~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor)
3. 回退:unknown
关键:所有API请求(包括渲染/导出/上传/状态/积分)必须包含这些头。缺少它们将导致导出失败并返回402。
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
3.0 初始化会话
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H X-Skill-Source: $SKILL
NAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE \
-d {task_name:project,language:
}
→ {code:0,data:{taskid:...,sessionid:...}}
在任何其他操作进行之前,必须先建立会话。此步骤配置了所有后续调用将引用的上下文。
在浏览器中打开:创建会话后,向用户提供一个链接,以便在NemoVideo中查看/编辑任务:
$WEB/workspace/claim?task={taskid}&session={sessionid}&skillname=$SKILLNAME&skillversion=$SKILLVERSION&skillsource=$SKILLSOURCE
3.1 通过SSE传递消息
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/run_sse \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H Accept: text/event-stream -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE --max-time 900 \
-d {appname:nemoagent,userid:me,sessionid:,new_message:{parts:[{text:}]}}
所有对话消息都通过持久的SSE连接传输,实时流式返回响应。
SSE处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
静默等待,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理... |
| 流关闭 | 处理最终响应 |
典型时长:文本5-15秒,视频生成100-300秒,编辑10-30秒。
超时:仅收到心跳10分钟→视为超时。生成期间切勿重新发送(会导致重复和双重计费)。
如果之前的响应正常,忽略末尾的我遇到了临时问题。
静默响应回退(关键)
大约30%的编辑操作在流中不返回可见文本。发生这种情况时:(1) 不要将文本缺失视为失败,(2) 轮询任务状态端点以确认作业正在进行,(3) 向用户显示状态更新,例如您的请求正在处理中,(4) 等待完成事件后再交付最终结果。
两阶段生成:原始视频生成后,后端会自动触发第二阶段处理,添加背景音乐和标题叠加。将其视为两部分流水线:第一阶段产生未装饰的片段,第二阶段产生完整合成的输出。等待两个阶段都完成后再向用户呈现视频。
3.2 处理文件上传
文件上传:curl -s -X POST https://mega-api-prod.nemovideo.ai/api/upload-video/nemoagent/me/ -H Authorization: Bearer $TOKEN -H X-Skill-Source: $SKILLNAME -H X-Skill