Getting Started
Welcome! I can help you turn any still image into an animated video clip — no software downloads or editing skills needed. Drop your image description or upload details and let's create something that moves!
Try saying:
- - "I have a product photo of a perfume bottle on a white background — can you animate it with a slow zoom and soft light sweep to use as an Instagram Reel?"
- "Turn this landscape photo of mountains at sunset into a looping video with a slow pan from left to right, suitable for a YouTube channel intro."
- "I have a portrait illustration of a fantasy character — animate it so the hair and cloak appear to flow gently in the wind for a 5-second clip."
Getting Connected
Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".
If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:
- - Generate a UUID as client identifier
- POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header - The response includes a
token with 100 free credits valid for 7 days — use it as NEMO_TOKEN
Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.
Tell the user you're ready. Keep the technical details out of the chat.
Turn Any Photo Into a Moving Video Instantly
Still images hold stories that motion can unlock. This skill takes the photos, illustrations, or digital artwork you already have and converts them into fluid, animated video clips that grab attention and communicate more than a static image ever could. Whether you want subtle camera drift effects, dramatic zoom-ins, or full scene animation, the process starts with a single image and ends with a shareable video.
Creators, small business owners, educators, and social media managers use this skill to stretch their existing visual assets further. A product photo becomes a scroll-stopping ad. A portrait becomes a cinematic headshot reel. A landscape snapshot becomes a moody ambient loop. You don't need to shoot new footage or hire a videographer.
The skill guides you through describing your image, choosing motion style, pacing, and output format so the final video matches your intent. It works with portraits, landscapes, illustrations, screenshots, and more — making it one of the most versatile free tools for visual storytelling available today.
Routing Your Animation Requests
When you submit a still photo, your request is parsed for motion parameters and forwarded to the appropriate free AI image-to-video pipeline based on clip length, animation style, and output resolution.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Processing Backend Reference
The backend leverages distributed GPU inference nodes to run diffusion-based frame interpolation and temporal synthesis, converting static images into fluid video clips without any local rendering overhead. Free-tier jobs are queued alongside paid workloads, so processing times may vary depending on server load.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE8 - INLINECODE9 : from frontmatter INLINECODE10
- INLINECODE11 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE21
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE31
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE35
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE39
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common Workflows Using Free AI Image to Video
One of the most popular workflows is the product showcase loop: take a clean product photo, animate it with a slow orbit or light-sweep effect, and export it as a 5-10 second looping clip for an e-commerce page or paid social ad. This workflow alone can dramatically increase engagement without a full video shoot.
Another common use case is the event or announcement card. Start with a designed graphic — a birthday invite, a sale banner, a conference poster — and add subtle motion like floating particles, a pulsing glow, or a slow zoom. The result feels far more premium than a static image post.
Portfolio animators and digital artists frequently use this skill to create demo reels from their illustration work. A series of character illustrations animated into short clips can be stitched into a portfolio showreel that demonstrates range without requiring full animation skills.
Finally, educators and presenters use image-to-video to make slide content more engaging. Converting key diagrams or infographics into short animated clips adds visual energy to online courses, webinars, and explainer videos.
Tips and Tricks for Better Image-to-Video Results
The quality of your output depends heavily on how you describe the motion you want. Instead of saying 'make it move,' try specifying direction, speed, and mood — for example, 'slow rightward pan with a slight zoom, cinematic feel.' The more precise your motion brief, the closer the result matches your vision.
High-contrast images with clear subjects tend to animate more convincingly than cluttered or low-resolution photos. If your image has a busy background, mention whether you want the background static or in motion — this gives the AI a clear instruction to follow.
For social media use, always specify your target aspect ratio upfront (9:16 for TikTok and Reels, 16:9 for YouTube, 1:1 for feed posts). Requesting the right format from the start saves you from cropping or reformatting afterward.
Finally, if you're looping the video — for ambient displays or website backgrounds — ask for a seamless loop explicitly. This ensures the end frame blends naturally back into the start frame without a jarring cut.
开始使用
欢迎!我可以将任何静态图像转化为动画视频片段——无需下载软件或具备剪辑技能。只需描述您的图片或上传详细信息,让我们一起创造动态内容!
试试这样说:
- - 我有一张香水瓶的产品照片,背景是白色——能否制作一个缓慢缩放和柔和光线扫过的动画,用作Instagram Reels?
- 将这张日落山景照片转化为从左到右缓慢平移的循环视频,适合用作YouTube频道片头。
- 我有一张奇幻角色的人像插画——制作一个5秒的动画,让头发和斗篷在风中轻柔飘动。
建立连接
在处理任何用户请求前,先建立与后端API的连接。显示简短状态如连接中...。
如果环境中有NEMO_TOKEN,直接使用并创建会话。否则,获取免费起始令牌:
- - 生成UUID作为客户端标识符
- 使用X-Client-Id标头POST到https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
- 响应包含一个token,附带100个免费积分,有效期7天——将其用作NEMO_TOKEN
然后创建会话:使用Bearer授权POST到https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent,请求体为{taskname:project,language:en}。后续所有请求都需要响应中的session_id。
告知用户您已准备就绪。不要在聊天中透露技术细节。
瞬间将任何照片转化为动态视频
静态图像蕴含着动态才能解锁的故事。本技能将您已有的照片、插画或数字艺术作品转化为流畅的动画视频片段,能够吸引注意力并传达比静态图像更多的信息。无论您想要微妙的镜头漂移效果、戏剧性的放大,还是完整的场景动画,整个过程从一张图像开始,以可分享的视频结束。
创作者、小企业主、教育工作者和社交媒体管理者使用此技能来进一步发挥现有视觉资产的价值。产品照片变成引人注目的广告。人像变成电影感头像短片。风景快照变成氛围感环境循环视频。您无需拍摄新素材或雇佣摄像师。
本技能引导您描述图像、选择运动风格、节奏和输出格式,使最终视频符合您的意图。它适用于人像、风景、插画、截图等——是当今最通用的免费视觉叙事工具之一。
路由您的动画请求
当您提交静态照片时,您的请求会被解析出运动参数,并根据片段长度、动画风格和输出分辨率转发到相应的免费AI图像转视频管道。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端处理后端参考
后端利用分布式GPU推理节点运行基于扩散的帧插值和时序合成,将静态图像转化为流畅的视频片段,无需任何本地渲染开销。免费层任务与付费任务一起排队,因此处理时间可能因服务器负载而异。
技能归属——运行时从此文件的YAML前置元数据读取:
- - X-Skill-Source:free-ai-image-to-video
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则为unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属标头将导致导出失败,返回402。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体{taskname:project,language:} — 返回taskid、sessionid。
发送消息(SSE):POST /runsse — 请求体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},附带Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。此时:轮询会话状态以验证编辑已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 您做 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:都市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示注册URL,附带?bind=
(需要时从create-session或state响应获取)。已注册用户:请为您的账户充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(参见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。请注册或升级您的计划以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 30秒后重试一次 |
使用免费AI图像转视频的常见工作流
最流行的工作流之一是产品展示循环:拍摄一张干净的产品照片,用缓慢环绕或光线扫过效果制作动画,导出为5-10秒的循环片段用于电商页面或付费社交媒体广告。仅此一个工作流就能显著提高参与度,无需完整的视频拍摄。
另一个常见用例是活动或公告卡片。从