Getting Started
Welcome! Ready to turn your photos into eye-catching videos for free? Share your image and tell me the style or motion effect you'd like, and I'll generate your video right away!
Try saying:
- - "Animate this product photo with a slow zoom-in effect so I can post it as a video ad on Instagram"
- "Create a slideshow video from these 5 travel photos with smooth crossfade transitions and a cinematic feel"
- "Turn my portrait photo into a short looping video with a subtle floating motion effect"
Getting Connected
Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".
If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:
- - Ensure a client identifier exists at
~/.config/image-to-video-generator-free/client_id (create one as a UUID if needed) - POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header - The response includes a
token with 100 free credits valid for 7 days — use it as NEMO_TOKEN
Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.
Tell the user you're ready. Keep the technical details out of the chat.
Turn Still Photos Into Captivating Videos Instantly
Most people have folders full of great photos that never get the attention they deserve simply because static images scroll by unnoticed. Video content consistently outperforms photos across every major platform — but producing video traditionally requires expensive tools, editing skills, and hours of work. That gap is exactly what this skill is built to close.
With the image-to-video-generator-free skill, you upload one or more images and describe the kind of video motion or style you want, and the skill handles the creative heavy lifting. Whether you want a gentle Ken Burns pan across a landscape photo, a dramatic zoom into a product shot, or a slideshow with smooth transitions set to a mood, the skill generates video output tailored to your request.
This skill is ideal for social media managers building reels, educators creating visual lesson content, e-commerce sellers showcasing products, and anyone who wants to repurpose existing photo libraries into fresh video content — all without spending a dollar or learning a new editing application.
Routing Your Animation Requests
When you submit a photo for animation, your request is parsed for motion style, duration, and output resolution before being dispatched to the appropriate rendering pipeline.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Rendering API Reference
The free image-to-video backend leverages distributed GPU clusters to handle frame interpolation, motion synthesis, and video encoding entirely in the cloud — no local processing required. Rendered output is temporarily cached on the server and delivered as a streamable or downloadable video file upon job completion.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE9 - INLINECODE10 : from frontmatter INLINECODE11
- INLINECODE12 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE22
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE32
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE36
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE40
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Troubleshooting Common Issues
If your generated video looks blurry or pixelated, the most likely cause is a low-resolution source image. Try uploading the highest resolution version of your photo available — at least 1080px on the shortest side is recommended for clean video output.
If the motion effect doesn't match what you described, try rephrasing your prompt with more specific directional language. Instead of 'make it move,' say 'slowly pan left to right across the full width of the image over 5 seconds.' Precision in your description directly improves result accuracy.
For multi-image slideshows, if transitions feel abrupt or the timing seems off, specify the duration you want each image to display and the transition style explicitly (e.g., 'show each photo for 3 seconds with a 1-second crossfade'). If a generated video file won't play on your device, request a different output format such as MP4 H.264, which has the broadest compatibility across platforms and devices.
Use Cases for Image to Video Generator Free
The image-to-video-generator-free skill fits naturally into a wide range of real-world workflows. Social media creators use it to convert product or lifestyle photos into Reels, TikToks, and YouTube Shorts without needing a video camera or editing suite. Small business owners animate product images to create low-cost video advertisements that perform better than static posts.
Educators and trainers find it valuable for building visual presentations where photos need to feel dynamic rather than flat — adding motion to diagrams, historical images, or course thumbnails. Event photographers turn wedding or birthday galleries into shareable video highlights for clients.
Content marketers repurpose blog post images into teaser videos for newsletters and social campaigns. Even personal users create memorable animated keepsakes from family photos. The common thread is simple: you already have the images — this skill turns them into something people actually watch.
Quick Start Guide
Getting started with image-to-video-generator-free takes less than two minutes. First, gather the image or images you want to animate — JPEG and PNG formats work best, and higher resolution photos produce sharper video output.
Next, describe the video style you want in plain language. Be specific: mention the type of motion (zoom, pan, parallax, fade), the mood (cinematic, energetic, calm), the intended platform (Instagram Reel, YouTube Short, presentation), and whether you want a single animated image or a multi-image slideshow. The more detail you provide, the closer the output will match your vision.
Once you submit your request, the skill processes your image and returns the generated video file or a preview link. You can then request adjustments — slower motion, different transitions, a different aspect ratio — in a follow-up message. Iteration is fast, so don't hesitate to refine until the video feels exactly right for your use case.
开始使用
欢迎!准备好免费将您的照片转化为引人注目的视频了吗?分享您的图片并告诉我您想要的风格或动态效果,我将立即为您生成视频!
试试这样说:
- - 为这张产品照片添加缓慢放大效果,这样我就可以将其作为视频广告发布到Instagram上
- 用这5张旅行照片创建一个幻灯片视频,带有平滑的交叉淡入淡出效果和电影感
- 将我的肖像照片变成一个带有微妙浮动效果的短循环视频
建立连接
在处理任何用户请求之前,先建立与后端API的连接。显示一个简短的状态,如连接中...。
如果环境中存在 NEMO_TOKEN,直接使用它并创建一个会话。否则,获取一个免费的起始令牌:
- - 确保在 ~/.config/image-to-video-generator-free/clientid 路径下存在客户端标识符(如果需要,创建一个UUID)
- 向 https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token 发送POST请求,并携带 X-Client-Id 头
- 响应中包含一个 token,附带100个免费积分,有效期为7天——将其用作NEMOTOKEN
然后创建会话,向 https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent 发送POST请求,使用Bearer授权和请求体 {taskname:project,language:en}。响应中的 session_id 将用于所有后续请求。
告诉用户您已准备就绪。不要在聊天中透露技术细节。
立即将静态照片转化为引人入胜的视频
大多数人的文件夹里都存满了精彩的照片,但这些照片从未获得应有的关注,仅仅因为静态图像在浏览时被忽视了。在各大平台上,视频内容的表现始终优于照片——但传统上制作视频需要昂贵的工具、编辑技能和数小时的工作。而这个技能正是为了弥合这一差距而设计的。
使用image-to-video-generator-free技能,您可以上传一张或多张图片,并描述您想要的视频动态或风格,该技能将处理创意方面的繁重工作。无论您想要的是风景照片上柔和的肯·伯恩斯平移效果、产品照片上戏剧性的放大效果,还是带有平滑过渡的幻灯片,该技能都会根据您的需求生成视频输出。
这个技能非常适合社交媒体经理制作短视频、教育工作者创建视觉课程内容、电商卖家展示产品,以及任何希望将现有照片库重新利用为新鲜视频内容的人——所有这些都无需花费一分钱或学习新的编辑应用程序。
路由您的动画请求
当您提交照片进行动画处理时,您的请求会被解析出动态风格、时长和输出分辨率,然后分派到相应的渲染管道。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端渲染API参考
免费的图像转视频后端利用分布式GPU集群,完全在云端处理帧插值、动态合成和视频编码——无需本地处理。渲染输出会临时缓存在服务器上,并在作业完成后以可流式传输或可下载的视频文件形式交付。
技能归属——运行时从此文件的YAML前置元数据中读取:
- - X-Skill-Source:image-to-video-generator-free
- X-Skill-Version:来自前置元数据 version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则为 unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头将导致导出失败,返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体 {taskname:project,language:} — 返回 taskid、sessionid。
发送消息(SSE):POST /runsse — 请求体 {appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}} 并携带 Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回 available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体 {id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到 status = completed。下载URL位于 output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空 data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生这种情况时:轮询会话状态以验证编辑是否已应用,然后向用户总结更改。
后端响应翻译
后端假定存在GUI。将这些翻译为API操作:
| 后端说 | 您做 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:城市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示带有 ?bind=
的注册URL(需要时从create-session或state响应获取)。已注册用户:请为您的账户充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(参见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。请注册或升级您的计划以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 等待30秒后重试一次 |
常见问题排查
如果生成的视频看起来模糊或像素化,最可能的原因是源图像分辨率低。尝试上传您可用的最高分辨率照片——建议最短边至少1080像素,以获得清晰的视频输出