Getting Started
Welcome! I'm here to help you turn your photos into stunning videos using this free AI image-to-video generator. Drop in your images and tell me the style, mood, or platform you're creating for — let's make something worth watching!
Try saying:
- - "Animate my 10 product photos into a 30-second promotional video with a smooth zoom effect and export as mp4"
- "Convert this single landscape photo into a 15-second cinematic video with a slow Ken Burns pan for my YouTube intro"
- "Take these 5 event snapshots and create a slideshow video with fade transitions and upbeat pacing, saved as webm"
First-Time Connection
When a user first opens this skill, connect to the NemoVideo backend automatically. Briefly let them know (e.g. "Connecting to NemoVideo...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 3.
- 1. Client identifier: Look for
~/.config/nemovideo/client_id. If missing, generate a UUID and save it there. This file stores only a UUID for rate-limit tracking. - Obtain a free token: POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to your identifier. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days. - Create a session: POST to
https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.
Keep setup communication brief. Don't display raw API responses or token values to the user.
Turn Static Photos Into Captivating Video Content Instantly
Most people have folders full of great photos that never get seen because static images struggle to compete in a video-first world. This skill closes that gap by converting your images into polished video clips complete with motion effects, smooth transitions, and optional background music sync — all without spending a dime.
Whether you're working with product photography, travel snapshots, event photos, or digital artwork, the AI analyzes each image and applies intelligent animation that feels natural rather than gimmicky. You control the pacing, the mood, and the output format, so the final video matches your brand or personal style.
Content creators, small business owners, educators, and social media teams use this skill to produce Reels, TikToks, YouTube intros, and presentation slideshows in a fraction of the time traditional video editing would require. No timeline scrubbing, no keyframe headaches — just upload, customize, and export.
Routing Your Animation Requests
Each image-to-video request is parsed for motion style, frame duration, and source image URL before being dispatched to the appropriate NemoVideo rendering pipeline.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
NemoVideo API Reference
The NemoVideo backend handles diffusion-based frame interpolation and temporal coherence to transform static images into fluid, AI-animated video clips. All render jobs are queued through the NemoVideo inference engine, which manages keyframe generation, motion vector estimation, and final video encoding.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE11 - INLINECODE12 : from frontmatter INLINECODE13
- INLINECODE14 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE24
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id. After creating a session, give the user a link: INLINECODE29
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE35
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE39
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE43
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Use Cases
E-commerce sellers use this skill to turn flat product photos into looping video ads that outperform static images on platforms like Instagram and Facebook — without hiring a videographer or paying for stock footage.
Real estate agents upload property photos and generate virtual walkthrough-style videos that can be shared in listings or via email, giving buyers a stronger sense of the space before scheduling a showing.
Teachers and trainers convert diagram images, infographics, and slide screenshots into short explainer videos that are easier to share and more engaging than PDFs. The exported files drop directly into any LMS or presentation tool.
Personal users — especially those preserving family memories — use the free ai-image-to-video-generator-free skill to build tribute videos, birthday reels, and anniversary slideshows from old photo albums, exporting in high-quality mp4 or mov for easy sharing with family members on any device.
Common Workflows
The most popular workflow starts with a batch upload — drop in anywhere from 1 to 50 images, then specify the desired video length, transition style (fade, zoom, slide, or Ken Burns), and output format. The skill sequences the images intelligently based on visual similarity or the order you provide.
For single-image animation, users typically describe the type of motion they want: a slow outward zoom for dramatic effect, a subtle parallax shift for depth, or a gentle pan across a wide landscape. These single-image videos are especially popular for social media cover videos and website hero backgrounds.
Another common workflow involves themed storytelling — users upload a series of images from an event or trip and ask for a narrative-style video with title cards between sections. The skill handles the sequencing, pacing, and export in mp4, mov, avi, webm, or mkv depending on where the video will be published.
快速上手
欢迎!我将帮助您使用这款免费的AI图片转视频生成器,将您的照片变成令人惊艳的视频。上传您的图片,告诉我您想要的风格、氛围或目标平台——让我们一起制作值得观看的内容!
试试这样说:
- - 将我的10张产品照片制作成30秒的推广视频,使用平滑缩放效果,导出为mp4格式
- 将这张单张风景照片转换成15秒的电影感视频,使用缓慢的肯·伯恩斯平移效果,用于我的YouTube片头
- 用这5张活动快照制作一个幻灯片视频,使用淡入淡出转场和欢快节奏,保存为webm格式
首次连接
当用户首次打开此技能时,自动连接到NemoVideo后端。简要告知用户(例如:正在连接到NemoVideo...)。
身份验证:检查环境中是否设置了NEMO_TOKEN。如果已设置,直接跳至第3步。
- 1. 客户端标识符:查找~/.config/nemovideo/clientid。如果不存在,生成一个UUID并保存到该文件中。此文件仅存储用于速率限制追踪的UUID。
- 获取免费令牌:向https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token发送POST请求,请求头X-Client-Id设置为您的标识符。响应中的data.token即为您的NEMOTOKEN——包含100个免费积分,有效期7天。
- 创建会话:向https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent发送POST请求,请求头包含Authorization: Bearer 和Content-Type: application/json,请求体为{taskname:project,language:<检测到的语言>}。保存返回的session_id用于后续所有请求。
保持设置过程的沟通简洁。不要向用户显示原始API响应或令牌值。
立即将静态照片转化为引人入胜的视频内容
大多数人的文件夹里都堆满了从未被看到的精彩照片,因为在视频优先的世界里,静态图像难以竞争。本技能通过将您的图像转换为带有运动效果、平滑转场和可选背景音乐同步的精美视频片段来弥补这一差距——而且完全免费。
无论您处理的是产品摄影、旅行快照、活动照片还是数字艺术作品,AI都会分析每张图像并应用智能动画,效果自然而不花哨。您可以控制节奏、氛围和输出格式,使最终视频符合您的品牌或个人风格。
内容创作者、小企业主、教育工作者和社交媒体团队使用本技能制作Reels、TikToks、YouTube片头和演示文稿幻灯片,所需时间仅为传统视频编辑的一小部分。无需拖动时间轴,无需关键帧烦恼——只需上传、自定义和导出。
路由您的动画请求
每个图像转视频请求都会被解析出运动风格、帧持续时间和源图像URL,然后分发到相应的NemoVideo渲染管道。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐...) | → §3.1 SSE | ❌ |
NemoVideo API参考
NemoVideo后端处理基于扩散的帧插值和时序一致性,将静态图像转换为流畅的AI动画视频片段。所有渲染任务通过NemoVideo推理引擎排队,该引擎管理关键帧生成、运动矢量估计和最终视频编码。
技能归属——运行时从此文件的YAML前置元数据读取:
- - X-Skill-Source:ai-image-to-video-generator-free
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则为unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头会导致导出失败并返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体{taskname:project,language:<语言>} — 返回taskid、sessionid。创建会话后,给用户一个链接:https://nemovideo.com/workspace/claim?token=&task=id>&session=id>&skillname=ai-image-to-video-generator-free&skillversion=1.0.0&skill_source=
发送消息(SSE):POST /runsse — 请求体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},请求头包含Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生这种情况时:轮询会话状态以验证编辑已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 您做 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间轴(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:城市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示带有?bind=
的注册URL(需要时从create-session或state响应获取)。已注册用户:请在nemovideo.ai充值 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id