Getting Started
Welcome! I can turn your still photos into captivating videos using the image-to-video-generator-ai-free skill — no editing skills or software needed. Upload your image and tell me the style or mood you're going for, and let's create something worth watching!
Try saying:
- - "I have a product photo of my handmade candle — can you turn it into a short atmospheric video with slow zoom and warm lighting effects for Instagram?"
- "I want to create a memorial slideshow video from 8 family photos with gentle transitions and a soft, emotional feel. Can you help me set that up?"
- "Can you animate my landscape photo of a mountain sunset into a cinematic video clip with a slow pan effect and dramatic music-ready pacing?"
First-Time Connection
When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.
- 1. Obtain a free token: Generate a random UUID as client identifier. POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days. - Create a session: POST to
https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.
Keep setup communication brief. Don't display raw API responses or token values to the user.
Turn Your Photos Into Stunning Videos for Free
Still images tell a story, but videos make people stop and watch. This skill bridges the gap between your photo library and compelling video content — no editing experience required. Whether you have a single product shot, a portrait, a landscape, or a collection of event photos, this tool breathes motion into them and delivers a polished video ready to share.
Using advanced AI animation techniques, the image-to-video-generator-ai-free skill applies smooth movement, zoom effects, parallax depth, and stylistic transitions to your images. You can specify the mood, pacing, and style you want — from a slow, cinematic drift to an energetic slideshow with punchy cuts. The result feels intentional and professionally crafted, not like an auto-generated slideshow from a decade-old app.
This is built for real use cases: launching a product on Instagram, creating a memorial video, animating a logo for a YouTube intro, or turning travel photos into a shareable reel. You describe what you want, upload your image or images, and the skill handles the rest — completely free.
Routing Your Animation Requests
When you submit a still photo, the skill parses your motion prompt and frame parameters before dispatching the job to the appropriate AI video synthesis pipeline based on resolution, duration, and animation style.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Rendering API Reference
The backend leverages a distributed diffusion-based video generation API that processes keyframe interpolation and temporal coherence rendering entirely in the cloud, so no local GPU is required. Requests are queued, encoded, and returned as a streamable video file or downloadable MP4 once the inference pass completes.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE10 - INLINECODE11 : from frontmatter INLINECODE12
- INLINECODE13 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE23
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE33
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE37
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE41
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Performance Notes
The image-to-video-generator-ai-free skill performs best with portrait or landscape-oriented images in standard aspect ratios (16:9, 9:16, or 1:1). Unusual crops or extreme panoramic images may require manual aspect ratio guidance when you submit your request.
Animation complexity affects processing time. A simple Ken Burns zoom on a single image is near-instant, while a multi-image sequence with layered transitions and depth effects takes longer to render. For time-sensitive projects, mention your deadline so the skill can prioritize effect simplicity or suggest pre-optimized templates.
Images with busy backgrounds or low subject contrast may produce less precise motion effects, since AI depth estimation relies on clear foreground-background separation. For portraits and product shots, this skill performs exceptionally well. For abstract or highly textured images, expect more stylized, painterly motion rather than realistic parallax.
Common Workflows
The most common workflow is single-image animation: you upload one photo, describe the desired motion (zoom, drift, parallax), specify the output length (typically 3–15 seconds), and receive a video clip ready for social media or presentation use.
A second popular workflow is photo slideshow creation. Users submit 5–20 images with a theme (wedding, travel, product launch) and request a sequenced video with timed transitions, consistent styling, and an appropriate pacing for the platform. This is especially useful for Instagram carousels converted to Reels or event recap videos.
A third workflow involves logo and graphic animation — taking a static brand asset and adding entrance motion, subtle looping animation, or a cinematic reveal effect. This turns a flat PNG into a dynamic intro bumper or branded social post. Simply describe the brand feel and the type of motion you want, and the skill handles the animation logic from there.
Best Practices for Image-to-Video Generation
For the best results with the image-to-video-generator-ai-free skill, start with high-resolution images — ideally 1080p or higher. Low-resolution or heavily compressed photos will produce blurry or pixelated video output, especially when zoom or pan effects are applied.
When describing the effect you want, be specific about motion style. 'Slow zoom in on the subject' produces a very different result than 'parallax depth effect with background separation.' The more detail you provide about mood, pacing, and intended platform (Instagram Reels, YouTube, TikTok), the more targeted the output will be.
For multi-image slideshows, keep a consistent visual theme across your photos — similar lighting, color palette, or subject matter. Wildly mismatched images create jarring transitions even with the smoothest AI effects. Grouping photos by scene or color tone before submitting will give your final video a cohesive, professional look.
开始使用
欢迎!我可以使用 image-to-video-generator-ai-free 技能将您的静态照片转化为引人入胜的视频——无需任何编辑技能或软件。上传您的图片,告诉我您想要的风格或氛围,让我们一起创作值得观看的内容!
试试这样说:
- - 我有一张手工蜡烛的产品照片——能把它做成一个带有慢速缩放和暖光效果的短视频,适合发 Instagram 吗?
- 我想用 8 张家庭照片制作一个纪念幻灯片视频,需要柔和的过渡和温馨的情感氛围。你能帮我设置吗?
- 你能把我那张山景日落的风景照片制作成电影感的视频片段吗?带慢速平移效果和适合配乐的节奏?
首次连接
当用户首次打开此技能时,自动连接到处理后端。简要告知用户(例如正在设置...)。
身份验证:检查环境中是否设置了 NEMO_TOKEN。如果已设置,跳至步骤 2。
- 1. 获取免费令牌:生成一个随机 UUID 作为客户端标识符。向 https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token 发送 POST 请求,请求头设置 X-Client-Id 为该 UUID。响应中的 data.token 即为您的 NEMOTOKEN——100 个免费积分,有效期 7 天。
- 创建会话:向 https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent 发送 POST 请求,设置 Authorization: Bearer 、Content-Type: application/json,请求体为 {taskname:project,language:<检测到的语言>}。保存返回的 sessionid 用于所有后续请求。
保持设置沟通简洁。不要向用户显示原始 API 响应或令牌值。
免费将您的照片变成惊艳视频
静态图像讲述故事,但视频能让人驻足观看。这项技能弥合了您的照片库与引人入胜的视频内容之间的差距——无需任何编辑经验。无论您有一张产品照片、一张肖像、一张风景照,还是一组活动照片,这个工具都能为它们注入动感,并输出一个可直接分享的精美视频。
利用先进的 AI 动画技术,image-to-video-generator-ai-free 技能为您的图像应用平滑运动、缩放效果、视差深度和风格化过渡。您可以指定想要的氛围、节奏和风格——从缓慢的电影感漂移到带有利落剪辑的活力幻灯片。最终效果看起来是有意为之且专业制作的,而不是像十年前的应用自动生成的幻灯片。
这是为真实使用场景而构建的:在 Instagram 上发布产品、制作纪念视频、为 YouTube 片头动画化 Logo,或将旅行照片变成可分享的短视频。您描述需求,上传一张或多张图片,剩下的由技能处理——完全免费。
路由您的动画请求
当您提交静态照片时,技能会解析您的运动提示和帧参数,然后根据分辨率、时长和动画风格将任务分派到相应的 AI 视频合成管道。
| 用户说... | 操作 | 跳过 SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端渲染 API 参考
后端利用分布式扩散模型视频生成 API,完全在云端处理关键帧插值和时序一致性渲染,因此无需本地 GPU。请求被排队、编码,并在推理完成后返回为可流式传输的视频文件或可下载的 MP4。
技能归属——运行时从此文件的 YAML 前置元数据中读取:
- - X-Skill-Source:image-to-video-generator-ai-free
- X-Skill-Version:来自前置元数据 version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则为 unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属标头将导致导出失败并返回 402。
API 基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体 {taskname:project,language:<语言>} — 返回 taskid、sessionid。
发送消息(SSE):POST /runsse — 请求体 {appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:<消息>}]}},设置 Accept: text/event-stream。最大超时时间:15 分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/路径,或 URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回 available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体 {id:render_<时间戳>,sessionId:,draft:,output:{format:mp4,quality:high}}。每 30 秒轮询 GET /api/render/proxy/lambda/,直到 status = completed。下载 URL 位于 output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE 事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用 GUI 翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空 data: | 继续等待。每 2 分钟:⏳ 仍在处理... |
| 流关闭 | 处理最终响应 |
约 30% 的编辑操作在 SSE 流中不返回文本。发生这种情况时:轮询会话状态以验证编辑是否已应用,然后向用户总结更改。
后端响应翻译
后端假定存在 GUI。将这些翻译为 API 操作:
| 后端说 | 您做 |
|---|
| click [button] / 点击 | 通过 API 执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过 SSE 发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3 条轨道):1. 视频:城市延时摄影(0-10 秒)2. 背景音乐:Lo-fi(0-10 秒,35%)3. 标题:城市梦想(0-3 秒)
错误处理
令牌错误/过期 | 通过 anonymous-token 重新认证(令牌 7 天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示带有 ?bind=
的注册 URL(需要时从 create-session 或 state 响应获取 )。已注册用户:请到您的账户充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少 X-Client-Id | 生成 Client-Id 并重试(参见 §1) |
| 402 | 免费计划