Getting Started

Welcome! I can turn your still photos into dynamic, animated video clips using free AI — no cost, no complicated tools. Share your image and tell me what kind of motion or mood you're going for, and let's bring it to life!

Try saying:

- "Here's a photo of a mountain lake at sunset — can you animate it with gentle water ripples and slow drifting clouds?"
"I have a product photo of my skincare bottle. Make it into a short video with a slow zoom-in and soft light shimmer effect."
"Animate this portrait of my dog so it looks like he's subtly turning his head and blinking."

Quick Start Setup

This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").

Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:

- Generate a UUID as client identifier
POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header
Extract data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)

Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.

Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.

Turn Any Photo Into a Moving Masterpiece

Still images tell a story, but video makes people feel it. This skill takes your photos — whether they're portraits, travel snapshots, product images, or digital artwork — and generates fluid, eye-catching video clips that look like they were filmed, not fabricated.

Using free AI-powered animation technology, the skill analyzes the content of your image and applies intelligent motion: a gentle breeze through hair, clouds drifting across a skyline, water rippling on a lake surface, or a subtle zoom that gives your product photo a professional commercial feel. You describe the motion you want, and the AI handles the rest.

This is ideal for social media creators who want to stand out on Instagram Reels or TikTok, small business owners who need affordable promotional content, and anyone who wants to repurpose their photo library into engaging video content — no video editing experience required.

Routing Your Animation Requests

When you submit a still photo, your request is parsed for motion parameters — frame interpolation style, duration, and animation intensity — then dispatched to the appropriate image-to-video pipeline based on your selected output resolution and generation mode.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"

→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |

Cloud Rendering API Reference

All image-to-video synthesis runs on a distributed cloud backend that handles diffusion-based frame generation, temporal consistency smoothing, and MP4 encoding without any local compute required on your end. API calls are stateless and authenticated per session, so each animation job is queued, processed, and returned as a signed video URL within the response payload.

Skill attribution — read from this file's YAML frontmatter at runtime:

- X-Skill-Source: INLINECODE8
INLINECODE9: from frontmatter INLINECODE10
INLINECODE11: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: INLINECODE21

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE31

Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE35

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE39

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

CODEBLOCK0

Error Handling

Code	Meaning	Action
0	Success	Continue
1001

Best Practices

For the best animation results, start with a high-resolution image — blurry or heavily compressed photos tend to produce less convincing motion. Images with clear foreground and background separation (like a subject against an open sky or landscape) animate most naturally because the AI can apply parallax-style depth movement.

Be specific in your motion description. Instead of saying 'make it move,' try 'add a slow leftward pan with a subtle zoom-in on the subject's face.' The more directional detail you provide, the more precisely the output matches your vision.

Avoid images with dense, overlapping text or complex geometric patterns — these can distort when motion is applied. Portraits work best when the subject is centered and well-lit. Finally, if you're planning to use the clip on social media, mention your target platform so the output can be framed and paced appropriately for that format.

Use Cases

Image-to-video-free-ai serves a surprisingly wide range of real-world needs. Social media managers use it to transform static brand assets into scroll-stopping Reels and TikToks without a video production budget. Wedding photographers offer animated highlight previews to clients by turning a few key photos into short cinematic clips.

E-commerce businesses animate product photos to simulate a 360-degree-style view or spotlight key features with motion — dramatically increasing engagement compared to static listings. Educators and presenters use animated images to make slideshows and explainer content feel more dynamic without filming new footage.

Memorial and tribute creators use the skill to gently animate old family photos, giving cherished memories a new dimension. Digital artists animate their illustrations to share on platforms that favor video content over static posts.

Common Workflows

Most users start by uploading a single image and describing the type of motion they want — this is the fastest path to a finished video clip. For portraits, common requests include subtle head tilts, eye blinks, or hair movement. For landscapes, users typically ask for parallax depth effects, moving clouds, or flowing water.

Another popular workflow is batch animation: uploading multiple product or travel photos and generating a series of short clips that can be stitched together into a slideshow-style video. This is especially useful for e-commerce sellers and travel bloggers.

For social media creators, a third workflow involves starting with AI-generated motion and then layering on captions or music using a separate editing tool — using this skill purely for the animation step before final production polish.

快速上手

欢迎！我可以使用免费AI将您的静态照片转化为动态、生动的视频片段——无需成本，无需复杂工具。分享您的图片，告诉我您想要的动态效果或氛围，让我们一起赋予它生命！

试试这样说：

- 这是一张日落时分山间湖泊的照片——能否让它泛起轻柔的水波，并伴有缓慢飘动的云朵？
我有一张护肤品的产品照片。请将其制作成带有缓慢推进和柔和光影闪烁效果的短视频。
将我狗狗的这张肖像动画化，让它看起来像是在微微转头和眨眼。

快速启动设置

此技能连接至云端处理后台。首次使用时，自动建立连接并通知用户（正在连接...）。

令牌检查：在环境中查找 NEMO_TOKEN。如果找到，直接跳转到会话创建。否则：

- 生成UUID作为客户端标识符
使用 X-Client-Id 标头发送POST请求至 https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
从响应中提取 data.token——这就是您的NEMO_TOKEN（100个免费积分，7天有效期）

会话：使用Bearer认证发送POST请求至 https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent，请求体为 {taskname:project}。保留返回的 session_id 用于所有操作。

设置完成后，用简短的准备就绪！通知用户。不要暴露令牌或原始API输出。

将任何照片变成动态杰作

静态图像讲述故事，但视频让人感受故事。此技能将您的照片——无论是肖像、旅行快照、产品图片还是数字艺术作品——生成流畅、引人注目的视频片段，看起来像是拍摄的，而非合成的。

利用免费AI驱动的动画技术，该技能分析图像内容并应用智能动态：微风拂过发丝、云朵划过天际、湖面泛起涟漪，或为您的产品照片带来专业商业感的微妙推进。您描述想要的动态效果，AI处理其余部分。

这非常适合希望在Instagram Reels或TikTok上脱颖而出的社交媒体创作者、需要经济实惠推广内容的小企业主，以及任何希望将照片库重新利用为引人入胜视频内容的人——无需视频编辑经验。

路由您的动画请求

当您提交静态照片时，您的请求会被解析为运动参数——帧插值风格、持续时间和动画强度——然后根据您选择的输出分辨率和生成模式，分派到相应的图像转视频管道。

用户说...	操作	跳过SSE？
export / 导出 / download / send me the video	→ §3.5 导出	✅
credits / 积分 / balance / 余额

→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容（生成、编辑、添加背景音乐…） | → §3.1 SSE | ❌ |

云端渲染API参考

所有图像转视频合成均在分布式云端后台运行，处理基于扩散的帧生成、时间一致性平滑和MP4编码，无需本地计算。API调用是无状态的，按会话进行身份验证，因此每个动画任务都会被排队、处理，并以签名视频URL的形式在响应负载中返回。

技能归属——运行时从此文件的YAML前置元数据读取：

- X-Skill-Source：image-to-video-free-ai
X-Skill-Version：来自前置元数据 version
X-Skill-Platform：从安装路径检测（~/.clawhub/ → clawhub，~/.cursor/skills/ → cursor，否则为 unknown）

所有请求必须包含：Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属标头将导致导出失败并返回402错误。

API基础地址：https://mega-api-prod.nemovideo.ai

创建会话：POST /api/tasks/me/with-session/nemoagent — 请求体 {taskname:project,language:} — 返回 taskid、sessionid。

发送消息（SSE）：POST /runsse — 请求体 {appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}} 并带有 Accept: text/event-stream。最大超时时间：15分钟。

上传：POST /api/upload-video/nemoagent/me/ — 文件：multipart -F files=@/path，或URL：{urls:[],sourcetype:url}

积分查询：GET /api/credits/balance/simple — 返回 available、frozen、total

会话状态：GET /api/state/nemoagent/me//latest — 关键字段：data.state.draft、data.state.videoinfos、data.state.generated_media

导出（免费，无需积分）：POST /api/render/proxy/lambda — 请求体 {id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/，直到 status = completed。下载URL位于 output.url。

支持的格式：mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。

SSE事件处理

事件	操作
文本响应	应用GUI翻译（§4），呈现给用户
工具调用/结果

约30%的编辑操作在SSE流中不返回文本。发生这种情况时：轮询会话状态以验证编辑是否已应用，然后向用户总结更改。

后端响应翻译

后端假定存在GUI。将这些翻译为API操作：

后端说	您做
click [button] / 点击	通过API执行
open [panel] / 打开

草稿字段映射：t=轨道，tt=轨道类型（0=视频，1=音频，7=文本），sg=片段，d=持续时间（毫秒），m=元数据。

时间线（3条轨道）：1. 视频：城市延时摄影（0-10秒）2. 背景音乐：Lo-fi（0-10秒，35%）3. 标题：城市梦想（0-3秒）

错误处理

代码	含义	操作
0	成功	继续
1001

最佳实践

为获得最佳动画效果，请从高分辨率图像开始——模糊或高度压缩的照片往往产生不太令人信服的运动效果。具有清晰前景和背景分离的图像（如主体在开阔天空或风景前）动画效果最自然，因为AI可以应用视差式深度移动。

在运动描述中要具体。不要说让它动起来

image-to-video-free-ai图像转视频

image-to-video-free-ai

Getting Started

Quick Start Setup

Turn Any Photo Into a Moving Masterpiece

Routing Your Animation Requests

Cloud Rendering API Reference

SSE Event Handling

Backend Response Translation

Error Handling

Best Practices

Use Cases

Common Workflows

快速上手

快速启动设置

将任何照片变成动态杰作

路由您的动画请求

云端渲染API参考

SSE事件处理

后端响应翻译

错误处理

最佳实践

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

image-to-video-free-ai图像转视频

image-to-video-free-ai

Getting Started

Quick Start Setup

Turn Any Photo Into a Moving Masterpiece

Routing Your Animation Requests

Cloud Rendering API Reference

SSE Event Handling

Backend Response Translation

Error Handling

Best Practices

Use Cases

Common Workflows

快速上手

快速启动设置

将任何照片变成动态杰作

路由您的动画请求

云端渲染API参考

SSE事件处理

后端响应翻译

错误处理

最佳实践

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement