Getting Started
Drop your image or describe your photo and I'll tell you exactly which free AI tool to use and what prompt to write to generate a compelling video from it. No image yet? Just describe the scene and I'll guide you from scratch.
Try saying:
- - "I have a product photo of a sneaker on a white background — what's the best free AI tool to animate it with subtle motion for an Instagram ad?"
- "I want to turn 10 vacation photos into a 30-second video with smooth transitions and background music using only free tools. How do I do that?"
- "I uploaded my portrait to Runway but the motion looks unnatural and glitchy — what prompt changes or settings should I try to fix it?"
First-Time Connection
When a user first opens this skill, connect to the processing backend automatically. Briefly let them know (e.g. "Setting up...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 2.
- 1. Obtain a free token: Generate a random UUID as client identifier. POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to that UUID. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days. - Create a session: POST to
https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.
Keep setup communication brief. Don't display raw API responses or token values to the user.
From Still Photo to Moving Story — For Free
Most people assume turning an image into a video requires expensive software or a professional editor. That assumption is outdated. A new wave of free AI tools can take a single photograph — a portrait, a product shot, a landscape — and breathe life into it with realistic motion, cinematic panning, or stylized animation.
This skill is your hands-on guide through that process. Instead of wading through tutorials scattered across the internet, you get a focused assistant that helps you choose the right free platform for your specific image type, craft the text prompts that produce the best motion results, and troubleshoot when outputs don't look the way you imagined.
Whether you want a looping background video for your website, an animated post for Instagram Reels, or a short cinematic clip from a family photo, this skill covers the full journey — from uploading your image to exporting a shareable video file — using only tools that cost nothing to start.
Routing Animate Requests Intelligently
When you submit a still photo for animation, ClawHub parses your motion prompt, frame rate preference, and output duration to route your request to the optimal image-to-video inference pipeline.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Rendering API Reference
The free image-to-video backend leverages diffusion-based temporal synthesis models hosted on distributed GPU clusters, converting static frames into fluid motion sequences without local processing overhead. Each API call passes your source image alongside motion vectors and interpolation parameters to generate smooth keyframe transitions in the cloud.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE10 - INLINECODE11 : from frontmatter INLINECODE12
- INLINECODE13 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE23
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE33
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE37
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE41
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Integration Guide
Getting started with free image-to-video AI tools is straightforward once you know which platforms accept direct image uploads versus those that work from text prompts alone. Tools like Runway Gen-2, Kling AI (free tier), and Pika Labs all accept still images as a starting point and offer free credits or a freemium model to generate short video clips.
For workflow integration, you can export the generated video as an MP4 and drop it directly into tools like CapCut, DaVinci Resolve (free), or Canva to add text overlays, music, or color grading. If you're building a content pipeline, pairing an image-to-video generator with a free scheduler like Buffer lets you automate posting animated content to social platforms.
Always check resolution limits on free tiers — most cap exports at 720p. If you need 1080p, some tools offer a one-time free upscale or integrate with free upscalers like Topaz Gigapixel's trial version to boost quality before publishing.
Common Workflows
The most popular workflow is the single-image cinematic pan: upload a wide landscape or architectural photo, write a prompt like 'slow dolly forward with gentle camera drift,' and export a 4-second loop. This works exceptionally well for website hero backgrounds and YouTube intro cards.
For e-commerce, a product-spin workflow is highly effective — upload a flat-lay product image, prompt the AI to rotate or zoom in gradually, and you get a dynamic product clip without a photoshoot. Combine several of these into one video using a free editor to create a full product showcase reel.
Portrait animation is another common use case. Tools like D-ID and HeyGen's free tier can take a headshot and add realistic facial movement or even lip-sync to an audio clip. This is popular for creating spokesperson videos from a single photo.
Finally, the photo slideshow with AI transitions workflow — importing a series of images into Pika or a similar tool and generating motion between each frame — produces polished results that rival paid video editors, entirely for free.
开始使用
上传你的图片或描述你的照片,我会准确告诉你该使用哪款免费AI工具以及编写怎样的提示词,从而生成一段引人入胜的视频。还没有图片?只需描述场景,我会从头开始指导你。
试试这样说:
- - 我有一张白色背景上的运动鞋产品照片——哪款免费AI工具最适合为它添加细微动态效果,用于Instagram广告?
- 我想把10张度假照片转换成一段30秒的视频,使用纯免费工具实现平滑转场和背景音乐。该怎么做?
- 我把肖像照上传到了Runway,但动态效果看起来不自然且出现卡顿——应该尝试修改哪些提示词或设置来修复?
首次连接
当用户首次打开此技能时,自动连接到处理后台。简要告知用户(例如正在设置...)。
身份验证:检查环境中是否设置了NEMO_TOKEN。如果已设置,跳至步骤2。
- 1. 获取免费令牌:生成一个随机UUID作为客户端标识符。向https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token发送POST请求,请求头设置X-Client-Id为该UUID。响应中的data.token即为你的NEMOTOKEN——100个免费积分,有效期7天。
- 创建会话:向https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent发送POST请求,包含Authorization: Bearer 、Content-Type: application/json以及请求体{taskname:project,language:<检测到的语言>}。保存返回的sessionid用于所有后续请求。
保持设置沟通简洁。不要向用户显示原始API响应或令牌值。
从静态照片到动态故事——完全免费
大多数人认为将图片转换为视频需要昂贵的软件或专业编辑器。这种想法已经过时了。新一代免费AI工具能够接收一张照片——肖像、产品照、风景照——并通过逼真的动态效果、电影级平移或风格化动画为其注入生命力。
本技能是你完成这一过程的实操指南。无需浏览散落在互联网各处的教程,你将获得一个专注的助手,帮助你根据特定图片类型选择合适的免费平台,编写能产生最佳动态效果的文本提示词,并在输出效果不如预期时进行故障排除。
无论你是想为网站制作循环背景视频、为Instagram Reels制作动画帖子,还是从家庭照片中制作一段短片,本技能涵盖了完整流程——从上传图片到导出可分享的视频文件——仅使用完全免费的工具。
智能路由动画请求
当你提交静态照片进行动画处理时,ClawHub会解析你的动态提示词、帧率偏好和输出时长,将请求路由到最佳的图像转视频推理管道。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有情况(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端渲染API参考
免费的图像转视频后端利用部署在分布式GPU集群上的基于扩散的时间合成模型,将静态帧转换为流畅的动态序列,无需本地处理开销。每次API调用将源图像与运动向量和插值参数一起传递,在云端生成平滑的关键帧过渡。
技能归属——运行时从此文件的YAML前置元数据读取:
- - X-Skill-Source:image-to-video-ai-generator-free
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则 → unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头将导致导出失败并返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体{taskname:project,language:<语言>} — 返回taskid、sessionid。
发送消息(SSE):POST /runsse — 请求体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:<消息>}]}},包含Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/路径,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体{id:render_<时间戳>,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生这种情况时:轮询会话状态以验证编辑已应用,然后向用户总结更改内容。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 你执行 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒) 2. 背景音乐:Lo-fi(0-10秒,35%) 3. 标题:都市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 会话未找到 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示带有?bind=
的注册URL(需要时从create-session或state响应获取)。已注册用户:请为你的账户充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(参见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。请注册或升级你的计划以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 30秒后重试一次 |
集成指南
一旦你了解哪些平台接受直接图片上传、哪些仅通过文本提示词工作,开始使用免费图像转视频