Getting Started

Welcome! I'm here to help you transform your still images into dynamic, motion-filled videos using Vheer AI Image to Video. Share your image or describe what you're working with, and let's create something that moves — literally.

Try saying:

- "Animate this product photo with a slow zoom-in and soft bokeh motion effect for an Instagram reel"
"Turn my landscape photograph into a cinematic video with a gentle parallax drift and moody atmosphere"
"Convert this illustrated portrait into a short looping video with subtle facial animation for a social media post"

Getting Connected

Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".

If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:

- Generate a UUID as client identifier
POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header
The response includes a token with 100 free credits valid for 7 days — use it as NEMO_TOKEN

Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.

Tell the user you're ready. Keep the technical details out of the chat.

Give Your Photos a Heartbeat With Motion

Still images carry stories, but video carries emotion. Vheer AI Image to Video bridges that gap by intelligently analyzing your photos and generating smooth, natural motion sequences that feel intentional and cinematic — not mechanical or glitchy.

Whether you're working with a portrait, a landscape, a product shot, or an illustrated artwork, this skill interprets the visual content and applies motion that complements the subject. A mountain scene gets a slow atmospheric drift. A portrait gets subtle life-like movement. A product image gets a polished reveal-style animation.

This skill is built for creators who move fast. You don't need a timeline editor, keyframes, or a render farm. Describe your image and your desired motion style, and the skill handles the transformation. The result is shareable video content ready for social media, presentations, or anywhere still images simply don't do justice to your vision.

Motion Request Routing Logic

When you submit an image for animation, Vheer AI parses your motion prompt, frame rate preference, and movement style to route your request to the optimal generation pipeline.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"

→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |

Vheer Cloud Processing Reference

Vheer AI's backend queues your image-to-video job across distributed GPU clusters, applying temporal coherence algorithms to maintain subject integrity across generated frames. Render times scale with output resolution, motion complexity, and current cluster load.

Skill attribution — read from this file's YAML frontmatter at runtime:

- X-Skill-Source: INLINECODE8
INLINECODE9: from frontmatter INLINECODE10
INLINECODE11: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: INLINECODE21

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE31

Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE35

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE39

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

CODEBLOCK0

Error Handling

Code	Meaning	Action
0	Success	Continue
1001

Performance Notes

Vheer AI Image to Video performs best with images in standard aspect ratios such as 1:1, 4:5, 16:9, or 9:16, which correspond to common social and video platform formats. Unusual crops or extreme panoramic images may require additional guidance on which section to animate.

Generation time varies based on the complexity of the requested motion and the resolution of the source image. Simple zoom or drift effects on clean images typically process faster than multi-layered parallax animations on detailed scenes.

Output videos are optimized for digital distribution and are well-suited for direct upload to platforms like Instagram, TikTok, LinkedIn, and YouTube Shorts. If you need a specific duration or frame rate, mention it upfront so the output matches your platform's requirements without post-processing adjustments.

Best Practices

For the best results with vheer-ai-image-to-video, start with high-resolution images that have a clear subject and well-defined foreground and background layers. Images with strong compositional depth — like a subject in front of a landscape — tend to produce the most convincing parallax and motion effects.

Be specific when describing the motion style you want. Instead of saying 'make it move,' try 'apply a slow rightward pan with a slight zoom on the subject.' The more directional context you provide, the more the output aligns with your creative intent.

Avoid heavily compressed or low-light images, as artifacts in the source photo can become amplified during motion generation. If your image has a busy background with no clear focal point, consider cropping or adjusting contrast before submission to help the skill identify motion zones accurately.

Use Cases

Vheer AI Image to Video is a versatile skill that serves a wide range of creative and professional needs. E-commerce brands use it to animate product photography into attention-grabbing video ads that outperform static image posts in engagement metrics.

Content creators and influencers use it to repurpose existing photo libraries into fresh video content, extending the lifespan of assets they've already invested in creating. A single well-shot photo can become multiple videos with different motion styles for different platforms.

Event planners, real estate agents, and travel marketers use it to create immersive previews — turning a venue photo into a sweeping walkthrough feel, or a property exterior into a cinematic reveal. Artists and illustrators use it to showcase their work in motion, adding depth and drama that a static gallery simply cannot replicate.

开始使用

欢迎！我在这里帮助你使用 Vheer AI 图像转视频工具，将静态图像转化为充满动感的视频。分享你的图片或描述你正在处理的内容，让我们一起创造真正会动的作品。

试试这样说：

- 为这个产品照片添加缓慢推近和柔和散景运动效果，制作成 Instagram Reels
将我的风景照片转化为电影感视频，带有平缓视差漂移和氛围感
将这张插画肖像转化为短循环视频，添加微妙的面部动画用于社交媒体发布

建立连接

在处理任何用户请求之前，先建立与后端 API 的连接。显示简短的状态提示，如正在连接...。

如果环境中存在 NEMO_TOKEN，直接使用它并创建会话。否则，获取一个免费的起始令牌：

- 生成一个 UUID 作为客户端标识符
向 https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token 发送 POST 请求，附带 X-Client-Id 请求头
响应中包含一个 token，附带 100 个免费积分，有效期为 7 天——将其用作 NEMO_TOKEN

然后创建会话，向 https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent 发送 POST 请求，使用 Bearer 授权和请求体 {taskname:project,language:en}。响应中的 session_id 在后续所有请求中都需要使用。

告知用户你已准备就绪。不要在聊天中透露技术细节。

用动态为你的照片注入生命力

静态图像承载故事，而视频传递情感。Vheer AI 图像转视频通过智能分析你的照片，生成流畅自然的运动序列，填补了这一差距——这些运动感觉有意图、有电影感，而非机械或卡顿。

无论你处理的是肖像、风景、产品照片还是插画作品，该技能都会解读视觉内容并应用与主体相得益彰的运动效果。山景获得缓慢的氛围漂移，肖像获得微妙的逼真动作，产品图像获得精致的展示式动画。

该技能专为快速创作的创作者打造。你不需要时间线编辑器、关键帧或渲染农场。描述你的图像和期望的运动风格，技能会处理转换过程。结果是可直接分享的视频内容，适用于社交媒体、演示文稿，或任何静态图像无法充分展现你创意的地方。

运动请求路由逻辑

当你提交图像进行动画处理时，Vheer AI 会解析你的运动提示、帧率偏好和运动风格，将你的请求路由到最优的生成管道。

用户说...	操作	跳过 SSE？
export / 导出 / download / send me the video	→ §3.5 导出	✅
credits / 积分 / balance / 余额

→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容（生成、编辑、添加背景音乐等） | → §3.1 SSE | ❌ |

Vheer 云端处理参考

Vheer AI 的后端将你的图像转视频任务排队到分布式 GPU 集群中，应用时间一致性算法以保持生成帧中主体的完整性。渲染时间随输出分辨率、运动复杂度和当前集群负载而变化。

技能归属——运行时从此文件的 YAML 前置元数据中读取：

- X-Skill-Source：vheer-ai-image-to-video
X-Skill-Version：来自前置元数据 version
X-Skill-Platform：从安装路径检测（~/.clawhub/ → clawhub，~/.cursor/skills/ → cursor，否则 → unknown）

所有请求必须包含：Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头会导致导出失败并返回 402。

API 基础地址：https://mega-api-prod.nemovideo.ai

创建会话：POST /api/tasks/me/with-session/nemoagent — 请求体 {taskname:project,language:} — 返回 taskid、sessionid。

发送消息（SSE）：POST /runsse — 请求体 {appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}} 附带 Accept: text/event-stream。最大超时时间：15 分钟。

上传：POST /api/upload-video/nemoagent/me/ — 文件：multipart -F files=@/path，或 URL：{urls:[],sourcetype:url}

积分：GET /api/credits/balance/simple — 返回 available、frozen、total

会话状态：GET /api/state/nemoagent/me//latest — 关键字段：data.state.draft、data.state.videoinfos、data.state.generated_media

导出（免费，不消耗积分）：POST /api/render/proxy/lambda — 请求体 {id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每 30 秒轮询 GET /api/render/proxy/lambda/，直到 status = completed。下载 URL 位于 output.url。

支持的格式：mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。

SSE 事件处理

事件	操作
文本响应	应用 GUI 翻译（§4），呈现给用户
工具调用/结果

约 30% 的编辑操作在 SSE 流中不返回文本。发生这种情况时：轮询会话状态以验证编辑是否已应用，然后向用户总结更改。

后端响应翻译

后端假设存在 GUI。将这些翻译为 API 操作：

后端说	你执行
click [button] / 点击	通过 API 执行
open [panel] / 打开

草稿字段映射：t=轨道，tt=轨道类型（0=视频，1=音频，7=文本），sg=片段，d=时长（毫秒），m=元数据。

时间线（3 条轨道）：1. 视频：城市延时摄影（0-10 秒）2. 背景音乐：Lo-fi（0-10 秒，35%）3. 标题：城市梦想（0-3 秒）

错误处理

代码	含义	操作
0	成功	继续
1001

性能说明

Vheer AI 图像转视频在标准宽高比的图像上表现最佳，如 1:1、4:5、16:9 或 9:16，这些对应常见的社交和视频平台

vheer-ai-image-to-video图像转视频

vheer-ai-image-to-video

Getting Started

Getting Connected

Give Your Photos a Heartbeat With Motion

Motion Request Routing Logic

Vheer Cloud Processing Reference

SSE Event Handling

Backend Response Translation

Error Handling

Performance Notes

Best Practices

Use Cases

开始使用

建立连接

用动态为你的照片注入生命力

运动请求路由逻辑

Vheer 云端处理参考

SSE 事件处理

后端响应翻译

错误处理

性能说明

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

vheer-ai-image-to-video图像转视频

vheer-ai-image-to-video

Getting Started

Getting Connected

Give Your Photos a Heartbeat With Motion

Motion Request Routing Logic

Vheer Cloud Processing Reference

SSE Event Handling

Backend Response Translation

Error Handling

Performance Notes

Best Practices

Use Cases

开始使用

建立连接

用动态为你的照片注入生命力

运动请求路由逻辑

Vheer 云端处理参考

SSE 事件处理

后端响应翻译

错误处理

性能说明

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement