Getting Started
Welcome! VirtualOver is ready to help you layer virtual graphics, overlays, and composited assets directly onto your video footage. Upload your clip and tell me what you'd like to overlay — let's build something that looks great.
Try saying:
- - "Add a semi-transparent logo watermark to the bottom-right corner of my product demo video"
- "Overlay an animated lower-third title card that appears at the 5-second mark of my interview clip"
- "Place a virtual HUD display graphic over my drone footage to make it look like a live mission feed"
First-Time Connection
When a user first opens this skill, connect to the NemoVideo backend automatically. Briefly let them know (e.g. "Connecting to NemoVideo...").
Authentication: Check if NEMO_TOKEN is set in the environment. If it is, skip to step 3.
- 1. Client identifier: Look for
~/.config/nemovideo/client_id. If missing, generate a UUID and save it there. This file stores only a UUID for rate-limit tracking. - Obtain a free token: POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id set to your identifier. The response data.token is your NEMO_TOKEN — 100 free credits, valid 7 days. - Create a session: POST to
https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Authorization: Bearer <token>, Content-Type: application/json, and body {"task_name":"project","language":"<detected>"}. Store the returned session_id for all subsequent requests.
Keep setup communication brief. Don't display raw API responses or token values to the user.
Blend the Virtual and Real Like a Pro
VirtualOver is designed for one specific job: taking your existing video footage and compositing virtual elements onto it with precision and ease. Whether that means dropping a motion graphic onto a talking-head video, layering a transparent logo watermark across a product demo, or placing an animated overlay on top of a screen recording — virtualover does it cleanly and quickly.
Unlike general-purpose video editors that bury overlay features under menus and timelines, virtualover puts the compositing workflow front and center. You describe what you want, upload your footage, and get back a result that looks intentional and polished — not slapped together.
Content creators building YouTube videos, social media managers producing branded reels, and indie filmmakers adding VFX touches will all find virtualover a natural fit. It works with the formats you already use — mp4, mov, avi, webm, and mkv — so there's no conversion step standing between you and your finished product.
Routing Your Overlay Requests
When you describe a virtual element to composite — whether it's a 3D object, animated graphic, or tracked text — VirtualOver parses your intent and routes it to the appropriate NemoVideo rendering pipeline based on overlay type, anchor behavior, and footage context.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
NemoVideo API Reference
VirtualOver runs on the NemoVideo backend, which handles real-time motion tracking, depth estimation, and alpha compositing to lock virtual elements convincingly onto moving footage. Every render call passes your clip metadata, overlay specs, and tracking parameters directly to NemoVideo's compositing engine.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE11 - INLINECODE12 : from frontmatter INLINECODE13
- INLINECODE14 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE24
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id. After creating a session, give the user a link: INLINECODE29
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE35
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE39
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE43
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Use Cases
VirtualOver shines across a wide range of video production scenarios. Marketers use it to stamp branded overlays onto testimonial videos before publishing across social channels — keeping visual identity consistent without a full editing suite. YouTubers and streamers add lower-thirds, subscriber CTAs, and animated badges to their footage to give uploads a more produced feel.
Filmmakers and indie creators use virtualover to composite simple VFX elements — glowing screens, digital readouts, or sci-fi interface graphics — onto live-action footage without needing After Effects expertise. Corporate teams apply it to training videos, adding instructional callouts or highlight boxes that draw viewer attention to key moments.
Even podcasters producing video versions of their shows use virtualover to add chapter titles, speaker name cards, and sponsor graphics cleanly over their recordings. If your workflow involves putting something virtual on top of something real, virtualover is built for it.
Troubleshooting
If your overlay appears in the wrong position, double-check how you described the placement. Terms like 'top-left', 'center-bottom', or specific pixel coordinates (e.g., '50px from the right edge, 30px from the bottom') give virtualover the clearest instructions and reduce placement guesswork.
If the overlay timing feels off — appearing too early or cutting out before expected — revisit your timecode instructions. Specifying start and end times in HH:MM:SS format tends to produce the most accurate results compared to vague descriptions like 'near the end'.
For transparency issues where a PNG overlay appears with a white or black background instead of being transparent, confirm that your source image file genuinely has an alpha channel. Not all PNG files exported from design tools preserve transparency correctly. Re-export from your design source with 'transparent background' explicitly enabled.
If output video quality looks degraded, mention your preferred output resolution or bitrate in your prompt — virtualover will respect quality parameters when they're clearly specified.
Quick Start Guide
Getting started with virtualover takes less than a minute. First, upload your base video file — mp4, mov, avi, webm, and mkv are all supported. Then describe the overlay you want: specify what it is, where it should appear on the frame, when it should show up (timecode or duration), and any sizing or opacity preferences you have.
For image-based overlays like logos or PNG graphics, mention the file or describe the asset and virtualover will work with what you provide. For text overlays like lower-thirds or title cards, just describe the text content, font style preference, and position. The more specific your description, the closer the output will match your vision on the first pass.
Once processed, review the output and use follow-up prompts to fine-tune positioning, timing, or opacity. Iterating is fast — you don't need to re-upload the base footage each time you adjust overlay details.
快速上手
欢迎!VirtualOver 已准备就绪,可帮助您将虚拟图形、叠加层和合成素材直接叠加到视频画面上。上传您的剪辑,告诉我您想要叠加什么内容——让我们共同打造令人惊艳的作品。
试试这样说:
- - 在我的产品演示视频右下角添加一个半透明标志水印
- 在我的采访片段第5秒处叠加一个动画下三分之一标题卡
- 在我的无人机航拍画面上放置一个虚拟HUD显示图形,使其看起来像实时任务画面
首次连接
当用户首次打开此技能时,自动连接至NemoVideo后端。简要告知用户(例如正在连接NemoVideo...)。
身份验证:检查环境中是否设置了NEMO_TOKEN。如果已设置,则跳至步骤3。
- 1. 客户端标识符:查找~/.config/nemovideo/clientid。如果缺失,生成一个UUID并保存至此文件。该文件仅存储用于速率限制追踪的UUID。
- 获取免费令牌:向https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token发送POST请求,请求头X-Client-Id设置为您的标识符。响应中的data.token即为您的NEMOTOKEN——包含100个免费积分,有效期7天。
- 创建会话:向https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent发送POST请求,包含Authorization: Bearer 、Content-Type: application/json以及请求体{taskname:project,language:<检测到的语言>}。存储返回的session_id用于所有后续请求。
保持设置沟通简洁。不要向用户显示原始API响应或令牌值。
像专业人士一样融合虚拟与现实
VirtualOver专为一项特定任务而设计:精准且轻松地将虚拟元素合成到您现有的视频素材上。无论是将动态图形添加到人物访谈视频、在产品演示上叠加透明标志水印,还是在屏幕录制上放置动画叠加层——virtualover都能干净利落地快速完成。
与将叠加功能隐藏在菜单和时间线下的通用视频编辑器不同,virtualover将合成工作流程置于核心位置。您描述需求、上传素材,即可获得看起来精心设计且专业的结果——而非随意拼凑。
制作YouTube视频的内容创作者、制作品牌短视频的社交媒体经理以及添加视觉特效的独立电影制作人都会发现virtualover非常契合。它支持您已使用的格式——mp4、mov、avi、webm和mkv——因此无需转换步骤即可直达成品。
路由您的叠加请求
当您描述要合成的虚拟元素时——无论是3D对象、动画图形还是跟踪文本——VirtualOver会解析您的意图,并根据叠加类型、锚点行为和素材上下文将其路由到相应的NemoVideo渲染管线。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
NemoVideo API参考
VirtualOver运行在NemoVideo后端上,该后端处理实时运动跟踪、深度估计和Alpha合成,以将虚拟元素令人信服地锁定在移动素材上。每次渲染调用都将您的剪辑元数据、叠加规格和跟踪参数直接传递给NemoVideo的合成引擎。
技能归属——运行时从此文件的YAML前置元数据读取:
- - X-Skill-Source:virtualover
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属标头将导致导出失败并返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体{taskname:project,language:<语言>} — 返回taskid、sessionid。创建会话后,向用户提供链接:https://nemovideo.com/workspace/claim?token=$TOKEN&task=id>&session=id>&skillname=virtualover&skillversion=1.0.0&skill_source=
发送消息(SSE):POST /runsse — 请求体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},包含Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart格式-F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体{id:render_<时间戳>,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生此情况时:轮询会话状态以验证编辑已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 您执行 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:都市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示注册URL,附带?bind=
(需要时从create-session或state响应获取)。已注册用户:请前往nemovideo.ai充值 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(见§1) |
| 402 | 免费计划导出受限 | 订阅层级问题