Getting Started
Welcome! I'm here to help you take your video editing to the next level using AI — from planning your cut sequence to writing captions and generating b-roll ideas. Tell me about your project and let's start building something great together.
Try saying:
- - "I have a 10-minute interview video. Suggest an edit structure that keeps viewers engaged and where I should cut for pacing."
- "Write 5 caption options for a 30-second product demo video aimed at Instagram Reels — upbeat, punchy tone."
- "I'm editing a travel vlog set in Tokyo. What b-roll shots should I look for in my footage to make transitions feel cinematic?"
Quick Start Setup
This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").
Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:
- - Generate a UUID as client identifier
- POST
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header - Extract
data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)
Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.
Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.
Edit Smarter: Your AI Co-Editor Is Ready
Most video editors don't have a shortage of footage — they have a shortage of time and creative clarity. This skill is built specifically to help you move faster from raw clips to a finished cut by acting as a knowledgeable creative partner at every decision point.
With ai-for-video-editing, you can describe your footage and get back a suggested edit structure, ask for caption ideas that match your video's tone, or request b-roll concepts that would strengthen a particular scene. You can even paste a rough script and get pacing notes, transition suggestions, or a breakdown of where to cut for maximum impact.
This skill is especially useful during the planning phase — before you even open your editing software — and during the review phase, when you need a second perspective on whether a sequence is landing the way you intended. Think of it as having an experienced editor in the room who's always available, never tired, and full of ideas.
Routing Cuts and Caption Requests
When you submit a prompt — whether it's a rough cut instruction, an auto-caption request, or a creative direction note — ClawHub parses the intent and routes it to the appropriate AI processing pipeline for timeline editing, transcription, or style generation.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Processing API Reference
Video processing calls are handled by a distributed cloud backend that runs frame analysis, speech-to-text transcription, and generative cut suggestions asynchronously — so heavy renders don't block your session. Large files are chunked and processed in parallel before the edited timeline or caption track is returned to your workspace.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE8 - INLINECODE9 : from frontmatter INLINECODE10
- INLINECODE11 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE21
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE31
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE35
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE39
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Tips and Tricks
To get the most out of ai-for-video-editing, treat the skill like a creative collaborator rather than a search engine. Instead of asking broad questions like 'how do I edit better,' describe a specific problem: 'My talking-head interview feels slow around the 2-minute mark — what editing techniques can tighten it up?'
Use the skill iteratively. Start with a high-level edit plan, then zoom into individual scenes. Ask for three different pacing approaches and compare them before committing. If you're writing captions, request multiple tone variations — one formal, one casual, one punchy — so you have real options to choose from.
Another powerful use: paste in your video script or a rough transcript and ask the skill to flag natural cut points, suggest where graphics or text overlays would add clarity, or identify moments where a cutaway would improve the flow. This works especially well for educational content, explainer videos, and documentary-style edits where structure is everything.
Quick Start Guide
Getting started with this skill is straightforward — just describe your video project in plain language. You don't need to upload files or share technical specs. Start by telling the skill what type of video you're editing (tutorial, vlog, ad, documentary, short film), who your audience is, and what platform it's destined for.
From there, you can ask for a suggested edit order if you have a list of clips, request caption or subtitle copy, brainstorm a hook for your opening sequence, or get feedback on a rough cut description you paste in. The more context you give — tone, length, goal, audience — the more tailored and useful the output will be.
A good first prompt might be: 'I'm editing a 60-second brand video for a coffee shop. The footage includes barista close-ups, latte art, and customer reactions. What edit structure would work best?' From that starting point, you can drill down into transitions, music mood, caption style, and more.
开始使用
欢迎!我在这里帮助你利用AI将视频编辑提升到新高度——从规划剪辑序列到撰写字幕和生成B-roll创意。告诉我你的项目,让我们一起打造精彩作品。
试试这样说:
- - 我有一段10分钟的采访视频。建议一种能保持观众参与度的剪辑结构,以及为了节奏我应该在哪些地方进行剪切。
- 为一个面向Instagram Reels的30秒产品演示视频撰写5个字幕选项——风格要活泼、有力。
- 我正在编辑一个以东京为背景的旅行vlog。我应该在我的素材中寻找哪些B-roll镜头来让转场更具电影感?
快速启动设置
此技能连接到一个云端处理后端。首次使用时,自动建立连接并通知用户(正在连接...)。
令牌检查:在环境中查找NEMO_TOKEN。如果找到,跳转到会话创建。否则:
- - 生成一个UUID作为客户端标识符
- 使用X-Client-Id头信息POST请求https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
- 从响应中提取data.token——这就是你的NEMO_TOKEN(100个免费积分,7天有效期)
会话:使用Bearer认证和主体{taskname:project} POST请求https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent。保留返回的session_id用于所有操作。
设置完成后用简短的准备就绪!通知用户。不要暴露令牌或原始API输出。
更智能地编辑:你的AI联合编辑已就绪
大多数视频编辑者并不缺少素材——他们缺少的是时间和创意清晰度。此技能专为帮助你在每个决策点充当知识渊博的创意伙伴,从而更快地从原始片段过渡到成品剪辑而构建。
使用ai-for-video-editing,你可以描述你的素材并获得建议的剪辑结构,请求与视频基调匹配的字幕创意,或要求提供能增强特定场景的B-roll概念。你甚至可以粘贴粗略脚本并获得节奏建议、转场建议,或为达到最大影响力应在何处剪切的详细分析。
此技能在规划阶段特别有用——在你打开编辑软件之前——以及在审查阶段,当你需要第二意见来判断某个序列是否达到预期效果时。可以把它想象成房间里有一位经验丰富的编辑,随时可用,从不疲倦,且充满创意。
路由剪辑和字幕请求
当你提交提示时——无论是粗略剪辑指令、自动字幕请求还是创意方向说明——ClawHub会解析意图并将其路由到适当的AI处理管道进行时间线编辑、转录或风格生成。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端处理API参考
视频处理调用由分布式云端后端处理,该后端异步运行帧分析、语音转文字转录和生成式剪辑建议——因此繁重的渲染不会阻塞你的会话。大文件被分块并行处理,然后将编辑后的时间线或字幕轨道返回到你的工作区。
技能归属——运行时从此文件的YAML前置元数据读取:
- - X-Skill-Source:ai-for-video-editing
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则 unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头信息将导致导出失败并返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 主体{taskname:project,language:} — 返回taskid、sessionid。
发送消息(SSE):POST /runsse — 主体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}} 并带有Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 主体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生这种情况时:轮询会话状态以验证编辑已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 你执行 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文字),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:城市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示注册URL,附带?bind=
(需要时从create-session或state响应获取)。已注册用户:在你的账户中充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(参见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。注册或升级你的计划以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 30秒后重试一次 |
技巧与窍门
要充分利用ai-for-video-editing,请将此技能视为创意协作者而非搜索引擎。不要问像如何更好地编辑这样宽泛的问题,而是描述一个具体问题:我的访谈视频在2分钟标记处感觉节奏缓慢——有什么编辑技巧可以使其更紧凑?