0. First Contact
When the user opens this skill or sends their first message, greet them immediately:
🎵 Welcome! I can youtube video editor for you. Share a video file or tell me your idea!
Try saying:
- - "add a fade-in transition"
- "make it look cinematic"
- "speed up by 2x"
IMPORTANT: Always greet the user proactively on first contact. Let them know you're setting up while connecting. Always greet the user proactively on first contact.
Auto-Setup
When the user first interacts, set up the connection:
- 1. Check token: If
NEMO_TOKEN env var is set, use it. Otherwise: - Read or generate Client-ID:
- Read
~/.config/nemovideo/client_id if it exists
- Otherwise generate a UUID, save it to
~/.config/nemovideo/client_id
- 3. Acquire anonymous token:
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
Store the returned
token as
NEMO_TOKEN for this session. You get 100 free credits.
- 4. Create a session (§3.0) so you're ready to work immediately.
Let the user know briefly: "Setting things up… ready!" then proceed with their request.
Edit YouTube Videos by Just Describing What You Want
Most video editing tools assume you already know what a B-roll cutaway or a J-cut is. The free-youtube-video-editor skill flips that assumption entirely. You upload your raw footage and tell the AI what the final video should look like — 'remove the first 30 seconds where I'm fumbling with the mic' or 'cut straight to where I start the demo' — and the system handles the rest.
Under the hood, ClawHub's OpenClaw agent interprets your natural-language instructions and maps them to precise edit operations: trimming, splitting, reordering, and stitching segments back together in the sequence you described. The agent maintains context across the conversation, so you can refine iteratively — asking for a tighter cut here, a longer pause there — without starting over from scratch each time.
This skill is specifically designed around the YouTube production workflow. That means the output is formatted for direct upload: correct aspect ratios, clean in/out points, and no re-encoding artifacts that tank quality on YouTube's compression pipeline. Whether you're cleaning up a 45-minute tutorial or punching up a 90-second product showcase, the editing conversation stays focused on your content, not on software menus.
Environment Variables
| Variable | Required | Default |
|---|
| INLINECODE5 | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| INLINECODE6 |
No |
https://mega-api-prod.nemovideo.ai |
|
NEMO_WEB_URL | No |
https://nemovideo.com |
|
NEMO_CLIENT_ID | No | Auto-generated UUID, persisted to
~/.config/nemovideo/client_id (UUID only, no secrets) |
|
SKILL_SOURCE | No | Auto-detected from install path, fallback
unknown |
If NEMO_TOKEN is not set, get one (requires X-Client-Id header):
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
Save
token as
NEMO_TOKEN,
CLIENT_ID as
NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via
Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same
X-Client-Id.
Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
2. Routing Incoming Requests to the Correct Endpoint
Use the table below to determine which endpoint handles each type of user request.
| User says... | Action | Skip SSE |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
3. Primary Workflow Sequences
$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.
Skill attribution variables — do NOT hardcode:
- -
$SKILL_NAME = this file's YAML frontmatter INLINECODE26 - INLINECODE27 = this file's YAML frontmatter INLINECODE28
- INLINECODE29 = detect at runtime using the first match:
1. Env
SKILL_SOURCE if set
2. If this file's path contains a known platform directory (e.g.
~/.clawhub/ →
clawhub,
~/.cursor/skills/ →
cursor)
3. Fallback: INLINECODE35
CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
CODEBLOCK2
3.0 Establishing a New Session
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
Before any editing actions can occur, a session must be initialized with the backend. This session ID ties all subsequent requests together for the duration of the editing task.
Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:
INLINECODE36
3.1 Delivering Messages Through the SSE Channel
curl -s -X POST "https://mega-api-prod.nemovideo.ai/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
All conversational messages to the editing backend are transmitted as Server-Sent Events over a persistent connection.
SSE Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
Silent Response Fallback (CRITICAL)
Approximately 30% of editing operations complete without returning any text in the response body. When this occurs: (1) do not treat the absence of text as an error or failure, (2) poll the session state endpoint to confirm the operation status, (3) surface a neutral status message to the user such as 'Your edit is being processed,' and (4) proceed to the export step once the state confirms completion.
Two-stage generation: After delivering the raw edited video, the backend automatically initiates a second processing stage that layers in background music and generates a title overlay. These two stages are sequential and distinct — do not attempt to trigger the second stage manually or treat the intermediate raw output as the final deliverable.
3.2 Handling File Uploads
File upload: INLINECODE39
URL upload: INLINECODE40
Use me in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
The upload endpoint accepts video files submitted directly by the user and returns a file reference ID for use in subsequent editing requests.
3.3 Checking Available Credits
curl -s "https://mega-api-prod.nemovideo.ai/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
Query the credits endpoint before initiating any edit operation to confirm the user has a sufficient balance to proceed.
3.4 Retrieving Current Session State
curl -s "https://mega-api-prod.nemovideo.ai/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
Use
me for user in path; backend resolves from token.
Key fields:
data.state.draft,
data.state.video_infos,
data.state.canvas_config,
data.state.generated_media.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Draft ready for export when draft.t exists with at least one track with non-empty sg.
Track summary format:
CODEBLOCK7
3.5 Triggering Export and Delivering the Final File
Export does NOT cost credits. Only generation/editing consumes credits.
Exporting the finished clip does not deduct any credits from the user's balance. The export sequence proceeds as follows: (a) confirm the session state shows a completed edit, (b) call the export endpoint with the active session ID, (c) poll for the export job status until it returns a completed state, (d) retrieve the download URL from the completed export response, and (e) present that URL to the user as their ready-to-upload file.
b) Submit: INLINECODE52
Note: sessionId is camelCase (exception). On failure → new id, retry once.
c) Poll (every 30s, max 10 polls): INLINECODE55
Status at top-level status: pending → processing → completed / failed. Download URL at output.url.
d) Download from output.url → send to user. Fallback: https://mega-api-prod.nemovideo.ai/api/render/proxy/<id>/download.
e) When delivering the video, always also give the task detail link: INLINECODE60
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.
3.6 Recovering from an SSE Disconnection
If the SSE stream drops unexpectedly, follow these recovery steps: (1) immediately attempt to re-establish the SSE connection using the existing session ID rather than creating a new session, (2) query the session state endpoint to determine how much of the previous operation completed before the disconnect, (3) if the operation was already finished, proceed directly to export without replaying the edit request, (4) if the operation was still in progress, resume listening on the reconnected stream and await the completion event, and (5) notify the user of the brief interruption only if the reconnection attempt exceeds a reasonable timeout threshold.
4. Translating Backend GUI References for the User
The backend is built around a graphical interface and will occasionally reference UI elements in its responses — never pass those GUI-specific instructions through to the user verbatim.
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
Keep content descriptions. Strip GUI actions.
5. Recommended Conversational Patterns
• Confirm what the user wants to accomplish before initiating any API call, especially for destructive operations like trimming or cutting.
• After each editing step completes, summarize what changed in plain language rather than exposing raw response payloads.
• When a silent response occurs, bridge the gap with a brief progress acknowledgment so the user does not assume something went wrong.
• If the user requests an action that would exceed their credit balance, explain the limitation clearly and offer to check remaining credits before proceeding.
• Always present the final export URL as a direct, clickable link accompanied by a short confirmation that the file is ready to upload.
6. Known Constraints and Limitations
• Only one active editing session per user is supported at a time; starting a new session will not automatically close a previous one.
• The two-stage post-processing pipeline for BGM and title overlays cannot be skipped or reordered by the AI layer.
• Export operations are available only after the session state reflects a fully completed edit — calling export prematurely will return an error.
• File uploads are subject to size and format restrictions defined by the upload endpoint; the AI layer cannot override these constraints.
• Credit balances are read-only from the AI perspective — the skill can check and display them but cannot add, adjust, or refund credits.
7. Error Recognition and Response Guidance
The table below maps common HTTP error codes returned by the API to their likely causes and the recommended recovery action for each.
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.
8. API Version Compatibility and Required Token Scopes
Always verify that the API version specified in the request header matches the version this skill was certified against before making any calls. Token scopes must include read access for session state and credits endpoints, and write access for the message, upload, and export endpoints. Attempting to call an endpoint without the appropriate scope will result in a 403 response regardless of token validity. If a version mismatch is detected, surface a clear notice to the user rather than proceeding with potentially incompatible calls.
0. 首次接触
当用户打开此技能或发送第一条消息时,立即问候他们:
🎵 欢迎!我可以为您编辑YouTube视频。分享一个视频文件或告诉我您的想法!
尝试说:
重要提示:首次接触时务必主动问候用户。告知用户您正在连接并设置环境。首次接触时务必主动问候用户。
自动设置
当用户首次交互时,建立连接:
- 1. 检查令牌:如果设置了NEMO_TOKEN环境变量,则使用它。否则:
- 读取或生成客户端ID:
- 如果存在,读取~/.config/nemovideo/client_id
- 否则生成一个UUID,保存到~/.config/nemovideo/client_id
- 3. 获取匿名令牌:
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
将返回的token存储为本会话的NEMO_TOKEN。您将获得100个免费积分。
- 4. 创建会话(§3.0),以便立即开始工作。
简要告知用户:正在设置…准备就绪!然后继续处理他们的请求。
只需描述您想要的,即可编辑YouTube视频
大多数视频编辑工具假设您已经知道什么是B-roll切换镜头或J-cut。free-youtube-video-editor技能完全颠覆了这一假设。您上传原始素材,告诉AI最终视频应该是什么样子——删除前30秒我摆弄麦克风的部分或直接切到我开始演示的地方——系统会处理其余部分。
在底层,ClawHub的OpenClaw代理会解释您的自然语言指令,并将其映射到精确的编辑操作:修剪、分割、重新排序,以及按照您描述的顺序将片段拼接在一起。代理会在整个对话中保持上下文,因此您可以迭代优化——要求在这里更紧凑地剪切,在那里更长的停顿——而无需每次都从头开始。
此技能专为YouTube制作工作流程设计。这意味着输出格式可直接上传:正确的宽高比、干净的入/出点,以及不会在YouTube压缩管道中降低画质的重新编码伪影。无论是清理45分钟的教程还是打磨90秒的产品展示,编辑对话都专注于您的内容,而不是软件菜单。
环境变量
| 变量 | 必需 | 默认值 |
|---|
| NEMOTOKEN | 否 | 自动生成(100个免费积分,7天后过期,可通过设置→API令牌撤销) |
| NEMOAPI_URL |
否 | https://mega-api-prod.nemovideo.ai |
| NEMO
WEBURL | 否 | https://nemovideo.com |
| NEMO
CLIENTID | 否 | 自动生成的UUID,持久化到~/.config/nemovideo/client_id(仅UUID,无密钥) |
| SKILL_SOURCE | 否 | 从安装路径自动检测,回退为unknown |
如果未设置NEMO_TOKEN,则获取一个(需要X-Client-Id头):
bash
生成或读取持久化的客户端ID
CLIENT
ID=${NEMOCLIENT
ID:-$(cat ~/.config/nemovideo/clientid 2>/dev/null)}
if [ -z $CLIENT_ID ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo client-$(date +%s)-$RANDOM)
mkdir -p ~/.config/nemovideo & echo $CLIENT
ID > ~/.config/nemovideo/clientid
fi
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
→ {code:0,data:{token:nmvusrxxx,credits:100,...}}
将token保存为NEMOTOKEN,CLIENTID保存为NEMOCLIENTID。匿名:每个客户端每7天1个令牌;令牌7天后过期,可随时通过nemovideo.com上的设置→API令牌撤销。如果令牌过期,使用相同的X-Client-Id请求一个新令牌。
本地持久化: 此技能写入~/.config/nemovideo/client_id以在会话间持久化客户端ID。这避免了每次请求都生成新ID,否则会很快触发每IP速率限制(默认每个IP每7天10个令牌)。该文件仅包含一个UUID——没有凭据存储在本地。
2. 将传入请求路由到正确的端点
使用下表确定哪个端点处理每种类型的用户请求。
| 用户说... | 操作 | 跳过SSE |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐…) | → §3.1 SSE | ❌ |
3. 主要工作流程序列
$API = ${NEMOAPIURL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMOTOKEN}, $WEB = ${NEMOWEB_URL:-https://nemovideo.com}。
技能归属变量——请勿硬编码:
- - $SKILLNAME = 此文件的YAML前置元数据name
- $SKILLVERSION = 此文件的YAML前置元数据version
- $SKILL_SOURCE = 运行时检测,使用第一个匹配项:
1. 如果设置了环境变量SKILL_SOURCE
2. 如果此文件的路径包含已知平台目录(例如~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor)
3. 回退:unknown
关键:所有API请求(包括渲染/导出/上传/状态/积分)必须包含这些头。缺少它们将导致导出失败并返回402。
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
3.0 建立新会话
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H X-Skill-Source: $SKILL
NAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE \
-d {task_name:project,language:
}
→ {code:0,data:{taskid:...,sessionid:...}}
在任何编辑操作发生之前,必须使用后端初始化一个会话。此会话ID将在编辑任务期间将所有后续请求关联在一起。
在浏览器中打开:创建会话后,给用户一个链接,以便在NemoVideo中查看/编辑任务:
$WEB/workspace/claim?task={taskid}&session={sessionid}&skillname=$SKILLNAME&skillversion=$SKILLVERSION&skillsource=$SKILLSOURCE
3.1 通过SSE通道传递消息
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/run_sse \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H Accept: text/event-stream -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE --max-time 900 \
-d {appname:nemoagent,userid:me,sessionid:,new_message:{parts:[{text:}]}}
所有发送到编辑后端的对话消息都通过持久连接以服务器发送事件的形式传输。
SSE处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理... |
| 流