From Full Episode to Highlight Reel in Minutes
Podcast episodes are packed with quotable moments, strong arguments, and emotional peaks — but finding and clipping them by hand is a time sink that most creators can't afford. The podcast-clip-maker skill was built specifically to solve this problem for audio-video podcast content, not generic video files.
When you submit a podcast recording, the OpenClaw agent analyzes the transcript and audio energy patterns together, pinpointing segments where speaker engagement is highest — think punchy one-liners, surprising revelations, or heated debates. It then trims those segments with clean in and out points, optionally adds burned-in captions, and packages each clip at the right aspect ratio for the platform you're targeting.
Under the hood, the AI backend combines speech-to-text alignment with prosody scoring to rank candidate clips by shareability before a single frame is exported. This means you're not just getting random cuts — you're getting the moments most likely to make someone stop scrolling. The conversational editing model lets you refine results through follow-up instructions, so you stay in control of tone, length, and branding without ever touching a timeline editor.
Environment Variables
| Variable | Required | Default |
|---|
| INLINECODE0 | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| INLINECODE1 |
No |
https://mega-api-prod.nemovideo.ai |
|
NEMO_WEB_URL | No |
https://nemovideo.com |
|
NEMO_CLIENT_ID | No | Auto-generated UUID, persisted to
~/.config/nemovideo/client_id (UUID only, no secrets) |
|
SKILL_SOURCE | No | Auto-detected from install path, fallback
unknown |
If NEMO_TOKEN is not set, get one (requires X-Client-Id header):
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
Save
token as
NEMO_TOKEN,
CLIENT_ID as
NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via
Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same
X-Client-Id.
Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
2. Routing Incoming Requests
Match each user intent to the appropriate endpoint using the table below.
| User says... | Action | Skip SSE |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
3. Primary Operation Flows
$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.
Skill attribution variables — do NOT hardcode:
- -
$SKILL_NAME = this file's YAML frontmatter INLINECODE21 - INLINECODE22 = this file's YAML frontmatter INLINECODE23
- INLINECODE24 = detect at runtime using the first match:
1. Env
SKILL_SOURCE if set
2. If this file's path contains a known platform directory (e.g.
~/.clawhub/ →
clawhub,
~/.cursor/skills/ →
cursor)
3. Fallback: INLINECODE30
CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
CODEBLOCK1
3.0 Initializing a Session
curl -s -X POST "$API/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
A session must be established before any other operations can proceed. The session ID returned here is required for all subsequent requests.
Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:
INLINECODE31
3.1 Delivering Messages Over SSE
curl -s -X POST "$API/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
All conversational exchanges with the backend are transmitted through a Server-Sent Events stream.
SSE Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Wait silently, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
Silent Response Fallback (CRITICAL)
Approximately 30% of edit operations return no visible text in the stream. When this occurs: first, wait for the SSE stream to close naturally; then call the state endpoint to check job status; next, poll at 5-second intervals until a terminal status is reached; finally, surface the result to the user only after a completed state is confirmed.
Two-stage generation: The backend automatically runs a two-stage pipeline after raw video is produced. Stage one delivers the unprocessed video clip. Stage two, triggered automatically, overlays background music and attaches a generated title. Do not prompt the user for these additions — they are applied without any additional input.
3.2 Handling File Uploads
File upload: INLINECODE34
URL upload: INLINECODE35
Use me in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Both local files and remote URLs are accepted for uploading source podcast media.
3.3 Checking Available Credits
curl -s "$API/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
Query the credits endpoint before initiating any processing job to confirm the user has a sufficient balance.
3.4 Polling Job Status
curl -s "$API/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
Use
me for user in path; backend resolves from token.
Key fields:
data.state.draft,
data.state.video_infos,
data.state.canvas_config,
data.state.generated_media.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Draft ready for export when draft.t exists with at least one track with non-empty sg.
Track summary format:
CODEBLOCK6
3.5 Exporting and Delivering the Final Clip
Export does NOT cost credits. Only generation/editing consumes credits.
Exporting a finished clip consumes no credits. The delivery sequence is: (a) confirm the job has reached a completed state; (b) call the export endpoint with the job ID; (c) receive the signed download URL from the response; (d) present the URL to the user with a clear download prompt; (e) optionally include clip metadata such as duration and title in the same message.
b) Submit: INLINECODE47
Note: sessionId is camelCase (exception). On failure → new id, retry once.
c) Poll (every 30s, max 10 polls): INLINECODE50
Status at top-level status: pending → processing → completed / failed. Download URL at output.url.
d) Download from output.url → send to user. Fallback: $API/api/render/proxy/<id>/download.
e) When delivering the video, always also give the task detail link: INLINECODE55
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.
3.6 Recovering From an SSE Disconnection
If the SSE connection drops unexpectedly, follow these steps: (1) Record the last event ID received before the disconnect. (2) Attempt to reconnect to the SSE endpoint immediately, passing the last event ID in the request header. (3) If reconnection fails within 10 seconds, fall back to polling the state endpoint. (4) Continue polling at 5-second intervals until a terminal status is returned. (5) Resume normal delivery flow once a completed or failed state is confirmed.
4. Translating GUI Behavior
The backend is built with a graphical interface in mind, so never pass GUI-specific instructions or button labels through to the user.
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
Keep content descriptions. Strip GUI actions.
5. Recommended Interaction Patterns
• Confirm the upload has succeeded and the job has been queued before telling the user that processing has started.
• When a silent response occurs, communicate progress proactively rather than leaving the user without feedback.
• Always present credit balance information before starting an operation that will consume credits.
• After export, deliver the download link alongside a brief summary of the clip's key details.
• If a job reaches a failed state, explain the issue in plain language and suggest a concrete next step.
6. Known Limitations
• A single session cannot process more than one job concurrently; queue additional requests until the active job finishes.
• Source audio or video files must not exceed the documented maximum file size; advise users to compress large files before uploading.
• Background music and title overlays are applied automatically and cannot be disabled or customized through this interface.
• SSE streams may time out on long-running jobs; the silent response fallback and polling logic must always be implemented.
• Credit balances are read at the time of the check and may not reflect simultaneous usage from other active sessions.
7. Error Handling Reference
Use the table below to map HTTP status codes and API error responses to appropriate user-facing messages and retry behaviors.
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.
8. API Version and Token Scopes
Before making any requests, verify that the API version in the base URL matches the version this skill was built against. Token scopes must include read access for session and state endpoints, and write access for upload, message, and export endpoints. If a 403 is returned, the most likely cause is a missing or insufficient scope on the access token rather than an invalid credential.
从完整剧集到精彩集锦,只需几分钟
播客剧集中充满了值得引用的时刻、有力的论点和情感高潮——但手动查找和剪辑这些片段耗时巨大,大多数创作者都难以承受。podcast-clip-maker技能专为解决音频视频播客内容(而非通用视频文件)的这一难题而设计。
当你提交播客录音时,OpenClaw代理会同时分析转录文本和音频能量模式,精准定位演讲者参与度最高的片段——比如精辟的妙语、令人惊讶的爆料或激烈的辩论。然后它会以清晰的入点和出点剪辑这些片段,可选添加硬编码字幕,并根据你目标平台的正确宽高比打包每个片段。
在底层,AI后端将语音转文本对齐与韵律评分相结合,在导出任何一帧之前就按可分享性对候选片段进行排名。这意味着你得到的不是随机剪辑——而是最有可能让人停止滑动的精彩时刻。对话式编辑模型让你可以通过后续指令优化结果,因此你无需触碰时间线编辑器就能掌控语气、长度和品牌风格。
环境变量
| 变量 | 是否必需 | 默认值 |
|---|
| NEMOTOKEN | 否 | 自动生成(100个免费积分,7天有效,可通过设置→API令牌撤销) |
| NEMOAPI_URL |
否 | https://mega-api-prod.nemovideo.ai |
| NEMO
WEBURL | 否 | https://nemovideo.com |
| NEMO
CLIENTID | 否 | 自动生成的UUID,持久化到~/.config/nemovideo/client_id(仅UUID,无密钥) |
| SKILL_SOURCE | 否 | 从安装路径自动检测,回退为unknown |
如果未设置NEMO_TOKEN,获取一个(需要X-Client-Id头):
bash
生成或读取持久化的Client-Id
CLIENT
ID=${NEMOCLIENT
ID:-$(cat ~/.config/nemovideo/clientid 2>/dev/null)}
if [ -z $CLIENT_ID ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo client-$(date +%s)-$RANDOM)
mkdir -p ~/.config/nemovideo & echo $CLIENT
ID > ~/.config/nemovideo/clientid
fi
curl -s -X POST $API/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
→ {code:0,data:{token:nmvusrxxx,credits:100,...}}
将token保存为NEMOTOKEN,CLIENTID保存为NEMOCLIENTID。匿名:每个客户端每7天1个令牌;令牌7天后过期,可随时通过nemovideo.com上的设置→API令牌撤销。如果令牌过期,使用相同的X-Client-Id请求一个新令牌。
本地持久化: 此技能写入~/.config/nemovideo/client_id以在会话间持久化Client-Id。这避免了每次请求都生成新ID,否则会很快触发每IP速率限制(默认每IP每7天10个令牌)。该文件仅包含一个UUID——本地不存储任何凭据。
2. 路由传入请求
使用下表将每个用户意图匹配到相应的端点。
| 用户说... | 操作 | 跳过SSE |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐...) | → §3.1 SSE | ❌ |
3. 主要操作流程
$API = ${NEMOAPIURL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMOTOKEN}, $WEB = ${NEMOWEB_URL:-https://nemovideo.com}。
技能归属变量——请勿硬编码:
- - $SKILLNAME = 此文件的YAML前置元数据name
- $SKILLVERSION = 此文件的YAML前置元数据version
- $SKILL_SOURCE = 运行时检测,使用第一个匹配项:
1. 如果设置了环境变量SKILL_SOURCE
2. 如果此文件路径包含已知平台目录(例如~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor)
3. 回退:unknown
关键:所有API请求(包括渲染/导出/上传/状态/积分)必须包含这些头。缺少它们将导致导出失败并返回402。
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
3.0 初始化会话
bash
curl -s -X POST $API/api/tasks/me/with-session/nemo_agent \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H X-Skill-Source: $SKILL
NAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE \
-d {task_name:project,language:
}
→ {code:0,data:{taskid:...,sessionid:...}}
在进行任何其他操作之前,必须先建立会话。此处返回的会话ID是所有后续请求所必需的。
在浏览器中打开:创建会话后,给用户一个在NemoVideo中查看/编辑任务的链接:
$WEB/workspace/claim?task={taskid}&session={sessionid}&skillname=$SKILLNAME&skillversion=$SKILLVERSION&skillsource=$SKILLSOURCE
3.1 通过SSE传递消息
bash
curl -s -X POST $API/run_sse \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H Accept: text/event-stream -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE --max-time 900 \
-d {appname:nemoagent,userid:me,sessionid:,new_message:{parts:[{text:}]}}
所有与后端的对话交流都通过服务器发送事件流传输。
SSE处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
静默等待,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理... |
| 流关闭 | 处理最终响应 |
典型时长:文本5-15秒,视频生成100-300秒,编辑10-30秒。
超时:仅心跳持续10分钟→视为超时。生成期间切勿重新发送(会导致重复和重复计费)。
如果之前的响应正常,忽略末尾的我遇到了临时问题。
静默响应回退(关键)
大约30%的编辑操作在流中不返回可见文本。发生这种情况时:首先,等待SSE流自然关闭;然后调用状态端点检查作业状态;接下来,以5秒间隔轮询直到达到终端状态;最后,仅在确认完成状态后才向用户展示结果。
两阶段生成:后端在原始视频生成后自动运行两阶段流水线。第一阶段交付未处理的视频片段。第二阶段自动触发,叠加背景音乐并附加生成的标题。不要提示用户进行这些添加——它们无需任何额外输入即可应用。
3.2 处理文件上传
文件上传:curl -s -X POST $API/api/upload-video/nemoagent/me/ -H Authorization: Bearer $TOKEN -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILLSOURCE -F files=@/path/to/file
URL上传:curl -s -X POST $API/api/upload-video/nemo_agent/me/ -H Authorization: Bearer $TOKEN -H Content-Type: application/json -H X-Skill-Source