Getting Started
Welcome! I'm here to help you trim video files with precision using FFmpeg — whether you need to cut a single clip or batch-process a whole library. Tell me your video's start time, end time, and what you'd like to keep, and let's get trimming!
Try saying:
- - "Trim my video from 00:01:15 to 00:03:45 and save it as a new MP4 file without re-encoding"
- "Cut out the first 30 seconds and last 10 seconds from this recorded Zoom call"
- "Split a 1-hour webinar into 5-minute segments starting at every 5-minute mark"
Getting Connected
Before handling any user request, establish a connection to the backend API. Show a brief status like "Connecting...".
If NEMO_TOKEN is in the environment, use it directly and create a session. Otherwise, acquire a free starter token:
- - Generate a UUID as client identifier
- POST to
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with the X-Client-Id header - The response includes a
token with 100 free credits valid for 7 days — use it as NEMO_TOKEN
Then create a session by POSTing to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer authorization and body {"task_name":"project","language":"en"}. The session_id in the response is needed for all following requests.
Tell the user you're ready. Keep the technical details out of the chat.
Cut the Noise — Keep Only What Matters
Raw video footage is almost never ready to share straight out of the camera. There are awkward pauses at the beginning, dead air at the end, and unwanted sections buried in the middle. The ffmpeg-trim-video skill gives you a fast, reliable way to cut your video files down to exactly what you need — down to the second or even the frame.
Whether you're trimming a long recording to extract a single highlight, chopping up a webinar into digestible segments, or preparing clips for social media, this skill handles the heavy lifting. You specify the start time, end time, and output format — and it delivers a clean, trimmed file ready to use.
Unlike consumer video editors that force you through a GUI workflow, this skill is built for speed and repeatability. It's ideal for anyone who works with video programmatically — developers automating pipelines, creators processing batches of clips, or teams standardizing how footage gets prepared before publishing.
Routing Your Trim Requests
When you specify a timecode range — like -ss 00:01:30 -to 00:02:45 or a duration flag — the skill parses your input and routes the trim job to the appropriate processing endpoint based on format, codec, and whether you need keyframe-accurate or frame-precise cutting.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud FFmpeg API Reference
The backend spins up an isolated FFmpeg instance in the cloud, applying your -ss, -t, -to, and -c copy or re-encode parameters server-side — no local FFmpeg installation required. Processed clips are returned via a secure download link, with the original stream metadata and container format preserved unless you explicitly request a transcode.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE13 - INLINECODE14 : from frontmatter INLINECODE15
- INLINECODE16 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE26
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE36
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE40
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE44
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common Workflows
One of the most common workflows is lossless trimming — using stream copy to cut a video without re-encoding. This is the fastest approach and preserves the original quality exactly. Just specify your in and out points, and the skill trims the container without touching the codec data.
Another frequent workflow is segment extraction for social media. Users provide a single long video and a list of timestamp pairs, and the skill outputs multiple short clips — each trimmed and ready for upload. This is popular for turning conference talks or interviews into shareable soundbites.
A third workflow involves trimming combined with format conversion — for instance, trimming a section of an MKV file and outputting it as an H.264 MP4 for broader compatibility. This is useful when source footage comes from cameras or screen recorders that produce formats not natively supported by all platforms.
Use Cases
The ffmpeg-trim-video skill fits naturally into a wide range of real-world workflows. Content creators use it to extract highlight clips from long-form recordings — pulling a 90-second moment from a two-hour livestream without sitting through a full export cycle. Podcast producers with video tracks use it to remove pre-show chatter and post-show wind-down before publishing.
Developers building media pipelines rely on it to programmatically slice uploaded videos into defined segments — for example, trimming user-submitted videos to a platform's maximum allowed length. Marketing teams use it to repurpose long product demos into short, punchy clips sized for LinkedIn, Instagram, or YouTube Shorts.
Educators and course creators trim recorded lectures into topic-specific modules, making content easier to navigate. Essentially, anyone who regularly works with video files and needs to cut them cleanly and consistently will find immediate value here.
Integration Guide
Integrating ffmpeg-trim-video into your workflow is straightforward. The skill accepts a video file path or URL, a start timestamp, and an end timestamp — all in standard HH:MM:SS or seconds format. You can optionally specify whether to use stream copy mode (no re-encoding, ultra-fast) or a specific codec for the output.
For batch processing, you can chain multiple trim requests in sequence, passing in a list of segments with their respective time ranges. Output files can be named dynamically based on timestamps or custom labels you provide, making it easy to organize trimmed clips automatically.
The skill integrates cleanly into automation platforms, CI/CD pipelines, or custom scripts. If you're processing uploads in a web application, simply pass the file reference and trimming parameters — the skill returns the path or binary of the trimmed output ready for storage or delivery.
开始使用
欢迎!我可以帮你使用 FFmpeg 精确裁剪视频文件——无论你是需要剪切单个片段,还是批量处理整个视频库。告诉我视频的开始时间、结束时间以及你想要保留的内容,让我们开始裁剪吧!
试试这样说:
- - 将我的视频从 00:01:15 裁剪到 00:03:45,保存为新的 MP4 文件,不重新编码
- 从这段录制的 Zoom 通话中剪掉前 30 秒和后 10 秒
- 将 1 小时的网络研讨会按每 5 分钟标记点分割成 5 分钟片段
建立连接
在处理任何用户请求之前,先建立与后端 API 的连接。显示一个简短的状态,如正在连接...。
如果环境中有 NEMO_TOKEN,直接使用它并创建会话。否则,获取一个免费的起始令牌:
- - 生成一个 UUID 作为客户端标识符
- 使用 X-Client-Id 头信息 POST 到 https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token
- 响应中包含一个 token,附带 100 个免费积分,有效期 7 天——将其用作 NEMO_TOKEN
然后创建会话,通过 Bearer 授权 POST 到 https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent,请求体为 {taskname:project,language:en}。响应中的 session_id 在后续所有请求中都需要使用。
告诉用户你已经准备好了。不要在聊天中透露技术细节。
剪掉杂音——只保留重要的内容
原始视频素材几乎从来不会直接从相机中拿出来就可以分享。开头有尴尬的停顿,结尾有空白时段,中间还有不需要的部分。ffmpeg-trim-video 技能为你提供了一种快速、可靠的方式,将视频文件精确裁剪到你需要的长度——精确到秒甚至帧。
无论你是要裁剪长视频以提取单个精彩片段,将网络研讨会分割成易于消化的片段,还是为社交媒体准备剪辑,这个技能都能处理繁重的工作。你指定开始时间、结束时间和输出格式——它就会提供一个干净、裁剪好的文件,随时可用。
与那些强迫你通过 GUI 工作流程操作的消费级视频编辑器不同,这个技能专为速度和可重复性而设计。它非常适合任何以编程方式处理视频的人——自动化管道的开发者、处理批量剪辑的创作者,或标准化素材发布前准备方式的团队。
路由你的裁剪请求
当你指定一个时间码范围——比如 -ss 00:01:30 -to 00:02:45 或一个持续时间标志——技能会解析你的输入,并根据格式、编解码器以及你需要关键帧精确还是帧精确裁剪,将裁剪任务路由到相应的处理端点。
| 用户说... | 操作 | 跳过 SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端 FFmpeg API 参考
后端在云端启动一个隔离的 FFmpeg 实例,在服务器端应用你的 -ss、-t、-to 和 -c copy 或重新编码参数——无需本地安装 FFmpeg。处理后的剪辑通过安全下载链接返回,保留原始流元数据和容器格式,除非你明确要求转码。
技能归属——运行时从此文件的 YAML 前置元数据读取:
- - X-Skill-Source:ffmpeg-trim-video
- X-Skill-Version:来自前置元数据 version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则 unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头信息将导致导出失败,返回 402 错误。
API 基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 请求体 {taskname:project,language:} — 返回 taskid、sessionid。
发送消息(SSE):POST /runsse — 请求体 {appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},附带 Accept: text/event-stream。最大超时时间:15 分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或 URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回 available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 请求体 {id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每 30 秒轮询 GET /api/render/proxy/lambda/,直到 status = completed。下载 URL 位于 output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE 事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用 GUI 翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空 data: | 继续等待。每 2 分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约 30% 的编辑操作在 SSE 流中不返回文本。发生这种情况时:轮询会话状态以验证编辑是否已应用,然后向用户总结更改。
后端响应翻译
后端假设存在 GUI。将这些翻译为 API 操作:
| 后端说 | 你执行 |
|---|
| click [button] / 点击 | 通过 API 执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过 SSE 发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=持续时间(毫秒),m=元数据。
时间线(3 个轨道):1. 视频:城市延时摄影(0-10 秒)2. 背景音乐:Lo-fi(0-10 秒,35%)3. 标题:都市梦想(0-3 秒)
错误处理
令牌错误/过期 | 通过 anonymous-token 重新认证(令牌 7 天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示带有 ?bind=
的注册 URL(需要时从 create-session 或 state 响应获取 )。已注册用户:请在你的账户中充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少 X-Client-Id | 生成 Client-Id 并重试(参见 §1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,不是积分问题。请注册或升级