Getting Started
Welcome! I'm your video format converter assistant — ready to help you convert, reformat, or compress any video file for any platform or purpose. Tell me what format you're working with and where you need to take it, and let's get started!
Try saying:
- - "Convert my MKV files to MP4 with H.264 encoding while keeping the original audio quality"
- "What's the best format and bitrate settings to export a video for Instagram Reels without losing quality?"
- "I need to batch convert a folder of MOV files to WebM for a web project — walk me through the fastest way to do it"
Quick Start Setup
This skill connects to a cloud processing backend. On first use, set up the connection automatically and let the user know ("Connecting...").
Token check: Look for NEMO_TOKEN in the environment. If found, skip to session creation. Otherwise:
- - Generate a UUID as client identifier
- POST
https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with X-Client-Id header - Extract
data.token from the response — this is your NEMO_TOKEN (100 free credits, 7-day expiry)
Session: POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Keep the returned session_id for all operations.
Let the user know with a brief "Ready!" when setup is complete. Don't expose tokens or raw API output.
Convert Any Video Format Without the Guesswork
Dealing with incompatible video formats is one of those problems that should have been solved a decade ago — yet here we are, still wrestling with codecs, containers, and bitrate settings every time a client sends the wrong file type. This skill exists to end that friction.
The video-format-converter skill lets you describe what you're starting with and where you need to end up. Say you've got a batch of MKV files that need to become H.264 MP4s for a streaming platform, or a ProRes master that needs a compressed H.265 version for archiving — just tell it what you need and get precise, actionable conversion instructions or automated workflows tailored to your setup.
This isn't a one-size-fits-all tool. It accounts for frame rate, aspect ratio, audio codec compatibility, container limitations, and platform-specific requirements like YouTube's preferred specs or Instagram's size caps. Whether you're converting a single clip or planning a large-scale format standardization project, this skill gives you clear steps and smart recommendations without the trial-and-error.
Routing Your Conversion Requests
When you submit a video conversion job, ClawHub parses your target format, codec preferences, and resolution parameters to route the request to the optimal processing pipeline.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Processing API Reference
Video Format Converter runs on a distributed cloud transcoding backend that handles container remuxing, codec re-encoding, and bitrate normalization in parallel across multiple nodes. Large files are chunked and processed concurrently, so even 4K source footage or high-bitrate MKV files move through the pipeline without bottlenecking.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE8 - INLINECODE9 : from frontmatter INLINECODE10
- INLINECODE11 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE21
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE31
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE35
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE39
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Integration Guide
The video-format-converter skill is designed to slot into the tools and workflows you're already using. If you're working with FFmpeg — the industry-standard command-line converter — this skill can generate precise, ready-to-run commands based on your input and output requirements, saving you from memorizing flag syntax.
For teams using Adobe Premiere, DaVinci Resolve, or Final Cut Pro, the skill can recommend export presets and codec settings that match your downstream delivery specs. It understands the difference between editing-friendly formats (like ProRes or DNxHD) and delivery formats (like H.264 or AV1), and will guide you toward the right choice for each stage of your pipeline.
Developers building media processing applications can use this skill to map out conversion logic, understand container and codec compatibility matrices, and troubleshoot format-related errors. Whether you're integrating FFmpeg into a Node.js backend or configuring a cloud media transcoding service, this skill provides the format knowledge layer your pipeline needs.
Common Workflows
One of the most frequent use cases is social media preparation — taking a high-resolution master file and converting it to platform-optimized versions for YouTube (H.264, up to 4K), TikTok (MP4, 9:16 aspect ratio), and LinkedIn (under 5GB, MP4 preferred) all from a single source file. This skill walks you through each platform's specs and flags anything your source file might be missing.
Another common workflow is archive conversion — taking older formats like AVI, WMV, or Flash video (FLV) and migrating them to modern, space-efficient containers like MKV with H.265 encoding. This can dramatically reduce storage costs while preserving visual quality.
For video editors, the proxy workflow is a recurring need: converting large RAW or 4K footage into lightweight proxy files for smooth editing, then relinking to the original high-res files at export. This skill can outline the exact conversion settings to create proxies that match your editing software's expectations and keep your timeline clean.
开始使用
欢迎!我是您的视频格式转换助手——随时准备帮助您为任何平台或用途转换、重新格式化或压缩任何视频文件。告诉我您正在处理的格式以及您需要转换到的目标格式,让我们开始吧!
尝试说:
- - 将我的MKV文件转换为H.264编码的MP4,同时保持原始音频质量
- 导出用于Instagram Reels的视频,在不损失质量的情况下,最佳格式和比特率设置是什么?
- 我需要将一个文件夹中的MOV文件批量转换为WebM格式用于网页项目——请指导我最快的方法
快速启动设置
此技能连接到云端处理后端。首次使用时,自动建立连接并通知用户(正在连接...)。
令牌检查:在环境中查找NEMO_TOKEN。如果找到,跳转到会话创建。否则:
- - 生成UUID作为客户端标识符
- 向https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token发送POST请求,附带X-Client-Id头
- 从响应中提取data.token——这就是您的NEMO_TOKEN(100个免费积分,7天有效期)
会话:向https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent发送POST请求,使用Bearer认证和主体{taskname:project}。保留返回的session_id用于所有操作。
设置完成后,用简短的准备就绪!通知用户。不要暴露令牌或原始API输出。
无需猜测即可转换任何视频格式
处理不兼容的视频格式是十年前就应该解决的问题——然而时至今日,每当客户发送错误的文件类型时,我们仍然在与编解码器、容器和比特率设置作斗争。此技能的存在就是为了消除这种摩擦。
video-format-converter技能让您可以描述您的起始格式和所需的目标格式。假设您有一批MKV文件需要转换为用于流媒体平台的H.264 MP4文件,或者一个ProRes母版需要压缩成H.265版本用于归档——只需告诉它您的需求,即可获得精确、可操作的转换说明或针对您的设置量身定制的自动化工作流程。
这不是一个一刀切的工具。它会考虑帧率、宽高比、音频编解码器兼容性、容器限制以及特定平台的要求,如YouTube的首选规格或Instagram的大小限制。无论您是转换单个剪辑还是规划大规模格式标准化项目,此技能都能为您提供清晰的步骤和智能建议,无需反复试验。
路由您的转换请求
当您提交视频转换任务时,ClawHub会解析您的目标格式、编解码器偏好和分辨率参数,将请求路由到最优处理管道。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐等) | → §3.1 SSE | ❌ |
云端处理API参考
视频格式转换器运行在分布式云端转码后端上,该后端在多个节点上并行处理容器重新封装、编解码器重新编码和比特率标准化。大文件被分块并发处理,因此即使是4K源素材或高比特率MKV文件也能在管道中顺畅运行,不会出现瓶颈。
技能归属——运行时从此文件的YAML前置元数据中读取:
- - X-Skill-Source:video-format-converter
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则为unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头将导致导出失败,返回402错误。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 主体{taskname:project,language:} — 返回taskid、sessionid。
发送消息(SSE):POST /runsse — 主体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},附带Accept: text/event-stream。最大超时时间:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 主体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。当发生这种情况时:轮询会话状态以验证编辑是否已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 您做 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流程 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=持续时间(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:Urban Dreams(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示注册URL,附带?bind=
(需要时从create-session或state响应获取)。已注册用户:在您的账户中充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(参见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。注册或升级您的计划以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 30秒后重试一次 |
集成指南
video-format-converter技能旨在融入您正在使用的工具和工作流程。如果您使用FFmpeg——行业标准的命令行转换器——此技能可以根据您的输入和输出需求生成精确、可直接运行的命令,省去