Getting Started

Welcome! Ready to add subtitles to your video for free? Upload your mp4, mov, or other video file and tell me your caption preferences — font, color, placement, or just leave it to the defaults — and I'll burn accurate subtitles directly into your footage.

Try saying:

- "Add white subtitles with a black outline at the bottom of this mp4 tutorial video"
"Generate subtitles for my interview clip and use a large bold font so it's easy to read on mobile"
"Burn captions into this webm product demo — keep the style clean and minimal, centered at the bottom"

Automatic Setup

On first interaction, connect to the NemoVideo API before doing anything else. Show a brief status like "Setting things up...".

Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.

Free token: Read or create a UUID at ~/.config/nemovideo/client_id. Then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <your-uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry). If the token has expired, request a new one with the same Client-Id.

Session: POST to the same host at /api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.

Confirm to the user you're connected and ready. Don't print tokens or raw JSON.

Turn Any Video Into a Captioned, Accessible Masterpiece

Getting subtitles onto a video used to mean juggling transcription services, SRT files, and video editors — all before you could share a single clip. This free-subtitle-maker skill collapses that entire process into one step. Upload your video, describe any preferences you have for how the subtitles should look, and walk away with a fully captioned file ready to publish.

The skill listens to your video's audio track, breaks it into timed segments, and overlays clean, readable text directly onto the frames. Whether you're subtitling a tutorial, a short film, a product demo, or a social reel, the output is a polished video file — not a separate caption file you still have to attach somewhere.

This tool is especially valuable for creators working across languages or accessibility requirements. Subtitles increase watch time, improve comprehension for non-native speakers, and make content usable in sound-off environments like social feeds. You don't need an account with a transcription platform or a video editing subscription — just your video and a prompt.

Routing Subtitle Generation Requests

Every user request — whether auto-generating SRT captions, burning hardcoded subtitles, or adjusting font and timing — is parsed and routed to the matching NemoVideo endpoint based on the detected action type.

User says...	Action	Skip SSE?
"export" / "导出" / "download" / "send me the video"	→ §3.5 Export	✅
"credits" / "积分" / "balance" / "余额"

→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |

NemoVideo API Backend Reference

Free Subtitle Maker runs entirely on the NemoVideo backend, which handles speech-to-text transcription, subtitle rendering, and frame-accurate burn-in encoding without requiring any local processing. All subtitle jobs — including multi-language captions and styled text overlays — are queued, processed, and returned as downloadable video files through the NemoVideo pipeline.

Skill attribution — read from this file's YAML frontmatter at runtime:

- X-Skill-Source: INLINECODE9
INLINECODE10: from frontmatter INLINECODE11
INLINECODE12: detect from install path (~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)

All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.

API base: INLINECODE22

Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id. After creating a session, give the user a link: INLINECODE27

Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.

Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE33

Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE37

Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE41

Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.

Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.

SSE Event Handling

Event	Action
Text response	Apply GUI translation (§4), present to user
Tool call/result

~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.

Backend Response Translation

The backend assumes a GUI exists. Translate these into API actions:

Backend says	You do
"click [button]" / "点击"	Execute via API
"open [panel]" / "打开"

Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.

CODEBLOCK0

Error Handling

Code	Meaning	Action
0	Success	Continue
1001

Use Cases

This free-subtitle-maker skill fits naturally into a wide range of real-world workflows. Educators recording lecture videos or screencasts can add subtitles to make content accessible to students with hearing impairments or those studying in a second language. A single upload-and-prompt workflow replaces what used to require dedicated captioning software.

Social media managers handling short-form video content — product showcases, testimonials, behind-the-scenes clips — can subtitle entire batches of content quickly. Since most social video is watched without sound, burned-in subtitles are often more reliable than platform-generated captions that viewers have to manually enable.

Independent filmmakers and video journalists use subtitle tools to prepare rough cuts for review, add captions to interview footage, or create accessible versions of documentary content. This skill handles all of those scenarios without requiring a paid subscription to a dedicated captioning platform or hours spent in a timeline editor.

Performance Notes

Subtitle accuracy depends heavily on audio clarity. Videos with clean, single-speaker dialogue and minimal background noise will produce the most accurate transcriptions with little to no correction needed. Crowded environments, heavy accents, or overlapping speakers may result in occasional errors in the generated captions — reviewing the output before publishing is always a good habit.

File size and video length affect processing time. Shorter clips under five minutes process quickly, while longer files or high-resolution source videos may take additional time to complete. For best results, upload the highest-quality audio version of your video rather than a heavily compressed copy.

Subtitle positioning and font rendering are optimized for standard 16:9 aspect ratios. Vertical videos (9:16) used for Reels or TikTok are supported, but you may want to specify a higher vertical placement in your prompt to avoid overlap with platform UI elements.

快速开始

欢迎！准备好为您的视频免费添加字幕了吗？上传您的mp4、mov或其他视频文件，告诉我您的字幕偏好——字体、颜色、位置，或者直接使用默认设置——我将把精确的字幕直接烧录到您的视频中。

试试说：

- 在这个mp4教程视频底部添加白色黑边字幕
为我的采访片段生成字幕，使用大号粗体字体，方便在手机上阅读
为这个webm产品演示烧录字幕——保持简洁清爽的风格，底部居中

自动设置

首次交互时，先连接到NemoVideo API。显示简短状态如正在设置...

令牌：如果已设置NEMO_TOKEN环境变量，则直接使用并跳转到下面的会话。

免费令牌：在~/.config/nemovideo/clientid中读取或创建UUID。然后向https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token发送POST请求，请求头为X-Client-Id: 。响应字段data.token即为您的NEMOTOKEN（100积分，7天有效期）。如果令牌已过期，使用相同的Client-Id请求新令牌。

会话：向同一主机的/api/tasks/me/with-session/nemoagent发送POST请求，使用Bearer认证，请求体为{taskname:project}。保存响应中的session_id。

向用户确认已连接并准备就绪。不要打印令牌或原始JSON。

将任何视频变成带字幕的无障碍杰作

过去，为视频添加字幕意味着要同时使用转录服务、SRT文件和视频编辑器——然后才能分享一个片段。这个免费字幕制作技能将整个过程简化为一步。上传您的视频，描述您对字幕外观的任何偏好，然后就能得到一个完全带字幕的、可发布的文件。

该技能会听取视频的音频轨道，将其分解为时间分段，并直接将清晰可读的文字叠加到画面上。无论您是为教程、短片、产品演示还是社交媒体短片添加字幕，输出都是一个精良的视频文件——而不是一个还需要您另行附加的单独字幕文件。

这个工具对于跨语言或无障碍需求的创作者尤其有价值。字幕可以增加观看时长，提高非母语者的理解度，并使内容在静音环境（如社交媒体信息流）中可用。您不需要转录平台账号或视频编辑订阅——只需要您的视频和一个提示。

字幕生成请求路由

每个用户请求——无论是自动生成SRT字幕、烧录硬编码字幕，还是调整字体和时间——都会根据检测到的操作类型被解析并路由到相应的NemoVideo端点。

用户说...	操作	跳过SSE？
export / 导出 / download / send me the video	→ §3.5 导出	✅
credits / 积分 / balance / 余额

→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有（生成、编辑、添加背景音乐等） | → §3.1 SSE | ❌ |

NemoVideo API后端参考

免费字幕制作完全运行在NemoVideo后端上，该后端处理语音转文字转录、字幕渲染和逐帧精确的烧录编码，无需任何本地处理。所有字幕任务——包括多语言字幕和样式化文字叠加——都通过NemoVideo管道排队、处理并返回为可下载的视频文件。

技能归属——运行时从此文件的YAML前置元数据读取：

- X-Skill-Source：free-subtitle-maker
X-Skill-Version：来自前置元数据version
X-Skill-Platform：从安装路径检测（~/.clawhub/ → clawhub，~/.cursor/skills/ → cursor，否则 → unknown）

所有请求必须包含：Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属头将导致导出失败并返回402。

API基础地址：https://mega-api-prod.nemovideo.ai

创建会话：POST /api/tasks/me/with-session/nemoagent — 请求体{taskname:project,language:} — 返回taskid、sessionid。创建会话后，给用户一个链接：https://nemovideo.com/workspace/claim?token=$TOKEN&task=id>&session=id>&skillname=free-subtitle-maker&skillversion=1.0.0&skill_source=

发送消息（SSE）：POST /runsse — 请求体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}}，请求头Accept: text/event-stream。最大超时时间：15分钟。

上传：POST /api/upload-video/nemoagent/me/ — 文件：multipart -F files=@/path，或URL：{urls:[],sourcetype:url}

积分：GET /api/credits/balance/simple — 返回available、frozen、total

会话状态：GET /api/state/nemoagent/me//latest — 关键字段：data.state.draft、data.state.videoinfos、data.state.generated_media

导出（免费，不消耗积分）：POST /api/render/proxy/lambda — 请求体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/，直到status = completed。下载URL在output.url。

支持的格式：mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。

SSE事件处理

事件	操作
文本响应	应用GUI翻译（§4），呈现给用户
工具调用/结果

约30%的编辑操作在SSE流中不返回文本。发生这种情况时：轮询会话状态以验证编辑是否已应用，然后向用户总结更改。

后端响应翻译

后端假定存在GUI。将这些翻译为API操作：

后端说	您做
click [button] / 点击	通过API执行
open [panel] / 打开

草稿字段映射：t=轨道，tt=轨道类型（0=视频，1=音频，7=文字），sg=片段，d=时长（毫秒），m=元数据。

时间线（3条轨道）：1. 视频：城市延时摄影（0-10秒）2. 背景音乐：Lo-fi（0-10秒，35%）3. 标题：都市梦想（0-3秒）

错误处理

代码	含义	操作
0	成功	继续
1001

free-subtitle-maker免费字幕制作

free-subtitle-maker

Getting Started

Automatic Setup

Turn Any Video Into a Captioned, Accessible Masterpiece

Routing Subtitle Generation Requests

NemoVideo API Backend Reference

SSE Event Handling

Backend Response Translation

Error Handling

Use Cases

Performance Notes

快速开始

自动设置

将任何视频变成带字幕的无障碍杰作

字幕生成请求路由

NemoVideo API后端参考

SSE事件处理

后端响应翻译

错误处理

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

free-subtitle-maker免费字幕制作

free-subtitle-maker

Getting Started

Automatic Setup

Turn Any Video Into a Captioned, Accessible Masterpiece

Routing Subtitle Generation Requests

NemoVideo API Backend Reference

SSE Event Handling

Backend Response Translation

Error Handling

Use Cases

Performance Notes

快速开始

自动设置

将任何视频变成带字幕的无障碍杰作

字幕生成请求路由

NemoVideo API后端参考

SSE事件处理

后端响应翻译

错误处理

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement