Getting Started
Welcome to your pet adoption video assistant — let's create a video that helps this animal find their forever home! Share the pet's details, any footage or photos you have, and what platform you're posting to so we can get started.
Try saying:
- - "Write a 60-second adoption video script for a 3-year-old shy tabby cat named Miso who loves blankets and slow mornings"
- "Give me a scene-by-scene outline for a foster dog's Instagram Reel that highlights her playful side and good behavior with kids"
- "Create on-screen text and caption ideas for a senior beagle's adoption video that addresses common concerns about older dogs"
Automatic Setup
On first interaction, connect to the processing API before doing anything else. Show a brief status like "Setting things up...".
Token: If NEMO_TOKEN environment variable is already set, use it and skip to Session below.
Free token: Generate a UUID as client identifier, then POST to https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token with header X-Client-Id: <uuid>. The response field data.token becomes your NEMO_TOKEN (100 credits, 7-day expiry).
Session: POST to https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent with Bearer auth and body {"task_name":"project"}. Save session_id from the response.
Confirm to the user you're connected and ready. Don't print tokens or raw JSON.
Turn Shelter Footage Into Adoption-Ready Stories
Every animal in a shelter deserves a video that shows their real personality — not just a shaky clip with no context. The pet-adoption-video skill helps you build videos that tell a pet's story in a way that resonates with potential adopters scrolling through social media or a rescue organization's website.
Whether you're working with raw footage from a phone, a set of photos, or just a written description of the pet, this skill helps you shape it into a structured, emotionally engaging video concept. You'll get scene-by-scene outlines, caption suggestions, voiceover scripts, and on-screen text ideas tailored to the specific animal — their breed, age, quirks, and adoption needs.
This is especially useful for small rescues and foster networks that don't have a dedicated media team. Instead of spending hours figuring out what to say or how to sequence clips, you can focus on the animals while this skill handles the storytelling framework that drives real adoption outcomes.
How Your Video Requests Flow
When you describe a pet's personality, story, or shelter details, the skill routes your request to the appropriate video template engine — whether that's a tearjerker rescue narrative, an upbeat playful profile, or a senior pet spotlight.
| User says... | Action | Skip SSE? |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
Cloud Backend API Reference
Pet adoption video processing runs through ClawHub's media rendering pipeline, which stitches together shelter-provided footage, AI-generated captions, and adoption call-to-action overlays in the cloud — no local rendering required. Heavy tasks like b-roll sequencing and voiceover sync are handled server-side so your final shareable video is ready in seconds.
Skill attribution — read from this file's YAML frontmatter at runtime:
- -
X-Skill-Source: INLINECODE8 - INLINECODE9 : from frontmatter INLINECODE10
- INLINECODE11 : detect from install path (
~/.clawhub/ → clawhub, ~/.cursor/skills/ → cursor, else unknown)
All requests must include: Authorization: Bearer <NEMO_TOKEN>, X-Skill-Source, X-Skill-Version, X-Skill-Platform. Missing attribution headers will cause export to fail with 402.
API base: INLINECODE21
Create session: POST /api/tasks/me/with-session/nemo_agent — body {"task_name":"project","language":"<lang>"} — returns task_id, session_id.
Send message (SSE): POST /run_sse — body {"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}} with Accept: text/event-stream. Max timeout: 15 minutes.
Upload: POST /api/upload-video/nemo_agent/me/<sid> — file: multipart -F "files=@/path", or URL: INLINECODE31
Credits: GET /api/credits/balance/simple — returns available, frozen, INLINECODE35
Session state: GET /api/state/nemo_agent/me/<sid>/latest — key fields: data.state.draft, data.state.video_infos, INLINECODE39
Export (free, no credits): POST /api/render/proxy/lambda — body {"id":"render_<ts>","sessionId":"<sid>","draft":<json>,"output":{"format":"mp4","quality":"high"}}. Poll GET /api/render/proxy/lambda/<id> every 30s until status = completed. Download URL at output.url.
Supported formats: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
SSE Event Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Process internally, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
~30% of editing operations return no text in the SSE stream. When this happens: poll session state to verify the edit was applied, then summarize changes to the user.
Backend Response Translation
The backend assumes a GUI exists. Translate these into API actions:
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Query session state |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute export workflow |
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
CODEBLOCK0
Error Handling
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up credits in your account" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register or upgrade your plan to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common Workflows
Most users come to this skill with one of three starting points: raw footage they need to structure, a pet profile they want to turn into a video concept, or a finished video that needs better captions and on-screen text.
For footage-based projects, share a description of what clips you have — length, what the pet is doing, the setting — and the skill will suggest a scene order, transition cues, and a voiceover or text overlay script that fits the footage you've already captured.
For profile-to-video workflows, paste in the pet's shelter bio or foster notes and specify the platform (TikTok, Instagram, Facebook, rescue website). You'll get a tailored script with an emotional opening hook, a middle section showcasing personality, and a clear call-to-action that tells viewers exactly how to adopt.
For caption and overlay work, share your existing video concept or rough cut description and the skill will generate platform-optimized captions, hashtag suggestions, and on-screen text timed to key moments in the video.
Quick Start Guide
Getting your first pet adoption video script takes less than two minutes. Start by telling the skill three things: the pet's basic info (name, species, breed, age), their standout personality traits or backstory, and where the video will be posted.
From there, specify the video length you're targeting. A 15-30 second TikTok needs a very different structure than a 2-minute Facebook feature video. If you're unsure, ask the skill to recommend a format based on the platform and the pet's story.
Once you have a script or outline, you can ask for variations — a more upbeat tone for a playful puppy, a gentler and slower-paced version for a timid or special-needs animal. You can also request separate versions for different platforms from the same core content, so one session can produce assets for Instagram, TikTok, and your rescue's website all at once.
开始使用
欢迎使用宠物领养视频助手——让我们制作一段视频,帮助这只动物找到永久的家!请分享宠物的详细信息、您拥有的任何视频片段或照片,以及您要发布的平台,以便我们开始。
尝试说:
- - 为一只名叫Miso的3岁害羞虎斑猫写一个60秒的领养视频脚本,它喜欢毯子和悠闲的早晨
- 为一只寄养狗狗的Instagram Reel提供一个分场景大纲,突出它爱玩的一面和与孩子相处的良好表现
- 为一只老年比格犬的领养视频创作屏幕文字和字幕创意,解决人们对老年犬的常见顾虑
自动设置
首次交互时,先连接到处理API再执行其他操作。显示简短状态如正在设置....
令牌:如果NEMO_TOKEN环境变量已设置,则使用它并跳至下面的会话。
免费令牌:生成一个UUID作为客户端标识符,然后向https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token发送POST请求,附带标头X-Client-Id: 。响应字段data.token即为您的NEMO_TOKEN(100积分,7天有效期)。
会话:向https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemoagent发送POST请求,使用Bearer认证和主体{taskname:project}。保存响应中的session_id。
向用户确认您已连接并准备就绪。不要打印令牌或原始JSON。
将收容所素材转化为适合领养的故事
收容所里的每只动物都值得拥有一段展现其真实个性的视频——而不仅仅是一段没有背景的抖动片段。宠物领养视频技能帮助您制作能够讲述宠物故事的视频,以引起在社交媒体或救助组织网站上浏览的潜在领养者的共鸣。
无论您处理的是手机拍摄的原始素材、一组照片,还是仅凭宠物的文字描述,此技能都能帮助您将其塑造成结构清晰、情感动人的视频概念。您将获得针对特定动物(其品种、年龄、怪癖和领养需求)量身定制的分场景大纲、字幕建议、配音脚本和屏幕文字创意。
这对于没有专门媒体团队的小型救助组织和寄养网络尤其有用。您无需花费数小时思考该说什么或如何排列片段,而是可以专注于动物本身,同时此技能处理推动实际领养成果的故事框架。
您的视频请求流程
当您描述宠物的个性、故事或收容所详情时,该技能会将您的请求路由到相应的视频模板引擎——无论是催泪的救助叙事、活泼欢快的介绍,还是老年宠物特写。
| 用户说... | 操作 | 跳过SSE? |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐...) | → §3.1 SSE | ❌ |
云端后端API参考
宠物领养视频处理通过ClawHub的媒体渲染管道运行,该管道在云端将收容所提供的素材、AI生成的字幕和领养行动号召叠加层拼接在一起——无需本地渲染。B-roll排序和配音同步等繁重任务在服务器端处理,因此最终可分享的视频在几秒钟内即可准备就绪。
技能归属——运行时从此文件的YAML前置元数据读取:
- - X-Skill-Source:pet-adoption-video
- X-Skill-Version:来自前置元数据version
- X-Skill-Platform:从安装路径检测(~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor,否则unknown)
所有请求必须包含:Authorization: Bearer 、X-Skill-Source、X-Skill-Version、X-Skill-Platform。缺少归属标头将导致导出失败并返回402。
API基础地址:https://mega-api-prod.nemovideo.ai
创建会话:POST /api/tasks/me/with-session/nemoagent — 主体{taskname:project,language:} — 返回taskid、sessionid。
发送消息(SSE):POST /runsse — 主体{appname:nemoagent,userid:me,sessionid:,newmessage:{parts:[{text:}]}},附带Accept: text/event-stream。最大超时:15分钟。
上传:POST /api/upload-video/nemoagent/me/ — 文件:multipart -F files=@/path,或URL:{urls:[],sourcetype:url}
积分:GET /api/credits/balance/simple — 返回available、frozen、total
会话状态:GET /api/state/nemoagent/me//latest — 关键字段:data.state.draft、data.state.videoinfos、data.state.generated_media
导出(免费,不消耗积分):POST /api/render/proxy/lambda — 主体{id:render_,sessionId:,draft:,output:{format:mp4,quality:high}}。每30秒轮询GET /api/render/proxy/lambda/,直到status = completed。下载URL位于output.url。
支持的格式:mp4、mov、avi、webm、mkv、jpg、png、gif、webp、mp3、wav、m4a、aac。
SSE事件处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
内部处理,不转发 |
| heartbeat / 空data: | 继续等待。每2分钟:⏳ 仍在处理... |
| 流关闭 | 处理最终响应 |
约30%的编辑操作在SSE流中不返回文本。发生这种情况时:轮询会话状态以验证编辑已应用,然后向用户总结更改。
后端响应翻译
后端假设存在GUI。将这些翻译为API操作:
| 后端说 | 您做 |
|---|
| click [button] / 点击 | 通过API执行 |
| open [panel] / 打开 |
查询会话状态 |
| drag/drop / 拖拽 | 通过SSE发送编辑 |
| preview in timeline | 显示轨道摘要 |
| Export button / 导出 | 执行导出工作流 |
草稿字段映射:t=轨道,tt=轨道类型(0=视频,1=音频,7=文本),sg=片段,d=时长(毫秒),m=元数据。
时间线(3条轨道):1. 视频:城市延时摄影(0-10秒)2. 背景音乐:Lo-fi(0-10秒,35%)3. 标题:都市梦想(0-3秒)
错误处理
令牌错误/过期 | 通过anonymous-token重新认证(令牌7天后过期) |
| 1002 | 未找到会话 | 新建会话 §3.0 |
| 2001 | 无积分 | 匿名用户:显示注册URL,附带?bind=
(需要时从create-session或state响应获取)。已注册用户:在您的账户中充值积分 |
| 4001 | 不支持的文件 | 显示支持的格式 |
| 4002 | 文件过大 | 建议压缩/裁剪 |
| 400 | 缺少X-Client-Id | 生成Client-Id并重试(见§1) |
| 402 | 免费计划导出被阻止 | 订阅层级问题,非积分问题。注册或升级您的计划以解锁导出功能。 |
| 429 | 速率限制(1个令牌/客户端/7天) | 30秒后重试一次 |
常见工作流
大多数用户使用此技能时通常有三种起点:需要整理的原始素材、想要转化为视频概念的宠物档案,或需要更好字幕和屏幕文字的