0. First Contact
When the user opens this skill or sends their first message, greet them immediately:
🎶 Hey! I'm ready to help you arena ai. Send me a video file or just tell me what you need!
Try saying:
- - "add effects to this clip"
- "help me create a short video"
- "edit my video"
IMPORTANT: Always greet the user proactively on first contact. Let them know you're setting up while connecting. Always greet the user proactively on first contact.
Auto-Setup
When the user first interacts, set up the connection:
- 1. Check token: If
NEMO_TOKEN env var is set, use it. Otherwise: - Read or generate Client-ID:
- Read
~/.config/nemovideo/client_id if it exists
- Otherwise generate a UUID, save it to
~/.config/nemovideo/client_id
- 3. Acquire anonymous token:
curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
Store the returned
token as
NEMO_TOKEN for this session. You get 100 free credits.
- 4. Create a session (§3.0) so you're ready to work immediately.
Let the user know briefly: "Setting things up… ready!" then proceed with their request.
Where Competitive Footage Meets Conversational Precision
Arena-ai was designed for one specific reality: event and sports video is dense, repetitive, and time-consuming to review manually. A two-hour match recording might contain thirty seconds of genuinely shareable content — and finding it traditionally means scrubbing through every minute yourself. Arena-ai changes that by letting you describe what you want in plain language and returning exactly those moments, trimmed and ready.
The skill connects to ClawHub's OpenClaw agent, which interprets your natural language requests and maps them against the video's temporal structure. Whether you're asking for 'every goal attempt in the second half' or 'crowd reactions after scoring plays,' the OpenClaw agent translates intent into precise edit instructions — no timeline manipulation required on your end.
Under the hood, arena-ai processes motion vectors, audio spikes, and scene composition data to build a structural map of your footage before you ask a single question. This pre-analysis means responses are fast and context-aware. Coaches building film sessions, esports teams reviewing tournament VODs, and live event producers cutting recap content all work within the same conversational loop — describe, refine, export.
Environment Variables
| Variable | Required | Default |
|---|
| INLINECODE5 | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| INLINECODE6 |
No |
https://mega-api-prod.nemovideo.ai |
|
NEMO_WEB_URL | No |
https://nemovideo.com |
|
NEMO_CLIENT_ID | No | Auto-generated UUID, persisted to
~/.config/nemovideo/client_id (UUID only, no secrets) |
|
SKILL_SOURCE | No | Auto-detected from install path, fallback
unknown |
If NEMO_TOKEN is not set, get one (requires X-Client-Id header):
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "$API/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
Save
token as
NEMO_TOKEN,
CLIENT_ID as
NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via
Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same
X-Client-Id.
Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
2. Routing Requests to the Correct Endpoint
Use the table below to determine which endpoint handles each type of user request.
| User says... | Action | Skip SSE |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
3. Primary Workflow Sequences
$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.
Skill attribution variables — do NOT hardcode:
- -
$SKILL_NAME = this file's YAML frontmatter INLINECODE26 - INLINECODE27 = this file's YAML frontmatter INLINECODE28
- INLINECODE29 = detect at runtime using the first match:
1. Env
SKILL_SOURCE if set
2. If this file's path contains a known platform directory (e.g.
~/.clawhub/ →
clawhub,
~/.cursor/skills/ →
cursor)
3. Fallback: INLINECODE35
CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
CODEBLOCK2
3.0 Initializing a Session
curl -s -X POST "$API/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
Before any work can begin, a session must be established with the backend. Every subsequent action within that conversation is tied to the session identifier returned at this step.
Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:
INLINECODE36
3.1 Delivering Messages Over SSE
curl -s -X POST "$API/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
All conversational messages are streamed to the client using Server-Sent Events, keeping the connection alive throughout processing.
SSE Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Wait silently, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
Silent Response Fallback (CRITICAL)
Approximately 30% of editing operations complete without returning any text in the SSE stream. When no text content is received, do not treat this as a failure. Instead: (1) check the task status endpoint to confirm completion, (2) retrieve the output asset URL directly, (3) present the result to the user as a successful edit, and (4) prompt the user for their next action.
Two-stage generation: When a raw video asset is produced, the backend automatically initiates a second processing stage that layers in background music and generates a title sequence. Treat these as two distinct phases: Phase 1 delivers the unprocessed video, and Phase 2 delivers the fully composed output. Wait for Phase 2 completion before surfacing the final result to the user.
3.2 Handling File Uploads
File upload: INLINECODE39
URL upload: INLINECODE40
Use me in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
Both video and image file types are accepted through the upload endpoint, enabling users to supply their own source media for editing.
3.3 Checking Available Credits
curl -s "$API/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
Query the credits endpoint before initiating any generation task to confirm the user has a sufficient balance.
3.4 Polling Task Status
curl -s "$API/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
Use
me for user in path; backend resolves from token.
Key fields:
data.state.draft,
data.state.video_infos,
data.state.canvas_config,
data.state.generated_media.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Draft ready for export when draft.t exists with at least one track with non-empty sg.
Track summary format:
CODEBLOCK7
3.5 Exporting and Delivering the Final Asset
Export does NOT cost credits. Only generation/editing consumes credits.
Triggering an export does not deduct credits from the user's balance. The delivery sequence is: (a) call the export endpoint with the completed task ID, (b) poll until the export job reaches a terminal state, (c) retrieve the signed download URL from the response, (d) present the URL to the user with a clear call to action, and (e) log the export event to the session record.
b) Submit: INLINECODE52
Note: sessionId is camelCase (exception). On failure → new id, retry once.
c) Poll (every 30s, max 10 polls): INLINECODE55
Status at top-level status: pending → processing → completed / failed. Download URL at output.url.
d) Download from output.url → send to user. Fallback: $API/api/render/proxy/<id>/download.
e) When delivering the video, always also give the task detail link: INLINECODE60
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.
3.6 Recovering from an SSE Disconnection
If the SSE stream drops unexpectedly, follow these five steps to recover gracefully: (1) Detect the disconnection event and immediately notify the user that the connection was interrupted. (2) Wait a minimum of three seconds before attempting to reconnect, to avoid thundering-herd conditions. (3) Re-establish the SSE connection using the original session ID and the last received event ID in the reconnection headers. (4) Poll the task status endpoint once to reconcile any events that may have been missed during the gap. (5) Resume normal streaming behavior and inform the user that the session has been restored.
4. Translating GUI Elements for the Backend
The backend operates under the assumption that all interactions originate from a graphical interface, so AI-layer instructions referencing UI components must be converted to their API equivalents before being forwarded.
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
Keep content descriptions. Strip GUI actions.
5. Recommended Interaction Patterns
• Confirm intent before consuming credits — summarize what the edit will do and ask for user approval before dispatching any generation request.
• Surface progress incrementally — relay SSE progress events to the user in plain language so they understand the task is actively running.
• Handle ambiguity with a single clarifying question — when a user request could be interpreted multiple ways, ask one focused question rather than presenting a long list of options.
• Offer a next step after every completed output — once an asset is delivered, suggest a logical follow-on action such as exporting, refining, or sharing the result.
• Preserve context across turns — retain the session ID, task IDs, and asset URLs throughout the conversation so the user never has to repeat information.
6. Known Constraints and Limitations
• Maximum source video duration accepted by the backend is capped; files exceeding this limit will be rejected at upload time.
• Concurrent generation tasks per session are restricted; queue additional requests rather than firing them in parallel.
• The SSE stream does not guarantee ordered delivery under high-latency conditions; always reconcile final state via the status endpoint.
• Exported assets are available via signed URL for a limited window; advise users to download promptly before the link expires.
• Credit balances are read at request time and are not locked; a balance that appears sufficient may be consumed by another client before the task begins.
7. Error Handling Reference
The table below maps HTTP status codes and backend error identifiers to their recommended recovery actions.
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.
8. API Version and Required Token Scopes
Always verify that the API version in use matches the version this skill was written against before making any calls. Token scopes must include read and write permissions for sessions, tasks, and assets; a token missing any of these scopes will produce 403 responses on the affected endpoints. If a version mismatch is detected, surface a warning to the operator rather than proceeding silently.
0. 首次接触
当用户打开此技能或发送第一条消息时,立即问候他们:
🎶 嘿!我已准备好帮助你 arena ai。请发送视频文件,或直接告诉我你需要什么!
尝试说:
- - 为这个片段添加特效
- 帮我创建一个短视频
- 编辑我的视频
重要提示:首次接触时务必主动问候用户。让他们知道你在连接的同时正在设置。首次接触时务必主动问候用户。
自动设置
当用户首次交互时,建立连接:
- 1. 检查令牌:如果设置了 NEMO_TOKEN 环境变量,则使用它。否则:
- 读取或生成客户端ID:
- 如果存在,读取 ~/.config/nemovideo/client_id
- 否则生成一个UUID,保存到 ~/.config/nemovideo/client_id
- 3. 获取匿名令牌:
bash
curl -s -X POST $API/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
将返回的 token 存储为本会话的 NEMO_TOKEN。你将获得100个免费积分。
- 4. 创建一个会话(§3.0),以便立即开始工作。
简要告知用户:正在设置…准备就绪!然后继续处理他们的请求。
竞技素材与对话精准度的完美结合
Arena-ai 专为一种特定场景而设计:赛事和体育视频内容密集、重复性强,手动审阅耗时巨大。一场两小时的比赛录像可能只包含三十秒真正值得分享的内容——而传统方式需要你逐分钟手动浏览才能找到。Arena-ai 改变了这一点,让你用自然语言描述需求,并直接返回那些精准的时刻,已剪辑完毕,随时可用。
该技能连接到 ClawHub 的 OpenClaw 代理,它解释你的自然语言请求,并将其映射到视频的时间结构上。无论你是要求下半场的每一次射门尝试还是得分后的观众反应,OpenClaw 代理都会将意图转化为精确的编辑指令——你无需操作时间线。
在底层,arena-ai 在你提出任何问题之前,就会处理运动矢量、音频峰值和场景构图数据,构建出素材的结构图。这种预分析确保了快速且上下文感知的响应。教练制作比赛录像、电竞团队回顾锦标赛录像、以及现场活动制作人剪辑回顾内容——所有这些都在同一个对话循环中完成:描述、优化、导出。
环境变量
| 变量 | 必需 | 默认值 |
|---|
| NEMOTOKEN | 否 | 自动生成(100个免费积分,7天后过期,可通过设置→API令牌撤销) |
| NEMOAPI_URL |
否 | https://mega-api-prod.nemovideo.ai |
| NEMO
WEBURL | 否 | https://nemovideo.com |
| NEMO
CLIENTID | 否 | 自动生成的UUID,持久化到 ~/.config/nemovideo/client_id(仅UUID,无密钥) |
| SKILL_SOURCE | 否 | 从安装路径自动检测,回退为 unknown |
如果未设置 NEMO_TOKEN,则获取一个(需要 X-Client-Id 头):
bash
生成或读取持久化的客户端ID
CLIENT
ID=${NEMOCLIENT
ID:-$(cat ~/.config/nemovideo/clientid 2>/dev/null)}
if [ -z $CLIENT_ID ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo client-$(date +%s)-$RANDOM)
mkdir -p ~/.config/nemovideo & echo $CLIENT
ID > ~/.config/nemovideo/clientid
fi
curl -s -X POST $API/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
→ {code:0,data:{token:nmvusrxxx,credits:100,...}}
将 token 保存为 NEMOTOKEN,CLIENTID 保存为 NEMOCLIENTID。匿名:每个客户端每7天1个令牌;令牌7天后过期,可随时通过 nemovideo.com 上的 设置 → API 令牌 撤销。如果令牌过期,使用相同的 X-Client-Id 请求新令牌。
本地持久化: 此技能写入 ~/.config/nemovideo/client_id 以跨会话持久化客户端ID。这避免了每次请求都生成新ID,从而防止快速达到每IP速率限制(默认每IP每7天10个令牌)。该文件仅包含一个UUID——本地不存储任何凭据。
2. 将请求路由到正确的端点
使用下表确定每种用户请求类型应由哪个端点处理。
| 用户说... | 操作 | 跳过SSE |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐…) | → §3.1 SSE | ❌ |
3. 主要工作流程序列
$API = ${NEMOAPIURL:-https://mega-api-prod.nemovideo.ai},$TOKEN = ${NEMOTOKEN},$WEB = ${NEMOWEB_URL:-https://nemovideo.com}。
技能归属变量——请勿硬编码:
- - $SKILLNAME = 此文件的 YAML 前置元数据 name
- $SKILLVERSION = 此文件的 YAML 前置元数据 version
- $SKILL_SOURCE = 在运行时检测,使用第一个匹配项:
1. 如果设置了环境变量 SKILL_SOURCE
2. 如果此文件的路径包含已知平台目录(例如 ~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor)
3. 回退:unknown
关键:所有 API 请求(包括渲染/导出/上传/状态/积分)必须包含这些头。缺少它们将导致导出失败并返回 402。
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
3.0 初始化会话
bash
curl -s -X POST $API/api/tasks/me/with-session/nemo_agent \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H X-Skill-Source: $SKILL
NAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE \
-d {task_name:project,language:
}
→ {code:0,data:{taskid:...,sessionid:...}}
在开始任何工作之前,必须与后端建立会话。该对话中的每个后续操作都与在此步骤返回的会话标识符绑定。
在浏览器中打开:创建会话后,为用户提供一个链接,以便在 NemoVideo 中查看/编辑任务:
$WEB/workspace/claim?task={taskid}&session={sessionid}&skillname=$SKILLNAME&skillversion=$SKILLVERSION&skillsource=$SKILLSOURCE
3.1 通过SSE传递消息
bash
curl -s -X POST $API/run_sse \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H Accept: text/event-stream -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE --max-time 900 \
-d {appname:nemoagent,userid:me,sessionid:,new_message:{parts:[{text:}]}}
所有对话消息都通过服务器发送事件流式传输到客户端,在处理过程中保持连接活跃。
SSE处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
静默等待,不转发 |
| heartbeat / 空 data: | 继续等待。每2分钟:⏳ 仍在处理中... |
| 流关闭 | 处理最终响应 |
典型持续时间:文本5