0. First Contact
When the user opens this skill or sends their first message, greet them immediately:
⚡ Ready to qwen ai video generator! Just send me a video or describe your project.
Try saying:
- - "help me create a short video"
- "add effects to this clip"
- "edit my video"
IMPORTANT: Always greet the user proactively on first contact. Let them know you're setting up while connecting. Always greet the user proactively on first contact.
Auto-Setup
When the user first interacts, set up the connection:
- 1. Check token: If
NEMO_TOKEN env var is set, use it. Otherwise: - Read or generate Client-ID:
- Read
~/.config/nemovideo/client_id if it exists
- Otherwise generate a UUID, save it to
~/.config/nemovideo/client_id
- 3. Acquire anonymous token:
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
Store the returned
token as
NEMO_TOKEN for this session. You get 100 free credits.
- 4. Create a session (§3.0) so you're ready to work immediately.
Let the user know briefly: "Setting things up… ready!" then proceed with their request.
Conversational Video Creation Powered by Qwen Intelligence
The qwen-ai-video-generator skill connects ClawHub's OpenClaw agent framework to Qwen's multimodal model, enabling a fundamentally different approach to video creation. Instead of dragging clips on a timeline, you describe what you want — scenes, transitions, pacing, narration tone — and the agent interprets your intent and structures the output accordingly. This means less time wrestling with editing software and more time refining your actual message.
The OpenClaw agent acts as the orchestration layer, translating your conversational prompts into structured generation tasks that Qwen processes frame by frame. Whether you're building a product walkthrough, an educational explainer, or a short-form social clip, the agent maintains context across your session so follow-up instructions like 'make the intro shorter' or 'add a text overlay at the 10-second mark' are handled without starting over.
Under the hood, the skill handles encoding and format negotiation automatically, so your final output lands in whichever container format fits your platform — mp4 for broad compatibility, mov for professional pipelines, webm for web delivery, or avi and mkv for archival and editing workflows. No manual transcoding required.
Environment Variables
| Variable | Required | Default |
|---|
| INLINECODE5 | No | Auto-generated (100 free credits, expires in 7 days, revocable via Settings → API Tokens) |
| INLINECODE6 |
No |
https://mega-api-prod.nemovideo.ai |
|
NEMO_WEB_URL | No |
https://nemovideo.com |
|
NEMO_CLIENT_ID | No | Auto-generated UUID, persisted to
~/.config/nemovideo/client_id (UUID only, no secrets) |
|
SKILL_SOURCE | No | Auto-detected from install path, fallback
unknown |
If NEMO_TOKEN is not set, get one (requires X-Client-Id header):
# Generate or read persisted Client-Id
CLIENT_ID="${NEMO_CLIENT_ID:-$(cat ~/.config/nemovideo/client_id 2>/dev/null)}"
if [ -z "$CLIENT_ID" ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo "client-$(date +%s)-$RANDOM")
mkdir -p ~/.config/nemovideo & echo "$CLIENT_ID" > ~/.config/nemovideo/client_id
fi
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token" -H "X-Client-Id: $CLIENT_ID"
# → {"code":0,"data":{"token":"nmv_usr_xxx","credits":100,...}}
Save
token as
NEMO_TOKEN,
CLIENT_ID as
NEMO_CLIENT_ID. Anonymous: 1 token per client per 7 days; token expires in 7 days and can be revoked at any time via
Settings → API Tokens on nemovideo.com. If your token expires, request a new one with the same
X-Client-Id.
Local persistence: This skill writes ~/.config/nemovideo/client_id to persist the Client-Id across sessions. This avoids generating a new ID on every request, which would hit the per-IP rate limit quickly (default 10 tokens per 7 days per IP). The file contains only a UUID — no credentials are stored locally.
2. Routing Incoming Requests to the Correct Endpoint
Use the table below to determine which API endpoint should handle each type of user request.
| User says... | Action | Skip SSE |
|---|
| "export" / "导出" / "download" / "send me the video" | → §3.5 Export | ✅ |
| "credits" / "积分" / "balance" / "余额" |
→ §3.3 Credits | ✅ |
| "status" / "状态" / "show tracks" | → §3.4 State | ✅ |
| "upload" / "上传" / user sends file | → §3.2 Upload | ✅ |
| Everything else (generate, edit, add BGM…) | → §3.1 SSE | ❌ |
3. Primary Workflow Sequences
$API = ${NEMO_API_URL:-https://mega-api-prod.nemovideo.ai}, $TOKEN = ${NEMO_TOKEN}, $WEB = ${NEMO_WEB_URL:-https://nemovideo.com}.
Skill attribution variables — do NOT hardcode:
- -
$SKILL_NAME = this file's YAML frontmatter INLINECODE26 - INLINECODE27 = this file's YAML frontmatter INLINECODE28
- INLINECODE29 = detect at runtime using the first match:
1. Env
SKILL_SOURCE if set
2. If this file's path contains a known platform directory (e.g.
~/.clawhub/ →
clawhub,
~/.cursor/skills/ →
cursor)
3. Fallback: INLINECODE35
CRITICAL: ALL API requests (including render/export/upload/state/credits) MUST include these headers. Missing them will cause export to fail with 402.
CODEBLOCK2
3.0 Initializing a New Session
curl -s -X POST "https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" \
-d '{"task_name":"project","language":"<lang>"}'
# → {"code":0,"data":{"task_id":"...","session_id":"..."}}
Before any interaction can begin, a session must be established with the backend. Store the returned session identifier, as every subsequent request within that conversation depends on it.
Open in browser: After creating a session, give the user a link to view/edit the task in NemoVideo:
INLINECODE36
3.1 Delivering Messages Over an SSE Connection
curl -s -X POST "https://mega-api-prod.nemovideo.ai/run_sse" \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-H "Accept: text/event-stream" -H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE" --max-time 900 \
-d '{"app_name":"nemo_agent","user_id":"me","session_id":"<sid>","new_message":{"parts":[{"text":"<msg>"}]}}'
All conversational messages are transmitted to the backend and responses are received through a persistent Server-Sent Events stream.
SSE Handling
| Event | Action |
|---|
| Text response | Apply GUI translation (§4), present to user |
| Tool call/result |
Wait silently, don't forward |
|
heartbeat / empty
data: | Keep waiting. Every 2 min: "⏳ Still working..." |
| Stream closes | Process final response |
Typical durations: text 5-15s, video generation 100-300s, editing 10-30s.
Timeout: 10 min heartbeats-only → assume timeout. Never re-send during generation (duplicates + double-charge).
Ignore trailing "I encountered a temporary issue" if prior responses were normal.
Silent Response Fallback (CRITICAL)
Approximately 30% of editing operations return a completed result without any accompanying text in the SSE stream. When no text content is detected in the response, do not treat this as an error or prompt the user to retry. Instead: (1) check the task status endpoint to confirm completion, (2) retrieve the output asset directly, and (3) present the finished result to the user as normal.
Two-stage generation: When a raw video is produced, the backend automatically initiates a second processing stage that layers in background music and a title overlay. Your integration must account for both stages: first surface the raw video to the user, then listen for the second-stage completion event and update the displayed result once the enhanced version is ready.
3.2 Handling File Uploads
File upload: INLINECODE39
URL upload: INLINECODE40
Use me in the path; backend resolves user from token.
Supported: mp4, mov, avi, webm, mkv, jpg, png, gif, webp, mp3, wav, m4a, aac.
The API accepts user-supplied media files, which must be uploaded through the designated upload endpoint before being referenced in any generation or editing request.
3.3 Checking Available Credits
curl -s "https://mega-api-prod.nemovideo.ai/api/credits/balance/simple" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
# → {"code":0,"data":{"available":XXX,"frozen":XX,"total":XXX}}
Query the credits endpoint prior to initiating any generation task to confirm the user has a sufficient balance to cover the operation.
3.4 Polling for Task Status
curl -s "https://mega-api-prod.nemovideo.ai/api/state/nemo_agent/me/<sid>/latest" -H "Authorization: Bearer $TOKEN" \
-H "X-Skill-Source: $SKILL_NAME" -H "X-Skill-Version: $SKILL_VERSION" -H "X-Skill-Platform: $SKILL_SOURCE"
Use
me for user in path; backend resolves from token.
Key fields:
data.state.draft,
data.state.video_infos,
data.state.canvas_config,
data.state.generated_media.
Draft field mapping: t=tracks, tt=track type (0=video, 1=audio, 7=text), sg=segments, d=duration(ms), m=metadata.
Draft ready for export when draft.t exists with at least one track with non-empty sg.
Track summary format:
CODEBLOCK7
3.5 Exporting and Delivering the Final Asset
Export does NOT cost credits. Only generation/editing consumes credits.
Triggering an export does not deduct any credits from the user's balance. The export sequence proceeds as follows: (a) call the export endpoint with the target asset identifier, (b) receive the export job ID in the response, (c) poll the job status until the state reaches completion, (d) extract the download URL from the completed job payload, and (e) present the URL or download the file for the user.
b) Submit: INLINECODE52
Note: sessionId is camelCase (exception). On failure → new id, retry once.
c) Poll (every 30s, max 10 polls): INLINECODE55
Status at top-level status: pending → processing → completed / failed. Download URL at output.url.
d) Download from output.url → send to user. Fallback: https://mega-api-prod.nemovideo.ai/api/render/proxy/<id>/download.
e) When delivering the video, always also give the task detail link: INLINECODE60
Progress messages: start "⏳ Rendering ~30s" → "⏳ 50%" → "✅ Video ready!" + file + task detail link.
3.6 Reconnecting After an SSE Dropout
If the SSE connection drops unexpectedly, follow these five steps to recover gracefully: (1) Wait a minimum of two seconds before attempting to reconnect, to avoid hammering the server. (2) Re-open the SSE stream using the original session identifier — do not create a new session. (3) Once reconnected, immediately query the task status endpoint to determine whether processing completed during the outage. (4) If the task has finished, retrieve the output asset and present it to the user as though the stream had never dropped. (5) If the task is still in progress, resume listening on the restored stream until the completion event arrives.
4. Translating GUI Concepts for the Backend
The backend operates under the assumption that all interactions originate from a graphical interface, so your integration must never forward GUI-specific labels, button names, or interface instructions directly to the API.
| Backend says | You do |
|---|
| "click [button]" / "点击" | Execute via API |
| "open [panel]" / "打开" |
Show state via §3.4 |
| "drag/drop" / "拖拽" | Send edit via SSE |
| "preview in timeline" | Show track summary |
| "Export button" / "导出" | Execute §3.5 |
| "check account/billing" | Check §3.3 |
Keep content descriptions. Strip GUI actions.
5. Recommended Interaction Patterns
• Always confirm the user's intent before dispatching a generation request that will consume credits.
• Provide incremental progress updates by surfacing SSE stream events as they arrive, rather than waiting for full task completion.
• When a silent-response scenario is detected, seamlessly fetch and display the output without exposing the internal fallback logic to the user.
• Offer a clear, single-action path for the user to export or download their finished video once the task is complete.
• If a task fails, surface a plain-language explanation alongside any actionable next steps rather than displaying a raw error code.
6. Known Constraints and Limitations
• Generation tasks are asynchronous; synchronous or blocking response patterns are not supported.
• A single session may not run multiple concurrent generation jobs; queue requests sequentially.
• Uploaded files must conform to the accepted format and size limits specified in the upload endpoint documentation.
• The two-stage enhancement process (BGM and title overlay) cannot be disabled or bypassed on a per-request basis.
• Credit balances are read-only through the API; top-ups must be handled outside the integration.
7. Error Handling and Status Codes
The table below maps each HTTP status code and API-level error identifier to its probable cause and the recommended recovery action.
| Code | Meaning | Action |
|---|
| 0 | Success | Continue |
| 1001 |
Bad/expired token | Re-auth via anonymous-token (tokens expire after 7 days) |
| 1002 | Session not found | New session §3.0 |
| 2001 | No credits | Anonymous: show registration URL with
?bind=<id> (get
<id> from create-session or state response when needed). Registered: "Top up at nemovideo.ai" |
| 4001 | Unsupported file | Show supported formats |
| 4002 | File too large | Suggest compress/trim |
| 400 | Missing X-Client-Id | Generate Client-Id and retry (see §1) |
| 402 | Free plan export blocked | Subscription tier issue, NOT credits. "Register at nemovideo.ai to unlock export." |
| 429 | Rate limit (1 token/client/7 days) | Retry in 30s once |
Common: no video → generate first; render fail → retry new id; SSE timeout → §3.6; silent edit → §3.1 fallback.
8. API Version and Required Token Scopes
Before going live, verify that your integration targets the current stable API version by checking the version field in the base endpoint response. Your OAuth token or API key must include all scopes required for the operations your skill performs — at minimum: session creation, message sending, file upload, task status polling, and export. Tokens missing any required scope will receive a 403 response on the affected endpoint. Scope requirements may expand across API versions, so re-validate after any version migration.
0. 首次接触
当用户打开此技能或发送第一条消息时,立即问候他们:
⚡ qwen AI视频生成器已就绪!只需发送视频或描述您的项目。
尝试说:
- - 帮我创建一个短视频
- 为这个片段添加特效
- 编辑我的视频
重要提示:首次接触时务必主动问候用户。让他们知道您正在连接的同时进行设置。首次接触时务必主动问候用户。
自动设置
当用户首次交互时,建立连接:
- 1. 检查令牌:如果设置了 NEMO_TOKEN 环境变量,则使用它。否则:
- 读取或生成客户端ID:
- 如果存在,读取 ~/.config/nemovideo/client_id
- 否则生成一个UUID,保存到 ~/.config/nemovideo/client_id
- 3. 获取匿名令牌:
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
将返回的 token 作为 NEMO_TOKEN 存储用于本次会话。您将获得100个免费积分。
- 4. 创建会话(§3.0)以便立即开始工作。
简要告知用户:正在设置…准备就绪!然后继续处理他们的请求。
由Qwen智能驱动的对话式视频创作
qwen-ai-video-generator技能将ClawHub的OpenClaw代理框架连接到Qwen的多模态模型,实现了一种根本不同的视频创作方式。您无需在时间线上拖拽片段,而是描述您想要的内容——场景、转场、节奏、旁白语气——代理会解读您的意图并相应地构建输出。这意味着花更少的时间与编辑软件搏斗,更多的时间完善您的实际信息。
OpenClaw代理充当编排层,将您的对话提示转换为结构化生成任务,由Qwen逐帧处理。无论您是在构建产品演示、教育讲解还是短视频社交媒体片段,代理都会在您的会话中保持上下文,因此像让开头更短或在第10秒处添加文字叠加这样的后续指令无需重新开始即可处理。
在底层,该技能自动处理编码和格式协商,因此您的最终输出会以适合您平台的容器格式呈现——mp4用于广泛兼容性,mov用于专业工作流程,webm用于网络交付,avi和mkv用于存档和编辑工作流程。无需手动转码。
环境变量
| 变量 | 是否必需 | 默认值 |
|---|
| NEMOTOKEN | 否 | 自动生成(100个免费积分,7天后过期,可通过设置→API令牌撤销) |
| NEMOAPI_URL |
否 | https://mega-api-prod.nemovideo.ai |
| NEMO
WEBURL | 否 | https://nemovideo.com |
| NEMO
CLIENTID | 否 | 自动生成的UUID,持久化到 ~/.config/nemovideo/client_id(仅UUID,无密钥) |
| SKILL_SOURCE | 否 | 从安装路径自动检测,回退为 unknown |
如果未设置 NEMO_TOKEN,获取一个(需要 X-Client-Id 头):
bash
生成或读取持久化的客户端ID
CLIENT
ID=${NEMOCLIENT
ID:-$(cat ~/.config/nemovideo/clientid 2>/dev/null)}
if [ -z $CLIENT_ID ]; then
CLIENT_ID=$(uuidgen 2>/dev/null || echo client-$(date +%s)-$RANDOM)
mkdir -p ~/.config/nemovideo & echo $CLIENT
ID > ~/.config/nemovideo/clientid
fi
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/auth/anonymous-token -H X-Client-Id: $CLIENT_ID
→ {code:0,data:{token:nmvusrxxx,credits:100,...}}
将 token 保存为 NEMOTOKEN,CLIENTID 保存为 NEMOCLIENTID。匿名:每个客户端每7天1个令牌;令牌在7天后过期,可随时通过nemovideo.com上的设置→API令牌撤销。如果您的令牌过期,使用相同的 X-Client-Id 请求新令牌。
本地持久化: 此技能写入 ~/.config/nemovideo/client_id 以在会话间持久化客户端ID。这避免了每次请求生成新ID,从而快速达到每IP速率限制(默认每个IP每7天10个令牌)。该文件仅包含一个UUID——本地不存储任何凭据。
2. 将传入请求路由到正确的端点
使用下表确定每种类型的用户请求应由哪个API端点处理。
| 用户说... | 操作 | 跳过SSE |
|---|
| export / 导出 / download / send me the video | → §3.5 导出 | ✅ |
| credits / 积分 / balance / 余额 |
→ §3.3 积分 | ✅ |
| status / 状态 / show tracks | → §3.4 状态 | ✅ |
| upload / 上传 / 用户发送文件 | → §3.2 上传 | ✅ |
| 其他所有内容(生成、编辑、添加背景音乐…) | → §3.1 SSE | ❌ |
3. 主要工作流程序列
$API = ${NEMOAPIURL:-https://mega-api-prod.nemovideo.ai},$TOKEN = ${NEMOTOKEN},$WEB = ${NEMOWEB_URL:-https://nemovideo.com}。
技能归属变量——请勿硬编码:
- - $SKILLNAME = 此文件的YAML前置元数据 name
- $SKILLVERSION = 此文件的YAML前置元数据 version
- $SKILL_SOURCE = 在运行时使用第一个匹配项检测:
1. 如果设置了环境变量 SKILL_SOURCE
2. 如果此文件的路径包含已知平台目录(例如 ~/.clawhub/ → clawhub,~/.cursor/skills/ → cursor)
3. 回退:unknown
关键:所有API请求(包括渲染/导出/上传/状态/积分)必须包含这些头。缺少它们将导致导出失败并返回402。
X-Skill-Source: $SKILL_NAME
X-Skill-Version: $SKILL_VERSION
X-Skill-Platform: $SKILL_SOURCE
3.0 初始化新会话
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/api/tasks/me/with-session/nemo_agent \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H X-Skill-Source: $SKILL
NAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE \
-d {task_name:project,language:
}
→ {code:0,data:{taskid:...,sessionid:...}}
在任何交互开始之前,必须与后端建立会话。存储返回的会话标识符,因为该对话中的每个后续请求都依赖于此。
在浏览器中打开:创建会话后,给用户一个链接,在NemoVideo中查看/编辑任务:
$WEB/workspace/claim?task={taskid}&session={sessionid}&skillname=$SKILLNAME&skillversion=$SKILLVERSION&skillsource=$SKILLSOURCE
3.1 通过SSE连接传递消息
bash
curl -s -X POST https://mega-api-prod.nemovideo.ai/run_sse \
-H Authorization: Bearer $TOKEN -H Content-Type: application/json \
-H Accept: text/event-stream -H X-Skill-Source: $SKILLNAME -H X-Skill-Version: $SKILLVERSION -H X-Skill-Platform: $SKILL_SOURCE --max-time 900 \
-d {appname:nemoagent,userid:me,sessionid:,new_message:{parts:[{text:}]}}
所有对话消息都传输到后端,并通过持久的服务器发送事件流接收响应。
SSE处理
| 事件 | 操作 |
|---|
| 文本响应 | 应用GUI翻译(§4),呈现给用户 |
| 工具调用/结果 |
静默等待,不转发 |
| heartbeat / 空