Video Generate Skill
This skill generates videos using Doubao Seedance 1.0/1.5 models.
Trigger Conditions
- 1. User wants to generate videos from text descriptions
- User wants to create videos based on images (first/last frame)
- User wants to create videos with reference materials (images, videos, audio)
- User asks for video generation capabilities
Usage
Environment Variables
Before using this skill, ensure the following environment variables are set:
- -
ARK_API_KEY or MODEL_VIDEO_API_KEY or MODEL_AGENT_API_KEY: API key for the video generation service - INLINECODE3 : API base URL (optional, has default)
- INLINECODE4 : Model name (optional, has default)
Function Signature
CODEBLOCK0
Parameters
params (list[dict])
A list of video generation requests. Each item is a dict with the following fields:
Required per item:
- -
video_name (str): Name/identifier of the output video file - INLINECODE6 (str): Text describing the video to generate. Supports Chinese and English.
Optional per item - Input Materials:
- -
first_frame (str): URL for the first frame image - INLINECODE8 (str): URL for the last frame image
- INLINECODE9 (list[str]): 1-4 reference image URLs for style/content guidance
- INLINECODE10 (list[str]): 0-3 reference video URLs (mp4/mov, 2-15s each, total ≤15s)
- INLINECODE11 (list[str]): 0-3 reference audio URLs (mp3/wav, 2-15s each, total ≤15s)
Optional per item - Video Output Parameters:
- -
ratio (str): Aspect ratio. Options: "16:9" (default), "9:16", "4:3", "3:4", "1:1", "2:1", "21:9", "adaptive" - INLINECODE13 (int): Video length in seconds. Range: 2-12s depending on model
- INLINECODE14 (str): Video resolution. Options: "480p", "720p", "1080p"
- INLINECODE15 (int): Total frame count. Must be in [29, 289] and follow format 25 + 4n
- INLINECODE16 (bool): Lock camera movement. Default: false
- INLINECODE17 (int): Random seed for reproducibility. Range: [-1, 2^32-1]
- INLINECODE18 (bool): Whether to add watermark. Default: false
- INLINECODE19 (bool): Whether to generate audio. Only Seedance 1.5 supports this
- INLINECODE20 (list[dict]): Tool configuration, e.g., INLINECODE21
Input Modes
- 1. Text-to-Video: Only provide prompt, no images/videos
- First Frame Guidance: Provide firstframe for starting image
- First + Last Frame Guidance: Provide both for transition video
- Reference Images: Provide referenceimages for style/content guidance
- Multimodal Reference: Combine referenceimages, referencevideos, reference_audios
Return Value
Script Return Info
The video_generate.py script will return these info:
CODEBLOCK1
Based on the script return info, the final response returned to the user consists of a description of the video generation task and the video URL(s). You may download the video from the URL, but the video URL should still be provided to the user for viewing and downloading.
Note: the URL is the 'url' in the success_list of script return info.
The URL must return in two ways:
Final Return Info
You must return three types of information:
- 1. File format, return both file (if you have some other methods to send the video file) and local path, for example:
/root/.openclaw/workspace/skills/video-generate/xxx.mp4
- 2. After generation, present list of video URL in Markdown format, for example:
CODEBLOCK2
Code Implementation
See scripts/video_generate.py for the full implementation.
Example Usage
CODEBLOCK3
Command Line Options
| Option | Short | Description |
|---|
| INLINECODE22 | INLINECODE23 | Text description of the video (required) |
| INLINECODE24 |
-n | Video name identifier (default: video) |
|
--model |
-m | Model name (default: doubao-seedance-1-0-pro-250528) |
|
--ratio |
-r | Aspect ratio (default: 16:9) |
|
--duration |
-d | Video duration in seconds (2-12) |
|
--resolution | | Video resolution: 480p, 720p, 1080p |
|
--first-frame |
-f | First frame image URL |
|
--last-frame |
-l | Last frame image URL |
|
--ref-images | | Reference image URLs (space-separated, 1-4 images) |
|
--ref-videos | | Reference video URLs (space-separated, 0-3 videos) |
|
--ref-audios | | Reference audio URLs (space-separated, 0-3 audios) |
|
--generate-audio | | Generate audio (Seedance 1.5 only) |
|
--seed | | Random seed for reproducibility |
|
--no-watermark | | Disable watermark |
|
--timeout |
-t | Max wait time in seconds (default: 1200) |
|
--query-task |
-q | Query task status by task_id |
Model Fallback
If you encounter a model-related error (like ModelNotOpen), you can downgrade to these models:
- - INLINECODE48
- INLINECODE49
Error Handling
- - IF the script raises the error "PermissionError: ARKAPIKEY or MODELVIDEOAPIKEY or MODELAGENTAPIKEY not found in environment variables", inform the user that they need to provide the
ARK_API_KEY or MODEL_VIDEO_API_KEY or MODEL_AGENT_API_KEY environment variable. Write it to the environment variable file in the workspace. If the file already exists, append it to the end. Ensure the environment variable format is correct, make the environment variable effective, and retry the video generation task that just failed.
Notes
- - Keep prompt concise (recommended ≤ 500 characters)
- For first/last frame, ensure aspect ratios match your chosen ratio
- Reference images: 1-4 images, formats: jpeg/png/webp/bmp/tiff/gif
- Reference videos: 0-3 videos, formats: mp4/mov, total duration ≤ 15s
- Reference audios: 0-3 audios, formats: mp3/wav, total duration ≤ 15s
- Multimodal requires at least one image or video (audio-only not supported)
- Audio generation is only supported by Seedance 1.5 pro
- If polling times out, use
--query-task with the returned task_id
视频生成技能
该技能使用豆包Seedance 1.0/1.5模型生成视频。
触发条件
- 1. 用户想要根据文本描述生成视频
- 用户想要基于图片(首帧/尾帧)创建视频
- 用户想要使用参考素材(图片、视频、音频)创建视频
- 用户询问视频生成能力
使用方法
环境变量
使用此技能前,请确保已设置以下环境变量:
- - ARKAPIKEY 或 MODELVIDEOAPIKEY 或 MODELAGENTAPIKEY:视频生成服务的API密钥
- MODELVIDEOAPIBASE:API基础URL(可选,有默认值)
- MODELVIDEO_NAME:模型名称(可选,有默认值)
函数签名
python
async def video_generate(
params: list,
batch_size: int = 10,
maxwaitseconds: int = 1200,
model_name: str = None,
) -> Dict:
参数说明
params (list[dict])
视频生成请求列表。每个元素是一个字典,包含以下字段:
每个元素必填项:
- - video_name (str):输出视频文件的名称/标识符
- prompt (str):描述要生成视频的文本。支持中文和英文。
每个元素可选项 - 输入素材:
- - firstframe (str):首帧图片URL
- lastframe (str):尾帧图片URL
- referenceimages (list[str]):1-4张参考图片URL,用于风格/内容引导
- referencevideos (list[str]):0-3个参考视频URL(mp4/mov格式,每个2-15秒,总计≤15秒)
- reference_audios (list[str]):0-3个参考音频URL(mp3/wav格式,每个2-15秒,总计≤15秒)
每个元素可选项 - 视频输出参数:
- - ratio (str):画面比例。可选值:16:9(默认)、9:16、4:3、3:4、1:1、2:1、21:9、adaptive
- duration (int):视频时长(秒)。范围:2-12秒,取决于模型
- resolution (str):视频分辨率。可选值:480p、720p、1080p
- frames (int):总帧数。必须在[29, 289]范围内,且符合25+4n格式
- camerafixed (bool):锁定镜头运动。默认值:false
- seed (int):随机种子,用于结果复现。范围:[-1, 2^32-1]
- watermark (bool):是否添加水印。默认值:false
- generateaudio (bool):是否生成音频。仅Seedance 1.5支持
- tools (list[dict]):工具配置,例如[{type: web_search}]
输入模式
- 1. 文生视频:仅提供prompt,不提供图片/视频
- 首帧引导:提供firstframe作为起始图片
- 首尾帧引导:同时提供首尾帧,生成过渡视频
- 参考图片:提供referenceimages用于风格/内容引导
- 多模态参考:组合referenceimages、referencevideos、reference_audios
返回值
脚本返回信息
video_generate.py脚本将返回以下信息:
python
{
status: success | partial_success | error,
successlist: [{videoname: video_url}],
errorlist: [videoname],
errordetails: [{videoname: ..., error: {...}}],
pendinglist: [{videoname: ..., task_id: cgt-xxx, ...}]
}
基于脚本返回信息,最终返回给用户的响应包含视频生成任务的描述和视频URL。您可以从URL下载视频,但仍需向用户提供视频URL供其查看和下载。
注意:URL是脚本返回信息中success_list里的url字段。
URL必须以两种方式返回:
最终返回信息
您必须返回三类信息:
- 1. 文件格式,同时返回文件(如果您有其他发送视频文件的方法)和本地路径,例如:
/root/.openclaw/workspace/skills/video-generate/xxx.mp4
- 2. 生成后,以Markdown格式呈现视频URL列表,例如:
代码实现
完整实现请参见scripts/video_generate.py。
使用示例
bash
文生视频
python scripts/video
generate.py -p 小猫骑着滑板穿过公园 -n catpark -r 16:9 -d 5 --resolution 720p
首帧引导
python scripts/video
generate.py -p 小猫跳起来 -n catjump -f https://example.com/cat.png -r adaptive -d 5
首尾帧引导
python scripts/video_generate.py -p 平滑过渡动画 -n transition \
-f https://example.com/start.png \
-l https://example.com/end.png \
-d 6
参考图片(风格/内容引导)
python scripts/video_generate.py -p [图1]戴着眼镜的男生和[图2]柯基小狗坐在草坪上 -n styled \
--ref-images https://example.com/boy.png https://example.com/dog.png \
-r 16:9 -d 5
多模态参考(视频+音频)
python scripts/video_generate.py -p 将视频中的人物换成[图1]中的男孩 -n multimodal \
--ref-images https://example.com/boy.png \
--ref-videos https://example.com/source.mp4 \
--ref-audios https://example.com/voice.wav \
-d 5
生成音频(仅Seedance 1.5)
python scripts/video
generate.py -p 女孩抱着狐狸,可以听到风声和树叶沙沙声 -n withaudio \
-f https://example.com/girl_fox.png \
--generate-audio \
-m doubao-seedance-1-5-pro-251215 \
-d 6 --resolution 1080p
查询任务状态
python scripts/video_generate.py -q cgt-20260222165751-wsnw8
使用特定模型
python scripts/video_generate.py -p A futuristic city -m doubao-seedance-1-5-pro-251215
无水印
python scripts/video_generate.py -p A beautiful landscape --no-watermark
命令行选项
| 选项 | 缩写 | 描述 |
|---|
| --prompt | -p | 视频的文本描述(必填) |
| --name |
-n | 视频名称标识符(默认:video) |
| --model | -m | 模型名称(默认:doubao-seedance-1-0-pro-250528) |
| --ratio | -r | 画面比例(默认:16:9) |
| --duration | -d | 视频时长(秒,2-12) |
| --resolution | | 视频分辨率:480p、720p、1080p |
| --first-frame | -f | 首帧图片URL |
| --last-frame | -l | 尾帧图片URL |
| --ref-images | | 参考图片URL(空格分隔,1-4张) |
| --ref-videos | | 参考视频URL(空格分隔,0-3个) |
| --ref-audios | | 参考音频URL(空格分隔,0-3个) |
| --generate-audio | | 生成音频(仅Seedance 1.5) |
| --seed | | 随机种子,用于结果复现 |
| --no-watermark | | 禁用视频水印 |
| --timeout | -t | 最大等待时间(秒,默认:1200) |
| --query-task | -q | 通过task_id查询任务状态 |
模型降级
如果遇到模型相关错误(如ModelNotOpen),可以降