Category: provider
Model Studio EMO
Validation
CODEBLOCK0
Pass criteria: command exits 0 and output/aliyun-emo/validate.txt is generated.
Output And Evidence
- - Save normalized request payloads, detection boxes, and task polling snapshots under
output/aliyun-emo/. - Record the chosen
style_level and the exact face_bbox / ext_bbox.
Use EMO when the input is a portrait image and speech audio, and you need a non-Wan expressive talking-head result.
Critical model names
Use these exact model strings:
Selection guidance:
- - Run image detection first to obtain
face_bbox and ext_bbox. - Use
emo-v1 only after detection succeeds.
Prerequisites
- - China mainland (Beijing) only.
- Set
DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials. - Input files must be public HTTP/HTTPS URLs.
Normalized interface (video.emo)
Detect Request
- -
model (string, optional): default INLINECODE14 - INLINECODE15 (string, required)
Generate Request
- -
model (string, optional): default INLINECODE17 - INLINECODE18 (string, required)
- INLINECODE19 (string, required)
- INLINECODE20 (array, required)
- INLINECODE21 (array, required)
- INLINECODE22 (string, optional):
normal, calm, or INLINECODE25
Response
- -
task_id (string) - INLINECODE27 (string)
- INLINECODE28 (string, when finished)
Quick start
CODEBLOCK1
Operational guidance
- - Do not invent
face_bbox or ext_bbox; use the detection API output. - INLINECODE31 ratio determines output format:
1:1 yields 512x512, 3:4 yields 512x704. - Keep the input portrait clear and front-facing for better expression quality.
- EMO is portrait-focused; for full-scene human videos use other skills instead.
Output location
- - Default output: INLINECODE36
- Override base dir with
OUTPUT_DIR.
References
技能名称: aliyun-emo
详细描述:
类别: 提供商
模型工作室 EMO
验证
bash
mkdir -p output/aliyun-emo
python -m pycompile skills/ai/video/aliyun-emo/scripts/prepareemorequest.py && echo pycompile_ok > output/aliyun-emo/validate.txt
通过标准:命令退出码为 0 且生成了 output/aliyun-emo/validate.txt。
输出与证据
- - 将标准化后的请求负载、检测框以及任务轮询快照保存到 output/aliyun-emo/ 目录下。
- 记录所选的 stylelevel 以及精确的 facebbox / ext_bbox。
当输入为人像图片和语音音频,且需要生成非万能的富有表现力的说话人头像结果时,使用 EMO。
关键模型名称
请使用以下精确的模型字符串:
选择指南:
- - 首先运行图像检测以获取 facebbox 和 extbbox。
- 仅在检测成功后使用 emo-v1。
前提条件
- - 仅限中国大陆(北京)区域。
- 在环境中设置 DASHSCOPEAPIKEY,或者将 dashscopeapikey 添加到 ~/.alibabacloud/credentials 文件中。
- 输入文件必须是公开的 HTTP/HTTPS URL。
标准化接口 (video.emo)
检测请求
- - model (字符串,可选):默认为 emo-v1-detect
- image_url (字符串,必填)
生成请求
- - model (字符串,可选):默认为 emo-v1
- imageurl (字符串,必填)
- audiourl (字符串,必填)
- facebbox (整数数组,必填)
- extbbox (整数数组,必填)
- style_level (字符串,可选):normal、calm 或 active
响应
- - taskid (字符串)
- taskstatus (字符串)
- video_url (字符串,任务完成时返回)
快速开始
bash
python skills/ai/video/aliyun-emo/scripts/prepareemorequest.py \
--image-url https://example.com/portrait.png \
--audio-url https://example.com/speech.mp3 \
--face-bbox 302,286,610,593 \
--ext-bbox 71,9,840,778 \
--style-level active
操作指南
- - 不要自行编造 facebbox 或 extbbox;请使用检测 API 的输出。
- ext_bbox 的比例决定了输出格式:1:1 生成 512x512,3:4 生成 512x704。
- 保持输入的人像清晰且正面朝向,以获得更好的表情质量。
- EMO 专注于人像;对于全场景人物视频,请改用其他技能。
输出位置
- - 默认输出:output/aliyun-emo/request.json
- 可通过 OUTPUT_DIR 覆盖基础目录。
参考资料