Category: provider
Model Studio CosyVoice Voice Design
Use the CosyVoice voice enrollment API to create designed voices from a natural-language voice description.
Critical model names
Use model="voice-enrollment" and one of these target_model values:
- - INLINECODE2
- INLINECODE3
- INLINECODE4
- INLINECODE5
Recommended default in this repo:
Region and compatibility
- -
cosyvoice-v3.5-plus and cosyvoice-v3.5-flash are available only in China mainland deployment mode (Beijing endpoint). - In international deployment mode (Singapore endpoint),
cosyvoice-v3-plus and cosyvoice-v3-flash do not support voice clone/design. - The
target_model must match the later speech synthesis model.
Endpoint
- - Domestic: INLINECODE12
- International: INLINECODE13
Prerequisites
- - Set
DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Normalized interface (cosyvoice.voice_design)
Request
- -
model (string, optional): fixed to INLINECODE18 - INLINECODE19 (string, optional): default INLINECODE20
- INLINECODE21 (string, required): letters/digits only, max 10 chars
- INLINECODE22 (string, required): max 500 chars, Chinese or English only
- INLINECODE23 (string, required): max 200 chars, Chinese or English
- INLINECODE24 (array[string], optional):
zh or en, and should match INLINECODE27 - INLINECODE28 (int, optional): e.g. INLINECODE29
- INLINECODE30 (string, optional): e.g. INLINECODE31
Response
- -
voice_id (string) - INLINECODE33 (string)
- INLINECODE34 (string, optional)
Operational guidance
- - Keep
voice_prompt concrete: timbre, age range, pace, emotion, articulation, and scenario. - If
language_hints is used, it should match the language of preview_text. - Designed voice names include a
-vd- marker in the generated backend naming convention.
Local helper script
Prepare a normalized request JSON:
CODEBLOCK0
Validation
CODEBLOCK1
Pass criteria: command exits 0 and output/aliyun-cosyvoice-voice-design/validate.txt is generated.
Output And Evidence
- - Save artifacts, command outputs, and API response summaries under
output/aliyun-cosyvoice-voice-design/. - Include
target_model, prefix, voice_prompt, and preview_text in the evidence file.
References
- - INLINECODE45
- INLINECODE46
技能名称: aliyun-cosyvoice-voice-design
详细描述:
类别: provider
模型工作室 CosyVoice 语音设计
使用 CosyVoice 语音注册 API,通过自然语言语音描述创建定制语音。
关键模型名称
使用 model=voice-enrollment 和以下 target_model 值之一:
- - cosyvoice-v3.5-plus
- cosyvoice-v3.5-flash
- cosyvoice-v3-plus
- cosyvoice-v3-flash
本仓库推荐默认值:
- - target_model=cosyvoice-v3.5-plus
区域与兼容性
- - cosyvoice-v3.5-plus 和 cosyvoice-v3.5-flash 仅在中国大陆部署模式(北京端点)可用。
- 在国际部署模式(新加坡端点)下,cosyvoice-v3-plus 和 cosyvoice-v3-flash 不支持语音克隆/设计。
- target_model 必须与后续的语音合成模型匹配。
端点
- - 国内:https://dashscope.aliyuncs.com/api/v1/services/audio/tts/customization
- 国际:https://dashscope-intl.aliyuncs.com/api/v1/services/audio/tts/customization
前提条件
- - 在环境中设置 DASHSCOPEAPIKEY,或将 dashscopeapikey 添加到 ~/.alibabacloud/credentials。
标准化接口 (cosyvoice.voice_design)
请求
- - model(字符串,可选):固定为 voice-enrollment
- targetmodel(字符串,可选):默认为 cosyvoice-v3.5-plus
- prefix(字符串,必填):仅限字母和数字,最多10个字符
- voiceprompt(字符串,必填):最多500个字符,仅支持中文或英文
- previewtext(字符串,必填):最多200个字符,中文或英文
- languagehints(字符串数组,可选):zh 或 en,应与 previewtext 匹配
- samplerate(整数,可选):例如 24000
- response_format(字符串,可选):例如 wav
响应
- - voiceid(字符串)
- requestid(字符串)
- status(字符串,可选)
操作指南
- - 保持 voiceprompt 具体:音色、年龄范围、语速、情感、清晰度和场景。
- 如果使用 languagehints,它应与 preview_text 的语言匹配。
- 定制语音名称在生成的后端命名规范中包含 -vd- 标记。
本地辅助脚本
准备一个标准化的请求 JSON:
bash
python skills/ai/audio/aliyun-cosyvoice-voice-design/scripts/preparecosyvoicedesign_request.py \
--target-model cosyvoice-v3.5-plus \
--prefix announcer \
--voice-prompt 沉稳的中年男性播音员,低沉有磁性,语速平稳,吐字清晰。 \
--preview-text 各位听众朋友,大家好,欢迎收听晚间新闻。 \
--language-hint zh
验证
bash
mkdir -p output/aliyun-cosyvoice-voice-design
for f in skills/ai/audio/aliyun-cosyvoice-voice-design/scripts/*.py; do
python3 -m py_compile $f
done
echo pycompileok > output/aliyun-cosyvoice-voice-design/validate.txt
通过标准:命令退出码为0,且生成 output/aliyun-cosyvoice-voice-design/validate.txt。
输出与证据
- - 将工件、命令输出和 API 响应摘要保存在 output/aliyun-cosyvoice-voice-design/ 下。
- 在证据文件中包含 targetmodel、prefix、voiceprompt 和 preview_text。
参考资料
- - references/api_reference.md
- references/sources.md