Category: provider
Model Studio Qwen TTS Voice Design
Use voice design models to create controllable synthetic voices from natural language descriptions.
Critical model names
Use one of these exact model strings:
Prerequisites
- - Install SDK in a virtual environment:
CODEBLOCK0
- - Set
DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials.
Normalized interface (tts.voice_design)
Request
- -
voice_prompt (string, required) target voice description - INLINECODE6 (string, required)
- INLINECODE7 (bool, optional)
Response
- -
audio_url (string) or streaming PCM chunks - INLINECODE9 (string)
- INLINECODE10 (string)
Operational guidance
- - Write voice prompts with tone, pace, emotion, and timbre constraints.
- Build a reusable voice prompt library for product consistency.
- Validate generated voice in short utterances before long scripts.
Local helper script
Prepare a normalized request JSON and validate response schema:
CODEBLOCK1
Output location
- - Default output: INLINECODE11
- Override base dir with
OUTPUT_DIR.
Validation
CODEBLOCK2
Pass criteria: command exits 0 and output/aliyun-qwen-tts-voice-design/validate.txt is generated.
Output And Evidence
- - Save artifacts, command outputs, and API response summaries under
output/aliyun-qwen-tts-voice-design/. - Include key parameters (region/resource id/time range) in evidence files for reproducibility.
Workflow
1) Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
2) Run one minimal read-only query first to verify connectivity and permissions.
3) Execute the target operation with explicit parameters and bounded scope.
4) Verify results and save output/evidence files.
References
类别: 提供者
Model Studio Qwen TTS 语音设计
使用语音设计模型,根据自然语言描述生成可控的合成语音。
关键模型名称
使用以下精确的模型字符串之一:
- - qwen3-tts-vd-2026-01-26
- qwen3-tts-vd-realtime-2026-01-15
前提条件
bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope
- - 在环境中设置 DASHSCOPEAPIKEY,或将 dashscopeapikey 添加到 ~/.alibabacloud/credentials 文件中。
标准化接口 (tts.voice_design)
请求
- - voice_prompt (字符串,必填) 目标语音描述
- text (字符串,必填)
- stream (布尔值,可选)
响应
- - audiourl (字符串) 或流式 PCM 数据块
- voiceid (字符串)
- request_id (字符串)
操作指导
- - 编写包含语气、语速、情感和音色约束的语音提示。
- 构建可复用的语音提示库,以确保产品一致性。
- 在长文本脚本之前,先用短句验证生成的语音。
本地辅助脚本
准备一个标准化的请求 JSON 并验证响应模式:
bash
.venv/bin/python skills/ai/audio/aliyun-qwen-tts-voice-design/scripts/preparevoicedesign_request.py \
--voice-prompt 一个温暖的女主持人声音,吐字清晰,语速适中 \
--text 这是一个语音设计演示
输出位置
- - 默认输出:output/ai-audio-tts-voice-design/audio/
- 通过 OUTPUT_DIR 覆盖基础目录。
验证
bash
mkdir -p output/aliyun-qwen-tts-voice-design
for f in skills/ai/audio/aliyun-qwen-tts-voice-design/scripts/*.py; do
python3 -m py_compile $f
done
echo pycompileok > output/aliyun-qwen-tts-voice-design/validate.txt
通过标准:命令退出码为 0,并且生成了 output/aliyun-qwen-tts-voice-design/validate.txt 文件。
输出与证据
- - 将产物、命令输出和 API 响应摘要保存到 output/aliyun-qwen-tts-voice-design/ 目录下。
- 在证据文件中包含关键参数(区域/资源 ID/时间范围),以确保可复现性。
工作流程
1) 确认用户意图、区域、标识符,以及操作是只读还是修改型。
2) 首先执行一个最小的只读查询,以验证连接性和权限。
3) 使用明确的参数和限定的范围执行目标操作。
4) 验证结果并保存输出/证据文件。
参考