Category: provider

Model Studio Qwen TTS

Validation

CODEBLOCK0

Pass criteria: command exits 0 and output/aliyun-qwen-tts/validate.txt is generated.

Output And Evidence

- Save generated audio links, sample audio files, and request payloads to output/aliyun-qwen-tts/.
Keep one validation log per execution.

Critical model names

Use one of the recommended models:

- INLINECODE2
INLINECODE3
INLINECODE4

Prerequisites

- Install SDK (recommended in a venv to avoid PEP 668 limits):

CODEBLOCK1

- Set DASHSCOPE_API_KEY in your environment, or add dashscope_api_key to ~/.alibabacloud/credentials (env takes precedence).

Normalized interface (tts.generate)

Request

- text (string, required)
INLINECODE9 (string, required)
INLINECODE10 (string, optional; default Auto)
INLINECODE12 (string, optional; recommended for instruct models)
INLINECODE13 (bool, optional; default false)

Response

- audio_url (string, when stream=false)
INLINECODE15 (string, when stream=true)
INLINECODE16 (int, 24000)
INLINECODE17 (string, wav or pcm depending on mode)

Quick start (Python + DashScope SDK)

CODEBLOCK2

Streaming notes

- stream=True returns Base64-encoded PCM chunks at 24kHz.
Decode chunks and play or concatenate to a pcm buffer.
The response contains finish_reason == "stop" when the stream ends.

Operational guidance

- Keep requests concise; split long text into multiple calls if you hit size or timeout errors.
Use language_type consistent with the text to improve pronunciation.
Use instruction only when you need explicit style/tone control.
Cache by (text, voice, language_type) to avoid repeat costs.

Output location

- Default output: INLINECODE23
Override base dir with OUTPUT_DIR.

Workflow

1) Confirm user intent, region, identifiers, and whether the operation is read-only or mutating.
2) Run one minimal read-only query first to verify connectivity and permissions.
3) Execute the target operation with explicit parameters and bounded scope.
4) Verify results and save output/evidence files.

References

- references/api_reference.md for parameter mapping and streaming example.
Realtime mode is provided by skills/ai/audio/aliyun-qwen-tts-realtime/.
Voice cloning/design are provided by skills/ai/audio/aliyun-qwen-tts-voice-clone/ and skills/ai/audio/aliyun-qwen-tts-voice-design/.

- Source list: INLINECODE29

技能名称: aliyun-qwen-tts
详细描述:
分类: provider

Model Studio Qwen TTS

验证

bash
mkdir -p output/aliyun-qwen-tts
python -m pycompile skills/ai/audio/aliyun-qwen-tts/scripts/generatetts.py && echo pycompileok > output/aliyun-qwen-tts/validate.txt

通过标准：命令退出码为0且生成了 output/aliyun-qwen-tts/validate.txt 文件。

输出与证据

- 将生成的音频链接、示例音频文件和请求负载保存到 output/aliyun-qwen-tts/ 目录下。
每次执行保留一份验证日志。

关键模型名称

使用以下推荐模型之一：

- qwen3-tts-flash
qwen3-tts-instruct-flash
qwen3-tts-instruct-flash-2026-01-26

前置条件

- 安装SDK（建议在虚拟环境中安装以避免PEP 668限制）：

bash
python3 -m venv .venv
. .venv/bin/activate
python -m pip install dashscope

- 在环境变量中设置 DASHSCOPEAPIKEY，或者将 dashscopeapikey 添加到 ~/.alibabacloud/credentials 文件中（环境变量优先级更高）。

标准化接口 (tts.generate)

请求

- text（字符串，必填）
voice（字符串，必填）
language_type（字符串，可选；默认值为 Auto）
instruction（字符串，可选；推荐用于指令模型）
stream（布尔值，可选；默认值为 false）

响应

- audiourl（字符串，当 stream=false 时）
audiobase64pcm（字符串，当 stream=true 时）
samplerate（整数，24000）
format（字符串，根据模式为 wav 或 pcm）

快速开始 (Python + DashScope SDK)

python
import os
import dashscope

优先使用环境变量进行认证：export DASHSCOPEAPIKEY=...

或者使用 ~/.alibabacloud/credentials 文件，在 [default] 下配置 dashscopeapikey。

北京区域；新加坡区域请使用：https://dashscope-intl.aliyuncs.com/api/v1

dashscope.basehttpapi_url = https://dashscope.aliyuncs.com/api/v1

text = 你好，这是一段简短的语音。
response = dashscope.MultiModalConversation.call(
model=qwen3-tts-instruct-flash,
apikey=os.getenv(DASHSCOPEAPI_KEY),
text=text,
voice=Cherry,
language_type=English,
instruction=温暖平静的语调，语速稍慢。,
stream=False,
)

audio_url = response.output.audio.url
print(audio_url)

流式传输说明

- stream=True 返回24kHz的Base64编码PCM数据块。
解码数据块并播放或拼接成pcm缓冲区。
当流结束时，响应中包含 finish_reason == stop。

操作指南

- 保持请求简洁；如果遇到大小或超时错误，可将长文本拆分为多次调用。
使用与文本一致的 languagetype 以提高发音准确性。
仅在需要显式控制风格/语调时使用 instruction。
通过 (text, voice, languagetype) 进行缓存以避免重复开销。

输出位置

- 默认输出：output/aliyun-qwen-tts/audio/
通过 OUTPUT_DIR 覆盖基础目录。

工作流程

1) 确认用户意图、区域、标识符以及操作是只读还是修改性质。
2) 首先执行一个最小的只读查询以验证连接和权限。
3) 使用明确的参数和有限的范围执行目标操作。
4) 验证结果并保存输出/证据文件。

参考资料

- 参数映射和流式传输示例请参考 references/api_reference.md。
实时模式由 skills/ai/audio/aliyun-qwen-tts-realtime/ 提供。
语音克隆/设计分别由 skills/ai/audio/aliyun-qwen-tts-voice-clone/ 和 skills/ai/audio/aliyun-qwen-tts-voice-design/ 提供。

- 来源列表：references/sources.md

aliyun-qwen-tts通义千问语音合成

aliyun-qwen-tts

Model Studio Qwen TTS

Validation

Output And Evidence

Critical model names

Prerequisites

Normalized interface (tts.generate)

Request

Response

Quick start (Python + DashScope SDK)

Streaming notes

Operational guidance

Output location

Workflow

References

Model Studio Qwen TTS

验证

输出与证据

关键模型名称

前置条件

标准化接口 (tts.generate)

请求

响应

快速开始 (Python + DashScope SDK)

优先使用环境变量进行认证：export DASHSCOPEAPIKEY=...

或者使用 ~/.alibabacloud/credentials 文件，在 [default] 下配置 dashscopeapikey。

北京区域；新加坡区域请使用：https://dashscope-intl.aliyuncs.com/api/v1

流式传输说明

操作指南

输出位置

工作流程

参考资料

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

aliyun-qwen-tts通义千问语音合成

aliyun-qwen-tts

Model Studio Qwen TTS

Validation

Output And Evidence

Critical model names

Prerequisites

Normalized interface (tts.generate)

Request

Response

Quick start (Python + DashScope SDK)

Streaming notes

Operational guidance

Output location

Workflow

References

Model Studio Qwen TTS

验证

输出与证据

关键模型名称

前置条件

标准化接口 (tts.generate)

请求

响应

快速开始 (Python + DashScope SDK)

优先使用环境变量进行认证：export DASHSCOPEAPIKEY=...

或者使用 ~/.alibabacloud/credentials 文件，在 [default] 下配置 dashscopeapikey。

北京区域；新加坡区域请使用：https://dashscope-intl.aliyuncs.com/api/v1

流式传输说明

操作指南

输出位置

工作流程

参考资料

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement