Agent setup: If your agent doesn't auto-load skills (e.g. Claude Code),
see agent-compatibility.md once per session.

Qwen Model Selector (Advisor)

This skill operates in two modes:

1. Interactive advisory — asks diagnostic questions to recommend the right model (see Diagnostic Flow).
Cross-skill resolution — provides a fast-path model lookup for execution skills that need a model decision

without user interaction (see Cross-Skill Model Resolution).

Do not fabricate model names — only recommend models listed in this skill.
This skill is part of qwencloud/qwencloud-ai.

Skill directory

Use this skill's reference files for data and learning. Load on demand — do not fetch external URLs unless the user
explicitly asks for latest data.

Location	Purpose
INLINECODE0	Pricing overview — model categories, billing units, and link to official pricing page
INLINECODE1

Model catalog (point-in-time snapshot) |
| references/sources.md | Official documentation URLs (manual lookup only) |
| references/agent-compatibility.md | Agent self-check: register skills in project config for agents that don't auto-load |

Security

NEVER output any API key or credential in plaintext. Always use variable references ($DASHSCOPE_API_KEY in shell,
os.environ["QWEN_API_KEY"] in Python). Any check or detection of credentials must be non-plaintext: report
only status (e.g. "set" / "not set", "valid" / "invalid"), never the value. Never display contents of .env or config
files that may contain secrets.

Coding Plan Models

Users with a Coding Plan subscription have access to a
limited set of models through their coding tools only:

Model	Context	Thinking
qwen3.5-plus	1M	Yes (budget: 81,920)
kimi-k2.5

256K | Yes (budget: 81,920) |
| glm-5 | 198K | Yes (budget: 32,768) |
| MiniMax-M2.5 | 192K | Yes (budget: 32,768) |
| qwen3-max-2026-01-23 | 256K | Yes (budget: 81,920) |
| qwen3-coder-next | 256K | No |
| qwen3-coder-plus | 1M | No |
| glm-4.7 | 198K | Yes (budget: 32,768) |

Coding Plan does not include image, video, TTS, or specialized vision models. When recommending models, note if the
user's chosen model falls outside this list and they are using a Coding Plan key (sk-sp-...). If qwencloud-ops-auth is
installed, see its references/codingplan.md for the full model list and error codes.

Diagnostic Flow

Ask the user (in order):

1. Content type? — text / image / video / audio / vision
Primary task? — generation / understanding / coding / reasoning / translation
Priority? — quality vs speed vs cost
Input size? — short / medium / long context
Structured output? — JSON / function calling needed?

Cross-Skill Model Resolution

When an execution skill needs to choose a model, evaluate across three dimensions: Requirement → Scenario →
Pricing. If the user explicitly specified a model, use it as given — but still verify availability; if
restricted, warn the user and suggest an alternative.

Dimension 1 · Requirement (select)

Match task capability to the right model. Use when the user's need points to a specialized model, or when the task is
ambiguous and you need to compare capabilities.

Signal	Keywords	Model
Reasoning	"think step by step", "reason", "analyze"	qwq-plus (text) · qvq-max (vision)
Coding

Dimension 2 · Scenario (tune)

Adjust model tier based on how the model will be used.

Pattern	Signals	Guidance
Interactive / real-time	"chat", "real-time", "interactive"	Prefer flash/turbo variants; enable streaming
Batch / offline

Dimension 3 · Pricing (optimize)

Given the candidates from dimensions 1–2, compare costs and apply modifiers.

- Pricing reference: pricing.md. For the latest rates, check

the official pricing page.

- Free quota: Some models offer a limited free quota after activation. However, quotas may have been consumed,

expired, or changed. Never assume remaining free quota — always present the paid unit price.

- Batch API: 50% off both input and output tokens for non-realtime workloads.
Context cache: Input token discount for repeated/templated contexts.
Tiered pricing: Some models charge more per token as input length increases — check pricing tables for

breakpoints.

- When cost is the user's primary concern, explicitly recommend the cheapest viable model and cite the price.

Default

No signals detected, clear task → use the Canonical Default for the domain.

Domain	Default	Quality	Speed	Cost
text.chat	qwen3.5-plus	qwen3-max	qwen3.5-flash	qwen-turbo
vision.analyze

Degradation: If this skill is not loaded or not available, each execution skill falls back to its own built-in
default. This protocol is purely additive — it enhances model selection but never blocks execution.

Model Recommendation Matrix

Text Models

Use Case	Recommended	Why
General chat/assistant	qwen3.5-plus	Best balance of quality, speed, cost. Also accepts image/video input (multimodal). Thinking enabled by default.
Fast responses, low cost

Image Models

Use Case	Recommended	Why
Best quality text-to-image	wan2.6-t2i	Latest model, sync support
Image editing / style transfer (1–4 refs)

Video Models

Use Case	Recommended	Why
Quick video creation	wan2.6-i2v-flash	Fast, multi-shot narrative
High quality

Audio Models

Use Case	Recommended	Why
Highest quality	INLINECODE10	Best naturalness, emotional expression, professional scenarios
High quality + speed

Vision Models

Use Case	Recommended	Why
Best accuracy	qwen3-vl-plus	Highest vision understanding. Thinking mode supported. 256K context.
Fast analysis

qwen3-vl-flash | Quick image understanding. Thinking mode supported. | | Unified text+vision | qwen3.5-plus | Multimodal (text + image + video). Surpasses qwen3-vl series on many benchmarks. Use when both text quality and vision matter. |

Omni Models

Use Case	Recommended	Why
Voice + vision chat	qwen3-omni-flash	Text/image/audio/video → text or speech. 49 voices, 10 languages. Thinking supported.
Real-time voice

qwen3-omni-flash-realtime | Streaming audio input + built-in VAD. 49 voices. |

Pricing Guidance

- Default pricing: pricing.md — International, USD.

For the latest rates, check the official pricing page.

- Latest prices: When the user explicitly asks for exact/latest pricing, see sources.md for

official URLs.

- Cost formula: Cost = Tokens ÷ 1,000,000 × Unit price. 1K Chinese chars ≈ 1,200-1,500 tokens.
Free quota: Some models offer a limited free quota after activation — but quotas may have been consumed, expired,

or changed without notice. Always present the paid unit price first. Mention free quota only as something the user should verify in their QwenCloud console.

- Cost tips:

- Use Batch calling for 50% off in non-realtime scenarios - Enable context cache for repeated contexts - Use flash/turbo series for non-critical tasks

Cost Estimation Disclaimer (MANDATORY)

🚨 CRITICAL — NO EXCEPTIONS: NEVER fabricate, invent, or guess any price figure. If you do not have a
confirmed price from references/pricing.md or the official pricing page, you MUST NOT output any number.
Instead, direct the user to
the official pricing page.
Outputting a made-up price is a critical failure — worse than saying "I don't know."

When responding to any cost-related query — including but not limited to price evaluation, usage estimation, budget
forecasting, or cost comparison — you MUST append a professional disclaimer. This applies regardless of language or
response format.

Required disclaimer (Chinese response):

⚠️ 费用说明：以上费用为基于官方公示单价的预估价格，仅供参考。实际费用受 Token
消耗量、上下文长度阶梯定价、Batch/缓存折扣及计费策略调整等因素影响，请以QwenCloud控制台的实际账单为准。部分模型可能提供限时免费额度，但免费额度的可用性、额度量及有效期随时可能调整，请在控制台确认您的账户是否仍有剩余额度，切勿假设本次调用免费。最新定价详见模型定价页。

Required disclaimer (English response):

⚠️ Pricing Notice: The cost figures above are estimates calculated from officially published unit prices and
are provided for reference only. Actual charges depend on token consumption, tiered context-length pricing,
Batch/cache discounts, and billing policy updates. Some models may offer a time-limited free quota, but
quota availability, amounts, and validity periods are subject to change — do not assume this call is free. Please
verify your remaining quota in
the QwenCloud console and refer to the actual
bill for definitive costs. See Model Pricing for
the latest rates.

Rules:

- The disclaimer must appear at the end of every cost-related response, clearly separated from the main content.
When the estimate involves assumptions (e.g., average tokens per character, assumed context length tier), explicitly

state each assumption used in the calculation.

- Never present estimated costs as exact or guaranteed amounts. Use hedging language such as "approximately", "estimated

at", "roughly" (or Chinese equivalents "约", "预估", "约合") throughout the cost breakdown.

- Never tell the user a call will be free or cost $0/¥0. Even if a free quota exists, the user may have already

consumed it. Always present the paid price and note that a free quota may apply — subject to the user verifying in their console.

- If pricing data is unavailable or uncertain, say so explicitly and link to the official pricing page. Never fill

the gap with a guess.

Available Models

All standard text, vision, image, video, audio, and coding models are available. Some models offer free
quota (verify in console).

- Text: qwen3-max, qwen3.5-plus, qwen3.5-flash, qwen-turbo, qwq-plus, qwen3-coder-next/plus/flash, qwen-plus-character, qwen-plus-character-ja, qwen-flash-character
Vision: qwen3-vl-plus, qwen3-vl-flash, qvq-max, qwen-vl-ocr, qwen-vl-max, qwen-vl-plus
Omni: qwen3-omni-flash (+ realtime), qwen-omni-turbo (+ realtime)
Image generation (text-to-image): wan2.6-t2i, wan2.5-t2i-preview, wan2.2-t2i-flash, z-image-turbo
Image editing (requires reference images): wan2.6-image, wan2.5-i2i-preview
Video generation: wan2.6 series (t2v, i2v, i2v-flash, r2v, r2v-flash), wan2.5/2.2 series, vace
TTS: qwen3-tts-flash, qwen3-tts-instruct-flash, cosyvoice-v3 series
ASR: qwen3-asr-flash, fun-asr
Embedding/Rerank: text-embedding-v4, qwen3-rerank
Translation: qwen-mt-plus/flash/lite/turbo

⚠️ Important: The model list above is a point-in-time snapshot and may be outdated. Model availability
changes frequently. Always check the official model list
for the authoritative, up-to-date catalog before making model decisions.
See model-list.md for a more detailed local reference.

Thinking Mode

Several models support hybrid thinking/non-thinking modes:

Model	Thinking Default	Notes
qwen3.5-plus	On	Thinking enabled by default. Use `enable_thinking: false` to disable.
qwen3.5-flash

Guidance: Do not enable thinking by default for simple or conversational tasks — it increases latency and output
token cost. Enable only when the user explicitly asks for deep reasoning or the task requires multi-step analysis.

Anti-Patterns

- Only recommend models listed in this skill — never fabricate model names.
When unsure, use qwen3.5-plus as a safe default for text tasks.
🚨 NEVER invent or guess any price figure — only use pricing from references/pricing.md or the

official pricing page. If the data is not available, say so and link to the official page. Fabricating a price is a critical failure.

- Always cite data source when providing pricing info.
Default currency is USD. Check

the official pricing page for the latest rates. Writing in Chinese does NOT imply CNY.

- All cost estimates must include the mandatory disclaimer — use hedging language ("approximately", "estimated").

Omitting the disclaimer is a critical violation.

- Never assume free quota is available — free quotas may have been consumed, expired, or removed. Never tell the

user a call will cost $0/¥0. Always present the paid unit price first; mention free quota only as a possibility the user should verify in their console.

- Do NOT proactively fetch URLs or trigger web searches — only access online sources when the user explicitly asks

for "latest" data AND reference files cannot answer.

- Never output API keys in plaintext — see Security section above.

Online Information Sources

Reference files in this skill are point-in-time snapshots. For official documentation URLs,
see sources.md.

Do NOT proactively fetch or search online. Only access online sources when ALL of these conditions are met:

1. The user explicitly asks for "latest", "real-time", or "current" data.
The reference files in this skill cannot answer the question.
The user confirms they want an online lookup.

Update Check (MANDATORY Post-Execution)

When the user asks to check for updates (e.g. "check for updates", "check version", "is there a new version", "update
skills"):

1. Find qwencloud-update-check: Look for qwencloud-update-check/SKILL.md in sibling skill directories.
If found — run: python3 <qwencloud-update-check-dir>/scripts/check_update.py --print-response and report the result.

Use --force if the user asks to force-check.

3. If not found — you MUST present the following options and wait for the user's response:

The update-check skill is not installed. It keeps you informed about new qwencloud/qwencloud-ai features and bug fixes.
1) Install now — I'll run the install command for you
2) Skip for now — do nothing this time
3) Never remind me — permanently suppress this prompt

After the user responds:

- User chooses 1 → run:

npx skills add QwenCloud/qwencloud-ai --skill qwencloud-update-check -y, then run: python3 <this-skill-scripts-dir>/gossamer.py --dismiss

- User chooses 2 → run: python3 <this-skill-scripts-dir>/gossamer.py --dismiss, then continue normally
User chooses 3 → run: python3 <this-skill-scripts-dir>/gossamer.py --never-install, then continue normally

References

- pricing.md — Pricing overview: model categories, billing units, and link to official pricing

page

- model-list.md — Model catalog (2026-03 snapshot;

check official model list for latest)

- sources.md — Official documentation URLs (for manual lookup only)

Agent 设置：如果你的Agent无法自动加载技能（例如Claude Code），
请在每个会话中参阅一次 agent-compatibility.md。

Qwen 模型选择器（顾问）

此技能以两种模式运行：

1. 交互式咨询 — 提出诊断性问题以推荐合适的模型（参见诊断流程）。
跨技能解析 — 为需要模型决策但无需用户交互的执行技能提供快速模型查询（参见跨技能模型解析）。

请勿编造模型名称 — 仅推荐本技能中列出的模型。
此技能是 qwencloud/qwencloud-ai 的一部分。

技能目录

使用本技能的参考文件获取数据和知识。按需加载 — 除非用户明确要求最新数据，否则不要获取外部URL。

位置	用途
references/pricing.md	定价概览 — 模型类别、计费单位以及官方定价页面链接
references/model-list.md

安全

切勿以明文形式输出任何API密钥或凭据。 始终使用变量引用（Shell中使用$DASHSCOPEAPIKEY，Python中使用os.environ[QWENAPIKEY]）。任何凭据的检查或检测必须非明文：仅报告状态（例如“已设置”/“未设置”，“有效”/“无效”），切勿输出值。切勿显示可能包含机密的.env或配置文件内容。

编程计划模型

订阅了编程计划的用户仅能通过其编程工具访问一组有限的模型：

模型	上下文	思考
qwen3.5-plus	1M	是（预算：81,920）
kimi-k2.5

256K | 是（预算：81,920） |
| glm-5 | 198K | 是（预算：32,768） |
| MiniMax-M2.5 | 192K | 是（预算：32,768） |
| qwen3-max-2026-01-23 | 256K | 是（预算：81,920） |
| qwen3-coder-next | 256K | 否 |
| qwen3-coder-plus | 1M | 否 |
| glm-4.7 | 198K | 是（预算：32,768） |

编程计划不包含图像、视频、TTS或专门的视觉模型。推荐模型时，请注意用户选择的模型是否在此列表之外，以及他们是否正在使用编程计划密钥（sk-sp-...）。如果已安装qwencloud-ops-auth，请参阅其references/codingplan.md以获取完整的模型列表和错误代码。

诊断流程

按顺序询问用户：

1. 内容类型？ — 文本 / 图像 / 视频 / 音频 / 视觉
主要任务？ — 生成 / 理解 / 编码 / 推理 / 翻译
优先级？ — 质量 vs 速度 vs 成本
输入大小？ — 短 / 中 / 长上下文
结构化输出？ — 需要JSON / 函数调用？

跨技能模型解析

当执行技能需要选择模型时，从三个维度进行评估：需求 → 场景 → 定价。如果用户明确指定了模型，则按指定使用 — 但仍需验证可用性；如果受限，请警告用户并建议替代方案。

维度1 · 需求（选择）

将任务能力与合适的模型匹配。当用户需求指向专门模型，或任务不明确需要比较能力时使用。

信号	关键词	模型
推理	逐步思考、推理、分析	qwq-plus（文本）· qvq-max（视觉）
编码

维度2 · 场景（调优）

根据模型的使用方式调整模型层级。

模式	信号	指导
交互式 / 实时	聊天、实时、交互式	优先选择flash/turbo变体；启用流式输出
批处理 / 离线

维度3 · 定价（优化）

根据维度1-2的候选模型，比较成本并应用修饰符。

- 定价参考：pricing.md。如需最新费率，请查看官方定价页面。
免费配额：部分模型在激活后提供有限的免费配额。但是，配额可能已被消耗、过期或更改。切勿假设剩余免费配额 — 始终提供付费单价。
批量API：非实时工作负载的输入和输出Token均享受五折优惠。
上下文缓存：重复/模板化上下文的输入Token折扣。
阶梯定价：部分模型随着输入长度增加按Token收取更高费用 — 请查看定价表了解断点。
当成本是用户的主要关注点时，明确推荐最便宜的可行模型并引用价格。

默认值

未检测到信号，任务明确 → 使用该领域的规范默认值。

领域	默认值	质量	速度	成本
文本.聊天	qwen3.5-plus	qwen3-max	qwen3.5-flash	qwen-turbo
视觉.分析

qwencloud-model-selector模型推荐器

qwencloud-model-selector

Qwen Model Selector (Advisor)

Skill directory

Security

Coding Plan Models

Diagnostic Flow

Cross-Skill Model Resolution

Dimension 1 · Requirement (select)

Dimension 2 · Scenario (tune)

Dimension 3 · Pricing (optimize)

Default

Model Recommendation Matrix

Text Models

Image Models

Video Models

Audio Models

Vision Models

Omni Models

Pricing Guidance

Cost Estimation Disclaimer (MANDATORY)

Available Models

Thinking Mode

Anti-Patterns

Online Information Sources

Update Check (MANDATORY Post-Execution)

References

Qwen 模型选择器（顾问）

技能目录

安全

编程计划模型

诊断流程

跨技能模型解析

维度1 · 需求（选择）

维度2 · 场景（调优）

维度3 · 定价（优化）

默认值

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement