When to Use
User wants to work with MiniMax as a real multimodal platform, not as a vague brand mention. Agent handles model routing, API selection, compatible SDK caveats, speech generation, queued media jobs, MCP boundaries, and production-safe retry patterns.
Use this when the blocker is operational: wrong interface, wrong model tier, ignored parameters, broken polling loop, unsafe media upload, or poor routing across text, speech, video, and music tasks.
Architecture
Memory lives in ~/minimax/. If ~/minimax/ does not exist, run setup.md. See memory-template.md for structure.
CODEBLOCK0
Quick Reference
Load only the file needed for the current blocker.
| Topic | File |
|---|
| Setup guide | INLINECODE4 |
| Memory template |
memory-template.md |
| Model selection and routing |
model-routing.md |
| Native, Anthropic-compatible, and OpenAI-compatible text flows |
text-interfaces.md |
| Speech generation and audio delivery |
speech-workflows.md |
| Video, music, and async media jobs |
media-generation.md |
| MCP boundaries and orchestration choices |
mcp-and-orchestration.md |
| Failure recovery and debugging |
troubleshooting.md |
Requirements
- -
MINIMAX_API_KEY for direct MiniMax API usage. - A client surface of choice: raw HTTP, an approved SDK, or an existing Anthropic-compatible or OpenAI-compatible integration.
- Explicit user approval before uploading private media, cloning or imitating a real person's voice, enabling remote MCP servers, or launching long-running paid generation jobs.
- Current model names, compatibility limits, and endpoint behavior must be verified against official MiniMax docs when the task depends on exact product surface.
Operating Coverage
This skill treats MiniMax as an execution platform, not as a one-line provider swap. It covers:
- - text generation through native MiniMax APIs and compatible SDK interfaces
- model routing across current text families such as
MiniMax-M2.5, MiniMax-M2.5-highspeed, MiniMax-M2.1, MiniMax-M2.1-highspeed, and INLINECODE17 - speech generation with synchronous HTTP and lower-latency endpoint choices
- queued media workflows for video and music where submit, poll, and fetch are separate phases
- MCP-aware workflows where tool access, host trust, and data scope must be explicit
- debugging around ignored parameters, malformed payloads, long queue times, rate limits, and output reproducibility
Data Storage
Keep only durable MiniMax operating context in ~/minimax/:
- - which modalities the user actually uses: text, speech, video, music, or MCP-backed flows
- approved models, speed tiers, and compatibility interfaces that worked for real tasks
- output defaults such as JSON parsing rules, audio formats, polling intervals, and retry posture
- media safety rules, consent requirements, and budget boundaries the user explicitly approved
- repeated failures such as 401s, ignored params, queue stalls, or bad prompt templates
Core Rules
1. Lock the Modality and Deliverable First
- - Start by naming the actual output: structured text, chat reply, narration audio, short video, song draft, or tool-augmented workflow.
- MiniMax is not one surface. The wrong modality choice creates wrong endpoints, wrong latency expectations, and wrong retry logic.
2. Choose Native Versus Compatible APIs Deliberately
- - Use native MiniMax APIs when you need MiniMax-specific features or exact behavior.
- Use Anthropic-compatible or OpenAI-compatible interfaces only when the surrounding app already depends on those SDKs and the supported subset is good enough.
- Treat compatibility layers as narrower surfaces, not as feature-complete copies.
3. Pin the Exact Model Family and Speed Tier
- - Choose quality-first, speed-first, or fallback models explicitly instead of saying "use MiniMax."
- Current text routing should start with
MiniMax-M2.5 or MiniMax-M2.5-highspeed, then step down only if latency, cost, or compatibility requires it. - Re-check live docs before shipping hardcoded model lists because MiniMax updates its public surface frequently.
4. Separate Sync From Async Media Work
- - Synchronous text and speech flows can often return in one request.
- Video and music generation usually need submit, poll, timeout, and fetch logic.
- Do not design a blocking one-shot workflow for media jobs that are inherently queued.
5. Validate Media Rights, Inputs, and Formats Before Generation
- - Confirm the user has rights to upload or transform any voice, lyrics, reference media, or branded assets.
- Validate format, duration, language, and output expectations before generating.
- Bad asset assumptions waste spend faster than bad prompts.
6. Make Cost and Trust Boundaries Explicit
- - Multimodal runs can send prompts, media, and metadata off machine and can accumulate cost quickly.
- State which endpoint will receive which payload, and stop before remote MCP or large media uploads unless the user approved that path.
- Never normalize remote execution just because the API supports it.
7. Finish With a Reproducible Recipe
- - A successful MiniMax run ends with the exact model, interface, key parameters, asset inputs, and polling behavior recorded clearly enough to rerun.
- If the output is fragile, capture the narrowest reproducible payload before changing prompts or models again.
MiniMax Traps
- - Treating every MiniMax feature as available through every SDK shim -> parameters get ignored and debugging starts from a false premise.
- Saying "use the MiniMax model" without pinning family or speed tier -> latency, quality, and cost drift across runs.
- Building media flows as one request and one response -> queued jobs hang or fail without usable recovery.
- Uploading sensitive media before clarifying rights or consent -> the technical workflow succeeds but the usage is unsafe.
- Assuming text defaults work for speech, video, or music -> prompts, payload shape, and validation rules diverge quickly.
- Blaming the model before checking payload schema, queue state, or output fetch logic -> operational bugs get mislabeled as generation quality problems.
- Letting MCP servers touch broad data without host review -> tool convenience becomes a trust leak.
External Endpoints
Only these endpoint categories are allowed unless the user explicitly approves more:
| Endpoint | Data Sent | Purpose |
|---|
| https://api.minimax.io | prompts, approved media inputs, generation parameters, and polling requests | Native MiniMax text, speech, media, and related API workflows |
| https://api-uw.minimax.io |
approved speech payloads and generation parameters | Optional lower-TTFA speech endpoint when the user wants faster first audio |
| https://platform.minimax.io/docs | doc queries only | Verify current models, compatibility notes, and API behavior |
| https://{user-approved-mcp-host} | request payloads required by the approved MCP server | Optional MCP tool access beyond the local machine |
No other data is sent externally unless the user explicitly approves additional hosts or provider routes.
Security & Privacy
Data that leaves your machine:
- - prompts and parameters sent to MiniMax API endpoints
- approved media assets or reference files only for the generation workflow the user requested
- optional MCP payloads only for user-approved MCP hosts
- optional documentation lookups against official MiniMax docs
Data that stays local:
- - durable operating notes under INLINECODE21
- local prompt drafts, routing choices, and incident notes unless the user exports them
- any rejected or unused assets that never get uploaded
This skill does NOT:
- - treat compatible SDKs as exact feature matches without verification
- upload private media, voice references, or lyrics without explicit user intent
- enable remote MCP or broad tool access without explicit approval
- claim that every MiniMax modality is synchronous or instantly available
- modify its own skill files
Trust
By using this skill, prompts and approved media may be sent to MiniMax services, plus any optional user-approved MCP hosts.
Only install if you trust those services with that data.
Scope
This skill ONLY:
- - helps operate MiniMax text, speech, video, music, and MCP-related workflows safely
- routes tasks to the right model family, interface, and job pattern
- keeps durable notes for approved defaults, budget boundaries, and recurring failures
This skill NEVER:
- - treat MiniMax as a generic provider drop-in without checking interface limits
- suggest voice imitation or media transformation without rights and consent checks
- blur the line between local orchestration and remote MCP execution
- promise that queued media jobs behave like low-latency text calls
Related Skills
Install with
clawhub install <slug> if user confirms:
- -
ai - Compare MiniMax against other model providers before locking the stack. - INLINECODE24 - Reuse structured HTTP, retry, and payload-debugging patterns around the MiniMax APIs.
- INLINECODE25 - Choose the right model family and fallback chain for quality, latency, and cost.
- INLINECODE26 - Extend MiniMax video work into broader multi-provider video routing.
- INLINECODE27 - Strengthen prompt and arrangement decisions when the task is specifically music-first.
Feedback
- - If useful: INLINECODE28
- Stay updated: INLINECODE29
何时使用
用户希望将MiniMax作为真正的多模态平台使用,而非模糊的品牌提及。智能体需处理模型路由、API选择、兼容SDK注意事项、语音生成、队列化媒体任务、MCP边界以及生产环境安全的重试模式。
当操作层面出现以下阻塞时使用:错误的接口、错误的模型层级、被忽略的参数、损坏的轮询循环、不安全的媒体上传,或文本、语音、视频、音乐任务间的路由不当。
架构
记忆存储在~/minimax/目录下。若~/minimax/不存在,则运行setup.md。结构参见memory-template.md。
text
~/minimax/
|-- memory.md # 持久化上下文、激活边界和已批准的默认配置
|-- routing.md # 经实践验证有效的模型和接口选择
|-- text-defaults.md # 文本模型固定版本、SDK兼容性说明和解析规则
|-- speech-defaults.md # 语音、格式、延迟和需同意的语音注意事项
|-- media-jobs.md # 异步视频或音乐任务模式、轮询和输出处理
|-- mcp-notes.md # 已批准的MCP主机、作用域和拒绝原因
-- incidents.md # 速率限制、失败任务、不良提示和恢复记录
快速参考
仅加载当前阻塞问题所需的文件。
memory-template.md |
| 模型选择与路由 | model-routing.md |
| 原生、Anthropic兼容和OpenAI兼容的文本流程 | text-interfaces.md |
| 语音生成与音频交付 | speech-workflows.md |
| 视频、音乐和异步媒体任务 | media-generation.md |
| MCP边界与编排选择 | mcp-and-orchestration.md |
| 故障恢复与调试 | troubleshooting.md |
要求
- - 直接使用MiniMax API需要MINIMAXAPIKEY。
- 选择客户端界面:原始HTTP、已批准的SDK或现有的Anthropic兼容/OpenAI兼容集成。
- 在上传私有媒体、克隆或模仿真人声音、启用远程MCP服务器或启动长时间运行的付费生成任务前,需获得用户明确批准。
- 当任务依赖精确的产品表面时,当前模型名称、兼容性限制和端点行为必须与官方MiniMax文档核对。
操作覆盖范围
本技能将MiniMax视为执行平台,而非一行代码的供应商替换。涵盖:
- - 通过原生MiniMax API和兼容SDK接口进行文本生成
- 当前文本系列(如MiniMax-M2.5、MiniMax-M2.5-highspeed、MiniMax-M2.1、MiniMax-M2.1-highspeed和MiniMax-M2)的模型路由
- 使用同步HTTP和低延迟端点选择的语音生成
- 视频和音乐的队列化媒体工作流,其中提交、轮询和获取为独立阶段
- 需明确工具访问、主机信任和数据作用域的MCP感知工作流
- 针对被忽略参数、格式错误负载、长队列等待时间、速率限制和输出可复现性的调试
数据存储
仅在~/minimax/中保留持久的MiniMax操作上下文:
- - 用户实际使用的模态:文本、语音、视频、音乐或MCP支持的工作流
- 经实践验证有效的已批准模型、速度层级和兼容接口
- 输出默认值,如JSON解析规则、音频格式、轮询间隔和重试策略
- 用户明确批准的媒体安全规则、同意要求和预算边界
- 重复失败情况,如401错误、被忽略参数、队列停滞或不良提示模板
核心规则
1. 首先锁定模态和交付物
- - 从命名实际输出开始:结构化文本、聊天回复、旁白音频、短视频、歌曲草稿或工具增强工作流。
- MiniMax并非单一表面。错误的模态选择会导致错误的端点、错误的延迟预期和错误的重试逻辑。
2. 审慎选择原生API与兼容API
- - 当需要MiniMax特定功能或精确行为时,使用原生MiniMax API。
- 仅当周围应用已依赖这些SDK且支持的子集足够时,才使用Anthropic兼容或OpenAI兼容接口。
- 将兼容层视为功能更窄的表面,而非功能完整的副本。
3. 固定确切的模型系列和速度层级
- - 明确选择质量优先、速度优先或回退模型,而非笼统地说使用MiniMax。
- 当前文本路由应从MiniMax-M2.5或MiniMax-M2.5-highspeed开始,仅在延迟、成本或兼容性要求时才降级。
- 在部署硬编码模型列表前重新检查实时文档,因为MiniMax频繁更新其公共表面。
4. 区分同步与异步媒体工作
- - 同步文本和语音流程通常可在一次请求中返回。
- 视频和音乐生成通常需要提交、轮询、超时和获取逻辑。
- 不要为本质上是队列化的媒体任务设计阻塞式一次性工作流。
5. 在生成前验证媒体权利、输入和格式
- - 确认用户有权上传或转换任何语音、歌词、参考媒体或品牌资产。
- 在生成前验证格式、时长、语言和输出预期。
- 不良资产假设比不良提示更快消耗预算。
6. 明确成本和信任边界
- - 多模态运行可能将提示、媒体和元数据发送到机器外部,并可能快速累积成本。
- 说明哪个端点将接收哪个负载,并在远程MCP或大型媒体上传前停止,除非用户批准该路径。
- 切勿仅因API支持而将远程执行正常化。
7. 以可复现的方案结束
- - 成功的MiniMax运行应以清晰记录的确切模型、接口、关键参数、资产输入和轮询行为结束,以便重新运行。
- 如果输出不稳定,在再次更改提示或模型前,捕获最窄的可复现负载。
MiniMax陷阱
- - 认为每个MiniMax功能都可通过每个SDK垫片使用 -> 参数被忽略,调试从错误前提开始。
- 说使用MiniMax模型而不固定系列或速度层级 -> 延迟、质量和成本在不同运行间漂移。
- 将媒体流程构建为一次请求一次响应 -> 队列化任务挂起或失败,无法有效恢复。
- 在明确权利或同意前上传敏感媒体 -> 技术工作流成功但使用不安全。
- 假设文本默认值适用于语音、视频或音乐 -> 提示、负载形状和验证规则迅速分化。
- 在检查负载模式、队列状态或输出获取逻辑前归咎于模型 -> 操作错误被误标为生成质量问题。
- 让MCP服务器在未经主机审查的情况下接触广泛数据 -> 工具便利性成为信任漏洞。
外部端点
除非用户明确批准更多,否则仅允许以下端点类别:
| 端点 | 发送的数据 | 目的 |
|---|
| https://api.minimax.io | 提示、已批准的媒体输入、生成参数和轮询请求 | 原生MiniMax文本、语音、媒体及相关API工作流 |
| https://api-uw.minimax.io |
已批准的语音负载和生成参数 | 当用户希望更快首次音频时的可选低TTFA语音端点 |
| https://platform.minimax.io/docs | 仅文档查询 | 验证当前模型、兼容性说明和API行为 |
| https://{用户批准的MCP主机} | 已批准MCP服务器所需的请求负载 | 超出本地机器的可选MCP工具访问 |
除非用户明确批准额外主机或供应商路由,否则不会向外部发送其他数据。
安全与隐私
离开您机器的数据:
- - 发送到MiniMax API端点的提示和参数
- 仅用于用户请求的生成工作流的已批准媒体资产或参考文件
- 仅用于用户批准的MCP主机的可选MCP负载
- 针对官方MiniMax文档的可选文档查询
留在本地的数据:
- - ~/minimax/下的持久化操作记录
- 本地提示草稿、路由选择和事件记录,除非用户导出
- 任何被拒绝或未使用的资产,从未上传
本技能不会:
- - 未经验证就将兼容SDK视为精确功能匹配
- 未经用户明确意图就上传私有媒体、语音参考或歌词
- 未经明确批准就启用远程MCP或广泛工具访问
- 声称每个MiniMax模态都是同步或立即可用的
- 修改自身的技能文件
信任
使用本技能,提示和已批准的媒体可能被发送到MiniMax服务,以及任何可选用户批准的MCP主机。
仅当您信任这些服务处理这些数据时才安装。
范围
本技能仅:
- - 帮助安全操作MiniMax文本、语音、视频、音乐和MCP相关工作流
- 将任务路由到正确的模型系列、接口和任务模式
- 为已批准的默认值、预算边界和重复失败保留