ElevenLabs API Skill (Advanced)
Purpose
Provide a production-oriented guide for using ElevenLabs APIs via direct HTTPS (no SDK requirement), with clear auth, safety, and workflow guidance.
Best fit
- - You need text-to-speech or speech-to-speech conversion.
- You want realtime speech-to-text with low latency.
- You prefer direct HTTP calls with predictable outputs.
Not a fit
- - You require a full SDK integration and helpers.
- You need full conversational agents beyond audio I/O.
Quick orientation
- - Read
references/elevenlabs-authentication.md for API keys and single-use tokens. - Read
references/elevenlabs-text-to-speech.md for TTS endpoints and payloads. - Read
references/elevenlabs-speech-to-speech.md for voice conversion. - Read
references/elevenlabs-speech-to-text-realtime.md for realtime STT WebSocket. - Read
references/elevenlabs-text-to-dialogue.md for multi-voice dialogue output. - Read
references/elevenlabs-voices-models.md for voice IDs and model discovery. - Read
references/elevenlabs-safety-and-privacy.md for zero-retention and safety rules.
Required inputs
- - API key (xi-api-key) or a single-use token when needed.
- Voice IDs and model IDs for your target use case.
- Output format choice (audio codec/sample rate/bitrate).
Expected output
- - A clear workflow plan, endpoint checklist, and operational guardrails.
Operational notes
- - Keep a strict allowlist for downstream destinations of audio output.
- Cache voice IDs and model IDs server-side.
- Keep payloads small and retry with backoff on throttling.
Security notes
- - Never log API keys or tokens.
- Use single-use tokens for client-side access.
ElevenLabs API 技能(高级)
目的
提供一份面向生产环境的 ElevenLabs API 使用指南,通过直接 HTTPS 调用(无需 SDK),并附带清晰的认证、安全和工作流程指导。
最佳适用场景
- - 需要文本转语音或语音转语音转换功能。
- 需要低延迟的实时语音转文本功能。
- 偏好直接 HTTP 调用且输出结果可预测。
不适用场景
- - 需要完整的 SDK 集成及辅助工具。
- 需要超越音频输入/输出的完整对话代理功能。
快速导航
- - 阅读 references/elevenlabs-authentication.md 了解 API 密钥和一次性令牌。
- 阅读 references/elevenlabs-text-to-speech.md 了解 TTS 端点及请求体。
- 阅读 references/elevenlabs-speech-to-speech.md 了解语音转换功能。
- 阅读 references/elevenlabs-speech-to-text-realtime.md 了解实时 STT WebSocket。
- 阅读 references/elevenlabs-text-to-dialogue.md 了解多语音对话输出。
- 阅读 references/elevenlabs-voices-models.md 了解语音 ID 和模型发现。
- 阅读 references/elevenlabs-safety-and-privacy.md 了解零保留策略和安全规则。
必需输入
- - API 密钥(xi-api-key)或按需使用的一次性令牌。
- 目标用例所需的语音 ID 和模型 ID。
- 输出格式选择(音频编解码器/采样率/比特率)。
预期输出
- - 清晰的工作流程方案、端点检查清单及操作防护措施。
操作说明
- - 对音频输出的下游目标保持严格的许可名单。
- 在服务端缓存语音 ID 和模型 ID。
- 保持请求体精简,遇到限流时采用退避重试策略。
安全说明
- - 切勿记录 API 密钥或令牌。
- 客户端访问时使用一次性令牌。