ElevenLabs Voice Personas v2.1
Comprehensive voice synthesis toolkit using ElevenLabs API.
🚀 First Run - Setup Wizard
When you first use this skill (no config.json exists), run the interactive setup wizard:
CODEBLOCK0
The wizard will guide you through:
- 1. API Key - Enter your ElevenLabs API key (required)
- Default Voice - Choose from popular voices (Rachel, Adam, Bella, etc.)
- Language - Set your preferred language (32 supported)
- Audio Quality - Standard or high quality output
- Cost Tracking - Enable usage and cost monitoring
- Budget Limit - Optional monthly spending cap
🔒 Privacy: Your API key is stored locally in config.json only. It never leaves your machine and is automatically excluded from git via .gitignore.
To reconfigure at any time, simply run the setup wizard again.
✨ Features
- - 18 Voice Personas - Carefully curated voices for different use cases
- 32 Languages - Multi-language synthesis with the multilingual v2 model
- Streaming Mode - Real-time audio output as it generates
- Sound Effects (SFX) - AI-generated sound effects from text prompts
- Batch Processing - Process multiple texts in one go
- Cost Tracking - Monitor character usage and estimated costs
- Voice Design - Create custom voices from descriptions
- Pronunciation Dictionary - Custom word pronunciation rules
- OpenClaw Integration - Works with OpenClaw's built-in TTS
🎙 Available Voices
| Voice | Accent | Gender | Persona | Best For |
|---|
| rachel | 🇺🇸 US | female | warm | Conversations, tutorials |
| adam |
🇺🇸 US | male | narrator | Documentaries, audiobooks |
| bella | 🇺🇸 US | female | professional | Business, presentations |
| brian | 🇺🇸 US | male | comforting | Meditation, calm content |
| george | 🇬🇧 UK | male | storyteller | Audiobooks, storytelling |
| alice | 🇬🇧 UK | female | educator | Tutorials, explanations |
| callum | 🇺🇸 US | male | trickster | Playful, gaming |
| charlie | 🇦🇺 AU | male | energetic | Sports, motivation |
| jessica | 🇺🇸 US | female | playful | Social media, casual |
| lily | 🇬🇧 UK | female | actress | Drama, elegant content |
| matilda | 🇺🇸 US | female | professional | Corporate, news |
| river | 🇺🇸 US | neutral | neutral | Inclusive, informative |
| roger | 🇺🇸 US | male | casual | Podcasts, relaxed |
| daniel | 🇬🇧 UK | male | broadcaster | News, announcements |
| eric | 🇺🇸 US | male | trustworthy | Business, corporate |
| chris | 🇺🇸 US | male | friendly | Tutorials, approachable |
| will | 🇺🇸 US | male | optimist | Motivation, uplifting |
| liam | 🇺🇸 US | male | social | YouTube, social media |
🎯 Quick Presets
- -
default → rachel (warm, friendly) - INLINECODE4 → adam (documentaries)
- INLINECODE5 → matilda (corporate)
- INLINECODE6 → george (audiobooks)
- INLINECODE7 → alice (tutorials)
- INLINECODE8 → brian (meditation)
- INLINECODE9 → liam (social media)
- INLINECODE10 → eric (business)
- INLINECODE11 → river (inclusive)
- INLINECODE12 → george
- INLINECODE13 → charlie
- INLINECODE14 → daniel (news)
🌍 Supported Languages (32)
The multilingual v2 model supports these languages:
| Code | Language | Code | Language |
|---|
| en | English | pl | Polish |
| de |
German | nl | Dutch |
| es | Spanish | sv | Swedish |
| fr | French | da | Danish |
| it | Italian | fi | Finnish |
| pt | Portuguese | no | Norwegian |
| ru | Russian | tr | Turkish |
| uk | Ukrainian | cs | Czech |
| ja | Japanese | sk | Slovak |
| ko | Korean | hu | Hungarian |
| zh | Chinese | ro | Romanian |
| ar | Arabic | bg | Bulgarian |
| hi | Hindi | hr | Croatian |
| ta | Tamil | el | Greek |
| id | Indonesian | ms | Malay |
| vi | Vietnamese | th | Thai |
CODEBLOCK1
💻 CLI Usage
Basic Text-to-Speech
CODEBLOCK2
Streaming Mode
Generate audio with real-time streaming (good for long texts):
CODEBLOCK3
Batch Processing
Process multiple texts from a file:
CODEBLOCK4
JSON batch format:
CODEBLOCK5
Simple text format (one per line):
CODEBLOCK6
Usage Statistics
CODEBLOCK7
🎵 Sound Effects (SFX)
Generate AI-powered sound effects from text descriptions:
CODEBLOCK8
Example prompts:
- - "Thunder rumbling in the distance"
- "Cat purring contentedly"
- "Typing on a mechanical keyboard"
- "Spaceship engine humming"
- "Coffee shop background chatter"
🎨 Voice Design
Create custom voices from text descriptions:
CODEBLOCK9
Voice Design Options:
| Option | Values |
|---|
| Gender | male, female, neutral |
| Age |
young, middle_aged, old |
| Accent | american, british, african, australian, indian, latin, middle
eastern, scandinavian, easterneuropean |
| Accent Strength | 0.3-2.0 (subtle to strong) |
📖 Pronunciation Dictionary
Customize how words are pronounced:
Edit pronunciations.json:
CODEBLOCK10
Usage:
# Pronunciations are applied automatically
python3 scripts/tts.py --text "The OpenClaw API is great" --voice rachel
# Disable pronunciations
python3 scripts/tts.py --text "The API is great" --voice rachel --no-pronunciations
💰 Cost Tracking
The skill tracks your character usage and estimates costs:
CODEBLOCK12
Output:
📊 ElevenLabs Usage Statistics
Total Characters: 15,230
Total Requests: 42
Since: 2024-01-15
💰 Estimated Costs:
Starter $4.57 ($0.30/1k chars)
Creator $3.66 ($0.24/1k chars)
Pro $2.74 ($0.18/1k chars)
Scale $1.68 ($0.11/1k chars)
🤖 OpenClaw TTS Integration
Using with OpenClaw's Built-in TTS
OpenClaw has built-in TTS support that can use ElevenLabs. Configure in ~/.openclaw/openclaw.json:
CODEBLOCK14
Triggering TTS in Chat
In OpenClaw conversations:
- - Use
/tts on to enable automatic TTS - Use the
tts tool directly for one-off speech - Request "read this aloud" or "speak this"
Using Skill Scripts from OpenClaw
CODEBLOCK15
⚙ Configuration
The scripts look for API key in this order:
- 1.
ELEVEN_API_KEY or ELEVENLABS_API_KEY environment variable - Skill-local
.env file (in the skill directory)
Create .env file:
CODEBLOCK16
Note: The skill no longer reads from ~/.openclaw/openclaw.json. Use environment variables or the skill-local .env file.
🎛 Voice Settings
Each voice has tuned settings for optimal output:
| Setting | Range | Description |
|---|
| stability | 0.0-1.0 | Higher = consistent, lower = expressive |
| similarity_boost |
0.0-1.0 | How closely to match original voice |
| style | 0.0-1.0 | Exaggeration of speaking style |
📝 Triggers
- - "use {voice_name} voice"
- "speak as {persona}"
- "list voices"
- "voice settings"
- "generate sound effect"
- "design a voice"
📁 Files
CODEBLOCK17
🔗 Links
📋 Changelog
v2.1.0
- - Added interactive setup wizard (
scripts/setup.py) - Onboarding guides through API key, voice, language, quality, and budget settings
- Config stored locally in
config.json (added to .gitignore) - Professional, privacy-focused setup experience
v2.0.0
- - Added 32 language support with
--lang parameter - Added streaming mode with
--stream flag - Added sound effects generation (
sfx.py) - Added batch processing with
--batch flag - Added cost tracking with
--stats flag - Added voice design tool (
voice-design.py) - Added pronunciation dictionary support
- Added OpenClaw TTS integration documentation
- Improved error handling and progress output
ElevenLabs 语音角色 v2.1
使用 ElevenLabs API 的全面语音合成工具包。
🚀 首次运行 - 设置向导
首次使用此技能时(不存在 config.json),请运行交互式设置向导:
bash
python3 scripts/setup.py
向导将引导您完成以下步骤:
- 1. API 密钥 - 输入您的 ElevenLabs API 密钥(必填)
- 默认语音 - 从热门语音中选择(Rachel、Adam、Bella 等)
- 语言 - 设置您的首选语言(支持 32 种语言)
- 音频质量 - 标准或高质量输出
- 成本追踪 - 启用使用量和成本监控
- 预算限制 - 可选的月度支出上限
🔒 隐私保护: 您的 API 密钥仅本地存储在 config.json 中。它永远不会离开您的机器,并通过 .gitignore 自动排除在 git 之外。
如需随时重新配置,只需再次运行设置向导即可。
✨ 功能特性
- - 18 种语音角色 - 为不同使用场景精心策划的语音
- 32 种语言 - 使用多语言 v2 模型进行多语言合成
- 流式模式 - 实时生成音频输出
- 音效(SFX) - 根据文本提示生成 AI 音效
- 批量处理 - 一次性处理多个文本
- 成本追踪 - 监控字符使用量和预估成本
- 语音设计 - 根据描述创建自定义语音
- 发音词典 - 自定义单词发音规则
- OpenClaw 集成 - 与 OpenClaw 内置 TTS 配合使用
🎙 可用语音
| 语音 | 口音 | 性别 | 角色 | 最佳用途 |
|---|
| rachel | 🇺🇸 美式 | 女声 | 温暖 | 对话、教程 |
| adam |
🇺🇸 美式 | 男声 | 旁白 | 纪录片、有声书 |
| bella | 🇺🇸 美式 | 女声 | 专业 | 商务、演示 |
| brian | 🇺🇸 美式 | 男声 | 抚慰 | 冥想、平静内容 |
| george | 🇬🇧 英式 | 男声 | 讲故事 | 有声书、故事讲述 |
| alice | 🇬🇧 英式 | 女声 | 教育者 | 教程、讲解 |
| callum | 🇺🇸 美式 | 男声 | 恶作剧者 | 趣味、游戏 |
| charlie | 🇦🇺 澳式 | 男声 | 充满活力 | 体育、激励 |
| jessica | 🇺🇸 美式 | 女声 | 俏皮 | 社交媒体、休闲 |
| lily | 🇬🇧 英式 | 女声 | 演员 | 戏剧、优雅内容 |
| matilda | 🇺🇸 美式 | 女声 | 专业 | 企业、新闻 |
| river | 🇺🇸 美式 | 中性 | 中性 | 包容性、信息性 |
| roger | 🇺🇸 美式 | 男声 | 休闲 | 播客、放松 |
| daniel | 🇬🇧 英式 | 男声 | 播音员 | 新闻、公告 |
| eric | 🇺🇸 美式 | 男声 | 可信赖 | 商务、企业 |
| chris | 🇺🇸 美式 | 男声 | 友好 | 教程、平易近人 |
| will | 🇺🇸 美式 | 男声 | 乐观 | 激励、振奋 |
| liam | 🇺🇸 美式 | 男声 | 社交 | YouTube、社交媒体 |
🎯 快速预设
- - default → rachel(温暖、友好)
- narrator → adam(纪录片)
- professional → matilda(企业)
- storyteller → george(有声书)
- educator → alice(教程)
- calm → brian(冥想)
- energetic → liam(社交媒体)
- trustworthy → eric(商务)
- neutral → river(包容性)
- british → george
- australian → charlie
- broadcaster → daniel(新闻)
🌍 支持的语言(32 种)
多语言 v2 模型支持以下语言:
德语 | nl | 荷兰语 |
| es | 西班牙语 | sv | 瑞典语 |
| fr | 法语 | da | 丹麦语 |
| it | 意大利语 | fi | 芬兰语 |
| pt | 葡萄牙语 | no | 挪威语 |
| ru | 俄语 | tr | 土耳其语 |
| uk | 乌克兰语 | cs | 捷克语 |
| ja | 日语 | sk | 斯洛伐克语 |
| ko | 韩语 | hu | 匈牙利语 |
| zh | 中文 | ro | 罗马尼亚语 |
| ar | 阿拉伯语 | bg | 保加利亚语 |
| hi | 印地语 | hr | 克罗地亚语 |
| ta | 泰米尔语 | el | 希腊语 |
| id | 印尼语 | ms | 马来语 |
| vi | 越南语 | th | 泰语 |
bash
用德语合成
python3 tts.py --text Guten Tag! --voice rachel --lang de
用法语合成
python3 tts.py --text Bonjour le monde! --voice adam --lang fr
列出所有语言
python3 tts.py --languages
💻 CLI 使用
基本文本转语音
bash
列出所有语音
python3 scripts/tts.py --list
生成语音
python3 scripts/tts.py --text Hello world --voice rachel --output hello.mp3
使用预设
python3 scripts/tts.py --text Breaking news... --voice broadcaster --output news.mp3
多语言
python3 scripts/tts.py --text Bonjour! --voice rachel --lang fr --output french.mp3
流式模式
实时流式生成音频(适合长文本):
bash
实时流式音频
python3 scripts/tts.py --text This is a long story... --voice adam --stream
流式输出到自定义文件
python3 scripts/tts.py --text Chapter one... --voice george --stream --output chapter1.mp3
批量处理
从文件处理多个文本:
bash
从换行符分隔的文本文件
python3 scripts/tts.py --batch texts.txt --voice rachel --output-dir ./audio
从 JSON 文件
python3 scripts/tts.py --batch batch.json --output-dir ./output
JSON 批量格式:
json
[
{text: 第一行, voice: rachel, output: line1.mp3},
{text: 第二行, voice: adam, output: line2.mp3},
{text: 第三行}
]
简单文本格式(每行一条):
你好,这是第一句话。
这是第二句话。
这是第三句话。
使用统计
bash
显示使用统计和成本估算
python3 scripts/tts.py --stats
重置统计
python3 scripts/tts.py --reset-stats
🎵 音效(SFX)
从文本描述生成 AI 驱动的音效:
bash
生成音效
python3 scripts/sfx.py --prompt 远处雷声隆隆
指定时长(0.5-22 秒)
python3 scripts/sfx.py --prompt 猫叫声 --duration 3 --output cat.mp3
调整提示影响度(0.0-1.0)
python3 scripts/sfx.py --prompt 砾石上的脚步声 --influence 0.5
批量生成音效
python3 scripts/sfx.py --batch sounds.json --output-dir ./sfx
显示提示示例
python3 scripts/sfx.py --examples
示例提示:
- - 远处雷声隆隆
- 猫满足地咕噜咕噜叫
- 机械键盘打字声
- 宇宙飞船引擎嗡嗡声
- 咖啡店背景闲聊声
🎨 语音设计
从文本描述创建自定义语音:
bash
基本语音设计
python3 scripts/voice-design.py --gender female