iMessage Voice Reply

Generate and send native iMessage voice messages using local Kokoro TTS. Voice messages appear as inline playable bubbles with waveforms — identical to voice messages recorded in Messages.app.

How It Works

CODEBLOCK0

Setup

CODEBLOCK1

Installs: kokoro-onnx, soundfile, numpy. Downloads Kokoro models (~136MB) to ~/.cache/kokoro-onnx/.

Requires: BlueBubbles channel configured in OpenClaw (channels.bluebubbles).

Generating and Sending a Voice Reply

Step 1: Generate audio

Write the response text to a temp file, then pass it via --text-file to avoid shell injection:

CODEBLOCK2

Alternatively, pass text directly (ensure proper shell escaping):

CODEBLOCK3

Options:

- --voice af_heart — Kokoro voice (default: af_heart)
INLINECODE4 — Playback speed (default: 1.15)
INLINECODE5 — Language code (default: en-us)

Security note: The Python script uses argparse and subprocess.run with list arguments (no shell=True). Input is handled safely within the script. When calling from a shell, prefer --text-file for untrusted input to avoid shell metacharacter issues.

Step 2: Send via BlueBubbles

Use the message tool:

CODEBLOCK4

Critical parameters for native voice bubble:

- filename must be INLINECODE9
INLINECODE10 must be INLINECODE11
INLINECODE12 must be INLINECODE13

All three are required for iMessage to render the message as an inline voice bubble with waveform instead of a file attachment.

Voice Options

Language	Female	Male
English	afheart ⭐	ampuck
Spanish

When to Reply with Voice

Reply with a voice message when:

- The user sent you a voice message (voice-for-voice)
The user explicitly asks for an audio/voice response

Always include a text reply alongside the voice message for accessibility.

Audio Format

- macOS: CAF container, Opus codec, 48kHz mono, 32kbps — encoded by Apple's native afconvert. Identical to what Messages.app produces.
Fallback: MP3 via ffmpeg (works but may not render as native voice bubble on all iMessage versions).

Cost

$0. Kokoro TTS runs entirely locally. No API calls for voice generation.

Troubleshooting

Voice message shows as file attachment — Ensure all three parameters are set: filename="Audio Message.caf", contentType="audio/x-caf", asVoice=true.

First word clipped — The script prepends 150ms silence automatically. If still clipped, increase the silence pad in the script.

Kokoro model not found — Run bash ${baseDir}/scripts/setup.sh.

afconvert not found — Only available on macOS. Script falls back to ffmpeg/MP3 on Linux.

iMessage 语音回复

使用本地 Kokoro TTS 生成并发送原生 iMessage 语音消息。语音消息将以带有波形图的可内联播放气泡形式呈现——与在 Messages.app 中录制的语音消息完全相同。

工作原理

你的文本回复 → Kokoro TTS（本地）→ afconvert（原生 Apple 编码器）→ CAF/Opus → BlueBubbles → iMessage 语音气泡

安装设置

bash
bash ${baseDir}/scripts/setup.sh

安装内容：kokoro-onnx、soundfile、numpy。将 Kokoro 模型（约 136MB）下载至 ~/.cache/kokoro-onnx/。

前提条件：在 OpenClaw 中配置了 BlueBubbles 通道（channels.bluebubbles）。

生成并发送语音回复

步骤 1：生成音频

将回复文本写入临时文件，然后通过 --text-file 参数传递以避免 shell 注入：

bash
echo 你的回复文本内容 > /tmp/voice_text.txt
${baseDir}/.venv/bin/python ${baseDir}/scripts/generatevoicereply.py --text-file /tmp/voicetext.txt --output /tmp/voicereply.caf

或者直接传递文本（确保正确进行 shell 转义）：

bash
${baseDir}/.venv/bin/python ${baseDir}/scripts/generatevoicereply.py --text 你的回复文本内容 --output /tmp/voice_reply.caf

可选参数：

- --voice afheart — Kokoro 语音（默认：afheart）
--speed 1.15 — 播放速度（默认：1.15）
--lang en-us — 语言代码（默认：en-us）

安全说明： Python 脚本使用 argparse 和 subprocess.run，参数以列表形式传递（未使用 shell=True）。输入内容在脚本内部得到安全处理。从 shell 调用时，对于不可信输入建议使用 --text-file 以避免 shell 元字符问题。

步骤 2：通过 BlueBubbles 发送

使用 message 工具：

json
{
action: sendAttachment,
channel: bluebubbles,
target: +1XXXXXXXXXX,
path: /tmp/voice_reply.caf,
filename: Audio Message.caf,
contentType: audio/x-caf,
asVoice: true
}

原生语音气泡的关键参数：

- filename 必须为 Audio Message.caf
contentType 必须为 audio/x-caf
asVoice 必须为 true

这三个参数缺一不可，iMessage 才能将消息渲染为带有波形图的内联语音气泡，而非文件附件。

语音选项

语言	女声	男声
英语	afheart ⭐	ampuck
西班牙语

何时使用语音回复

在以下情况下使用语音消息回复：

- 用户向你发送了语音消息（语音对语音）
用户明确要求音频/语音回复

为方便无障碍访问，请始终在语音消息旁附带文本回复。

音频格式

- macOS： CAF 容器，Opus 编解码器，48kHz 单声道，32kbps — 由 Apple 原生 afconvert 编码。与 Messages.app 生成的格式完全相同。
备用方案： 通过 ffmpeg 生成 MP3（可用，但可能无法在所有 iMessage 版本中渲染为原生语音气泡）。

费用

$0。Kokoro TTS 完全在本地运行。语音生成无需 API 调用。

故障排除

语音消息显示为文件附件 — 确保三个参数均已设置：filename=Audio Message.caf、contentType=audio/x-caf、asVoice=true。

首字被截断 — 脚本会自动在开头添加 150ms 静音。如果仍被截断，请在脚本中增加静音填充时长。

找不到 Kokoro 模型 — 运行 bash ${baseDir}/scripts/setup.sh。

找不到 afconvert — 仅 macOS 可用。脚本在 Linux 上会回退至 ffmpeg/MP3。

imessage-voice-replyiMessage语音回复