Voice Translate

Use this skill for two closely related modes:

1. Chat-native mode: the user sends audio or a voice note in OpenClaw; return transcript text, translation text, and translated audio.
Local pipeline mode: run a deterministic file-based pipeline that writes transcript, translation, wav, and metadata artifacts.

Default to an LLM-assisted translation workflow: let the current agent produce the translation, save it to a file when using the local pipeline, or use the surrounding agent turn directly when responding in chat.

Workflow

A. Chat-native mode

Use this when an inbound message already contains an audio transcript from OpenClaw media understanding, or when the user asks you to process a voice message conversationally.

1. Detect that the user sent audio or that the request is for voice translation.
Obtain or confirm the transcript text.
Translate with the current model.
Send the transcript text to the user.
Send the translated text to the user.
Synthesize the translated text as audio:

- prefer the OpenClaw tts tool when you need an immediate chat reply with audio - prefer Piper when you need a local wav artifact

7. Keep the output order stable: transcript first, translation second, audio last.

B. Local pipeline mode

1. Confirm input/output expectations: source language, target language, output directory, and whether the run should be real or mock.
Choose backends:

- faster-whisper for real transcription, mock for pipeline testing. - llm as the default translation path when an agent/model is available. - service only when unattended HTTP translation is preferable. - manual only as a fallback. - piper for real TTS, mock for dry-run testing.

3. Run transcription.
If using the default llm path, read the transcript and translate it with the current model. Save the translated text to a file.
Run synthesis/output writing with --translation-file.
Inspect outputs:

- 01_transcript.txt - 02_translation.txt - 03_translation.wav - result.json

7. If the user wants chat updates during processing, pass notifier commands with --transcript-command, --translation-command, and --audio-command.

Preferred execution patterns

Default LLM-assisted path

Use this when the agent handling the task can translate the transcript itself.

1. Run the pipeline once transcription is available, or run the full command after preparing translation.txt.
Save the model-produced translation to a file.
Invoke:

CODEBLOCK0

Read references/llm-translation-pattern.md when you need the exact orchestration pattern or a reusable translation prompt.

Mock end-to-end validation

Use this first when you need to validate the pipeline structure without model/runtime dependencies.

CODEBLOCK1

Notes:

- mock transcription reads plain text from the input file.
INLINECODE20 TTS writes a silent wav file.
INLINECODE21 is still required by the current CLI shape even when using mock TTS; use any placeholder path.
INLINECODE22 mode currently means the translation must already exist in --translation-file.

Service fallback

CODEBLOCK2

Resources

scripts/

- run_voice_translate.py: primary entrypoint.
INLINECODE25: thin wrapper for the default LLM-assisted path.
INLINECODE26: pipeline modules.
INLINECODE27: wrap stage text and forward it via a shell command.
INLINECODE28: forward generated audio via a shell command.
INLINECODE29, mock_audio_sender.py: local smoke-test helpers.

references/

- Read references/runtime-notes.md for dependency/setup details, backend behavior, and integration constraints.
Read references/llm-translation-pattern.md when the surrounding agent should perform translation with its own model.
Read references/openclaw-chat-mode.md when implementing or following the conversational flow: receive voice, output transcript text, output translation text, then output translated audio.

Editing guidance

- Keep SKILL.md procedural and short.
Put environment- or backend-specific detail in references.
Treat llm as the preferred translation path for agent-driven workflows.
In chat-native mode, preserve the user-visible ordering: transcript text, translation text, then audio.
Prefer OpenClaw tts for immediate conversational audio replies; prefer Piper for local wav artifacts and offline pipelines.
If the user wants tighter OpenClaw integration, add an attachment-aware outer workflow or hook instead of rewriting ASR/TTS first.
Preserve the current file contract unless the user asks to change it: transcript, translation, wav, metadata JSON.

语音翻译

使用此技能处理两种紧密相关的模式：

1. 聊天原生模式：用户在OpenClaw中发送音频或语音消息；返回转录文本、翻译文本和翻译后的音频。
本地流水线模式：运行基于文件的确定性流水线，生成转录、翻译、wav和元数据文件。

默认采用LLM辅助翻译工作流：让当前智能体生成翻译，使用本地流水线时保存到文件，或在聊天中直接响应时使用周围的智能体轮次。

工作流

A. 聊天原生模式

当入站消息已包含来自OpenClaw媒体理解的音频转录，或用户要求你以对话方式处理语音消息时使用此模式。

1. 检测到用户发送了音频或请求进行语音翻译。
获取或确认转录文本。
使用当前模型进行翻译。
将转录文本发送给用户。
将翻译文本发送给用户。
将翻译文本合成为音频：

- 当需要立即回复带有音频的聊天时，优先使用OpenClaw的tts工具 - 当需要本地wav文件时，优先使用Piper

7. 保持输出顺序稳定：先转录，再翻译，最后音频。

B. 本地流水线模式

1. 确认输入/输出预期：源语言、目标语言、输出目录，以及运行应为真实还是模拟。
选择后端：

- faster-whisper用于真实转录，mock用于流水线测试。 - 当有智能体/模型可用时，llm作为默认翻译路径。 - 仅在需要无人值守的HTTP翻译时使用service。 - manual仅作为回退方案。 - piper用于真实TTS，mock用于空运行测试。

3. 运行转录。
如果使用默认的llm路径，读取转录并使用当前模型进行翻译。将翻译文本保存到文件。
使用--translation-file运行合成/输出写入。
检查输出：

- 01_transcript.txt - 02_translation.txt - 03_translation.wav - result.json

7. 如果用户希望在处理过程中获得聊天更新，使用--transcript-command、--translation-command和--audio-command传递通知命令。

首选执行模式

默认LLM辅助路径

当处理任务的智能体可以自行翻译转录时使用此模式。

1. 转录可用后运行流水线，或在准备好translation.txt后运行完整命令。
将模型生成的翻译保存到文件。
调用：

bash
bash scripts/runvoicetranslate_llm.sh \
/path/to/input.m4a \
./outputs/llm-run \
zh \
en \
/path/to/en_US-lessac-medium.onnx \
./translation.txt \
--whisper-model small \
--transcribe-backend faster-whisper \
--tts-backend piper

当需要精确的编排模式或可复用的翻译提示时，请阅读references/llm-translation-pattern.md。

模拟端到端验证

当需要验证流水线结构而不依赖模型/运行时依赖时，首先使用此模式。

bash
python3 scripts/runvoicetranslate.py \
--input references/examples/mock-input.txt \
--output-dir ./outputs/mock-run \
--source-lang zh \
--target-lang en \
--transcribe-backend mock \
--translation-file ./translated.txt \
--translation-backend llm \
--no-interactive-translate \
--tts-backend mock \
--piper-model ./dummy.onnx

注意：

- mock转录从输入文件读取纯文本。
mockTTS写入静音wav文件。
即使使用模拟TTS，当前CLI格式仍需要--piper-model；使用任何占位路径即可。
llm模式目前意味着翻译必须已存在于--translation-file中。

服务回退

bash
python3 scripts/runvoicetranslate.py \
--input /path/to/input.m4a \
--output-dir ./outputs/service-run \
--source-lang zh \
--target-lang en \
--whisper-model small \
--transcribe-backend faster-whisper \
--translation-backend service \
--translation-service-url http://127.0.0.1:8000/translate \
--tts-backend piper \
--piper-model /path/to/en_US-lessac-medium.onnx

资源

scripts/

- runvoicetranslate.py：主要入口点。
runvoicetranslatellm.sh：默认LLM辅助路径的轻量封装。
voicetranslateapp/：流水线模块。
sendtext.py：封装阶段文本并通过shell命令转发。
sendaudio.py：通过shell命令转发生成的音频。
mocktextsender.py、mockaudio_sender.py：本地冒烟测试辅助工具。

references/

- 阅读references/runtime-notes.md了解依赖/设置详情、后端行为和集成约束。
当周围智能体应使用自身模型执行翻译时，阅读references/llm-translation-pattern.md。
当实现或遵循对话流程时阅读references/openclaw-chat-mode.md：接收语音，输出转录文本，输出翻译文本，然后输出翻译后的音频。

编辑指南

- 保持SKILL.md流程化且简洁。
将环境或后端特定的细节放在references中。
将llm视为智能体驱动工作流的首选翻译路径。
在聊天原生模式下，保持用户可见的顺序：转录文本、翻译文本、然后音频。
对于即时对话音频回复，优先使用OpenClaw的tts；对于本地wav文件和离线流水线，优先使用Piper。
如果用户希望更紧密的OpenClaw集成，添加附件感知的外部工作流或钩子，而不是重写ASR/TTS。
除非用户要求更改，否则保持当前文件约定：转录、翻译、wav、元数据JSON。

speech-translation语音翻译工作流

speech-translation

Voice Translate

Workflow

A. Chat-native mode

B. Local pipeline mode

Preferred execution patterns

Default LLM-assisted path

Mock end-to-end validation

Service fallback

Resources

scripts/

references/

Editing guidance

语音翻译

工作流

A. 聊天原生模式

B. 本地流水线模式

首选执行模式

默认LLM辅助路径

模拟端到端验证

服务回退

资源

scripts/

references/

编辑指南

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

speech-translation语音翻译工作流

speech-translation

Voice Translate

Workflow

A. Chat-native mode

B. Local pipeline mode

Preferred execution patterns

Default LLM-assisted path

Mock end-to-end validation

Service fallback

Resources

scripts/

references/

Editing guidance

语音翻译

工作流

A. 聊天原生模式

B. 本地流水线模式

首选执行模式

默认LLM辅助路径

模拟端到端验证

服务回退

资源

scripts/

references/

编辑指南

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement