返回顶部
d

doubao-asr(豆包语音转写)豆包语音转写

Transcribe recorded audio files to text via Doubao Seed-ASR 2.0 (豆包录音文件识别模型2.0) from ByteDance/Volcengine. Best-in-class Chinese speech recognition with speaker diarization. Use this skill whenever the user wants to: convert audio/recording to text, transcribe a meeting recording or voice memo, identify who said what in a recording (说话人分离), transcribe m4a/mp3/wav/ogg/flac files, or mentions 录音转文字/豆包/火山引擎/Volcengine/Doubao ASR. Also use when the user has an audio file and needs a transcript, even

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 0.18.3
安全检测
已通过
1,221
下载量
免费
免费
4
收藏
概述
安装方式
版本历史

doubao-asr(豆包语音转写)

Doubao ASR / 豆包语音转写

Transcribe audio files via ByteDance Volcengines Seed-ASR 2.0 Standard (豆包录音文件识别模型2.0-标准版) API. Best-in-class accuracy for Chinese (Mandarin, Cantonese, Sichuan dialect, etc.) and supports 13+ languages.

调用字节跳动火山引擎豆包录音文件识别模型2.0-标准版(Seed-ASR 2.0 Standard)转写音频文件。中文识别(普通话、粤语、四川话等方言)准确率业界领先,支持 13+ 种语言。

Sending audio to OpenClaw

Currently, audio files can be sent to OpenClaw via Discord or WhatsApp. Send the audio file in a chat message and ask the bot to transcribe it.

目前可通过 DiscordWhatsApp 向 OpenClaw 发送音频文件,发送后让 bot 转写即可。

Note: Direct voice recording in the OpenClaw web UI is not yet supported. Use a messaging app to send pre-recorded audio files.
提示:OpenClaw 网页端暂不支持直接录音,请通过即时通讯应用发送预录制的音频文件。

Quick start

bash
python3 {baseDir}/scripts/transcribe.py /path/to/audio.m4a

Defaults:

  • - Model: Seed-ASR 2.0 Standard / 豆包录音文件识别模型2.0-标准版
  • Speaker diarization: enabled / 说话人分离:默认开启
  • Output: stdout (transcript text with speaker labels / 带说话人标签的转写文本)

Useful flags

bash
python3 {baseDir}/scripts/transcribe.py /path/to/audio.m4a --out /tmp/transcript.txt
python3 {baseDir}/scripts/transcribe.py /path/to/audio.mp3 --format mp3
python3 {baseDir}/scripts/transcribe.py /path/to/audio.m4a --json --out /tmp/result.json
python3 {baseDir}/scripts/transcribe.py /path/to/audio.m4a --no-speakers # disable speaker diarization / 关闭说话人分离
python3 {baseDir}/scripts/transcribe.py https://example.com/audio.mp3 # direct URL (skip upload)

How it works

The Doubao API accepts audio via URL (not direct file upload). The script:

  1. 1. Uploads audio to Volcengine TOS (object storage) via presigned URL — audio stays within Volcengine infrastructure, no third-party services involved
  2. Submits transcription task to Seed-ASR 2.0
  3. Polls until complete (typically 1-3 minutes for a 10-min audio)
  4. Returns transcript text

Privacy: By default, audio is uploaded to your own Volcengine TOS bucket via presigned URL. No data is sent to third-party services.

You can also pass a direct audio URL as the argument to skip upload entirely:

bash
python3 {baseDir}/scripts/transcribe.py https://your-bucket.tos.volces.com/audio.m4a

Dependencies

  • - Python 3.9+
  • requests: pip install requests

Credentials

You need 4 environment variables. Follow these steps carefully — the guided setup below saves you 1-2 hours of digging through Volcengine docs.

你需要设置 4 个环境变量。按以下步骤操作——这份引导能帮你节省 1-2 小时翻文档踩坑的时间。

Step 1: Doubao ASR API Key / 第一步:豆包 ASR API Key

  1. 1. 打开 https://console.volcengine.com/speech/new/(确认进入的是新版「豆包语音」控制台)
  2. 左侧菜单 →「语音识别」
  3. 点击「开通模型」,开通「录音文件识别2.0」
  4. 点击页面右上角「API 调用」
  5. 在 Step 1「获取 API Key」中,点击创建 API Key
  6. 复制生成的 UUID 格式 Key

  1. 1. Open https://console.volcengine.com/speech/new/ (make sure you are in the new Doubao Speech console)
  2. Left sidebar → Speech Recognition
  3. Click Activate Model, activate Audio File Recognition 2.0
  4. Click API Call button at the top-right of the page
  5. In Step 1 Get API Key, click to create an API Key
  6. Copy the generated UUID-format key (e.g. 57e620a4-179c-4b3d-bd8d-990bd1f9a1e2)

bash
export VOLCENGINEAPIKEY=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Step 2: IAM Access Key / 第二步:创建 IAM 子用户和访问密钥

  1. 1. 打开 https://console.volcengine.com/iam/usermanage
  2. 点「新建用户」,填写用户名(如 doubao-asr)
  3. 访问方式确保勾选「编程访问」和「允许用户管理自己的API密钥」,其他选项保持默认即可
  4. 点击确定,创建成功后页面会显示 Access Key ID(以 AKLT 开头)和 Secret Access Key,复制保存

提示:这一步不需要添加任何 IAM 权限策略。权限将在 Step 3 通过 TOS 桶策略授予(仅限单桶读写)。
如需再次查看密钥,进入用户列表 → 点击子用户名 → 切换到「密钥」tab。


  1. 1. Open https://console.volcengine.com/iam/usermanage
  2. Click Create User, enter username (e.g. doubao-asr)
  3. Make sure Programmatic Access and Allow user to manage own API keys are checked. Leave all other options as default
  4. Click confirm. The success page shows Access Key ID (starts with AKLT) and Secret Access Key — copy both

Note: No IAM permission policies needed here — access will be granted via TOS bucket policy in Step 3 (single-bucket read/write only).
Tip: To view keys again, go to user list → click sub-user name → switch to Keys tab.

bash
export VOLCENGINEACCESSKEY_ID=AKLTxxxx...
export VOLCENGINESECRETACCESS_KEY=xxxx...

Step 3: TOS Bucket / 第三步:开通并创建 TOS 存储桶

豆包 API 要求音频通过 URL 访问。TOS 对象存储提供安全的临时上传,数据留在火山引擎内部。

  1. 1. 打开 https://console.volcengine.com/tos
  2. 首次进入会看到「开通对象存储」引导页,点击确认开通
  3. 开通后如果页面没有自动跳转到管理控制台,请手动重新访问 https://console.volcengine.com/tos 进入
  4. 在左侧菜单栏找到「桶列表」。如果看不到已创建的桶,检查页面顶部的项目选择器,切换到创建桶时所用的项目
  5. 点击「创建桶」,输入桶名称,根据服务器位置选择区域(见下方表格)
  6. 创建完成后,点击桶名称进入桶控制面板
  7. 左侧导航栏 →「权限管理」→「存储桶授权策略管理」→「创建策略」
  8. 选择「文件夹读写」模板 → 下一步 → 授权用户选择「当前主账号」→ 资源范围选择「所有对象」→ 确定
  9. 回到桶列表,复制桶名称

  1. 1. Open https://console.volcengine.com/tos
  2. First-time users will see an Activate Object Storage page — click to activate
  3. If the page does not auto-redirect after activation, manually re-visit https://console.volcengine.com/tos
  4. In the left sidebar, find Bucket List. If you dont see your bucket, check the project selector at the top
  5. Click Create Bucket, enter a bucket name and choose region based on server location (see table below)
  6. After creation, click the bucket name to enter bucket dashboard
  7. Left sidebar → Permission Management → Bucket Authorization Policy → Create Policy
  8. Select Folder Read/Write template → Next → Authorized user: Current main account → Resource scope: All objects → Confirm
  9. Go back to bucket list, copy the bucket name

Region selection / 区域选择:

Server location / 服务器位置Recommended TOS region / 推荐 TOS 区域Region code
China mainland / 中国内地cn-beijing, cn-shanghai, cn-guangzhoucn-beijing
Hong Kong / 香港
cn-hongkong | cn-hongkong |
| Southeast Asia / 东南亚 | ap-southeast-1 (Singapore) | ap-southeast-1 |
| US, Europe, other overseas / 美国、欧洲等海外 |

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 doubao-asr-1776305113 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 doubao-asr-1776305113 技能

通过命令行安装

skillhub install doubao-asr-1776305113

下载

⬇ 下载 doubao-asr(豆包语音转写) v0.18.3(免费)

文件大小: 20.05 KB | 发布时间: 2026-4-16 18:40

v0.18.3 最新 2026-4-16 18:40
- Expanded and clarified the skill description to better detail use cases, keywords, and when to use the skill (including speaker diarization and various audio file types).
- Updated Chinese and English instructions for accuracy, emphasizing usage even when "transcribe" is not explicitly mentioned.
- No behavioral or code changes—documentation and usage trigger improvements only.

Archiver·手机版·闲社网·闲社论坛·智能体自动化市场· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2026 闲社网·AI智能体论坛·AI自动化解决方案·http://xianshe.com

p2p_official_large
返回顶部