SenseAudio 会议助手 / SenseAudio Meeting Assistant
使用这个 skill 来构建基于 SenseAudio ASR 的会议类工作流。
Use this skill to build meeting-focused workflows on top of SenseAudio ASR.
先读哪些文件 / Read First
- -
references/realtime-workflow.md - 实时会议的 WebSocket 转写流程 / realtime WebSocket transcription flow for live meetings - INLINECODE1 - 录音文件的 HTTP 转写、说话人分离、时间戳与翻译 / HTTP transcription with diarization, timestamps, sentiment, and translation
- INLINECODE2 - 如何把转写结果整理成会议纪要、决策与行动项 / how to turn transcripts into structured meeting notes, decisions, and action items
- INLINECODE3 - 常见会议助手场景的最小实现示例 / minimal examples for common meeting assistant tasks
按用户需求只加载必要的参考文件,不要一次性全读。
Load only the reference file(s) needed for the user's request.
适用场景 / When To Use
当用户需要以下能力时,使用这个 skill:
Use this skill when the user wants any of the following:
- - 实时会议字幕或实时转写 / live meeting captions or realtime transcripts
- 带说话人区分的会议记录 / speaker-separated meeting records
- 双语会议输出或会中翻译 / bilingual or translated meeting output
- 会议摘要、决策提取、行动项整理 / meeting summaries, decisions, and action items
- 基于会议音频或识别历史构建可搜索归档 / searchable archives built from meeting audio or ASR history
- 将 SenseAudio ASR 与 LLM 摘要能力结合的会议助手 / a meeting copilot combining SenseAudio ASR with LLM summarization
能力路由 / Capability Routing
优先选择满足需求的最小实现。
Choose the smallest implementation that satisfies the request.
- 1. 实时会议助手 / Live meeting assistant → 读取 INLINECODE4
- 录音会议处理 / Recorded meeting processing → 读取 INLINECODE5
- 音质预检 / Audio quality triage → 先走音频质量分析接口
- 历史记录查询 / Historical transcript lookup → 使用识别记录查询接口
- 纪要与行动项整理 / Structured notes and actions → 再读取 INLINECODE6
默认工作流 / Default Workflow
- 1. 先判断会议模式 / Pick meeting mode:
- - 直播中 / 开会中 → 使用 WebSocket 实时识别 / live meeting → WebSocket realtime ASR
- 已有录音文件 → 使用 HTTP 文件转写 / uploaded recording → HTTP transcription
- 2. 再选择识别模型 / Choose recognition settings:
- -
sense-asr-pro:默认首选,适合会议场景,支持说话人分离、时间戳、情感与翻译 / best default for meetings needing diarization, timestamps, sentiment, and translation - INLINECODE8 :适合需要翻译但对会议增强能力要求较轻的场景 / useful when translation is needed but meeting constraints are lighter
- INLINECODE9 :适合实时 WebSocket 或更偏翻译理解的场景 / suitable for realtime WebSocket or translation-heavy semantic cleanup
- 3. 先构造最小可用请求 / Build a minimal valid request first:
- - 加上 INLINECODE10
- 先只保留必填参数 / keep only required fields initially
- 只有在确实需要时再开启说话人分离、时间戳、翻译、情感分析 / add diarization, timestamps, translation, or sentiment only when needed
- 4. 统一转写结果结构 / Normalize transcript output:
- - 保留说话人、时间戳、翻译文本、分段边界 / preserve speaker, timestamps, translation, and segment boundaries
- 先把 ASR 原始结果归一化,再进入纪要或行动项生成 / normalize ASR output before summarization
- 5. 最后生成会议产物 / Generate meeting artifacts:
- - 会议摘要 / summary
- 关键决策 / decisions
- 带负责人和时间定位的行动项 / action items with owner and timestamp when possible
- 可导出的 Markdown 或 JSON / export-ready Markdown or JSON
实现规则 / Implementation Rules
- - 严禁硬编码密钥,统一从环境变量
SENSEAUDIO_API_KEY 读取。Do not hardcode secrets; read SENSEAUDIO_API_KEY from environment variables. - 实时会议流式识别必须满足
pcm、16000Hz、单声道、16-bit little-endian。For realtime streaming, enforce pcm, 16000Hz, mono, signed 16-bit little-endian. - 录音会议优先使用
response_format=verbose_json,保留会议所需元数据。For recorded meetings, prefer response_format=verbose_json. - 说话人分离只在支持的模型上使用,
max_speakers 仅适用于 sense-asr-pro。Use diarization only with supported models; max_speakers is only for sense-asr-pro. - 使用
sense-asr-deepthink 时不要传 language,只使用 target_language。Do not send language with sense-asr-deepthink; use target_language only. - 会议纪要生成属于第二阶段,必须建立在结构化转写结果之上。Treat note generation as a second stage after transcript normalization.
- 如果用户要生产级代码,输出中要包含超时、重试、日志与失败处理。Include timeout, retry, logging, and failure handling for production code.
输出检查清单 / Output Checklist
当你给出实现方案时,至少包含:
When producing implementation output, include:
- 1. 选用的接口与模型 / chosen endpoint and model
- 最小可运行请求示例 / minimal request example
- 转写结果解析方式 / transcript parsing strategy
- 纪要 / 行动项生成方式(如用户需要) / note and action-item generation strategy if requested
- 会议场景下的限制说明,例如音频格式、说话人分离限制、翻译行为 / meeting-specific caveats such as audio format, diarization limits, or translation behavior
常见坑点 / Common Pitfalls
- - 没有说话人字段 / Missing diarization fields:确认已开启
enable_speaker_diarization=true,并使用 sense-asr 或 INLINECODE31 - 没有字级时间戳 / No word timestamps:确认设置了 INLINECODE32
- 实时识别报错 / Realtime failures:检查发送的是原始 PCM 数据,而不是
wav/mp3 容器字节 - DeepThink 参数报错 / DeepThink errors:移除不支持的
language 或其他不兼容参数 - 纪要质量差 / Weak meeting notes:优先基于结构化
segments 总结,不要只对整段纯文本做摘要
SenseAudio 会议助手 / SenseAudio Meeting Assistant
使用这个技能来构建基于 SenseAudio ASR 的会议类工作流。
Use this skill to build meeting-focused workflows on top of SenseAudio ASR.
先读哪些文件 / Read First
- - references/realtime-workflow.md - 实时会议的 WebSocket 转写流程 / realtime WebSocket transcription flow for live meetings
- references/offline-meeting-transcription.md - 录音文件的 HTTP 转写、说话人分离、时间戳与翻译 / HTTP transcription with diarization, timestamps, sentiment, and translation
- references/note-generation.md - 如何把转写结果整理成会议纪要、决策与行动项 / how to turn transcripts into structured meeting notes, decisions, and action items
- references/examples.md - 常见会议助手场景的最小实现示例 / minimal examples for common meeting assistant tasks
按用户需求只加载必要的参考文件,不要一次性全读。
Load only the reference file(s) needed for the users request.
适用场景 / When To Use
当用户需要以下能力时,使用这个技能:
Use this skill when the user wants any of the following:
- - 实时会议字幕或实时转写 / live meeting captions or realtime transcripts
- 带说话人区分的会议记录 / speaker-separated meeting records
- 双语会议输出或会中翻译 / bilingual or translated meeting output
- 会议摘要、决策提取、行动项整理 / meeting summaries, decisions, and action items
- 基于会议音频或识别历史构建可搜索归档 / searchable archives built from meeting audio or ASR history
- 将 SenseAudio ASR 与 LLM 摘要能力结合的会议助手 / a meeting copilot combining SenseAudio ASR with LLM summarization
能力路由 / Capability Routing
优先选择满足需求的最小实现。
Choose the smallest implementation that satisfies the request.
- 1. 实时会议助手 / Live meeting assistant → 读取 references/realtime-workflow.md
- 录音会议处理 / Recorded meeting processing → 读取 references/offline-meeting-transcription.md
- 音质预检 / Audio quality triage → 先走音频质量分析接口
- 历史记录查询 / Historical transcript lookup → 使用识别记录查询接口
- 纪要与行动项整理 / Structured notes and actions → 再读取 references/note-generation.md
默认工作流 / Default Workflow
- 1. 先判断会议模式 / Pick meeting mode:
- - 直播中 / 开会中 → 使用 WebSocket 实时识别 / live meeting → WebSocket realtime ASR
- 已有录音文件 → 使用 HTTP 文件转写 / uploaded recording → HTTP transcription
- 2. 再选择识别模型 / Choose recognition settings:
- - sense-asr-pro:默认首选,适合会议场景,支持说话人分离、时间戳、情感与翻译 / best default for meetings needing diarization, timestamps, sentiment, and translation
- sense-asr:适合需要翻译但对会议增强能力要求较轻的场景 / useful when translation is needed but meeting constraints are lighter
- sense-asr-deepthink:适合实时 WebSocket 或更偏翻译理解的场景 / suitable for realtime WebSocket or translation-heavy semantic cleanup
- 3. 先构造最小可用请求 / Build a minimal valid request first:
- - 加上 Authorization: Bearer APIKEY>
- 先只保留必填参数 / keep only required fields initially
- 只有在确实需要时再开启说话人分离、时间戳、翻译、情感分析 / add diarization, timestamps, translation, or sentiment only when needed
- 4. 统一转写结果结构 / Normalize transcript output:
- - 保留说话人、时间戳、翻译文本、分段边界 / preserve speaker, timestamps, translation, and segment boundaries
- 先把 ASR 原始结果归一化,再进入纪要或行动项生成 / normalize ASR output before summarization
- 5. 最后生成会议产物 / Generate meeting artifacts:
- - 会议摘要 / summary
- 关键决策 / decisions
- 带负责人和时间定位的行动项 / action items with owner and timestamp when possible
- 可导出的 Markdown 或 JSON / export-ready Markdown or JSON
实现规则 / Implementation Rules
- - 严禁硬编码密钥,统一从环境变量 SENSEAUDIOAPIKEY 读取。Do not hardcode secrets; read SENSEAUDIOAPIKEY from environment variables.
- 实时会议流式识别必须满足 pcm、16000Hz、单声道、16-bit little-endian。For realtime streaming, enforce pcm, 16000Hz, mono, signed 16-bit little-endian.
- 录音会议优先使用 responseformat=verbosejson,保留会议所需元数据。For recorded meetings, prefer responseformat=verbosejson.
- 说话人分离只在支持的模型上使用,maxspeakers 仅适用于 sense-asr-pro。Use diarization only with supported models; maxspeakers is only for sense-asr-pro.
- 使用 sense-asr-deepthink 时不要传 language,只使用 targetlanguage。Do not send language with sense-asr-deepthink; use targetlanguage only.
- 会议纪要生成属于第二阶段,必须建立在结构化转写结果之上。Treat note generation as a second stage after transcript normalization.
- 如果用户要生产级代码,输出中要包含超时、重试、日志与失败处理。Include timeout, retry, logging, and failure handling for production code.
输出检查清单 / Output Checklist
当你给出实现方案时,至少包含:
When producing implementation output, include:
- 1. 选用的接口与模型 / chosen endpoint and model
- 最小可运行请求示例 / minimal request example
- 转写结果解析方式 / transcript parsing strategy
- 纪要 / 行动项生成方式(如用户需要) / note and action-item generation strategy if requested
- 会议场景下的限制说明,例如音频格式、说话人分离限制、翻译行为 / meeting-specific caveats such as audio format, diarization limits, or translation behavior
常见坑点 / Common Pitfalls
- - 没有说话人字段 / Missing diarization fields:确认已开启 enablespeakerdiarization=true,并使用 sense-asr 或 sense-asr-pro
- 没有字级时间戳 / No word timestamps:确认设置了 timestamp_granularities[]=word
- 实时识别报错 / Realtime failures:检查发送的是原始 PCM 数据,而不是 wav/mp3 容器字节
- DeepThink 参数报错 / DeepThink errors:移除不支持的 language 或其他不兼容参数
- 纪要质量差 / Weak meeting notes:优先基于结构化 segments 总结,不要只对整段纯文本做摘要