Meeting To Text

Use this skill when the job is a local file-to-transcript workflow.

Do not use this skill if the user only wants audio extraction, a meeting summary, environment setup, or an explanation of the models.

Inputs To Collect

Always collect:

- one local source file path
one output target path

Output target rules:

- If the target ends with .txt, write exactly to that file.
Otherwise treat it as a directory and write <source-stem>_transcript.txt inside it.

Supported source types:

- Video: .mp4, .mkv, .mov, .avi, INLINECODE6
Audio: .wav, .mp3, .m4a, .aac, .flac, INLINECODE12

Runtime

Read references/runtime_paths.md before running the script.

Run the bundled entrypoint with the local ASR environment:

CODEBLOCK0

If you need a stable temp location, add:

CODEBLOCK1

Result Handling

The script may print library noise before the final machine-readable result.

Always treat the last non-empty stdout line as the JSON result object.

Interpret results this way:

- Exit code 0 with status: success: transcript file was created with no warnings.
Exit code 0 with status: warning: transcript file was created, but you must report the warnings and any skipped segments.
Non-zero exit code or status: error: do not claim success; surface the warning list and the intended output path.

Important fields in the final JSON:

- output_path: final transcript file path
INLINECODE19: number of detected 说话人N labels in the written transcript
INLINECODE21: normalized diarization segments sent into transcription
INLINECODE22: segments that produced text
INLINECODE23: dropped or failed segments
INLINECODE24: segment-level failures with start, end, and INLINECODE27
INLINECODE28: run-level warnings such as INLINECODE29

Behavior Guarantees

The entrypoint already enforces the workflow. Do not rewrite the pipeline ad hoc in the conversation.

The script will:

- normalize audio with FFmpeg instead of renaming extensions
use local SenseVoiceSmall for ASR
use local 3D-Speaker embeddings plus clustering for diarization
write a plain text transcript with timestamps and INLINECODE30
stop on diarization failure instead of silently emitting a non-speaker-separated transcript

Report Back To The User

On success, report:

- the final transcript path
whether the source was audio or video
the detected speaker count
any warnings that matter for review

On failure, report:

- the exit code category
the warning message from the JSON result
whether the failure happened during validation, media normalization, diarization, transcription, or output writing

References

Read these only when needed:

- references/runtimepaths.md: fixed local paths and command template
references/troubleshooting.md: common runtime issues and how to interpret them

会议转文本

当任务为本地文件到转录文本的工作流程时使用此技能。

如果用户仅需要音频提取、会议摘要、环境设置或模型解释，请勿使用此技能。

需收集的输入

始终收集：

- 一个本地源文件路径
一个输出目标路径

输出目标规则：

- 如果目标路径以.txt结尾，则直接写入该文件。
否则将其视为目录，并在其中写入<源文件名>_transcript.txt。

支持的源文件类型：

- 视频：.mp4、.mkv、.mov、.avi、.webm
音频：.wav、.mp3、.m4a、.aac、.flac、.ogg

运行时

在运行脚本前，请阅读references/runtime_paths.md。

使用本地ASR环境运行捆绑的入口点：

powershell
& CONDAENVPYTHONPATH> C:\path\to\your\meeting-to-text\scripts\meetingtotext.py --input PATH> --output TARGET>

如果需要稳定的临时位置，请添加：

powershell
--work-dir WORKSPACETEMP_PATH>

结果处理

脚本在输出最终机器可读结果前可能会打印库的噪声信息。

始终将最后一个非空stdout行视为JSON结果对象。

按以下方式解释结果：

- 退出代码0且status: success：转录文件已创建，无警告。
退出代码0且status: warning：转录文件已创建，但必须报告警告及任何跳过的片段。
非零退出代码或status: error：不要声称成功；展示警告列表和预期的输出路径。

最终JSON中的重要字段：

- outputpath：最终转录文件路径
speakercount：在写入的转录中检测到的说话人N标签数量
segmentcount：送入转录的标准化说话人分割片段数
transcribedsegmentcount：产生文本的片段数
skippedsegmentcount：丢弃或失败的片段数
failedsegments：片段级失败信息，包含start、end和reason
warnings：运行级警告，如仅检测到一个说话人

行为保证

入口点已强制执行工作流程。不要在对话中临时重写管道。

脚本将：

- 使用FFmpeg标准化音频，而非重命名扩展名
使用本地SenseVoiceSmall进行ASR
使用本地3D-Speaker嵌入加聚类进行说话人分割
写入带有时间戳和说话人N的纯文本转录
在说话人分割失败时停止，而非静默输出未区分说话人的转录

向用户报告

成功时，报告：

- 最终转录文件路径
源文件是音频还是视频
检测到的说话人数量
任何需要审查的警告

失败时，报告：

- 退出代码类别
JSON结果中的警告消息
失败发生在验证、媒体标准化、说话人分割、转录还是输出写入阶段

参考资料

仅在需要时阅读：

- references/runtimepaths.md：固定的本地路径和命令模板
references/troubleshooting.md：常见运行时问题及其解释方法

结果处理

行为保证

向用户报告

参考资料

meeting-to-text会议转文字

meeting-to-text

Meeting To Text

Inputs To Collect

Runtime

Result Handling

Behavior Guarantees

Report Back To The User

References

会议转文本

需收集的输入

运行时

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

meeting-to-text会议转文字

meeting-to-text

Meeting To Text

Inputs To Collect

Runtime

Result Handling

Behavior Guarantees

Report Back To The User

References

会议转文本

需收集的输入

运行时

结果处理

行为保证

向用户报告

参考资料

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement