VoiceTrust
VoiceTrust answers one question: is this audio likely spoken by the enrolled owner?
Normal use:
- - run STT for content
- run VoiceTrust for owner verification
- merge both before replying
Do not use this skill to define machine-specific commands.
Local routing and machine policy belong elsewhere.
Runtime note
This skill bundle is lightweight:
- - source code and setup docs are included
- large model files are not bundled
- owner enrollment data is local runtime state and must not be published
If model assets are missing, read references/quickstart.md.
Output fields
VoiceTrust results may include:
- - INLINECODE1
- INLINECODE2
- INLINECODE3
- INLINECODE4
- INLINECODE5
- INLINECODE6
- INLINECODE7
- INLINECODE8
- INLINECODE9
- INLINECODE10
- INLINECODE11
- INLINECODE12
- INLINECODE13
- INLINECODE14
How to use the result
Use trust_label for concise rendering.
Use decision for command gating.
Do not treat audio quality alone as owner identity evidence.
Trust label
- -
high: identity_score >= 85 and confidence >= 80 and no failure - INLINECODE20 :
identity_score >= 72 and confidence >= 68 and no failure - INLINECODE23 : everything else
Common downgrade signals:
- - INLINECODE24
- INLINECODE25
- INLINECODE26
- INLINECODE27
- INLINECODE28
Command gating
For voice command execution:
- - use the normal path when INLINECODE29
- allow a short voice sample only when all of the following are true:
-
speech_duration >= 1.2
-
speaker_match >= 85
-
confidence >= 85
- - in all cases, command execution still requires:
-
speaker_match >= 78
-
confidence >= 80
-
identity_score >= 82
-
vad_status == "ok"
- INLINECODE37
Interpretation:
- -
decision == "allow_command" means command execution may proceed - INLINECODE39 means do not execute commands from this sample
- non-command voice content may still be handled normally
- music / non-speech / non-command audio should not enter the command path
CLI example:
CODEBLOCK0
Human rendering
Preferred compact format:
- - INLINECODE40
- INLINECODE41
- if relevant: INLINECODE42
If degraded, say why briefly using decision_reasons.
Do not over-claim certainty.
Failure handling
- - If STT succeeds and VoiceTrust fails: keep transcript, report trust as unavailable or inconclusive.
- If VoiceTrust succeeds and STT fails: keep trust result, report transcription failure.
- If both fail: say the audio could not be processed reliably.
- If
decision != "allow_command", do not execute voice commands.
First-time setup
For first-time setup, local installation, enrollment, or bootstrap, read:
VoiceTrust
VoiceTrust回答一个问题:这段音频是否很可能由已注册的所有者说出?
正常使用:
- - 运行STT获取内容
- 运行VoiceTrust进行所有者验证
- 在回复前合并两者结果
请勿使用此技能定义特定机器的命令。
本地路由和机器策略属于其他范畴。
运行时说明
此技能包轻量级:
- - 包含源代码和设置文档
- 不捆绑大型模型文件
- 所有者注册数据为本地运行时状态,不得发布
如果缺少模型资源,请阅读references/quickstart.md。
输出字段
VoiceTrust结果可能包括:
- - speakermatch
- audioquality
- overalltrust
- confidence
- identityscore
- trustlabel
- decision
- decisionreasons
- speakerid
- speechduration
- speechratio
- vadstatus
- failurereason
- rawscores.speaker_similarity
如何使用结果
使用trust_label进行简洁呈现。
使用decision进行命令门控。
不要单独将音频质量作为所有者身份证据。
信任标签
- - high:identityscore >= 85且confidence >= 80且无失败
- medium:identityscore >= 72且confidence >= 68且无失败
- low:其他所有情况
常见降级信号:
- - vadstatus != ok
- speechduration < 2.5
- speechratio < 0.45
- speakermatch < 70
- failure_reason != null
命令门控
对于语音命令执行:
- - 当speech_duration >= 3.0时使用正常路径
- 仅当以下所有条件成立时才允许短语音样本:
- speech_duration >= 1.2
- speaker_match >= 85
- confidence >= 85
- speaker_match >= 78
- confidence >= 80
- identity_score >= 82
- vad_status == ok
- failure_reason == null
解释:
- - decision == allowcommand表示可以执行命令
- decision != allowcommand表示不执行此样本中的命令
- 非命令语音内容仍可正常处理
- 音乐/非语音/非命令音频不应进入命令路径
CLI示例:
bash
uv run --python .venv/bin/python ../scripts/demo.py \
--audio /path/to/sample.ogg \
--speaker owner \
--json
人类可读呈现
推荐的简洁格式:
- - 语音信任:高 / 中 / 低
- 详情:匹配度 - 信任度 - 置信度 - 身份 - 质量
- 如果相关:决策:允许命令 / 拒绝命令
如果降级,使用decision_reasons简要说明原因。
不要过度声称确定性。
失败处理
- - 如果STT成功而VoiceTrust失败:保留转录文本,报告信任度不可用或不确定。
- 如果VoiceTrust成功而STT失败:保留信任结果,报告转录失败。
- 如果两者都失败:说明无法可靠处理该音频。
- 如果decision != allow_command,不执行语音命令。
首次设置
关于首次设置、本地安装、注册或引导,请阅读:
- - references/quickstart.md