UNITH Digital Humans Skill

Create, configure, update, and deploy AI-powered Digital Human avatars using the UNITH API.

Quick Overview

UNITH digital humans are AI avatars that can speak, converse, and interact with users. They combine a face (head visual), a voice, and a conversational engine into a hosted, embeddable experience.

Base API URL: https://platform-api.unith.ai
Docs: https://docs.unith.ai

Prerequisites

The user must supply the following credentials (stored as environment variables):

Variable	Description	How to obtain
INLINECODE1	Account email	Register at https://unith.ai
INLINECODE2

Non-expiring secret key | UNITH dashboard → Manage Account → "Secret Key" section → Generate |

⚠️ The secret key is displayed only once. If lost, the user must delete and regenerate it.

Authentication

All API calls require a Bearer token (valid 7 days). Use the auth script:

CODEBLOCK0

This validates credentials, retries on network errors, and exports UNITH_TOKEN. On failure, it prints specific guidance (wrong key, expired token, etc.).

Workflow: Creating a Digital Human

Step 1: Choose an Operating Mode

Ask the user what they want the digital human to do. Map their answer to one of 5 modes:

Mode	INLINECODE4 value	Use case	Output
Text-to-Video	INLINECODE5	Generate an MP4 video of the avatar speaking provided text	MP4 file
Open Dialogue

Complexity spectrum (simple → sophisticated):

- Simplest: ttt — just text in, video out. No knowledge base needed.
Standard: oc — conversational with a system prompt. Good for general assistants.
Knowledge-grounded: doc_qa — upload documents, avatar answers from them. Best for support/FAQ.
Workflow-driven: voiceflow — structured conversation paths. Requires Voiceflow account.
Most flexible: plugin — BYO conversational engine. Maximum control.

Step 2: List Available Faces

CODEBLOCK1

Each face has an id (used as headVisualId in creation). Faces can be:

- Public: Available to all organizations
Private: Available only to the user's organization
Custom (BYOF): User uploads a video of a real person (currently managed by UNITH)

Present the available faces to the user and let them choose.

Step 3: List Available Voices

CODEBLOCK2

Voices come from providers: elevenlabs, azure, audiostack. Present options to the user. Voices have performance rankings — faster voices are better for real-time conversation.

Step 4: Create the Digital Human

Build a JSON payload file (see references/api-payloads.md for the schema per mode), then:

CODEBLOCK3

The script validates required fields, checks mode-specific requirements, retries on server errors, and prints the publicUrl on success.

Step 5 (doc_qa only): Upload Knowledge Document

For doc_qa mode, the digital human needs a knowledge document:

CODEBLOCK4

The script checks file existence/size, uses a longer timeout for uploads, and provides guidance on next steps.

Step 6: Test and Iterate

The digital human is live at the publicUrl from Step 4. The user should:

1. Visit the URL and test the conversation
Update configuration as needed (see below)

Updating a Digital Human

Use the update script to modify any parameter except the face (changing face requires creating a new head):

CODEBLOCK5

Listing Existing Digital Humans

CODEBLOCK6

Deleting a Digital Human

CODEBLOCK7

This permanently removes the digital human and cannot be undone.

Agent note: Always pass --confirm when calling this script. Without it, the script prompts for interactive input and will hang.

Embedding

Digital humans can be embedded in websites/apps. See references/embedding.md for code snippets and configuration options.

Scripts

All scripts include retry logic (exponential backoff), meaningful error messages, and input validation.

Script	Purpose
INLINECODE26	Shared utilities: retry wrapper, colored logging, error parsing
INLINECODE27

Authenticate and export UNITH_TOKEN (with 6-day token caching) |
| scripts/list-resources.sh | List faces, voices, heads, languages, or get head details |
| scripts/create-head.sh | Create a digital human from a JSON payload file (with --dry-run validation) |
| scripts/update-head.sh | Update a digital human's configuration (JSON file or --field flags) |
| scripts/delete-head.sh | Delete a digital human (with confirmation prompt) |
| scripts/upload-document.sh | Upload knowledge document to a doc_qa head |

Configuration via environment variables:

- UNITH_MAX_RETRIES — max retry attempts (default: 3)
INLINECODE38 — initial delay between retries in seconds (default: 2, doubles each retry)
INLINECODE39 — curl timeout in seconds (default: 30, 120 for uploads)
INLINECODE40 — connection timeout in seconds (default: 10)
INLINECODE41 — token cache file path (default: /tmp/.unith_token_cache, set empty to disable)

Detailed API Reference

For full payload schemas, configuration parameters, and mode-specific details:

CODEBLOCK8

Common Patterns

"I want a quick video of someone saying X" → ttt mode, minimal config
"I want a customer support avatar" → doc_qa mode with knowledge docs
"I want an AI sales rep" → oc mode with a sales personality prompt
"I want to connect my own LLM" → plugin mode with webhook URL
"I want a guided onboarding flow" → voiceflow mode with Voiceflow API key

Information to Collect from the User

Before creating, ask for:

1. Purpose / use case → determines operating mode
Face preference → list available faces for selection
Voice preference → language, accent, gender, speed priority
Alias → display name for the digital human
Language → speech recognition and UI language (e.g., en-US, es-ES)
Greeting message → initial message the avatar says
System prompt (for oc/doc_qa) → personality and behavior instructions
Knowledge documents (for doc_qa) → files to upload
Voiceflow API key (for voiceflow) → from their Voiceflow account
Plugin URL (for plugin) → webhook endpoint for their custom engine

UNITH数字人技能

使用UNITH API创建、配置、更新和部署AI驱动的数字人虚拟形象。

快速概览

UNITH数字人是能够说话、对话并与用户互动的AI虚拟形象。它们将面部（头部视觉）、语音和对话引擎整合为一个可托管、可嵌入的体验。

基础API URL: https://platform-api.unith.ai
文档: https://docs.unith.ai

前提条件

用户必须提供以下凭证（存储为环境变量）：

变量	描述	获取方式
UNITHEMAIL	账户邮箱	在 https://unith.ai 注册
UNITHSECRET_KEY

永不过期的密钥 | UNITH仪表盘 → 管理账户 → 密钥部分 → 生成 |

⚠️ 密钥仅显示一次。如果丢失，用户必须删除并重新生成。

身份认证

所有API调用都需要Bearer令牌（有效期7天）。使用认证脚本：

bash
source scripts/auth.sh

该脚本验证凭证，在网络错误时重试，并导出UNITH_TOKEN。失败时会打印具体指导（密钥错误、令牌过期等）。

工作流程：创建数字人

步骤1：选择操作模式

询问用户希望数字人做什么。将他们的回答映射到5种模式之一：

模式	operationMode 值	用例	输出
文本转视频	ttt	生成虚拟形象朗读指定文本的MP4视频	MP4文件
开放对话

复杂度谱系（简单 → 复杂）：

- 最简单: ttt — 只需输入文本，输出视频。无需知识库。
标准: oc — 带有系统提示的对话。适合通用助手。
知识驱动: doc_qa — 上传文档，虚拟形象从中回答问题。最适合支持/FAQ。
工作流驱动: voiceflow — 结构化对话路径。需要Voiceflow账户。
最灵活: plugin — 自带对话引擎。最大控制权。

步骤2：列出可用面部

bash
bash scripts/list-resources.sh faces

每个面部都有一个id（在创建时用作headVisualId）。面部可以是：

- 公共: 所有组织可用
私有: 仅用户所在组织可用
自定义（BYOF）: 用户上传真人视频（目前由UNITH管理）

向用户展示可用面部，让他们选择。

步骤3：列出可用语音

bash
bash scripts/list-resources.sh voices

语音来自提供商：elevenlabs、azure、audiostack。向用户展示选项。语音有性能排名——更快的语音更适合实时对话。

步骤4：创建数字人

构建JSON负载文件（参见references/api-payloads.md了解每种模式的模式），然后：

bash
bash scripts/create-head.sh payload.json --dry-run # 先验证
bash scripts/create-head.sh payload.json # 创建

该脚本验证必填字段，检查模式特定要求，在服务器错误时重试，成功时打印publicUrl。

步骤5（仅限doc_qa）：上传知识文档

对于doc_qa模式，数字人需要知识文档：

bash
bash scripts/upload-document.sh /path/to/document.pdf

该脚本检查文件存在性/大小，上传使用更长的超时时间，并提供后续步骤指导。

步骤6：测试和迭代

数字人在步骤4的publicUrl上线。用户应：

1. 访问URL并测试对话
根据需要更新配置（见下文）

更新数字人

使用更新脚本修改除面部外的任何参数（更改面部需要创建新的头部）：

bash
bash scripts/update-head.sh updates.json # 从JSON文件
bash scripts/update-head.sh --field ttsVoice=rachel # 单个字段
bash scripts/update-head.sh --field ttsVoice=rachel --field greetings=Hi! # 多个字段

列出已有数字人

bash
bash scripts/list-resources.sh heads # 列出所有
bash scripts/list-resources.sh head # 获取单个详情

删除数字人

bash
bash scripts/delete-head.sh --confirm # 在自动化/代理环境中始终使用--confirm

这将永久删除数字人，无法撤销。

代理注意：调用此脚本时始终传递--confirm。没有它，脚本会提示交互输入并挂起。

嵌入

数字人可以嵌入网站/应用中。参见references/embedding.md获取代码片段和配置选项。

脚本

所有脚本都包含重试逻辑（指数退避）、有意义的错误消息和输入验证。

脚本	用途
scripts/utils.sh	共享工具：重试包装器、彩色日志、错误解析
scripts/auth.sh

通过环境变量配置：

- UNITHMAXRETRIES — 最大重试次数（默认：3）
UNITHRETRYDELAY — 重试之间的初始延迟（秒）（默认：2，每次重试加倍）
UNITHCURLTIMEOUT — curl超时（秒）（默认：30，上传为120）
UNITHCONNECTTIMEOUT — 连接超时（秒）（默认：10）
UNITHTOKENCACHE — 令牌缓存文件路径（默认：/tmp/.unithtokencache，设为空以禁用）

详细API参考

有关完整的负载模式、配置参数和模式特定详情：

Read references/api-payloads.md # 每种模式的完整请求/响应模式
Read references/configuration.md # 所有可配置参数
Read references/embedding.md # 嵌入代码和选项

常见模式

我想要一个快速视频，有人说X → ttt模式，最小配置
我想要一个客户支持虚拟形象 → doc_qa模式，带知识文档
我想要一个AI销售代表 → oc模式，带销售个性提示
我想连接自己的LLM → plugin模式，带webhook URL
我想要一个引导式入职流程 → voiceflow模式，带Voiceflow API密钥

需要从用户收集的信息

在创建之前，询问：

1. 目的/用例 → 确定操作模式
面部偏好 → 列出可用面部供选择
语音偏好 → 语言、口音、性别、速度优先级
别名 → 数字人的显示名称
语言 → 语音识别和UI语言（例如，en-US、es-ES）
问候消息 → 虚拟形象说的初始消息
系统提示（用于oc/docqa）→ 个性和行为指令
知识文档（用于docqa）→ 要上传的文件
Voiceflow API密钥（用于voiceflow）→ 来自他们的Voiceflow账户
插件URL（用于plugin）→ 自定义引擎的webhook端点

digital-clawatar数字人创建

digital-clawatar

UNITH Digital Humans Skill

Quick Overview

Prerequisites

Authentication

Workflow: Creating a Digital Human

Step 1: Choose an Operating Mode

Step 2: List Available Faces

Step 3: List Available Voices

Step 4: Create the Digital Human

Step 5 (doc_qa only): Upload Knowledge Document

Step 6: Test and Iterate

Updating a Digital Human

Listing Existing Digital Humans

Deleting a Digital Human

Embedding

Scripts

Detailed API Reference

Common Patterns

Information to Collect from the User

UNITH数字人技能

快速概览

前提条件

身份认证

工作流程：创建数字人

步骤1：选择操作模式

步骤2：列出可用面部

步骤3：列出可用语音

步骤4：创建数字人

步骤5（仅限doc_qa）：上传知识文档

步骤6：测试和迭代

更新数字人

列出已有数字人

删除数字人

嵌入

脚本

详细API参考

常见模式

需要从用户收集的信息

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement