AI Avatar & Talking Head Videos
Create AI avatars and talking head videos via inference.sh CLI.

Quick Start
CODEBLOCK0
Install note: The install script only detects your OS/architecture, downloads the matching binary from dist.inference.sh, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.
Available Models
| Model | App ID | Best For |
|---|
| OmniHuman 1.5 | INLINECODE1 | Multi-character, best quality |
| OmniHuman 1.0 |
bytedance/omnihuman-1-0 | Single character |
| Fabric 1.0 |
falai/fabric-1-0 | Image talks with lipsync |
| PixVerse Lipsync |
falai/pixverse-lipsync | Highly realistic |
Search Avatar Apps
CODEBLOCK1
Examples
OmniHuman 1.5 (Multi-Character)
CODEBLOCK2
Supports specifying which character to drive in multi-person images.
Fabric 1.0 (Image Talks)
CODEBLOCK3
PixVerse Lipsync
CODEBLOCK4
Generates highly realistic lipsync from any audio.
Full Workflow: TTS + Avatar
CODEBLOCK5
Full Workflow: Dub Video in Another Language
CODEBLOCK6
Use Cases
- - Marketing: Product demos with AI presenter
- Education: Course videos, explainers
- Localization: Dub content in multiple languages
- Social Media: Consistent virtual influencer
- Corporate: Training videos, announcements
Tips
- - Use high-quality portrait photos (front-facing, good lighting)
- Audio should be clear with minimal background noise
- OmniHuman 1.5 supports multiple people in one image
- LatentSync is best for syncing existing videos to new audio
Related Skills
CODEBLOCK7
Browse all video apps: INLINECODE5
Documentation
AI 虚拟形象与对口型视频
通过 inference.sh CLI 创建 AI 虚拟形象和对口型视频。

快速开始
bash
curl -fsSL https://cli.inference.sh | sh && infsh login
从图片+音频创建虚拟形象视频
infsh app run bytedance/omnihuman-1-5 --input {
image_url: https://portrait.jpg,
audio_url: https://speech.mp3
}
安装说明: 安装脚本仅检测您的操作系统/架构,从 dist.inference.sh 下载匹配的二进制文件,并验证其 SHA-256 校验和。无需提升权限或后台进程。提供手动安装与验证。
可用模型
| 模型 | 应用 ID | 最佳用途 |
|---|
| OmniHuman 1.5 | bytedance/omnihuman-1-5 | 多角色,最佳质量 |
| OmniHuman 1.0 |
bytedance/omnihuman-1-0 | 单角色 |
| Fabric 1.0 | falai/fabric-1-0 | 图片说话+唇形同步 |
| PixVerse 唇形同步 | falai/pixverse-lipsync | 高度逼真 |
搜索虚拟形象应用
bash
infsh app list --search omnihuman
infsh app list --search lipsync
infsh app list --search fabric
示例
OmniHuman 1.5(多角色)
bash
infsh app run bytedance/omnihuman-1-5 --input {
image_url: https://portrait.jpg,
audio_url: https://speech.mp3
}
支持在多人物图片中指定要驱动的角色。
Fabric 1.0(图片说话)
bash
infsh app run falai/fabric-1-0 --input {
image_url: https://face.jpg,
audio_url: https://audio.mp3
}
PixVerse 唇形同步
bash
infsh app run falai/pixverse-lipsync --input {
image_url: https://portrait.jpg,
audio_url: https://speech.mp3
}
从任意音频生成高度逼真的唇形同步。
完整工作流:TTS + 虚拟形象
bash
1. 从文本生成语音
infsh app run infsh/kokoro-tts --input {
text: 欢迎观看我们的产品演示。今天我将向您展示...
} > speech.json
2. 使用生成的语音创建虚拟形象视频
infsh app run bytedance/omnihuman-1-5 --input {
image_url: https://presenter-photo.jpg,
audio_url: <步骤1中的音频URL>
}
完整工作流:视频多语言配音
bash
1. 转录原始视频
infsh app run infsh/fast-whisper-large-v3 --input {audio_url: https://video.mp4} > transcript.json
2. 翻译文本(手动或使用LLM)
3. 生成新语言的语音
infsh app run infsh/kokoro-tts --input {text: <翻译后的文本>} > new_speech.json
4. 将原始视频与新音频进行唇形同步
infsh app run infsh/latentsync-1-6 --input {
video_url: https://original-video.mp4,
audio_url: <新音频URL>
}
使用场景
- - 营销:AI 主持人产品演示
- 教育:课程视频、讲解视频
- 本地化:多语言内容配音
- 社交媒体:一致的虚拟影响者
- 企业:培训视频、公告
提示
- - 使用高质量肖像照片(正面、光线良好)
- 音频应清晰,背景噪音最小
- OmniHuman 1.5 支持单张图片中的多人
- LatentSync 最适合将现有视频与新音频同步
相关技能
bash
完整平台技能(150+ 应用)
npx skills add inference-sh/skills@inference-sh
文本转语音(为虚拟形象生成音频)
npx skills add inference-sh/skills@text-to-speech
语音转文本(为配音进行转录)
npx skills add inference-sh/skills@speech-to-text
视频生成
npx skills add inference-sh/skills@ai-video-generation
图像生成(创建虚拟形象图片)
npx skills add inference-sh/skills@ai-image-generation
浏览所有视频应用:infsh app list --category video
文档