Avatar Video Generation Skill
This skill allows you to generate videos using Flyworks (a.k.a HiFly 飞影数字人) Digital Humans. Available features:
- 1. Public Avatar Video: Create video from text or audio using pre-made highly realistic avatars.
- Talking Photo: Create a "talking photo" video from a single image and text/audio.
- Voice Cloning: Clone a voice from an audio sample to use in TTS.
For detailed documentation, see the references/ folder:
API Token & Limitations
This skill works with a default free-tier token, but it has limitations:
- - Watermark: Generated videos will have a watermark.
- Duration Limit: Videos are limited to 30 seconds.
To remove limitations:
- 1. Register at hifly.cc or flyworks.ai.
- Get your API key from User Settings.
- Set the environment variable: INLINECODE0
Tools
scripts/hifly_client.py
The main entry point for all operations.
Usage
CODEBLOCK0
Examples
1. Create a simple greeting video
CODEBLOCK1
2. Use a custom talking photo
CODEBLOCK2
Agent Behavior Guidelines
When assisting users with video generation, follow these guidelines:
Voice Selection Required
Video generation requires both text AND a voice. If the user provides text but no voice:
- 1. Check local memory first: Run
manage_memory list to see if the user has saved any voice aliases. - Ask the user to choose:
- "I see you want to create a video with the text '[text]'. Which voice would you like to use?"
- If they have saved voices: "You have these saved voices: [list]. Or would you prefer a public voice?"
- If no saved voices: "Would you like to use a public voice, or clone your own voice from an audio sample first?"
- 3. Help them select:
- To see public voices:
list_public_voices
- To clone a voice: INLINECODE4
Complete Workflow Example
For a prompt like "Create a talking photo video from my photo saying 'this is my AI twin'":
- 1. Ask: "Which voice would you like for your AI twin? You can use a public voice or clone your own."
- If they want to clone: Help them with INLINECODE5
- Create the talking photo with both text and voice:
CODEBLOCK3
Saving for Later
After creating avatars or cloning voices, offer to save them:
CODEBLOCK4
头像视频生成技能
本技能允许您使用Flyworks(又名HiFly飞影数字人)数字人生成视频。可用功能包括:
- 1. 公共头像视频:使用预设的高逼真度头像,从文本或音频创建视频。
- 照片说话:从单张图片和文本/音频创建照片说话视频。
- 声音克隆:从音频样本中克隆声音,用于TTS。
详细文档请参见references/文件夹:
API令牌与限制
本技能使用默认的免费层令牌,但存在以下限制:
- - 水印:生成的视频将带有水印。
- 时长限制:视频时长限制为30秒。
移除限制的方法:
- 1. 在hifly.cc或flyworks.ai注册。
- 从用户设置获取您的API密钥。
- 设置环境变量:export HIFLYAPITOKEN=yourtoken_here
工具
scripts/hifly_client.py
所有操作的主要入口点。
使用方法
bash
列出可用的公共头像
python scripts/hifly
client.py listpublic_avatars
列出可用的公共声音
python scripts/hifly
client.py listpublic_voices
使用公共头像创建视频(TTS)
python scripts/hifly
client.py createvideo --type tts --text Hello world --avatar avatar
idor
alias --voice voiceid
oralias
使用公共头像创建视频(音频URL或文件)
python scripts/hifly
client.py createvideo --audio https://... or path/to/audio.mp3 --avatar avatar
idor_alias
使用捆绑资源创建照片说话视频
python scripts/hifly
client.py createtalking_photo --image assets/avatar.png --title Bundled Avatar
使用捆绑资源克隆声音
python scripts/hifly
client.py clonevoice --audio assets/voice.MP3 --title Bundled Voice
检查生成任务的状态
python scripts/hifly
client.py checktask --id TASK_ID
管理本地别名(保存在memory.json中)
python scripts/hifly
client.py managememory add my
avatar av12345
python scripts/hifly
client.py managememory list
示例
1. 创建简单的问候视频
bash
首先查找声音和头像
python scripts/hifly
client.py listpublic_avatars
python scripts/hifly
client.py listpublic_voices
生成
python scripts/hifly
client.py createvideo --type tts --text Welcome to our service. --avatar av
public01 --voice voice
public01
2. 使用自定义照片说话
bash
从图片URL创建头像
python scripts/hifly
client.py createtalking_photo --image https://mysite.com/photo.jpg --title CEO Photo
输出将提供头像ID,例如 avcustom99
保存到内存
python scripts/hifly
client.py managememory add ceo av
custom99
使用新头像生成视频
python scripts/hifly
client.py createvideo --type tts --text Here is the quarterly report. --avatar ceo --voice voice
public01
代理行为指南
在协助用户进行视频生成时,请遵循以下指南:
声音选择要求
视频生成需要文本和声音。 如果用户提供了文本但没有提供声音:
- 1. 首先检查本地内存:运行 manage_memory list 查看用户是否保存了任何声音别名。
- 请用户选择:
- 我看到您想用文本[文本]创建视频。您想使用哪个声音?
- 如果他们已保存声音:您已保存了这些声音:[列表]。或者您更倾向于使用公共声音?
- 如果没有保存的声音:您想使用公共声音,还是先从音频样本克隆您自己的声音?
- 3. 帮助他们选择:
- 查看公共声音:list
publicvoices
- 克隆声音:clone_voice --audio [file] --title [name]
完整工作流程示例
对于类似 用我的照片创建一张照片说话视频,说这是我的AI孪生 的提示:
- 1. 询问:您想为您的AI孪生使用哪个声音?您可以使用公共声音或克隆您自己的声音。
- 如果他们想克隆:帮助他们使用 clone_voice
- 使用文本和声音创建照片说话:
bash
python scripts/hifly
client.py createtalking_photo \
--image user_photo.jpg \
--text this is my AI twin \
--voice SELECTED
VOICEID \
--title My AI Twin
保存以备后用
创建头像或克隆声音后,主动提供保存功能:
bash
python scripts/hiflyclient.py managememory add myavatar AVATARID --kind avatar
python scripts/hiflyclient.py managememory add myvoice VOICEID --kind voice