Vivago AI Skill
Integration with Vivago AI (智小象) platform for AI-powered image and video generation.
Supported Features
Image Generation
- - Text to Image (
txt2img): Generate images from text descriptions - Image to Image (
img2img): Transform existing images based on prompts, including style transfer, image editing, and multi-image fusion
Video Generation
- - Text to Video (
txt2vid): Generate videos from text descriptions - Image to Video (
img2vid): Generate videos from static images - Keyframe to Video (
keyframe_to_video): Generate transition videos from start and end keyframes - Video Templates (
template_to_video): 181 pre-defined video effects - Supports multiple model versions (v3Pro, v3L, kling-video-o1)
Additional Features
- - Image upload to Vivago storage
- Batch generation (up to 4 images)
- Multiple aspect ratios (1:1, 4:3, 3:4, 16:9, 9:16)
- Automatic retry with polling
Architecture
Core Modules
CODEBLOCK0
Code Quality
- - Type Safety: Complete type annotations and enums
- Exception Handling: Structured exception hierarchy
- CI/CD: GitHub Actions for automated testing
- Modular Config: Split configuration files for maintainability
Setup
Prerequisites
Before using this skill, you need to obtain a Vivago.ai API Token:
Step 1: Login to Vivago.ai
- 1. Visit https://vivago.ai/ and log in to your account
- Check your remaining credits and consider subscribing to a suitable plan if needed
Step 2: Obtain Your Token
- 1. After logging in, visit https://vivago.ai/prod-api/user/token
- The page will return your API Token (in JWT format)
- Copy this Token for configuration
Security Note: The Token is your credential for accessing the API. Please keep it secure and do not share it with others.
Environment Variables
Security Note: For secure deployments and AI Agents, the system requires the token to be passed strictly via the HIDREAM_AUTHORIZATION environment variable.
Export it securely in your current session:
CODEBLOCK1
Note: STORAGE_AK and STORAGE_SK are deprecated and removed. The image upload uses secure pre-signed URLs provided by the Vivago API.
File Output Configuration
Important: By default, all generated resources (JSON results, downloaded images, and videos) will be output to the assets/ directory within the current working folder. Ensure this directory exists or the system has permission to create it.
Installation
CODEBLOCK2
Usage
Python API
CODEBLOCK3
Error Handling
CODEBLOCK4
Command Line (Best for AI Agents)
For AI Agents: The easiest way to use this skill is through the provided CLI scripts. They automatically handle API communication, polling, and result parsing. By default, they use HiDream's native models.
Text to Image:
python3 scripts/txt2img.py \
--prompt "a futuristic city" \
--wh-ratio 16:9 \
--batch-size 2 \
--output ./assets/results.json
Note: This defaults to the hidream-txt2img model.
Text to Video:
python3 scripts/txt2vid.py \
--prompt "a cybernetic dragon flying over a futuristic city" \
--wh-ratio 16:9 \
--duration 5 \
--output ./assets/video_results.json
Note: This defaults to the v3Pro model.
Image to Video:
CODEBLOCK7
API Reference
Enums
CODEBLOCK8
Models
| Feature | Available Versions | Default |
|---|
| Text to Image | v3L (HiDream), kling-image-o1 | v3L (via port hidream-txt2img) |
| Image to Video |
v3Pro, v3L, kling-video-o1 |
v3Pro |
| Keyframe to Video | v3Pro, v3L |
v3Pro |
Note for AI Agents: By default, all CLI tools (txt2img.py, txt2vid.py) are pre-configured to use HiDream's native models (hidream-txt2img for images, v3Pro for videos). You don't need to specify the model unless explicitly requested by the user.
Aspect Ratios
- -
1:1 - Square - INLINECODE18 - Standard
- INLINECODE19 - Portrait
- INLINECODE20 - Widescreen
- INLINECODE21 - Mobile/Vertical
Task Status Codes
CODEBLOCK9
File Structure
CODEBLOCK10
Important Notes
Feishu Channel Messaging Guidelines
When sending generated content through Feishu (飞书) channel:
| Content Type | Send Method | Example |
|---|
| Images | ✅ Direct file upload | Attach image file directly |
| Videos |
❌
Must send as link |
https://media.vivago.ai/{video_uuid} |
⚠️ Critical: Videos CANNOT be sent as file attachments in Feishu. Always construct and send the direct media URL:
CODEBLOCK11
Why: Feishu does not support playable video attachments. Sending video files directly will result in delivery failure or unplayable content.
Image Download
Images can be downloaded using the correct URL format:
CODEBLOCK12
Example:
CODEBLOCK13
Sending via Feishu:
CODEBLOCK14
Asynchronous Processing
- - API calls are asynchronous with automatic polling
- Images are automatically resized to max 1024px on longest side before upload
- Video generation supports 5 or 10 second durations
- Batch size for images: 1-4, for videos: 1
- All API calls include automatic retry logic
Error Handling
The client handles common errors:
- - Network timeouts (with retry)
- Rate limiting (with exponential backoff)
- Invalid parameters (validation before API call)
- Task failures (structured exceptions)
Exception Hierarchy
CODEBLOCK15
Video Templates Reference
The following 181 video templates are available via template_to_video():
Quick Categories
| Category | Count | Example Templates |
|---|
| Style Transfer | 20+ | ghibli, 1930s-2000s vintage styles |
| Harry Potter |
4 | magic
revealravenclaw, gryffindor, hufflepuff, slytherin |
|
Wings/Fantasy | 10+ | angel
wings, phoenixwings, crystal
wings, firewings |
|
Superheroes | 5+ | iron
man, catwoman, ghost_rider |
|
Dance | 10+ | apt, dadada, dance, limbo_dance |
|
Effects | 15+ | ash
out, metallicliquid, flash_flood |
|
Thanksgiving | 10+ | turkey
chasing, autumnfeast, gratitude_photo |
|
Comics/Cartoon | 8+ | gta
star, animefigure, bring
comicsto_life |
|
Products | 8+ | glasses
display, musicbox, food
productdisplay |
|
Scenes | 20+ | romantic
kiss, graduation, starshipchef |
Popular Templates
| Template ID | Description |
|---|
| INLINECODE24 / INLINECODE25 | Studio Ghibli animation style |
| INLINECODE26 |
Harry Potter Ravenclaw transformation |
|
magic_reveal_gryffindor | Harry Potter Gryffindor transformation |
|
magic_reveal_hufflepuff | Harry Potter Hufflepuff transformation |
|
magic_reveal_slytherin | Harry Potter Slytherin transformation |
|
iron_man | Iron Man armor assembly |
|
angel_wings /
phoenix_wings /
crystal_wings /
fire_wings | Wing transformations |
|
cat_woman | Cat Woman style |
|
ghost_rider | Ghost Rider flaming skull |
|
joker | Joker villain style |
|
mermaid | Mermaid underwater scene |
|
snow_white | Snow White princess |
|
barbie | Barbie princess transformation |
|
me_in_hand | Miniature figure in hand |
|
music_box | Rotating figure on music box |
|
anime_figure | Transform into anime figure |
|
gta_star | GTA game style transformation |
|
apt /
dadada /
dance | Dance templates |
|
ash_out | Disintegrate into ashes |
|
eye_of_the_storm | Thunder god awakening |
|
metallic_liquid | Metal mask transformation |
|
flash_flood | Water/flood effect |
|
turkey_chasing /
turkey_away /
turkey_giant | Thanksgiving turkey scenes |
|
autumn_feast /
autumn_stroll | Autumn scenes |
|
renovation_of_old_photos | Colorize B&W photos |
|
graduation | Graduation ceremony |
|
glasses /
glasses_display | Glasses/eyewear showcase |
|
bikini /
sexy_man /
sexy_pants | Fashion/beach |
|
romantic_kiss /
boyfriends_rose /
girlfriends_rose | Romantic scenes |
|
ai_archaeologist /
starship_chef /
cyber_cooker | Sci-fi characters |
|
jungle_reign /
panther_queen /
roar_of_the_dustlands /
tiger_snuggle | Animal companions |
|
instant_sadness /
headphone_vibe /
relax | Emotion/reaction |
|
frost_alert | Cold/freeze effect |
|
bald_me | Bald transformation |
|
boom_hair /
curl_pop /
long_hair | Hair transformations |
|
muscles | Muscle transformation |
|
face_punch /
gun_point | Action effects |
|
static_shot /
tracking_shot /
orbit_shot /
push_in /
zoom_out /
handheld_shot | Camera movements |
|
earth_zoom_in /
earth_zoom_out | Earth zoom effects |
View All Templates
CODEBLOCK16
Usage Example
CODEBLOCK17
Changelog
v0.9.0 (2026-03-09)
- - ✅ Code review complete (P0-P3)
- ✅ Added GitHub Actions CI
- ✅ Added type safety module (enums.py)
- ✅ Added structured exceptions (exceptions.py)
- ✅ Split configuration into modular files
- ✅ Archived redundant code and tests
- ✅ Pinned dependency versions
v0.8.2 (2026-03-08)
- - ✅ Template testing: 44 templates, 40 passed (90.9%)
- ✅ Fixed metallicliquid naming issue
- ✅ Marked longhair as deprecated
v0.8.0 (2026-03-07)
- - ✅ Completed Tier 1-4 testing
- ✅ Established smart test optimization system
Vivago AI 技能
与 Vivago AI(智小象)平台集成,实现 AI 驱动的图像和视频生成。
支持的功能
图像生成
- - 文生图(txt2img):根据文本描述生成图像
- 图生图(img2img):基于提示词转换现有图像,包括风格迁移、图像编辑和多图像融合
视频生成
- - 文生视频(txt2vid):根据文本描述生成视频
- 图生视频(img2vid):从静态图像生成视频
- 关键帧转视频(keyframetovideo):从起始和结束关键帧生成过渡视频
- 视频模板(templatetovideo):181 种预定义视频效果
- 支持多个模型版本(v3Pro、v3L、kling-video-o1)
附加功能
- - 图像上传至 Vivago 存储
- 批量生成(最多 4 张图像)
- 多种宽高比(1:1、4:3、3:4、16:9、9:16)
- 带轮询的自动重试
架构
核心模块
scripts/
├── vivago_client.py # 主 API 客户端
├── template_manager.py # 模板管理
├── config_loader.py # 配置加载
├── enums.py # 类型枚举(TaskStatus、AspectRatio 等)
├── exceptions.py # 结构化异常
└── config/ # 模块化配置文件
代码质量
- - 类型安全:完整的类型注解和枚举
- 异常处理:结构化的异常层级
- CI/CD:GitHub Actions 自动化测试
- 模块化配置:拆分配置文件以提高可维护性
设置
前提条件
使用此技能前,您需要获取 Vivago.ai API Token:
步骤 1:登录 Vivago.ai
- 1. 访问 https://vivago.ai/ 并登录您的账户
- 检查剩余积分,如有需要可订阅合适的套餐
步骤 2:获取您的 Token
- 1. 登录后,访问 https://vivago.ai/prod-api/user/token
- 页面将返回您的 API Token(JWT 格式)
- 复制此 Token 用于配置
安全提示:Token 是您访问 API 的凭证,请妥善保管,不要与他人分享。
环境变量
安全提示: 为确保安全部署和 AI Agent 使用,系统要求 Token 必须通过 HIDREAM_AUTHORIZATION 环境变量传递。
在当前会话中安全地导出:
bash
export HIDREAMAUTHORIZATION=yourvivagoapitoken
注意: STORAGEAK 和 STORAGESK 已弃用并移除。图像上传使用 Vivago API 提供的安全预签名 URL。
文件输出配置
重要提示: 默认情况下,所有生成的资源(JSON 结果、下载的图像和视频)将输出到当前工作目录下的 assets/ 文件夹。请确保该目录存在或系统有权限创建它。
安装
bash
pip install -r requirements.txt
使用方法
Python API
python
from scripts import create_client, VivagoClient
from scripts.enums import AspectRatio, PortName, TaskStatus
from scripts.exceptions import TaskFailedError, TaskTimeoutError
创建客户端
client = create_client()
文生图
results = client.text
toimage(
prompt=a beautiful sunset over mountains,
port=PortName.KLING
IMAGE, # 或 PortName.NANOBANANA
wh
ratio=AspectRatio.RATIO16_9,
batch_size=2
)
图生视频(使用本地图像)
results = client.image
tovideo(
prompt=camera slowly zooming out,
image
uuid=client.uploadimage(/path/to/image.jpg),
port=PortName.V3PRO,
wh
ratio=AspectRatio.RATIO16_9,
duration=5
)
关键帧转视频(使用起始和结束图像)
results = client.keyframe
tovideo(
prompt=smooth transition from start to end,
start
imageuuid=client.upload_image(/path/to/start.jpg),
end
imageuuid=client.upload_image(/path/to/end.jpg),
port=PortName.V3PRO,
wh
ratio=AspectRatio.RATIO16_9,
duration=5
)
视频模板 - 使用预定义效果
results = client.template
tovideo(
image
uuid=client.uploadimage(/path/to/image.jpg),
template=ghibli, # 查看下方可用模板
wh
ratio=AspectRatio.RATIO9_16
)
错误处理
python
from scripts.exceptions import (
TaskFailedError,
TaskRejectedError,
TaskTimeoutError,
InvalidPortError
)
try:
results = client.imagetovideo(...)
except TaskFailedError as e:
print(fTask failed: {e.task_id})
except TaskRejectedError as e:
print(fContent rejected: {e.reason})
except TaskTimeoutError as e:
print(fTimeout after {e.timeout_seconds}s)
except InvalidPortError as e:
print(fInvalid port: {e.port}, available: {e.available})
命令行(最适合 AI Agent)
对于 AI Agent: 使用此技能最简单的方式是通过提供的 CLI 脚本。它们自动处理 API 通信、轮询和结果解析。默认情况下,它们使用 HiDream 的原生模型。
文生图:
bash
python3 scripts/txt2img.py \
--prompt a futuristic city \
--wh-ratio 16:9 \
--batch-size 2 \
--output ./assets/results.json
注意:默认使用 hidream-txt2img 模型。
文生视频:
bash
python3 scripts/txt2vid.py \
--prompt a cybernetic dragon flying over a futuristic city \
--wh-ratio 16:9 \
--duration 5 \
--output ./assets/video_results.json
注意:默认使用 v3Pro 模型。
图生视频:
bash
python3 scripts/img2video.py \
--prompt slow motion falling leaves \
--image ./assets/source_image.jpg \
--duration 5 \
--output ./assets/video.json
API 参考
枚举
python
from scripts.enums import (
TaskStatus, # PENDING, COMPLETED, PROCESSING, FAILED, REJECTED
AspectRatio, # RATIO11, RATIO43, RATIO169, 等
PortCategory, # TEXTTOIMAGE, IMAGETOVIDEO, 等
PortName # KLINGIMAGE, V3PRO, NANOBANANA, 等
)
模型
| 功能 | 可用版本 | 默认 |
|---|
| 文生图 | v3L (HiDream), kling-image-o1 | v3L(通过端口 hidream-txt2img) |
| 图生视频 |
v3Pro, v3L, kling-video-o1 |
v3Pro |
| 关键帧转视频 | v3Pro, v3L |
v3Pro |
AI Agent 注意: 默认情况下,所有 CLI 工具(txt2img.py、txt2vid.py)已预配置使用 HiDream 的原生模型(图像使用 hidream-txt2img,视频使用 v3Pro)。除非用户明确要求,否则无需指定模型。
宽高比
- - 1:1 - 正方形
- 4:3 - 标准
- 3:4 - 竖屏
- 16:9 - 宽屏
- 9:16 - 手机/竖屏
任务状态码
python
from scripts.enums import TaskStatus
TaskStatus.PENDING # 0 - 待处理
TaskStatus.COMPLETED # 1 - 已完成
TaskStatus.PROCESSING # 2 - 处理中
TaskStatus.FAILED # 3 - 失败
TaskStatus.REJECTED # 4 - 被拒绝(内容审核)
文件结构
vivago-ai-skill/
├── scripts/
│ ├── init.py # 包导出
│ ├── vivago_client.py # 核心 API 客户端
│