AI Video Remix Skill

This is an instruction-only skill — it provides guidance and reference documentation for the AI Video Remix CLI tool. The runtime source code lives in the GitHub repository and must be cloned separately (see Quick Start below).

Generate styled video compositions from a local ShotAI video library using natural language.

Important: Video Library Requirement

This skill can only search and use videos that have been imported into ShotAI. Videos simply stored on your hard drive are not searchable — they must be added to a ShotAI collection and fully indexed first.

Before using this skill, make sure you have:

1. Opened ShotAI and created a collection
Added your video files or folders to the collection
Waited for indexing to complete (shot detection + semantic analysis — progress is shown in ShotAI)

If the search returns no results or low-quality matches, the most common reason is that the relevant videos have not been imported into ShotAI yet.

Prerequisites

See references/setup.md for full installation instructions, including:

- ShotAI download and setup
ffmpeg installation
yt-dlp installation (for auto music)
Node.js dependencies

Quick Start

Note: This skill does not bundle runtime code. Clone the source repository first.

CODEBLOCK0

Pipeline (8 steps)

1. Agent: parseIntent — LLM extracts theme, selects composition, optionally overrides music style
Agent: refineQueries — LLM rewrites per-slot search terms to match library content
ShotAI: pickShots — Semantic search per slot via local ShotAI MCP server (localhost only), best shot selected
Music: resolveMusic — Uses local MP3 via --bgm (recommended), or optionally downloads from YouTube via yt-dlp
ffmpeg: extractClip — Each shot trimmed to independent .mp4 clip file (local processing only)
Agent: annotateClips — LLM assigns per-clip visual effect params (tone, dramatic, kenBurns, caption)
File Server — Localhost-only HTTP server (127.0.0.1) serves clips to Remotion renderer within the same machine
Remotion: render — Composition rendered to final MP4

CLI Usage

After cloning the repository and running npm install:

CODEBLOCK1

Compositions

ID	Label	Best For
INLINECODE3	赛博朋克夜景	Neon city, night scenes, sci-fi
INLINECODE4

Modes

Standard mode (default): LLM picks composition + generates search queries from registry templates.

Probe mode (--probe): Scans library videos first (names, shot samples, mood/scene tags), then LLM generates custom slots tailored to what actually exists.

Choose probe mode when: library content is unknown, user wants "best of my library", or standard slots return low-quality shots.

Environment Variables

See references/config.md for all environment variables and LLM provider setup.

Troubleshooting & Quality Tuning

See references/tuning.md for solutions to:

- Clip boundary flicker / 1–2 frame flash at cuts
Red flash artifact in CyberpunkCity (GlitchFlicker on short clips)
Low-quality or off-topic shots
Music download failures

Recommended .env defaults for best quality:
CODEBLOCK2

Writing ShotAI Search Queries

ShotAI uses semantic search powered by AI-generated tags and embedding vectors. Query quality is the single biggest factor in shot relevance — invest time here.

Query construction rules

Always write full sentences or rich phrases, never bare keywords.

The search engine understands semantic similarity ("ocean" matches "sea", "waves", "shoreline"), so richer context produces better recall.

Quality	Example	When to use
⭐ Detailed description	INLINECODE15	Best precision — use for hero shots
⭐ Full sentence

What to include in a query

Describe the visual content of the ideal shot across these dimensions:

- Subject: what/who is in frame (a lone hiker, city traffic at night, athlete celebrating)
Action: what is happening (walking slowly through fog, speeding through intersection, jumping with arms raised)
Environment: location, setting, time of day (rain-soaked Tokyo street, mountain meadow at golden hour, empty stadium under floodlights)
Mood / atmosphere: emotional tone (melancholic, tense, euphoric, serene)
Camera feel: implied movement or framing (wide establishing shot, tight close-up, slow pan, handheld shaky)

Not all dimensions are needed every time — include whichever are most distinctive for the shot you want.

The refineQueries step

When the agent runs refineQueries, it rewrites the composition's default slot queries to better match the user's actual library. Apply these principles:

1. Start from the slot's semantic intent — what emotional or narrative role does this shot play in the composition?
Incorporate any context from the user's request — location names, event names, specific subjects mentioned
Expand synonyms — if the slot says "water", try "river flowing through forest" or "lake reflecting mountains" based on what the library likely contains
Avoid negations — "not indoors" does not work; instead describe the positive version ("outdoor daylight scene")
One query per slot — make it specific rather than trying to cover multiple scenarios

Examples: slot query → refined query

CODEBLOCK3

Adding a New Composition

See references/composition-guide.md to add a new Remotion composition to the registry.

Safety and Fallback

Network & credential scope

- All credentials stay local. SHOTAI_TOKEN is sent only to the local ShotAI MCP server (127.0.0.1). LLM API keys (if configured) are sent only to their respective provider endpoints — never to ShotAI, YouTube, or any other service.
The clip file server binds to 127.0.0.1 only (default port 8080). It is not accessible from other machines on the network. It serves temporary clip files to the Remotion renderer running on the same machine and shuts down after rendering completes.
yt-dlp is optional. Use --bgm /path/to/local.mp3 to skip all YouTube network access. When yt-dlp is used, it only downloads a single background music track — no other data is sent to YouTube.
LLM access is optional. Set AGENT_PROVIDER=none to run in heuristic mode with zero external network calls (aside from the local ShotAI MCP server).

Error handling

- If SHOTAI_URL or SHOTAI_TOKEN is unset, display a warning: "ShotAI MCP server is not configured. Set SHOTAI_URL and SHOTAI_TOKEN in your .env file. Download ShotAI at https://www.shotai.io."
If the ShotAI MCP server returns an error (connection refused, HTTP 4xx/5xx), display the error message and stop — do not fabricate shot results.
Never fabricate video file paths, shot timestamps, or similarity scores.
If music download fails (yt-dlp error or network unreachable), suggest using --bgm <local.mp3> to provide a local audio file instead.
If Remotion render fails, display the error output and suggest checking Node.js version (18+) and that all clip files were extracted successfully.
If the LLM provider is unreachable, fall back to heuristic mode: use composition default queries directly without refinement, and skip annotateClips (use composition default effect params).

License

MIT-0 — Free to use, modify, and redistribute. No attribution required.
See https://spdx.org/licenses/MIT-0.html

AI 视频混剪技能

这是一个纯指令技能——它为 AI 视频混剪 CLI 工具提供指导和参考文档。运行时源代码位于 GitHub 仓库中，需要单独克隆（参见下面的快速开始）。

使用自然语言从本地 ShotAI 视频库生成风格化视频作品。

重要：视频库要求

此技能只能搜索和使用已导入 ShotAI 的视频。仅存储在硬盘上的视频不可搜索——它们必须先添加到 ShotAI 集合中并完成索引。

使用此技能前，请确保您已：

1. 打开 ShotAI 并创建一个集合
将您的视频文件或文件夹添加到该集合
等待索引完成（镜头检测 + 语义分析——进度在 ShotAI 中显示）

如果搜索返回空结果或低质量匹配，最常见的原因是相关视频尚未导入 ShotAI。

先决条件

完整安装说明请参见 references/setup.md，包括：

- ShotAI 下载和设置
ffmpeg 安装
yt-dlp 安装（用于自动音乐）
Node.js 依赖

快速开始

注意： 此技能不捆绑运行时代码。请先克隆源代码仓库。

bash
git clone https://github.com/abu-ShotAI/ai-video-remix.git
cd ai-video-editor
npm install
cp .env.example .env # 填写 SHOTAIURL、SHOTAITOKEN，以及可选的 AGENT_PROVIDER
npx tsx src/skill/cli.ts 帮我做一个旅行混剪

流程（8 步）

1. Agent: parseIntent — LLM 提取主题，选择合成方案，可选地覆盖音乐风格
Agent: refineQueries — LLM 重写每个槽位的搜索词以匹配库内容
ShotAI: pickShots — 通过本地 ShotAI MCP 服务器（仅 localhost）对每个槽位进行语义搜索，选择最佳镜头
Music: resolveMusic — 使用本地 MP3（通过 --bgm，推荐），或可选地通过 yt-dlp 从 YouTube 下载
ffmpeg: extractClip — 每个镜头裁剪为独立的 .mp4 剪辑文件（仅本地处理）
Agent: annotateClips — LLM 为每个剪辑分配视觉效果参数（色调、戏剧效果、肯·伯恩斯效果、字幕）
File Server — 仅 localhost 的 HTTP 服务器（127.0.0.1）将剪辑提供给同一台机器上的 Remotion 渲染器
Remotion: render — 合成渲染为最终 MP4

CLI 用法

克隆仓库并运行 npm install 后：

bash
npx tsx src/skill/cli.ts <请求> [选项]

选项：
--composition 覆盖合成方案（跳过 LLM 选择）
--bgm <路径> 本地 MP3 路径（跳过 YouTube 搜索）
--output <目录> 输出目录（默认：./output）
--lang 输出语言：zh 中文（默认）/ en 英文
影响：视频标题、每个剪辑的字幕和位置标签、署名行
--probe 先扫描库，让 LLM 根据实际内容规划槽位

合成方案

ID	标签	最佳用途
CyberpunkCity	赛博朋克夜景	霓虹城市、夜景、科幻
TravelVlog

模式

标准模式（默认）：LLM 选择合成方案 + 从注册表模板生成搜索查询。

探测模式（--probe）：先扫描库视频（名称、镜头样本、情绪/场景标签），然后 LLM 根据实际存在的内容生成自定义槽位。

何时选择探测模式：库内容未知、用户想要我的库中最好的内容，或标准槽位返回低质量镜头。

环境变量

所有环境变量和 LLM 提供商设置请参见 references/config.md。

故障排除与质量调优

参见 references/tuning.md 了解以下问题的解决方案：

- 剪辑边界闪烁 / 剪辑处 1-2 帧闪白
CyberpunkCity 中的红色闪烁伪影（短剪辑上的 GlitchFlicker）
低质量或偏离主题的镜头
音乐下载失败

推荐的最佳质量 .env 默认值：
env
MIN_SCORE=0.5 # 过滤短/低质量镜头

编写 ShotAI 搜索查询

ShotAI 使用由 AI 生成的标签和嵌入向量驱动的语义搜索。查询质量是镜头相关性的最大因素——请在此投入时间。

查询构建规则

始终编写完整句子或丰富短语，切勿使用裸关键词。

搜索引擎理解语义相似性（ocean 匹配 sea、waves、shoreline），因此更丰富的上下文能产生更好的召回率。

质量	示例	使用时机
⭐ 详细描述	一只展开翅膀的白海鸥在平静的蓝色海面上平滑滑翔，金色夕阳反射在波浪上	最佳精度——用于主镜头
⭐ 完整句子

查询中应包含的内容

从以下维度描述理想镜头的视觉内容：

- 主体：画面中有什么/谁（孤独的徒步者、夜间城市交通、庆祝的运动员）
动作：正在发生什么（在雾中缓慢行走、高速穿过十字路口、双臂举起跳跃）
环境：地点、场景、时间（雨中的东京街道、黄金时刻的山间草地、泛光灯下的空体育场）
情绪/氛围：情感基调（忧郁、紧张、欣快、宁静）
镜头感：暗示的运动或构图（广角定场镜头、紧特写、慢摇、手持抖动）

并非每次都需要所有维度——只包含对您想要的镜头最具区分度的那些。

refineQueries 步骤

当代理运行 refineQueries 时，它会重写合成方案的默认槽位查询，以更好地匹配用户的实际库。应用以下原则：

1. 从槽位的语义意图出发——这个镜头在合成方案中扮演什么情感或叙事角色？
融入用户请求中的上下文——地点名称、事件名称、提到的特定主体
扩展同义词——如果槽位说水，根据库可能包含的内容尝试流过森林的河流或倒映山脉的湖泊
避免否定——不在室内不起作用；改为描述正面版本（户外白天场景）
每个槽位一个查询——使其具体化，而不是试图覆盖多个场景

示例：槽位查询 → 精炼查询

槽位默认：城市夜景
用户请求：帮我做一个东京旅行混剪
精炼：霓虹灯照亮的东京街道夜景，行人穿过发光的招牌下，雨水反射在人行道上

槽位默认：自然风景
用户请求：上个月的巴塔哥尼亚之旅
精炼：戏剧性的巴塔哥尼亚山景，暴风云下白雪覆盖的山峰，广阔的荒野

槽位默认：运动员在行动
用户请求：上一场比赛的篮球集锦
精炼：篮球运动员冲向篮筐，爆发性动作，背景中模糊的观众

添加新合成方案

参见 references/composition-guide.md 向注册表添加新的 Remotion 合成方案。

安全与回退

网络与凭证范围

- 所有凭证保持本地。 SHOTAI_TOKEN 仅发送到本地 ShotAI MCP 服务器（127.0.0.1）。LLM API 密钥（如果配置）仅发送到各自的提供商端点——

ai-video-remixAI视频混剪

ai-video-remix

AI Video Remix Skill

Important: Video Library Requirement

Prerequisites

Quick Start

Pipeline (8 steps)

CLI Usage

Compositions

Modes

Environment Variables

Troubleshooting & Quality Tuning

Writing ShotAI Search Queries

Query construction rules

What to include in a query

The refineQueries step

Examples: slot query → refined query

Adding a New Composition

Safety and Fallback

Network & credential scope

Error handling

License

AI 视频混剪技能

重要：视频库要求

先决条件

快速开始

流程（8 步）

CLI 用法

合成方案

模式

环境变量

故障排除与质量调优

编写 ShotAI 搜索查询

查询构建规则

查询中应包含的内容

refineQueries 步骤

示例：槽位查询 → 精炼查询

添加新合成方案

安全与回退

网络与凭证范围

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement