Explainer Video Guide

Create explainer videos from script to final cut via inference.sh CLI.

Quick Start

CODEBLOCK0

Install note: The install script only detects your OS/architecture, downloads the matching binary from dist.inference.sh, and verifies its SHA-256 checksum. No elevated permissions or background processes. Manual install & verification available.

Script Formulas

Problem-Agitate-Solve (PAS) — 60 seconds

Section	Duration	Content	Word Count
Problem	10s	State the pain point the viewer has	~25 words
Agitate

Before-After-Bridge (BAB) — 90 seconds

Section	Duration	Content
Before	15s	Show the current frustrating state
After

15s | Show the ideal outcome | | Bridge | 40s | Explain how your product gets them there | | Social Proof | 10s | Quick stat or testimonial | | CTA | 10s | Clear next step |

Feature Spotlight — 30 seconds (social)

Section	Duration	Content
Hook	3s	Surprising fact or question
Feature

15s | Show one feature solving one problem | | Result | 7s | The outcome/benefit | | CTA | 5s | Try it / Learn more |

Pacing Rules

Content Type	Words Per Minute	Notes
Standard narration	150 wpm	Conversational pace
Complex/technical

Key rule: 1 scene per key message. Don't pack multiple ideas into one visual.

Scene Duration Guidelines

- Establishing shot: 3-5 seconds
Feature demonstration: 5-8 seconds
Text/stat on screen: 3-4 seconds (must be readable)
Transition: 0.5-1 second
CTA screen: 3-5 seconds

Visual Production

Scene Types

CODEBLOCK1

Image-to-Video for Scenes

CODEBLOCK2

Voiceover Production

Script Writing Tips

- Short sentences. Max 15 words per sentence.
Active voice. "You can track your data" not "Your data can be tracked."
Conversational tone. Read it aloud — if it sounds stiff, rewrite.
One idea per sentence. One sentence per visual beat.

Generating Voiceover

CODEBLOCK3

Pacing Control in TTS

Technique	Effect	Example
Period INLINECODE1	Medium pause	"This changes everything. Here's how."
Ellipsis INLINECODE2

Music & Audio

Background Music Guidelines

- Volume: 20-30% under narration (duck 6-12dB when voice plays)
Style: match the brand tone (corporate = ambient electronic, startup = upbeat indie)
Structure: intro swell (first 3s) -> subtle loop under narration -> swell at CTA
No vocals: instrumental only under narration

CODEBLOCK4

Assembly Pipeline

Full Production Workflow

CODEBLOCK5

Video Length by Format

Format	Length	Platform
Social teaser	15-30s	TikTok, Instagram Reels, YouTube Shorts
Product demo

Transition Types

Transition	When to Use	Effect
Cut	Default between related scenes	Clean, professional
Dissolve/Crossfade

Common Mistakes

Mistake	Problem	Fix
Script too wordy	Voiceover rushed, viewer overwhelmed	Cut to 150 wpm max
No hook in first 3s

Related Skills

CODEBLOCK6

Browse all apps: INLINECODE6

解说视频指南

通过 inference.sh CLI 从脚本到最终剪辑创建解说视频。

快速开始

bash
curl -fsSL https://cli.inference.sh | sh && infsh login

为解说视频生成场景

infsh app run google/veo-3-1-fast --input { prompt: 简洁动态图形风格动画，连接节点间流动的抽象数据，蓝白配色方案，专业企业美学，平滑过渡 }

安装说明： 安装脚本仅检测您的操作系统/架构，从 dist.inference.sh 下载匹配的二进制文件，并验证其 SHA-256 校验和。无需提升权限或后台进程。提供手动安装和验证。

脚本公式

问题-激化-解决 (PAS) — 60秒

部分	时长	内容	字数
问题	10秒	陈述观众面临的痛点	~25字
激化

10秒 | 展示为什么比他们想象的更糟 | ~25字 | | 解决 | 15秒 | 介绍您的产品/想法 | ~35字 | | 工作原理 | 20秒 | 展示3个关键步骤或功能 | ~50字 | | 行动号召 | 5秒 | 一个明确的下一步行动 | ~12字 |

之前-之后-桥梁 (BAB) — 90秒

部分	时长	内容
之前	15秒	展示当前令人沮丧的状态
之后

15秒 | 展示理想的结果 | | 桥梁 | 40秒 | 解释您的产品如何帮助他们实现目标 | | 社会证明 | 10秒 | 快速数据或推荐 | | 行动号召 | 10秒 | 明确的下一步 |

功能亮点 — 30秒（社交媒体）

部分	时长	内容
钩子	3秒	令人惊讶的事实或问题
功能

15秒 | 展示一个功能解决一个问题 | | 结果 | 7秒 | 结果/收益 | | 行动号召 | 5秒 | 尝试/了解更多 |

节奏规则

内容类型	每分钟字数	备注
标准旁白	150字/分钟	对话式节奏
复杂/技术性

关键规则： 每个关键信息一个场景。不要在一个画面中塞入多个想法。

场景时长指南

- 定场镜头：3-5秒
功能演示：5-8秒
屏幕文字/数据：3-4秒（必须可读）
过渡：0.5-1秒
行动号召画面：3-5秒

视觉制作

场景类型

bash

产品在场景中

infsh app run google/veo-3-1-fast --input {
prompt: 简洁产品演示视频，双手在笔记本电脑上打字显示仪表盘界面，明亮现代办公室，柔和自然光线，专业
}

抽象概念可视化

infsh app run bytedance/seedance-1-5-pro --input { prompt: 抽象动态图形，彩色数据流连接浮动几何形状，流畅平滑动画，深色背景带发光元素，科技美学 }

生活方式/成果镜头

infsh app run google/veo-3-1-fast --input { prompt: 开心的人靠在沙发上使用笔记本电脑，对着屏幕微笑，明亮通风的客厅，温暖午后光线，满意的客户感受，生活方式商业风格 }

之前/之后对比

infsh app run falai/flux-dev-lora --input { prompt: 分屏对比，左侧杂乱堆满文件和压力的办公桌，右侧干净整洁极简工作空间，显著差异，简洁设计 }

场景的图像转视频

bash

首先生成静态帧

infsh app run falai/flux-dev-lora --input {
prompt: 专业工作空间带发光全息界面，未来感但简洁，蓝色氛围灯光
}

制作动画

infsh app run falai/wan-2-5-i2v --input { prompt: 轻柔摄像机推进，全息元素微妙浮动旋转，柔和环境光变化, image: path/to/workspace-still.png }

配音制作

脚本写作技巧

- 短句。每句最多15个字。
主动语态。您可以追踪您的数据而不是您的数据可以被追踪。
对话式语气。大声朗读——如果听起来生硬，就重写。
每句一个想法。每个视觉节拍一句话。

生成配音

bash

使用 Dia TTS 的专业旁白

infsh app run falai/dia-tts --input {
prompt: [S1] 厌倦了花费数小时制作没人看的报告？有更好的方法。认识一下 DataFlow。它能在几秒钟内将您的原始数据转化为视觉故事...只需连接您的数据源，选择一个模板，然后分享。今天免费试用 DataFlow。
}

TTS 中的节奏控制

技巧	效果	示例
句号 .	中等停顿	这改变了一切。这是方法。
省略号 ...

长停顿（戏剧性） | 而结果...令人难以置信。 | | 逗号 , | 短停顿 | 快速、简单、强大。 | | 感叹号 ! | 强调/活力 | 今天就开始构建！ | | 问号 ? | 升调 | 如果有更好的方法呢？ |

音乐与音频

背景音乐指南

- 音量： 旁白下20-30%（语音播放时降低6-12dB）
风格： 匹配品牌调性（企业=环境电子，初创=欢快独立）
结构： 开头渐强（前3秒）-> 旁白下微妙循环 -> 行动号召处渐强
无歌词： 旁白下仅限器乐

bash

生成背景音乐

infsh app run --input {
prompt: 欢快企业背景音乐，现代电子，90 BPM，积极专业，无歌词，适合产品解说视频
}

组装流程

完整制作工作流

bash

1. 生成配音

infsh app run falai/dia-tts --input {
prompt: [S1] 您的脚本在这里...
}

2. 生成场景视觉（并行）

infsh app run google/veo-3-1-fast --input {prompt: 场景1描述} --no-wait infsh app run google/veo-3-1-fast --input {prompt: 场景2描述} --no-wait infsh app run google/veo-3-1-fast --input {prompt: 场景3描述} --no-wait

3. 合并场景为序列

infsh app run infsh/media-merger --input { media: [scene1.mp4, scene2.mp4, scene3.mp4] }

4. 为视频添加配音

infsh app run infsh/video-audio-merger --input { video: merged-scenes.mp4, audio: voiceover.mp3 }

5. 添加字幕

infsh app run infsh/caption-videos --input { video: final-with-audio.mp4, caption_file: captions.srt }

按格式的视频长度

格式	长度	平台
社交媒体预告	15-30秒	TikTok、Instagram Reels、YouTube Shorts
产品演示

过渡类型

过渡	使用时机	效果
剪切	相关场景之间的默认方式	干净、专业

| 溶解/交叉

explainer-video-guide解释视频指南