Story to Prompts
One-shot conversion from story/scene to text-to-image prompts. No interactive confirmation — output the final result directly.
Output Language
Detect language from user input:
- - Chinese input → primary prompt in Chinese, secondary in English
- English input → primary prompt in English, secondary in Chinese
- Explicit language override (e.g. "output in English", "用中文输出") → follow user instruction
- All structural text (titles, character sheets, scene descriptions) matches the primary language
Entry Point
Determine mode based on user input:
- - Multi-scene mode: Input contains multiple events/plot points, or user explicitly requests N images
- Single-scene mode: Input describes only one scene/画面, or user asks for a prompt for "one scene"
Split Strategy (Multi-scene Mode)
Priority for determining image count and split:
- 1. User specifies count (e.g. "4 images", "拆成6张") → use directly
- User does not specify → split by spatiotemporal boundaries:
- Identify distinct time-space units (location change, time jump)
- Each independent time-space = one image
- Within the same time-space, if multiple key actions exist, split into 2-3 images with different shot types
- 3. Default range: 3-6 images unless the story is extremely simple or very long
Workflow (Multi-scene Mode)
Complete all steps in one pass. Output final result only.
Step 1: Extract Story Baseline
Determine internally (do not output separately):
- - Story core (one sentence)
- Character fixed features (age, hair, clothing, signature accessories)
- Unified visual style
- Color palette
- Lighting style
Step 2: Structure Split
Determine N images, assign for each:
- - Shot type (refer to
references/shot-types.md narrative rhythm template, adjacent images must differ) - Camera angle
- Narrative function (establishing / progression / climax / resolution)
Step 3: Generate Prompt per Image
Requirements for each prompt:
- - Repeat character fixed features in every prompt (consistency)
- Vary viewpoint, composition, posture across images (diversity)
- Only include characters/objects mentioned in the current scene (appearance rule)
- Include negative prompt (anti-failure)
- Follow the writing spec below
Step 4: Score and Optimize
Self-evaluate each prompt on 10 dimensions and optimize:
Structure Completeness (40 pts)
- 1. Core intent clarity (10): Is the goal unambiguous?
- Subject and hierarchy (10): Is the main subject clear with size ratio?
- Composition and ratio constraints (10): Aspect ratio, viewpoint, composition technique?
- Style anchor clarity (10): Specific style/medium specified?
Generation Quality Control (40 pts)
- 5. Motif unity (10): Do visual details serve a unified theme?
- Material and lighting description (10): Specific material and light logic?
- Constraints and negative prompts (10): Anti-failure constraints present?
- Text-image integration (10): Text layout handled or explicitly absent?
Productization and Reusability (20 pts)
- 9. Parameterization (10): Easy to adjust and reuse?
- Failure anticipation (10): Common AI errors preemptively blocked?
Logic check per prompt: character consistency, scene continuity, physics plausibility, style coherence. Fix contradictions if found.
Target: each prompt ≥ 80 points (High Quality). If below, self-optimize and output the improved version.
Workflow (Single-scene Mode)
Simpler, one pass:
- 1. Extract character features and visual style from the scene
- Determine optimal shot type and composition
- Generate prompt (same requirements as Step 3-4 above)
- Output
Output Format
Primary language marked ★, secondary marked ☆:
CODEBLOCK0
Prompt Writing Spec
Structure (by priority):
CODEBLOCK1
Bilingual output rules:
- - Primary language prompt: complete and detailed, ready to copy-paste
- Secondary language prompt: equally complete, adapted to target language prompt conventions (not a literal translation)
Consistency rules:
- - Character fixed features (age, hair, clothing) must be explicitly repeated in every prompt
- Style, color palette, lighting baseline must carry through all images
- Key props appearance must remain consistent
Diversity rules:
- - Adjacent images use different shot types
- Encourage different composition techniques
- Character posture, expression, position may vary
- Lighting intensity may be adjusted, style remains constant
Reference Files
Read on demand:
- -
references/shot-types.md — Shot types, camera angles, narrative rhythm templates - INLINECODE2 — 12 composition patterns with prompt fragments
- INLINECODE3 — 30+ style parameters (keywords, quality tags, avoid list, lighting)
故事转提示词
从故事/场景到文生图提示词的一次性转换。无需交互确认——直接输出最终结果。
输出语言
根据用户输入检测语言:
- - 中文输入 → 主提示词为中文,辅助提示词为英文
- 英文输入 → 主提示词为英文,辅助提示词为中文
- 明确语言覆盖(如output in English、用中文输出)→ 遵循用户指令
- 所有结构文本(标题、角色表、场景描述)与主语言一致
入口点
根据用户输入确定模式:
- - 多场景模式:输入包含多个事件/情节节点,或用户明确要求N张图像
- 单场景模式:输入仅描述一个场景/画面,或用户要求为一个场景生成提示词
拆分策略(多场景模式)
确定图像数量和拆分的优先级:
- 1. 用户指定数量(如4张图、拆成6张)→ 直接使用
- 用户未指定 → 按时空边界拆分:
- 识别不同的时空单元(地点变化、时间跳跃)
- 每个独立时空 = 一张图像
- 同一时空内,若存在多个关键动作,以不同镜头类型拆分为2-3张图像
- 3. 默认范围:3-6张图像,除非故事极其简单或非常冗长
工作流程(多场景模式)
一次性完成所有步骤。仅输出最终结果。
步骤1:提取故事基线
内部确定(不单独输出):
- - 故事核心(一句话)
- 角色固定特征(年龄、发型、服装、标志性配饰)
- 统一视觉风格
- 色调
- 光影风格
步骤2:结构拆分
确定N张图像,为每张分配:
- - 镜头类型(参考references/shot-types.md叙事节奏模板,相邻图像必须不同)
- 摄影角度
- 叙事功能(建立/推进/高潮/收束)
步骤3:为每张图像生成提示词
每条提示词的要求:
- - 重复角色固定特征于每条提示词中(一致性)
- 变化视角、构图、姿态于各图像间(多样性)
- 仅包含当前场景中提及的角色/物体(出现规则)
- 包含负面提示词(防失败)
- 遵循以下写作规范
步骤4:评分与优化
从10个维度对每条提示词进行自我评估并优化:
结构完整性(40分)
- 1. 核心意图清晰度(10):目标是否明确无误?
- 主体与层次(10):主体是否清晰且具有尺寸比例?
- 构图与比例约束(10):宽高比、视角、构图技法?
- 风格锚点清晰度(10):是否指定了具体风格/媒介?
生成质量控制(40分)
- 5. 主题统一性(10):视觉细节是否服务于统一主题?
- 材质与光影描述(10):是否包含具体材质和光影逻辑?
- 约束与负面提示词(10):是否存在防失败约束?
- 图文整合(10):文字布局已处理或明确排除?
产品化与可复用性(20分)
- 9. 参数化(10):是否易于调整和复用?
- 故障预判(10):是否预先阻止了常见AI错误?
逐条提示词逻辑检查:角色一致性、场景连续性、物理合理性、风格连贯性。发现矛盾则修正。
目标:每条提示词 ≥ 80分(高质量)。 若低于此标准,自行优化并输出改进版本。
工作流程(单场景模式)
更简单,一次性完成:
- 1. 从场景中提取角色特征和视觉风格
- 确定最佳镜头类型和构图
- 生成提示词(与上述步骤3-4要求相同)
- 输出
输出格式
主语言标记★,辅助语言标记☆:
图像 N | [镜头类型] | [叙事功能]
场景描述: [主语言的详细描述]
文生图提示词 ★([主语言]):
[完整详细提示词,可直接复制粘贴]
文生图提示词 ☆([辅助语言]):
[适应目标语言惯例的完整提示词]
负面提示词: [负面关键词]
评分:[X]/100 | 等级:[产品级 / 高质量 / 可用]
优势:[一句话]
改进:[如适用,一句话]
提示词写作规范
结构(按优先级):
[风格] + [镜头类型 + 构图 + 摄影角度] + [主体 + 固定特征] + [动作/表情] + [环境/背景] + [光影/氛围] + [材质/纹理] + [质量标签] + [负面提示词]
双语输出规则:
- - 主语言提示词:完整详细,可直接复制粘贴
- 辅助语言提示词:同样完整,适应目标语言提示词惯例(非字面翻译)
一致性规则:
- - 角色固定特征(年龄、发型、服装)必须在每条提示词中明确重复
- 风格、色调、光影基线必须在所有图像中贯穿
- 关键道具外观必须保持一致
多样性规则:
- - 相邻图像使用不同镜头类型
- 鼓励不同构图技法
- 角色姿态、表情、位置可以变化
- 光影强度可调整,风格保持不变
参考文件
按需读取:
- - references/shot-types.md — 镜头类型、摄影角度、叙事节奏模板
- references/composition-patterns.md — 12种构图模式及提示词片段
- references/style-params.md — 30+风格参数(关键词、质量标签、避免列表、光影)