VTL Image Analysis
Use this skill whenever a user asks to analyze, diagnose, or improve a
generated image's composition. Also invoke it proactively after image
generation if the user has requested better compositional quality.
When to Use
- - User says "analyze this image", "why does this look generic/flat/boring"
- User asks to improve a generated image's composition
- After generating an image with openai-image-gen or similar skills
- User asks why their prompts aren't producing interesting layouts
Step 1 — Measure
Run the probe script on the image:
CODEBLOCK0
This returns JSON. Example:
CODEBLOCK1
HARD STOP — Refusal Gate
Before reporting any results, check valid and mask_status.
If valid is false OR mask_status is "FAIL":
"VTL measurement failed: [error message]. The image does not have sufficient
structural signal for reliable compositional analysis. Try a different image
or one with more defined edges and contrast."
Stop here. Do not report coordinates. Do not generate re-prompts.
If mask_status is "WARN":
"VTL measurement returned low-confidence results (sparse structural signal).
Coordinates are reported but treat them as indicative, not definitive."
Then continue with the caveat attached to all outputs.
This refusal is non-negotiable. Fabricating a compositional reading from a
failed measurement produces false diagnosis. The framework is deterministic
by design — an uncertain measurement is reported as uncertain, not smoothed over.
Step 2 — Report Coordinates
Report the five coordinates plainly:
CODEBLOCK2
Step 3 — Generate Re-Prompt (if flags present)
Run the regen script with the user's original prompt and the metrics output:
CODEBLOCK3
This selects operators from operators.yaml based on which flags fired and
returns up to 3 prompt variants. Report the selected variant as the primary
recommendation and offer the alternatives.
If no flags fired, report: "No default-mode patterns detected. Coordinates are
within normal range."
Operator Logic
Operators live in operators.yaml. They are rule-based — triggers are evaluated
deterministically against the metric values. The AI does not invent or modify
operators. If a trigger fires, the patch is applied. If not, it isn't.
Do not override operator logic. Do not substitute your own re-prompt language
for what the operator specifies. The operators are the prescription layer —
they are the operator's responsibility, not the AI's improvisation.
If the user wants to modify re-prompt behavior, direct them to edit operators.yaml.
Notes
- - Metrics describe compositional coordinates, not quality. CENTER_LOCK is not
"bad" — it's a signal that the model defaulted. A portrait photographer
choosing center composition is authorship. An AI doing it on every prompt
regardless of content is prior behavior. VTL measures the difference.
- - dRC requires radial eligibility. If mass centroid is very close to frame
center, dRC is labeled "dual-center" — report the label, not a number
interpretation.
- - Full metric definitions: references/vtl-metrics.md
- Full framework: https://github.com/rusparrish/Visual-Thinking-Lens
- Author: Russell Parrish — https://artistinfluencer.com
VTL 图像分析
当用户要求分析、诊断或改进生成图像的构图时,请使用此技能。如果用户要求更好的构图质量,在图像生成后也应主动调用此技能。
使用时机
- - 用户说分析这张图像、为什么这看起来普通/平淡/无聊
- 用户要求改进生成图像的构图
- 在使用 openai-image-gen 或类似技能生成图像后
- 用户询问为什么他们的提示词无法产生有趣的布局
步骤 1 — 测量
在图像上运行探测脚本:
bash
python3 scripts/vtlprobe.py path>
返回 JSON 格式数据。示例:
json
{
valid: true,
mask_status: PASS,
delta_x: -0.027,
delta_y: 0.008,
r_v: 0.875,
rho_r: 12.4,
dRC: 0.40,
dRC_label: mass-dominant,
k_var: 1.12,
infl_density: 0.16,
flags: [CENTER_LOCK]
}
硬性停止 — 拒绝门控
在报告任何结果之前,检查 valid 和 mask_status。
如果 valid 为 false 或 mask_status 为 FAIL:
VTL 测量失败:[错误信息]。该图像缺乏足够的结构信号,无法进行可靠的构图分析。请尝试其他图像,或选择边缘和对比度更清晰的图像。
在此停止。不要报告坐标。不要生成重新提示。
如果 mask_status 为 WARN:
VTL 测量返回低置信度结果(稀疏的结构信号)。坐标已报告,但请将其视为指示性而非确定性数据。
然后继续,并在所有输出中附加此说明。
此拒绝规则不可协商。从失败的测量中编造构图分析会产生错误的诊断。该框架在设计上是确定性的——不确定的测量结果会如实报告为不确定,而非被掩盖。
步骤 2 — 报告坐标
简洁地报告五个坐标:
VTL 分析
────────────────────────────────
位置 Δx={deltax} Δy={deltay}
空白 rᵥ={r_v}
密度 ρᵣ={rho_r}
径向 dRC={dRC} [{dRC_label}]
张力 kvar={kvar}
标记:{flags 或 NONE}
步骤 3 — 生成重新提示(如果存在标记)
使用用户的原始提示词和指标输出运行重新生成脚本:
bash
python3 scripts/vtl_regen.py \
--prompt 用户的原始提示词 \
--metrics \
--out prompts.json
该脚本根据触发的标记从 operators.yaml 中选择操作符,并返回最多 3 个提示词变体。将 selected 变体报告为主要推荐,并提供备选方案。
如果没有触发任何标记,则报告:未检测到默认模式。坐标在正常范围内。
操作符逻辑
操作符位于 operators.yaml 中。它们基于规则——触发条件会根据指标值确定性地进行评估。AI 不会发明或修改操作符。如果触发条件被激活,则应用补丁。如果没有,则不应用。
不要覆盖操作符逻辑。不要用您自己的重新提示语言替换操作符指定的内容。操作符是处方层——它们是操作符的责任,而非 AI 的即兴发挥。
如果用户想要修改重新提示行为,请引导他们编辑 operators.yaml。
备注
- - 指标描述的是构图坐标,而非质量。CENTER_LOCK 并非不好——它是模型默认行为的信号。肖像摄影师选择中心构图是创作行为。AI 在每个提示词中都不考虑内容而这样做是先前行为。VTL 衡量的是这种差异。
- dRC 需要径向资格。如果质量中心非常接近画面中心,dRC 会被标记为dual-center——报告该标签,而非数字解释。
- 完整指标定义:references/vtl-metrics.md
- 完整框架:https://github.com/rusparrish/Visual-Thinking-Lens
- 作者:Russell Parrish — https://artistinfluencer.com