Setup
On first use, read setup.md.
When to Use
User needs AI-generated visuals, edits, or consistent image sets.
Use this skill to pick the right model, write stronger prompts, and avoid outdated model choices.
Architecture
User preferences persist in ~/image-generation/. See memory-template.md for setup.
CODEBLOCK0
Quick Reference
| Topic | File |
|---|
| Initial setup | INLINECODE3 |
| Memory template |
memory-template.md |
| Migration guide |
migration.md |
| Benchmark snapshots |
benchmarks-2026.md |
| Prompt techniques |
prompting.md |
| API handling |
api-patterns.md |
| GPT Image (OpenAI) |
gpt-image.md |
| Gemini and Imagen (Google) |
gemini.md |
| FLUX (Black Forest Labs) |
flux.md |
| Midjourney |
midjourney.md |
| Leonardo |
leonardo.md |
| Ideogram |
ideogram.md |
| Replicate |
replicate.md |
| Stable Diffusion |
stable-diffusion.md |
Core Rules
1. Resolve aliases to official model IDs first
Community names shift quickly. Before calling an API, map the nickname to the provider model ID.
| Community label | Official model ID to try first | Notes |
|---|
| Nano Banana | INLINECODE17 | Common nickname, not an official Google model ID |
| Nano Banana 2 / Pro |
Verify provider docs | Usually a provider preset over Gemini image models |
| GPT Image 1.5 |
gpt-image-1.5 | Current OpenAI high-tier image model |
| GPT Image mini / iMini |
gpt-image-1-mini | Budget/faster OpenAI variant |
| FLUX 2 Pro / Max |
flux-pro /
flux-ultra | Many platforms rename these SKUs |
2. Pick models by task, not by hype
| Task | First choice | Backup |
|---|
| Exact text in image | INLINECODE22 | Ideogram |
| Multi-turn edits |
gemini-2.5-flash-image-preview |
flux-kontext-pro |
| Photoreal hero shots |
imagen-4.0-ultra-generate-001 |
flux-ultra |
| Fast low-cost drafts |
gpt-image-1-mini |
imagen-4.0-fast-generate-001 |
| Character/product consistency |
flux-kontext-max |
gpt-image-1.5 with references |
| Local no-API workflows |
flux-schnell | SDXL |
3. Use benchmark tables as dated snapshots
Benchmarks drift weekly. Use benchmarks-2026.md as a starting point, then recheck current rankings when quality is critical.
4. Draft cheap, finish expensive
Start with 1-4 low-cost drafts, pick one, then upscale or rerender only the winner.
5. Keep a fallback chain
If the preferred model is unavailable, fallback by tier:
1) same provider lower tier, 2) cross-provider equivalent, 3) local/open model.
6. Treat DALL-E as legacy
OpenAI lists DALL-E 2/3 as legacy. Do not use them as default for new projects.
Common Traps
- - Using vendor nicknames as model IDs -> API errors and wasted retries
- Assuming "Nano Banana Pro" or "FLUX 2" are universal IDs -> provider mismatch
- Copying old DALL-E prompt habits -> weaker output vs modern GPT/Gemini image models
- Comparing text-to-image and image-editing scores as if they were the same benchmark
- Optimizing every draft at max quality -> cost spikes without quality gain
Security & Privacy
Data that leaves your machine:
- - Prompt text
- Reference images when editing or style matching
Data that stays local:
- - Provider preferences in INLINECODE33
- Optional local history file
This skill does NOT:
- - Store API keys
- Upload files outside chosen provider requests
- Persist generated images unless user asks to save them
External Endpoints
| Provider | Endpoint | Data Sent | Purpose |
|---|
| OpenAI | INLINECODE34 | Prompt text, optional input images | GPT Image generation/editing |
| Google Gemini API |
generativelanguage.googleapis.com | Prompt text, optional input images | Gemini image generation/editing |
| Google Vertex AI |
aiplatform.googleapis.com | Prompt text, optional input images | Imagen 4 generation |
| Black Forest Labs |
api.bfl.ai | Prompt text, optional input images | FLUX generation/editing |
| Replicate |
api.replicate.com | Prompt text, optional input images | Hosted third-party image models |
| Midjourney |
discord.com | Prompt text | Midjourney generation via Discord workflows |
| Leonardo |
cloud.leonardo.ai | Prompt text, optional input images | Leonardo generation/editing |
| Ideogram |
api.ideogram.ai | Prompt text | Typography-focused image generation |
No other data is sent externally.
Migration
If upgrading from a previous version, read migration.md before updating local memory structure.
Trust
This skill may send prompts and reference images to third-party AI providers.
Only install if you trust those providers with your content.
Related Skills
Install with
clawhub install <slug> if user confirms:
- -
image-edit - Specialized inpainting, outpainting, and mask workflows - INLINECODE45 - Convert image concepts into video pipelines
- INLINECODE46 - Build palettes for visual consistency across assets
- INLINECODE47 - Post-process image sequences and exports
Feedback
- - If useful: INLINECODE48
- Stay updated: INLINECODE49
设置
首次使用时,请阅读 setup.md。
使用时机
用户需要AI生成的视觉内容、编辑或一致的图像集。
使用此技能来选择正确的模型、编写更有效的提示词,并避免使用过时的模型。
架构
用户偏好设置保存在 ~/image-generation/ 目录下。参见 memory-template.md 了解设置方法。
~/image-generation/
├── memory.md # 首选提供商、项目上下文、成功方案
└── history.md # 可选生成日志
快速参考
memory-template.md |
| 迁移指南 | migration.md |
| 基准测试快照 | benchmarks-2026.md |
| 提示词技巧 | prompting.md |
| API处理 | api-patterns.md |
| GPT Image (OpenAI) | gpt-image.md |
| Gemini 和 Imagen (Google) | gemini.md |
| FLUX (Black Forest Labs) | flux.md |
| Midjourney | midjourney.md |
| Leonardo | leonardo.md |
| Ideogram | ideogram.md |
| Replicate | replicate.md |
| Stable Diffusion | stable-diffusion.md |
核心规则
1. 首先将别名解析为官方模型ID
社区名称变化很快。在调用API之前,将昵称映射到提供商的模型ID。
| 社区标签 | 优先尝试的官方模型ID | 备注 |
|---|
| Nano Banana | gemini-2.5-flash-image-preview | 常见昵称,非官方Google模型ID |
| Nano Banana 2 / Pro |
验证提供商文档 | 通常是Gemini图像模型上的提供商预设 |
| GPT Image 1.5 | gpt-image-1.5 | 当前OpenAI高端图像模型 |
| GPT Image mini / iMini | gpt-image-1-mini | 预算/更快的OpenAI变体 |
| FLUX 2 Pro / Max | flux-pro / flux-ultra | 许多平台重命名了这些SKU |
2. 根据任务选择模型,而非跟风
| 任务 | 首选 | 备选 |
|---|
| 图像中精确文字 | gpt-image-1.5 | Ideogram |
| 多轮编辑 |
gemini-2.5-flash-image-preview | flux-kontext-pro |
| 逼真主视觉 | imagen-4.0-ultra-generate-001 | flux-ultra |
| 快速低成本草稿 | gpt-image-1-mini | imagen-4.0-fast-generate-001 |
| 角色/产品一致性 | flux-kontext-max | gpt-image-1.5 配合参考图 |
| 本地无API工作流 | flux-schnell | SDXL |
3. 将基准测试表视为时效性快照
基准测试每周都在变化。使用 benchmarks-2026.md 作为起点,在质量要求严格时重新检查当前排名。
4. 草稿用低成本,成品用高质量
先生成1-4个低成本草稿,选择一个,然后仅对选中的进行放大或重新渲染。
5. 保持降级备用链
如果首选模型不可用,按层级降级:
1) 同一提供商低层级,2) 跨提供商等效模型,3) 本地/开源模型。
6. 将DALL-E视为遗留产品
OpenAI将DALL-E 2/3列为遗留产品。不要将其作为新项目的默认选择。
常见陷阱
- - 使用供应商昵称作为模型ID -> API错误和浪费的重试
- 假设Nano Banana Pro或FLUX 2是通用ID -> 提供商不匹配
- 沿用旧的DALL-E提示词习惯 -> 相比现代GPT/Gemini图像模型输出质量较差
- 将文生图和图像编辑的评分视为同一基准进行比较
- 每个草稿都追求最高质量 -> 成本飙升但质量无提升
安全与隐私
离开你设备的数据:
保留在本地的数据:
- - ~/image-generation/memory.md 中的提供商偏好设置
- 可选的本地历史记录文件
此技能不会:
- - 存储API密钥
- 在所选提供商请求之外上传文件
- 持久化生成的图像,除非用户要求保存
外部端点
| 提供商 | 端点 | 发送的数据 | 用途 |
|---|
| OpenAI | api.openai.com | 提示词文本,可选的输入图像 | GPT Image生成/编辑 |
| Google Gemini API |
generativelanguage.googleapis.com | 提示词文本,可选的输入图像 | Gemini图像生成/编辑 |
| Google Vertex AI | aiplatform.googleapis.com | 提示词文本,可选的输入图像 | Imagen 4生成 |
| Black Forest Labs | api.bfl.ai | 提示词文本,可选的输入图像 | FLUX生成/编辑 |
| Replicate | api.replicate.com | 提示词文本,可选的输入图像 | 托管的第三方图像模型 |
| Midjourney | discord.com | 提示词文本 | 通过Discord工作流进行Midjourney生成 |
| Leonardo | cloud.leonardo.ai | 提示词文本,可选的输入图像 | Leonardo生成/编辑 |
| Ideogram | api.ideogram.ai | 提示词文本 | 专注于排版的图像生成 |
没有其他数据被发送到外部。
迁移
如果从之前版本升级,在更新本地记忆结构前请阅读 migration.md。
信任声明
此技能可能会将提示词和参考图像发送给第三方AI提供商。
仅在你信任这些提供商处理你的内容时才安装。
相关技能
如果用户确认,使用 clawhub install
安装:
- - image-edit - 专门的图像修复、扩展和遮罩工作流
- video-generation - 将图像概念转换为视频管线
- colors - 为跨资产的视觉一致性构建调色板
- ffmpeg - 后处理图像序列和导出
反馈
- - 如果有用:clawhub star image-generation
- 保持更新:clawhub sync