Stella Selfie

Generate persona-consistent selfie images using Google Gemini or fal (xAI Grok Imagine) and send them to messaging channels via OpenClaw. Supports multi-reference avatar blending for strong character consistency.

When to Use

- User says "send a pic", "send me a photo", "send a selfie", "发张照片", "发自拍"
User says "show me what you look like...", "send a pic of you...", "展示你在..."
User describes a scene: "send a pic wearing...", "send a pic at...", "穿着...发张图"
User wants the agent to appear in a specific outfit, location, or situation

Prompt Modes

Mode 1: Mirror Selfie (default)

Best for: outfit showcases, full-body shots, fashion content

CODEBLOCK0

Mode 2: Direct Selfie

Best for: close-up portraits, location shots, emotional expressions

CODEBLOCK1

Mode 3: Third-Person Photo

Best for: non-selfie viewpoints, including explicit third-person requests and scenes that should not read as a selfie

CODEBLOCK2

Mode Selection Logic

Signal	Auto-Select Mode
Strong user keywords: outfit, wearing, clothes, dress, suit, fashion	INLINECODE0
Strong user keywords: full-body, mirror, reflection, pose, show the look

Default policy:

- Interpret explicit user requirements first: camera style, outfit emphasis, body framing, scene, pose, and expression.
Use mirror by default for outfit / full-body / self-presentation requests, even if the user did not explicitly mention a mirror.
Use direct by default for selfie requests focused on face, emotion, immediacy, or in-the-moment presence.
Use third_person only when the user explicitly asks for a non-selfie style or clearly describes a shot that should not read as a selfie.

Default mode when no keywords match and timeline is unavailable: INLINECODE8

Resolution Keywords

User says	Resolution
(default)	INLINECODE9
2k, 2048, medium res, 中等分辨率

2K | | 4k, high res, ultra, 超清, 高分辨率 | 4K |

Step-by-Step Instructions

Step 1: Collect User Input

Determine from the user's message:

- Explicit context (optional): scene, outfit, location, activity — detect from keywords
Mode (optional): mirror, direct, or third_person — auto-detect from explicit user intent if not specified
Target channel: Where to send (e.g., #general, @username, channel ID)
Channel provider (optional): Which platform (discord, telegram, whatsapp, slack)
Resolution (optional): 1K / 2K / 4K — default 1K
Count (optional): How many images — default 1, only increase if explicitly requested
Has explicit scene?: Does the request contain any specific scene/outfit/location/activity keywords?

Step 2: Enrich with Timeline Context Or Recent Scene Recall

INLINECODE17 is an optional enhancement, not a prerequisite.

- If timeline_resolve is unavailable in the current environment, skip this step and proceed with Stella's default behavior.
If the request is a current-state Sparse prompt — for example "发张自拍", "发张照片", "想看看你", "send a selfie", "send a photo", "show me what you look like" — and timeline_resolve is available, load and follow references/timeline-integration.md.
If the current request clearly refers back to a single recently resolved timeline scene in the current conversation, load and follow references/timeline-integration.md even if the photo request itself is not Sparse.
If the user already provided a clear standalone scene, outfit, location, activity, or camera requirement and it is not a callback to a recently resolved timeline scene, do not use timeline enhancement. Follow the default policy directly.
When you do call timeline_resolve, do not freely rewrite the request into output-slot questions. Use the fixed query rules in references/timeline-integration.md.
Only enable Nano Banana real-world grounding when the prompt can explicitly include a concrete city plus an exact local date/time anchor from timeline data. If those anchors are missing, do not claim real-world synchronization.
If timeline returns fact.status === "empty", is missing result.consumption, or any error occurs, immediately fall back to Step 3 without mentioning timeline failure to the user.

Never block image generation on timeline availability. Timeline enrichment is best-effort and should only be used for current-state Sparse prompts or explicit callbacks to a recently resolved timeline scene.

Step 3: Assemble Prompt

Select mode from the default policy first.

If the request is Sparse, and you loaded references/timeline-integration.md and obtained usable timeline context, apply its Sparse-only merge and prompt rules.

When that timeline enrichment includes outdoor real-world grounding, keep the grounding clause as a separate strong instruction sentence rather than a soft atmosphere phrase like Make it feel like....

Otherwise, use the user's explicit context directly and keep Stella's original fallback behavior:

CODEBLOCK3

Step 4: Generate Image

Run the Stella script:

CODEBLOCK4

Step 5: Confirm Result

After the script completes, confirm to the user:

- Image was generated successfully
Image was sent to the target channel
If any error occurred, send a concise actionable failure message

Environment Variables

Stella supports multiple providers and a gateway-backed send path, so its sensitive runtime environment variables
are explicitly declared in metadata.openclaw.requires.env for OpenClaw's env-injection allowlist.
The skill also sets metadata.openclaw.always: true, so these declarations do not become hard load-time gates.
Actual credential validation remains runtime-driven inside skill.js, based on the selected provider.

Variable	Required	Description
INLINECODE33	Required (if Provider=gemini)	Google Gemini API key
INLINECODE34

Credential requirements are provider-specific:

- Default Provider=gemini: requires INLINECODE44
INLINECODE45: requires INLINECODE46
INLINECODE47: requires INLINECODE48

Media File Handling (Gemini)

When Provider=gemini, Stella writes generated files to:

- INLINECODE50

After successful send, Stella deletes the local file immediately. If send fails, the file is kept for debugging.

Skill Environment Options

Configure in your OpenClaw openclaw.json under skills.entries.stella-selfie.env:

Option	Default	Description
INLINECODE53	INLINECODE54	Image provider: `gemini`, `fal`, or INLINECODE57
INLINECODE58

Note for Provider=fal users: fal's image editing API only accepts HTTP/HTTPS image URLs. Local file paths (from Avatar / AvatarsDir) are not supported. Configure AvatarsURLs in IDENTITY.md with public URLs of your reference images to enable image editing with fal.

Note for Provider=laozhang users: laozhang.ai uses the Google-native Gemini API format (gemini-3-pro-image-preview). It requires local reference images from Avatar / AvatarsDir and does not use AvatarsURLs. Supports 1K/2K/4K resolution and 10 aspect ratios. Get your API key at api.laozhang.ai — remember to configure a billing mode in the token settings before use.

Delivery Path

- Stella sends via openclaw message send.
Delivery auth and routing are handled by the local OpenClaw installation, not by skill-level gateway tokens.

External Endpoints And Data Flow

Endpoint / path	When used	Data sent
Google Gemini API	INLINECODE73	Prompt text and selected local reference images from `Avatar` / INLINECODE75
fal API

Security And Privacy

- Stella reads ~/.openclaw/workspace/IDENTITY.md and local avatar files to build reference context.
Under Provider=gemini, selected local avatar images are uploaded to Gemini as part of normal image generation.
Under Provider=fal, only public http/https avatar URLs are sent; local avatar files are not uploaded to fal directly.
Under Provider=laozhang, local avatar files from Avatar / AvatarsDir are base64-encoded and uploaded to laozhang.ai.
Generated files (Gemini and laozhang) are written to ~/.openclaw/workspace/stella-selfie/ and deleted after successful send.

User Configuration

Before using this skill, you must configure your OpenClaw workspace. See templates/SOUL.fragment.md for the recommended capability snippet to add to your SOUL.md.

Required: IDENTITY.md

Add the following fields to ~/.openclaw/workspace/IDENTITY.md:

CODEBLOCK5

- Avatar: Path to your primary reference image (relative to workspace root)
INLINECODE94: Directory containing multiple reference photos of the same character (different styles, scenes, outfits)
INLINECODE95: Comma-separated public URLs of reference images — required for Provider=fal (local files are not supported by fal's API)

Required: avatars/ Directory

Place your reference photos in ~/.openclaw/workspace/avatars/:

- Use jpg, jpeg, png, or webp format
All photos should be of the same character
Different styles, scenes, outfits, and expressions work best
Images are selected by creation time (newest first)

Required: SOUL.md

Add the Stella capability block to ~/.openclaw/workspace/SOUL.md. See README.md ("4. SOUL.md") for the copy/paste snippet.

Installation

CODEBLOCK6

After installation, complete the configuration steps above before using the skill.

Stella Selfie

使用 Google Gemini 或 fal（xAI Grok Imagine）生成角色一致性自拍图像，并通过 OpenClaw 发送到消息频道。支持多参考头像融合，实现强角色一致性。

使用时机

- 用户说发张照片、给我发张照片、发自拍、send a pic、send me a photo、send a selfie
用户说展示你的样子……、show me what you look like...、send a pic of you...
用户描述场景：发张穿……的照片、send a pic wearing...、send a pic at...、穿着……发张图
用户希望代理以特定服装、地点或情境出现

提示词模式

模式 1：镜子自拍（默认）

最适合：服装展示、全身照、时尚内容

此人对着镜子自拍，[用户上下文]，展示全身倒影。

模式 2：直接自拍

最适合：特写肖像、地点照片、情感表达

此人的自拍照，[用户上下文]，直视镜头。

模式 3：第三人称照片

最适合：非自拍视角，包括明确的第三人称请求以及不应被视为自拍的场景

此人的自然第三人称照片，[用户上下文]，自然构图，非自拍。

模式选择逻辑

信号	自动选择模式
强用户关键词：outfit、wearing、clothes、dress、suit、fashion	mirror
强用户关键词：full-body、mirror、reflection、pose、show the look

默认策略：

- 首先解读用户的明确要求：拍摄风格、服装重点、身体构图、场景、姿势和表情。
对于服装/全身/自我展示类请求，默认使用 mirror，即使用户未明确提及镜子。
对于聚焦面部、情感、即时性或当下存在的自拍请求，默认使用 direct。
仅当用户明确要求非自拍风格或清楚描述不应被视为自拍的镜头时，才使用 third_person。

当无关键词匹配且时间线不可用时，默认模式：mirror

分辨率关键词

用户表述	分辨率
（默认）	1K
2k、2048、medium res、中等分辨率

2K | | 4k、high res、ultra、超清、高分辨率 | 4K |

分步说明

步骤 1：收集用户输入

从用户消息中确定：

- 明确上下文（可选）：场景、服装、地点、活动——通过关键词检测
模式（可选）：mirror、direct 或 third_person——如未指定，从用户明确意图自动检测
目标频道：发送位置（例如 #general、@username、频道 ID）
频道提供商（可选）：哪个平台（discord、telegram、whatsapp、slack）
分辨率（可选）：1K / 2K / 4K——默认为 1K
数量（可选）：图片数量——默认为 1，仅在明确要求时增加
有明确场景？：请求是否包含任何特定场景/服装/地点/活动关键词？

步骤 2：用时间线上下文或近期场景回忆进行丰富

timeline_resolve 是可选的增强功能，非先决条件。

- 如果当前环境中 timelineresolve 不可用，跳过此步骤，继续执行 Stella 的默认行为。
如果请求是当前状态的 Sparse 提示——例如发张自拍、发张照片、想看看你、send a selfie、send a photo、show me what you look like——且 timelineresolve 可用，则加载并遵循 references/timeline-integration.md。
如果当前请求明确回溯到当前对话中最近解析的单个时间线场景，即使照片请求本身不是 Sparse，也加载并遵循 references/timeline-integration.md。
如果用户已提供清晰的独立场景、服装、地点、活动或相机要求，且不是对最近解析的时间线场景的回调，则不使用时间线增强。直接遵循默认策略。
当调用 timeline_resolve 时，不要随意将请求重写为输出槽问题。使用 references/timeline-integration.md 中的固定查询规则。
仅当提示能明确包含来自时间线数据的具体 city 加上精确的本地日期/时间锚点时，才启用 Nano Banana 现实世界接地。如果缺少这些锚点，不要声称现实世界同步。
如果时间线返回 fact.status === empty、缺少 result.consumption 或发生任何错误，立即回退到步骤 3，不向用户提及时间线失败。

绝不要因时间线可用性而阻塞图像生成。 时间线丰富是尽力而为的，仅应用于当前状态的 Sparse 提示或对最近解析的时间线场景的明确回调。

步骤 3：组装提示词

首先从默认策略中选择模式。

如果请求是 Sparse，且你加载了 references/timeline-integration.md 并获得了可用的时间线上下文，则应用其仅限 Sparse 的合并和提示规则。

当该时间线丰富包含户外现实世界接地时，将接地子句保留为独立的强指令句，而不是像 Make it feel like... 这样的软氛围短语。

否则，直接使用用户的明确上下文，并保留 Stella 的原始回退行为：

[mirror] 此人对着镜子自拍，[用户的明确上下文（如有）]，展示全身倒影。
[direct] 此人的自拍照，[用户的明确上下文（如有）]，直视镜头。
[third_person] 此人的自然第三人称照片，[用户的明确上下文（如有）]，自然构图，非自拍。

步骤 4：生成图像

运行 Stella 脚本：

bash
node {baseDir}/dist/scripts/skill.js \
--prompt <组装后的提示词> \
--target <目标频道> \
--channel <频道提供商> \
--caption <说明文字> \
--resolution <1K|2K|4K> \
--count <数量>

步骤 5：确认结果

脚本完成后，向用户确认：

- 图像已成功生成
图像已发送到目标频道
如果发生任何错误，发送简洁的可操作失败消息

环境变量

Stella 支持多个提供商和网关支持的发信路径，因此其敏感的运行时环境变量在 metadata.openclaw.requires.env 中显式声明，用于 OpenClaw 的环境变量注入允许列表。该技能还设置了 metadata.openclaw.always: true，因此这些声明不会成为硬性的加载时门控。实际的凭证验证在 skill.js 中基于所选提供商在运行时进行。

变量	必需	描述
GEMINIAPIKEY	必需（如果 Provider=gemini）	Google Gemini API 密钥
FAL_KEY

凭证要求因提供商而异：

- 默认 Provider=gemini：需要 GEMINIAPIKEY
Provider=fal：需要 FALKEY
Provider=laozhang：需要 LAOZHANGAPI_KEY

媒体文件处理（Gemini）

当 Provider=gemini 时，Stella 将生成的文件写入：

- ~/.openclaw/workspace/stella-selfie/

成功发送后，Stella 立即删除本地文件。如果发送失败，文件保留用于调试。

技能环境选项

在 OpenClaw 的 openclaw.json 中，在 skills.entries.stella-selfie.env 下配置：

| 选项 | 默认值 | 描述

stella-selfie 星拍自拍

stella-selfie

Stella Selfie

When to Use

Prompt Modes

Mode 1: Mirror Selfie (default)

Mode 2: Direct Selfie

Mode 3: Third-Person Photo

Mode Selection Logic

Resolution Keywords

Step-by-Step Instructions

Step 1: Collect User Input

Step 2: Enrich with Timeline Context Or Recent Scene Recall

Step 3: Assemble Prompt

Step 4: Generate Image

Step 5: Confirm Result

Environment Variables

Media File Handling (Gemini)

Skill Environment Options

Delivery Path

External Endpoints And Data Flow

Security And Privacy

User Configuration

Required: IDENTITY.md

Required: avatars/ Directory

Required: SOUL.md

Installation

Stella Selfie

使用时机

提示词模式

模式 1：镜子自拍（默认）

模式 2：直接自拍

模式 3：第三人称照片

模式选择逻辑

分辨率关键词

分步说明

步骤 1：收集用户输入

步骤 2：用时间线上下文或近期场景回忆进行丰富

步骤 3：组装提示词

步骤 4：生成图像

步骤 5：确认结果

环境变量

媒体文件处理（Gemini）

技能环境选项

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement