External KI Integration
Use external AI services via browser automation (ChatGPT, Claude, web‑based LLMs) and APIs (Hugging Face Inference, OpenAI‑compatible endpoints) to augment your capabilities.
When to use this skill
- - You need to consult an external AI model (ChatGPT, Claude, Gemini, etc.) for reasoning, analysis, or generation tasks.
- The user has granted access to their chat interfaces (e.g., via Chrome Relay attached tab).
- You want to use Hugging Face Inference API (if token provided) for model inference.
- You need to interact with a free AI demo or Space via browser automation.
- The task benefits from a second opinion or specialized model (coding, creative writing, summarization).
Requirements
- 1. Browser automation – the
browser tool with profile="chrome" (user must have attached a tab to OpenClaw Browser Relay). - External AI accounts – user must be logged into the target service (ChatGPT, Claude, etc.) in the attached Chrome tab.
- Hugging Face token (optional) – for Inference API access, stored in
~/.openclaw/openclaw.json or provided as environment variable. - Other API keys (optional) – e.g., OpenAI, Anthropic, if user provides them.
Setup
Chrome Relay Attachment
The user must click the OpenClaw Browser Relay toolbar icon on the desired tab (badge ON). Verify attachment:
CODEBLOCK0
Or via browser tool: browser(action=status, profile="chrome").
Hugging Face Token
If token already stored in config, it will be used automatically. Otherwise, ask user to provide it.
Environment Variables (optional)
For API‑based access, you may set:
CODEBLOCK1
Browser Automation for Web UIs
General Pattern
- 1. Navigate to the service URL (e.g.,
https://chat.openai.com, https://claude.ai, https://gemini.google.com). - Wait for page load, snapshot with
refs="aria" to locate UI elements. - Find input area (role="textbox", role="textbox" with name "Message", etc.).
- Type your query using
act with ref or selector. - Click send/submit button (role="button", name="Send").
- Wait for response (poll for new text elements, detect loading indicator disappearance).
- Extract response from the output container (role="article", class "markdown", etc.).
- Return the extracted text.
Example: ChatGPT via Chrome Relay
CODEBLOCK2
Adaptation Notes
- - UI changes frequently: Use
refs="aria" for stable references (aria‑role, aria‑name). Fall back to selector with CSS classes if needed. - Rate limiting: Be gentle; wait 2–5 seconds between interactions.
- Session persistence: The attached tab retains login state; you can continue conversation in same chat.
API Integration
Hugging Face Inference API
See the dedicated
Hugging Face skill for detailed usage.
OpenAI‑compatible endpoints
If user provides an API key, you can call models via
curl or
exec:
CODEBLOCK3
Anthropic Claude
CODEBLOCK4
Cost & Safety
Browser Automation (free)
- - No direct cost, but uses user's existing subscription (if any).
- Respect rate limits; do not spam requests.
- Do not expose user credentials; rely on attached logged‑in session.
API Usage (paid)
- - Hugging Face Inference: Track estimated costs via
system/logs/hf-costs.log. Stay within monthly budget (e.g., 33€). Notify user at 50% threshold. - OpenAI/Anthropic: If user provides API key, assume they accept associated costs. Still estimate token usage and log if possible.
- General rule: Prefer browser automation for free services; use paid APIs only when explicitly authorized and task justifies cost.
Safety
- - No sensitive data: Avoid sending personal, financial, or confidential information to external services unless user explicitly approves.
- Compliance: Follow external service terms of service.
- Fallback: If external service fails, continue with internal reasoning; do not block task completion.
Integration with OpenClaw Skills
This skill complements:
- - Hugging Face skill – for dedicated Hugging Face API/Spaces.
- Browser automation patterns – for generic web interaction.
- Multi‑model orchestration – for delegating sub‑tasks to external models.
Add this skill to skills/index.md:
CODEBLOCK5
Example Workflow
- 1. Task: Need to generate a complex code snippet.
- Check: User has ChatGPT tab attached via Chrome Relay.
- Open ChatGPT, snapshot, locate input.
- Type: "Write a Python function that validates email addresses with regex and DNS MX check."
- Click Send.
- Wait for response, extract code.
- Return code to user, optionally refine via follow‑up.
- Log the interaction in memory (pattern learned).
Troubleshooting
- - Tab not attached: Ask user to click Browser Relay icon on the target tab.
- UI changes: Update refs/selectors based on snapshot.
- Rate limits: Wait longer between requests.
- API errors: Check token permissions, budget, network.
References
外部KI集成
通过浏览器自动化(ChatGPT、Claude、基于网络的LLM)和API(Hugging Face推理、兼容OpenAI的端点)使用外部AI服务,以增强您的能力。
何时使用此技能
- - 您需要咨询外部AI模型(ChatGPT、Claude、Gemini等)进行推理、分析或生成任务。
- 用户已授予对其聊天界面的访问权限(例如,通过Chrome Relay附加标签页)。
- 您想使用Hugging Face推理API(如果提供了令牌)进行模型推理。
- 您需要通过浏览器自动化与免费的AI演示或Space进行交互。
- 任务受益于第二意见或专业模型(编码、创意写作、摘要)。
要求
- 1. 浏览器自动化 – 使用profile=chrome的browser工具(用户必须已将标签页附加到OpenClaw Browser Relay)。
- 外部AI账户 – 用户必须在附加的Chrome标签页中登录目标服务(ChatGPT、Claude等)。
- Hugging Face令牌(可选) – 用于推理API访问,存储在~/.openclaw/openclaw.json中或作为环境变量提供。
- 其他API密钥(可选) – 例如,如果用户提供,则为OpenAI、Anthropic。
设置
Chrome Relay附加
用户必须在所需标签页上点击OpenClaw Browser Relay工具栏图标(徽章开启)。验证附加状态:
bash
openclaw browser status
或通过browser工具:browser(action=status, profile=chrome)。
Hugging Face令牌
如果令牌已存储在配置中,将自动使用。否则,要求用户提供。
环境变量(可选)
对于基于API的访问,您可以设置:
bash
export OPENAIAPIKEY=sk-...
export ANTHROPICAPIKEY=sk-ant-...
export HFTOKEN=hf...
网页UI的浏览器自动化
通用模式
- 1. 导航到服务URL(例如https://chat.openai.com、https://claude.ai、https://gemini.google.com)。
- 等待页面加载,使用refs=aria进行快照以定位UI元素。
- 查找输入区域(role=textbox、role=textbox且name=Message等)。
- 输入您的查询,使用带有ref或selector的act。
- 点击发送/提交按钮(role=button、name=Send)。
- 等待响应(轮询新文本元素,检测加载指示器消失)。
- 提取响应从输出容器中(role=article、class markdown等)。
- 返回提取的文本。
示例:通过Chrome Relay使用ChatGPT
javascript
// 1. 导航
browser(action=open, profile=chrome, targetUrl=https://chat.openai.com);
// 2. 加载后快照
const snap = browser(action=snapshot, profile=chrome, refs=aria, interactive=true);
// 3. 查找文本框(根据快照调整ref)
browser(action=act, profile=chrome, request={ kind: type, ref: textbox:Message, text: 您的查询内容 });
// 4. 点击发送按钮
browser(action=act, profile=chrome, request={ kind: click, ref: button:Send });
// 5. 等待响应(轮询直到出现新文本)
// 6. 提取响应
适配说明
- - UI频繁变化:使用refs=aria获取稳定引用(aria‑role、aria‑name)。必要时回退到使用CSS类的selector。
- 速率限制:谨慎操作;交互之间等待2-5秒。
- 会话持久性:附加的标签页保留登录状态;您可以在同一聊天中继续对话。
API集成
Hugging Face推理API
详细用法请参见专门的
Hugging Face技能。
兼容OpenAI的端点
如果用户提供API密钥,您可以通过curl或exec调用模型:
bash
curl -s -X POST https://api.openai.com/v1/chat/completions \
-H Authorization: Bearer $OPENAIAPIKEY \
-H Content-Type: application/json \
-d {
model: gpt-4,
messages: [{role: user, content: Hello}]
}
Anthropic Claude
bash
curl -s -X POST https://api.anthropic.com/v1/messages \
-H x-api-key: $ANTHROPIC
APIKEY \
-H anthropic-version: 2023-06-01 \
-H Content-Type: application/json \
-d {
model: claude-3-opus-20240229,
max_tokens: 1024,
messages: [{role: user, content: Hello}]
}
成本与安全
浏览器自动化(免费)
- - 无直接成本,但使用用户现有的订阅(如有)。
- 遵守速率限制;不要发送垃圾请求。
- 不要暴露用户凭据;依赖附加的已登录会话。
API使用(付费)
- - Hugging Face推理:通过system/logs/hf-costs.log跟踪估计成本。保持在月度预算内(例如33€)。在达到50%阈值时通知用户。
- OpenAI/Anthropic:如果用户提供API密钥,假设他们接受相关成本。仍估计令牌使用量并尽可能记录。
- 通用规则:优先使用免费服务的浏览器自动化;仅在明确授权且任务合理时使用付费API。
安全
- - 无敏感数据:除非用户明确批准,否则避免向外部服务发送个人、财务或机密信息。
- 合规性:遵守外部服务的使用条款。
- 回退:如果外部服务失败,继续使用内部推理;不要阻塞任务完成。
与OpenClaw技能的集成
此技能补充:
- - Hugging Face技能 – 用于专门的Hugging Face API/Spaces。
- 浏览器自动化模式 – 用于通用网页交互。
- 多模型编排 – 用于将子任务委派给外部模型。
将此技能添加到skills/index.md:
| 外部KI集成 | skills/external‑ki‑integration/SKILL.md |
示例工作流程
- 1. 任务:需要生成一个复杂的代码片段。
- 检查:用户已通过Chrome Relay附加了ChatGPT标签页。
- 打开ChatGPT,快照,定位输入。
- 输入:编写一个Python函数,使用正则表达式和DNS MX检查验证电子邮件地址。
- 点击发送。
- 等待响应,提取代码。
- 返回代码给用户,可选地通过后续对话进行优化。
- 记录交互到内存中(学习到的模式)。
故障排除
- - 标签页未附加:要求用户在目标标签页上点击Browser Relay图标。
- UI变化:根据快照更新ref/选择器。
- 速率限制:在请求之间等待更长时间。
- API错误:检查令牌权限、预算、网络。
参考