External AI Integration Skill
This skill provides patterns for using external AI models as tools that the assistant can call on‑demand. It extends existing browser‑automation and API‑integration skills, enabling the assistant to:
- - Automate interactions with ChatGPT, Claude, Gemini, or other web‑based LLMs via Chrome Relay (browser automation).
- Call Hugging Face Inference API for models hosted on Hugging Face Spaces (text‑generation, summarization, translation, etc.).
- Integrate external reasoning into the assistant's own workflow—e.g., asking ChatGPT for a second opinion, using Claude for detailed analysis, or leveraging Hugging Face for domain‑specific tasks.
- Avoid spawning isolated sub‑agents by treating external models as tools, keeping control and context within the main assistant session.
When to use
- - You need additional reasoning power, a different model's perspective, or a specialized model (e.g., code generation, translation) that your primary model lacks.
- The task benefits from a second opinion or parallel evaluation (e.g., reviewing code, analyzing strategy).
- You want to use a model with a larger context window, better coding ability, or specific domain knowledge (Claude, ChatGPT, Hugging Face models).
- You are asked to “integrate external AI via browser” or “use ChatGPT/Claude as a tool”.
- You need to call Hugging Face Inference API for a specific model (e.g., summarization, sentiment analysis) and incorporate the result into your response.
Core patterns
1. Browser Automation (Chrome Relay) for Web‑Based LLMs
Use Chrome Relay to automate interactions with ChatGPT, Claude, Gemini, or any other web‑based LLM that requires a browser interface.
Prerequisites:
- - Chrome Relay extension installed and a tab attached (user must click the OpenClaw Browser Relay toolbar icon).
- The target LLM website (e.g.,
chatgpt.com, claude.ai) already logged in (session cookies present). - Basic familiarity with the browser automation playbook (
memory/patterns/playbooks.md – “Browser Automation (Chrome Relay)”).
Steps:
- 1. Attach to the Chrome Relay profile (
profile="chrome"). - Navigate to the target LLM (or reuse an already‑open tab).
- Take a snapshot to locate the input field and send button (use
refs="aria" for stable references). - Type the prompt into the input field and submit (click send button or press Enter).
- Wait for the response (poll for a new element, detect typing indicators, or use a fixed timeout).
- Extract the response text from the appropriate DOM element.
- Return the response to the assistant's workflow.
Example workflow:
CODEBLOCK0
Key considerations:
- - Session persistence: The attached tab must stay logged in; avoid actions that log out.
- Rate limits: Be aware of the LLM's rate limits and usage policies.
- Error handling: Detect captchas, “network error” messages, or “try again” buttons and fall back gracefully.
- Multi‑turn conversations: Maintain conversation context by keeping the same tab and not refreshing.
2. Hugging Face Inference API Integration
For models hosted on Hugging Face Spaces or the Inference API, you can call them directly via HTTP requests.
Prerequisites:
- - Hugging Face API token (stored in 1Password or environment variable).
- Model identifier (e.g.,
"gpt2", "google/flan-t5-large", "microsoft/DialoGPT-medium"). - Knowledge of the model's expected input/output format.
Steps:
- 1. Retrieve the API token (use 1Password skill or read from
~/.huggingface/token). - Construct the request (URL, headers, JSON payload).
- Send the request via
curl or exec with requests Python module. - Parse the response and extract the generated text.
- Handle errors (rate limits, model loading, invalid token).
Example script (using curl):
CODEBLOCK1
Example Python function (using requests):
CODEBLOCK2
Key considerations:
- - Cost: Inference API may have costs; monitor usage.
- Model readiness: Some models need to be loaded; include
{"options":{"wait_for_model":true}} in parameters. - Output format: Response structure varies by model; inspect with a test call first.
3. Orchestrating External AI as a Tool
Instead of spawning a sub‑agent, the assistant calls external AI within its own reasoning flow.
Pattern:
- 1. Determine need: Decide which external model is appropriate (ChatGPT for creative tasks, Claude for analysis, Hugging Face for specialized models).
- Prepare the prompt: Format the prompt with clear instructions, context, and expected output format.
- Call the tool: Use browser automation for web‑based LLMs or API call for Hugging Face.
- Integrate the result: Parse, validate, and incorporate the external response into your own answer.
- Fallback: If the external call fails, continue with your own reasoning or try an alternative.
Example decision logic:
CODEBLOCK3
4. Prompt Engineering for External Models
External models may require different prompting styles than the assistant's native model.
- - ChatGPT/Claude: Use conversational style, system prompts, and markdown formatting.
- Hugging Face models: Follow the model's expected input format (e.g.,
"Translate English to German: ..." for T5). - Include context: Provide necessary background, constraints, and examples in the prompt.
- Specify output format: Ask for JSON, bullet points, code blocks, etc.
Example prompt for code review:
CODEBLOCK4 python
def calculate_average(numbers):
total = 0
for n in numbers:
total += n
return total / len(numbers)
CODEBLOCK5
5. Error Handling and Fallbacks
External services can fail; plan for graceful degradation.
- - Browser automation failures: Captchas, login required, network errors. Fallback: try Hugging Face API or continue without external help.
- API failures: Rate limits, model not found, token invalid. Fallback: use a different model or skip external step.
- Timeouts: Set reasonable timeouts (e.g., 30 seconds for browser automation, 10 seconds for API). Fallback: proceed with assistant's own reasoning.
- Log failures: Record external AI failures in
memory/YYYY‑MM‑DD.md with tag external‑ai‑failure for later analysis.
Example fallback structure:
CODEBLOCK6
Examples
Example 1: Code Review with Claude
Scenario: The assistant is asked to review a complex React component. It uses Claude (via Chrome Relay) for a detailed second opinion.
Steps:
- 1. Assistant prepares a prompt with the component code and review instructions.
- Calls
ask_claude(prompt) using browser automation. - Claude returns a structured review.
- Assistant incorporates Claude's feedback into its final answer.
Example 2: Translation via Hugging Face
Scenario: User provides a paragraph in English and asks for a German translation. Assistant calls Hugging Face translation model.
Steps:
- 1. Assistant constructs prompt:
"Translate English to German: <text>". - Calls
hf_inference("Helsinki-NLP/opus-mt-en-de", prompt). - Parses the generated text.
- Returns translation to user.
Example 3: Creative Brainstorming with ChatGPT
Scenario: User needs ideas for a blog post title. Assistant uses ChatGPT to generate 10 options.
Steps:
- 1. Assistant navigates to ChatGPT tab, inputs “Generate 10 catchy blog post titles about AI assistants”.
- Waits for response, extracts list.
- Presents the list to user, adding its own commentary.
Example 4: Combined Analysis (Assistant + External)
Scenario: User asks for a strategic analysis of a business decision. Assistant uses its own reasoning, then asks ChatGPT for potential blind spots.
Steps:
- 1. Assistant produces its own analysis.
- Assistant prompts ChatGPT: “What are potential blind spots in the following analysis? ”
- Integrates ChatGPT's blind‑spot list into final answer.
Anti‑Patterns
- - Over‑reliance on external AI: Using external models for trivial tasks increases latency and dependency. Use only when value added justifies the cost/risk.
- Ignoring context size: Web‑based LLMs have context limits; sending huge contexts may truncate or fail. Summarize or chunk appropriately.
- Exposing secrets: Never paste API tokens, passwords, or sensitive data into external AI prompts (especially web‑based). Use 1Password for tokens.
- Assuming correctness: External AI can be wrong, biased, or hallucinate. Always validate critical outputs.
- Breaking conversation flow: Browser automation that logs out or loses the tab breaks future calls. Keep session alive and avoid destructive actions.
- Cost unawareness: Hugging Face Inference API may incur costs; monitor usage and set budgets.
- Neglecting fallbacks: Not planning for external AI failure leaves the assistant stuck. Always have a fallback path.
Related Patterns
- - Browser Automation (Chrome Relay) playbook – detailed steps for Chrome Relay automation.
- Hugging Face skill – using Hugging Face Hub, Spaces, and Inference API with budget management.
- 1Password skill – retrieving API tokens securely.
- API‑Tool Integration skill – general patterns for calling external APIs.
- Error Recovery Automation skill – handling failures in external services.
- Health Monitoring skill – monitoring external service availability.
References
- -
docs/browser-automation.md – Chrome Relay setup and commands. - INLINECODE20 – Hugging Face API usage.
- INLINECODE21 – retrieving secrets.
- INLINECODE22 – Browser Automation playbook.
- INLINECODE23 (this skill's core implementation).
- INLINECODE24 (orchestration playbook).
Skill Integration
When a task would benefit from external AI reasoning, read this skill to decide which model to use and how to call it. Store successful patterns in memory/patterns/tools.md. Update pending.md if external AI fails repeatedly and needs manual configuration.
This skill increases autonomy by expanding the assistant's toolset with external AI models, allowing it to tackle a wider range of tasks without spawning sub‑agents and maintaining control over the workflow.
外部AI集成技能
本技能提供了将外部AI模型作为工具供助手按需调用的模式。它扩展了现有的浏览器自动化和API集成技能,使助手能够:
- - 自动化交互:通过Chrome Relay(浏览器自动化)与ChatGPT、Claude、Gemini或其他基于网页的LLM进行交互。
- 调用Hugging Face推理API:用于托管在Hugging Face Spaces上的模型(文本生成、摘要、翻译等)。
- 将外部推理集成到助手自身工作流中——例如,向ChatGPT征求第二意见,使用Claude进行详细分析,或利用Hugging Face处理特定领域任务。
- 避免生成孤立的子代理:通过将外部模型视为工具,将控制权和上下文保留在主助手会话中。
何时使用
- - 你需要额外的推理能力、不同模型的视角,或主模型缺乏的专用模型(如代码生成、翻译)。
- 任务受益于第二意见或并行评估(例如,审查代码、分析策略)。
- 你想使用具有更大上下文窗口、更强编码能力或特定领域知识的模型(Claude、ChatGPT、Hugging Face模型)。
- 你被要求“通过浏览器集成外部AI”或“将ChatGPT/Claude作为工具使用”。
- 你需要调用Hugging Face推理API处理特定模型(如摘要、情感分析),并将结果整合到你的回复中。
核心模式
1. 基于网页的LLM浏览器自动化(Chrome Relay)
使用Chrome Relay自动化与ChatGPT、Claude、Gemini或任何其他需要浏览器界面的基于网页的LLM的交互。
前提条件:
- - 已安装Chrome Relay扩展并附加了一个标签页(用户必须点击OpenClaw Browser Relay工具栏图标)。
- 目标LLM网站(例如chatgpt.com、claude.ai)已登录(存在会话cookie)。
- 基本熟悉浏览器自动化剧本(memory/patterns/playbooks.md – “浏览器自动化(Chrome Relay)”)。
步骤:
- 1. 附加到Chrome Relay配置文件(profile=chrome)。
- 导航到目标LLM(或重用已打开的标签页)。
- 拍摄快照以定位输入字段和发送按钮(使用refs=aria获取稳定引用)。
- 在输入字段中键入提示并提交(点击发送按钮或按回车键)。
- 等待响应(轮询新元素、检测输入指示器或使用固定超时)。
- 从适当的DOM元素中提取响应文本。
- 将响应返回给助手的工作流。
示例工作流:
python
这是一个概念性示例;实际实现使用浏览器工具调用。
def ask_chatgpt(prompt):
# 1. 确保Chrome Relay已附加
browser(action=open, profile=chrome, targetUrl=https://chatgpt.com)
# 2. 快照以获取引用
snap = browser(action=snapshot, refs=aria)
# 3. 查找输入字段(aria role=textbox)和发送按钮
input
ref = snap.findelement(role=textbox, name=Message)
send
ref = snap.findelement(role=button, name=Send)
# 4. 键入提示并点击发送
browser(action=act, request={kind:type, ref:input_ref, text:prompt})
browser(action=act, request={kind:click, ref:send_ref})
# 5. 等待响应(简化)
time.sleep(10)
# 6. 再次快照,从最后一条消息气泡中提取响应
snap2 = browser(action=snapshot, refs=aria)
response
element = snap2.findlast_message()
return response_element.text
关键考虑因素:
- - 会话持久性: 附加的标签页必须保持登录状态;避免导致注销的操作。
- 速率限制: 注意LLM的速率限制和使用政策。
- 错误处理: 检测验证码、“网络错误”消息或“重试”按钮,并优雅地回退。
- 多轮对话: 通过保持同一标签页且不刷新来维护对话上下文。
2. Hugging Face推理API集成
对于托管在Hugging Face Spaces或推理API上的模型,你可以直接通过HTTP请求调用它们。
前提条件:
- - Hugging Face API令牌(存储在1Password或环境变量中)。
- 模型标识符(例如gpt2、google/flan-t5-large、microsoft/DialoGPT-medium)。
- 了解模型的预期输入/输出格式。
步骤:
- 1. 检索API令牌(使用1Password技能或从~/.huggingface/token读取)。
- 构造请求(URL、标头、JSON负载)。
- 通过curl或使用requests Python模块的exec发送请求。
- 解析响应并提取生成的文本。
- 处理错误(速率限制、模型加载、无效令牌)。
示例脚本(使用curl):
bash
#!/bin/bash
set -e
MODEL=google/flan-t5-large
PROMPT=Translate English to German: How are you?
APITOKEN=$(op read op://Personal/HuggingFace/apitoken)
curl -s https://api-inference.huggingface.co/models/$MODEL \
-H Authorization: Bearer $API_TOKEN \
-H Content-Type: application/json \
-d {\inputs\: \$PROMPT\} | jq -r .[0].generated_text
示例Python函数(使用requests):
python
import requests
import os
def hf_inference(model, inputs, parameters=None):
apitoken = os.getenv(HFTOKEN) # 或通过1Password检索
url = fhttps://api-inference.huggingface.co/models/{model}
headers = {Authorization: fBearer {api_token}}
payload = {inputs: inputs}
if parameters:
payload.update(parameters)
resp = requests.post(url, headers=headers, json=payload)
resp.raiseforstatus()
return resp.json()
关键考虑因素:
- - 成本: 推理API可能产生费用;监控使用情况。
- 模型就绪状态: 某些模型需要加载;在参数中包含{options:{waitformodel:true}}。
- 输出格式: 响应结构因模型而异;先通过测试调用进行检查。
3. 将外部AI编排为工具
助手在其自身的推理流程中调用外部AI,而不是生成子代理。
模式:
- 1. 确定需求: 决定使用哪个外部模型(ChatGPT用于创意任务,Claude用于分析,Hugging Face用于专用模型)。
- 准备提示: 格式化提示,包含清晰的指令、上下文和预期输出格式。
- 调用工具: 对基于网页的LLM使用浏览器自动化,对Hugging Face使用API调用。
- 集成结果: 解析、验证并将外部响应整合到你的答案中。
- 回退: 如果外部调用失败,继续使用你自己的推理或尝试替代方案。
示例决策逻辑:
python
def externalaiassist(task_type, prompt):
if tasktype == codereview:
# 通过浏览器自动化使用Claude
return ask_claude(prompt)
elif task_type == translation:
# 使用Hugging Face翻译模型
return hf_inference(Helsinki-NLP/opus-mt-en-de, prompt)
elif tasktype == creativewriting:
# 通过浏览器自动化使用ChatGPT
return ask_chatgpt(prompt)
else:
raise ValueError(fNo external AI configured for {task_type})
4. 外部模型的提示工程
外部模型可能需要与助手原生模型不同的提示风格。
- - ChatGPT/Claude: 使用对话风格、系统提示和Markdown格式。
- Hugging Face模型: 遵循模型预期的输入格式(例如,T5的Translate English to German: ...)。
- 包含上下文: 在提示中提供必要的背景、约束和示例。
- 指定输出格式: 要求JSON、要点、代码块等。
代码审查的示例提示:
You are an expert software engineer reviewing the following code snippet. Please:
- 1. Identify potential bugs or security issues.
- Suggest performance improvements.
- Comment on code style and readability.
- Output your review as a JSON with keys bugs, performance, style.
Code:
python
def calculate_average(numbers):
total = 0
for n in numbers:
total += n
return total / len(numbers)
5. 错误处理和回退
外部服务可能失败;计划优雅降级。
- - 浏览器自动化失败: 验证码、需要登录、网络错误。回退:尝试