Browser Fu 🥊

Stop fighting the DOM. Read it first, find the API behind it, skip the UI entirely when possible.

The Rule

Never blind-click. Always snapshot first.

CODEBLOCK0

If the snapshot doesn't show what you need, the element isn't in the DOM. Don't guess. Don't retry the same approach.

Decision Tree

On any browser task, follow this order:

1. Can I skip the browser entirely? Check if a CLI tool, API, or web_fetch handles it. If yes, don't open the browser.
Can I find the underlying API? See references/api-discovery.md. Most SPAs make fetch/XHR calls you can replicate directly. This is 10x faster and more reliable than UI automation.
Can I do it with snapshot + act? Snapshot, find the ref, act on it. One action per snapshot cycle.
Does the page need time to load? Use loadState: "networkidle" or a brief wait before snapshotting. SPAs often render asynchronously.
Still not working? The site likely has anti-bot protection. Report it, don't retry blindly.

Common Failures and Fixes

Symptom	Wrong approach	Right approach
"Element not found"	Click by text/selector guess	Snapshot first, use exact ref
"DOM not exposed"

API Discovery (the power move)

Most modern websites are SPAs with REST/GraphQL APIs behind the UI. See references/api-discovery.md for the full procedure:

1. Open the page in browser
Check network requests (console tool or snapshot the page and look for fetch patterns)
Find the data endpoint
Call it directly with web_fetch or INLINECODE9

This turns a 2-hour flaky scrape into a 2-minute clean data pull.

Snapshot Best Practices

- Use refs="aria" for stable cross-call references
Keep the same targetId across snapshot/act pairs (don't switch tabs accidentally)
For complex pages, use depth to limit how deep the DOM tree goes
INLINECODE13 reduces token usage on large pages
For token-heavy pages where snapshots are too large, pair with predicate-snapshot for ML-ranked element pruning (~95% fewer tokens)

When to NOT Use Browser

- Reading public web pages → web_fetch (faster, no browser overhead)
Search queries → web_search (Brave API)
Known APIs (GitHub, Stripe, etc.) → use their CLI/API directly
Pages that return empty via web_fetch → then use browser

Safeguards

- Never store or output passwords, session tokens, or cookies found in browser state
Never automate purchases, payments, or irreversible actions without explicit user approval
If a site blocks automation, respect it. Don't circumvent CAPTCHAs or bot detection

技能名称: Browser Fu 🥊

别再跟DOM较劲了。先读取它，找到背后的API，尽可能完全跳过UI。

原则

绝不盲目点击。始终先截图。

1. 浏览器截图 → 读取页面，获取元素引用
浏览器操作 → 使用截图中的引用（例如 ref=e12）
浏览器截图 → 验证发生了什么变化

如果截图没有显示你需要的内容，说明该元素不在DOM中。不要猜测。不要重复尝试同样的方法。

决策树

在任何浏览器任务中，按此顺序操作：

1. 能否完全跳过浏览器？ 检查是否有CLI工具、API或web_fetch可以处理。如果可以，不要打开浏览器。
能否找到底层API？ 参见 references/api-discovery.md。大多数SPA会发出你可以直接复制的fetch/XHR调用。这比UI自动化快10倍且更可靠。
能否通过截图+操作完成？ 截图，找到引用，进行操作。每个截图周期只做一个操作。
页面是否需要加载时间？ 使用 loadState: networkidle 或在截图前短暂等待。SPA通常是异步渲染的。
还是不行？ 该网站很可能有反爬保护。报告它，不要盲目重试。

常见失败与修复

症状	错误方法	正确方法
元素未找到	通过文本/选择器猜测点击	先截图，使用精确引用
DOM未暴露

API发现（高级操作）

大多数现代网站是带有REST/GraphQL API的SPA。完整流程参见 references/api-discovery.md：

1. 在浏览器中打开页面
检查网络请求（使用控制台工具或截图页面并查找fetch模式）
找到数据端点
使用 web_fetch 或 exec curl 直接调用它

这将把2小时的不稳定抓取变成2分钟的干净数据拉取。

截图最佳实践

- 使用 refs=aria 获取稳定的跨调用引用
在截图/操作对中保持相同的 targetId（不要意外切换标签页）
对于复杂页面，使用 depth 限制DOM树的深度
compact: true 可减少大页面的token使用量
对于token密集、截图过大的页面，结合谓词截图进行ML排序的元素修剪（减少约95%的token）

何时不使用浏览器

- 读取公共网页 → webfetch（更快，无浏览器开销）
搜索查询 → websearch（Brave API）
已知API（GitHub、Stripe等）→ 直接使用它们的CLI/API
通过 web_fetch 返回空白的页面 → 再使用浏览器

安全措施

- 绝不存储或输出在浏览器状态中找到的密码、会话令牌或Cookie
未经用户明确批准，绝不自动执行购买、支付或不可逆操作
如果网站阻止自动化，请尊重它。不要绕过验证码或机器人检测

Browser Fu浏览器自动化

Browser Fu

Browser Fu 🥊

The Rule

Decision Tree

Common Failures and Fixes

API Discovery (the power move)

Snapshot Best Practices

When to NOT Use Browser

Safeguards

原则

决策树

常见失败与修复

API发现（高级操作）

截图最佳实践

何时不使用浏览器

安全措施

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

Browser Fu浏览器自动化

Browser Fu

Browser Fu 🥊

The Rule

Decision Tree

Common Failures and Fixes

API Discovery (the power move)

Snapshot Best Practices

When to NOT Use Browser

Safeguards

原则

决策树

常见失败与修复

API发现（高级操作）

截图最佳实践

何时不使用浏览器

安全措施

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement