`God of all Browsers`

A stateful, multi-tab Puppeteer skill designed to help AI agents automate heavily protected websites the exact same way a human does.

It solves three critical AI problems:

1. Tabs & Statefulness: It launches a single background browser that stays open. Navigations, clicks that open new tabs, and cookies are remembered across multiple commands!
Vision Abstraction: AI cannot "see" coordinates well, so snapshot maps the DOM, assigns a [tag] ID to every visible button/input, and takes a screenshot. The AI just says "Click tag [15]."
Bot Evasion: Uses headless: false, custom user agents, removed webdriver footprints, and canvas spoofing.

Important Setup: Ensure the Chromium path is correct (C:\Program Files\Google\Chrome\Application\chrome.exe for Win or /usr/bin/chromium for Linux) and puppeteer-core is installed.

🚀 COMMANDS & WORKFLOW

1. Start the Browser (Required First Step)

Launch the browser in the background. It will use a persistent chrome_profile directory so you NEVER lose login sessions.

CODEBLOCK0

2. Take a Snapshot (And auto-close popups)

This is your "eyes". Run this before any interaction to get the active window's current state and a list of clickable [tag] IDs.

CODEBLOCK1

Wait for this command to output the JSON array of tags. It will also automatically click away annoying Chatbot/Notification popups.

3. Click or Type (Like a human)

Use the tags captured during the snapshot.

CODEBLOCK2

4. Reading & Content Extraction

Extract text content from any element using tags or CSS selectors.

CODEBLOCK3

5. Tab Management

Many sites open clicked links in a new tab! If your click command opens a new tab, the CLI will automatically say:
INLINECODE11

You can manually manage tabs using:

CODEBLOCK4

5. Find Tags (Accurate Filtered Search)

Use this to filter elements by keywords instead of reading a massive snapshot. It can search live on the current page or in a previously saved JSON file.

CODEBLOCK5

6. Refresh Page

Manually reload the current tab. Useful for status updates.

CODEBLOCK6

7. Scrape Meta Tags (SEO/OpenGraph)

Extract hidden page data like Title, Description, and Social Media tags.

CODEBLOCK7

8. Dynamic Evolution (Eval)

Execute custom JavaScript logic directly in the browser context. Note: For security, the --force flag is required. Supports both inline code and script files.

CODEBLOCK8

9. Google Search (Direct Extraction)

Get the top 5 organic search results (Titles, Links, Snippets) in a single command. Extremely fast and agent-friendly.

CODEBLOCK9

10. Session & Learning

Manage your login state and keep track of automation failures for self-correction.

CODEBLOCK10

11. Stop the Browser

Clean up resources when the task is entirely finished.

CODEBLOCK11

🧠 AI STRATEGY (HOW TO USE)

1. Run start.
Run snapshot --url "[TARGET]".
Check auth-status if the page is restricted. Use save-session after manual/automated login.
Analyze the output tags. Think step-by-step. Does the page require a search? Does it require clicking an 'Apply' button?
Run click or type on the specific [tag].
READ THE CLICK OUTPUT CAREFULLY! Did it say a new tab opened? If so, your next snapshot will read from that tab.
Run snapshot again WITHOUT a URL to read the new page/modal that loaded.
Repeat until the task is complete. If you need to return to the search results, run check-tabs and switch-tab --index 0.
If you encounter a bug (e.g., selector not found), use log-learning to record the fix for future runs.
If you need to run custom JS, use eval with the --force flag.
Once finished, run stop.

10. Common Extraction Patterns (USE EVAL)

When you need to get actual data (not just see the page), use the eval command with these patterns:

Google Search Results:

CODEBLOCK12

LinkedIn Profile (Basic):

CODEBLOCK13

General Link Scraper:

CODEBLOCK14

11. Robust Automation Workflow (Multi-Tab & State)

Follow this professional flow for complex, multi-stage automation tasks:

1. Initialize: Run start to launch the persistent browser instance.
Navigation & Auth:

- Run snapshot --url "[TARGET]" to land on the page. - Run auth-status to check if a login is required. - If you perform a manual/auto login, run save-session to persist the state.

3. Clean & Expand:

- Always run expand before deep scanning. This removes popups and reveals hidden content that might be missing from the DOM.

4. Action Loop:

- Run snapshot (without URL) to get the latest [tag] list. - Perform interactions using click, type, or press. - Pro Tip: If a click result is ambiguous, run check-url to see if the page changed.

5. Multi-Tab Handling:

- If the terminal warns ⚠️ A NEW TAB WAS OPENED, run check-tabs. - Note the index of the new tab (e.g., [1]) and run switch-tab --index 1. - Every snapshot and command thereafter will target this new tab.

6. Extraction:

- Use read --tag "[#]" for simple text. - Use eval for complex data structures (arrays of objects, etc.).

7. Recovery & Learning:

- If a command fails, use log-learning --failed "..." --fixed "..." to document the solution for the AI's internal memory.

8. Teardown: Run stop only when the entire job (across all domains) is finished.

万浏览器之神

一个具有状态管理、多标签页功能的Puppeteer技能，旨在帮助AI代理以与人类完全相同的方式自动化操作高度保护的网站。

它解决了AI的三个关键问题：

1. 标签页与状态管理： 它启动一个保持打开状态的单后台浏览器。导航、打开新标签页的点击以及cookies在多个命令之间都会被记住！
视觉抽象： AI无法很好地看到坐标，因此snapshot会映射DOM，为每个可见按钮/输入分配一个[tag]ID，并截取屏幕截图。AI只需说点击标签[15]。
反机器人检测： 使用headless: false、自定义用户代理、移除webdriver足迹以及Canvas欺骗。

重要设置： 确保Chromium路径正确（Windows为C:\Program Files\Google\Chrome\Application\chrome.exe，Linux为/usr/bin/chromium），并且已安装puppeteer-core。

🚀 命令与工作流程

1. 启动浏览器（必需的第一步）

在后台启动浏览器。它将使用一个持久的chrome_profile目录，因此您永远不会丢失登录会话。

bash

标准模式（推荐用于调试）

node browser.js start

无头模式（更快，静默后台运行）

注意：如果在Termux中运行，将自动启用。

node browser.js start --headless

2. 拍摄快照（并自动关闭弹窗）

这是您的眼睛。在任何交互之前运行此命令，以获取当前活动窗口的状态和可点击的[tag]ID列表。

bash

如果导航到新页面：

node browser.js snapshot --url https://www.google.com

如果已在页面上（刷新DOM）：

node browser.js snapshot

等待此命令输出标签的JSON数组。 它还会自动点击关闭烦人的聊天机器人/通知弹窗。

3. 点击或输入（像人类一样）

使用在快照期间捕获的标签。

bash

点击按钮或链接（例如标签[24]）

node browser.js click --tag [24]

在输入框中输入（例如标签[5]）

node browser.js type --tag [5] --text MERN全栈开发者

按下特定键盘键（默认：回车）

node browser.js press --key Enter

4. 读取与内容提取

使用标签或CSS选择器从任何元素中提取文本内容。

bash

读取特定标签的可见文本

node browser.js read --tag [12]

读取特定CSS选择器的内容（例如主文章）

node browser.js read --selector article.main-content

深度展开隐藏内容（自动点击阅读更多/显示全部按钮）

node browser.js expand

5. 标签页管理

许多网站会在新标签页中打开点击的链接！如果您的click命令打开了一个新标签页，CLI将自动显示：
⚠️ 已打开新标签页！！自动切换到标签页[1]。

您可以使用以下命令手动管理标签页：

bash

列出所有当前打开的标签页

node browser.js check-tabs

切换到特定标签页索引（例如返回搜索页面：标签页0）

node browser.js switch-tab --index 0

仅检查当前正在查看的URL：

node browser.js check-url

5. 查找标签（精确过滤搜索）

使用此功能按关键词过滤元素，而不是读取庞大的快照。它可以在当前页面上实时搜索，也可以在之前保存的JSON文件中搜索。

bash

实时搜索申请或成功按钮

node browser.js find --query apply,success

在特定保存的快照文件中搜索（例如，验证输出）

node browser.js find --file snapshot.json --query applied,successfully

6. 刷新页面

手动重新加载当前标签页。对状态更新很有用。

bash
node browser.js refresh

7. 抓取元标签（SEO/OpenGraph）

提取隐藏的页面数据，如标题、描述和社交媒体标签。

bash
node browser.js scrap-meta

8. 动态执行（Eval）

直接在浏览器上下文中执行自定义JavaScript逻辑。注意：出于安全考虑，需要--force标志。 支持内联代码和脚本文件。

bash

执行内联代码（需要--force）

node browser.js eval --code return { links: document.querySelectorAll(a).length } --force

从文件执行（需要--force）

node browser.js eval --file custom_script.js --force

9. Google搜索（直接提取）

通过单个命令获取前5个自然搜索结果（标题、链接、摘要）。速度极快，对代理友好。

bash
node browser.js google --query Mathanraj Murugesan

10. 会话与学习

管理您的登录状态，并记录自动化失败信息以便自我修正。

bash

将当前cookies保存到session.json（跨运行持久化）

node browser.js save-session

检查页面是否需要登录或用户是否已登录

node browser.js auth-status

记录失败和学到的经验，供AI自我修正

node browser.js log-learning --failed 选择器[12]被隐藏 --fixed 先使用了[expand] --lessons 在读取前始终尝试展开内容

11. 停止浏览器

当任务完全完成时清理资源。

bash
node browser.js stop

🧠 AI策略（如何使用）

1. 运行start。
运行snapshot --url [目标网址]。
如果页面受限，检查auth-status。手动/自动登录后使用save-session。
分析输出的标签。逐步思考。页面是否需要搜索？是否需要点击申请按钮？
在特定的[tag]上运行click或type。
仔细阅读点击输出！ 是否提示打开了新标签页？如果是，您的下一个snapshot将从该标签页读取。
再次运行snapshot（不带URL）以读取加载的新页面/模态框。
重复直到任务完成。如果需要返回搜索结果，运行check-tabs和switch-tab --index 0。
如果遇到错误（例如，未找到选择器），使用log-learning记录修复方法，供将来运行使用。
如果需要运行自定义JS，使用带--force标志的eval。
完成后，运行stop。

10. 常见提取模式（使用EVAL）

当您需要获取实际数据（而不仅仅是查看页面）时，使用以下模式的eval命令：

Google搜索结果：

bash
node browser.js eval --force --code return Array.from(document.querySelectorAll(div.g)).slice(0,5).map(g => ({ title: g.querySelector(h3)?.innerText, link: g.querySelector(a)?.href }))

LinkedIn个人资料（基本）：

bash
node browser.js eval --force --code return { name: document.querySelector(.text-heading-xlarge)?.innerText, title: document.querySelector(.text-body-medium)?.innerText }

通用链接抓取器：

bash
node browser.js eval --force --code return Array.from(document.querySelectorAll(a)).map(a => ({ text: a.innerText, url: a.href })).filter(a => a.url.startsWith(http))

11. 稳健的自动化工作流程（多标签页与状态）

对于复杂的多阶段自动化任务，请遵循此专业流程：

1. 初始化：运行start启动持久浏览器实例。
导航与认证：

- 运行snapshot --url [目标网址]进入页面。 - 运行auth-status检查是否需要登录。 - 如果执行了手动/自动登录，运行save-session持久化状态。

3. 清理与展开：

- 在深度扫描前始终运行expand。这会移除弹窗并显示可能缺失于DOM中的隐藏内容。

4. 操作循环：

- 运行snapshot（不带URL）获取最新的[tag]列表。 - 使用click、type或press执行交互。 - 专业提示：如果点击结果不明确，运行check-url查看页面是否已更改。

5. 多标签页处理：

- 如果终端警告⚠️ 已打开新标签页，运行check-tabs。 - 记下新标签页的索引（例如[1]），然后运行switch-tab --index 1。 - 此后的每个快照和命令都将针对这个新标签页。

6. 提取：

- 对于简单文本，使用read --tag [#]。 - 对于复杂数据结构（对象数组

god-of-all-browsers 全能浏览器神