Ghost Browser
Automated Chrome browser for AI agent web tasks. Powered by nodriver for reliable browser control. Every command is designed to minimize token usage and maximize accuracy.
Use for: web automation, screenshots, page reading, form filling, scraping, cookie/session management, and persistent browser profiles.
How to Browse — Follow This Workflow
ALWAYS use this workflow. Never use raw content (HTML) or CSS-selector click/type as your first choice.
Step 1: Navigate and understand the page
CODEBLOCK0
INLINECODE3 returns: page title, URL, element counts (links/buttons/inputs/forms), whether there's a login form, and a short text preview. Costs ~10 tokens.
Step 2: See what you can interact with
CODEBLOCK1
Output:
CODEBLOCK2
Step 3: Interact by visible text (NOT CSS selectors)
CODEBLOCK3
Step 4: Read page content as markdown
CODEBLOCK4
Never use content — it returns raw HTML which wastes thousands of tokens. Use readable instead.
Step 5: Wait for dynamic pages
CODEBLOCK5
Complete Login Example
CODEBLOCK6
Restore a Previous Session
CODEBLOCK7
Command Reference
Preferred Commands (use these first)
| Command | What it does | Token cost |
|---|
| INLINECODE6 | Page overview: title, URL, element counts, flags | ~10 |
| INLINECODE7 |
Numbered list of buttons, links, inputs | ~50-200 |
|
elements --form-only | Just form inputs | ~10-50 |
|
readable | Full page as clean markdown | ~500-10000 |
|
readable --max-length N | Page markdown capped at N chars | controlled |
|
interact click "text" | Click by visible text | action |
|
interact type "label" --type-text "value" | Type by label/placeholder text | action |
|
fill-form '{"field":"value"}' --submit | Fill and submit a form | action |
|
hover "text" --by-text | Hover by visible text | action |
|
wait-ready | Wait for page to finish loading | ~5 |
|
session save <name> | Save cookies + localStorage + sessionStorage | ~10 |
|
session load <name> | Restore full auth state | ~10 |
Lifecycle
CODEBLOCK8
Navigation & Tabs
CODEBLOCK9
After navigate, all commands automatically target the navigated tab.
Page Understanding
CODEBLOCK10
Text-Based Interaction (preferred)
CODEBLOCK11
CSS Selector Interaction (fallback)
Use these only when text-based interaction fails.
CODEBLOCK12
Cookies & Storage
CODEBLOCK13
Session Management
CODEBLOCK14
Sessions are saved locally under state/sessions/<name>.json. They contain cookies and storage data needed to restore an authenticated state. Delete session files when no longer needed.
Files & Media
CODEBLOCK15
Debugging
CODEBLOCK16
Network and console logging run automatically in the background.
Window
CODEBLOCK17
Profile Management
Profiles persist browser data (history, cookies, extensions) across sessions.
CODEBLOCK18
Extension Management
CODEBLOCK19
Challenge Handling
CODEBLOCK20
Challenges are also detected and handled automatically in the background.
Decision Guide
| I want to... | Use this |
|---|
| Understand what's on a page | INLINECODE20 then elements if needed |
| Read page text content |
readable (NOT
content) |
| Click a button |
interact click "Button Text" |
| Fill a login form |
fill-form '{"email":"...","password":"..."}' --submit |
| Type into a specific field |
interact type "Field Label" --type-text "value" |
| Wait for page to load |
wait-ready |
| See what I can click/type |
elements or
elements --form-only |
| Save login for later |
session save sitename |
| Restore a saved login |
session load sitename |
| Debug a failing request |
network-log --filter domain.com |
| Check for JS errors |
console-log --level error |
| Take a screenshot |
screenshot --output ./page.png |
| Run custom JavaScript |
eval "your code here" |
JSON Output
All commands support --json for machine-readable output:
CODEBLOCK21
Installation
After installing the skill, run the setup script:
CODEBLOCK22
This creates a Python virtual environment and installs dependencies automatically.
Requirements
- - Python 3.8+
- Google Chrome installed on the system
- nodriver (installed automatically by
setup.sh)
Data & Privacy
This skill stores the following data locally under its state/ directory:
| Data | Location | Contains |
|---|
| Browser state | INLINECODE39 | Port numbers, process ID |
| Logs |
state/browser.log | Daemon debug logs |
| Profiles |
state/profiles/ | Chrome user data (history, cookies) |
| Sessions |
state/sessions/ | Saved auth state (cookies, localStorage) |
To clean up: delete the state/ directory to remove all persistent data. Use ghost-browser profile delete <name> to remove individual profiles.
Session and cookie files may contain authentication tokens. Handle them carefully and delete when no longer needed.
Security
- - Browser control server binds to 127.0.0.1 only (localhost, not network-accessible)
- The skill does not modify any files outside its own directory
- No environment variables or external credentials are required
- All persistent data is stored under the skill's
state/ directory
Capability Disclosure
This skill is a full browser automation tool. By design, it includes capabilities that security scanners may flag:
- -
eval — Executes arbitrary JavaScript in the browser context. This is standard for any browser automation tool (equivalent to Playwright's page.evaluate() or Puppeteer's page.evaluate()). upload / download — Reads local files for upload and saves downloaded files to disk. Required for any browser that handles file inputs or downloads.session save/load — Persists cookies, localStorage, and sessionStorage to JSON files under state/sessions/. These files may contain authentication tokens. Delete sessions you no longer need.install-extension / load-extension — Loads Chrome extensions programmatically. On macOS, extension installation from .crx files may use osascript for Chrome Web Store installs.- Anti-detection — Uses nodriver (instead of Playwright/Puppeteer) to avoid setting
navigator.webdriver=true. Blocks detectable CDP domains (Runtime.enable, Console.enable). Patches mouse event coordinates to avoid synthetic click detection. These are stealth features for bypassing bot detection on websites — not evasion of security tools on your machine.
None of these capabilities access data outside the browser or the skill's own state/ directory. The skill does not phone home, collect telemetry, or transmit data to any third party.
Ghost Browser
用于AI代理网页任务的自动化Chrome浏览器。由nodriver驱动,提供可靠的浏览器控制。每条命令都旨在最小化token消耗并最大化准确性。
用途:网页自动化、截图、页面读取、表单填写、数据抓取、Cookie/会话管理以及持久化浏览器配置文件。
如何浏览 — 遵循此工作流程
始终使用此工作流程。切勿将原始content(HTML)或CSS选择器click/type作为首选。
步骤1:导航并理解页面
bash
ghost-browser navigate https://example.com
ghost-browser wait-ready
ghost-browser page-summary
page-summary返回:页面标题、URL、元素数量(链接/按钮/输入框/表单)、是否存在登录表单以及简短文本预览。消耗约10个token。
步骤2:查看可交互元素
bash
ghost-browser elements # 所有可交互元素的编号列表
ghost-browser elements --form-only # 仅表单输入(用于登录/注册/搜索)
输出:
[0] link Home → /
[1] link Products → /products
[2] button Sign In
[3] input[email] Email address
[4] input[password] Password
[5] submit Log In
步骤3:通过可见文本交互(非CSS选择器)
bash
ghost-browser interact click Sign In
ghost-browser interact type Email --type-text user@example.com
ghost-browser interact type Password --type-text secret123
或一次性填写整个表单
ghost-browser fill-form {email:user@example.com,password:secret123} --submit
步骤4:以Markdown格式读取页面内容
bash
ghost-browser readable # 完整页面转为干净Markdown
ghost-browser readable --max-length 5000 # 限制长度以节省token
切勿使用content — 它会返回原始HTML,浪费数千个token。请使用readable替代。
步骤5:等待动态页面
bash
ghost-browser wait-ready # 等待网络空闲 + DOM稳定
ghost-browser wait-ready --timeout 10
完整登录示例
bash
ghost-browser start
ghost-browser navigate https://mysite.com/login
ghost-browser wait-ready
ghost-browser elements --form-only
ghost-browser fill-form {email:me@example.com,password:mypass} --submit
ghost-browser wait-ready
ghost-browser page-summary # 验证登录成功
ghost-browser session save mysite # 保存认证状态供以后使用
恢复之前的会话
bash
ghost-browser start --profile mysite
ghost-browser session load mysite
ghost-browser navigate https://mysite.com/dashboard
ghost-browser page-summary
命令参考
推荐命令(优先使用)
| 命令 | 功能 | Token消耗 |
|---|
| page-summary | 页面概览:标题、URL、元素数量、标志 | ~10 |
| elements |
按钮、链接、输入框的编号列表 | ~50-200 |
| elements --form-only | 仅表单输入 | ~10-50 |
| readable | 完整页面转为干净Markdown | ~500-10000 |
| readable --max-length N | 页面Markdown限制为N个字符 | 可控 |
| interact click text | 通过可见文本点击 | 操作 |
| interact type label --type-text value | 通过标签/占位符文本输入 | 操作 |
| fill-form {field:value} --submit | 填写并提交表单 | 操作 |
| hover text --by-text | 通过可见文本悬停 | 操作 |
| wait-ready | 等待页面加载完成 | ~5 |
| session save
| 保存cookies + localStorage + sessionStorage | ~10 |
| session load | 恢复完整认证状态 | ~10 |
生命周期
bash
ghost-browser start # 启动浏览器守护进程
ghost-browser start --headless # 无可见窗口运行
ghost-browser start --profile work # 使用命名配置文件(持久化数据)
ghost-browser start --extension /path/ext # 加载解压的Chrome扩展
ghost-browser start --proxy socks5://host:port # 使用代理
ghost-browser stop # 优雅关闭
ghost-browser status # 检查运行状态
ghost-browser status --json # 机器可读状态
ghost-browser health # 快速健康检查
导航与标签页
bash
ghost-browser navigate # 导航当前标签页(或重用匹配标签页)
ghost-browser navigate --force-new # 始终打开新标签页
ghost-browser tabs # 列出打开的标签页及其ID
ghost-browser tabs --json # 机器可读标签页列表
ghost-browser activate-tab # 按ID切换到标签页
ghost-browser close-tab # 按ID关闭标签页
ghost-browser wait-ready # 等待页面完全加载
ghost-browser wait-ready --timeout 10 # 自定义超时时间(秒)
navigate后,所有命令自动针对导航到的标签页。
页面理解
bash
ghost-browser page-summary # 快速概览(~10 tokens)
ghost-browser elements # 所有可交互元素(编号)
ghost-browser elements --form-only # 仅表单输入
ghost-browser elements --limit 50 # 限制最多50个元素
ghost-browser readable # 完整页面转为Markdown
ghost-browser readable --max-length 5000 # 限制输出长度
ghost-browser content # 原始HTML(避免使用 — 用readable替代)
基于文本的交互(推荐)
bash
ghost-browser interact click Sign In # 通过按钮/链接文本点击
ghost-browser interact type Email --type-text user@example.com # 通过标签输入
ghost-browser interact click Products --index 1 # 如有多个匹配,点击第二个
ghost-browser fill-form {email:a@b.com,password:x} --submit
ghost-browser hover Menu --by-text # 通过可见文本悬停
CSS选择器交互(备用)
仅在基于文本的交互失败时使用。
bash
ghost-browser click button.submit # 通过CSS选择器点击
ghost-browser type input#email a@b.com # 通过CSS选择器输入
ghost-browser find h1 # 通过选择器查找元素
ghost-browser hover .dropdown # 通过CSS选择器悬停
ghost-browser wait .loaded --timeout 10 # 等待元素出现
ghost-browser scroll --down # 向下滚动
ghost-browser scroll --up # 向上滚动
ghost-browser scroll --to 500 # 滚动到Y位置
ghost-browser eval document.title # 执行任意JavaScript
Cookies与存储
bash
ghost-browser cookies # 列出所有cookies
ghost-browser cookies --domain example.com # 按域名筛选
ghost-browser set-cookie name value # 设置cookie
ghost-browser set-cookie name value --domain .example.com # 设置带域名的cookie
ghost-browser clear-cookies # 清除所有cookies
ghost-browser clear-cookies --domain example.com # 清除指定域名cookies
ghost-browser save-cookies --file cookies.json # 导出cookies为JSON
ghost-browser load-cookies cookies.json # 从JSON导入cookies
ghost-browser storage list # 列出localStorage条目
ghost-browser storage list --session # 列出sessionStorage条目
ghost-browser storage get # 获取值
ghost-browser storage set # 设置值
ghost-browser storage delete # 删除键
ghost-browser storage clear # 清除所有localStorage
会话管理
bash
ghost-browser session save # 保存cookies + localStorage + sessionStorage
ghost-browser session load # 恢复完整认证状态