stealthy-auto-browse

Stealth browser in Docker. Camoufox (custom Firefox) — zero CDP signals. OS-level mouse/keyboard via PyAutoGUI — undetectable. Passes Cloudflare, DataDome, PerimeterX, Akamai.

For installation, configuration, and container setup, see references/setup.md.

When To Use

- Site has bot detection (Cloudflare, CAPTCHAs, DataDome)
Another browser skill is getting 403s or blocked responses
You need a logged-in session that won't get banned

When NOT To Use

- No bot protection — use curl or INLINECODE1
Only need static HTML — use INLINECODE2

Setup

The API should already be running. Set the base URL:

CODEBLOCK0

Verify: curl $STEALTHY_AUTO_BROWSE_URL/health returns ok.

HTTP API

All commands: POST $STEALTHY_AUTO_BROWSE_URL/ with JSON body {"action": "name", ...params}.

If AUTH_TOKEN is set on the server, include it on every request (except /health):

CODEBLOCK1

Or pass it as a query param: ?auth_token=<key> (useful for MCP clients that can't set headers).

In single-instance mode, requests are serialized automatically — only one runs at a time, the rest queue up.

Every response:

CODEBLOCK2

Two Input Modes

System Input — Undetectable

Uses PyAutoGUI for real OS-level events. The browser has no idea it's automated.

- system_click — move mouse with human-like curve, then click (viewport x,y coords)
INLINECODE11 — move mouse without clicking (hover menus, tooltips)
INLINECODE12 — click at position or current location (no smooth movement)
INLINECODE13 — type text character-by-character with randomized delays
INLINECODE14 — press a key or combo (enter, tab, ctrl+a)
INLINECODE18 — mouse wheel scroll (negative = down)

Get viewport coordinates from get_interactive_elements.

Playwright Input — Detectable But Convenient

Uses Playwright's DOM events. Faster, uses CSS selectors/XPath, but detectable.

- click — click by selector
INLINECODE21 — set input value instantly
INLINECODE22 — type into element character-by-character

Which To Use

- Bot detection? System input. Always.
No detection? Playwright input is fine.
Fill forms stealthily? system_click to focus, then system_type.

Typical Workflow

1. goto → load the page
INLINECODE26 → read what's on the page
INLINECODE27 → find buttons/inputs with x,y coordinates
INLINECODE28 / system_type / send_key → interact
INLINECODE31 / wait_for_text → wait for results
INLINECODE33 → verify

Actions Reference

Navigation

CODEBLOCK3

INLINECODE34: "domcontentloaded" (default), "load", "networkidle".
referer: set HTTP Referer header (for sites that check referrer).

Response: INLINECODE39

System Input (Undetectable)

CODEBLOCK4

Playwright Input (Detectable)

CODEBLOCK5

Page Inspection

CODEBLOCK6

INLINECODE40 returns all buttons, links, inputs with x, y, w, h, text, selector, visible. Pass x, y directly to system_click.

INLINECODE51 returns visible page text (truncated to 10,000 chars). Call this first after navigating.

Screenshots

CODEBLOCK7

Resize params: whLargest=512 (recommended), width=800, height=300, width=400&height=400.

Via action (for script mode — returns base64 with output_id):

CODEBLOCK8

Wait Conditions

Use these instead of sleep.

CODEBLOCK9

INLINECODE58: "visible" (default), "hidden", "attached", "detached".

Tabs

CODEBLOCK10

Dialogs

Call handle_dialog BEFORE the action that triggers the dialog. Dialogs are auto-accepted by default.

CODEBLOCK11

Cookies

CODEBLOCK12

Storage

CODEBLOCK13

INLINECODE64: "local" (default) or "session".

Downloads & Uploads

CODEBLOCK14

Network Logging

CODEBLOCK15

Console Logging

Capture console.log, console.error, console.warn, etc. Each entry has type, text, location, timestamp.

CODEBLOCK16

Scrolling

CODEBLOCK17

INLINECODE74 uses JS (fast). scroll_to_bottom_humanized uses OS-level mouse wheel (undetectable).

Display

CODEBLOCK18

Call calibrate after fullscreen changes.

Multi-Step Scripts

Run multiple actions as one atomic request. Steps with output_id collect results.

CODEBLOCK19

Also accepts "yaml": "..." with the same YAML format used in script mode.

INLINECODE79: "stop" (default) or "continue".

Utility

CODEBLOCK20

State Endpoints (GET)

CODEBLOCK21

MCP Server

The browser exposes all actions as MCP tools via Streamable HTTP at /mcp/ on the same port as the HTTP API.

CODEBLOCK22

Connect any MCP-compatible client to that URL. All actions from the HTTP API are available as tools — goto, screenshot, system_click, system_type, eval_js, get_text, get_cookies, run_script (multi-step), browser_action (generic fallback for everything else), and more.

If AUTH_TOKEN is set, connect to http://localhost:8080/mcp/?auth_token=<key>.

Works in both standalone and cluster mode — HAProxy routes MCP traffic with the same sticky sessions.

Cluster Mode

Run multiple browser instances behind HAProxy with a request queue, sticky sessions, and Redis cookie sync. For setup see references/setup.md.

Entry point is http://localhost:8080 — same API. HAProxy queues requests when all instances are busy instead of returning errors.

Sticky sessions: HAProxy sets an INSTANCEID cookie. Send it back on subsequent requests to keep routing to the same browser instance. All browser state (tabs, DOM, JS, local storage) lives on that specific container — only cookies sync via Redis.

Redis cookie sync: Cookies set on any instance propagate to all others instantly via PubSub. Log in once, the whole fleet is authenticated.

Script Mode

Pipe a YAML script via stdin, get JSON results on stdout, container exits. No HTTP server.

CODEBLOCK23

Script Format

CODEBLOCK24

Output

CODEBLOCK25

- output_id on any step collects its result into outputs. Screenshots become base64 data URIs.
${env.VAR_NAME} substitutes environment variables.
on_error: continue keeps going past failures. stop (default) halts.
All HTTP API actions work as script steps.
Logs go to stderr, stdout is clean JSON.
Exit code 0 on success, 1 on failure.

Example: Screenshot + Extract

CODEBLOCK26

Page Loaders (URL-Triggered Automation)

Mount YAML files to /loaders. When goto hits a matching URL, the loader's steps execute instead of normal navigation. Works in both API and script mode.

CODEBLOCK27

In script mode:

CODEBLOCK28

Loader Format

CODEBLOCK29

Match fields are optional but at least one is required. All specified fields must match.

Example Scripts

Web Search (`scripts/websearch.py`)

Multi-engine parallel web search using the browser API. Searches Brave, Google, and Bing, extracts structured results (title, URL, snippet) and AI overviews when available.

CODEBLOCK30

Output is JSON: INLINECODE104

Env vars: STEALTHY_AUTO_BROWSE_URL, WEBSEARCH_ENGINES (default: brave,google,bing), AUTH_TOKEN, USER_AGENT.

In cluster mode, each engine gets its own browser instance for true parallelism. In single mode, requests serialize via the request lock.

Tips

1. Always get_interactive_elements before clicking — don't guess coordinates
System input for stealth — system_click, system_type, INLINECODE113
get_text first, screenshots second — text is faster and smaller
Match TZ to IP location — timezone mismatch is a detection signal
Resize screenshots with ?whLargest=512 — full resolution is huge
Wait conditions over sleep — wait_for_element, wait_for_text, INLINECODE118
handle_dialog BEFORE the trigger — dialogs are auto-accepted otherwise
calibrate after fullscreen — coordinate mapping shifts

stealthy-auto-browse

Docker中的隐身浏览器。Camoufox（定制Firefox）——零CDP信号。通过PyAutoGUI实现操作系统级鼠标/键盘操作——无法检测。可绕过Cloudflare、DataDome、PerimeterX、Akamai。

安装、配置和容器设置请参见references/setup.md。

使用场景

- 网站有机器人检测（Cloudflare、验证码、DataDome）
其他浏览器技能返回403或被拦截
需要保持登录会话不被封禁

不使用场景

- 无机器人防护——使用curl或WebFetch
仅需静态HTML——使用curl

设置

API应已运行。设置基础URL：

bash
export STEALTHYAUTOBROWSE_URL=http://localhost:8080

验证： curl $STEALTHYAUTOBROWSE_URL/health 返回 ok。

HTTP API

所有命令：POST $STEALTHYAUTOBROWSE_URL/，JSON主体 {action: name, ...params}。

如果服务器设置了 AUTH_TOKEN，每个请求（/health 除外）都需要包含：

Authorization: Bearer

或作为查询参数传递：?auth_token=（适用于无法设置标头的MCP客户端）。

在单实例模式下，请求自动序列化——一次只运行一个，其余排队。

每个响应：

json
{
success: true,
timestamp: 1234567890.123,
data: { ... },
error: 仅当success为false时存在
}

两种输入模式

系统输入——无法检测

使用PyAutoGUI实现真实操作系统级事件。浏览器不知道被自动化。

- systemclick —— 以类人曲线移动鼠标，然后点击（视口x,y坐标）
mousemove —— 移动鼠标但不点击（悬停菜单、工具提示）
mouseclick —— 在指定位置或当前位置点击（无平滑移动）
systemtype —— 以随机延迟逐字符输入文本
send_key —— 按下按键或组合键（enter、tab、ctrl+a）
scroll —— 鼠标滚轮滚动（负数=向下）

从 getinteractiveelements 获取视口坐标。

Playwright输入——可检测但方便

使用Playwright的DOM事件。更快，使用CSS选择器/XPath，但可检测。

- click —— 按选择器点击
fill —— 立即设置输入值
type —— 逐字符输入到元素中

使用哪种

- 有机器人检测？ 始终使用系统输入。
无检测？ Playwright输入即可。
隐身填写表单？ 使用 systemclick 聚焦，然后 systemtype。

典型工作流程

1. goto → 加载页面
gettext → 读取页面内容
getinteractiveelements → 查找带x,y坐标的按钮/输入框
systemclick / systemtype / sendkey → 交互
waitforelement / waitfortext → 等待结果
get_text → 验证

操作参考

系统输入（无法检测）

json
{action: system_click, x: 500, y: 300}
{action: system_click, x: 500, y: 300, duration: 0.5}
{action: mouse_move, x: 500, y: 300}
{action: mouse_click, x: 500, y: 300}
{action: mouse_click}
{action: system_type, text: hello world, interval: 0.08}
{action: send_key, key: enter}
{action: send_key, key: ctrl+a}
{action: scroll, amount: -3}
{action: scroll, amount: -3, x: 500, y: 300}

Playwright输入（可检测）

json
{action: click, selector: #submit-btn}
{action: click, selector: xpath=//button[@id=submit]}
{action: fill, selector: input[name=email], value: user@example.com}
{action: type, selector: #search, text: query, delay: 0.05}

页面检查

json
{action: getinteractiveelements}
{action: getinteractiveelements, visible_only: true}
{action: get_text}
{action: get_html}
{action: eval, expression: document.title}
{action: eval, expression: document.querySelectorAll(a).length}

getinteractiveelements 返回所有按钮、链接、输入框，包含 x、y、w、h、text、selector、visible。直接将 x、y 传递给 system_click。

get_text 返回可见页面文本（截断至10,000字符）。导航后首先调用此方法。

截图

bash

浏览器视口

curl -s $STEALTHYAUTOBROWSE_URL/screenshot/browser?whLargest=512 -o screenshot.png

全桌面

curl -s $STEALTHYAUTOBROWSE_URL/screenshot/desktop?whLargest=512 -o desktop.png

调整大小参数：whLargest=512（推荐）、width=800、height=300、width=400&height=400。

通过操作（用于脚本模式——返回带 output_id 的base64）：

json
{action: save_screenshot}
{action: save_screenshot, type: desktop}
{action: savescreenshot, outputid: my_screenshot, whLargest: 512}
{action: save_screenshot, path: /output/page.png}

等待条件

使用这些代替 sleep。

json
{action: waitforelement, selector: #results, state: visible, timeout: 10}
{action: waitfortext, text: Search results, timeout: 10}
{action: waitforurl, url: /dashboard, timeout: 10}
{action: waitfornetwork_idle, timeout: 30}

state：visible（默认）、hidden、attached、detached。

标签页

json
{action: list_tabs}
{action: new_tab, url: https://example.com}
{action: switch_tab, index: 0}
{action: close_tab, index: 1}

对话框

在触发对话框的操作之前调用 handle_dialog。默认情况下对话框会自动接受。

json
{action: handle_dialog, accept: true}
{action: handle_dialog, accept: false}
{action: handle_dialog, accept: true, text: prompt response}
{action: getlastdialog}

Cookie

json
{action: get_cookies}
{action: get_cookies, urls: [https://example.com]}
{action: set_cookie, name: session, value: abc, url: https://example.com}
{action: delete_cookies}

存储

json
{action: get_storage, type: local}
{action: set_storage, type: local, key: theme, value: dark}
{action: clear_storage, type: local}

type：local（默认）或 session

stealthy-auto-browse隐身自动浏览