stealthy-auto-browse
Stealth browser in Docker. Camoufox (custom Firefox) — zero CDP signals. OS-level mouse/keyboard via PyAutoGUI — undetectable. Passes Cloudflare, DataDome, PerimeterX, Akamai.
For installation, configuration, and container setup, see references/setup.md.
When To Use
- - Site has bot detection (Cloudflare, CAPTCHAs, DataDome)
- Another browser skill is getting 403s or blocked responses
- You need a logged-in session that won't get banned
When NOT To Use
- - No bot protection — use
curl or INLINECODE1 - Only need static HTML — use INLINECODE2
Setup
The API should already be running. Set the base URL:
CODEBLOCK0
Verify: curl $STEALTHY_AUTO_BROWSE_URL/health returns ok.
HTTP API
All commands: POST $STEALTHY_AUTO_BROWSE_URL/ with JSON body {"action": "name", ...params}.
If AUTH_TOKEN is set on the server, include it on every request (except /health):
CODEBLOCK1
Or pass it as a query param: ?auth_token=<key> (useful for MCP clients that can't set headers).
In single-instance mode, requests are serialized automatically — only one runs at a time, the rest queue up.
Every response:
CODEBLOCK2
Two Input Modes
System Input — Undetectable
Uses PyAutoGUI for real OS-level events. The browser has no idea it's automated.
- -
system_click — move mouse with human-like curve, then click (viewport x,y coords) - INLINECODE11 — move mouse without clicking (hover menus, tooltips)
- INLINECODE12 — click at position or current location (no smooth movement)
- INLINECODE13 — type text character-by-character with randomized delays
- INLINECODE14 — press a key or combo (
enter, tab, ctrl+a) - INLINECODE18 — mouse wheel scroll (negative = down)
Get viewport coordinates from get_interactive_elements.
Playwright Input — Detectable But Convenient
Uses Playwright's DOM events. Faster, uses CSS selectors/XPath, but detectable.
- -
click — click by selector - INLINECODE21 — set input value instantly
- INLINECODE22 — type into element character-by-character
Which To Use
- - Bot detection? System input. Always.
- No detection? Playwright input is fine.
- Fill forms stealthily?
system_click to focus, then system_type.
Typical Workflow
- 1.
goto → load the page - INLINECODE26 → read what's on the page
- INLINECODE27 → find buttons/inputs with x,y coordinates
- INLINECODE28 /
system_type / send_key → interact - INLINECODE31 /
wait_for_text → wait for results - INLINECODE33 → verify
Actions Reference
Navigation
CODEBLOCK3
INLINECODE34 : "domcontentloaded" (default), "load", "networkidle".
referer: set HTTP Referer header (for sites that check referrer).
Response: INLINECODE39
System Input (Undetectable)
CODEBLOCK4
Playwright Input (Detectable)
CODEBLOCK5
Page Inspection
CODEBLOCK6
INLINECODE40 returns all buttons, links, inputs with x, y, w, h, text, selector, visible. Pass x, y directly to system_click.
INLINECODE51 returns visible page text (truncated to 10,000 chars). Call this first after navigating.
Screenshots
CODEBLOCK7
Resize params: whLargest=512 (recommended), width=800, height=300, width=400&height=400.
Via action (for script mode — returns base64 with output_id):
CODEBLOCK8
Wait Conditions
Use these instead of sleep.
CODEBLOCK9
INLINECODE58 : "visible" (default), "hidden", "attached", "detached".
Tabs
CODEBLOCK10
Dialogs
Call handle_dialog BEFORE the action that triggers the dialog. Dialogs are auto-accepted by default.
CODEBLOCK11
Cookies
CODEBLOCK12
Storage
CODEBLOCK13
INLINECODE64 : "local" (default) or "session".
Downloads & Uploads
CODEBLOCK14
Network Logging
CODEBLOCK15
Console Logging
Capture console.log, console.error, console.warn, etc. Each entry has type, text, location, timestamp.
CODEBLOCK16
Scrolling
CODEBLOCK17
INLINECODE74 uses JS (fast). scroll_to_bottom_humanized uses OS-level mouse wheel (undetectable).
Display
CODEBLOCK18
Call calibrate after fullscreen changes.
Multi-Step Scripts
Run multiple actions as one atomic request. Steps with output_id collect results.
CODEBLOCK19
Also accepts "yaml": "..." with the same YAML format used in script mode.
INLINECODE79 : "stop" (default) or "continue".
Utility
CODEBLOCK20
State Endpoints (GET)
CODEBLOCK21
MCP Server
The browser exposes all actions as MCP tools via Streamable HTTP at /mcp/ on the same port as the HTTP API.
CODEBLOCK22
Connect any MCP-compatible client to that URL. All actions from the HTTP API are available as tools — goto, screenshot, system_click, system_type, eval_js, get_text, get_cookies, run_script (multi-step), browser_action (generic fallback for everything else), and more.
If AUTH_TOKEN is set, connect to http://localhost:8080/mcp/?auth_token=<key>.
Works in both standalone and cluster mode — HAProxy routes MCP traffic with the same sticky sessions.
Cluster Mode
Run multiple browser instances behind HAProxy with a request queue, sticky sessions, and Redis cookie sync. For setup see references/setup.md.
Entry point is http://localhost:8080 — same API. HAProxy queues requests when all instances are busy instead of returning errors.
Sticky sessions: HAProxy sets an INSTANCEID cookie. Send it back on subsequent requests to keep routing to the same browser instance. All browser state (tabs, DOM, JS, local storage) lives on that specific container — only cookies sync via Redis.
Redis cookie sync: Cookies set on any instance propagate to all others instantly via PubSub. Log in once, the whole fleet is authenticated.
Script Mode
Pipe a YAML script via stdin, get JSON results on stdout, container exits. No HTTP server.
CODEBLOCK23
Script Format
CODEBLOCK24
Output
CODEBLOCK25
- -
output_id on any step collects its result into outputs. Screenshots become base64 data URIs. ${env.VAR_NAME} substitutes environment variables.on_error: continue keeps going past failures. stop (default) halts.- All HTTP API actions work as script steps.
- Logs go to stderr, stdout is clean JSON.
- Exit code 0 on success, 1 on failure.
Example: Screenshot + Extract
CODEBLOCK26
Page Loaders (URL-Triggered Automation)
Mount YAML files to /loaders. When goto hits a matching URL, the loader's steps execute instead of normal navigation. Works in both API and script mode.
CODEBLOCK27
In script mode:
CODEBLOCK28
Loader Format
CODEBLOCK29
Match fields are optional but at least one is required. All specified fields must match.
Example Scripts
Web Search (scripts/websearch.py)
Multi-engine parallel web search using the browser API. Searches Brave, Google, and Bing, extracts structured results (title, URL, snippet) and AI overviews when available.
CODEBLOCK30
Output is JSON: INLINECODE104
Env vars: STEALTHY_AUTO_BROWSE_URL, WEBSEARCH_ENGINES (default: brave,google,bing), AUTH_TOKEN, USER_AGENT.
In cluster mode, each engine gets its own browser instance for true parallelism. In single mode, requests serialize via the request lock.
Tips
- 1. Always
get_interactive_elements before clicking — don't guess coordinates - System input for stealth —
system_click, system_type, INLINECODE113 get_text first, screenshots second — text is faster and smaller- Match TZ to IP location — timezone mismatch is a detection signal
- Resize screenshots with
?whLargest=512 — full resolution is huge - Wait conditions over sleep —
wait_for_element, wait_for_text, INLINECODE118 handle_dialog BEFORE the trigger — dialogs are auto-accepted otherwisecalibrate after fullscreen — coordinate mapping shifts
stealthy-auto-browse
Docker中的隐身浏览器。Camoufox(定制Firefox)——零CDP信号。通过PyAutoGUI实现操作系统级鼠标/键盘操作——无法检测。可绕过Cloudflare、DataDome、PerimeterX、Akamai。
安装、配置和容器设置请参见references/setup.md。
使用场景
- - 网站有机器人检测(Cloudflare、验证码、DataDome)
- 其他浏览器技能返回403或被拦截
- 需要保持登录会话不被封禁
不使用场景
- - 无机器人防护——使用curl或WebFetch
- 仅需静态HTML——使用curl
设置
API应已运行。设置基础URL:
bash
export STEALTHYAUTOBROWSE_URL=http://localhost:8080
验证: curl $STEALTHYAUTOBROWSE_URL/health 返回 ok。
HTTP API
所有命令:POST $STEALTHYAUTOBROWSE_URL/,JSON主体 {action: name, ...params}。
如果服务器设置了 AUTH_TOKEN,每个请求(/health 除外)都需要包含:
Authorization: Bearer
或作为查询参数传递:?auth_token=(适用于无法设置标头的MCP客户端)。
在单实例模式下,请求自动序列化——一次只运行一个,其余排队。
每个响应:
json
{
success: true,
timestamp: 1234567890.123,
data: { ... },
error: 仅当success为false时存在
}
两种输入模式
系统输入——无法检测
使用PyAutoGUI实现真实操作系统级事件。浏览器不知道被自动化。
- - systemclick —— 以类人曲线移动鼠标,然后点击(视口x,y坐标)
- mousemove —— 移动鼠标但不点击(悬停菜单、工具提示)
- mouseclick —— 在指定位置或当前位置点击(无平滑移动)
- systemtype —— 以随机延迟逐字符输入文本
- send_key —— 按下按键或组合键(enter、tab、ctrl+a)
- scroll —— 鼠标滚轮滚动(负数=向下)
从 getinteractiveelements 获取视口坐标。
Playwright输入——可检测但方便
使用Playwright的DOM事件。更快,使用CSS选择器/XPath,但可检测。
- - click —— 按选择器点击
- fill —— 立即设置输入值
- type —— 逐字符输入到元素中
使用哪种
- - 有机器人检测? 始终使用系统输入。
- 无检测? Playwright输入即可。
- 隐身填写表单? 使用 systemclick 聚焦,然后 systemtype。
典型工作流程
- 1. goto → 加载页面
- gettext → 读取页面内容
- getinteractiveelements → 查找带x,y坐标的按钮/输入框
- systemclick / systemtype / sendkey → 交互
- waitforelement / waitfortext → 等待结果
- get_text → 验证
操作参考
导航
json
{action: goto, url: https://example.com}
{action: goto, url: https://example.com, wait_until: networkidle}
{action: goto, url: https://example.com, referer: https://google.com/search?q=stuff}
{action: refresh}
{action: refresh, wait_until: networkidle}
wait_until:domcontentloaded(默认)、load、networkidle。
referer:设置HTTP Referer标头(用于检查引荐来源的网站)。
响应:{url: ..., title: ...}
系统输入(无法检测)
json
{action: system_click, x: 500, y: 300}
{action: system_click, x: 500, y: 300, duration: 0.5}
{action: mouse_move, x: 500, y: 300}
{action: mouse_click, x: 500, y: 300}
{action: mouse_click}
{action: system_type, text: hello world, interval: 0.08}
{action: send_key, key: enter}
{action: send_key, key: ctrl+a}
{action: scroll, amount: -3}
{action: scroll, amount: -3, x: 500, y: 300}
Playwright输入(可检测)
json
{action: click, selector: #submit-btn}
{action: click, selector: xpath=//button[@id=submit]}
{action: fill, selector: input[name=email], value: user@example.com}
{action: type, selector: #search, text: query, delay: 0.05}
页面检查
json
{action: getinteractiveelements}
{action: getinteractiveelements, visible_only: true}
{action: get_text}
{action: get_html}
{action: eval, expression: document.title}
{action: eval, expression: document.querySelectorAll(a).length}
getinteractiveelements 返回所有按钮、链接、输入框,包含 x、y、w、h、text、selector、visible。直接将 x、y 传递给 system_click。
get_text 返回可见页面文本(截断至10,000字符)。导航后首先调用此方法。
截图
bash
浏览器视口
curl -s $STEALTHY
AUTOBROWSE_URL/screenshot/browser?whLargest=512 -o screenshot.png
全桌面
curl -s $STEALTHY
AUTOBROWSE_URL/screenshot/desktop?whLargest=512 -o desktop.png
调整大小参数:whLargest=512(推荐)、width=800、height=300、width=400&height=400。
通过操作(用于脚本模式——返回带 output_id 的base64):
json
{action: save_screenshot}
{action: save_screenshot, type: desktop}
{action: savescreenshot, outputid: my_screenshot, whLargest: 512}
{action: save_screenshot, path: /output/page.png}
等待条件
使用这些代替 sleep。
json
{action: waitforelement, selector: #results, state: visible, timeout: 10}
{action: waitfortext, text: Search results, timeout: 10}
{action: waitforurl, url: /dashboard, timeout: 10}
{action: waitfornetwork_idle, timeout: 30}
state:visible(默认)、hidden、attached、detached。
标签页
json
{action: list_tabs}
{action: new_tab, url: https://example.com}
{action: switch_tab, index: 0}
{action: close_tab, index: 1}
对话框
在触发对话框的操作之前调用 handle_dialog。默认情况下对话框会自动接受。
json
{action: handle_dialog, accept: true}
{action: handle_dialog, accept: false}
{action: handle_dialog, accept: true, text: prompt response}
{action: getlastdialog}
Cookie
json
{action: get_cookies}
{action: get_cookies, urls: [https://example.com]}
{action: set_cookie, name: session, value: abc, url: https://example.com}
{action: delete_cookies}
存储
json
{action: get_storage, type: local}
{action: set_storage, type: local, key: theme, value: dark}
{action: clear_storage, type: local}
type:local(默认)或 session