Camoufox Stealth Browser 🦊
C++ level anti-bot evasion using Camoufox — a custom Firefox fork with stealth patches compiled into the browser itself, not bolted on via JavaScript.
Why Camoufox > Chrome-based Solutions
| Approach | Detection Level | Tools |
|---|
| Camoufox (this skill) | C++ compiled patches | Undetectable fingerprints baked into browser |
| undetected-chromedriver |
JS runtime patches | Can be detected by timing analysis |
| puppeteer-stealth | JS injection | Patches applied after page load = detectable |
| playwright-stealth | JS injection | Same limitations |
Camoufox patches Firefox at the source code level — WebGL, Canvas, AudioContext fingerprints are genuinely spoofed, not masked by JavaScript overrides that anti-bot systems can detect.
Key Advantages
- 1. C++ Level Stealth — Fingerprint spoofing compiled into the browser, not JS hacks
- Container Isolation — Runs in distrobox, keeping your host system clean
- Dual-Tool Approach — Camoufox for browsers, curl_cffi for API-only (no browser overhead)
- Firefox-Based — Less fingerprinted than Chrome (everyone uses Chrome for bots)
When to Use
- - Standard Playwright/Selenium gets blocked
- Site shows Cloudflare challenge or "checking your browser"
- Need to scrape Airbnb, Yelp, or similar protected sites
- INLINECODE0 or
undetected-chromedriver stopped working - You need actual stealth, not JS band-aids
Tool Selection
| Tool | Level | Best For |
|---|
| Camoufox | C++ patches | All protected sites - Cloudflare, Datadome, Yelp, Airbnb |
| curl_cffi |
TLS spoofing | API endpoints only - no JS needed, very fast |
Quick Start
All scripts run in pybox distrobox for isolation.
⚠️ Use python3.14 explicitly - pybox may have multiple Python versions with different packages installed.
1. Setup (First Time)
CODEBLOCK0
2. Fetch a Protected Page
Browser (Camoufox):
CODEBLOCK1
API only (curl_cffi):
CODEBLOCK2
Architecture
CODEBLOCK3
Tool Details
Camoufox
- - What: Custom Firefox build with C++ level stealth patches
- Pros: Best fingerprint evasion, passes Turnstile automatically
- Cons: ~700MB download, Firefox-based
- Best for: All protected sites - Cloudflare, Datadome, Yelp, Airbnb
curl_cffi
- - What: Python HTTP client with browser TLS fingerprint spoofing
- Pros: No browser overhead, very fast
- Cons: No JS execution, API endpoints only
- Best for: Known API endpoints, mobile app reverse engineering
Critical: Proxy Requirements
Datacenter IPs (AWS, DigitalOcean) = INSTANT BLOCK on Airbnb/Yelp
You MUST use residential or mobile proxies:
CODEBLOCK4
See references/proxy-setup.md for proxy configuration.
Behavioral Tips
Sites like Airbnb/Yelp use behavioral analysis. To avoid detection:
- 1. Warm up: Don't hit target URL directly. Visit homepage first, scroll, click around.
- Mouse movements: Inject random mouse movements (Camoufox handles this).
- Timing: Add random delays (2-5s between actions), not fixed intervals.
- Session stickiness: Use same proxy IP for 10-30 min sessions, don't rotate every request.
Headless Mode Warning
⚠️ Old --headless flag is DETECTED. Options:
- 1. New Headless: Use
headless="new" (Chrome 109+) - Xvfb: Run headed browser in virtual display
- Headed: Just run headed if you can (most reliable)
CODEBLOCK5
Troubleshooting
| Problem | Solution |
|---|
| "Access Denied" immediately | Use residential proxy |
| Cloudflare challenge loops |
Try Camoufox instead of Nodriver |
| Browser crashes in pybox | Install missing deps:
sudo dnf install gtk3 libXt |
| TLS fingerprint blocked | Use curl_cffi with
impersonate="chrome120" |
| Turnstile checkbox appears | Add mouse movement, increase wait time |
|
ModuleNotFoundError: camoufox | Use
python3.14 not
python or
python3 |
|
greenlet segfault (exit 139) | Python version mismatch - use
python3.14 explicitly |
|
libstdc++.so.6 errors | NixOS lib path issue - use
python3.14 in pybox |
Python Version Issues (NixOS/pybox)
The pybox container may have multiple Python versions with separate site-packages:
CODEBLOCK6
If you get segfaults or import errors, always use python3.14 explicitly.
Examples
Scrape Airbnb Listing
CODEBLOCK7
Scrape Yelp Business
CODEBLOCK8
API Scraping with TLS Spoofing
CODEBLOCK9
Session Management
Persistent sessions allow reusing authenticated state across runs without re-logging in.
Quick Start
CODEBLOCK10
Flags
| Flag | Description |
|---|
| INLINECODE18 | Named profile for session storage (required) |
| INLINECODE19 |
Interactive login mode - opens headed browser |
|
--headless | Use saved session in headless mode |
|
--status | Check if session appears valid |
|
--export-cookies FILE | Export cookies to JSON for backup |
|
--import-cookies FILE | Import cookies from JSON file |
Storage
- - Location: INLINECODE24
- Permissions: Directory
700, files INLINECODE26 - Profile names: Letters, numbers,
_, - only (1-63 chars)
Cookie Handling
- - Save: All cookies from all domains stored in browser profile
- Restore: Only cookies matching target URL domain are used
- SSO: If redirected to Google/auth domain, re-authenticate once and profile updates
Login Wall Detection
The script detects session expiry using multiple signals:
- 1. HTTP status: 401, 403
- URL patterns:
/login, /signin, INLINECODE31 - Title patterns: "login", "sign in", etc.
- Content keywords: "captcha", "verify", "authenticate"
- Form detection: Password input fields
If detected during --headless mode, you'll see:
CODEBLOCK11
Re-run with --login to refresh the session.
Remote Login (SSH)
Since --login requires a visible browser, you need display forwarding:
X11 Forwarding (Preferred):
CODEBLOCK12
VNC Alternative:
CODEBLOCK13
Security Notes
⚠️ Cookies are credentials. Treat profile directories like passwords:
- - Profile dirs have
chmod 700 (owner only) - Cookie exports have INLINECODE36
- Don't share profiles or exported cookies over insecure channels
- Consider encrypting backups
Limitations
| Limitation | Reason |
|---|
| localStorage/sessionStorage not exported | Use browser profile instead (handles automatically) |
| IndexedDB not portable |
Stored in browser profile, not cookie export |
| No parallel profile access | No file locking in v1; use one process per profile |
References
Camoufox 隐身浏览器 🦊
C++ 级别的反机器人规避方案,采用 Camoufox——一款定制版 Firefox 分支,其隐身补丁直接编译进浏览器本身,而非通过 JavaScript 附加。
为什么 Camoufox 优于 Chrome 方案
| 方案 | 检测级别 | 工具 |
|---|
| Camoufox(本技能) | C++ 编译补丁 | 不可检测的指纹内置于浏览器 |
| undetected-chromedriver |
JS 运行时补丁 | 可通过时序分析检测 |
| puppeteer-stealth | JS 注入 | 页面加载后应用补丁 = 可检测 |
| playwright-stealth | JS 注入 | 同样限制 |
Camoufox 在源代码层面修补 Firefox——WebGL、Canvas、AudioContext 指纹被真实伪造,而非通过反机器人系统可检测的 JavaScript 覆盖来掩盖。
关键优势
- 1. C++ 级别隐身——指纹伪造编译进浏览器,而非 JS 黑客手段
- 容器隔离——在 distrobox 中运行,保持主机系统清洁
- 双工具方法——Camoufox 用于浏览器,curl_cffi 用于仅 API(无浏览器开销)
- 基于 Firefox——比 Chrome 指纹更少(人人都用 Chrome 跑机器人)
何时使用
- - 标准 Playwright/Selenium 被拦截
- 网站显示 Cloudflare 挑战或正在检查您的浏览器
- 需要抓取 Airbnb、Yelp 或类似受保护网站
- puppeteer-stealth 或 undetected-chromedriver 失效
- 你需要真正的隐身,而非 JS 创可贴
工具选择
| 工具 | 级别 | 最佳用途 |
|---|
| Camoufox | C++ 补丁 | 所有受保护网站 - Cloudflare、Datadome、Yelp、Airbnb |
| curl_cffi |
TLS 伪造 | 仅 API 端点 - 无需 JS,速度极快 |
快速开始
所有脚本在 pybox distrobox 中运行以实现隔离。
⚠️ 明确使用 python3.14——pybox 可能安装有多个 Python 版本及不同包。
1. 设置(首次)
bash
在 pybox 中安装工具(使用 python3.14)
distrobox-enter pybox -- python3.14 -m pip install camoufox curl_cffi
Camoufox 浏览器在首次运行时自动下载(约 700MB Firefox 分支)
2. 获取受保护页面
浏览器(Camoufox):
bash
distrobox-enter pybox -- python3.14 scripts/camoufox-fetch.py https://example.com --headless
仅 API(curl_cffi):
bash
distrobox-enter pybox -- python3.14 scripts/curl-api.py https://api.example.com/endpoint
架构
┌─────────────────────────────────────────────────────────┐
│ OpenClaw 代理 │
├─────────────────────────────────────────────────────────┤
│ distrobox-enter pybox -- python3.14 scripts/xxx.py │
├─────────────────────────────────────────────────────────┤
│ pybox 容器 │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Camoufox │ │ curl_cffi │ │
│ │ (Firefox) │ │ (TLS 伪造) │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────┘
工具详情
Camoufox
- - 是什么: 定制版 Firefox 构建,带有 C++ 级别隐身补丁
- 优点: 最佳指纹规避,自动通过 Turnstile
- 缺点: 约 700MB 下载,基于 Firefox
- 最佳用途: 所有受保护网站 - Cloudflare、Datadome、Yelp、Airbnb
curl_cffi
- - 是什么: 带有浏览器 TLS 指纹伪造的 Python HTTP 客户端
- 优点: 无浏览器开销,速度极快
- 缺点: 无 JS 执行,仅 API 端点
- 最佳用途: 已知 API 端点,移动应用逆向工程
关键:代理要求
数据中心 IP(AWS、DigitalOcean)= 在 Airbnb/Yelp 上立即被封禁
你必须使用住宅或移动代理:
python
示例代理配置
proxy = http://user:pass@residential-proxy.example.com:8080
代理配置请参阅 references/proxy-setup.md。
行为技巧
Airbnb/Yelp 等网站使用行为分析。为避免检测:
- 1. 预热: 不要直接访问目标 URL。先访问首页,滚动,点击。
- 鼠标移动: 注入随机鼠标移动(Camoufox 处理此功能)。
- 时序: 添加随机延迟(操作间 2-5 秒),而非固定间隔。
- 会话粘性: 在 10-30 分钟会话中使用相同代理 IP,不要每次请求都轮换。
无头模式警告
⚠️ 旧的 --headless 标志会被检测到。选项:
- 1. 新无头模式: 使用 headless=new(Chrome 109+)
- Xvfb: 在虚拟显示器中运行有头浏览器
- 有头模式: 如果可以,直接运行有头模式(最可靠)
bash
Xvfb 方法(Linux)
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99
python scripts/camoufox-fetch.py https://example.com
故障排除
| 问题 | 解决方案 |
|---|
| 立即显示访问被拒绝 | 使用住宅代理 |
| Cloudflare 挑战循环 |
尝试 Camoufox 而非 Nodriver |
| 浏览器在 pybox 中崩溃 | 安装缺失依赖:sudo dnf install gtk3 libXt |
| TLS 指纹被拦截 | 使用 impersonate=chrome120 的 curl_cffi |
| Turnstile 复选框出现 | 添加鼠标移动,增加等待时间 |
| ModuleNotFoundError: camoufox | 使用 python3.14 而非 python 或 python3 |
| greenlet 段错误(退出码 139) | Python 版本不匹配——明确使用 python3.14 |
| libstdc++.so.6 错误 | NixOS 库路径问题——在 pybox 中使用 python3.14 |
Python 版本问题(NixOS/pybox)
pybox 容器可能有多个 Python 版本及独立的 site-packages:
bash
检查哪个 Python 有 camoufox
distrobox-enter pybox -- python3.14 -c import camoufox; print(OK)
错误(可能使用不同 Python)
distrobox-enter pybox -- python3.14 scripts/camoufox-session.py ...
正确(明确版本)
distrobox-enter pybox -- python3.14 scripts/camoufox-session.py ...
如果遇到段错误或导入错误,始终明确使用 python3.14。
示例
抓取 Airbnb 房源
bash
distrobox-enter pybox -- python3.14 scripts/camoufox-fetch.py \
https://www.airbnb.com/rooms/12345 \
--headless --wait 10 \
--screenshot airbnb.png
抓取 Yelp 商家
bash
distrobox-enter pybox -- python3.14 scripts/camoufox-fetch.py \
https://www.yelp.com/biz/some-restaurant \
--headless --wait 8 \
--output yelp.html
使用 TLS 伪造进行 API 抓取
bash
distrobox-enter pybox -- python3.14 scripts/curl-api.py \
https://api.yelp.com/v3/businesses/search?term=coffee&location=SF \
--headers {Authorization: Bearer xxx}
会话管理
持久会话允许在多次运行中重用已认证状态,无需重新登录。
快速开始
bash
1. 交互式登录(有头浏览器打开)
distrobox-enter pybox -- python3.14 scripts/camoufox-session.py \
--profile airbnb --login https://www.airbnb.com/account-settings
在浏览器中完成登录,然后按 Enter