headful-browser-vnc
Overview
INLINECODE0 provides a controlled, auditable, headful Chromium browsing environment on a server for cases where full browser rendering and occasional human interaction are required. It combines Xvfb, a window manager, and x11vnc (optional noVNC) to present the server-side browser UI to an operator, while offering programmatic integration points (Chrome Remote Debugging / CDP) for automated capture of cookies, rendered HTML, and screenshots.
Primary objective
- - Enable reliable, repeatable automation workflows that require the ability to escalate to a human operator for tasks that cannot be solved purely by automation (CAPTCHA solving, challenge pages, multi-factor authentication, manual login flows). The skill's intent is to: (1) run a headful Chrome instance in an isolated profile; (2) expose a secure operator-visible UI to the server browser (VNC/noVNC) so a human can intervene; (3) preserve and export the resulting browser session artifacts (cookies, outerHTML, screenshots) back into automated pipelines for continued processing.
Key capabilities
- - Headful browser execution: launch Google Chrome / Chromium with an isolated user data directory and configurable flags (proxy, remote-debugging-port, extra args) on a server X display provided by
Xvfb. - Operator UI: present the running browser to an operator via
x11vnc (or optional noVNC web proxy). Operators can connect over SSH-forwarded ports or noVNC with token gating to perform manual actions (solve CAPTCHAs, authenticate), then signal the system to capture artifacts. - Programmatic capture: export rendered
outerHTML, full-page screenshots, and cookies using Chrome CDP (Playwright/Puppeteer compatible). Exports are intended for downstream automated analysis or storage. - Safe restart and recovery: helpers to restart Chrome when flags change, with explicit user confirmation required for any action that may terminate existing browser instances.
- Artifact hygiene: captured artifacts are written to an artifacts directory with restrictive file permissions. The skill logs actions and writes diagnostic artifacts to facilitate debugging and comparison-based analysis.
Security, privacy, and operational notes
- - VNC security: the skill creates per-session passfiles (
rfbauth) when possible; noVNC should be bound to loopback or token-gated. Do not expose VNC/noVNC endpoints to the public internet without additional access controls. Store passfiles with mode 600. - Sensitive artifacts: cookies and rendered page artifacts are sensitive. They are stored under the skill's
out/ directory with restrictive permissions; users are responsible for secure storage and timely cleanup. - Privileged operations: installing system packages and enabling systemd units require
sudo and explicit user consent. The scripts will not perform privileged actions automatically unless the operator explicitly enables the auto-install path (see below).
Supported VNC implementations
The skill supports multiple VNC backends; behavior is controlled via the skill-local .env file (VNC_IMPLEMENTATION): auto (default), tigervnc, tightvnc, realvnc. When possible the skill prefers non-interactive rfbauth generation via vncpasswd; when unavailable it prompts the operator and documents fallback behavior.
Runtime files & usage
This section documents each file included in the skill release branch, its purpose, usage examples, preconditions, and relations to other files. All script examples assume you run them from the workspace root unless noted.
skills/headful-browser-vnc/scripts/setup.sh
- - Purpose: interactive, documented installer and validator for the skill's runtime dependencies (
Xvfb, x11vnc, Chrome/Chromium, node, tooling). Primarily guidance-only: prints distro-aware commands; only runs package-manager operations when explicitly allowed. - Usage:
./skills/headful-browser-vnc/scripts/setup.sh [--check-only] [--auto-install] [--yes] [--set-password]
- - Preconditions: network access to package repositories;
sudo available for host installs (not required inside containers when running as root). When running inside a container, auto-installs require either root or CONTAINER_AUTO_OK=true. - Relations: updates/creates
skills/headful-browser-vnc/.env; generates VNC passfiles via vncpasswd when available; consults templates/ for service unit guidance.
skills/headful-browser-vnc/scripts/start_vnc.sh
- - Purpose: start an
Xvfb display, optional window manager, and x11vnc to expose the display. Emits a one-line summary with the VNC port and display id. - Usage:
skills/headful-browser-vnc/scripts/start_vnc.sh <session_id> [--display=:99] [--resolution=1366x768] [--port=5901]
- - Preconditions:
Xvfb and x11vnc installed (or available in Docker image), VNC_PASSFILE permission 600 if provided. - Relations:
start_vnc.sh creates the DISPLAY and user-data directory used by start_chrome_debug.sh; stop_vnc.sh undoes the session.
skills/headful-browser-vnc/scripts/stop_vnc.sh
- - Purpose: stop and clean up a running VNC/Xvfb session previously started by
start_vnc.sh. - Usage:
skills/headful-browser-vnc/scripts/stop_vnc.sh <session_id> [--display=:99]
- - Preconditions: session id matches an active session created by
start_vnc.sh.
skills/headful-browser-vnc/scripts/start_chrome_debug.sh
- - Purpose: launch a headful Chrome/Chromium instance attached to a session
DISPLAY with a dedicated user-data dir and remote-debugging port for CDP access. - Usage:
skills/headful-browser-vnc/scripts/start_chrome_debug.sh <session_id> [--proxy=http://...] [--remote-debug-port=9222]
- - Preconditions: Chrome/Chromium binary available and readable;
DISPLAY is set (start_vnc.sh must have run); user-data directory writable by the process owner. - Relations: other scripts (
export_page.sh, export_cookies.sh) connect to the remote-debugging port started by this script.
skills/headful-browser-vnc/scripts/export_page.sh
- - Purpose: instruct Chrome to load a URL and export rendered
outerHTML and a full-page screenshot for later analysis. - Usage:
skills/headful-browser-vnc/scripts/export_page.sh <session_id> <url> [--devtools-port=9222]
- - Preconditions: headful Chrome with remote-debugging port active and reachable (
start_chrome_debug.sh).
skills/headful-browser-vnc/scripts/export_cookies.sh
- - Purpose: export cookies from a running Chrome instance via the Chrome DevTools Protocol.
- Usage:
CODEBLOCK5
General Notes on Export Helpers
- - Note: these export helpers (
export_page.sh, export_cookies.sh) are designed to be idempotent and safe to run after manual operator interventions. They place artifacts into out/<session_id>/ with restrictive permissions.
Combined Script Usage Flow
- - Flow:
start_vnc.sh <id> → start_chrome_debug.sh <id> → operator attaches via VNC → operator interacts → export_page.sh / export_cookies.sh → INLINECODE57
skills/headful-browser-vnc/docker/
- - Purpose: reference
docker/ directory containing entrypoint and docker-compose.yml plus an embedded Dockerfile in README.docker.md for reproducible builds. ClawHub does not accept Dockerfile uploads, so the Dockerfile content has been included in skills/headful-browser-vnc/README.docker.md as a code block. - Recommendation: For reproducibility and security, build dependencies into the image (
Dockerfile) rather than relying on runtime package installs inside a running container. If you must allow in-container auto-installs, see setup.sh gating (CONTAINER_AUTO_OK and container_auto_allowed()).
skills/headful-browser-vnc/templates/
- - Purpose: Jinja2-style templates for systemd unit files (
x11vnc/noVNC). Use them as references; deploying them on a host requires sudo and careful service permissions.
skills/headful-browser-vnc/tests/smoke_test.sh
- - Purpose: non-privileged smoke test exercising start → launch → export → cleanup sequence to validate runtime behavior in CI-friendly environments.
- Usage:
CODEBLOCK6
Configuration (.env)
Place a skill-local skills/headful-browser-vnc/.env (chmod 600) to persist runtime defaults. Key fields:
- -
VNC_PASSFILE: path to passfile (e.g. /home/user/.vnc/passwd or ./vnc_passwd) - INLINECODE79 : optional explicit TCP port to bind
x11vnc (if omitted the script will report the actual port in use) - INLINECODE81 :
auto|tigervnc|tightvnc| INLINECODE85 - INLINECODE86 : X display (default
:99) - INLINECODE88 : screen resolution (default
1366x768) - INLINECODE90 : Chrome remote debugging port (default
9222) - INLINECODE92 /
HTTP_PROXY / HTTPS_PROXY: optional proxy settings
Install and dependencies
- -
setup.sh contains interactive guidance and optional prompts for installing Chrome, node, Playwright, and VNC helper tools. The installer will not run sudo operations without explicit consent. - Programmatic export paths prefer Node + Playwright; a Python fallback is available but optional.
Auto-install behaviour and safety (enforced)
The installer supports an optional auto-install path but it is gated behind explicit confirmations and container-aware checks:
- - CLI flag:
--auto-install (sets AUTO_INSTALL=true for the run). - Runtime confirmation: when the installer proposes a distro-specific command it will first ask for a
y/N confirmation, then print the exact command and require the operator to type the full word yes (not y). Only if both confirmations are provided will the installer execute the command. - Container extra gate: inside a detected container (
/.dockerenv or /proc/1/cgroup contains docker/containerd/kubepods), automatic package-manager operations will only run if the process is root (id -u == 0) or if CONTAINER_AUTO_OK=true.
Default behaviour remains conservative: without --auto-install the installer only prints distro-appropriate commands and will not run package-manager commands automatically.
Templates and integration
- -
templates/x11vnc.service.j2 — systemd unit template for persistent sessions (requires sudo to install) - INLINECODE110 — noVNC service template
Integration guidance for maintainers
- - Use Chrome CDP (devtools) for deterministic exports. Prefer attaching to an already-running headful Chrome instance rather than launching short-lived headless instances when reproducing a previously observed UI state.
- Persist session artifacts and index them (timestamp, URL, session id, VNC port, devtools port) so comparison automation can operate on operator-validated examples.
- When embedding into automated pipelines, clearly separate automated actions from operator interventions; require explicit human confirmation for destructive actions (Chrome restarts, service reconfiguration).
Testing
A non-privileged smoke test is provided at skills/headful-browser-vnc/tests/smoke_test.sh. It performs a basic start → launch → export → cleanup sequence and is useful for CI verification.
Support and contribution
The skill is maintained in this workspace. When contributing changes, follow the repository conventions: create backups before modifying scripts, run bash -n for syntax validation, and preserve audit logs and artifacts.
License
Include an appropriate LICENSE file when publishing (e.g., MIT). Update author/maintainer fields in SKILL.md prior to external publication.
Example: safe auto-install run
To allow the installer to perform distro package manager actions automatically, run with explicit --auto-install and be prepared to type the full confirmation word. Example:
CODEBLOCK7
The script will: (1) detect your distro and print the exact command it plans to run; (2) ask for a normal y/N confirmation; (3) print the command and require you to type the exact word yes to proceed; (4) only then execute the command.
If you prefer to always review and run commands manually, omit --auto-install and the script will only print distro-appropriate commands for you to run yourself.
Short example run transcript (what prompts look like)
Below is a short, representative transcript demonstrating the installer flow when a few components are missing and the operator chooses the manual path (no auto-install). Prompts shown are exact prompts produced by the current setup.sh.
CODEBLOCK8
headful-browser-vnc
概述
headful-browser-vnc 在服务器上提供一个可控、可审计、有头界面的 Chromium 浏览环境,适用于需要完整浏览器渲染和偶尔人工交互的场景。它结合了 Xvfb、窗口管理器和 x11vnc(可选 noVNC),将服务器端浏览器 UI 呈现给操作员,同时提供程序化集成点(Chrome 远程调试 / CDP),用于自动捕获 Cookie、渲染后的 HTML 和截图。
主要目标
- - 支持可靠、可重复的自动化工作流,这些工作流需要能够升级到人工操作员来处理纯自动化无法解决的任务(验证码破解、挑战页面、多因素认证、手动登录流程)。该技能的目的是:(1) 在隔离的配置文件中运行有头 Chrome 实例;(2) 向服务器浏览器(VNC/noVNC)暴露安全的操作员可见 UI,以便人工干预;(3) 保留并将生成的浏览器会话产物(Cookie、outerHTML、截图)导出回自动化流水线以继续处理。
关键能力
- - 有头浏览器执行:在由 Xvfb 提供的服务器 X 显示器上,使用隔离的用户数据目录和可配置标志(代理、远程调试端口、额外参数)启动 Google Chrome / Chromium。
- 操作员 UI:通过 x11vnc(或可选的 noVNC Web 代理)将正在运行的浏览器呈现给操作员。操作员可以通过 SSH 转发端口或带令牌门控的 noVNC 连接,执行手动操作(解决验证码、认证),然后通知系统捕获产物。
- 程序化捕获:使用 Chrome CDP(兼容 Playwright/Puppeteer)导出渲染后的 outerHTML、整页截图和 Cookie。导出内容用于下游自动化分析或存储。
- 安全重启与恢复:当标志更改时重启 Chrome 的辅助工具,任何可能终止现有浏览器实例的操作都需要明确的用户确认。
- 产物卫生:捕获的产物写入具有限制性文件权限的产物目录。该技能记录操作并写入诊断产物,以方便调试和基于比较的分析。
安全、隐私与操作说明
- - VNC 安全:该技能在可能的情况下创建每会话密码文件(rfbauth);noVNC 应绑定到回环地址或使用令牌门控。在没有额外访问控制的情况下,不要将 VNC/noVNC 端点暴露到公共互联网。密码文件存储权限设置为 600。
- 敏感产物:Cookie 和渲染页面产物是敏感的。它们存储在技能 out/ 目录下,具有限制性权限;用户负责安全存储和及时清理。
- 特权操作:安装系统包和启用 systemd 单元需要 sudo 和明确的用户同意。除非操作员明确启用自动安装路径(见下文),否则脚本不会自动执行特权操作。
支持的 VNC 实现
该技能支持多种 VNC 后端;行为通过技能本地 .env 文件(VNC_IMPLEMENTATION)控制:auto(默认)、tigervnc、tightvnc、realvnc。在可能的情况下,技能优先通过 vncpasswd 生成非交互式 rfbauth;当不可用时,它会提示操作员并记录回退行为。
运行时文件与使用
本节记录了技能发布分支中包含的每个文件、其用途、使用示例、前置条件以及与其他文件的关系。除非另有说明,所有脚本示例假设您从工作区根目录运行。
skills/headful-browser-vnc/scripts/setup.sh
- - 用途:交互式、有文档说明的技能运行时依赖项(Xvfb、x11vnc、Chrome/Chromium、node、工具)安装程序和验证程序。主要是指导性的:打印发行版感知的命令;仅在明确允许时运行包管理器操作。
- 用法:
bash
./skills/headful-browser-vnc/scripts/setup.sh [--check-only] [--auto-install] [--yes] [--set-password]
- - 前置条件:能够访问包仓库的网络;主机安装需要 sudo(在容器中以 root 身份运行时不需要)。在容器内运行时,自动安装需要 root 权限或 CONTAINERAUTOOK=true。
- 关系:更新/创建 skills/headful-browser-vnc/.env;在可用时通过 vncpasswd 生成 VNC 密码文件;参考 templates/ 获取服务单元指导。
skills/headful-browser-vnc/scripts/start_vnc.sh
- - 用途:启动 Xvfb 显示器、可选的窗口管理器和 x11vnc 以暴露显示器。输出一行摘要,包含 VNC 端口和显示器 ID。
- 用法:
bash
skills/headful-browser-vnc/scripts/start
vnc.sh id> [--display=:99] [--resolution=1366x768] [--port=5901]
- - 前置条件:已安装 Xvfb 和 x11vnc(或在 Docker 镜像中可用),如果提供了 VNCPASSFILE,其权限为 600。
- 关系:start
vnc.sh 创建 start
chromedebug.sh 使用的 DISPLAY 和用户数据目录;stop_vnc.sh 撤销会话。
skills/headful-browser-vnc/scripts/stop_vnc.sh
- - 用途:停止并清理之前由 start_vnc.sh 启动的正在运行的 VNC/Xvfb 会话。
- 用法:
bash
skills/headful-browser-vnc/scripts/stop
vnc.sh id> [--display=:99]
- - 前置条件:会话 ID 与 start_vnc.sh 创建的活动会话匹配。
skills/headful-browser-vnc/scripts/startchromedebug.sh
- - 用途:启动一个连接到会话 DISPLAY 的有头 Chrome/Chromium 实例,具有专用的用户数据目录和用于 CDP 访问的远程调试端口。
- 用法:
bash
skills/headful-browser-vnc/scripts/startchromedebug.sh [--proxy=http://...] [--remote-debug-port=9222]
- - 前置条件:Chrome/Chromium 二进制文件可用且可读;已设置 DISPLAY(必须已运行 startvnc.sh);用户数据目录对进程所有者可写。
- 关系:其他脚本(export
page.sh、export_cookies.sh)连接到此脚本启动的远程调试端口。
skills/headful-browser-vnc/scripts/export_page.sh
- - 用途:指示 Chrome 加载 URL 并导出渲染后的 outerHTML 和整页截图以供后续分析。
- 用法:
bash
skills/headful-browser-vnc/scripts/export
page.sh id> [--devtools-port=9222]
- - 前置条件:有头 Chrome 具有活动且可访问的远程调试端口(startchromedebug.sh)。
skills/headful-browser-vnc/scripts/export_cookies.sh
- - 用途:通过 Chrome DevTools 协议从正在运行的 Chrome 实例导出 Cookie。
- 用法:
bash
skills/headful-browser-vnc/scripts/exportcookies.sh id> [--devtools-port=9222]
关于导出辅助工具的通用说明
- - 注意:这些导出辅助工具(exportpage.sh、exportcookies.sh)设计为幂等的,并且在人工操作员干预后运行是安全的。它们将产物放入 out// 目录,并具有限制性权限。
组合脚本使用流程
- - 流程:startvnc.sh → startchromedebug.sh → 操作员通过 VNC 连接 → 操作员交互 → exportpage.sh / exportcookies.sh → stopvnc.sh
skills/headful-browser-vnc/docker/
- - 用途:参考 docker/ 目录,包含入口点和 docker-compose.yml,以及 README.docker.md 中嵌入的 Dockerfile,用于可重现构建。ClawHub 不接受 Dockerfile 上传,因此 Dockerfile 内容已作为代码块包含在 skills/headful-browser-vnc/README.docker.md 中。
- 建议:为了可重现性和安全性,将依赖项构建到镜像中(Dockerfile),而不是依赖运行中容器内的运行时包安装。如果必须允许容器内自动安装,请参见 setup.sh 的门控机制(CONTAINERAUTOOK 和 containerautoallowed())。