Lightweight CDP browser control for AI agents. Token-efficient alternative to the built-in browser tool — 3-10x fewer tokens per interaction. Use when browsing websites, clicking elements, filling forms, uploading files, or extracting page content. Requires a Chrome/Chromium browser running with --remote-debugging-port (OpenClaw browser works out of the box). Signed-in sessions carry over automatically.
Lightweight CLI that talks to Chrome via CDP (Chrome DevTools Protocol). Returns minimal, indexed output that agents can act on immediately — no accessibility tree parsing, no ref hunting.
Setup
CODEBLOCK0
The tool connects to http://127.0.0.1:18800 by default. Override with CDP_URL env var.
Alias setup (optional)
CODEBLOCK1
Commands
CODEBLOCK2
How it works
INLINECODE2 scans the page for all interactive elements (links, buttons, inputs, selects, etc.) — including those inside shadow DOM (web components). This means sites like Reddit, GitHub, and other modern SPAs that use shadow DOM are fully supported. The scan recursively pierces all shadow roots.
Returns a compact numbered list:
CODEBLOCK3
Then click 3 or type 2 search query — immediately actionable, no interpretation needed.
Auto-indexing:click and type auto-index elements if not already indexed. You can skip calling elements first and go straight to click/type after open. Call elements explicitly when you need to see what's on the page.
After navigation or AJAX changes: Elements get re-indexed automatically on next click/type if stamps are stale. For manual re-index, call elements again.
Real mouse events:click uses CDP Input.dispatchMouseEvent (mousePressed + mouseReleased) instead of JS .click(). This triggers React/Vue/Angular synthetic event handlers that ignore plain .click() calls. Works reliably on SPAs like Instagram, GitHub, LinkedIn.
File uploads
INLINECODE19 uses CDP's DOM.setFileInputFiles to inject files directly into hidden <input type="file"> elements — no OS file picker dialog. Works with Instagram, Twitter, any site with file uploads.
CODEBLOCK4
Token efficiency
Approach
Tokens per interaction
Notes
bjs
~50-200
Indexed list, 1-line responses
browser tool (snapshot)
~2,000-5,000 | Full accessibility tree |
| browser tool + thinking | ~3,000-8,000 | Plus reasoning to find refs |
Over a 10-step flow: ~1,500 tokens (bjs) vs ~30,000-80,000 (browser tool).
Typical flow
CODEBLOCK5
Shadow DOM support
bjs automatically pierces shadow DOM boundaries. Sites built with web components (Reddit, GitHub, etc.) work out of the box — elements, click, type, and text all recurse into shadow roots. No special flags needed.
Coordinate commands (iframes, captchas, overlays)
When you can't use click by index — e.g. the target is inside a cross-origin iframe (captcha checkbox, payment form, OAuth widget) — use coordinate-based commands that dispatch real CDP Input events at the OS level. These bypass all DOM boundaries.
Workflow for clicking inside an iframe:
CODEBLOCK6
INLINECODE27 returns the iframe's position on the page. Add offsets to target specific elements inside it (e.g. a checkbox is typically near the left side).
Other uses:
- hover-xy — trigger hover menus, tooltips that need mouse position
- Adds coordinate-based commands (`click-xy`, `hover-xy`, `drag-xy`, `iframe-rect`) for interacting with elements inside cross-origin iframes, captchas, overlays, and other non-indexable UI.
- Updates command list and documentation to explain when and how to use coordinate-based input vs. standard indexed element commands.
- Clarifies workflow and usage details for cross-origin iframe interaction, including coordinate calculation and tips.