Browserless Agent 🌐

A comprehensive web automation skill for OpenClaw that provides 30+ browser actions including navigation, data extraction, form filling, screenshot capture, PDF generation, file handling, and advanced web scraping capabilities.

🚀 Features

- Navigation: Full control over page navigation, redirects, and history
Data Extraction: Get text, attributes, HTML, computed styles, and structured data
Form Automation: Type text, click buttons, select options, upload files
Visual Capture: Screenshots (full page, element-only, viewport)
Content Generation: Save pages as PDF with custom options
Advanced Interactions: Hover, drag-drop, keyboard shortcuts, scrolling
Multi-tab Support: Manage multiple pages and windows
Network Control: Intercept requests, modify headers, block resources
Storage Access: Manage cookies, localStorage, sessionStorage
Dynamic Content: Wait for selectors, network idle, custom conditions
iFrames: Interact with nested frame content
Browser State: Emulate devices, set geolocation, handle dialogs

🔧 Configuration

This skill requires the BROWSERLESS_URL environment variable to be configured in OpenClaw.
Optionally, you can also set BROWSERLESS_TOKEN for authentication.

To set it up:

1. Open OpenClaw settings
Navigate to Skills → browserless-agent
Enter your Browserless base URL in the API Key field
(Optional) Add BROWSERLESS_TOKEN in the env section for token authentication

Configuration Examples:

Cloud Service (with token):

CODEBLOCK0

Local Service (no token):

CODEBLOCK1

Custom Endpoint:

CODEBLOCK2

The skill will automatically:

- Add /playwright/chromium if endpoint is not specified
Append token as query parameter if BROWSERLESS_TOKEN is set
Work with or without authentication token

Get your Browserless service at: browserless.io

📖 Available Actions

Navigation & Page Control

`navigate`

Navigate to a URL. CODEBLOCK3

`go_back`

Navigate to previous page in history. CODEBLOCK4

`go_forward`

Navigate to next page in history. CODEBLOCK5

`reload`

Reload the current page. CODEBLOCK6

`wait_for_load`

Wait for page to finish loading. CODEBLOCK7

Data Extraction

`get_text`

Extract text content from element(s). CODEBLOCK8

`get_attribute`

Get attribute value from element(s). CODEBLOCK9

`get_html`

Get inner or outer HTML of element(s). CODEBLOCK10

`get_value`

Get input value from form element(s). CODEBLOCK11

`get_style`

Get computed CSS style property. CODEBLOCK12

`get_multiple`

Extract multiple pieces of data at once. CODEBLOCK13

Interaction & Input

`type_text`

Type text into an element. CODEBLOCK14

`click`

Click on an element. CODEBLOCK15

`double_click`

Double-click on an element. CODEBLOCK16

`right_click`

Right-click (context menu) on an element. CODEBLOCK17

`hover`

Move mouse over an element. CODEBLOCK18

`focus`

Focus on an element. CODEBLOCK19

`select_option`

Select option(s) in a dropdown. CODEBLOCK20

`check`

Check a checkbox or radio button. CODEBLOCK21

`uncheck`

Uncheck a checkbox. CODEBLOCK22

`upload_file`

Upload file(s) to file input. CODEBLOCK23

`press_key`

Press keyboard key(s).

{"key": "Enter"}

Common keys: Enter, Tab, Escape, ArrowDown, Control+A, etc.

`keyboard_type`

Type text with keyboard (supports shortcuts). CODEBLOCK25

Scrolling & Position

`scroll_to`

Scroll to specific position. CODEBLOCK26

`scroll_into_view`

Scroll element into viewport. CODEBLOCK27

`scroll_to_bottom`

Scroll to bottom of page. CODEBLOCK28

`scroll_to_top`

Scroll to top of page. CODEBLOCK29

Visual & Capture

`screenshot`

Take screenshot of page or element. CODEBLOCK30

`pdf`

Generate PDF from current page. CODEBLOCK31

Evaluation & Execution

`evaluate`

Execute JavaScript in page context. CODEBLOCK32

`evaluate_function`

Execute JavaScript function with arguments. CODEBLOCK33

Waiting & Timing

`wait_for_selector`

Wait for element to appear.

{"selector": ".dynamic-content", "timeout": 10000, "state": "visible"}

States: visible, hidden, attached, detached

`wait_for_timeout`

Wait for specified milliseconds. CODEBLOCK35

`wait_for_function`

Wait for JavaScript expression to return truthy. CODEBLOCK36

`wait_for_navigation`

Wait for navigation to complete.

{"timeout": 30000, "wait_until": "networkidle"}

wait_until options: load, domcontentloaded, networkidle

Element State Checking

`is_visible`

Check if element is visible. CODEBLOCK38

`is_enabled`

Check if element is enabled. CODEBLOCK39

`is_checked`

Check if checkbox/radio is checked. CODEBLOCK40

`element_exists`

Check if element exists in DOM. CODEBLOCK41

`element_count`

Count elements matching selector. CODEBLOCK42

Storage & Cookies

`get_cookies`

Get all cookies or specific cookie. CODEBLOCK43

`set_cookie`

Set a cookie. CODEBLOCK44

`delete_cookies`

Delete cookies.

{"name": "session_id"}

Omit name to delete all cookies.

`get_local_storage`

Get localStorage item. CODEBLOCK46

`set_local_storage`

Set localStorage item. CODEBLOCK47

`clear_local_storage`

Clear all localStorage. CODEBLOCK48

Network & Requests

`set_extra_headers`

Set extra HTTP headers for all requests. CODEBLOCK49

`block_resources`

Block specific resource types.

{"types": ["image", "stylesheet", "font"]}

Types: document, stylesheet, image, media, font, script, xhr, fetch, other

`get_page_info`

Get comprehensive page information.

{}

Returns: title, url, html (optional), viewport size, etc.

iFrame Handling

`get_frame_text`

Get text from element inside iframe. CODEBLOCK52

`click_in_frame`

Click element inside iframe. CODEBLOCK53

Multi-Page/Tab

`new_page`

Open a new page/tab. CODEBLOCK54

`close_page`

Close a specific page. CODEBLOCK55

`switch_page`

Switch to a different page. CODEBLOCK56

`list_pages`

List all open pages. CODEBLOCK57

Browser Context

`set_viewport`

Set viewport size. CODEBLOCK58

`emulate_device`

Emulate mobile device.

{"device": "iPhone 12"}

Common devices: iPhone 12, iPad Pro, Galaxy S21, Pixel 5

`set_geolocation`

Set geolocation. CODEBLOCK60

`set_user_agent`

Set custom user agent. CODEBLOCK61

Advanced Automation

`drag_and_drop`

Drag element and drop on target. CODEBLOCK62

`fill_form`

Fill multiple form fields at once. CODEBLOCK63

`extract_table`

Extract data from HTML table. CODEBLOCK64

`extract_links`

Extract all links from page. CODEBLOCK65

`handle_dialog`

Set how to handle JavaScript dialogs (alert/confirm/prompt).

{
  "action": "accept",
  "text": "Optional prompt response"
}

Actions: accept, dismiss

💡 Usage Examples

Example 1: Web Scraping

CODEBLOCK67

Example 2: Form Automation

CODEBLOCK68

Example 3: Screenshot with Element Highlight

CODEBLOCK69

Example 4: PDF Generation

CODEBLOCK70

🎯 OpenClaw Integration

To use this skill from OpenClaw, the agent can automatically invoke these actions. Examples:

User: "Take a screenshot of example.com"
Agent: Executes screenshot action with the URL

User: "What's the title of wikipedia.org?"
Agent: Navigates to Wikipedia and extracts text from the title element

User: "Search for 'Python' on Google and get the first result link"
Agent: Navigates to Google, types in search, clicks search, extracts first result

🔒 Security Notes

- Browserless connection uses WebSocket over TLS (wss://)
Credentials are never logged or exposed in responses
All browser actions are isolated in the Browserless container
No local browser installation required

🐛 Troubleshooting

Connection fails:

- Verify BROWSERLESS_WS URL is correct
Check if token is valid and not expired
Ensure network allows WebSocket connections

Timeout errors:

- Increase timeout values for slow-loading pages
Use wait_for_selector before interacting with dynamic content
Consider using wait_until: "networkidle" for AJAX-heavy sites

Element not found:

- Verify selector using browser DevTools
Wait for element to load with INLINECODE72
Check if element is inside an iframe

📚 Resources

Browserless Agent 🌐

一个为OpenClaw设计的全面网页自动化技能，提供30多种浏览器操作，包括导航、数据提取、表单填写、截图捕获、PDF生成、文件处理以及高级网页抓取功能。

🚀 功能特性

- 导航：完全控制页面导航、重定向和历史记录
数据提取：获取文本、属性、HTML、计算样式和结构化数据
表单自动化：输入文本、点击按钮、选择选项、上传文件
视觉捕获：截图（全页、仅元素、视口）
内容生成：使用自定义选项将页面保存为PDF
高级交互：悬停、拖放、键盘快捷键、滚动
多标签支持：管理多个页面和窗口
网络控制：拦截请求、修改标头、阻止资源
存储访问：管理Cookie、localStorage、sessionStorage
动态内容：等待选择器、网络空闲、自定义条件
iFrame：与嵌套框架内容交互
浏览器状态：模拟设备、设置地理位置、处理对话框

🔧 配置

此技能需要在OpenClaw中配置BROWSERLESS_URL环境变量。
可选地，您还可以设置BROWSERLESS_TOKEN进行身份验证。

设置步骤：

1. 打开OpenClaw设置
导航至技能 → browserless-agent
在API密钥字段中输入您的Browserless基础URL
（可选）在env部分添加BROWSERLESS_TOKEN用于令牌认证

配置示例：

云服务（带令牌）：

BROWSERLESS_URL=wss://chrome.browserless.io
BROWSERLESS_TOKEN=your-token-here

本地服务（无令牌）：

BROWSERLESS_URL=ws://localhost:3000

自定义端点：

BROWSERLESS_URL=wss://your-host.com/playwright/chromium
BROWSERLESS_TOKEN=optional-token

该技能将自动：

- 如果未指定端点，则添加/playwright/chromium
如果设置了BROWSERLESS_TOKEN，则将其作为查询参数附加
无论是否使用认证令牌均可工作

获取您的Browserless服务：browserless.io

📖 可用操作

导航与页面控制

navigate

导航到URL。 json {url: https://example.com}

go_back

导航到历史记录中的上一页。 json {}

go_forward

导航到历史记录中的下一页。 json {}

reload

重新加载当前页面。 json {hard: false}

waitforload

等待页面加载完成。 json {timeout: 30000}

数据提取

get_text

提取元素的文本内容。 json {selector: h1, all: false}

get_attribute

获取元素的属性值。 json {selector: img, attribute: src, all: false}

get_html

获取元素的内部或外部HTML。 json {selector: article, outer: false, all: false}

get_value

获取表单元素的输入值。 json {selector: input[name=email]}

get_style

获取计算后的CSS样式属性。 json {selector: .box, property: background-color}

get_multiple

一次性提取多条数据。 json { extractions: [ {name: title, selector: h1, type: text}, {name: image, selector: img, type: attribute, attribute: src}, {name: price, selector: .price, type: text} ] }

交互与输入

type_text

在元素中输入文本。 json {selector: input[type=search], text: hello world, delay: 0, clear: true}

click

点击元素。 json {selector: button.submit, force: false, delay: 0}

double_click

双击元素。 json {selector: .item}

right_click

右键点击（上下文菜单）元素。 json {selector: .context-target}

hover

将鼠标悬停在元素上。 json {selector: .menu-item}

focus

聚焦到元素。 json {selector: input}

select_option

在下拉菜单中选择选项。 json {selector: select, values: [option1, option2]}

check

勾选复选框或单选按钮。 json {selector: input[type=checkbox]}

uncheck

取消勾选复选框。 json {selector: input[type=checkbox]}

upload_file

上传文件到文件输入框。 json {selector: input[type=file], files: [path/to/file.pdf]}

press_key

按下键盘键。 json {key: Enter}

常用键：Enter、Tab、Escape、ArrowDown、Control+A等。

keyboard_type

使用键盘输入文本（支持快捷键）。 json {text: Hello World}

滚动与定位

scroll_to

滚动到指定位置。 json {x: 0, y: 500}

scrollintoview

将元素滚动到视口中。 json {selector: .footer}

scrolltobottom

滚动到页面底部。 json {}

scrolltotop

滚动到页面顶部。 json {}

视觉与捕获

screenshot

截取页面或元素的屏幕截图。 json { path: screenshot.png, full_page: true, selector: null, quality: 90, type: png }

pdf

从当前页面生成PDF。 json { path: page.pdf, format: A4, landscape: false, margin: {top: 1cm, right: 1cm, bottom: 1cm, left: 1cm}, print_background: true }

评估与执行

evaluate

在页面上下文中执行JavaScript。 json {expression: document.title}

evaluate_function

使用参数执行JavaScript函数。 json { function: (x, y) => x + y, args: [5, 10] }

等待与定时

waitforselector

等待元素出现。 json {selector: .dynamic-content, timeout: 10000, state: visible}

状态：visible、hidden、attached、detached

waitfortimeout

等待指定的毫秒数。 json {timeout: 2000}

waitforfunction

等待JavaScript表达式返回真值。 json { expression: () => document.readyState === complete, timeout: 10000 }

waitfornavigation

等待导航完成。 json {timeout: 30000, wait_until: networkidle}

wait_until选项：load、domcontentloaded、networkidle

元素状态检查

is_visible

检查元素是否可见。 json {selector: .modal}

is_enabled

检查元素是否启用。 json {selector: button}

is_checked

检查复选框/单选按钮是否被选中。 json {selector: input[type=checkbox]}

element_exists

检查元素是否存在于DOM中。 json {selector: .optional-element}

element_count

计算匹配选择器的元素数量。 json {selector: .list-item}

存储与Cookie

get_cookies

获取所有Cookie或特定Cookie。 json {name: session_id}

set_cookie

设置Cookie。 json { name: user_preference, value: dark_mode, domain: example.com, path: /, expires: 1735689600, httpOnly: false, secure: true, sameSite: Lax }

delete_cookies

删除Cookie。 json {name: session_id}

省略名称将删除所有Cookie。

getlocalstorage

获取localStorage项。 json {key: user_data}

setlocalstorage

设置local

browserless-agent无头浏览器代理