Smooth Browser

Smooth CLI is a browser for AI agents to interact with websites, authenticate, scrape data, and perform complex web-based tasks using natural language.

Prerequisites

Assume the Smooth CLI is already installed. If not, you can install it by running:

CODEBLOCK0

Assume an API key is already configured. If you encounter authentication errors, configure it with:

CODEBLOCK1

To verify the configuration:
CODEBLOCK2

Get an API key at https://app.smooth.sh

If the account is out of credits, ask the user to upgrade their plan at https://app.smooth.sh

Basic Workflow

1. Create a Profile (Optional)

Profiles are useful to persist cookies, login sessions, and browser state between sessions.

CODEBLOCK3

List existing profiles:
CODEBLOCK4

2. Start a Browser Session

CODEBLOCK5

Options:

- --profile-id - Use a specific profile (optional, creates anonymous session if not provided)
INLINECODE1 - Initial URL to navigate to (optional)
INLINECODE2 - Comma-separated file IDs to make available in the session (optional)
INLINECODE3 - Device type (default: mobile)
INLINECODE4 - Load profile without saving changes
INLINECODE5 - Comma-separated URL patterns to restrict access to certain URLs only (e.g., "https://example.com/,https://api.example.com/")
INLINECODE6 - Disable the default proxy (see note below)

Important: Save the session ID from the output - you'll need it for all subsequent commands.

Proxy behavior: By default, the CLI automatically configures a built-in proxy for the browser session. If a website blocks the proxy or you need direct connections, disable it with --no-proxy.

3. Run Tasks in the Session

Execute tasks using natural language:

CODEBLOCK6

With structured output (for tasks requiring interaction):
CODEBLOCK7

With metadata (the agent will be):
CODEBLOCK8

Options:

- --url - Navigate to this URL before running the task
INLINECODE9 - JSON object with variables for the task
INLINECODE10 - JSON schema for structured output
INLINECODE11 - Maximum agent steps (default: 32)
INLINECODE12 - Output results as JSON

Notes:
It's important that you give tasks at the right level of abstraction. Not too prescriptive - e.g. single-step actions - and not too broad or vague.

Good tasks:

- "Search on Linkedin for people working as SDEs at Amazon, and return 5 profile urls"
"Find the price of an iPhone 17 on Amazon"

Bad tasks:

- "Click search" -> too prescriptive!
"Load google.com, write 'restaurants near me', click search, wait for the page to load, extract the top 5 results, and return them." -> too prescriptive! you can say "search restaurants near me on google and return the top 5 results"
"Find software engineers that would be a good fit for our company" -> too broad! YOU need to plan how to achieve the goal and run well-defined tasks that compose into the given goal

IMPORTANT: Smooth is powered by an intelligent agent, DO NOT over-controll it, and give it well-defined goal-oriented tasks instead of steps.

4. Close the Session

You must close the session when you're done.

CODEBLOCK9

Important: Wait 5 seconds after closing to ensure cookies and state are saved to the profile if you need it for another session.

Common Use Cases

Authentication & Persistent Sessions

Create a profile for a specific website:
CODEBLOCK10

Reuse authenticated profile:
CODEBLOCK11

Keep profiles organized: Save to memory which profiles authenticate to which services so you can reuse them efficiently in the future.

Sequential Tasks on Same Browser

Execute multiple tasks in sequence without closing the session:

CODEBLOCK12

Important: run preserves the browser state (cookies, URL, page content) but not the browser agent's memory. If you need to carry information from one task to the next, you should pass it explicitly in the prompt.

Example - Passing context between tasks:
CODEBLOCK13

Notes:

- The run command is blocking. If you need to carry out multiple tasks at the same time, you MUST use subagents (Task tool).
All tasks will use the current tab, you cannot request to run tasks in a new tab. If you need to preserve the current tab’s state, you can open a new session.
Each session can run only one task at a time. To run tasks simultaneously, use subagents with one session each.
The maximum number of concurrent sessions depends on the user plan.
If useful, remind the user that they can upgrade the plan to give you more concurrent sessions.

Web Scraping with Structured Output

Option 1: Using run with structured output:

CODEBLOCK14

Option 2: Using extract for direct data extraction:

The extract command is more efficient for pure data extraction as it doesn't use agent steps.

It's like a smart fetch that can extract structured data from dynamically rendered websites:

CODEBLOCK15

When to use each:

- Use extract when you're on the right page or know the right url and just need to pull structured data
Use run when you need the agent to navigate, interact, or perform complex actions before extracting

Working with Files

Upload files for use in sessions:

Files must be uploaded before starting a session, then passed to the session via file IDs:

CODEBLOCK16

Upload multiple files:
CODEBLOCK17

Download files from session:

smooth run -- <session-id> "Download the monthly report PDF" --url
smooth close-session -- <session-id>

# After session closes, get download URL
smooth downloads -- <session-id>
# Visit the URL to download files

Live View & Manual Intervention

When automation needs human input (CAPTCHA, 2FA, complex authentication):

CODEBLOCK19

Direct Browser Actions

Extract data from current page:

CODEBLOCK20

Navigate to URL then extract:

CODEBLOCK21

Execute JavaScript in the browser:

CODEBLOCK22

Profile Management

List all profiles:
CODEBLOCK23

Delete a profile:
CODEBLOCK24

When to use profiles:

- ✅ Websites requiring authentication
✅ Maintaining session state across multiple task runs
✅ Avoiding repeated logins
✅ Preserving cookies and local storage

When to skip profiles:

- Public websites that don't require authentication
One-off scraping tasks
Testing scenarios

File Management

Upload files:
CODEBLOCK25

Delete files:

smooth delete-file <file-id>

Best Practices

1. Always save session IDs - You'll need them for subsequent commands
Use profiles for authenticated sessions - Track which profile is for which website
Wait 5 seconds after closing sessions - Ensures state is properly saved
Use descriptive profile IDs - e.g., "linkedin-personal", "twitter-company"
Close sessions when done - Graceful close (default) ensures proper cleanup
Use structured output for data extraction - Provides clean, typed results
Run sequential tasks in the same session - Keep the session continuous when steps rely on previous work.
Use subagents with one session each for independent tasks - Run tasks in parallel to speed up work.
Coordinate resources - When working with subagents, you must create and assign ONE section to each subagent without having them creating them.
Do not add url query parameters to urls, e.g. avoid ?filter=xyz - Start at the base URL and let the agent navigate the UI to apply filters.
Smooth is powered by an intelligent agent - Give it tasks, not individual steps.

Troubleshooting

"Session not found" - The session may have timed out or been closed. Start a new one.

"Profile not found" - Check smooth list-profiles to see available profiles.

CAPTCHA or authentication issues - Use smooth live-view -- <session-id> to let the user manually intervene.

Task timeout - Increase --max-steps or break the task into smaller steps.

Command Reference

Profile Commands

- smooth create-profile [--profile-id ID] - Create a new profile
INLINECODE24 - List all profiles
INLINECODE25 - Delete a profile

File Commands

- smooth upload-file <path> [--name NAME] [--purpose PURPOSE] - Upload a file
INLINECODE27 - Delete an uploaded file

Session Commands

- smooth start-session [OPTIONS] - Start a browser session
INLINECODE29 - Close a session
INLINECODE30 - Run a task
INLINECODE31 - Extract structured data
INLINECODE32 - Execute JavaScript
INLINECODE33 - Get interactive live URL
INLINECODE34 - Get recording URL
INLINECODE35 - Get downloads URL

All commands support --json flag for JSON output.

Smooth Browser

Smooth CLI 是一个供AI代理与网站交互、进行身份验证、抓取数据以及使用自然语言执行复杂网络任务的浏览器。

前提条件

假设 Smooth CLI 已安装。如果未安装，可以通过运行以下命令进行安装：

bash
pip install smooth-py

假设 API 密钥已配置。如果遇到身份验证错误，请使用以下命令进行配置：

bash
smooth config --api-key

验证配置：
bash
smooth config --show

在 https://app.smooth.sh 获取 API 密钥

如果账户积分不足，请让用户在 https://app.smooth.sh 升级他们的套餐

基本工作流程

1. 创建配置文件（可选）

配置文件可用于在会话之间持久化保存 cookies、登录会话和浏览器状态。

bash
smooth create-profile --profile-id my-profile

列出现有配置文件：
bash
smooth list-profiles

2. 启动浏览器会话

bash
smooth start-session --profile-id my-profile --url https://example.com

选项：

- --profile-id - 使用特定配置文件（可选，如不提供则创建匿名会话）
--url - 初始导航到的 URL（可选）
--files - 逗号分隔的文件ID，使文件在会话中可用（可选）
--device mobile|desktop - 设备类型（默认：mobile）
--profile-read-only - 加载配置文件但不保存更改
--allowed-urls - 逗号分隔的URL模式，用于限制仅访问特定URL（例如：https://example.com/,https://api.example.com/）
--no-proxy - 禁用默认代理（见下方说明）

重要提示： 保存输出中的会话ID——后续所有命令都需要用到它。

代理行为： 默认情况下，CLI会自动为浏览器会话配置内置代理。如果网站阻止代理或需要直接连接，请使用 --no-proxy 禁用它。

3. 在会话中运行任务

使用自然语言执行任务：

bash
smooth run -- 前往 LocalLLM 子版块并找到前3个帖子

使用结构化输出（适用于需要交互的任务）：
bash
smooth run -- 搜索无线耳机，筛选4星以上，按价格排序，提取前3个结果 \
--url https://shop.example.com \
--response-model {type:array,items:{type:object,properties:{product:{type:string,description:正在描述的产品的名称。},sentiment:{type:string,enum:[positive,negative,neutral],description:对产品的整体情感倾向。}},required:[product,sentiment]}}

使用元数据（代理将）：
bash
smooth run -- 用用户信息填写表单 \
--metadata {email:user@example.com,name:John Doe}

选项：

- --url - 在运行任务前导航到此URL
--metadata - 包含任务变量的JSON对象
--response-model - 结构化输出的JSON模式
--max-steps - 最大代理步数（默认：32）
--json - 以JSON格式输出结果

注意：
重要的是，你要在适当的抽象级别上给出任务。既不要太具体——例如单步操作——也不要太宽泛或模糊。

好的任务：

- 在Linkedin上搜索在亚马逊担任SDE的人员，返回5个个人资料URL
在亚马逊上查找iPhone 17的价格

不好的任务：

- 点击搜索 -> 太具体了！
加载google.com，输入附近的餐厅，点击搜索，等待页面加载，提取前5个结果，然后返回它们。 -> 太具体了！你可以说在谷歌上搜索附近的餐厅并返回前5个结果
找到适合我们公司的软件工程师 -> 太宽泛了！你需要规划如何实现目标，并运行组成该目标的明确定义的任务

重要提示：Smooth 由智能代理驱动，不要过度控制它，给它明确定义的、面向目标的任务，而不是步骤。

4. 关闭会话

完成后必须关闭会话。

bash
smooth close-session --

重要提示： 关闭后等待5秒，以确保cookies和状态保存到配置文件中（如果你需要用于另一个会话）。

常见用例

身份验证和持久化会话

为特定网站创建配置文件：
bash

创建配置文件

smooth create-profile --profile-id github-account

启动会话

smooth start-session --profile-id github-account --url https://github.com/login

获取实时视图以手动进行身份验证

smooth live-view --

将URL提供给用户，以便在浏览器中打开并登录

当用户确认登录后，你可以关闭会话以保存配置文件数据

smooth close-session --

保存profile-id以便以后重用

重用已验证的配置文件：
bash

下次只需使用相同的配置文件启动会话

smooth start-session --profile-id github-account
smooth run -- 在我的仓库my-project中创建一个新问题

保持配置文件有序： 将哪些配置文件验证了哪些服务保存到内存中，以便将来高效重用。

同一浏览器上的顺序任务

在不关闭会话的情况下按顺序执行多个任务：

bash
SESSIONID=$(smooth start-session --profile-id my-profile --json | jq -r .sessionid)

任务1：登录

smooth run $SESSION_ID 使用给定的凭据登录网站

任务2：第一个操作

smooth run $SESSION_ID 找到设置并将通知偏好更改为仅电子邮件

任务3：第二个操作

smooth run $SESSION_ID 找到账单部分并给我最新发票的URL

smooth close-session $SESSION_ID

重要提示： run 会保留浏览器状态（cookies、URL、页面内容），但不会保留浏览器代理的记忆。如果需要在任务之间传递信息，应在提示中明确传递。

示例——在任务之间传递上下文：
bash

任务1：获取信息

RESULT=$(smooth run $SESSION_ID 在此页面上查找产品名称 --json | jq -r .output)

任务2：使用任务1中的信息

smooth run $SESSION_ID 考虑名称为$RESULT的产品。现在查找此在线商店提供的3个类似产品。

注意：

- run命令是阻塞的。如果需要同时执行多个任务，必须使用子代理（Task工具）。
所有任务将使用当前标签页，不能请求在新标签页中运行任务。如果需要保留当前标签页的状态，可以打开一个新会话。
每个会话一次只能运行一个任务。要同时运行任务，请使用每个会话一个子代理。
最大并发会话数取决于用户套餐。
如果有用，提醒用户可以升级套餐以获得更多并发会话。

带结构化输出的网页抓取

选项1：使用带有结构化输出的 run：

bash
smooth start-session --url https://news.ycombinator.com
smooth run -- 提取前10个帖子 \
--response-model {
type: object,
properties: {
posts: {
type: array,
items: {
type: object,
properties: {
title: {type: string},
url: {type: string},
points: {type: number}
}
}
}
}
}

选项2：使用 extract 进行直接数据提取：

extract 命令对于纯数据提取更高效，因为它不使用代理步骤。

它就像一个智能抓取器，可以从动态渲染的网站中提取结构化数据：

bash
smooth start-session
smooth extract -- \
--url https://news.ycombinator.com \
--schema {
type: object,
properties: {
posts: {
type: array,
items: {
type: object,
properties: {
title: {type: string},
url: {type: string},
points: {type: number}
}
}
}
}
} \
--prompt 提取前10个帖子

何时使用每种方法：

- 当你在正确的页面上或知道正确的URL，只需要提取结构化数据时，使用 extract
当你需要代理在提取之前进行导航、交互或执行复杂操作时，使用 run

处理文件

上传文件以在会话中使用：

文件必须在启动会话前上传，然后通过文件ID传递给会话：

bash

步骤1：上传文件

FILE_ID

smooth-browser平滑浏览器

smooth-browser

Smooth Browser

Prerequisites

Basic Workflow

1. Create a Profile (Optional)

2. Start a Browser Session

3. Run Tasks in the Session

4. Close the Session

Common Use Cases

Authentication & Persistent Sessions

Sequential Tasks on Same Browser

Web Scraping with Structured Output

Working with Files

Live View & Manual Intervention

Direct Browser Actions

Profile Management

File Management

Best Practices

Troubleshooting

Command Reference

Profile Commands

File Commands

Session Commands

Smooth Browser

前提条件

基本工作流程

1. 创建配置文件（可选）

2. 启动浏览器会话

3. 在会话中运行任务

4. 关闭会话

常见用例

身份验证和持久化会话

创建配置文件

启动会话

获取实时视图以手动进行身份验证

将URL提供给用户，以便在浏览器中打开并登录

当用户确认登录后，你可以关闭会话以保存配置文件数据

保存profile-id以便以后重用

下次只需使用相同的配置文件启动会话

同一浏览器上的顺序任务

任务1：登录

任务2：第一个操作

任务3：第二个操作

任务1：获取信息

任务2：使用任务1中的信息

带结构化输出的网页抓取

处理文件

步骤1：上传文件

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement