Smooth Browser
Smooth CLI is a browser for AI agents to interact with websites, authenticate, scrape data, and perform complex web-based tasks using natural language.
Prerequisites
Assume the Smooth CLI is already installed. If not, you can install it by running:
CODEBLOCK0
Assume an API key is already configured. If you encounter authentication errors, configure it with:
CODEBLOCK1
To verify the configuration:
CODEBLOCK2
Get an API key at https://app.smooth.sh
If the account is out of credits, ask the user to upgrade their plan at https://app.smooth.sh
Basic Workflow
1. Create a Profile (Optional)
Profiles are useful to persist cookies, login sessions, and browser state between sessions.
CODEBLOCK3
List existing profiles:
CODEBLOCK4
2. Start a Browser Session
CODEBLOCK5
Options:
- -
--profile-id - Use a specific profile (optional, creates anonymous session if not provided) - INLINECODE1 - Initial URL to navigate to (optional)
- INLINECODE2 - Comma-separated file IDs to make available in the session (optional)
- INLINECODE3 - Device type (default: mobile)
- INLINECODE4 - Load profile without saving changes
- INLINECODE5 - Comma-separated URL patterns to restrict access to certain URLs only (e.g., "https://example.com/,https://api.example.com/")
- INLINECODE6 - Disable the default proxy (see note below)
Important: Save the session ID from the output - you'll need it for all subsequent commands.
Proxy behavior: By default, the CLI automatically configures a built-in proxy for the browser session. If a website blocks the proxy or you need direct connections, disable it with --no-proxy.
3. Run Tasks in the Session
Execute tasks using natural language:
CODEBLOCK6
With structured output (for tasks requiring interaction):
CODEBLOCK7
With metadata (the agent will be):
CODEBLOCK8
Options:
- -
--url - Navigate to this URL before running the task - INLINECODE9 - JSON object with variables for the task
- INLINECODE10 - JSON schema for structured output
- INLINECODE11 - Maximum agent steps (default: 32)
- INLINECODE12 - Output results as JSON
Notes:
It's important that you give tasks at the right level of abstraction. Not too prescriptive - e.g. single-step actions - and not too broad or vague.
Good tasks:
- - "Search on Linkedin for people working as SDEs at Amazon, and return 5 profile urls"
- "Find the price of an iPhone 17 on Amazon"
Bad tasks:
- - "Click search" -> too prescriptive!
- "Load google.com, write 'restaurants near me', click search, wait for the page to load, extract the top 5 results, and return them." -> too prescriptive! you can say "search restaurants near me on google and return the top 5 results"
- "Find software engineers that would be a good fit for our company" -> too broad! YOU need to plan how to achieve the goal and run well-defined tasks that compose into the given goal
IMPORTANT: Smooth is powered by an intelligent agent, DO NOT over-controll it, and give it well-defined goal-oriented tasks instead of steps.
4. Close the Session
You must close the session when you're done.
CODEBLOCK9
Important: Wait 5 seconds after closing to ensure cookies and state are saved to the profile if you need it for another session.
Common Use Cases
Authentication & Persistent Sessions
Create a profile for a specific website:
CODEBLOCK10
Reuse authenticated profile:
CODEBLOCK11
Keep profiles organized: Save to memory which profiles authenticate to which services so you can reuse them efficiently in the future.
Sequential Tasks on Same Browser
Execute multiple tasks in sequence without closing the session:
CODEBLOCK12
Important: run preserves the browser state (cookies, URL, page content) but not the browser agent's memory. If you need to carry information from one task to the next, you should pass it explicitly in the prompt.
Example - Passing context between tasks:
CODEBLOCK13
Notes:
- - The run command is blocking. If you need to carry out multiple tasks at the same time, you MUST use subagents (Task tool).
- All tasks will use the current tab, you cannot request to run tasks in a new tab. If you need to preserve the current tab’s state, you can open a new session.
- Each session can run only one task at a time. To run tasks simultaneously, use subagents with one session each.
- The maximum number of concurrent sessions depends on the user plan.
- If useful, remind the user that they can upgrade the plan to give you more concurrent sessions.
Web Scraping with Structured Output
Option 1: Using run with structured output:
CODEBLOCK14
Option 2: Using extract for direct data extraction:
The extract command is more efficient for pure data extraction as it doesn't use agent steps.
It's like a smart fetch that can extract structured data from dynamically rendered websites:
CODEBLOCK15
When to use each:
- - Use
extract when you're on the right page or know the right url and just need to pull structured data - Use
run when you need the agent to navigate, interact, or perform complex actions before extracting
Working with Files
Upload files for use in sessions:
Files must be uploaded before starting a session, then passed to the session via file IDs:
CODEBLOCK16
Upload multiple files:
CODEBLOCK17
Download files from session:
smooth run -- <session-id> "Download the monthly report PDF" --url
smooth close-session -- <session-id>
# After session closes, get download URL
smooth downloads -- <session-id>
# Visit the URL to download files
Live View & Manual Intervention
When automation needs human input (CAPTCHA, 2FA, complex authentication):
CODEBLOCK19
Direct Browser Actions
Extract data from current page:
CODEBLOCK20
Navigate to URL then extract:
CODEBLOCK21
Execute JavaScript in the browser:
CODEBLOCK22
Profile Management
List all profiles:
CODEBLOCK23
Delete a profile:
CODEBLOCK24
When to use profiles:
- - ✅ Websites requiring authentication
- ✅ Maintaining session state across multiple task runs
- ✅ Avoiding repeated logins
- ✅ Preserving cookies and local storage
When to skip profiles:
- - Public websites that don't require authentication
- One-off scraping tasks
- Testing scenarios
File Management
Upload files:
CODEBLOCK25
Delete files:
smooth delete-file <file-id>
Best Practices
- 1. Always save session IDs - You'll need them for subsequent commands
- Use profiles for authenticated sessions - Track which profile is for which website
- Wait 5 seconds after closing sessions - Ensures state is properly saved
- Use descriptive profile IDs - e.g., "linkedin-personal", "twitter-company"
- Close sessions when done - Graceful close (default) ensures proper cleanup
- Use structured output for data extraction - Provides clean, typed results
- Run sequential tasks in the same session - Keep the session continuous when steps rely on previous work.
- Use subagents with one session each for independent tasks - Run tasks in parallel to speed up work.
- Coordinate resources - When working with subagents, you must create and assign ONE section to each subagent without having them creating them.
- Do not add url query parameters to urls, e.g. avoid
?filter=xyz - Start at the base URL and let the agent navigate the UI to apply filters. - Smooth is powered by an intelligent agent - Give it tasks, not individual steps.
Troubleshooting
"Session not found" - The session may have timed out or been closed. Start a new one.
"Profile not found" - Check smooth list-profiles to see available profiles.
CAPTCHA or authentication issues - Use smooth live-view -- <session-id> to let the user manually intervene.
Task timeout - Increase --max-steps or break the task into smaller steps.
Command Reference
Profile Commands
- -
smooth create-profile [--profile-id ID] - Create a new profile - INLINECODE24 - List all profiles
- INLINECODE25 - Delete a profile
File Commands
- -
smooth upload-file <path> [--name NAME] [--purpose PURPOSE] - Upload a file - INLINECODE27 - Delete an uploaded file
Session Commands
- -
smooth start-session [OPTIONS] - Start a browser session - INLINECODE29 - Close a session
- INLINECODE30 - Run a task
- INLINECODE31 - Extract structured data
- INLINECODE32 - Execute JavaScript
- INLINECODE33 - Get interactive live URL
- INLINECODE34 - Get recording URL
- INLINECODE35 - Get downloads URL
All commands support --json flag for JSON output.
Smooth Browser
Smooth CLI 是一个供AI代理与网站交互、进行身份验证、抓取数据以及使用自然语言执行复杂网络任务的浏览器。
前提条件
假设 Smooth CLI 已安装。如果未安装,可以通过运行以下命令进行安装:
bash
pip install smooth-py
假设 API 密钥已配置。如果遇到身份验证错误,请使用以下命令进行配置:
bash
smooth config --api-key
验证配置:
bash
smooth config --show
在 https://app.smooth.sh 获取 API 密钥
如果账户积分不足,请让用户在 https://app.smooth.sh 升级他们的套餐
基本工作流程
1. 创建配置文件(可选)
配置文件可用于在会话之间持久化保存 cookies、登录会话和浏览器状态。
bash
smooth create-profile --profile-id my-profile
列出现有配置文件:
bash
smooth list-profiles
2. 启动浏览器会话
bash
smooth start-session --profile-id my-profile --url https://example.com
选项:
- - --profile-id - 使用特定配置文件(可选,如不提供则创建匿名会话)
- --url - 初始导航到的 URL(可选)
- --files - 逗号分隔的文件ID,使文件在会话中可用(可选)
- --device mobile|desktop - 设备类型(默认:mobile)
- --profile-read-only - 加载配置文件但不保存更改
- --allowed-urls - 逗号分隔的URL模式,用于限制仅访问特定URL(例如:https://example.com/,https://api.example.com/)
- --no-proxy - 禁用默认代理(见下方说明)
重要提示: 保存输出中的会话ID——后续所有命令都需要用到它。
代理行为: 默认情况下,CLI会自动为浏览器会话配置内置代理。如果网站阻止代理或需要直接连接,请使用 --no-proxy 禁用它。
3. 在会话中运行任务
使用自然语言执行任务:
bash
smooth run -- 前往 LocalLLM 子版块并找到前3个帖子
使用结构化输出(适用于需要交互的任务):
bash
smooth run -- 搜索无线耳机,筛选4星以上,按价格排序,提取前3个结果 \
--url https://shop.example.com \
--response-model {type:array,items:{type:object,properties:{product:{type:string,description:正在描述的产品的名称。},sentiment:{type:string,enum:[positive,negative,neutral],description:对产品的整体情感倾向。}},required:[product,sentiment]}}
使用元数据(代理将):
bash
smooth run -- 用用户信息填写表单 \
--metadata {email:user@example.com,name:John Doe}
选项:
- - --url - 在运行任务前导航到此URL
- --metadata - 包含任务变量的JSON对象
- --response-model - 结构化输出的JSON模式
- --max-steps - 最大代理步数(默认:32)
- --json - 以JSON格式输出结果
注意:
重要的是,你要在适当的抽象级别上给出任务。既不要太具体——例如单步操作——也不要太宽泛或模糊。
好的任务:
- - 在Linkedin上搜索在亚马逊担任SDE的人员,返回5个个人资料URL
- 在亚马逊上查找iPhone 17的价格
不好的任务:
- - 点击搜索 -> 太具体了!
- 加载google.com,输入附近的餐厅,点击搜索,等待页面加载,提取前5个结果,然后返回它们。 -> 太具体了!你可以说在谷歌上搜索附近的餐厅并返回前5个结果
- 找到适合我们公司的软件工程师 -> 太宽泛了!你需要规划如何实现目标,并运行组成该目标的明确定义的任务
重要提示:Smooth 由智能代理驱动,不要过度控制它,给它明确定义的、面向目标的任务,而不是步骤。
4. 关闭会话
完成后必须关闭会话。
bash
smooth close-session --
重要提示: 关闭后等待5秒,以确保cookies和状态保存到配置文件中(如果你需要用于另一个会话)。
常见用例
身份验证和持久化会话
为特定网站创建配置文件:
bash
创建配置文件
smooth create-profile --profile-id github-account
启动会话
smooth start-session --profile-id github-account --url https://github.com/login
获取实时视图以手动进行身份验证
smooth live-view --
将URL提供给用户,以便在浏览器中打开并登录
当用户确认登录后,你可以关闭会话以保存配置文件数据
smooth close-session --
保存profile-id以便以后重用
重用已验证的配置文件:
bash
下次只需使用相同的配置文件启动会话
smooth start-session --profile-id github-account
smooth run -- 在我的仓库my-project中创建一个新问题
保持配置文件有序: 将哪些配置文件验证了哪些服务保存到内存中,以便将来高效重用。
同一浏览器上的顺序任务
在不关闭会话的情况下按顺序执行多个任务:
bash
SESSIONID=$(smooth start-session --profile-id my-profile --json | jq -r .sessionid)
任务1:登录
smooth run $SESSION_ID 使用给定的凭据登录网站
任务2:第一个操作
smooth run $SESSION_ID 找到设置并将通知偏好更改为仅电子邮件
任务3:第二个操作
smooth run $SESSION_ID 找到账单部分并给我最新发票的URL
smooth close-session $SESSION_ID
重要提示: run 会保留浏览器状态(cookies、URL、页面内容),但不会保留浏览器代理的记忆。如果需要在任务之间传递信息,应在提示中明确传递。
示例——在任务之间传递上下文:
bash
任务1:获取信息
RESULT=$(smooth run $SESSION_ID 在此页面上查找产品名称 --json | jq -r .output)
任务2:使用任务1中的信息
smooth run $SESSION_ID 考虑名称为$RESULT的产品。现在查找此在线商店提供的3个类似产品。
注意:
- - run命令是阻塞的。如果需要同时执行多个任务,必须使用子代理(Task工具)。
- 所有任务将使用当前标签页,不能请求在新标签页中运行任务。如果需要保留当前标签页的状态,可以打开一个新会话。
- 每个会话一次只能运行一个任务。要同时运行任务,请使用每个会话一个子代理。
- 最大并发会话数取决于用户套餐。
- 如果有用,提醒用户可以升级套餐以获得更多并发会话。
带结构化输出的网页抓取
选项1:使用带有结构化输出的 run:
bash
smooth start-session --url https://news.ycombinator.com
smooth run -- 提取前10个帖子 \
--response-model {
type: object,
properties: {
posts: {
type: array,
items: {
type: object,
properties: {
title: {type: string},
url: {type: string},
points: {type: number}
}
}
}
}
}
选项2:使用 extract 进行直接数据提取:
extract 命令对于纯数据提取更高效,因为它不使用代理步骤。
它就像一个智能抓取器,可以从动态渲染的网站中提取结构化数据:
bash
smooth start-session
smooth extract -- \
--url https://news.ycombinator.com \
--schema {
type: object,
properties: {
posts: {
type: array,
items: {
type: object,
properties: {
title: {type: string},
url: {type: string},
points: {type: number}
}
}
}
}
} \
--prompt 提取前10个帖子
何时使用每种方法:
- - 当你在正确的页面上或知道正确的URL,只需要提取结构化数据时,使用 extract
- 当你需要代理在提取之前进行导航、交互或执行复杂操作时,使用 run
处理文件
上传文件以在会话中使用:
文件必须在启动会话前上传,然后通过文件ID传递给会话:
bash
步骤1:上传文件
FILE_ID