MrScraper

Run AI-powered, unblockable web scraping, data extraction with natural language via the MrScraper API

Actions

This skill supports:

- Opening blocked pages through unblocker (stealth browser + IP rotation)
Starting AI scraper runs from natural-language instructions
Rerunning existing scraper configurations on one or multiple URLs
Running manual workflow-based reruns
Fetching paginated results and detailed results by ID

This skill is API-only and does not depend on bundled local scripts.

Base URLs

- Unblocker API: INLINECODE0
Platform API: INLINECODE1

Authentication

Unblocker API auth

Use query-param auth on unblocker endpoint:

- INLINECODE2

Platform API auth

Use header-based auth on platform endpoints:

CODEBLOCK0

How to get `MRSCRAPER_API_TOKEN`?

An API token lets your applications securely interact with MrScraper APIs and rerun scrapers created in the dashboard.

Follow these steps in the dashboard:

1. Click your User Profile at the top-right corner.
Select API Tokens.
Click New Token.
Enter a name and set an expiration date.
Click Create.
Copy the new token and store it securely as MRSCRAPER_API_TOKEN.
Use it in requests through the x-api-token header.

Security rule:

- Never expose tokens in client-side code (browser/mobile app bundles).
Store tokens in environment variables or server-side secret managers.

Notes from the auth docs:

- The API key works for all V3 Platform endpoints.
The same key can be used for endpoints on sync.scraper.mrscraper.com.
For access to endpoints on other hosts, contact support@mrscraper.com.

Install and Runtime

- No local install step is required by this skill document.
No bundled scripts/ are required.
Calls are direct HTTPS requests to the two base URLs above.

Data and Scope

- Data is sent only to api.app.mrscraper.com and api.mrscraper.com.
Responses may contain extracted page content and scrape metadata.
This skill does not define hidden persistence or background jobs.
Never expose tokens in logs, commits, or output.

Endpoints

1. Unblocker

- Method: INLINECODE11
URL: INLINECODE12
Auth: token query parameter

Opens a target URL through stealth browsing and IP rotation, then returns HTML. Use this when direct access is blocked by captcha or anti-bot protections.

Query parameters:

Field	Type	Required	Default	Description
INLINECODE14	INLINECODE15	Yes	—	Unblocker token (`MRSCRAPER_API_TOKEN`)
INLINECODE17

Request example:

CODEBLOCK1

Response example:

CODEBLOCK2

Notes:

- Prefer explicit geoCode and practical timeouts for repeatable behavior.
Only pass cookies when session-specific content is required.

2. Create AI Scraper

- Method: INLINECODE28
Host: INLINECODE29
Path: INLINECODE30
Auth: INLINECODE31

Create a new AI scraper run from natural-language instructions.

Payload parameters (for `agent`: `general` or `agent`: `listing`):

Field	Type	Required	Default	Description
INLINECODE36	string	Yes	—	Target URL
INLINECODE37

Payload parameters (for `agent`: `map`):

Field	Type	Required	Default	Description
INLINECODE45	INLINECODE46	Yes	—	Target URL
INLINECODE47

string | No | map | The AI agent type to use for scraping (for this case it is map) | | maxDepth | number | No | 2 | Maximum depth level for crawling links from the starting URL.
0 = only the starting URL, 1 = +direct links | | maxPages | number | No | 50 | Maximum number of pages to scrape during the crawling process. | | limit | number | No | 1000 | Maximum number of data records to extract across all pages. Scraping stops when this limit is reached. | | includePatterns | string | No | "" | Regex patterns to include (separate multiple with \|\|) | | excludePatterns | string | No | "" | Regex patterns to exclude (separate multiple with \|\|) |

Request example:

CODEBLOCK3

Response example:

CODEBLOCK4

Notes:

- Choose agent type correctly as each agent is specialized for specified use cases. Use general for most standard web scraping tasks. The go to agent if the user doesn't specify or the connected LLM is not confident about the type of page. But mostly used for scraping product page, but handles any type of page very well as well. Use listing for scraping listing pages like product listings, job listings, etc. Choose this if the connected LLM can confidently identify whether the given URL is a listing page. Use map for crawling and getting all subdomain or subpages of a website. Choose this if the user specifies that the given URL is a website and not a specific page. For map agent type, there is a special args that can be used to configure the scraping process.
For the map agent, you can use special arguments to control crawling:
maxDepth (lower values 1–2 for focused scraping, max 3 recommended),
maxPages (limits total pages regardless of depth),
limit (caps total records extracted),
and includePatterns/excludePatterns (regex patterns separated by || to specify which URLs to crawl or skip, e.g., */products/*||*/blog/* or */cart/*||*.pdf).
If includePatterns is an empty string, all URLs are included. If excludePatterns is an empty string, no URLs are excluded.

3. Rerun AI Scraper

- Method: INLINECODE77
Host: INLINECODE78
Path: INLINECODE79
Auth: INLINECODE80

Reruns an existing scraper configuration on a new URL.

Payload parameters:

Field	Type	Required	Default	Description
INLINECODE81	INLINECODE82	Yes	—	Scraper ID retrieved from created AI scraper
INLINECODE83

string | Yes | — | Target URL |

Optional payload parameters for `map` agent:

Field	Type	Required	Default	Description
INLINECODE86	number	No	2	Crawl depth
INLINECODE87

number | No | 50 | Maximum pages to crawl | | limit | number | No | 1000 | Result limit | | includePatterns | string | No | "" | Regex patterns to include (separate multiple with \|\|) | | excludePatterns | string | No | "" | Regex patterns to exclude (separate multiple with \|\|) |

Request example:

CODEBLOCK5

Response example:

CODEBLOCK6

4. Bulk Rerun AI Scraper

- Method: INLINECODE93
Host: INLINECODE94
Path: INLINECODE95
Auth: INLINECODE96

Runs one scraper configuration over multiple URLs.

Payload parameters:

Field	Type	Required	Default	Description
INLINECODE97	INLINECODE98	Yes	—	Existing AI scraper configuration ID
INLINECODE99

array[string] | Yes | — | Target URLs to run |

Request example:

CODEBLOCK7

Response example:

CODEBLOCK8

5. Rerun Manual Scraper

- Method: INLINECODE101
Host: INLINECODE102
Path: INLINECODE103
Auth: INLINECODE104

Executes a rerun using a manual browser workflow.

Creating a Manual Scraper

Before calling the manual rerun endpoint, you need to create and save a manual scraper from the dashboard. Follow these steps:

1. Open the MrScraper dashboard and go to Scraper.
Click New Manual Scraper +.
Enter your target URL.
Add workflow steps that match your site's behavior (e.g., Input, Click, Delay, Extract, Inject JavaScript).
Configure pagination if needed (using options like Query Pagination, Directory Pagination, or Next Page Link).
Test and save the scraper, then copy its scraperId to use in API reruns.

Payload parameters:

Field	Type	Required	Default	Description
INLINECODE117	INLINECODE118	Yes	—	ID of the manual scraper to rerun.
INLINECODE119

Request example:

CODEBLOCK9

Response example:

CODEBLOCK10

6. Bulk Rerun Manual Scraper

- Method: INLINECODE123
Host: INLINECODE124
Path: INLINECODE125
Auth: INLINECODE126

Runs one scraper configuration over multiple URLs.

Payload parameters:

Field	Type	Required	Default	Description
INLINECODE127	INLINECODE128	Yes	—	Existing manual scraper configuration ID
INLINECODE129

array[string] | Yes | — | Target URLs to run |

Request example:

CODEBLOCK11

Response example:

CODEBLOCK12

7. Fetch Results

- Method: INLINECODE131
Host: INLINECODE132
Path: INLINECODE133
Auth: INLINECODE134

Returns paginated scrape results.

Query parameters:

Field	Type	Required	Default	Description
INLINECODE135	string	Yes	INLINECODE136	Sort column
INLINECODE137

Notes:

- sortField options: createdAt, updatedAt, id, type, url, status, error, tokenUsage, INLINECODE155
INLINECODE156 options: ASC, INLINECODE158
INLINECODE159 options: createdAt, INLINECODE161

Request example:

CODEBLOCK13

Response example:

CODEBLOCK14

8. Fetch Detailed Result by ID

- Method: INLINECODE162
Host: INLINECODE163
Path: INLINECODE164
Auth: INLINECODE165

Returns one detailed result object for a specific result ID.

Query parameters:

Field	Type	Required	Default	Description
INLINECODE166	INLINECODE167	Yes	—	Result ID

Request example:

CODEBLOCK15

Response example:

CODEBLOCK16

Errors

Standard platform API errors:

Status	Meaning
INLINECODE168	Invalid request payload
INLINECODE169

Missing or invalid API token |
| 404 | Scraper or result not found |
| 429 | Rate limit exceeded |
| 500 | Internal scraper error |

Error format:

CODEBLOCK17

Operating Rules

- Validate required fields before every call.
Use pagination for large result sets.
Retry on 429 with exponential backoff.
Never expose credentials in outputs.

MrScraper

通过MrScraper API运行AI驱动的、不可屏蔽的网页抓取和数据提取，支持自然语言操作。

操作

该技能支持：

- 通过解锁器（隐身浏览器 + IP轮换）打开被屏蔽的页面
根据自然语言指令启动AI抓取器运行
在一个或多个URL上重新运行现有的抓取器配置
运行基于手动工作流的重新运行
按ID获取分页结果和详细结果

此技能仅限API使用，不依赖捆绑的本地脚本。

基础URL

- 解锁器API：https://api.mrscraper.com
平台API：https://api.app.mrscraper.com

认证

解锁器API认证

在解锁器端点上使用查询参数认证：

- token=APITOKEN>

平台API认证

在平台端点上使用基于头的认证：

http
x-api-token: APITOKEN>
accept: application/json
content-type: application/json

如何获取MRSCRAPERAPITOKEN？

API令牌允许您的应用程序安全地与MrScraper API交互，并重新运行在仪表板中创建的抓取器。

请按照以下步骤在仪表板中操作：

1. 点击右上角的用户资料。
选择API令牌。
点击新建令牌。
输入名称并设置过期日期。
点击创建。
复制新令牌并安全存储为MRSCRAPERAPITOKEN。
通过x-api-token头在请求中使用它。

安全规则：

- 切勿在客户端代码（浏览器/移动应用包）中暴露令牌。
将令牌存储在环境变量或服务器端密钥管理器中。

来自认证文档的说明：

- API密钥适用于所有V3平台端点。
同一密钥可用于sync.scraper.mrscraper.com上的端点。
如需访问其他主机上的端点，请联系support@mrscraper.com。

安装和运行时

- 此技能文档不需要本地安装步骤。
不需要捆绑的scripts/目录。
调用是对上述两个基础URL的直接HTTPS请求。

数据和范围

- 数据仅发送到api.app.mrscraper.com和api.mrscraper.com。
响应可能包含提取的页面内容和抓取元数据。
此技能不定义隐藏的持久化或后台任务。
切勿在日志、提交或输出中暴露令牌。

端点

1. 解锁器

- 方法：GET
URL：https://api.mrscraper.com
认证：token查询参数

通过隐身浏览和IP轮换打开目标URL，然后返回HTML。当直接访问被验证码或反机器人保护阻止时使用此功能。

查询参数：

字段	类型	必需	默认值	描述
token	string	是	—	解锁器令牌（MRSCRAPERAPITOKEN）
url

请求示例：

bash
curl --location https://api.mrscraper.com?token=APITOKEN>&timeout=120&geoCode=SG&url=https%3A%2F%2Fwww.lazada.sg%2Fproducts%2Fpdp-i111650098-s23209659764.html&blockResources=false

响应示例：

html

...
...

说明：

- 建议明确指定geoCode和实际的超时时间，以获得可重复的行为。
仅在需要会话特定内容时传递cookie。

2. 创建AI抓取器

- 方法：POST
主机：https://api.app.mrscraper.com
路径：/api/v1/scrapers-ai
认证：x-api-token

根据自然语言指令创建新的AI抓取器运行。

负载参数（适用于agent：general或agent：listing）：

字段	类型	必需	默认值	描述
url	string	是	—	目标URL
message

负载参数（适用于agent：map）：

字段	类型	必需	默认值	描述
url	string	是	—	目标URL
agent

string | 否 | map | 用于抓取的AI代理类型（此情况下为map） | | maxDepth | number | 否 | 2 | 从起始URL爬取链接的最大深度级别。
0 = 仅起始URL，1 = +直接链接 | | maxPages | number | 否 | 50 | 爬取过程中要抓取的最大页面数。 | | limit | number | 否 | 1000 | 跨所有页面提取的最大数据记录数。达到此限制时停止抓取。 | | includePatterns | string | 否 | | 要包含的正则表达式模式（多个用\|\|分隔） | | excludePatterns | string | 否 | | 要排除的正则表达式模式（多个用\|\|分隔） |

请求示例：

bash
curl -X POST https://api.app.mrscraper.com/api/v1/scrapers-ai \
-H x-api-token: APITOKEN> \
-H Content-Type: application/json \
-d {
url: https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html,
message: 提取标题、价格、库存和评分,
agent: general
}

响应示例：

json
{
id: 497f6eca-6276-4993-bfeb-53cbbbba6f08,
createdAt: 2019-08-24T14:15:22Z,
createdById: e13e432a-5323-4484-a91d-b5969bc564d9,
updatedAt: 2019-08-24T14:15:22Z,
updatedById: d8bc6076-4141-4a88-80b9-0eb31643066f,
deletedAt: 2019-08-24T14:15:22Z,
deletedById: 8ef578ad-7f1e-4656-b48b-b1b4a9aaa1cb,
userId: 2c4a230c-5085-4924-a3e1-25fb4fc5965b,
scraperId: 6695bf87-aaa6-46b0-b1ee-88586b222b0b,
type: AI,
url: http://example.com,
status: 已完成,
error: string,
tokenUsage: 0,
runtime: 0,
data: {}, // 主要抓取数据
htmlPath: string,
recordingPath: string,
screenshotPath: string,
dataPath: string
}

说明：

- 正确选择代理类型，因为每个代理专门用于特定用例。对于

mrscraperMrScraper爬虫

mrscraper

MrScraper

Actions

Base URLs

Authentication

Unblocker API auth

Platform API auth

How to get MRSCRAPER_API_TOKEN?

Install and Runtime

Data and Scope

Endpoints

1. Unblocker

Query parameters:

Request example:

Response example:

Notes:

2. Create AI Scraper

Payload parameters (for agent: general or agent: listing):

Payload parameters (for agent: map):

Request example:

Response example:

Notes:

3. Rerun AI Scraper

Payload parameters:

Optional payload parameters for map agent:

Request example:

Response example:

4. Bulk Rerun AI Scraper

Payload parameters:

Request example:

Response example:

5. Rerun Manual Scraper

Creating a Manual Scraper

Payload parameters:

Request example:

Response example:

6. Bulk Rerun Manual Scraper

Payload parameters:

Request example:

Response example:

7. Fetch Results

Query parameters:

Notes:

Request example:

Response example:

8. Fetch Detailed Result by ID

Query parameters:

Request example:

Response example:

Errors

Operating Rules

MrScraper

操作

基础URL

认证

解锁器API认证

平台API认证

如何获取MRSCRAPERAPITOKEN？

安装和运行时

数据和范围

端点

1. 解锁器

查询参数：

请求示例：

响应示例：

说明：

2. 创建AI抓取器

负载参数（适用于agent：general或agent：listing）：

负载参数（适用于agent：map）：

请求示例：

响应示例：

说明：

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

How to get `MRSCRAPER_API_TOKEN`?

Payload parameters (for `agent`: `general` or `agent`: `listing`):

Payload parameters (for `agent`: `map`):

Optional payload parameters for `map` agent: