Apify Scraper Skill
Use this skill when you need to scrape content from sites that block bots — Twitter/X threads, Reddit, LinkedIn, YouTube comments, Google SERP, Amazon, Product Hunt, etc.
When to Use
- - A Twitter/X URL is shared and you need the full thread (not just the first tweet)
- You need Reddit thread content without the expensive API
- LinkedIn company/profile data is needed
- YouTube comments or video metadata beyond what the API gives
- Google search results programmatically
- Any site that blocks standard web_fetch
Do NOT use for: sites accessible via normal webfetch or websearch. Apify costs credits — use it only when needed.
Setup
- - API Key:
op://OpenClaw/Apify API Credentials/credential (also in gateway plist as APIFY_API_KEY) - Dashboard: https://console.apify.com (account: redditech)
- Plan: FREE ($5/mo credit)
- Script: INLINECODE2
Running an Actor
CODEBLOCK0
Key Actors
Twitter/X
- -
apidojo/tweet-scraper — $0.40/1K tweets. Full thread support via conversationIds. Advanced search syntax.
{"conversationIds": ["2034675043033375103"], "maxItems": 50}
or by handle:
CODEBLOCK2
Reddit
- -
trudax/reddit-scraper-lite — Free tier friendly. Fetch threads + comments.
CODEBLOCK3
YouTube
- -
streamers/youtube-scraper — Comments + metadata.
CODEBLOCK4
Google SERP
- -
apify/google-search-scraper — Search results as structured data.
CODEBLOCK5
LinkedIn
- -
anchor/linkedin-profile-scraper — ⚠️ ToS risk. Use sparingly for research only.
Pricing Reference
- - 1 CU = 1 GB RAM × 1 hour
- Free tier: $5/mo (~16.7 CU)
- Tweet scraping: ~0.035–0.04 CU/1K tweets (~$0.01/1K on free tier)
- Some actors charge flat per-result: $0.25–$0.40/1K tweets
- Check usage: https://console.apify.com/billing
Notes
- - Results are returned as a dataset — the script polls until complete
- Timeout: 5 minutes default (most actors finish in 30–60s)
- If an actor breaks (community-maintained), check Apify Store for alternatives
- MCP integration pending — Apify MCP server exists but openclaw.json doesn't support
mcpServers key yet (schema validation rejects it). Use this script approach instead.
Apify 爬虫技能
当你需要从屏蔽机器人的网站抓取内容时使用此技能——Twitter/X 帖子、Reddit、LinkedIn、YouTube 评论、Google 搜索结果、Amazon、Product Hunt 等。
何时使用
- - 分享了一个 Twitter/X 链接,需要获取完整帖子(不仅仅是第一条推文)
- 需要 Reddit 帖子内容但不想使用昂贵的 API
- 需要 LinkedIn 公司/个人资料数据
- 需要 API 无法提供的 YouTube 评论或视频元数据
- 以编程方式获取 Google 搜索结果
- 任何屏蔽标准 web_fetch 的网站
不要用于: 可通过普通 webfetch 或 websearch 访问的网站。Apify 消耗积分——仅在必要时使用。
设置
- - API 密钥: op://OpenClaw/Apify API Credentials/credential(也在 gateway plist 中作为 APIFYAPIKEY)
- 控制面板: https://console.apify.com(账户:redditech)
- 套餐: 免费(每月 $5 积分)
- 脚本: python3 scripts/apify-run.py
运行 Actor
bash
python3 /Users/loki/.openclaw/workspace/scripts/apify-run.py \
apidojo/tweet-scraper \
{twitterHandles: [solanamobile], maxItems: 50}
关键 Actor
Twitter/X
- - apidojo/tweet-scraper — 每 1000 条推文 $0.40。通过 conversationIds 支持完整帖子。高级搜索语法。
json
{conversationIds: [2034675043033375103], maxItems: 50}
或按用户名:
json
{twitterHandles: [solanamobile], maxItems: 20}
Reddit
- - trudax/reddit-scraper-lite — 免费套餐友好。抓取帖子 + 评论。
json
{startUrls: [{url: https://reddit.com/r/solana/comments/...}], maxItems: 100}
YouTube
- - streamers/youtube-scraper — 评论 + 元数据。
json
{startUrls: [{url: https://youtube.com/watch?v=...}], maxComments: 200}
Google 搜索结果
- - apify/google-search-scraper — 结构化数据的搜索结果。
json
{queries: solana mobile grants, maxPagesPerQuery: 1}
LinkedIn
- - anchor/linkedin-profile-scraper — ⚠️ 违反服务条款风险。仅限研究用途,谨慎使用。
价格参考
- - 1 CU = 1 GB 内存 × 1 小时
- 免费套餐:每月 $5(约 16.7 CU)
- 推文抓取:每 1000 条推文约 0.035–0.04 CU(免费套餐约 $0.01/1000 条)
- 部分 Actor 按结果收费:每 1000 条推文 $0.25–$0.40
- 查看使用情况:https://console.apify.com/billing
注意事项
- - 结果以数据集形式返回——脚本会轮询直到完成
- 超时:默认 5 分钟(大多数 Actor 在 30–60 秒内完成)
- 如果某个 Actor 失效(社区维护),请查看 Apify Store 寻找替代方案
- MCP 集成待定——Apify MCP 服务器存在,但 openclaw.json 尚不支持 mcpServers 键(模式验证拒绝)。请使用此脚本方法。