finviz-crawler
Why This Skill?
📰 Your own financial news database — most finance skills just wrap an API for one-shot queries. This skill runs continuously, building a local archive of every headline and article from Finviz. Query your history anytime — no API limits, no missing data.
🆓 No API key, no subscription — scrapes finviz.com directly using Crawl4AI + RSS. Bloomberg, Reuters, Yahoo Finance, CNBC articles extracted automatically. Zero cost.
🤖 Built for AI summarization — the query tool outputs clean text/JSON optimized for LLM digests. Pair with an OpenClaw cron job for automated morning briefings, evening wrap-ups, or weekly investment summaries.
💾 Auto-cleanup — configurable expiry automatically deletes old articles from both the database and disk. Set --expiry-days 30 to keep a month of history, or 0 to keep everything forever.
🔄 Daemon architecture — runs as a background service that starts/stops with OpenClaw. No manual intervention needed after setup. Works with systemd (Linux) and launchd (macOS).
Install
CODEBLOCK0
Works on macOS, Linux, and Windows. Installs Python packages (crawl4ai, feedparser), sets up Playwright browsers, creates data directories, and verifies everything.
Manual install
CODEBLOCK1
Usage
Run the crawler
CODEBLOCK2
Query articles
CODEBLOCK3
Manage tickers
CODEBLOCK4
Tickers are stored in the tickers table inside finviz.db alongside articles. The crawler reads this table each cycle to know which ticker pages to scrape.
Configuration
| Setting | CLI flag | Env var | Default |
|---|
| Database path | INLINECODE6 | — | INLINECODE7 |
| Articles directory |
--articles-dir | — |
~/workspace/finviz/articles/ |
| Crawl interval |
--sleep | — |
300 (5 min) |
| Article expiry |
--expiry-days |
FINVIZ_EXPIRY_DAYS |
7 days |
| Timezone | — |
FINVIZ_TZ or
TZ | System default |
💬 Chat Commands (OpenClaw Agent)
When this skill is installed, the agent recognizes /finviz as a shortcut:
| Command | Action |
|---|
| INLINECODE18 | Show tracked tickers |
| INLINECODE19 |
Add tickers to track |
|
/finviz remove NVDA | Remove a ticker |
|
/finviz stats | Show article/ticker counts |
|
/finviz help | Show available commands |
The agent runs these via the finviz_query.py CLI internally.
📱 PrivateApp Dashboard
A companion mobile dashboard is available in PrivateApp — a personal PWA dashboard for your home server.
The Finviz app provides:
- - Headlines browser with time-range filters (12h / 24h / Week)
- Ticker-specific news filtering
- LLM-powered summaries on demand
Install PrivateApp, and the Finviz dashboard is built-in — no extra setup needed.
Architecture
Crawler daemon (finviz_crawler.py):
- - Crawls finviz.com/news.ashx headlines every 5 minutes
- Fetches article content via Crawl4AI (Playwright) or RSS (paywalled sites)
- Bot/paywall detection rejects garbage content
- Per-domain rate limiting, user-agent rotation
- Deduplicates via SHA-256 title hash
- Auto-expires old articles (configurable)
- Clean shutdown on SIGTERM/SIGINT
Query tool (finviz_query.py):
- - Read-only SQLite queries (no HTTP, stdlib only)
- Filter by time window, export titles or full content
- Designed for LLM summarization pipelines
Run as a service (optional)
systemd (Linux)
CODEBLOCK5
launchd (macOS)
CODEBLOCK6
Data layout
CODEBLOCK7
Cron integration
Pair with an OpenClaw cron job for automated digests:
CODEBLOCK8
finviz-crawler
为什么需要这项技能?
📰 你自己的财经新闻数据库 — 大多数金融技能只是封装了一个API用于一次性查询。这项技能持续运行,从Finviz构建每个头条和文章的本地存档。随时查询你的历史记录 — 没有API限制,不会丢失数据。
🆓 无需API密钥,无需订阅 — 使用Crawl4AI + RSS直接抓取finviz.com。自动提取彭博社、路透社、雅虎财经、CNBC的文章。零成本。
🤖 专为AI摘要构建 — 查询工具输出针对LLM摘要优化的纯文本/JSON格式。配合OpenClaw定时任务,实现自动化的早间简报、晚间总结或每周投资摘要。
💾 自动清理 — 可配置的过期策略自动从数据库和磁盘中删除旧文章。设置--expiry-days 30保留一个月的历史记录,或设置为0永久保留所有内容。
🔄 守护进程架构 — 作为后台服务运行,随OpenClaw启动和停止。设置后无需手动干预。支持systemd(Linux)和launchd(macOS)。
安装
bash
python3 scripts/install.py
支持 macOS、Linux和Windows。安装Python包(crawl4ai、feedparser),设置Playwright浏览器,创建数据目录,并验证一切正常。
手动安装
bash
pip install crawl4ai feedparser
crawl4ai-setup # 或:python -m playwright install chromium
使用方法
运行爬虫
bash
默认:~/workspace/finviz/,7天过期
python3 scripts/finviz_crawler.py
自定义路径和设置
python3 scripts/finviz_crawler.py --db /path/to/finviz.db --articles-dir /path/to/articles/
保留30天的文章
python3 scripts/finviz_crawler.py --expiry-days 30
永不自动删除(保留所有内容)
python3 scripts/finviz_crawler.py --expiry-days 0
自定义爬取间隔(默认:300秒)
python3 scripts/finviz_crawler.py --sleep 600
查询文章
bash
最近24小时的头条
python3 scripts/finviz_query.py --hours 24
仅标题(紧凑格式,适合LLM摘要)
python3 scripts/finviz_query.py --hours 12 --titles-only
包含完整文章内容
python3 scripts/finviz_query.py --hours 12 --with-content
列出已下载的文章及内容状态
python3 scripts/finviz_query.py --list-articles --hours 24
数据库统计
python3 scripts/finviz_query.py --stats
管理股票代码
bash
列出所有跟踪的股票代码
python3 scripts/finviz_query.py --list-tickers
添加单个股票代码(从符号自动生成关键词)
python3 scripts/finviz_query.py --add-ticker NVDA
添加自定义关键词
python3 scripts/finviz_query.py --add-ticker NVDA:nvidia,jensen huang
批量添加多个股票代码
python3 scripts/finviz_query.py --add-ticker NVDA TSLA AAPL
python3 scripts/finviz_query.py --add-ticker NVDA:nvidia,jensen TSLA:tesla,elon musk
批量移除股票代码
python3 scripts/finviz_query.py --remove-ticker NVDA TSLA
自定义数据库路径
python3 scripts/finviz_query.py --list-tickers --db /path/to/finviz.db
股票代码存储在finviz.db的tickers表中,与文章一起。爬虫每个周期读取此表,以了解需要抓取哪些股票代码页面。
配置
| 设置 | CLI标志 | 环境变量 | 默认值 |
|---|
| 数据库路径 | --db | — | ~/workspace/finviz/finviz.db |
| 文章目录 |
--articles-dir | — | ~/workspace/finviz/articles/ |
| 爬取间隔 | --sleep | — | 300(5分钟) |
| 文章过期 | --expiry-days | FINVIZ
EXPIRYDAYS | 7天 |
| 时区 | — | FINVIZ_TZ或TZ | 系统默认 |
💬 聊天命令(OpenClaw代理)
安装此技能后,代理会识别/finviz作为快捷方式:
| 命令 | 操作 |
|---|
| /finviz list | 显示跟踪的股票代码 |
| /finviz add NVDA, TSLA |
添加要跟踪的股票代码 |
| /finviz remove NVDA | 移除股票代码 |
| /finviz stats | 显示文章/股票代码数量 |
| /finviz help | 显示可用命令 |
代理通过内部调用finviz_query.py CLI来执行这些命令。
📱 PrivateApp仪表板
配套的移动仪表板可在PrivateApp获取 — 一个用于家庭服务器的个人PWA仪表板。
Finviz应用提供:
- - 带时间范围筛选的头条浏览器(12小时/24小时/一周)
- 按股票代码筛选新闻
- 按需的LLM驱动摘要
安装PrivateApp后,Finviz仪表板即内置其中 — 无需额外设置。
架构
爬虫守护进程(finviz_crawler.py):
- - 每5分钟爬取finviz.com/news.ashx头条
- 通过Crawl4AI(Playwright)或RSS(付费网站)获取文章内容
- 机器人/付费墙检测拒绝垃圾内容
- 按域名限速,轮换用户代理
- 通过SHA-256标题哈希去重
- 自动过期旧文章(可配置)
- 在SIGTERM/SIGINT信号下干净关闭
查询工具(finviz_query.py):
- - 只读SQLite查询(无HTTP,仅使用标准库)
- 按时间窗口筛选,导出标题或完整内容
- 专为LLM摘要流水线设计
作为服务运行(可选)
systemd(Linux)
ini
[Unit]
Description=Finviz新闻爬虫
[Service]
ExecStart=python3 /path/to/scripts/finviz_crawler.py --expiry-days 30
Restart=on-failure
RestartSec=30
[Install]
WantedBy=default.target
launchd(macOS)
xml
Labelcom.finviz.crawler
ProgramArguments
python3
/path/to/scripts/finviz_crawler.py
--expiry-days
30
RunAtLoad
KeepAlive
数据布局
~/workspace/finviz/
├── finviz.db # SQLite:文章 + 股票代码(单一数据库)
├── articles/ # 完整文章内容,存储为.md文件
│ ├── market/ # 一般市场头条
│ ├── nvda/ # 按股票代码分类的文章
│ └── tsla/
└── summaries/ # LLM摘要缓存(.json)
定时任务集成
配合OpenClaw定时任务实现自动摘要:
计划:0 6 *(每天早上6点)
任务:查询最近24小时 → LLM摘要 → 发送到Matrix/Telegram/Discord