AI Daily News Skill
Automatically collect and report AI news from multiple sources with fallback browser scraping.
Quick Start
CODEBLOCK0
Supported Data Sources
| Source | Primary Method | Fallback Method |
|---|
| arXiv Papers | RSS API | Playwright browser |
| Hugging Face Papers |
RSS Feed | Playwright browser |
| Product Hunt | RSS Feed | Playwright browser |
| YouTube AI Creators | yt-dlp | Playwright browser |
| PaperWeekly | RSS | requests |
| Custom RSS | feedparser | requests |
Configuration
Edit references/config.example.json or run setup_config.py:
CODEBLOCK1
YouTube Creators
Available creator keys:
- -
andrew_ng - 吴恩达 (DeepLearning.AI) - INLINECODE3 - Matt Wolfe
- INLINECODE4 - AI Explained
- INLINECODE5 - AI with Oliver
- INLINECODE6 - Greg Isenberg
Scripts Overview
| Script | Purpose |
|---|
| INLINECODE7 | Main collector with fallback logic |
| INLINECODE8 |
YouTube video collection |
|
rss_collector.py | RSS feed collection |
|
browser_fallback.py | Browser-based fallback scraping |
|
push_to_feishu.py | Report generation and Feishu push |
|
daily_scheduler.py | Scheduled task runner |
|
setup_config.py | Interactive configuration setup |
Fallback Mechanism
When primary methods (RSS/API/yt-dlp) fail:
- 1. Automatically retries with browser-based scraping
- Uses Playwright for JavaScript-rendered pages
- Seamless integration - same output format
- Logs fallback usage for monitoring
Report Format
Generated reports include:
- - 📚 arXiv papers with abstracts
- 🚀 Product Hunt AI products
- 🤗 Hugging Face papers
- 📺 YouTube video summaries
- 📰 PaperWeekly interpretations
- 📊 Source statistics
Troubleshooting
arXiv returns 0 papers: Check days_back parameter or network connection
YouTube fails: Ensure yt-dlp is installed; fallback to Playwright available
RSS timeouts: Browser fallback will attempt direct requests
Feishu push fails: Verify webhook URL and chat_id in config
Advanced: Adding Custom Sources
- 1. Add RSS feed to
rss section in config - Or implement new collector in INLINECODE16
- Register in INLINECODE17
- Add fallback method in INLINECODE18
See references/DEVELOPMENT.md for detailed extension guide.
AI 每日新闻技能
自动从多个来源收集并报告 AI 新闻,支持浏览器回退抓取。
快速开始
bash
安装依赖
pip install -r references/requirements.txt
playwright install chromium
配置
python scripts/setup_config.py
运行收集
python scripts/collect
ainews.py
生成并推送报告
python scripts/push
tofeishu.py
支持的数据源
| 来源 | 主要方法 | 回退方法 |
|---|
| arXiv 论文 | RSS API | Playwright 浏览器 |
| Hugging Face 论文 |
RSS 订阅 | Playwright 浏览器 |
| Product Hunt | RSS 订阅 | Playwright 浏览器 |
| YouTube AI 创作者 | yt-dlp | Playwright 浏览器 |
| PaperWeekly | RSS | requests |
| 自定义 RSS | feedparser | requests |
配置
编辑 references/config.example.json 或运行 setup_config.py:
json
{
feishu: {
webhook_url: https://open.feishu.cn/open-apis/bot/v2/hook/xxx,
chatid: ocxxx
},
sources: {
arxiv: {enabled: true, categories: [cs.CL, cs.LG, cs.AI]},
youtube: {
enabled: true,
creators: [andrewng, mattwolfe, aiexplained, gregisenberg]
},
paperweekly: {enabled: true, rss_url: }
}
}
YouTube 创作者
可用的创作者键值:
- - andrewng - 吴恩达 (DeepLearning.AI)
- mattwolfe - Matt Wolfe
- aiexplained - AI Explained
- aiwitholiver - AI with Oliver
- gregisenberg - Greg Isenberg
脚本概览
| 脚本 | 用途 |
|---|
| collectainews.py | 主收集器,含回退逻辑 |
| youtube_collector.py |
YouTube 视频收集 |
| rss_collector.py | RSS 订阅收集 |
| browser_fallback.py | 基于浏览器的回退抓取 |
| push
tofeishu.py | 报告生成与飞书推送 |
| daily_scheduler.py | 定时任务运行器 |
| setup_config.py | 交互式配置设置 |
回退机制
当主要方法(RSS/API/yt-dlp)失败时:
- 1. 自动重试,采用基于浏览器的抓取
- 使用 Playwright 处理 JavaScript 渲染页面
- 无缝集成——输出格式相同
- 记录回退使用情况以供监控
报告格式
生成的报告包含:
- - 📚 arXiv 论文及摘要
- 🚀 Product Hunt AI 产品
- 🤗 Hugging Face 论文
- 📺 YouTube 视频摘要
- 📰 PaperWeekly 解读
- 📊 来源统计
故障排除
arXiv 返回 0 篇论文:检查 days_back 参数或网络连接
YouTube 失败:确保已安装 yt-dlp;可回退至 Playwright
RSS 超时:浏览器回退将尝试直接请求
飞书推送失败:验证配置中的 webhook URL 和 chat_id
高级:添加自定义来源
- 1. 在配置的 rss 部分添加 RSS 订阅
- 或在 scripts/ 中实现新的收集器
- 在 collectainews.py 中注册
- 在 browser_fallback.py 中添加回退方法
详细扩展指南请参阅 references/DEVELOPMENT.md。