AI Paper Survey Skill

Structured, multi-phase paper survey workflow for AI research.

When to Use

- "Survey recent papers in [topic]"
"What's new in agent/LLM/multimodal research?"
"Find the most important papers from the last N months"
"Do a literature review on [topic]"
"Track progress in [research area]"

Prerequisites

- alphaXiv MCP server must be connected (provides embedding_similarity_search, full_text_papers_search, get_paper_content)
paper-impact-analyzer skill installed (for impact assessment)
Research keywords file (optional): a Markdown file listing the user's research interests and keywords

Workflow: 5-Phase Pipeline

Phase 0: Load Research Context

1. Check if a research keywords file exists. Look for files matching patterns:

- 研究关键词*.md - research-keywords*.md - research-interests*.md in the current working directory.

2. If found, read it and extract:

- Theme list: the major research themes (e.g., "RL optimization", "Agent & Tool Calling") - Keywords: specific terms to search for (e.g., "GRPO", "Nested Learning", "VLA") - Models of interest: specific model names (e.g., "DeepSeek V4", "Qwen3.5")

3. If no keywords file, ask the user for:

- Research topics (1-5 topics) - Time range (default: last 3 months) - Any specific papers or authors to track

4. Determine the time range (default: last 3 months from today).

5. Generate search queries using the template below. For each user theme T, generate:

CODEBLOCK0

Phase 1: Broad Search (Parallel)

Execute search queries in parallel using alphaXiv MCP tools:

- Use embedding_similarity_search for semantic queries (captures conceptual matches)
Use full_text_papers_search for keyword queries (captures exact term matches)

Rules:

- Launch 4-6 parallel searches covering different themes
Each search returns up to 15 results
Collect all results into a candidate pool
Deduplicate by arXiv ID
Filter by publication date (must be within the specified time range)

Expected output: 30-60 unique candidate papers with titles and abstracts.

Phase 2: Initial Screening (LLM Judgment)

For each candidate paper, classify by the user's framework. Default framework (3-tier):

- Tier 1 (Essence): "What IS X?" — Redefines the problem itself. Asks fundamental questions about the nature of learning, reasoning, action, perception, etc. These papers have lasting impact because they challenge assumptions.
Tier 2 (Engineering): "How to do X better?" — Optimizes within existing frameworks. Valuable but doesn't change paradigms. Examples: better MoE routing, improved training recipes, new benchmarks.
Tier 3 (Patch): "How to mitigate this symptom?" — Short-term fixes. Inference token pruning, fine-tuning tricks, quantization improvements.

Rules:

- Use ONLY title + abstract for screening (don't read full papers yet)
Be selective: aim for 8-12 papers across all tiers
Tier 1 should have 3-5 papers max
Apply the user's specific keywords to boost relevance

Expected output: Classified paper list with tier assignments.

Phase 3: Deep Reading (Parallel, Top Candidates Only)

For Tier 1 and top Tier 2 papers (4-6 papers max), use get_paper_content to retrieve full analysis.

After reading each paper, immediately extract and cache:

- Core contribution (1 sentence)
Method keywords (3-5 terms)
Best experimental result (1-2 numbers)
Open-source links (GitHub URL if any)
Venue acceptance status
Key limitation

Discard the raw full-text analysis after extraction to manage context window.

Phase 4: Impact Assessment

For each paper in the deep reading set, run the paper-impact-analyzer:

CODEBLOCK1

Merge impact data with the content analysis from Phase 3.

Phase 5: Synthesize Report

Generate a structured Markdown report with the following sections:

CODEBLOCK2

Save the report to {working_directory}/{topic}-paper-survey-{date}.md.

Configuration

Custom Classification Framework

Users can override the default 3-tier framework by specifying their own in the keywords file. The skill will use whatever framework the user provides.

Search Depth Control

Level	Searches	Deep reads	Best for
Quick	4	2-3	Weekly check-in
Standard

6 | 4-6 | Monthly review | | Thorough | 8-10 | 6-8 | Quarterly survey |

Default: Standard.

Example Usage

CODEBLOCK3

CODEBLOCK4

CODEBLOCK5

AI论文调研技能

用于AI研究的结构化、多阶段论文调研工作流。

适用场景

- 调研[主题]领域的最新论文
智能体/大语言模型/多模态研究有什么新进展？
查找过去N个月内最重要的论文
对[主题]进行文献综述
追踪[研究领域]的进展

前置条件

- 必须连接alphaXiv MCP服务器（提供embeddingsimilaritysearch、fulltextpaperssearch、getpaper_content功能）
已安装paper-impact-analyzer技能（用于影响力评估）
研究关键词文件（可选）：一个Markdown文件，列出用户的研究兴趣和关键词

工作流：5阶段流程

阶段0：加载研究上下文

1. 检查是否存在研究关键词文件。在当前工作目录中查找匹配以下模式的文件：

- 研究关键词*.md - research-keywords*.md - research-interests*.md

2. 如果找到，读取并提取：

- 主题列表：主要研究主题（例如强化学习优化、智能体与工具调用） - 关键词：需要搜索的特定术语（例如GRPO、嵌套学习、VLA） - 关注模型：特定模型名称（例如DeepSeek V4、Qwen3.5）

3. 如果没有关键词文件，询问用户：

- 研究主题（1-5个主题） - 时间范围（默认：最近3个月） - 需要追踪的特定论文或作者

4. 确定时间范围（默认：从今天起最近3个月）。

5. 使用以下模板生成搜索查询。对每个用户主题T，生成：

语义查询： {T}的基础进展、范式转变、重新定义{T}、{年份}
关键词查询： {T的特定关键词} {年份范围}
对比查询： {T当前范式的替代方案}、超越{T}、{年份}

阶段1：广泛搜索（并行）

使用alphaXiv MCP工具并行执行搜索查询：

- 使用embeddingsimilaritysearch进行语义查询（捕获概念匹配）
使用fulltextpapers_search进行关键词查询（捕获精确术语匹配）

规则：

- 启动4-6个覆盖不同主题的并行搜索
每次搜索最多返回15个结果
将所有结果收集到候选池中
按arXiv ID去重
按发表日期过滤（必须在指定时间范围内）

预期输出： 30-60篇包含标题和摘要的唯一候选论文。

阶段2：初步筛选（LLM判断）

对每篇候选论文，按用户框架进行分类。默认框架（3层）：

- 第1层（本质）：X是什么？——重新定义问题本身。提出关于学习、推理、行动、感知等本质的基本问题。这些论文因挑战假设而具有持久影响力。
第2层（工程）：如何更好地做X？——在现有框架内优化。有价值但不会改变范式。例如：更好的MoE路由、改进的训练方案、新基准。
第3层（补丁）：如何缓解这个症状？——短期修复。推理令牌剪枝、微调技巧、量化改进。

规则：

- 仅使用标题+摘要进行筛选（暂不阅读全文）
选择性筛选：目标为各层共8-12篇论文
第1层最多3-5篇
应用用户特定关键词提升相关性

预期输出： 带有层级分配的分类论文列表。

阶段3：深度阅读（并行，仅限顶级候选）

对第1层和顶级第2层论文（最多4-6篇），使用getpapercontent获取完整分析。

阅读每篇论文后，立即提取并缓存：

- 核心贡献（一句话）
方法关键词（3-5个术语）
最佳实验结果（1-2个数字）
开源链接（如有GitHub URL）
会议接收状态
关键局限性

提取后丢弃原始全文分析以管理上下文窗口。

阶段4：影响力评估

对深度阅读集中的每篇论文，运行paper-impact-analyzer：

bash
python path/to/paper-impact-analyzer/scripts/analyze.py {arxivid1} {arxivid2} ...

将影响力数据与阶段3的内容分析合并。

阶段5：综合报告

生成结构化Markdown报告，包含以下部分：

markdown

{主题} 论文调研 — {日期范围}

调研日期：{今天}
范围：{覆盖的主题}
筛选论文：{N篇候选} → {M篇入选}

分类框架

{描述使用的层级系统}

第1层（本质）：重新定义问题

论文1：{标题}

- 本质问题：这挑战了什么基本假设？
核心贡献：{一句话}
关键结果：{最佳数字}
影响力：{分析器评分} | {会议} | {GitHub星数}
链接：arXiv | GitHub

{... 对每篇第1层论文重复}

第2层（工程）：做得更好

| 论文 | 贡献 | 影响力 | 链接 | |-------|-------------|--------|-------| {表格行}

第3层（补丁）：症状缓解

| 论文 | 修复内容 | 链接 | |-------|--------------|-------| {表格行}

趋势与观察

{2-3段关于新兴模式的描述}

将报告保存到{工作目录}/{主题}-论文调研-{日期}.md。

配置

自定义分类框架

用户可以通过在关键词文件中指定自己的框架来覆盖默认的3层框架。该技能将使用用户提供的任何框架。

搜索深度控制

级别	搜索次数	深度阅读数	最佳用途
快速	4	2-3	每周检查
标准

6 | 4-6 | 月度回顾 | | 深入 | 8-10 | 6-8 | 季度调研 |

默认：标准。

使用示例

调研我研究领域最近3个月的论文

快速调研：自1月以来大语言模型推理和智能体工具调用有什么新进展？

对大型语言模型的强化学习训练方法进行深入文献综述，按创新层级分类

ai-paper-surveyAI论文综述

ai-paper-survey

AI Paper Survey Skill

When to Use

Prerequisites

Workflow: 5-Phase Pipeline

Phase 0: Load Research Context

Phase 1: Broad Search (Parallel)

Phase 2: Initial Screening (LLM Judgment)

Phase 3: Deep Reading (Parallel, Top Candidates Only)

Phase 4: Impact Assessment

Phase 5: Synthesize Report

Configuration

Custom Classification Framework

Search Depth Control

Example Usage

AI论文调研技能

适用场景

前置条件

工作流：5阶段流程

阶段0：加载研究上下文

阶段1：广泛搜索（并行）

阶段2：初步筛选（LLM判断）

阶段3：深度阅读（并行，仅限顶级候选）

阶段4：影响力评估

阶段5：综合报告

{主题} 论文调研 — {日期范围}

分类框架

第1层（本质）：重新定义问题

论文1：{标题}

第2层（工程）：做得更好

第3层（补丁）：症状缓解

推荐的前3篇论文

趋势与观察

配置

自定义分类框架

搜索深度控制

使用示例

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement