Paper Compare
Compare academic papers side-by-side with structured tables and detailed narrative analysis.
The Paper Comparison Reasoning Framework
CODEBLOCK0
Decision Tree: Input Processing
CODEBLOCK1
Decision Tree: Comparison Angle
CODEBLOCK2
Self-Check: After Identifying Angle
- - [ ] Does my analysis focus on the right aspects?
- [ ] Will this help the user make a decision?
Step 1: Interpret the Request
What to Clarify
| Question | Why It Matters |
|---|
| Which papers? | Need exact references |
| What goal? |
Learning? Research? Writing? |
| What comparison angle? | Focus analysis appropriately |
Self-Check: Before Starting
- - [ ] Do I have all paper references?
- [ ] Do I understand what user wants to learn?
- [ ] Is the number of papers appropriate (1-5)?
- [ ] What's the comparison angle?
Step 2: Retrieve Papers
Retrieval Strategy
| Input Type | Method | Source |
|---|
| DOI | API | crossref, semantic scholar |
| URL |
web_fetch | arXiv, IEEE, PubMed |
| Search | web
search → webfetch | Find, then confirm |
| PDF | pdf skill | Extract text |
| History | memory_search | Prior comparisons |
Quality Priority
CODEBLOCK3
Citation Count
Use Semantic Scholar API:
CODEBLOCK4
Self-Check: After Retrieval
- - [ ] Did I get the abstract?
- [ ] Can I determine the methodology?
- [ ] Are there any papers with missing critical info?
- [ ] Did I get citation counts?
Step 3: Analyze (10 Dimensions)
Core Dimensions (Always Include)
| # | Dimension | What to Extract |
|---|
| 1 | Title | Full title |
| 2 |
Authors | All authors, first author highlighted |
| 3 | Year | Publication year |
| 4 | Venue | Journal/Conference |
| 5 | Research Question | What problem do they solve? |
| 6 | Methodology | Approach, techniques used |
| 7 | Dataset | What data did they use? |
| 8 | Results | Key findings with numbers |
| 9 | Limitations | What do they acknowledge? |
| 10 | Code & Data | Links to artifacts? |
Decision: What If Missing?
CODEBLOCK5
Step 4: Synthesize
Quality Scoring
Evaluate each paper:
| Factor | Score | Notes |
|---|
| Venue Quality | | |
| - Top-tier (NeurIPS, ICML, ICLR, Nature, Science) |
⭐⭐⭐ | |
| - Good (AAAI, IJCAI, CVPR, EMNLP, IEEE) | ⭐⭐ | |
| - Other | ⭐ | |
|
Citations | | |
| - 100+ | ⭐⭐⭐ | Highly cited |
| - 10-100 | ⭐⭐ | Well-known |
| - <10 | ⭐ | Recent or niche |
|
Code Available | | |
| - Yes, official | ⭐⭐⭐ | |
| - Yes, community | ⭐⭐ | |
| - No | ⭐ | |
|
Data Available | | |
| - Yes | ⭐⭐⭐ | |
| - No | ⭐ | |
Overall Quality: Sum stars (higher = more established)
Comparison Table Structure
CODEBLOCK6
Narrative Synthesis Template
Structure:
## Overview
[What problem each paper addresses - high-level]
[Comparison angle: what are we comparing?]
## Methodology Comparison
[Compare techniques - are they compression-based? architecture-based?
What's the key algorithmic difference?
How does the comparison angle affect this?]
## Results Analysis
[Quantitative results - specific numbers, metrics
Performance comparison - trade-offs mentioned
Which paper wins on what?]
## Limitations
[What each paper acknowledges - be honest about gaps]
[What's NOT covered that might matter]
## Research Gaps
[What's MISSING across ALL papers]
[What's not yet explored]
[Potential future directions]
## Quality Assessment
[Paper A: ⭐⭐⭐ - Why]
[Paper B: ⭐⭐ - Why]
[Note any concerns]
Step 5: Structured Verdict
Decision Matrix
Decision Matrix
CODEBLOCK8
Final Recommendation
CODEBLOCK9
Self-Check: Before Delivering
- - [ ] Did I answer the user's original question?
- [ ] Did I identify the comparison angle?
- [ ] Are all 10 dimensions covered?
- [ ] Is quality scored?
- [ ] Is verdict actionable?
Step 6: Validate & Deliver
For Single Paper (1 only)
Output:
CODEBLOCK10
For Comparison (2-5 papers)
Deliver:
- 1. Comparison Angle — What we're comparing and why
- Comparison Table — All 10 dimensions + quality
- Narrative Summary — 6-section synthesis
- Quality Assessment — Scored factors
- Structured Verdict — Decision matrix + recommendation
Edge Cases to Note
| Situation | How to Handle |
|---|
| Different fields | Warn: "Comparing CS vs Biology papers" |
| Very different years |
Note: "2010 vs 2024 — comparison may be unfair" |
| Preprint | Note: "Preprint — not peer-reviewed" |
| Conflicting results | Note: "Paper A claims X, Paper B claims Y" |
Error Handling
If Retrieval Fails
CODEBLOCK11
History (Persistence)
Save After Comparison
CODEBLOCK12
Load History
- - Read
memory/paper-compare-history.json if exists - Use
memory_search to find prior comparisons
Dependencies
| Skill | Use For |
|---|
| pdf | Extract text from uploaded PDFs |
| web_search |
Find papers by query |
| web_fetch | Get paper content from URLs |
Quick Reference
| Input | Action |
|---|
| 1 DOI | Single summary |
| 2 DOIs |
Full comparison |
| arXiv URL | Fetch abstract |
| "search for X" | Search → confirm → proceed |
| Upload PDF | Extract → analyze |
Summary Checklist
- - [ ] Identify comparison angle
- [ ] Retrieve all papers (metadata + abstract)
- [ ] Extract 10 dimensions
- [ ] Score quality (venue, citations, code, data)
- [ ] Build comparison table
- [ ] Write narrative summary
- [ ] Create structured verdict
- [ ] Save to history
Notes
- - Always confirm before proceeding with search results
- Keep comparisons focused: 2-5 papers max
- Don't infer missing information — state "Not specified"
- Save to history for future reference
- Quality scoring helps users make informed decisions
论文对比
使用结构化表格和详细的叙述性分析,并排比较学术论文。
论文对比推理框架
┌─────────────────────────────────────────────────────────────┐
│ 论文对比思考流程 │
├─────────────────────────────────────────────────────────────┤
│ 1. 解读 → 哪些论文?对比目标是什么? │
│ 2. 检索 → 获取元数据、摘要、全文 │
│ 3. 分析 → 从10个维度提取信息 │
│ 4. 综合 → 构建叙述、发现差距、评估质量 │
│ 5. 验证 → 检查完整性、交付结果 │
└─────────────────────────────────────────────────────────────┘
决策树:输入处理
用户输入
│
├── 1篇论文 ──→ 单篇论文摘要
│ └── 跳过对比,显示完整摘要
│
├── 2-5篇论文 ──→ 完整对比
│ └── 按10个维度进行
│
├── >5篇论文 ──→ 要求缩小范围
│ └── 请缩小至2-5篇以进行有意义的对比
│
├── DOI ──→ 通过crossref/semantic scholar获取
│ └── https://api.crossref.org/works/{doi}
│
├── URL ──→ 通过web_fetch获取
│ └── 提取标题、作者、摘要
│
├── 搜索查询 ──→ 先搜索
│ └── 使用web_search,展示前3个结果,确认后再继续
│
└── PDF文件 ──→ 先提取文本
└── 使用pdf技能,然后提取元数据
决策树:对比角度
对比是关于什么的?
│
├── 相同主题,不同方法 ──→
│ └── 重点:方法论差异、结果对比
│
├── 相同方法,不同领域 ──→
│ └── 重点:跨领域的适应性、性能表现
│
├── 随时间演变 ──→
│ └── 重点:改进之处、变化内容、SOTA进展
│
├── 竞争性方法 ──→
│ └── 重点:权衡取舍、何时选择哪种
│
└── 互补性论文 ──→
└── 重点:如何结合、各自填补哪些空白
自我检查:确定角度后
- - [ ] 我的分析是否聚焦在正确的方面?
- [ ] 这能否帮助用户做出决策?
第一步:解读请求
需要澄清的内容
| 问题 | 为什么重要 |
|---|
| 哪些论文? | 需要确切的参考文献 |
| 什么目标? |
学习?研究?写作? |
| 什么对比角度? | 适当聚焦分析 |
自我检查:开始之前
- - [ ] 我是否拥有所有论文的参考文献?
- [ ] 我是否理解用户想了解什么?
- [ ] 论文数量是否合适(1-5篇)?
- [ ] 对比角度是什么?
第二步:检索论文
检索策略
| 输入类型 | 方法 | 来源 |
|---|
| DOI | API | crossref, semantic scholar |
| URL |
web_fetch | arXiv, IEEE, PubMed |
| 搜索 | web
search → webfetch | 查找,然后确认 |
| PDF | pdf技能 | 提取文本 |
| 历史记录 | memory_search | 之前的对比 |
质量优先级
必须包含:
├── 标题
├── 作者
├── 年份
├── 发表场所
├── 摘要(用于方法论和结果)
最好包含:
├── 全文(用于局限性)
├── 代码/数据链接
├── 引用次数(见下文)
引用次数
使用Semantic Scholar API:
https://api.semanticscholar.org/graph/v1/paper/{doi}?fields=citationCount
自我检查:检索后
- - [ ] 我是否获取了摘要?
- [ ] 我能否确定方法论?
- [ ] 是否有论文缺少关键信息?
- [ ] 我是否获取了引用次数?
第三步:分析(10个维度)
核心维度(始终包含)
作者 | 所有作者,第一作者高亮 |
| 3 | 年份 | 发表年份 |
| 4 | 发表场所 | 期刊/会议 |
| 5 | 研究问题 | 他们解决了什么问题? |
| 6 | 方法论 | 方法、使用的技术 |
| 7 | 数据集 | 他们使用了什么数据? |
| 8 | 结果 | 关键发现及数据 |
| 9 | 局限性 | 他们承认了什么? |
| 10 | 代码与数据 | 相关资源的链接? |
决策:如果缺失怎么办?
缺失维度:
│
├── 摘要缺失 ──→ 注明无法分析方法论
│
├── 结果缺失 ──→ 注明元数据中无结果
│
├── 局限性缺失 ──→ 注明未说明(不要推断)
│
└── 数据集不明确 ──→ 注明未明确说明
第四步:综合
质量评分
评估每篇论文:
| 因素 | 分数 | 备注 |
|---|
| 发表场所质量 | | |
| - 顶级(NeurIPS, ICML, ICLR, Nature, Science) |
⭐⭐⭐ | |
| - 良好(AAAI, IJCAI, CVPR, EMNLP, IEEE) | ⭐⭐ | |
| - 其他 | ⭐ | |
|
引用次数 | | |
| - 100+ | ⭐⭐⭐ | 高被引 |
| - 10-100 | ⭐⭐ | 知名 |
| - <10 | ⭐ | 近期或小众 |
|
代码可用性 | | |
| - 是,官方 | ⭐⭐⭐ | |
| - 是,社区 | ⭐⭐ | |
| - 否 | ⭐ | |
|
数据可用性 | | |
| - 是 | ⭐⭐⭐ | |
| - 否 | ⭐ | |
总体质量: 星星总数(越高表示越成熟)
对比表格结构
... | ... | ... |
| 年份 | ... | ... | ... |
| 发表场所 | ... | ... | ... |
| 研究问题 | ... | ... | ... |
| 方法论 | ... | ... | ... |
| 数据集 | ... | ... | ... |
| 结果 | ... | ... | ... |
| 局限性 | ... | ... | ... |
| 代码与数据 | ... | ... | ... |
| 质量评分 | [⭐⭐⭐] | [⭐⭐] | ... |
叙述性综合模板
结构:
概述
[每篇论文解决的问题 - 高层次概述]
[对比角度:我们在比较什么?]
方法论对比
[比较技术 - 是基于压缩的?还是基于架构的?
关键算法差异是什么?
对比角度如何影响这一点?]
结果分析
[定量结果 - 具体数字、指标
性能对比 - 提到的权衡
哪篇论文在什么方面胜出?]
局限性
[每篇论文承认的内容 - 诚实地指出差距]
[未涵盖但可能重要的内容]
研究空白
[所有论文中缺失的内容]
[尚未探索的内容]
[潜在的未来方向]
质量评估
[论文A:⭐⭐⭐ - 原因]
[论文B:⭐⭐ - 原因]
[注明任何问题]
第五步:结构化结论
决策矩阵
| 如果你需要... | 选择 | 原因 |
|---|
| [最佳性能] | 论文[X] | [原因] |
| [最容易实现] |
论文[X] | [原因] |
| [最新方法] | 论文[X] | [原因] |
| [最被引用/最可靠] | 论文[X] | [原因] |
| [代码可用] | 论文[X] | [原因] |
最终推荐
markdown
结论
针对[用户目标]:
- - 最佳整体: [论文X] — [关键原因]
- 最佳实现: [论文Y] — [关键