Retraction Watcher
A specialized skill for identifying retracted, corrected, or questionable papers in academic reference lists before they compromise research integrity.
When to Use
- - Use this skill when the task needs Automatically scan document reference lists and check against Retraction.
- Use this skill for evidence insight tasks that require explicit assumptions, bounded scope, and a reproducible output format.
- Use this skill when you need a documented fallback path for missing inputs, execution errors, or partial evidence.
Key Features
- - Scope-focused workflow aligned to: Automatically scan document reference lists and check against Retraction.
- Packaged executable path(s):
scripts/main.py. - Reference material available in
references/ for task-specific guidance. - Structured execution path designed to keep outputs consistent and reviewable.
Dependencies
See ## Prerequisites above for related details.
- -
Python: 3.10+. Repository baseline for current packaged skills. - INLINECODE5 :
unspecified. Declared in requirements.txt. - INLINECODE8 :
unspecified. Declared in requirements.txt.
Example Usage
CODEBLOCK0 bash
python -m py_compile scripts/main.py
## Audit-Ready Commands
Use these concrete commands for validation. They are intentionally self-contained and avoid placeholder paths.
bash
python -m py_compile scripts/main.py
python scripts/main.py --help
## Workflow
1. Confirm the user objective, required inputs, and non-negotiable constraints before doing detailed work.
2. Validate that the request matches the documented scope and stop early if the task would require unsupported assumptions.
3. Use the packaged script path or the documented reasoning path with only the inputs that are actually available.
4. Return a structured result that separates assumptions, deliverables, risks, and unresolved items.
5. If execution fails or inputs are incomplete, switch to the fallback path and state exactly what blocked full completion.
## Purpose
Academic misconduct and errors can lead to paper retractions. Citing retracted work undermines research credibility. This skill:
- Scans reference lists from manuscripts, papers, or bibliographies
- Cross-checks citations against Retraction Watch and other retraction databases
- Identifies papers with retraction notices, expressions of concern, or corrections
- Provides detailed reports with retraction reasons and dates
## Trigger Conditions
Activate this skill when:
1. User provides a document with references and asks to check for retractions
2. User explicitly requests "check my references" or "scan for retracted papers"
3. User submits a bibliography or reference list for verification
4. Pre-submission manuscript review is requested
5. User wants to verify citation integrity
## Input Format
Accepted inputs:
- PDF files (manuscripts, papers, theses)
- Plain text files (.txt, .bib, .ris)
- Raw text containing reference lists
- URLs to papers or reference lists
- Clipboard content with citations
## Output Format
### Report Header
🔍 RETRACTION WATCH REPORT
Documents Scanned: [N]
References Found: [N]
Check Date: [YYYY-MM-DD]
CODEBLOCK3
Data Sources
- - Retraction Watch Database: https://retractionwatch.com/
- Crossref API: https://api.crossref.org/
- PubMed E-utilities: https://www.ncbi.nlm.nih.gov/home/develop/api/
- Open Retractions: https://openretractions.com/
References
See references/ for:
- -
citation-formats.md: Supported citation format specifications - INLINECODE13 : Database API reference and rate limits
- INLINECODE14 : Sample output reports for testing
Author: AI Assistant
Version: 1.0
Last Updated: 2026-02-06
Status: Ready for use
Requires: Internet connection for database lookups
Risk Assessment
| Risk Indicator | Assessment | Level |
|---|
| Code Execution | Python scripts with tools | High |
| Network Access |
External API calls | High |
| File System Access | Read/write data | Medium |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Data handled securely | Medium |
Security Checklist
- - [ ] No hardcoded credentials or API keys
- [ ] No unauthorized file system access (../)
- [ ] Output does not expose sensitive information
- [ ] Prompt injection protections in place
- [ ] API requests use HTTPS only
- [ ] Input validated against allowed patterns
- [ ] API timeout and retry mechanisms implemented
- [ ] Output directory restricted to workspace
- [ ] Script execution in sandboxed environment
- [ ] Error messages sanitized (no internal paths exposed)
- [ ] Dependencies audited
- [ ] No exposure of internal service architecture
Prerequisites
CODEBLOCK4
Evaluation Criteria
Success Metrics
- - [ ] Successfully executes main functionality
- [ ] Output meets quality standards
- [ ] Handles edge cases gracefully
- [ ] Performance is acceptable
Test Cases
- 1. Basic Functionality: Standard input → Expected output
- Edge Case: Invalid input → Graceful error handling
- Performance: Large dataset → Acceptable processing time
Lifecycle Status
- - Current Stage: Draft
- Next Review Date: 2026-03-06
- Known Issues: None
- Planned Improvements:
- Performance optimization
- Additional feature support
Output Requirements
Every final response should make these items explicit when they are relevant:
- - Objective or requested deliverable
- Inputs used and assumptions introduced
- Workflow or decision path
- Core result, recommendation, or artifact
- Constraints, risks, caveats, or validation needs
- Unresolved items and next-step checks
Error Handling
- - If required inputs are missing, state exactly which fields are missing and request only the minimum additional information.
- If the task goes outside the documented scope, stop instead of guessing or silently widening the assignment.
- If
scripts/main.py fails, report the failure point, summarize what still can be completed safely, and provide a manual fallback. - Do not fabricate files, citations, data, search results, or execution outcomes.
Input Validation
This skill accepts requests that match the documented purpose of retraction-watcher and include enough context to complete the workflow safely.
Do not continue the workflow when the request is out of scope, missing a critical input, or would require unsupported assumptions. Instead respond:
INLINECODE17 only handles its documented workflow. Please provide the missing required inputs or switch to a more suitable skill.
References
Response Template
Use the following fixed structure for non-trivial requests:
- 1. Objective
- Inputs Received
- Assumptions
- Workflow
- Deliverable
- Risks and Limits
- Next Checks
If the request is simple, you may compress the structure, but still keep assumptions and limits explicit when they affect correctness.
技能名称: retraction-watcher
详细描述:
撤回监控器
一项专门技能,用于在学术参考文献列表中识别已撤回、已更正或存在问题的论文,防止其损害研究诚信。
使用时机
- - 当任务需要自动扫描文档参考文献列表并与撤回数据库进行比对时,使用此技能。
- 对于需要明确假设、限定范围以及可复现输出格式的证据洞察任务,使用此技能。
- 当需要为缺失输入、执行错误或部分证据提供有文档记录的备用路径时,使用此技能。
主要特性
- - 以范围为核心的工作流程,对齐以下目标:自动扫描文档参考文献列表并与撤回数据库进行比对。
- 可打包的可执行路径:scripts/main.py。
- 参考资料位于 references/ 目录中,提供任务特定指导。
- 结构化的执行路径,旨在保持输出的一致性和可审查性。
依赖项
相关详情请参见上方的 ## 先决条件。
- - Python:3.10+。当前打包技能的仓库基线版本。
- dataclasses:未指定版本。在 requirements.txt 中声明。
- pypdf2:未指定版本。在 requirements.txt 中声明。
使用示例
python
实现细节
相关详情请参见上方的 ## 工作流程。
- - 执行模型:验证请求,选择打包的工作流程,并生成一个限定范围的可交付成果。
- 输入控制:在运行任何脚本之前,确认源文件、范围限制、输出格式和验收标准。
- 主要实现界面:scripts/main.py。
- 参考指南:references/ 目录包含支持性规则、提示或检查清单。
- 需优先明确的参数:输入路径、输出路径、范围过滤器、阈值以及任何特定领域的约束条件。
- 输出规范:保持结果可复现,明确标识假设,避免未记录在案的副作用。
快速检查
在深入执行之前,使用此命令验证打包脚本的入口点是否可被解析。
bash
python -m py_compile scripts/main.py
审计就绪命令
使用这些具体命令进行验证。它们特意设计为自包含,避免使用占位符路径。
bash
python -m py_compile scripts/main.py
python scripts/main.py --help
工作流程
- 1. 在进行详细工作之前,确认用户目标、所需输入以及不可协商的约束条件。
- 验证请求是否与文档记录的范围匹配,如果任务需要不支持的假设,则尽早停止。
- 仅使用实际可用的输入,运行打包的脚本路径或遵循文档记录的推理路径。
- 返回一个结构化的结果,将假设、可交付成果、风险和未解决事项分开。
- 如果执行失败或输入不完整,切换到备用路径,并明确指出阻碍完全完成的原因。
目的
学术不端行为和错误可能导致论文被撤回。引用已撤回的作品会损害研究可信度。此技能能够:
- - 扫描手稿、论文或参考文献列表中的参考文献
- 将引文与撤回观察数据库及其他撤回数据库进行交叉核对
- 识别带有撤回通知、关切声明或更正的论文
- 提供包含撤回原因和日期的详细报告
触发条件
在以下情况下激活此技能:
- 1. 用户提供带有参考文献的文档并要求检查撤回情况
- 用户明确要求“检查我的参考文献”或“扫描已撤回的论文”
- 用户提交参考文献目录或参考文献列表以供验证
- 请求进行提交前的手稿审查
- 用户希望验证引文完整性
输入格式
接受的输入:
- - PDF 文件(手稿、论文、学位论文)
- 纯文本文件(.txt, .bib, .ris)
- 包含参考文献列表的原始文本
- 指向论文或参考文献列表的 URL
- 包含引文的剪贴板内容
输出格式
报告头部
🔍 撤回监控报告
扫描文档数:[N]
发现参考文献数:[N]
检查日期:[YYYY-MM-DD]
状态类别
🔴 已撤回 - 论文已被正式撤回
- - 撤回原因
- 撤回日期
- 原始 DOI/PMID
- 建议操作:移除该引文
🟡 关切声明 - 期刊已提出关切
- - 关切性质
- 发布日期
- 建议操作:验证当前状态,考虑替代来源
🟠 已更正 - 论文已发布更正/勘误
- - 更正详情
- 更正日期
- 建议操作:检查更正是否影响所引用的主张
🟢 清晰 - 未发现撤回问题
技术方法
引文解析策略
- 1. 格式检测:识别引文样式(APA, MLA, Vancouver, Chicago 等)
- 字段提取:解析 DOI, PMID, 标题, 作者, 期刊, 年份
- 标识符解析:标准化 DOI(去除前缀,验证格式)
- 标题匹配:提取文章标题进行模糊匹配
数据库检查
- 1. 撤回观察数据库 - 撤回数据的主要来源
- Crossref API - 通过“update-type: retraction”获取撤回元数据
- PubMed API - 通过出版物类型过滤器获取撤回通知
- 开放撤回数据库 - 聚合的撤回数据
匹配算法
- - 精确匹配:DOI/PMID 精确匹配(最高置信度)
- 标题匹配:标准化标题比较(相似度阈值 90%+)
- 作者 + 年份:对模糊匹配进行二次验证
- 模糊匹配:处理标题的微小变化和拼写错误
难度级别
中高 - 需要:
- - 跨多种格式的稳健引文解析
- 与撤回数据库的 API 集成
- 处理部分/不完整的引文数据
- 基于标题查找的模糊匹配
- API 调用的速率限制和缓存
质量标准
一次成功的扫描必须:
- - [ ] 从标准格式中正确解析超过 90% 的引文
- [ ] 在撤回检测中实现低于 1% 的误报率
- [ ] 为每个被标记的引文提供可操作的建议
- [ ] 通过标题匹配备用方案处理缺失的 DOI/PMID
- [ ] 在合理时间内完成检查(50 条参考文献 < 30 秒)
- [ ] 保留参考文献编号以便于识别
局限性
- - 需要互联网连接才能进行数据库查询
- 免费 API 层级可能存在速率限制
- 非常近期的撤回(< 48 小时)可能尚未被收录
- 仅通过标题匹配可能会因相似标题产生误报
- 非英文论文的覆盖范围可能有限
- 预印本引文(arXiv, bioRxiv)通常不追踪撤回情况
检查 PDF 手稿
python scripts/main.py --input manuscript.pdf --format detailed
检查 BibTeX 文件
python scripts/main.py --input references.bib --output report.txt
检查原始文本
python scripts/main.py --text [在此粘贴参考文献]
仅摘要的快速检查
python scripts/main.py --input paper.pdf --format summary
数据来源
- - 撤回观察数据库:https://retractionwatch.com/
- Crossref API:https://api.crossref.org/
- PubMed E-utilities:https://www.ncbi.nlm.nih.gov/home/develop/api/
- 开放撤回数据库:https://openretractions.com/
参考资料
请参见 references/ 目录中的:
- - citation-formats.md:支持的引文格式规范
- api-documentation.md:数据库 API 参考和速率限制
- example-reports/:用于测试的示例输出报告
作者:AI 助手
版本:1.0
最后更新:2026-02-06
状态:准备就绪
要求:需要互联网连接进行数据库查询
风险评估
| 风险指标 | 评估 | 级别 |
|---|
| 代码执行 | 使用工具的 Python 脚本 | 高 |
| 网络访问 |
外部 API 调用 | 高 |
| 文件系统访问 | 读写数据 | 中 |
| 指令篡改 | 标准提示指南 | 低 |
| 数据泄露 | 数据安全处理 | 中 |
安全检查清单
- - [ ] 无硬编码的凭据或 API 密钥
- [ ] 无未经授权的文件系统访问(../)
- [ ] 输出不暴露敏感信息
- [ ] 已实施提示注入保护
- [ ] API 请求仅使用 HTTPS
- [ ] 根据允许的模式验证输入
- [ ] 已实现 API 超时和重试机制
- [ ] 输出目录限制在工作空间内
- [ ] 脚本在沙盒环境中执行
- [ ] 错误消息已清理(不暴露内部路径)
- [ ]