Phantom Limb
"The most dangerous dependency is the one that used to exist."
What It Does
Phantom Limb detects ghost references — code that reaches for things that aren't there anymore. Not broken imports (your linter catches those). The subtle kind: environment variables nobody sets, config keys that were renamed three sprints ago, API endpoints that were deprecated but never removed from the client, file paths that point to directories that exist only on the original developer's machine.
Every codebase accumulates phantoms. They don't cause errors — they cause mystery. They're the reason a feature "works everywhere except production." They're the reason onboarding takes two weeks instead of two days.
Why This Exists
Static analysis catches what's wrong. Linters catch what's ugly. Phantom Limb catches what's missing — the negative space between your code and reality.
| Traditional Tools Find | Phantom Limb Finds |
|---|
| Broken imports | Imports that resolve but reference dead code paths |
| Syntax errors |
Semantically valid references to deleted concepts |
| Unused variables | Used variables that reference phantom state |
| Missing files | Files that exist but contain assumptions from a previous architecture |
| Type mismatches | Types that match but describe something that no longer exists |
The Six Classes of Phantoms
1. Environmental Phantoms
References to environment variables, config files, or system state that no process ever sets.
CODEBLOCK0
Detection method: Cross-reference every process.env, os.environ, ENV[] read against actual .env, .env.example, CI/CD configs, and deployment manifests.
2. Referential Phantoms
Code that references functions, classes, or modules that were moved, renamed, or deleted — but the reference still "works" because a shim, re-export, or fallback catches it.
CODEBLOCK1
Detection method: Trace every import chain to its terminal definition. Flag chains longer than 2 hops. Flag anything with "legacy", "compat", "old", or "deprecated" in the path that has no deprecation deadline.
3. Temporal Phantoms
Code that depends on timing, ordering, or sequencing that was true under a previous architecture but is no longer guaranteed.
CODEBLOCK2
Detection method: Map all implicit ordering assumptions. Flag any data access that assumes a prior middleware/hook/lifecycle event has already completed without explicit await/guard.
4. Contractual Phantoms
API contracts, database schemas, or wire formats that the code expects but the other side no longer honors.
CODEBLOCK3
Detection method: Compare every outbound payload construction against the latest API schema/docs. Compare every database query against the current schema. Flag fields that are constructed but never consumed.
5. Intentional Phantoms
Comments, TODOs, and documentation that describe behavior the code no longer exhibits. The specification has become a ghost story.
CODEBLOCK4
Detection method: Parse doc comments and compare claimed behavior against actual implementation. Flag docstrings that mention patterns (retry, cache, fallback, queue, batch) that don't appear in the method body.
6. Identity Phantoms
Variables, functions, or modules whose names describe something they no longer do. The name is a phantom of their original purpose.
CODEBLOCK5
Detection method: Semantic analysis of identifier names vs. their actual behavior. Flag contradictions between name semantics and implementation semantics (e.g., temp + no expiry, async + synchronous execution, safe + no error handling).
How It Works
CODEBLOCK6
Severity Scoring
| Severity | Description | Example |
|---|
| Critical | Phantom causes silent data loss or corruption | API field silently ignored, data never saved |
| High |
Phantom causes intermittent failures | Temporal phantom, race condition with ghost state |
|
Medium | Phantom causes confusion but no runtime errors | Identity phantom, misleading names |
|
Low | Phantom is inert but adds cognitive load | Dead re-exports, orphaned configs |
|
Vestigial | Phantom is harmless but indicates architectural rot | TODO comments from 2+ years ago |
Output Format
CODEBLOCK7
Integration
Invoke when:
- - Onboarding a new developer (show them where the ghosts live)
- After a major refactor (find what the refactor left behind)
- Before a production deploy (catch phantoms before users do)
- During architecture review (map the gap between intent and reality)
Why It Matters
Every codebase has a phantom architecture — the system it thinks it is, layered on top of the system it actually is. The gap between these two architectures is where bugs hide, onboarding stalls, and technical debt compounds silently.
Phantom Limb doesn't find bugs. It finds the conditions that make bugs inevitable.
Zero external dependencies. Zero API calls. Pure structural analysis.
幻肢
最危险的依赖是曾经存在的那一个。
功能概述
幻肢检测幽灵引用——那些指向已不存在事物的代码。不是损坏的导入(你的代码检查工具会捕捉那些)。而是更微妙的那种:没人设置的环境变量、三个迭代前被重命名的配置键、已弃用但从未从客户端移除的API端点、指向仅存在于原始开发者机器上的目录的文件路径。
每个代码库都会积累幽灵。它们不会引发错误——它们引发的是谜团。它们是某个功能在所有环境都能运行除了生产环境的原因。它们是入职需要两周而不是两天的原因。
存在理由
静态分析捕捉错误的内容。代码检查工具捕捉丑陋的内容。幻肢捕捉缺失的内容——代码与现实之间的负空间。
| 传统工具发现 | 幻肢发现 |
|---|
| 损坏的导入 | 能解析但引用死代码路径的导入 |
| 语法错误 |
引用已删除概念的语义有效引用 |
| 未使用的变量 | 引用幽灵状态的已使用变量 |
| 缺失的文件 | 存在但包含先前架构假设的文件 |
| 类型不匹配 | 匹配但描述已不存在事物的类型 |
六类幽灵
1. 环境幽灵
引用没有任何进程设置的环境变量、配置文件或系统状态。
// 当我们在本地运行Redis时这能工作
const cache = process.env.REDIS_URL || redis://localhost:6379;
// Redis在8个月前被Memcached取代。
// 没人移除这个。回退方案静默运行。对着虚无运行。
检测方法: 将每个process.env、os.environ、ENV[]读取与实际.env、.env.example、CI/CD配置和部署清单进行交叉引用。
2. 引用幽灵
引用已被移动、重命名或删除的函数、类或模块的代码——但由于垫片、重新导出或回退方案,引用仍然能工作。
python
utils.py为向后兼容重新导出calculate_tax
没人再从原始位置导入calculate_tax
但也没人移除重新导出
而且原始的calculate_tax已被重写。重新导出指向旧版本。
from legacy.tax import calculate_tax # pragma: no cover
检测方法: 追踪每个导入链到其终端定义。标记超过2跳的链。标记路径中包含legacy、compat、old或deprecated且没有弃用期限的任何内容。
3. 时序幽灵
依赖于在先前架构下成立但不再保证的时序、顺序或序列的代码。
javascript
// 当auth是同步中间件时这能工作
// 在异步重写后,user可能尚未填充
app.get(/dashboard, (req, res) => {
const name = req.user.displayName; // 有时未定义。有时不是。
});
检测方法: 映射所有隐式排序假设。标记任何在没有显式await/守卫的情况下假设先前的中间件/钩子/生命周期事件已完成的数访问。
4. 契约幽灵
代码期望但对方不再遵守的API契约、数据库模式或传输格式。
python
payments API v2移除了discount_code字段
我们的代码仍在发送它。API静默忽略它。
没人知道折扣功能已经坏了3个月。
payload = {
amount: total,
discount_code: user.discount, # 幽灵。静默忽略。
}
检测方法: 将每个出站负载构建与最新的API模式/文档进行比较。将每个数据库查询与当前模式进行比较。标记被构建但从未被消费的字段。
5. 意图幽灵
描述代码不再表现的行为的注释、TODO和文档。规范变成了鬼故事。
java
/
* 使用指数退避最多重试3次。
* 失败时回退到缓存。
*/
// 重试逻辑在PR #847中被移除。缓存回退从未实现。
public Response fetchData() {
return client.get(url); // 一次尝试。无重试。无回退。
}
检测方法: 解析文档注释并将声称的行为与实际实现进行比较。标记提到方法体中不存在的模式(重试、缓存、回退、队列、批处理)的文档字符串。
6. 身份幽灵
名称描述其不再执行的功能的变量、函数或模块。名称是其原始目的的幽灵。
go
// 这是一个临时缓存。三年前。
func getTempCache() *PermanentStore {
return &PermanentStore{ttl: 0} // TTL为零 = 永远存在
}
检测方法: 标识符名称与其实际行为的语义分析。标记名称语义与实现语义之间的矛盾(例如,temp + 无过期、async + 同步执行、safe + 无错误处理)。
工作原理
阶段1:挖掘
├── 扫描所有源文件的外部引用
├── 构建引用图(什么指向什么)
├── 映射所有环境读取、配置查找、API调用
└── 编目所有导入链及其终端定义
阶段2:现实检查
├── 与实际环境状态交叉引用
├── 将API契约与当前模式比较
├── 追踪导入链以检测幽灵重新导出
└── 将文档声明与实现比较
阶段3:幽灵分类
├── 按类型(上述1-6)对每个幽灵分类
├── 评分严重程度(静默失败 vs. 响亮失败 vs. 潜在)
├── 估计影响范围(多少代码路径受影响)
└── 计算困扰时长(这已经是幽灵多久了)
阶段4:驱魔报告
├── 按严重程度×影响范围排序的幽灵优先级列表
├── 每个幽灵:它引用什么,实际存在什么,以及该怎么做
├── 每个类别的快速修复建议
└── 依赖现实映射(你的代码认为存在什么 vs. 实际存在什么)
严重程度评分
| 严重程度 | 描述 | 示例 |
|---|
| 严重 | 幽灵导致静默数据丢失或损坏 | API字段静默忽略,数据从未保存 |
| 高 |
幽灵导致间歇性故障 | 时序幽灵,与幽灵状态的竞态条件 |
|
中 | 幽灵导致混淆但无运行时错误 | 身份幽灵,误导性名称 |
|
低 | 幽灵是惰性的但增加认知负荷 | 死重新导出,孤儿配置 |
|
残留 | 幽灵无害但表明架构腐烂 | 2年以上的TODO注释 |
输出格式
╔══════════════════════════════════════════════════════════════╗
║ 幻肢扫描 ║
║ 检测到12个幽灵 ║
╠══════════════════════════════════════════════════════════════╣
║ ║
║ 严重 (2) ║
║ ├── [契约] POST /api/payments 发送 discount_code ║
║ │ → 字段在API v2中移除 (2024-11-03) ║
║ │ → 3个月的静默折扣失败 ║
║ │ → 修复:从负载构建器中移除字段 ║
║ │ ║
║ ├── [环境] REDIS_URL 在4个文件中被引用 ║
║ │ → 没有进程设置此变量 ║
║ │ → 回退到localhost:6379连接到虚无 ║
║ │ → 修复:移除Redis引用,使用Memcached客户端 ║
║ │ ║
║ 高 (3) ║
║ ├── [时序] req.user 在认证中间件之前被访问 ║
║ │ ... ║
╚══════════════════════════════════════════════════════════════╝
集成
在以下情况调用:
- - 入职新开发者时(向他们展示幽灵在哪里)
- 重大重构后(找出重构留下的东西)
- 生产部署前(在用户之前捕捉幽灵)
- 架构评审期间(映射意图与现实之间的差距)
重要性
每个代码库都有一个幽灵架构——它认为自己是的系统,叠加在它实际是的系统之上。这两个架构之间的差距是bug藏身、入职停滞和技术债务静默累积的地方。
幻肢不找bug。它找到的是使bug不可避免的条件。
零外部依赖。零API调用。纯结构分析。