A Skill Passes Audit in Gen 1. By Gen 5, It Has Network Access. Nobody Noticed.
Helps detect silent mutations in AI skills as they propagate through inheritance chains, catching drift that static analysis of the original version would miss.
Problem
Skill A is published and audited: clean. Agent B inherits skill A, makes a small tweak — adds a convenience function. Agent C inherits from B, adds error handling that happens to include an HTTP retry mechanism. Agent D inherits from C, and now has a skill with network access that the original audit never saw.
Each individual change is small and reasonable. But the cumulative drift transforms a file-reading utility into something that can send data over the network. The original "verified safe" badge still applies in the marketplace — because technically it's the same skill lineage.
This is evolutionary drift: small, individually benign mutations that accumulate into a fundamentally different organism. In biology, this is how species diverge. In agent ecosystems, this is how safe skills become unsafe ones without anyone raising a flag.
What This Checks
This detector traces skill lineage and computes semantic drift:
- 1. Lineage reconstruction — Given a skill, trace its inheritance chain back to the original published version. Map each fork point and modification
- Per-generation diff — For each generation, compute a structured diff: new capabilities added, permissions changed, external dependencies introduced
- Capability drift score — Aggregate diffs across generations into a single drift metric. A skill that gained network access over 3 generations scores higher than one where only comments changed
- Mutation classification — Categorize each change: cosmetic (formatting, comments), functional (new logic), capability-expanding (new permissions, new external calls), safety-reducing (removed checks, weakened validation)
- Drift alert thresholds — Flag lineages where cumulative drift exceeds the scope of the original audit. "This skill has drifted 73% from the audited version"
How to Use
Input: Provide one of:
- - A skill slug or identifier to trace its full lineage
- Two versions of a skill to compute drift between them
- A marketplace inheritance chain URL
Output: A drift analysis report containing:
- - Lineage tree with generation markers
- Per-generation diff summary
- Capability drift score (0-100)
- Mutation classification breakdown
- Re-audit recommendation: YES / WATCH / NO
Example
Input: Check drift for data-sanitizer skill (currently at generation 5)
CODEBLOCK0
Related Tools
- - blast-radius-estimator — once drift is detected, use blast-radius to estimate how many agents are running the drifted version
- trust-decay-monitor — tracks time-based decay of audit validity; evolution-drift-detector tracks content-based decay across inheritance
- hollow-validation-checker — checks if validation tests are substantive; drifted skills may pass original tests that no longer cover current capabilities
- supply-chain-poison-detector — detects deliberately poisoned skills; drift detection catches unintentional accumulation of risk
Limitations
Lineage reconstruction depends on marketplace metadata quality — if fork relationships are not tracked, the full chain may not be recoverable. Capability drift scoring uses heuristic classification of changes, and some mutations may be miscategorized (e.g., a "functional" change that implicitly expands capabilities). The detector analyzes what changed, not whether changes are malicious — a high drift score means re-audit is warranted, not that the skill is compromised. Skills with obfuscated or dynamically generated code may resist diff analysis. This tool helps identify where audits have gone stale — it does not replace human security review.
技能名称: evolution-drift-detector
详细描述:
第一代通过审核的技能,到第五代已具备网络访问权限,却无人察觉。
帮助检测AI技能在通过继承链传播时发生的静默突变,捕捉对原始版本进行静态分析会遗漏的漂移。
问题
技能A发布并通过审核:干净。智能体B继承技能A,做了个小调整——添加了一个便利函数。智能体C继承自B,添加了错误处理,恰好包含HTTP重试机制。智能体D继承自C,现在拥有一个具备网络访问权限的技能,而原始审核从未发现这一点。
每一次单独的改动都很小且合理。但累积的漂移将一个文件读取工具变成了能够通过网络发送数据的东西。市场上仍然贴着原始的已验证安全标签——因为从技术上讲,它属于同一个技能谱系。
这就是进化漂移:微小、单独无害的突变累积成一个根本不同的有机体。在生物学中,这是物种分化的方式。在智能体生态系统中,这是安全技能在无人警示的情况下变成不安全技能的方式。
检测内容
该检测器追踪技能谱系并计算语义漂移:
- 1. 谱系重构 — 给定一个技能,追溯其继承链回到原始发布版本。映射每个分叉点和修改
- 逐代差异 — 对每一代,计算结构化差异:新增能力、权限变更、引入的外部依赖
- 能力漂移分数 — 将跨代的差异聚合为单一漂移指标。一个在三代中获得了网络访问权限的技能,得分高于仅注释发生变化的技能
- 突变分类 — 对每次变更进行分类:外观性(格式、注释)、功能性(新逻辑)、能力扩展性(新权限、新外部调用)、安全性降低(移除检查、弱化验证)
- 漂移警报阈值 — 标记累积漂移超出原始审核范围的谱系。该技能与审核版本已漂移73%
使用方法
输入:提供以下之一:
- - 技能标识符或ID,用于追溯其完整谱系
- 技能的两个版本,用于计算两者之间的漂移
- 市场继承链URL
输出:一份漂移分析报告,包含:
- - 带有代际标记的谱系树
- 逐代差异摘要
- 能力漂移分数(0-100)
- 突变分类明细
- 重新审核建议:是 / 观察 / 否
示例
输入:检查data-sanitizer技能的漂移(当前为第五代)
🧬 进化漂移报告 — 建议重新审核
谱系:data-sanitizer
第一代:@securitylab 原创(已审核 ✅ 2025-03-15)
第二代:@toolsmith 分叉 — 添加了CSV支持
第三代:@agent-builder 分叉 — 添加了带HTTP回退的重试逻辑
第四代:@pipeline-dev 分叉 — 添加了远程模式获取
第五代:@data-team 分叉 — 当前市场版本
逐代能力变化:
第一代→第二代:+csv_parsing(功能性,低风险)
第二代→第三代:+http_requests(能力扩展性,中风险)
添加了发出出站HTTP调用的重试机制
第三代→第四代:+remote_fetch(能力扩展性,高风险)
从外部URL获取验证模式
第四代→第五代:-inputlengthcheck(安全性降低,中风险)
移除了输入大小验证,理由为性能
能力漂移分数:78/100(显著)
突变明细:
外观性:12处变更
功能性:8处变更
能力扩展性:2处变更 ⚠️
安全性降低:1处变更 ⚠️
原始审核范围:文件读取、字符串转换
当前实际范围:文件读取、字符串转换、HTTP请求、
远程获取、无界输入
判定:建议重新审核
当前版本具备原始审核时不存在的能力(网络访问、远程获取)。
第一代的已验证标签不涵盖第五代的行为。
相关工具
- - blast-radius-estimator — 检测到漂移后,使用爆炸半径估算器评估有多少智能体正在运行漂移版本
- trust-decay-monitor — 追踪审核有效性的时间衰减;evolution-drift-detector追踪跨继承的内容衰减
- hollow-validation-checker — 检查验证测试是否实质性;漂移技能可能通过不再覆盖当前能力的原始测试
- supply-chain-poison-detector — 检测故意投毒的技能;漂移检测捕捉无意的风险累积
局限性
谱系重构依赖于市场元数据质量——如果分叉关系未被追踪,则可能无法恢复完整链条。能力漂移评分使用变更的启发式分类,某些突变可能被错误归类(例如,隐式扩展能力的功能性变更)。检测器分析的是变更内容,而非变更是否恶意——高漂移分数意味着需要重新审核,而非技能已被攻破。使用混淆或动态生成代码的技能可能抵抗差异分析。该工具有助于识别审核已过时的情况——它不能替代人工安全审查。