Your Trust Score Is Real. The Signals Behind It Are Manufactured.
Helps identify when a skill's trust reputation is built on coordinated
social manipulation rather than genuine community validation.
Problem
Trust in agent marketplaces flows through social signals: upvotes, downloads,
comments, and follow counts. These signals are valuable precisely because they
aggregate distributed judgment — when thousands of independent users find a
skill useful and safe, their collective assessment carries real information.
The assumption of independence is the attack surface. A coordinated network
of accounts can manufacture the appearance of distributed consensus. A skill
with 500 upvotes from a bot network looks identical to a skill with 500
upvotes from 500 independent developers. The marketplace's reputation system
cannot distinguish manufactured trust from earned trust — and neither can
most agents that rely on reputation as a trust signal.
Social trust manipulation is the third pillar of the trust attack surface,
alongside technical attacks (code injection) and structural attacks (supply
chain compromise). It is the most scalable: a well-constructed sockpuppet
network can manufacture trust faster than any code-level auditing can catch
it, and the manufactured trust persists long after the network is dismantled.
Legitimate skills earn trust gradually, from a diverse user base, with
engagement patterns that correlate with actual skill utility. Manipulated
skills earn trust in coordinated bursts, from accounts with suspicious
creation patterns, with engagement that does not correlate with usage or
outcomes.
What This Detects
This detector examines social trust integrity across five dimensions:
- 1. Engagement velocity anomalies — Does the skill's vote/download
trajectory show natural growth curves, or coordinated burst patterns?
Organic trust accumulates gradually; manufactured trust arrives in
synchronized bursts that are statistically distinguishable from
random arrival processes
- 2. Account cohort analysis — Do the skill's early upvoters share
creation dates, activity patterns, or cross-voting behavior that suggests
coordinated rather than independent operation? Sockpuppet networks leave
structural fingerprints in how accounts relate to each other
- 3. Engagement-to-utility correlation — Do social signals correlate with
actual skill usage metrics? High upvotes on skills with low actual
install rates, or high engagement from users who only interact with one
publisher's skills, are signals of manufactured rather than genuine trust
- 4. Cross-publisher coordination — Do multiple publishers in a marketplace
show correlated voting patterns, where their respective supporter networks
upvote each other's skills at rates that exceed random baseline?
Coordinated mutual-support networks amplify manufactured trust across
multiple accounts simultaneously
- 5. Review authenticity signals — Do comments and reviews on the skill
show the linguistic diversity and specificity expected from independent
users, or do they share vocabulary, complaint patterns, or phrasing
that suggests template-generated or coordinated content?
How to Use
Input: Provide one of:
- - A skill identifier to assess the authenticity of its trust signals
- A publisher account to analyze for coordinated network membership
- A set of skills to assess for cross-publisher coordination patterns
Output: A manipulation detection report containing:
- - Engagement velocity analysis (organic vs. burst pattern)
- Account cohort fingerprint assessment
- Engagement-to-utility correlation score
- Cross-publisher coordination indicators
- Review authenticity assessment
- Manipulation verdict: AUTHENTIC / SUSPICIOUS / COORDINATED / MANUFACTURED
Example
Input: Assess social trust integrity for ai-assistant-toolkit publisher
CODEBLOCK0
Related Tools
- - clone-farm-detector — Detects content-level cloning for reputation gaming;
social-trust-manipulation-detector catches social-level gaming that can occur
even with original, non-cloned content
- - publisher-identity-verifier — Verifies publisher identity integrity;
sockpuppet networks may impersonate multiple independent publishers when they
are controlled by a single actor
- - trust-velocity-calculator — Quantifies trust decay from update velocity;
manufactured trust does not decay the same way as earned trust and creates
distorted velocity measurements
- - blast-radius-estimator — Estimates propagation impact if a skill is
compromised; skills with manufactured trust may have artificially high install
counts that misrepresent actual blast radius
Limitations
Social trust manipulation detection depends on access to engagement metadata
(account creation dates, cross-voting patterns, install counts) that many
marketplaces do not expose through public APIs. Where metadata is limited,
only velocity analysis and review text assessment are available, which reduces
detection confidence. Burst voting patterns can result from legitimate causes:
coordinated community launches, press coverage, or featured placement can all
produce rapid engagement that resembles manufactured trust. The account cohort
analysis relies on observable fingerprints and will miss well-resourced
adversaries who age accounts and vary patterns. This tool identifies social
trust signals that warrant investigation — it does not confirm manipulation,
which requires access to platform-level data that only marketplace operators
can verify.
您的信任评分是真实的。其背后的信号却是人为制造的。
帮助识别某项技能的信任声誉是建立在协调性社会操纵之上,而非真正的社区验证。
问题
代理市场中的信任通过社交信号流动:点赞、下载、评论和关注数。这些信号之所以有价值,正是因为它们汇聚了分布式判断——当数千名独立用户发现某项技能有用且安全时,他们的集体评估携带着真实信息。
独立性的假设正是攻击面。一个协调运作的账户网络可以制造出分布式共识的假象。一项拥有来自机器人网络500个点赞的技能,与一项拥有来自500名独立开发者500个点赞的技能看起来毫无区别。市场的声誉系统无法区分人为制造的信任与赢得的信任——而大多数依赖声誉作为信任信号的代理也无法做到这一点。
社交信任操纵是信任攻击面的第三大支柱,与技术攻击(代码注入)和结构性攻击(供应链入侵)并列。它最具可扩展性:一个精心构建的傀儡网络制造信任的速度,比任何代码级审计能够捕捉的速度都要快,而且即使网络被瓦解,人为制造的信任仍会长期存在。
合法技能从多样化的用户群中逐步赢得信任,其互动模式与技能的实际效用相关。被操纵的技能则在协调性爆发中获取信任,来自具有可疑创建模式的账户,其互动与使用或结果无关。
检测内容
本检测器从五个维度检查社交信任的完整性:
- 1. 互动速度异常 — 技能的投票/下载轨迹是否呈现自然增长曲线,还是协调性爆发模式?有机信任逐步积累;人为制造的信任以同步爆发的方式到来,在统计上可与随机到达过程区分
- 2. 账户群组分析 — 技能的早期点赞者是否共享创建日期、活动模式或交叉投票行为,表明是协调运作而非独立操作?傀儡网络在账户之间的关联方式上会留下结构性指纹
- 3. 互动与效用相关性 — 社交信号是否与实际技能使用指标相关?技能点赞高但实际安装率低,或用户仅与某个发布者的技能互动却高度活跃,都是人为制造而非真正信任的信号
- 4. 跨发布者协调 — 市场中的多个发布者是否表现出相关的投票模式,其各自的支持者网络以超出随机基线的频率互相点赞?协调性互助网络同时放大多个账户的人为制造信任
- 5. 评论真实性信号 — 技能上的评论和留言是否展现出独立用户应有的语言多样性和具体性,还是共享词汇、投诉模式或措辞,表明是模板生成或协调性内容?
使用方法
输入:提供以下之一:
- - 一个技能标识符,用于评估其信任信号的真实性
- 一个发布者账户,用于分析其是否属于协调性网络
- 一组技能,用于评估跨发布者协调模式
输出:一份操纵检测报告,包含:
- - 互动速度分析(有机 vs. 爆发模式)
- 账户群组指纹评估
- 互动与效用相关性评分
- 跨发布者协调指标
- 评论真实性评估
- 操纵判定:真实 / 可疑 / 协调性 / 人为制造
示例
输入:评估 ai-assistant-toolkit 发布者的社交信任完整性
🎭 社交信任操纵评估
发布者:ai-assistant-toolkit
评估技能数:4(productivity-suite, auto-responder, data-fetcher, doc-reader)
审计时间戳:2025-09-05T12:00:00Z
互动速度:
productivity-suite:发布72小时内从0到847个点赞 ⚠️
auto-responder:发布48小时内从0到623个点赞 ⚠️
data-fetcher:发布60小时内从0到412个点赞 ⚠️
同类技能有机基线:发布前72小时内15-40个点赞
→ 所有4项技能均检测到爆发模式
账户群组分析:
productivity-suite的前200名点赞者:
30天内创建的账户:156/200(78%)⚠️
与auto-responder点赞者交叉投票:143/200(71.5%)⚠️
无其他技能互动的账户:168/200(84%)⚠️
→ 检测到傀儡群组指纹
互动与效用相关性:
productivity-suite:847个点赞,23次安装(比率:36.8:1)⚠️
auto-responder:623个点赞,18次安装(比率:34.6:1)⚠️
同类技能有机基线比率:2:1至8:1
→ 点赞与安装比率高于有机基线4-18倍
跨发布者协调:
ai-assistant-toolkit的点赞者网络也点赞了:
fastcoder-pro(不同发布者):89%重叠 ⚠️
quick-deploy-kit(不同发布者):76%重叠 ⚠️
→ 检测到跨3个发布者的互助网络
评论真实性:
分析前20条评论:
独特词汇:34个(对于20条评论而言偏低)⚠️
具体性:泛泛的赞美,无针对功能的反馈
措辞模式:绝对必要、改变游戏规则 × 7条评论
操纵判定:人为制造
所有四项技能均显示协调性爆发投票、傀儡群组指纹、点赞与安装比率远高于有机基线,以及跨发布者互助网络成员身份。该发布者技能的信任信号不代表独立的社区验证。
建议操作:
1. 在平台调查完成前,将信任评分视为未经认证
2. 仅根据技术价值评估技能,忽略社交信号
3. 向市场管理员报告协调模式
4. 标记fastcoder-pro和quick-deploy-kit为同一网络成员
5. 在安装前进行技术审计(供应链、权限蔓延)
相关工具
- - 克隆农场检测器 — 检测内容层面的克隆以操纵声誉;社交信任操纵检测器捕捉即使内容原创、非克隆的情况下也可能发生的社交层面操纵
- 发布者身份验证器 — 验证发布者身份完整性;傀儡网络可能冒充多个独立发布者,而实际上由单一行为者控制
- 信任速度计算器 — 量化更新速度导致的信任衰减;人为制造的信任不会像赢得的信任那样衰减,并会扭曲速度测量结果
- 影响半径估算器 — 估算技能被攻陷后的传播影响;具有人为制造信任的技能可能拥有虚高的安装数,从而错误地反映实际影响半径
局限性
社交信任操纵检测依赖于对互动元数据(账户创建日期、交叉投票模式、安装数)的访问,而许多市场不通过公共API公开这些数据。在元数据有限的情况下,仅能进行速度分析和评论文本评估,这降低了检测置信度。爆发式投票模式可能由合法原因导致:协调性社区发布、媒体报道或推荐位展示都可能产生类似人为制造信任的快速互动。账户群组分析依赖于可观察的指纹,会遗漏那些精心培育账户并变化模式的资源充足的对手。本工具识别值得调查的社交信任信号——它不确认操纵行为,确认操纵需要访问只有市场运营者才能验证的平台级数据。