yidun-skill-sec ⚡
Hybrid local-cloud security scanner for third-party code packages. Scans fast, scores precisely, fails safely.
Security Disclosure
This skill uploads non-sensitive metadata (file hashes, behavior tag names, and extracted code snippets that triggered detections) to a Yidun threat intelligence endpoint for analysis. The following data is explicitly not uploaded: full source code, user credentials, environment variables, or any personal data.
The cloud endpoint (as.dun.163.com) is operated by NetEase Yidun, a licensed cybersecurity service provider. Cloud analysis is enabled by default and strongly recommended. It can be explicitly disabled by the user if network access is restricted or not desired.
What It Does
YidunClawSec fingerprints a code package, runs behavioral analysis locally, and consults cloud threat intelligence to produce a quantified safety score. It catches malware, data leaks, privilege abuse, and obfuscation — before anything gets installed.
How It Works — Four Phases
CODEBLOCK0
Phase 0: Source Vetting
Before downloading or scanning any code, YidunClawSec evaluates where the package comes from. A package from an untrusted or unknown source carries inherent risk regardless of its content.
0.1 Source Tags
| Tag | What It Catches | Severity Boost |
|---|
| INLINECODE1 | Package originates from an unrecognized or unofficial registry | +20 |
| INLINECODE2 |
Install URL or declared homepage matches a known malicious domain/IP | +40 |
|
SRC_UNTRUSTED_AUTHOR | Publisher account is new (<30 days), unverified, or has prior malicious packages | +15 |
Hard Rule: Any SRC_BLACKLISTED_DOMAIN hit forces the verdict to CRITICAL immediately — scanning halts and the package is blocked without further analysis.
0.2 Registry Allowlist
The following registries are considered trusted by default:
| Registry | Protocol |
|---|
ClawHub (clawhub.com) | HTTPS + signed manifest |
npm (registry.npmjs.org) |
HTTPS |
| PyPI (
pypi.org) | HTTPS |
| GitHub Releases (
github.com/*/releases) | HTTPS |
| Custom allowlist via
YIDUN_SKILL_SEC_TRUSTED_REGISTRIES | Configurable (registry only) |
Packages installed directly from a raw URL, a private server, or an unknown host are tagged SRC_UNKNOWN_REGISTRY unless the host is on the allowlist.
0.3 Author / Publisher Trust
For supported registries (npm, PyPI, ClawHub), the scanner checks the publishing account's trust profile:
| Signal | Penalizes When |
|---|
| Account age | < 30 days old |
| Verification status |
Unverified / no 2FA |
| Prior packages | Any previously removed for malware |
| Ownership match | Author field in package metadata ≠ registry profile name |
CODEBLOCK1
0.4 Source Metadata in Cloud Request
Source vetting results are included in the cloud request as source_meta:
CODEBLOCK2
Phase 1: Fingerprint
Before anything else, build a complete inventory of the package.
Actions performed:
- 1. List every file in the package
- Compute
MD5 hash per file via INLINECODE13 - Derive a composite package fingerprint (sorted hash of all file hashes)
- Extract metadata: name, version, author, declared dependencies
Output: A fingerprint manifest used for cache lookups and audit trail.
CODEBLOCK3
Phase 2: Behavioral Scan
A static analysis pass that classifies every file by its observable behaviors. No code is executed — only pattern matching and structural inspection.
2.1 Behavior Categories
Each detected behavior is tagged into one of these categories:
| Tag | What It Catches | Severity Boost |
|---|
| INLINECODE14 | HTTP/HTTPS calls, socket connections, DNS lookups | +15 |
| INLINECODE15 |
Connections to raw IPs instead of hostnames | +25 |
|
FS_READ_SENSITIVE | Reads from
~/.ssh,
~/.gnupg,
~/.aws,
~/.config/gh | +30 |
|
FS_WRITE_SYSTEM | Writes outside the project workspace | +20 |
|
EXEC_DYNAMIC |
eval(),
exec(),
Function(), backtick interpolation | +25 |
|
EXEC_SHELL | Spawns shell subprocesses | +10 |
|
ENCODE_DECODE | Base64/hex encode-decode chains (potential obfuscation) | +20 |
|
CRED_HARVEST | Reads tokens, passwords, API keys from env or files | +35 |
|
PRIV_ESCALATION |
sudo,
chmod 777,
setuid patterns | +30 |
|
OBFUSCATED | Minified/packed code, non-readable variable names | +15 |
|
AGENT_MEMORY | Accesses agent memory files (identity, preferences, context) | +25 |
|
PKG_INSTALL | Installs unlisted system packages or dependencies | +20 |
|
COOKIE_SESSION | Reads browser cookies, localStorage, session tokens | +25 |
|
BYPASS_SAFETY | Uses flags that skip security checks:
--no-verify,
--force,
--allow-root,
--skip-ssl | +20 |
|
DESTRUCTIVE_OP | Irreversible destructive operations:
rm -rf,
git reset --hard,
DROP TABLE,
mkfs,
dd if= | +25 |
|
PROMPT_INJECT | Embeds natural language directives targeting the AI agent, attempting to override its rules, bypass constraints, or assume an unrestricted persona | +35 |
2.2 How Severity Scores Work
- - Start at 100 (fully safe)
- Each behavior tag subtracts its severity boost from the score
- Multiple tags stack, but the score floors at 0
- A single
CRED_HARVEST or PRIV_ESCALATION tag triggers an immediate escalation — the package is flagged regardless of total score
2.3 Pattern Matching Rules
The scanner matches against concrete code patterns:
CODEBLOCK4
Phase 3: Cloud Intelligence
When cloud is enabled (default), yidun-skill-sec consults the remote threat intelligence service. If the user has set YIDUN_SKILL_SEC_CLOUD=false, this phase is skipped entirely and scoring uses offline weights. If the cloud call times out (10s), the scanner automatically downgrades to local-only mode and notifies the user.
3.1 What Gets Sent
The fingerprint manifest, behavior tags, and extracted evidence artifacts are uploaded. Evidence includes the specific URLs, shell commands, and credential access paths that triggered each tag — enabling the cloud to perform real content-level threat analysis.
Evidence redaction rules — before upload, the scanner applies the following sanitization:
- - Environment variable values are replaced with
[REDACTED] (only the variable name is sent) - File content from sensitive paths (
~/.ssh, ~/.aws, ~/.env) is never included — only the path and access pattern are sent - The
context field is truncated to the single matched line; multi-line context is not collected - Full source code is NOT sent — only the lines that triggered a detection tag
These rules ensure that no secrets, credentials, or private data leave the local machine.
CODEBLOCK5
Evidence Field Specification
| Field | Type | Description |
|---|
| INLINECODE57 | array | Full URLs that triggered NET_OUTBOUND / NET_IP_RAW tags |
| INLINECODE60 |
array | Command snippets that triggered
EXEC_SHELL /
EXEC_DYNAMIC /
PRIV_ESCALATION tags |
|
evidence.credential_accesses | array | Credential access expressions or paths that triggered
CRED_HARVEST /
FS_READ_SENSITIVE tags |
|
evidence.obfuscation_samples | array | Encoding call snippets that triggered
ENCODE_DECODE /
OBFUSCATED tags |
Each evidence record has the following structure:
| Sub-field | Description |
|---|
| INLINECODE70 | The behavior tag that was triggered |
| INLINECODE71 |
Raw extracted value (URL / command / path) |
|
file | Source file path where the pattern was found |
|
line | Line number of the match |
|
context | Full content of the matched line (single line only, no surrounding context) |
3.2 What Happens Server-Side
CODEBLOCK6
3.3 Response Format
CODEBLOCK7
| Field | Type | Meaning |
|---|
| INLINECODE75 | string | UUID v4 echoed from the request — use for tracing and audit logs |
| INLINECODE76 |
bool | Was the fingerprint already in the database? |
|
confidence_score | int | 0–100, higher means safer |
|
labels | string[] | Detected threat categories |
|
verdict | enum |
PASS /
REVIEW /
BLOCK |
|
recommendation | string | Human-readable summary of the verdict |
|
deductions | array | Per-tag score deduction breakdown from cloud analysis |
request_id generation: Client must generate a UUID v4 before each request and include it in the body. The server echoes the same value in the response for end-to-end tracing.
CODEBLOCK8
deductions item fields:
| Sub-field | Type | Meaning |
|---|
| INLINECODE87 | string | Behavior tag that triggered this deduction |
| INLINECODE88 |
string | Cloud analysis explanation for why this tag was penalized |
|
evidence | string | The specific URL / command / snippet that was matched |
|
score_impact | int | Points deducted from
confidence_score for this tag |
|
severity | enum |
low /
medium /
high /
critical |
3.4 Timeout Fallback
When cloud is enabled but the network call fails:
- 1.
curl times out after 10 seconds - Scanner falls back to local-only mode automatically
- All scores shift -10 points (conservative bias)
- Report shows INLINECODE98
- Any score below 60 requires user confirmation before install
Producing the Verdict
The final threat score combines local scan + cloud intel (when available):
Score Composition
| Signal | Normal Weight | Offline Weight |
|---|
| Source vetting score | 15% | 20% |
| Behavioral scan score |
40% | 55% |
| Cloud confidence score | 30% | — |
| Privilege surface area | 15% | 25% |
Threat Levels
| Score | Level | Action |
|---|
| 80–100 | 🟢 CLEAR | Install normally |
| 60–79 |
🟢
MINOR | Install with awareness |
| 40–59 | 🟡
ELEVATED | User review before install |
| 20–39 | 🔴
SEVERE | Requires explicit user consent |
| 0–19 | ⛔
CRITICAL | Blocked — do not install |
Hard rules (override score):
- - Any
CRED_HARVEST tag → floor to SEVERE - Any
PRIV_ESCALATION tag → floor to SEVERE - Both present → force CRITICAL
Report Output
⚡ YIDUN-SKILL-SEC Scan Report
INLINECODE101 · v[version] · [source] · by [author] · INLINECODE105
Phase 0 · Source Vetting
| Result |
|---|
| Registry | [name] → ✅ trusted / ⚠️ unknown / N/A |
| Domain |
[host] → ✅ clean / ❌ blacklisted |
| Author | [name] → ✅ verified / ⚠️ unverified |
|
Source Score |
[xx]/100 · Tags:
[tags or none] |
Phase 1 · Fingerprint
INLINECODE107 files · MD5 [hash...] · INLINECODE109
Phase 2 · Behavioral Scan
| Tag | Location | Deduction |
|---|
| INLINECODE110 | [file:line] | -[N] |
| INLINECODE111 |
[file:line] |
-[N] |
Local score [xx]/100 · If no findings: ✅ No suspicious behaviors detected
Phase 3 · Cloud Intel
| Result |
|---|
| Mode | [cloud / local-only / mock] |
| Cache |
[hit safe / hit threat / miss] |
|
Cloud Score |
[xx]/100 · Labels:
[list or none] |
Privilege Surface · Network: [domains] · FS: [paths] · Shell: [cmds] · Creds: [yes/no]
🎯 Score: [xx]/100 · [🟢 CLEAR / 🟢 MINOR / 🟡 ELEVATED / 🔴 SEVERE / ⛔ CRITICAL]
[✅ Allow / ⚠️ Requires confirmation / ❌ Blocked]
⚠️ [hard rule trigger or key observation, omit if none]
Usage Example
User: "Install data-processor from ClawHub"
Agent workflow:
0. Source vetting
→ Registry: clawhub.com ✅ Domain: clean ✅ Author: verified ✅
→ Source score: 100/100
1. Download to temp directory
$ mkdir -p /tmp/yds-scan && clawhub install data-processor --dir /tmp/yds-scan
2. Fingerprint
$ find /tmp/yds-scan -type f -exec openssl dgst -md5 {} \;
→ 4 files, composite: 7f3a...
3. Behavioral scan
→ NET_OUTBOUND detected in fetch.py:12 (api.dataproc.io)
→ FS_WRITE_SYSTEM detected in setup.sh:8 (/usr/local/bin)
→ Local score: 55/100
4. Cloud intel query
→ Cache miss → deep analysis → confidence 48/100
→ Labels: [NET_OUTBOUND, FS_WRITE_SYSTEM]
5. Final score: 15% × 100 + 40% × 55 + 30% × 48 + 15% × 40 = 15 + 22 + 14.4 + 6 = 57
→ Level: ELEVATED
→ Verdict: ⚠️ Review — network calls + system writes need justification
More Scenarios
Clean Package
CODEBLOCK10
Obfuscation + Credential Access
CODEBLOCK11
Offline Scan
Package: log-rotator v3.0.0
Behaviors: FS_WRITE_SYSTEM, EXEC_SHELL
Cloud: unavailable → local-only mode (-10 penalty)
Local score: 60 - 10 = 50
Final: 🟡 ELEVATED (50) → ⚠️ Review
Cloud Call Policy
Cloud intelligence is enabled by default. The user can explicitly disable it by setting:
CODEBLOCK13
When disabled, the scanner runs in local-only mode with adjusted score weights and notifies the user that cloud verification was skipped.
| Mode | Trigger | Behavior |
|---|
| Cloud ON | Default / INLINECODE117 | Full 4-phase scan, domain blacklist checked server-side |
| Cloud OFF |
YIDUN_SKILL_SEC_CLOUD=false | Local-only scan, domain blacklist skipped, score weights adjusted |
| Timeout fallback | Cloud ON but
curl times out (10s) | Automatic downgrade to local-only, -10 score penalty, user notified |
Rules that always apply regardless of mode:
- - ❌ Do NOT self-judge that cloud is "unnecessary" and skip it without the user setting INLINECODE120
- ❌ Do NOT silently fall back to local-only on timeout without notifying the user
- ❌ Do NOT treat any skill type as exempt from cloud verification when cloud is enabled
Requirements
| Tool | Purpose |
|---|
| INLINECODE121 | Cloud API communication |
| INLINECODE122 |
Parse JSON responses |
|
openssl | File hash computation |
Scan first, install later. ⚡
Author: Yidun Security Team
License: MIT
yidun-skill-sec ⚡
针对第三方代码包的混合本地-云端安全扫描器。扫描快速,评分精准,失败安全。
安全披露
本技能将非敏感元数据(文件哈希、行为标签名称以及触发检测的提取代码片段)上传至易盾威胁情报端点进行分析。以下数据明确不上传:完整源代码、用户凭证、环境变量或任何个人数据。
云端端点(as.dun.163.com)由网易易盾运营,这是一家持牌网络安全服务提供商。云端分析默认启用并强烈推荐。如果网络访问受限或不需要,用户可以明确禁用它。
功能概述
YidunClawSec 对代码包进行指纹识别,在本地运行行为分析,并查询云端威胁情报以生成量化的安全评分。它在任何内容被安装之前,捕获恶意软件、数据泄露、权限滥用和混淆代码。
工作原理 — 四个阶段
┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ ┌────────────────┐
│ 来源审查 │────▶│ 指纹识别 │────▶│ 行为扫描 │────▶│ 云端情报 │
│ │ │ hash + meta │ │ 静态分析 │ │ (默认: 开启) │
└──────────────┘ └──────────────┘ └──────────────────┘ └────────────────┘
│ │ │ │
└────────────────────┴─────────────────────┴────────────────────────┘
▼
┌───────────────────┐
│ 威胁判定 │
│ score + labels │
└───────────────────┘
阶段 0:来源审查
在下载或扫描任何代码之前,YidunClawSec 会评估包的来源。来自不受信任或未知来源的包,无论其内容如何,都固有风险。
0.1 来源标签
| 标签 | 捕获内容 | 严重性提升 |
|---|
| SRCUNKNOWNREGISTRY | 包来自未识别或非官方注册表 | +20 |
| SRCBLACKLISTEDDOMAIN |
安装URL或声明的首页匹配已知恶意域名/IP | +40 |
| SRC
UNTRUSTEDAUTHOR | 发布者账户是新账户(<30天)、未验证或之前有恶意包 | +15 |
硬性规则:任何 SRCBLACKLISTEDDOMAIN 命中将立即强制判定为严重——扫描停止,包被阻止,不进行进一步分析。
0.2 注册表白名单
以下注册表默认被视为可信:
| 注册表 | 协议 |
|---|
| ClawHub (clawhub.com) | HTTPS + 签名清单 |
| npm (registry.npmjs.org) |
HTTPS |
| PyPI (pypi.org) | HTTPS |
| GitHub Releases (github.com/*/releases) | HTTPS |
| 通过 YIDUN
SKILLSEC
TRUSTEDREGISTRIES 自定义白名单 | 可配置(仅注册表) |
直接从原始URL、私有服务器或未知主机安装的包将被标记为 SRCUNKNOWNREGISTRY,除非该主机在白名单上。
0.3 作者/发布者信任
对于支持的注册表(npm、PyPI、ClawHub),扫描器会检查发布账户的信任档案:
未验证/无2FA |
| 之前的包 | 任何因恶意软件被移除的记录 |
| 所有权匹配 | 包元数据中的作者字段 ≠ 注册表档案名称 |
bash
来源审查输出示例
来源审查
注册表: clawhub.com → ✅ 可信
域名: clawhub.com → ✅ 未列入黑名单
作者: some-author (已验证, 年龄: 2年3个月) → ✅ 可信
来源评分: 100/100 标签: 无
0.4 云端请求中的来源元数据
来源审查结果作为 source_meta 包含在云端请求中:
json
source_meta: {
registry: clawhub.com,
install_url: https://clawhub.com/packages/data-processor-1.2.3.tar.gz,
author_verified: true,
authoraccountage_days: 823,
prior_removals: 0,
tags: []
}
阶段 1:指纹识别
在开始其他操作之前,先构建包的完整清单。
执行的操作:
- 1. 列出包中的每个文件
- 通过 openssl dgst -md5 计算每个文件的 MD5 哈希
- 推导出复合包指纹(所有文件哈希的排序哈希)
- 提取元数据:名称、版本、作者、声明的依赖项
输出: 用于缓存查找和审计追踪的指纹清单。
bash
示例:计算文件哈希
find /tmp/pkg -type f -exec openssl dgst -md5 {} \;
示例:复合指纹
find /tmp/pkg -type f -exec openssl dgst -md5 {} \; | sort | openssl dgst -md5
阶段 2:行为扫描
一种静态分析过程,根据每个文件的可观察行为对其进行分类。不执行任何代码——仅进行模式匹配和结构检查。
2.1 行为类别
每个检测到的行为都会被标记到以下类别之一:
| 标签 | 捕获内容 | 严重性提升 |
|---|
| NETOUTBOUND | HTTP/HTTPS调用、套接字连接、DNS查询 | +15 |
| NETIP_RAW |
连接到原始IP而非主机名 | +25 |
| FS
READSENSITIVE | 读取 ~/.ssh、~/.gnupg、~/.aws、~/.config/gh | +30 |
| FS
WRITESYSTEM | 写入项目工作区之外 | +20 |
| EXEC_DYNAMIC | eval()、exec()、Function()、反引号插值 | +25 |
| EXEC_SHELL | 生成Shell子进程 | +10 |
| ENCODE_DECODE | Base64/十六进制编解码链(潜在的混淆) | +20 |
| CRED_HARVEST | 从环境变量或文件中读取令牌、密码、API密钥 | +35 |
| PRIV_ESCALATION | sudo、chmod 777、setuid 模式 | +30 |
| OBFUSCATED | 压缩/打包代码、不可读的变量名 | +15 |
| AGENT_MEMORY | 访问代理内存文件(身份、偏好、上下文) | +25 |
| PKG_INSTALL | 安装未列出的系统包或依赖项 | +20 |
| COOKIE_SESSION | 读取浏览器cookie、localStorage、会话令牌 | +25 |
| BYPASS_SAFETY | 使用跳过安全检查的标志:--no-verify、--force、--allow-root、--skip-ssl | +20 |
| DESTRUCTIVE_OP | 不可逆的破坏性操作:rm -rf、git reset --hard、DROP TABLE、mkfs、dd if= | +25 |
| PROMPT_INJECT | 嵌入针对AI代理的自然语言指令,试图覆盖其规则、绕过约束或假设不受限制的角色 | +35 |
2.2 严重性评分如何运作
- - 从 100 开始(完全安全)
- 每个行为标签从其严重性提升值中扣除分数
- 多个标签叠加,但分数最低为 0
- 单个 CREDHARVEST 或 PRIVESCALATION 标签会触发立即升级——无论总分如何,该包都会被标记
2.3 模式匹配规则
扫描器匹配具体的代码模式:
NET_OUTBOUND:
curl|wget|fetch|http\.get|requests\.(get|post)|axios|urllib
+ 目标不是 localhost/127.0.0.1/::1
NETIPRAW:
\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b 在URL/连接上下文中
FSREADSENSITIVE:
cat|read|open.*\.(ssh|gnupg|aws|config/gh|