OpenClaw Bastion
Runtime prompt injection defense for agent workspaces. While other tools watch workspace identity files, Bastion protects the input/output boundary — the files being read by the agent, web content, API responses, and user-supplied documents.
Why This Matters
Agents process content from many sources: local files, API responses, web pages, user uploads. Any of these can contain prompt injection attacks — hidden instructions that manipulate agent behavior. Bastion scans this content before the agent acts on it.
Commands
Scan for Injections
Scan files or directories for prompt injection patterns. Detects instruction overrides, system prompt markers, hidden Unicode, markdown exfiltration, HTML injection, shell injection, encoded payloads, delimiter confusion, multi-turn manipulation, and dangerous commands.
If no target is specified, scans the entire workspace.
CODEBLOCK0
Scan a specific file or directory:
CODEBLOCK1
Quick File Check
Fast single-file injection check. Same detection patterns as scan, targeted to one file.
CODEBLOCK2
Boundary Analysis
Analyze content boundary safety across the workspace. Identifies:
- - Agent instruction files that contain mixed trusted/untrusted content
- Writable instruction files (attack surface for compromised skills)
- Blast radius assessment for each critical file
CODEBLOCK3
Command Allowlist
Display the current command allowlist and blocklist policy. Creates a default .bastion-policy.json if none exists.
CODEBLOCK4
The policy file defines which commands are considered safe and which patterns are blocked. Edit the JSON file directly to customize. Bastion Pro enforces this policy at runtime via hooks.
Status
Quick summary of workspace injection defense posture: files scanned, findings by severity, boundary safety, and overall posture rating.
CODEBLOCK5
Workspace Auto-Detection
If --workspace is omitted, the script tries:
- 1.
OPENCLAW_WORKSPACE environment variable - Current directory (if
AGENTS.md exists) - INLINECODE5 (default)
What Gets Detected
| Category | Patterns | Severity |
|---|
| Instruction override | "ignore previous", "disregard above", "you are now", "new system prompt", "forget your instructions", "override safety", "act as if no restrictions", "entering developer mode" | CRITICAL |
| System prompt markers |
<system>,
[SYSTEM],
<<SYS>>,
<\|im_start\|>system,
[INST],
### System: | CRITICAL |
|
Hidden instructions | Multi-turn manipulation ("in your next response, you must"), stealth patterns ("do not tell the user") | CRITICAL |
|
HTML injection |
<script>,
<iframe>,
<img onerror=>, hidden divs,
<svg onload=> | CRITICAL |
|
Markdown exfiltration | Image tags with encoded data in URLs | CRITICAL |
|
Dangerous commands |
curl \| bash,
wget \| sh,
rm -rf /, fork bombs | CRITICAL |
|
Unicode tricks | Zero-width characters, RTL overrides, invisible formatting | WARNING |
|
Homoglyph substitution | Cyrillic/Latin lookalikes mixed into ASCII text | WARNING |
|
Base64 payloads | Large encoded blobs outside code blocks | WARNING |
|
Shell injection |
$(command) subshell execution outside code blocks | WARNING |
|
Delimiter confusion | Fake code block boundaries with injection content | WARNING |
Context-Aware Scanning
- - Patterns inside fenced code blocks (
` ) are skipped to avoid false positives - Per-file risk scoring based on finding count and severity
- Self-exclusion: Bastion skips its own skill files (which describe injection patterns)
Exit Codes
| Code | Meaning |
|---|
| 0 | Clean, no issues |
| 1 |
Warnings detected (review recommended) |
| 2 | Critical findings (action needed) |
No External Dependencies
Python standard library only. No pip install. No network calls. Everything runs locally.
Cross-Platform
Works with OpenClaw, Claude Code, Cursor, and any tool using the Agent Skills specification.
OpenClaw Bastion
针对智能体工作空间的运行时提示注入防御。当其他工具监控工作空间身份文件时,Bastion 保护输入/输出边界——即智能体正在读取的文件、网页内容、API 响应以及用户提供的文档。
为何重要
智能体处理来自多种来源的内容:本地文件、API 响应、网页、用户上传。其中任何内容都可能包含提示注入攻击——即操纵智能体行为的隐藏指令。Bastion 在智能体处理这些内容之前对其进行扫描。
命令
扫描注入
扫描文件或目录中的提示注入模式。检测指令覆盖、系统提示标记、隐藏 Unicode、Markdown 数据外泄、HTML 注入、Shell 注入、编码载荷、分隔符混淆、多轮操纵以及危险命令。
如果未指定目标,则扫描整个工作空间。
bash
python3 {baseDir}/scripts/bastion.py scan
扫描特定文件或目录:
bash
python3 {baseDir}/scripts/bastion.py scan path/to/file.md
python3 {baseDir}/scripts/bastion.py scan path/to/directory/
快速文件检查
快速单文件注入检查。与 scan 相同的检测模式,仅针对单个文件。
bash
python3 {baseDir}/scripts/bastion.py check path/to/file.md
边界分析
分析整个工作空间的内容边界安全性。识别:
- - 包含混合可信/不可信内容的智能体指令文件
- 可写指令文件(受损技能的受攻击面)
- 每个关键文件的爆炸半径评估
bash
python3 {baseDir}/scripts/bastion.py boundaries
命令白名单
显示当前命令白名单和黑名单策略。如果不存在,则创建默认的 .bastion-policy.json 文件。
bash
python3 {baseDir}/scripts/bastion.py allowlist
python3 {baseDir}/scripts/bastion.py allowlist --show
策略文件定义了哪些命令被视为安全以及哪些模式被阻止。直接编辑 JSON 文件进行自定义。Bastion Pro 通过钩子在运行时强制执行此策略。
状态
工作空间注入防御状态的快速摘要:已扫描文件、按严重级别分类的发现、边界安全性以及总体状态评级。
bash
python3 {baseDir}/scripts/bastion.py status
工作空间自动检测
如果省略 --workspace,脚本会依次尝试:
- 1. OPENCLAW_WORKSPACE 环境变量
- 当前目录(如果存在 AGENTS.md)
- ~/.openclaw/workspace(默认)
检测内容
| 类别 | 模式 | 严重级别 |
|---|
| 指令覆盖 | 忽略之前, 无视上述, 你现在是, 新系统提示, 忘记你的指令, 覆盖安全设置, 假装没有限制, 进入开发者模式 | 严重 |
| 系统提示标记 |
, [SYSTEM], <>, <\|im_start\|>system, [INST], ### System: | 严重 |
| 隐藏指令 | 多轮操纵(在你的下一个回复中,你必须),隐蔽模式(不要告诉用户) | 严重 |
| HTML 注入 |