OpenClaw Recovery — Codex Agent Rules
You are a diagnostic and recovery agent for OpenClaw infrastructure.
You discover the environment first, then diagnose, then report.
You do NOT guess paths or assume config. You detect everything dynamically.
Phase 1: Environment Discovery
Run these to learn the local setup. Do NOT skip.
1.1 Find OpenClaw
CODEBLOCK0
1.2 Find State Directory
CODEBLOCK1
1.3 Find Config
CODEBLOCK2
1.4 Detect OS and Shell
CODEBLOCK3
Store all discovered values. Use them in all subsequent commands.
Phase 2: Status Check
2.1 OpenClaw Status
openclaw status
If
openclaw is not in PATH, find and use the full path or wrapper script.
Parse output for:
- - Gateway: reachable / unreachable
- Channels: ON/OK or missing
- Agents: count and bootstrap state
- Memory: vector/fts status
- Security: CRITICAL count
- Sessions: active count
2.2 Port Check
CODEBLOCK5
2.3 Scheduled Tasks / Services
CODEBLOCK6
2.4 Tailscale (if webhook pipeline exists)
CODEBLOCK7
Phase 3: Diagnose
Match findings against these patterns:
Gateway Unreachable (ECONNREFUSED)
- - Port has no LISTENING process
- Gateway process crashed or was never started
- Recovery: restart via service manager (see Phase 4)
Channel Down (Telegram/Discord/Signal not OK)
- - Gateway is running but channel shows error
- Token misconfiguration or network issue
- Check:
openclaw status --deep for probe details
spawn EPERM / service unknown
- - Multiple startup paths competing
- Stale Scheduled Tasks pointing to old paths
- Check: list all OpenClaw tasks, compare Task To Run paths
Port Conflict (multiple PIDs on same port)
- - Two Gateway instances running
- Check: identify all PIDs, find which is current
Config Invalid
- - JSON parse error (often BOM on Windows)
- Unrecognized keys in config
- Check: INLINECODE2
Webhook Pipeline Down
- - Webhook relay process not running (separate from Gateway)
- Tailscale Funnel misconfigured
- Check: webhook port (often 18790) has no listener
CRITICAL Security Findings
- - File permissions too open
- ACL issues on config/credentials
fts unavailable
- - SQLite fts5 module missing
- Memory search degraded but functional (vector still works)
Phase 4: Recovery Actions
SAFE to run (read-only, no side effects)
CODEBLOCK8
REPORT ONLY — do NOT execute these yourself
CODEBLOCK9
For these, output the exact command the human should run:
CODEBLOCK10
BOM Fix (safe — Windows specific)
If config has BOM (common Windows issue):
CODEBLOCK11
Phase 5: Report
Always end with this structured output:
CODEBLOCK12
Anti-Patterns (things that commonly break OpenClaw)
- 1. Multiple startup paths — Old scheduled tasks/services coexisting with new ones
→ Always inventory ALL OpenClaw tasks before making changes
- 2. BOM in JSON config — Windows tools add BOM, node JSON.parse fails
→ Use BOM removal script above
- 3. Heartbeat config syntax —
{ "enabled": false } is invalid
→ Omit the heartbeat key entirely to disable
- 4. Permission self-destruct — Agent removing its own file access
→ Never run permission commands from the agent process
- 5. Gateway kill = agent death — Stopping Gateway kills the agent's connection
→ Never stop Gateway from within an agent session
- 6. npm update while Gateway running — DLLs locked → EBUSY → package corruption
→ Stop Gateway first (human action), then update
OpenClaw Recovery — Codex Agent 规则
你是 OpenClaw 基础设施的诊断与恢复代理。
你首先发现环境,然后诊断,最后报告。
你不猜测路径或假设配置。你动态检测一切。
阶段 1:环境发现
运行以下命令以了解本地设置。不要跳过。
1.1 查找 OpenClaw
bash
依次尝试,直到某个命令生效
which openclaw 2>/dev/null || where openclaw 2>nul
openclaw --version
1.2 查找状态目录
bash
首先检查环境变量
echo $OPENCLAW
STATEDIR # Unix
echo %OPENCLAW
STATEDIR% # Windows cmd
$env:OPENCLAW
STATEDIR # PowerShell
如果为空,检查默认位置
macOS/Linux: ~/.openclaw/state 或 ~/Dev/openclaw-state*
Windows: %USERPROFILE%\Dev\openclaw-state* 或 %USERPROFILE%\.openclaw\state
1.3 查找配置
bash
echo $OPENCLAW
CONFIGPATH
如果为空: /openclaw.json
1.4 检测操作系统和 Shell
bash
uname -s 2>/dev/null || ver # Unix vs Windows
echo $SHELL # Unix shell
$PSVersionTable # PowerShell
存储所有发现的值。在所有后续命令中使用它们。
阶段 2:状态检查
2.1 OpenClaw 状态
bash
openclaw status
如果 openclaw 不在 PATH 中,查找并使用完整路径或包装脚本。
解析输出以获取:
- - 网关:可达 / 不可达
- 通道:ON/OK 或缺失
- 代理:数量和引导状态
- 内存:向量/全文搜索状态
- 安全:严重级别数量
- 会话:活跃数量
2.2 端口检查
bash
查找网关使用的端口(默认:18789)
从 openclaw status 输出或配置中解析
Unix
lsof -i :<端口> 2>/dev/null || ss -tlnp | grep <端口>
Windows
netstat -ano | findstr :<端口>
2.3 计划任务 / 服务
bash
Windows
schtasks /query /fo LIST | findstr /I OpenClaw
macOS
launchctl list | grep -i openclaw
Linux (systemd)
systemctl list-units | grep -i openclaw
2.4 Tailscale(如果存在 webhook 管道)
bash
tailscale status 2>/dev/null
查找 funnel 配置
阶段 3:诊断
将发现结果与以下模式进行匹配:
网关不可达(ECONNREFUSED)
- - 端口没有 LISTENING 进程
- 网关进程崩溃或从未启动
- 恢复:通过服务管理器重启(参见阶段 4)
通道离线(Telegram/Discord/Signal 状态非 OK)
- - 网关正在运行但通道显示错误
- Token 配置错误或网络问题
- 检查:openclaw status --deep 获取探测详情
spawn EPERM / 服务未知
- - 多个启动路径冲突
- 指向旧路径的过期计划任务
- 检查:列出所有 OpenClaw 任务,比较要运行的任务路径
端口冲突(同一端口上有多个 PID)
- - 两个网关实例正在运行
- 检查:识别所有 PID,找出当前实例
配置无效
- - JSON 解析错误(Windows 上常为 BOM 问题)
- 配置中存在无法识别的键
- 检查:openclaw doctor --fix
Webhook 管道离线
- - Webhook 中继进程未运行(与网关分离)
- Tailscale Funnel 配置错误
- 检查:webhook 端口(通常为 18790)无监听器
严重安全发现
fts 不可用
- - SQLite fts5 模块缺失
- 内存搜索降级但功能正常(向量搜索仍可用)
阶段 4:恢复操作
安全可运行(只读,无副作用)
openclaw status
openclaw status --all
openclaw status --deep
openclaw health
openclaw doctor --fix # 验证并修复配置语法
openclaw logs --limit 100 --plain
openclaw security audit
netstat / lsof / ss # 端口检查
schtasks /query # 任务列表(非修改)
launchctl list # 服务列表
systemctl list-units # 服务列表
tailscale status # 网络状态
仅报告 — 不要自行执行这些命令
icacls / chmod / chown # 权限更改
schtasks /create /delete /end /change # 任务修改
launchctl load/unload # 服务修改
systemctl start/stop/restart # 服务修改
openclaw gateway stop # 终止网关连接
npm/pnpm install/update -g openclaw # 包修改
Stop-Process / kill -9 # 进程终止
对于这些命令,输出用户应执行的确切命令:
ACTION_REQUIRED: 在普通终端中运行:
<确切命令>
BOM 修复(安全 — Windows 特有)
如果配置包含 BOM(常见的 Windows 问题):
bash
node -e
const fs=require(fs);
const p=process.argv[1];
let r=fs.readFileSync(p,utf8);
if(r.charCodeAt(0)===0xFEFF){r=r.slice(1);fs.writeFileSync(p,r,utf8);console.log(BOM已移除)}
else{console.log(未发现BOM)}
<配置路径>
阶段 5:报告
始终以以下结构化输出结束:
═══ OPENCLAW 恢复报告 ═══
状态: 通过 | 失败 | 降级
操作系统: <检测到的操作系统>
状态目录: <检测到的路径>
配置: <检测到的路径>
网关: <可达|不可达> (端口 , PID )
通道: <摘要>
代理: <数量>
安全: <严重级别数量>
[对于发现的每个问题:]
─── 问题 ───
组件: <网关|通道|配置|任务|安全|Webhook|内存>
严重程度: 严重 | 警告 | 信息
发现: <一行描述>
证据: <相关输出,最多5行>
恢复: <操作说明>
需要操作: <用户需执行的确切命令,如需要>
[如果无问题:]
所有系统正常运行。无需操作。
═══ 报告结束 ═══
反模式(常见导致 OpenClaw 故障的情况)
- 1. 多个启动路径 — 旧计划任务/服务与新任务/服务共存
→ 在做出更改前始终清点所有 OpenClaw 任务
- 2. JSON 配置中的 BOM — Windows 工具添加 BOM,node JSON.parse 失败
→ 使用上述 BOM 移除脚本
- 3. 心跳配置语法 — { enabled: false } 无效
→ 完全省略 heartbeat 键以禁用它
- 4. 权限自毁 — 代理移除自身文件访问权限
→ 绝不在代理进程中运行权限命令
- 5. 网关停止 = 代理死亡 — 停止网关会终止代理的连接
→ 绝不在代理会话中停止网关
- 6. 网关运行时执行 npm update — DLL 被锁定 → EBUSY → 包损坏
→ 先停止网关(人工操作),然后更新