paperclip-resilience

Production-grade resilience for AI agents running on Paperclip, orchestrated through OpenClaw.

The Problem

Paperclip agents die silently when providers hit rate limits, sessions crash on gateway restarts, and failed runs leave agents stuck in error state with no recovery path. If you're running agents overnight or in parallel, you need automated recovery — not manual babysitting.

What's Included

Module	File	Purpose
Spawn with Fallback	INLINECODE1	Wraps `openclaw session spawn` with automatic provider failover. If your primary model 429s, it tries the configured fallback.
Model Rotation

src/model-rotation.js | Tracks fix attempts per PR/task and rotates through models + thinking levels after repeated failures. | | Run Recovery | src/run-recovery.js | Detects failed Paperclip heartbeat runs (gateway errors, timeouts, 429s) and re-invokes agents with model fallback. | | Blocker Routing | src/blocker-routing.js | Scans agent session transcripts for blocked/stuck signals and routes them to configurable destinations (file, stdout, webhook). | | Task Injection | src/task-injection.js | Enriches spawn task descriptions with issue tracking metadata, PR requirements, and UX design checklists before agent execution. |

Quick Start

1. Install

CODEBLOCK0

2. Configure

CODEBLOCK1

3. Use Spawn with Fallback

CODEBLOCK2

CODEBLOCK3

4. Set Up Run Recovery (Cron)

Add to your OpenClaw cron schedule to auto-recover failed runs:

CODEBLOCK4

Once verified, schedule it:
CODEBLOCK5

5. Model Rotation for PR Fixes

CODEBLOCK6

Configuration

All modules read from config.json in the skill directory, with sensible defaults if no config is provided.

See config.example.json for the full documented schema, and config.schema.json for validation.

Key Configuration Sections

aliases — Map short model names to full provider/model strings:
CODEBLOCK7

fallbacks — Define provider failover pairs:
CODEBLOCK8

failurePatterns — Regex patterns that trigger fallback:
CODEBLOCK9

Architecture

CODEBLOCK10

Requirements

- OpenClaw (for session spawning and agent management)
Paperclip (for heartbeat run monitoring and agent lifecycle)
Node.js 18+
At least two LLM provider API keys configured (for fallback to work)

Security

This skill was security-reviewed for ClawHub publication in SUP-453. The code paths that accept user-controlled input now enforce validation up front and fail closed.

Hardened Surfaces

Surface	Protection
Model names	Character allowlist with support for provider suffixes like `:free`; rejects empty path segments and `.` / `..` traversal segments
Task files (`@file`)

Blocks explicit ../, canonicalizes symlinks with realpath, rejects system paths like /etc/ and /usr/, requires a regular file | | Task payloads | 1MB max size limit for inline and file-backed task content | | Spawn mode + labels | Allowlist validation for mode (run, session) and safe-character validation for labels | | Failure regex config | Caps pattern count/length and drops invalid regexes to reduce ReDoS risk | | Paperclip issue metadata | Sanitizes API strings, constrains issue identifier extraction, normalizes priority values |

Security Boundaries

- Process execution: uses execFile, not shell execution
Dynamic code execution: none (eval / Function not used)
Credentials: read from environment or external auth files; not embedded in the skill
File access: limited to explicitly requested files, with traversal and symlink tunnel protections
Dependencies: zero external runtime dependencies in this package

Verification

CODEBLOCK11

Audit Record

- Last audit: 2026-03-27
Tracking issue: SUP-453
Status: ✅ Approved for ClawHub publication
Details: see SECURITY-AUDIT-REPORT.md

Related Paperclip Issues

These are the upstream gaps this skill works around:

- #276 — Auto-requeue agent on failure
#1845 — No crash-recovery wakeup after restart
#1861 — Agent death on 429 with no model fallback

License

MIT

paperclip-resilience

为运行在 Paperclip 上的 AI 智能体提供生产级弹性，通过 OpenClaw 进行编排。

问题

当提供商达到速率限制时，Paperclip 智能体会静默死亡；会话在网关重启时崩溃；失败的运行会使智能体陷入 error 状态且无法恢复。如果你需要让智能体通宵运行或并行运行，你需要的是自动恢复——而不是手动看护。

包含的模块

模块	文件	用途
带回退的生成	src/spawn-with-fallback.js	封装 openclaw session spawn，自动进行提供商故障转移。如果主模型返回 429，则尝试配置的回退模型。
模型轮换

快速开始

1. 安装

bash
clawhub install paperclip-resilience

2. 配置

bash
cd skills/paperclip-resilience
cp config.example.json config.json

用你的模型别名和回退对编辑 config.json

3. 使用带回退的生成

bash

命令行

node skills/paperclip-resilience/src/spawn-with-fallback.js \
--model sonnet --task 修复登录错误 --mode run

空运行，查看将会发生什么

node skills/paperclip-resilience/src/spawn-with-fallback.js \ --model opus --task 重构认证 --dry-run

javascript
// 编程方式
const { spawnWithFallback, loadConfig } = require(./skills/paperclip-resilience/src/spawn-with-fallback);
const config = loadConfig(./my-config.json);
const result = await spawnWithFallback({ model: sonnet, task: 修复错误, config });

4. 设置运行恢复（定时任务）

添加到你的 OpenClaw 定时任务计划中，以自动恢复失败的运行：

bash
node skills/paperclip-resilience/src/run-recovery.js --dry-run --verbose

验证后，将其加入计划：

/15 * node skills/paperclip-resilience/src/run-recovery.js

5. 用于 PR 修复的模型轮换

bash

检查 PR 是否需要模型轮换

node skills/paperclip-resilience/src/model-rotation.js check --pr 42 --repo owner/repo

记录一次尝试

node skills/paperclip-resilience/src/model-rotation.js record --pr 42 --repo owner/repo --model anthropic/claude-sonnet-4-6

配置

所有模块都从技能目录中的 config.json 读取配置，如果未提供配置，则使用合理的默认值。

请参阅 config.example.json 获取完整的文档化模式，以及 config.schema.json 进行验证。

关键配置部分

aliases — 将简短模型名称映射到完整的提供商/模型字符串：
json
{
aliases: {
sonnet: anthropic/claude-sonnet-4-6,
opus: anthropic/claude-opus-4-6,
codex: openai-codex/gpt-5.3-codex
}
}

fallbacks — 定义提供商故障转移对：
json
{
fallbacks: {
anthropic/claude-sonnet-4-6: openai-codex/gpt-5.3-codex,
openai-codex/gpt-5.3-codex: anthropic/claude-sonnet-4-6
}
}

failurePatterns — 触发回退的正则表达式模式：
json
{
failurePatterns: {
patterns: [credits, quota, 402, rate[\\s_-]?limit]
}
}

架构

┌──────────────────┐ ┌──────────────────┐
│ 任务注入 │────▶│ 带回退的生成 │
│ (丰富任务) │ │ (提供商重试) │
└──────────────────┘ └────────┬───────────┘
│
▼
┌──────────────────────┐
│ Paperclip 智能体 │
│ (心跳运行) │
└──────────┬───────────┘
│
┌──────────┴───────────┐
│ │
▼ ▼
┌────────────────┐ ┌──────────────────┐
│ 运行恢复 │ │ 阻塞路由 │
│ (检测 + 唤醒) │ │ (升级卡住情况) │
└────────────────┘ └──────────────────┘
│
▼
┌────────────────┐
│ 模型轮换 │
│ (升级模型) │
└────────────────┘

要求

- OpenClaw（用于会话生成和智能体管理）
Paperclip（用于心跳运行监控和智能体生命周期）
Node.js 18+
至少配置两个 LLM 提供商的 API 密钥（以便回退功能正常工作）

安全性

此技能已为 ClawHub 发布进行了安全审查，编号为 SUP-453。接受用户输入的代码路径现在会预先执行验证，并在失败时安全关闭。

加固的表面

表面	保护措施
模型名称	字符允许列表，支持 :free 等提供商后缀；拒绝空路径段和 . / .. 遍历段
任务文件（@file）

安全边界

- 进程执行：使用 execFile，而非 shell 执行
动态代码执行：无（未使用 eval / Function）
凭据：从环境变量或外部认证文件读取；不嵌入技能中
文件访问：仅限于显式请求的文件，具有遍历和符号链接隧道保护
依赖项：此包中零外部运行时依赖项

验证

bash

功能覆盖

node skills/paperclip-resilience/tests/test-spawn-with-fallback.js

完整安全套件

node skills/paperclip-resilience/tests/test-security.js

快速冒烟测试

node skills/paperclip-resilience/tests/test-security-quick.js

审计记录

- 上次审计：2026-03-27
跟踪问题：SUP-453
状态：✅ 已批准用于 ClawHub 发布
详情：请参阅 SECURITY-AUDIT-REPORT.md

许可证

MIT

paperclip-resilience回形针韧性