Checkmate

A deterministic Python loop (scripts/run.py) calls an LLM for worker and judge roles.
Nothing leaves until it passes — and you stay in control at every checkpoint.

Requirements

- OpenClaw platform CLI (openclaw) — must be available in PATH. Used for:

- openclaw gateway call sessions.list — resolve session UUID for turn injection - openclaw agent --session-id <UUID> — inject checkpoint messages into the live session - openclaw message send — fallback channel delivery (e.g. Telegram, Signal)

- Python 3 — run.py is pure stdlib; no pip packages required
No separate API keys or env vars needed — routes through the gateway's existing OAuth

Security & Privilege Model

⚠️ This is a high-privilege skill. Read before using in batch/automated mode.

Spawned workers and judges inherit full host-agent runtime, including:

- exec (arbitrary shell commands)
INLINECODE8, INLINECODE9
All installed skills (including those with OAuth-bound credentials — Gmail, Drive, etc.)
INLINECODE10 (workers can spawn further sub-agents)

This means the task description you provide directly controls what the worker does — treat it like code you're about to run, not a message you're about to send.

Batch mode (--no-interactive) removes all human gates. In interactive mode (default), you approve criteria and each checkpoint before the loop continues. In batch mode, criteria are auto-approved and the loop runs to completion autonomously — only use this for tasks and environments you fully trust.

User-input bridging writes arbitrary content to disk. When you reply to a checkpoint, the main agent writes your reply verbatim to user-input.md in the workspace. The orchestrator reads it and acts on it. Don't relay untrusted third-party content as checkpoint replies.

When to Use

Use checkmate when correctness matters more than speed — when "good enough on the first try" isn't acceptable.

Good fits:

- Code that must pass tests or meet a spec
Docs or reports that must hit a defined quality bar
Research that must be thorough and cover specific ground
Any task where you'd otherwise iterate manually until satisfied

Trigger phrases (say any of these):

- INLINECODE13
INLINECODE14
INLINECODE15
INLINECODE16
INLINECODE17
INLINECODE18
INLINECODE19
INLINECODE20

Architecture

CODEBLOCK0

Interactive mode (default): user approves criteria, confirms pre-start, and reviews each FAIL checkpoint.
Batch mode (--no-interactive): fully autonomous; criteria-judge gates intake, no checkpoints.

User Input Bridge

When the orchestrator needs user input, it:

1. Writes workspace/pending-input.json (kind + workspace path)
Sends a notification via --recipient and INLINECODE24
Polls workspace/user-input.md every 5s (up to --checkpoint-timeout minutes)

The main agent acts as the bridge: when pending-input.json exists and the user replies, the agent writes their response to user-input.md. The orchestrator picks it up automatically.

Each agent session is spawned via:

openclaw agent --session-id <isolated-id> --message <prompt> --timeout <N> --json

Routes through the gateway WebSocket using existing OAuth — no separate API key.
Workers get full agent runtime: exec, websearch, webfetch, all skills, sessions_spawn.

Your Job (main agent)

When checkmate is triggered:

1. Get your session UUID (for direct agent-turn injection):

   openclaw gateway call sessions.list --params '{"limit":1}' --json \
     | python3 -c "import json,sys; s=json.load(sys.stdin)['sessions'][0]; print(s['sessionId'])"

Also note your --recipient (channel user/chat ID) and --channel as fallback.

2. Create workspace:

   bash <skill-path>/scripts/workspace.sh /tmp "TASK"

Prints the workspace path. Write the full task to workspace/task.md if needed.

3. Run the orchestrator (background exec):

   python3 <skill-path>/scripts/run.py \
     --workspace /tmp/checkmate-TIMESTAMP \
     --task "FULL TASK DESCRIPTION" \
     --max-iter 10 \
     --session-uuid YOUR_SESSION_UUID \
     --recipient YOUR_RECIPIENT_ID \
     --channel <your-channel>

Use exec with background=true. This runs for as long as needed. Add --no-interactive for fully autonomous runs (no user checkpoints).

4. Tell the user checkmate is running, what it's working on, and that they'll receive criteria drafts and checkpoint messages via your configured channel to review and approve.

5. Bridge user replies: When user responds to a checkpoint message, check for pending-input.json and write their response to workspace/user-input.md.

Bridging User Input

When a checkpoint message arrives (the orchestrator sent the user a criteria/approval/checkpoint request), bridge their reply:

CODEBLOCK5

The orchestrator polls for this file every 5 seconds. Once written, it resumes automatically and deletes the file.

Accepted replies at each gate:

Gate	Continue	Redirect	Cancel
Criteria review	"ok", "approve", "lgtm"	any feedback text	—
Pre-start

Parameters

Flag	Default	Notes
INLINECODE37	5	Intake criteria refinement iterations
INLINECODE38

Workspace layout

CODEBLOCK6

Resume

If the script is interrupted, just re-run it with the same --workspace. It reads state.json and skips completed steps. Locked criteria.md is reused; completed iter-N/output.md files are not re-run.

Prompts

Active prompts called by run.py:

- prompts/intake.md — converts task → criteria draft
INLINECODE56 — evaluates criteria quality (APPROVED / NEEDSWORK) — used in non-interactive mode
INLINECODE57 — worker prompt (variables: TASK, CRITERIA, FEEDBACK, ITERATION, MAXITER, OUTPUT_PATH)
INLINECODE58 — evaluates output against criteria (PASS / FAIL)

Reference only (not called by run.py):

- prompts/orchestrator.md — architecture documentation explaining the design rationale

将杀

一个确定性的Python循环（scripts/run.py）调用LLM来执行工作节点和裁判节点的角色。
在通过之前不会结束——并且你在每个检查点都保持控制。

要求

- OpenClaw平台CLI（openclaw）——必须在PATH中可用。用于：

- openclaw gateway call sessions.list——解析会话UUID以进行回合注入 - openclaw agent --session-id ——将检查点消息注入到实时会话中 - openclaw message send——备用渠道投递（例如Telegram、Signal）

- Python 3——run.py是纯标准库；不需要pip包
不需要单独的API密钥或环境变量——通过网关现有的OAuth路由

安全与权限模型

⚠️ 这是一个高权限技能。 在批量/自动模式下使用前请阅读。

生成的工作节点和裁判节点继承完整的主机代理运行时，包括：

- exec（任意shell命令）
websearch、webfetch
所有已安装的技能（包括那些绑定OAuth凭据的技能——Gmail、Drive等）
sessions_spawn（工作节点可以生成进一步的子代理）

这意味着你提供的任务描述直接控制工作节点的行为——将其视为即将运行的代码，而不是即将发送的消息。

批量模式（--no-interactive）移除所有人工关卡。 在交互模式（默认）下，你在循环继续之前批准标准和每个检查点。在批量模式下，标准自动批准，循环自主运行至完成——仅在你完全信任的任务和环境中使用此模式。

用户输入桥接将任意内容写入磁盘。 当你回复检查点时，主代理将你的回复逐字写入工作空间中的user-input.md。编排器读取它并据此执行。不要将不受信任的第三方内容作为检查点回复转发。

何时使用

当正确性比速度更重要时——当第一次就足够好不可接受时，使用将杀。

适用场景：

- 必须通过测试或符合规范的代码
必须达到定义质量标准的文档或报告
必须彻底并覆盖特定领域的研究
任何你原本会手动迭代直到满意的任务

触发短语（说出以下任一）：

- checkmate: TASK
keep iterating until it passes
dont stop until done
until it passes
quality loop: TASK
iterate until satisfied
judge and retry
keep going until done

架构

scripts/run.py （确定性Python while循环——编排器）
├─ 摄入循环 [最多maxintakeiter，默认5]：
│ ├─ 起草标准（摄入提示 + 任务 + 优化反馈）
│ ├─ ⏸ 用户审查：显示草稿 → 等待批准或反馈
│ │ 已批准？→ 锁定 criteria.md
│ │ 反馈？→ 优化，下一次摄入迭代
│ └─ （非交互模式：标准裁判关卡代替用户）
│
├─ ⏸ 启动前关卡：显示最终任务 + 标准 → 用户确认开始
│ （此处支持编辑任务/取消）
│
└─ 主循环 [最多max_iter，默认10]：
├─ 工作节点：生成代理会话 → iter-N/output.md
│ （完整运行时：exec、web_search、所有技能、OAuth认证）
├─ 裁判节点：生成代理会话 → iter-N/verdict.md
├─ 通过？→ 写入 final-output.md，通知用户，退出
└─ 失败？→ 提取差距 → ⏸ 检查点：向用户显示分数 + 差距
继续？→ 下一次迭代（附带裁判差距）
重定向：X → 下一次迭代（附带用户指示）
停止？→ 结束循环，取迄今为止最佳结果

交互模式（默认）：用户批准标准，确认启动前，并审查每个失败的检查点。
批量模式（--no-interactive）：完全自主；标准裁判关卡控制摄入，无检查点。

用户输入桥接

当编排器需要用户输入时，它：

1. 写入 workspace/pending-input.json（类型 + 工作空间路径）
通过 --recipient 和 --channel 发送通知
每5秒轮询 workspace/user-input.md（最多 --checkpoint-timeout 分钟）

主代理充当桥接：当 pending-input.json 存在且用户回复时，代理将其响应写入 user-input.md。编排器自动拾取。

每个代理会话通过以下方式生成：
bash
openclaw agent --session-id --message --timeout --json

通过网关WebSocket使用现有OAuth路由——不需要单独的API密钥。
工作节点获得完整的代理运行时：exec、websearch、webfetch、所有技能、sessions_spawn。

你的工作（主代理）

当将杀被触发时：

1. 获取你的会话UUID（用于直接代理回合注入）：

bash openclaw gateway call sessions.list --params {limit:1} --json \ | python3 -c import json,sys; s=json.load(sys.stdin)[sessions][0]; print(s[sessionId])

同时记下你的 --recipient（渠道用户/聊天ID）和 --channel 作为备用。

2. 创建工作空间：

bash bash /scripts/workspace.sh /tmp TASK

打印工作空间路径。如果需要，将完整任务写入 workspace/task.md。

3. 运行编排器（后台执行）：

bash python3 /scripts/run.py \ --workspace /tmp/checkmate-TIMESTAMP \ --task FULL TASK DESCRIPTION \ --max-iter 10 \ --session-uuid YOURSESSIONUUID \ --recipient YOURRECIPIENTID \ --channel

使用 exec 并设置 background=true。这将根据需要运行任意长时间。
添加 --no-interactive 进行完全自主运行（无用户检查点）。

4. 告知用户将杀正在运行，它在做什么，以及他们将通过你配置的渠道收到标准草稿和检查点消息以供审查和批准。

5. 桥接用户回复：当用户回复检查点消息时，检查 pending-input.json 是否存在，并将其响应写入 workspace/user-input.md。

桥接用户输入

当检查点消息到达时（编排器向用户发送了标准/批准/检查点请求），桥接他们的回复：

bash

查找活动的待处理输入

cat /checkmate-*/pending-input.json 2>/dev/null

路由用户的回复

echo USER REPLY HERE > /path/to/workspace/user-input.md

编排器每5秒轮询此文件。一旦写入，它会自动恢复并删除该文件。

每个关卡接受的回复：

关卡	继续	重定向	取消
标准审查	ok, approve, lgtm	任何反馈文本	—
启动前

参数

标志	默认值	说明
--max-intake-iter	5	摄入标准优化迭代次数
--max-iter

工作空间布局

memory/checkmate-YYYYMMDD-HHMMSS/
├── task.md # 任务描述（用户可在启动前编辑）
├── criteria.md #

checkmate将杀式验收

checkmate

Checkmate

Requirements

Security & Privilege Model

When to Use

Architecture

User Input Bridge

Your Job (main agent)

Bridging User Input

Parameters

Workspace layout

Resume

Prompts

将杀

要求

安全与权限模型

何时使用

架构

用户输入桥接

你的工作（主代理）

桥接用户输入

查找活动的待处理输入

路由用户的回复

参数

工作空间布局

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

checkmate将杀式验收

checkmate

Checkmate

Requirements

Security & Privilege Model

When to Use

Architecture

User Input Bridge

Your Job (main agent)

Bridging User Input

Parameters

Workspace layout

Resume

Prompts

将杀

要求

安全与权限模型

何时使用

架构

用户输入桥接

你的工作（主代理）

桥接用户输入

查找活动的待处理输入

路由用户的回复

参数

工作空间布局

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement