🦈 The Shark Pattern

A shark that stops swimming dies. An agent that waits for tools wastes compute.

Works with: Claude Code · Codex · Gemini CLI · Cursor · Windsurf · Aider · OpenClaw · any LLM agent

When to Use This Skill

Trigger this skill when the user says:

- "use the shark pattern"
"non-blocking agent"
"never wait for tools"
"spawn background workers"
"parallel subagents"
"keep the main agent moving"
or when you notice you're about to block on a slow tool (web fetch, SSH, build, test run, API call)

The Rule

Every LLM turn must complete in under 30 seconds.

If any operation would take longer:

1. Spawn a remora (sessions_spawn with mode: "run")
Continue reasoning immediately
Incorporate remora results when they arrive

You are never in I/O wait. You are always reasoning about something.

Lifecycle

CODEBLOCK0

No nested remoras. If a remora is running, it executes inline — remoras cannot spawn their own remoras. Only the main shark spawns.

The Pattern

Bad (Ralph-style blocking):

CODEBLOCK1

Good (Shark-style non-blocking):

CODEBLOCK2

Implementation

When applying the Shark Pattern, structure your work like this:

1. Identify blocking operations

Before calling any tool, ask: "Will this take more than 20-30 seconds?"

Slow tools (always spawn):

- Web searches / page fetches
SSH commands on remote machines
Build / test / CI runs
File system scans over large directories
API calls with unknown latency
LLM inference calls (coding agents)

Fast tools (run inline, never spawn):

- Reading local files
Simple calculations
String manipulation
Memory lookups

2. Spawn remoras

CODEBLOCK3

Spawn multiple remoras in parallel when possible — don't serialize unless there's a data dependency.

3. Keep the main fin moving

After spawning, immediately continue:

- Plan the next step
Work on a different part of the task
Summarize what you know so far
Prepare to incorporate results

4. Incorporate results

When remora results arrive, weave them in and continue. Never re-do work a remora already completed.

If your runtime keeps subagents alive after completion, close them once you've incorporated their result. In Codex that means: wait for the remora, use its output, then close_agent(id) unless you intentionally plan to reuse that same agent.

Timing Budget

Operation	Budget	Action
File read	< 2s	Inline
Web search

Example: Multi-Step Research Task

Without Shark (blocking):
CODEBLOCK4

With Shark (non-blocking):
CODEBLOCK5

Output Format

Announce on start

🦈 Shark mode — spawning [N] remoras for [tasks], continuing...

Progress bar (chat-friendly, Unicode only — no images needed)

Use this format after each remora or pilot fish completes. Works in Telegram, Discord, Signal, iMessage — anywhere.

CODEBLOCK6

Symbols:

- ◉ = remora (completed)
INLINECODE4 = remora (pending)
INLINECODE5 = remora (running)
INLINECODE6 = pilot fish (time-bounded)
INLINECODE7 = done bar (12 blocks)
INLINECODE8 = partial (filled = elapsed / total budget)
INLINECODE9 = not started

Progress fill: filled = round(elapsed / timeout * 12) blocks of █, remainder INLINECODE12

Only post an update when something changes (remora completes or pilot fish starts/ends). Don't spam — one update per event.

Final synthesis

After all remoras done:

🦈 All fins in — synthesising [N] results + pilot draft

Then deliver the report.

The Pilot Fish Sub-Pattern

Pilot fish swim alongside sharks doing prep work. When you have idle time, use it.

When one remora returns early and others are still running:

1. Spawn a pilot fish — a time-bounded analysis sub-agent
Give it only the partial results so far + a hard timeout equal to the estimated remaining wait
Let it pre-validate, pre-analyse, find patterns, draft conclusions
Kill it (or it self-terminates) when the last primary remora completes
Incorporate whatever the pilot fish produced into the final synthesis

CODEBLOCK7

Pilot Fish Rules

- Always time-bounded — pass runTimeoutSeconds equal to estimated remaining wait
Never blocks — spawned async, main agent continues
Opportunistic — if it finishes early, bonus; if killed mid-run, partial output is still useful
One at a time — don't stack pilot fish on pilot fish
Task: pre-validate data, find gaps, draft structure, flag anomalies, prepare questions

Example

CODEBLOCK8

Decision Tree — When to Spawn

Before every tool call, ask: "Will this take more than 10 seconds?"

CODEBLOCK9

Always spawn: web search/fetch, SSH, build/test, coding agents, CI triggers, API calls with unknown latency
Always inline: file read, memory lookup, string ops, math, local config reads

Error Handling

remoras will fail, timeout, or return garbage. Plan for it.

remora timeout

◉ [A] task    ████████████ ⏱ 30s [timeout]

- Treat as partial result — use whatever was returned
Do not re-spawn the same task (wastes time, likely to timeout again)
Note the gap in synthesis: "A timed out — data may be incomplete"
If A's result is critical, spawn a smaller-scoped follow-up shark

remora crash / error

◉ [A] task    ████████████ ❌ [error: connection refused]

- Log the error inline in the progress bar
Continue synthesis without that result
Mention the failure in the final report
Optionally file an issue / alert if it's infrastructure
If the runtime still shows the remora as open after completion or error, clean it up immediately. In Codex, close completed remoras with close_agent(id) once their output is delivered.

Partial results (most common)

- Most useful — a remora that timed out at 28s has 28s of work in it
Always check if partial output is usable before discarding
Progress bar: ⏱ = timeout with partial, ❌ = hard error with nothing

>50% remoras failed

- Degrade gracefully — fall back to sequential for remaining work
Note in report: "⚠️ degraded mode — N/M remoras failed"

All remoras failed

- Fall back to sequential execution for the most critical task only
Do not spawn another full fleet — you're likely hitting a systemic issue

Forgetting to spawn the pilot fish (most common mistake)

- You finished a fast inline task, a remora is still running, and you just... wait
Symptom: main agent idle, no pilot fish, time wasted
Fix: always ask after any remora completes early — "what can I pre-draft right now?"
Even if you have nothing obvious, draft the output structure, prepare questions, or outline the synthesis

Pilot fish killed mid-run

- Normal and expected — whatever it produced is still useful
Incorporate partial pilot fish output into synthesis
Don't wait for it or re-spawn it

Terminology

- remora = a sessions_spawn call with mode: "run", runtime: "subagent", and runTimeoutSeconds set. A remora is specifically a timed sub-agent — untimed subagents are not remoras.
Pilot fish = a remora spawned after another remora completes, with a short timeout sized to the estimated remaining wait. Purpose: pre-analysis only, never primary work.
Fleet = the full set of remoras spawned for one task
Fin moving = the main agent is doing useful work (not waiting)
No nested remoras = remoras always execute inline — only the main shark spawns

`runTimeoutSeconds` — confirmed real

Verified against OpenClaw source: runTimeoutSeconds: z.number().int().min(0).optional() — maps to the subagent wait timeout. Use it. Hard-kills the sub-agent process after N seconds, partial output returned.

Pilot Fish Sizing Formula

CODEBLOCK12

- estimatedRemaining = how long you think the slowest remaining remora will take
Cap at 25s so pilot fish always finishes before the main synthesis turn
If you don't know: use 20s as default

Example: slowest remaining remora estimated at 30s → pilot fish timeout = min(24, 25) = 24s

Hard Limits

- Never use yieldMs > 30000 in exec calls — this holds the main turn hostage
Never process(action=poll, timeout > 20000) in the main session — same reason
Never add sleep or wait loops in the main thread
Always set runTimeoutSeconds on remoras — unbound sub-agents are not sharks
Always clean up completed remoras — if your runtime requires explicit teardown, do it right after incorporating the result
Max 8 concurrent remoras — beyond this, context overhead exceeds the gain
Never stack pilot fish — one at a time, no pilot fish spawning pilot fish
Spawn tasks ≤ 3 sentences — longer task descriptions need decomposition first

Enforcing the 30-Second Timeout

The 30s cap isn't just a guideline — here's how to actually enforce it per runtime.

OpenClaw subagents

sessions_spawn({
  task: "...",
  mode: "run",
  runtime: "subagent",
  runTimeoutSeconds: 30   // hard kill after 30s — agent gets SIGTERM
})

runTimeoutSeconds is enforced by the OpenClaw runtime — the sub-agent process is killed if it exceeds it. Partial output is still returned.

exec calls (shell, SSH, scripts)

exec({
  command: "some-slow-command",
  timeout: 30,        // hard kill in seconds
  background: true,   // don't block the main agent turn
  yieldMs: 500        // poll back quickly to check
})

timeout kills the process. background: true means the main agent doesn't wait — it gets a session handle and can check back with process(poll).

Gemini CLI via exec

timeout 30 gemini -p "task here"
# or on Windows:
Start-Process gemini -ArgumentList '-p "task"' -Wait -Timeout 30

Wrap the CLI invocation with OS-level timeout / Start-Process -Timeout.

Pilot fish — always use `runTimeoutSeconds`

sessions_spawn({
  task: "pre-analyse partial results, draft structure, flag gaps",
  mode: "run",
  runTimeoutSeconds: estimatedRemainingMs / 1000,  // die before the last remora
})

Set it to slightly less than your estimated remaining wait — so the pilot fish always finishes before you need to synthesise.

What happens when timeout fires

- Sub-agent/process is killed
Whatever output was produced so far is returned
Main agent treats it as a partial result — still useful for synthesis
Log: [timeout] in the progress bar instead of INLINECODE36

CODEBLOCK17

The LLM turn itself

You can't hard-kill an LLM mid-turn, but you can:

1. Keep prompts tight — don't ask for exhaustive analysis in one turn
Use thinking: "none" for fast sub-tasks that don't need deep reasoning
Break large tasks into smaller shark-able chunks upfront

Rule of thumb: if a task description is >3 sentences, it probably needs to be split into remoras.

Compatibility — Claude, Codex, Gemini CLI

The Shark Pattern is runtime-agnostic. remoras can be any agent type.

OpenClaw (Claude / Sonnet / Opus)

CODEBLOCK18

Codex

CODEBLOCK19

Codex-specific lifecycle:

- Spawn with spawn_agent(...) or the runtime-equivalent remora launcher
Check completion with INLINECODE39
If you want to reuse the same remora, send more work with INLINECODE40
Otherwise, once the remora has completed and you've incorporated its result, call close_agent(id) so the agent does not linger in the session

Gemini CLI

Gemini CLI is a local process — spawn via exec with a timeout:

exec({
  command: "gemini -p \"task description here\"",
  timeout: 30,            // hard cap in seconds
  background: true,       // don't block main agent
  yieldMs: 500            // check back quickly
})

For Gemini sub-tasks, use exec with timeout + background: true rather than sessions_spawn. Treat the process handle the same way — continue working, collect output when it lands.

Mixed fleets

You can mix runtimes in the same shark run: CODEBLOCK21

Which to use when

Task type	Best runtime
Code generation / editing	Codex
Web search / summarise

shark-exec Sub-Skill

For slow shell commands (>5s), use the shark-exec companion skill:

- Located at shark-exec/SKILL.md in this repo
Wraps any exec call in background + cron poller
Guarantees main turn completes in <30s even for 10-minute commands
Use it instead of inline exec whenever the command might block

Loop Enforcement (Ralph-style)

The 30-second rule is best enforced at the shell level, not inside a turn.

Use shark.sh (or shark.ps1 on Windows) to run Claude in a bounded loop:

CODEBLOCK22

Each iteration:

1. Builds a fresh prompt: skill context + task + current state
Runs claude --print with a hard timeout 25s shell wrapper
If Claude times out → loop continues (it's expected — shark pattern means short turns)
If Claude writes .shark-done → loop exits

This is identical to the Ralph Loop pattern, but with the Shark Pattern as the prompt — Claude spawns remoras for slow work, keeps each turn under 25s, and the shell loop enforces the hard cut.

When to use the loop vs direct claude

Use case	Approach
Single fast task (<30s total)	INLINECODE53 directly
Multi-step task, slow tools

Environment variables

Variable	Default	Description
INLINECODE55	INLINECODE56	Maximum iterations before giving up
INLINECODE57

25 | Per-turn timeout in seconds (hard kill) |

Completion protocol

When Claude determines the task is done, it writes to .shark-done:

TASK_COMPLETE
<brief summary of what was accomplished>

The loop detects this file and exits cleanly.

Commands

When the user invokes these commands, follow the instructions for each.

`/shark <task>`

Apply the Shark Pattern to the given task. Decompose, spawn remoras for slow ops, keep the main fin moving. Follow all rules in this SKILL.md.

`/shark-loop <task> [--max-loops N] [--timeout S]`

Run the external shark loop enforcer. Execute:

$env:SHARK_MAX_LOOPS = "<N>"
$env:SHARK_LOOP_TIMEOUT = "<S>"
powershell.exe -ExecutionPolicy Bypass -File "<skill_dir>/shark.ps1" "<task>"

Defaults: --max-loops 50, --timeout 25. On Linux/Mac use shark.sh instead.

`/shark-status`

Check current shark state:

1. Read <skill_dir>/shark-exec/state/pending.json — report active background jobs (label, command, elapsed time, whether overdue past maxSeconds)
If .shark-done exists, show its contents
If SHARK_LOG.md exists, show the last 10 lines
If nothing exists, report "No active shark jobs."

`/shark-clean`

Remove shark state files: .shark-done, SHARK_LOG.md, shark-exec/state/pending.json. Report what was cleaned.

`/shark-autotune`

Analyse timing history and recommend optimal settings.

1. Read <skill_dir>/state/timings.jsonl — each line is:

CODEBLOCK25

2. If no data, report "No timing data yet. Run tasks with /shark first."

3. Compute and report:

- Total runs (unique task_hash values) and total loops - Median turn time (p50) and p95 turn time - Timeout rate — % of turns with result "timeout" - Loops to completion — median and max (count loops per task_hash that has a "done" entry) - Wasted headroom — sum of (timeouts - elapseds) for result "ok" turns - Optimal timeout — p95 turn time + 3s buffer, rounded up to nearest 5s - Optimal max_loops — p95 loops-to-completion + 2

4. Show recommendations:

CODEBLOCK26

5. If timeout rate > 30%: "Consider breaking tasks into smaller steps."
If median turn time < 5s: "Most turns complete fast. Consider lowering timeout."

Timing Instrumentation

Both shark.sh and shark.ps1 automatically record per-loop timings to state/timings.jsonl. Each entry includes:

- ts — Unix timestamp
INLINECODE79 — loop iteration number
INLINECODE80 — actual wall-clock seconds for this turn
INLINECODE81 — configured timeout for this run
INLINECODE82 — "ok" (completed), "timeout" (hit limit), "done" (task finished)
INLINECODE86 — 8-char hash correlating loops within a single run

Use /shark-autotune to analyse this data and tune your settings.

References

- Ralph Loop (sequential baseline): ghuntley.com/ralph/
OpenClaw sessions_spawn docs: spawn with mode: "run", INLINECODE89
Gemini CLI: INLINECODE90
The name: sharks use ram ventilation — they literally die if they stop moving

🦈 鲨鱼模式

停止游动的鲨鱼会死亡。等待工具的智能体会浪费算力。

适用于： Claude Code · Codex · Gemini CLI · Cursor · Windsurf · Aider · OpenClaw · 任何LLM智能体

何时使用此技能

当用户说出以下内容时触发此技能：

- 使用鲨鱼模式
非阻塞智能体
永远不要等待工具
生成后台工作进程
并行子智能体
保持主智能体持续运行
或当你注意到即将阻塞在某个慢速工具上时（网页抓取、SSH、构建、测试运行、API调用）

规则

每个LLM轮次必须在30秒内完成。

如果任何操作需要更长时间：

1. 生成一个䲟鱼（sessions_spawn 使用 mode: run）
立即继续推理
当䲟鱼结果到达时将其整合

你从不处于I/O等待状态。你始终在对某事进行推理。

生命周期

┌─────────────┐
│ 分解任务 │ 将任务拆分为N个独立的子任务
└──────┬──────┘
│ 生成N条䲟鱼（+ 当第一条提前完成时生成1条领航鱼）
▼
┌─────────────┐
│ 生成 │ sessions_spawn × N，全部并行，记录会话ID
└──────┬──────┘
│ 主智能体继续推理（从不等待）
▼
┌─────────────┐ 超时/崩溃
│ 监控 │ ──────────────────► 标记 ⏱/❌（部分结果仍有价值）
└──────┬──────┘
│ 全部完成或截止时间到达
▼
┌─────────────┐
│ 聚合 │ 收集结果，记录失败，合并领航鱼草稿
└──────┬──────┘
│
▼
┌─────────────┐
│ 报告 │ 单一连贯响应，注明失败次数
└─────────────┘

禁止嵌套䲟鱼。 如果䲟鱼正在运行，它以内联方式执行——䲟鱼不能生成自己的䲟鱼。只有主鲨鱼才能生成。

模式

糟糕（Ralph式阻塞）：

思考 → 调用慢速工具 → 等待60秒 → 思考 → 调用慢速工具 → 等待45秒 → ...

良好（鲨鱼式非阻塞）：

思考 → 生成䲟鱼(慢速工具) → 思考其他事情
→ 生成䲟鱼(另一个工具) → 综合部分结果
→ 接收䲟鱼结果 → 整合 → 继续游动

实现

应用鲨鱼模式时，按以下方式组织工作：

1. 识别阻塞操作

在调用任何工具之前，问：这会花费超过20-30秒吗？

慢速工具（始终生成）：

- 网页搜索/页面抓取
远程机器上的SSH命令
构建/测试/CI运行
大型目录的文件系统扫描
延迟未知的API调用
LLM推理调用（编码智能体）

快速工具（内联运行，从不生成）：

- 读取本地文件
简单计算
字符串操作
内存查找

2. 生成䲟鱼

sessions_spawn({
task: 执行慢速操作并返回结果,
mode: run,
runtime: subagent,
streamTo: parent // 可选：将输出流式传回
})

尽可能并行生成多条䲟鱼——除非存在数据依赖，否则不要串行化。

3. 保持主鳍持续游动

生成后，立即继续：

- 规划下一步
处理任务的不同部分
总结目前已知信息
准备整合结果

4. 整合结果

当䲟鱼结果到达时，将其编织进来并继续。永远不要重复䲟鱼已完成的工作。

如果你的运行环境在完成后仍保持子智能体存活，在整合其结果后关闭它们。在Codex中这意味着：等待䲟鱼，使用其输出，然后调用 close_agent(id)，除非你有意计划重用同一个智能体。

时间预算

操作	预算	操作
文件读取	< 2s	内联
网页搜索

5-30s | 生成 | | SSH命令 | 10-120s | 生成 | | 构建/测试 | 30-300s | 生成 | | 编码智能体 | 60-600s | 生成 | | 内存搜索 | < 3s | 内联 |

示例：多步骤研究任务

没有鲨鱼模式（阻塞）：

1. 搜索网页X [等待15s]
搜索网页Y [等待12s]
抓取页面Z [等待8s]
SSH检查服务器 [等待30s]

总计：约65秒阻塞

使用鲨鱼模式（非阻塞）：

1. 生成：搜索X [0s - 已生成]
生成：搜索Y [0s - 已生成]
生成：抓取Z [0s - 已生成]
生成：SSH检查 [0s - 已生成]
等待时规划综合方案 [15s的实际思考]
所有结果到达 → 综合

总计：约15秒思考 + 并行中的最大(工具时间)

输出格式

开始时宣布

🦈 鲨鱼模式 — 为[任务]生成[N]条䲟鱼，继续...

进度条（聊天友好，仅Unicode——无需图片）

每次䲟鱼或领航鱼完成后使用此格式。适用于Telegram、Discord、Signal、iMessage——任何地方。

🦈 3条䲟鱼 · 1条领航鱼

◉ [A] 任务名称 ████████████ ✅ 9s
◉ [B] 任务名称 ████████████ ✅ 33s
○ [C] 任务名称 ░░░░░░░░░░░░ 待处理
◈ [P] 领航鱼 ██████░░░░░░ ~14s剩余

↳ 继续...

符号：

- ◉ = 䲟鱼（已完成）
○ = 䲟鱼（待处理）
⊙ = 䲟鱼（运行中）
◈ = 领航鱼（有时间限制）
████████████ = 完成条（12个块）
██████░░░░░░ = 部分（填充 = 已用时间/总预算）
░░░░░░░░░░░░ = 未开始

进度填充： 填充 = round(已用时间/超时时间 * 12) 个 █ 块，剩余 ░

仅在状态发生变化时（䲟鱼完成或领航鱼开始/结束）发布更新。不要刷屏——每个事件一次更新。

最终综合

所有䲟鱼完成后：

🦈 所有鱼鳍归位 — 综合[N]个结果 + 领航鱼草稿

然后交付报告。

领航鱼子模式

领航鱼与鲨鱼并肩游动，做准备工作。当你有空闲时间时，利用它。

当一条䲟鱼提前返回而其他䲟鱼仍在运行时：

1. 生成一条领航鱼 — 一个有时间限制的分析子智能体
只给它到目前为止的部分结果 + 一个硬超时，等于估计的剩余等待时间
让它进行预验证、预分析、发现模式、起草结论
终止它（或让它自行终止）当最后一条主要䲟鱼完成时
整合领航鱼产生的任何内容到最终综合中

䲟鱼A ──────► 结果（提前）
䲟鱼B ────────────────────────────► 结果
䲟鱼C ──────────────────────────────────► 结果

主线程：生成A、B、C
A完成 → 生成领航鱼(A的结果, 超时=估计剩余时间)
领航鱼：预分析A，起草部分报告，验证数据...
B完成 → 领航鱼仍在运行，将B的结果传入（或终止并重用）
C完成 → 终止领航鱼，综合A+B+C+领航鱼草稿

领航鱼规则

- 始终有时间限制 — 传递

shark鲨鱼模式

shark