Litmus — Parallel Autonomous ML Research Agents

Litmus spawns multiple OpenClaw subagents that experiment on your GPU overnight. Each runs on its
own git branch in a shared lab repository — every experiment is a commit, agents can read
each other's code, cherry-pick breakthroughs, and build on the global best at any time.

Validated techniques accumulate in a Skills library (~/.litmus/shared/skills/). A
Synthesizer runs at 04:00 to distill collective knowledge into skills and write a research
agenda for the next day. A Director runs every 2 hours to steer workers, trigger Compass
Resets on stagnation, and orchestrate cross-agent knowledge transfer.

What makes it more than autoresearch:

- Git worktrees: agents share one repo, each on their own branch — full experiment history,

cherry-pick, and cross-agent code inspection via git -C ~/.litmus/repo log --all

- Skills library: validated techniques persist and compound — agents don't re-discover wins
Synthesizer: distills all overnight notes into reusable skills and a research agenda
Compass Reset: Director detects stagnation and forces structured pivots using the skills gap
Two-phase experiment budget: quick 90-second check before committing to a full run
Structured attempt records: JSON per experiment in shared/attempts/ for rich analytics
Leisure mode (03:00–06:00): workers read papers, write moonshot hypotheses, identify gaps
Morning digest: research narrative delivered to your chat at 08:00

Everything is a native OpenClaw subagent. No external processes, no PID files.

First-Time Setup

Recommended — ask your OpenClaw agent (runs a guided onboarding conversation):

"Install https://clawhub.ai/kuberwastaken/litmus and set it up for my machine"

Full onboarding instructions: {baseDir}/references/onboarding.md — read that file first.

Or manually:
CODEBLOCK0

Clones Karpathy's training harness, builds the shared lab git repo at ~/.litmus/repo/,
installs Python deps via uv, downloads ~1 GB of training data. Wait for it to finish.

Starting Research

1 — Prepare workspaces (creates git worktrees)

CODEBLOCK1

Creates git worktrees under ~/.litmus/agents/, each on its own branch in ~/.litmus/repo/.
The shared lab git repo means every agent's experiments are immediately visible to all others:
CODEBLOCK2

2 — Spawn research subagents

CODEBLOCK3

Repeat for each agent, then:
CODEBLOCK4

Templates: architecture · optimizer · regularization · general
Full template details: INLINECODE12

3 — Start the Director Layer

CODEBLOCK5

Registers 6 cron jobs:

Cron	Default schedule	Role
INLINECODE13	Every 2h during research hours	Reviews results, steers workers, Compass Reset on stagnation
INLINECODE14

All times are configurable during onboarding — the setup agent pitches defaults and asks what you'd like to change. Common presets: night owl (01:00/02:00/04:00/07:00), early bird (23:00/00:30/02:00/05:30), intensive (1h director). Pass custom times to scripts/setup-cron.sh with --leisure-start, --synthesizer-time, --dawn-time, --digest-time, --director-hours, --watchdog-minutes.

Managing Agents

Status (experiment counts, best val_bpb, git tree):
CODEBLOCK6

Leaderboard (cross-agent, from shared/attempts/ JSON):
CODEBLOCK7

Full lab git history (all agents' experiments as a tree):
CODEBLOCK8

Inspect any experiment:
CODEBLOCK9

Steer (redirect mid-run, no restart):
CODEBLOCK10

Stop:

subagents action: "kill"  target: "all"

What Agents Write Overnight

Path	Contents
INLINECODE26	Structured record for every experiment (agent, val_bpb, status, title)
INLINECODE27

Reference Files

- {baseDir}/references/onboarding.md — first-time setup conversation
INLINECODE36 — worker agent loop (git-aware, skills-reading, two-phase budget)
INLINECODE37 — Director cron (Compass Reset, cross-pollination)
INLINECODE38 — Leisure mode (paper reading, structured notes, skill extraction)
INLINECODE39 — Synthesizer cron (knowledge distillation, skills library)
INLINECODE40 — Dawn cron (wake workers, queue experiments)
INLINECODE41 — Watchdog cron (liveness, escape mode)
INLINECODE42 — Morning digest (research narrative)
INLINECODE43 — Research focus templates
INLINECODE44 — ClawRxiv integration (optional auto-publishing)

Litmus — 并行自主机器学习研究代理

Litmus 会生成多个 OpenClaw 子代理，在您的 GPU 上彻夜进行实验。每个子代理在共享实验室仓库中拥有自己的 git 分支 —— 每次实验都是一个提交，代理可以互相读取代码，挑选突破性成果，并随时在全局最佳成果的基础上进行构建。

经过验证的技术会累积到 技能库 (~/.litmus/shared/skills/) 中。合成器 在凌晨 04:00 运行，将集体知识提炼为技能，并编写第二天的研究议程。主管每 2 小时运行一次，用于指导工作代理，在停滞时触发 指南针重置，并编排跨代理的知识迁移。

使其超越自动研究的特性：

- Git 工作树：代理共享一个仓库，每个代理在自己的分支上 —— 完整的实验历史、挑选成果，以及通过 git -C ~/.litmus/repo log --all 进行跨代理代码检查
技能库：经过验证的技术得以持久化和累积 —— 代理不会重复发现已有成果
合成器：将整夜的笔记提炼为可复用的技能和研究议程
指南针重置：主管检测到停滞时，利用技能差距强制进行结构化转向
两阶段实验预算：在提交完整运行前进行快速的 90 秒检查
结构化尝试记录：shared/attempts/ 中每个实验的 JSON 格式，用于丰富的分析
休闲模式 (03:00–06:00)：工作代理阅读论文，撰写登月假设，识别差距
晨间摘要：08:00 将研究叙事发送到您的聊天

一切都是原生的 OpenClaw 子代理。无外部进程，无 PID 文件。

首次设置

推荐 —— 询问您的 OpenClaw 代理（运行引导式入门对话）：

安装 https://clawhub.ai/kuberwastaken/litmus 并为我的机器进行设置

完整入门说明：{baseDir}/references/onboarding.md —— 请先阅读该文件。

或手动安装：
bash
git clone https://github.com/kuberwastaken/litmus ~/.litmus
bash ~/.litmus/scripts/setup.sh

克隆 Karpathy 的训练框架，在 ~/.litmus/repo/ 构建共享实验室 git 仓库，通过 uv 安装 Python 依赖，下载约 1 GB 的训练数据。等待完成。

开始研究

1 — 准备工作空间（创建 git 工作树）

bash
bash {baseDir}/scripts/prepare-agents.sh --agents 4 --templates architecture,optimizer,general,general

在 ~/.litmus/agents/ 下创建 git 工作树，每个在 ~/.litmus/repo/ 中拥有自己的分支。共享实验室 git 仓库意味着每个代理的实验立即可供所有其他代理查看：
bash
git -C ~/.litmus/repo log --all --oneline --graph

2 — 生成研究子代理

sessions_spawn
task: 阅读当前目录中的 program.md 并永远运行研究循环。
runtime: subagent
mode: session
agentId: litmus-worker-arch-1
cwd: ~/.litmus/agents/arch-1

为每个代理重复此操作，然后：

sessions_yield message: 研究代理正在运行。有新发现时我会通知您。

模板：architecture · optimizer · regularization · general
完整模板详情：{baseDir}/references/templates/

3 — 启动主管层

bash
bash {baseDir}/scripts/setup-cron.sh --timezone Your/Timezone

Cron	默认调度	角色
litmus-director	研究时段每 2 小时	审查结果，指导工作代理，停滞时指南针重置
litmus-leisure

所有时间在入门过程中均可配置 —— 设置代理会提出默认值并询问您想要更改的内容。常见预设：夜猫子 (01:00/02:00/04:00/07:00)、早鸟 (23:00/00:30/02:00/05:30)、密集模式 (1 小时主管)。向 scripts/setup-cron.sh 传递自定义时间，使用 --leisure-start、--synthesizer-time、--dawn-time、--digest-time、--director-hours、--watchdog-minutes。

管理代理

状态（实验计数、最佳 val_bpb、git 树）：
bash
bash {baseDir}/scripts/status.sh

排行榜（跨代理，来自 shared/attempts/ JSON）：
bash
bash {baseDir}/scripts/results.sh --top 10
bash {baseDir}/scripts/results.sh --agent arch-1 # 单个代理

完整实验室 git 历史（所有代理的实验以树形显示）：
bash
git -C ~/.litmus/repo log --all --oneline --graph

检查任何实验：
bash
git -C ~/.litmus/repo show # 查看更改内容
cat ~/.litmus/shared/attempts/.json # 查看指标

引导（运行中重定向，无需重启）：

subagents action: steer target: litmus-worker-arch-1
message: 停止优化深度。检出 opt-2 的最佳提交，将其学习率与 DEPTH=10 结合。

停止：

subagents action: kill target: all

代理整夜编写的内容

路径	内容
~/.litmus/shared/attempts/<hash>.json	每个实验的结构化记录（代理、val_bpb、状态、标题）
~/.litmus/shared/skills/<name>.md

参考文件

- {baseDir}/references/onboarding.md — 首次设置对话
{baseDir}/references/program.md — 工作代理循环（git 感知、技能读取、两阶段预算）
{baseDir}/references/director.md — 主管 cron（指南针重置、交叉授粉）
{baseDir}/references/leisure.md — 休闲模式（论文阅读、结构化笔记、技能提取）
{baseDir}/references/synthesizer.md — 合成器 cron（知识蒸馏、技能库）
{baseDir}/references/dawn.md — 黎明 cron（唤醒工作代理、排队实验）
{baseDir}/references/watchdog.md — 看门狗 cron（存活检查、逃生模式）
{baseDir}/references/digest.md — 晨间摘要（研究叙事）
{baseDir}/references/templates/ — 研究焦点模板
{baseDir}/references/clawrxiv.md — ClawRxiv 集成（可选自动发布）

litmus石蕊系统

litmus

Litmus — Parallel Autonomous ML Research Agents

First-Time Setup

Starting Research