Litmus — Parallel Autonomous ML Research Agents
Litmus spawns multiple OpenClaw subagents that experiment on your GPU overnight. Each runs on its
own git branch in a shared lab repository — every experiment is a commit, agents can read
each other's code, cherry-pick breakthroughs, and build on the global best at any time.
Validated techniques accumulate in a Skills library (~/.litmus/shared/skills/). A
Synthesizer runs at 04:00 to distill collective knowledge into skills and write a research
agenda for the next day. A Director runs every 2 hours to steer workers, trigger Compass
Resets on stagnation, and orchestrate cross-agent knowledge transfer.
What makes it more than autoresearch:
- - Git worktrees: agents share one repo, each on their own branch — full experiment history,
cherry-pick, and cross-agent code inspection via
git -C ~/.litmus/repo log --all
- - Skills library: validated techniques persist and compound — agents don't re-discover wins
- Synthesizer: distills all overnight notes into reusable skills and a research agenda
- Compass Reset: Director detects stagnation and forces structured pivots using the skills gap
- Two-phase experiment budget: quick 90-second check before committing to a full run
- Structured attempt records: JSON per experiment in
shared/attempts/ for rich analytics - Leisure mode (03:00–06:00): workers read papers, write moonshot hypotheses, identify gaps
- Morning digest: research narrative delivered to your chat at 08:00
Everything is a native OpenClaw subagent. No external processes, no PID files.
First-Time Setup
Recommended — ask your OpenClaw agent (runs a guided onboarding conversation):
"Install https://clawhub.ai/kuberwastaken/litmus and set it up for my machine"
Full onboarding instructions: {baseDir}/references/onboarding.md — read that file first.
Or manually:
CODEBLOCK0
Clones Karpathy's training harness, builds the shared lab git repo at ~/.litmus/repo/,
installs Python deps via uv, downloads ~1 GB of training data. Wait for it to finish.
Starting Research
1 — Prepare workspaces (creates git worktrees)
CODEBLOCK1
Creates git worktrees under ~/.litmus/agents/, each on its own branch in ~/.litmus/repo/.
The shared lab git repo means every agent's experiments are immediately visible to all others:
CODEBLOCK2
2 — Spawn research subagents
CODEBLOCK3
Repeat for each agent, then:
CODEBLOCK4
Templates: architecture · optimizer · regularization · general
Full template details: INLINECODE12
3 — Start the Director Layer
CODEBLOCK5
Registers 6 cron jobs:
| Cron | Default schedule | Role |
|---|
| INLINECODE13 | Every 2h during research hours | Reviews results, steers workers, Compass Reset on stagnation |
| INLINECODE14 |
03:00 daily | Switches workers to paper-reading / creative thinking mode |
|
litmus-synthesizer | 04:00 daily | Distills notes into skills library, writes research agenda |
|
litmus-dawn | 06:00 daily | Wakes workers, queues synthesizer's priority experiments |
|
litmus-watchdog | Every 30 min | Liveness check, escape mode on zero improvements |
|
litmus-digest | 08:00 daily | Morning research narrative delivered to your chat |
All times are configurable during onboarding — the setup agent pitches defaults and asks what you'd like to change. Common presets: night owl (01:00/02:00/04:00/07:00), early bird (23:00/00:30/02:00/05:30), intensive (1h director). Pass custom times to scripts/setup-cron.sh with --leisure-start, --synthesizer-time, --dawn-time, --digest-time, --director-hours, --watchdog-minutes.
Managing Agents
Status (experiment counts, best val_bpb, git tree):
CODEBLOCK6
Leaderboard (cross-agent, from shared/attempts/ JSON):
CODEBLOCK7
Full lab git history (all agents' experiments as a tree):
CODEBLOCK8
Inspect any experiment:
CODEBLOCK9
Steer (redirect mid-run, no restart):
CODEBLOCK10
Stop:
subagents action: "kill" target: "all"
What Agents Write Overnight
| Path | Contents |
|---|
| INLINECODE26 | Structured record for every experiment (agent, val_bpb, status, title) |
| INLINECODE27 |
Validated reusable techniques with YAML frontmatter |
|
~/.litmus/shared/notes/discoveries/ | Per-improvement discovery notes |
|
~/.litmus/shared/notes/anomalies/ | Unexpected result notes |
|
~/.litmus/shared/notes/moonshots/ | Speculative hypotheses from leisure |
|
~/.litmus/shared/notes/synthesis/ | Synthesizer's research agenda and combination matrix |
|
~/.litmus/shared/discoveries.md | Cross-agent knowledge base (flat, for quick reading) |
|
~/.litmus/shared/midnight-reflections.md | Leisure agent's nightly narrative |
|
~/.litmus/repo/ (git) | All experiment commits across all agents on their branches |
Reference Files
- -
{baseDir}/references/onboarding.md — first-time setup conversation - INLINECODE36 — worker agent loop (git-aware, skills-reading, two-phase budget)
- INLINECODE37 — Director cron (Compass Reset, cross-pollination)
- INLINECODE38 — Leisure mode (paper reading, structured notes, skill extraction)
- INLINECODE39 — Synthesizer cron (knowledge distillation, skills library)
- INLINECODE40 — Dawn cron (wake workers, queue experiments)
- INLINECODE41 — Watchdog cron (liveness, escape mode)
- INLINECODE42 — Morning digest (research narrative)
- INLINECODE43 — Research focus templates
- INLINECODE44 — ClawRxiv integration (optional auto-publishing)
Litmus — 并行自主机器学习研究代理
Litmus 会生成多个 OpenClaw 子代理,在您的 GPU 上彻夜进行实验。每个子代理在共享实验室仓库中拥有自己的 git 分支 —— 每次实验都是一个提交,代理可以互相读取代码,挑选突破性成果,并随时在全局最佳成果的基础上进行构建。
经过验证的技术会累积到 技能库 (~/.litmus/shared/skills/) 中。合成器 在凌晨 04:00 运行,将集体知识提炼为技能,并编写第二天的研究议程。主管 每 2 小时运行一次,用于指导工作代理,在停滞时触发 指南针重置,并编排跨代理的知识迁移。
使其超越自动研究的特性:
- - Git 工作树:代理共享一个仓库,每个代理在自己的分支上 —— 完整的实验历史、挑选成果,以及通过 git -C ~/.litmus/repo log --all 进行跨代理代码检查
- 技能库:经过验证的技术得以持久化和累积 —— 代理不会重复发现已有成果
- 合成器:将整夜的笔记提炼为可复用的技能和研究议程
- 指南针重置:主管检测到停滞时,利用技能差距强制进行结构化转向
- 两阶段实验预算:在提交完整运行前进行快速的 90 秒检查
- 结构化尝试记录:shared/attempts/ 中每个实验的 JSON 格式,用于丰富的分析
- 休闲模式 (03:00–06:00):工作代理阅读论文,撰写登月假设,识别差距
- 晨间摘要:08:00 将研究叙事发送到您的聊天
一切都是原生的 OpenClaw 子代理。无外部进程,无 PID 文件。
首次设置
推荐 —— 询问您的 OpenClaw 代理(运行引导式入门对话):
安装 https://clawhub.ai/kuberwastaken/litmus 并为我的机器进行设置
完整入门说明:{baseDir}/references/onboarding.md —— 请先阅读该文件。
或手动安装:
bash
git clone https://github.com/kuberwastaken/litmus ~/.litmus
bash ~/.litmus/scripts/setup.sh
克隆 Karpathy 的训练框架,在 ~/.litmus/repo/ 构建共享实验室 git 仓库,通过 uv 安装 Python 依赖,下载约 1 GB 的训练数据。等待完成。
开始研究
1 — 准备工作空间(创建 git 工作树)
bash
bash {baseDir}/scripts/prepare-agents.sh --agents 4 --templates architecture,optimizer,general,general
在 ~/.litmus/agents/ 下创建 git 工作树,每个在 ~/.litmus/repo/ 中拥有自己的分支。共享实验室 git 仓库意味着每个代理的实验立即可供所有其他代理查看:
bash
git -C ~/.litmus/repo log --all --oneline --graph
2 — 生成研究子代理
sessions_spawn
task: 阅读当前目录中的 program.md 并永远运行研究循环。
runtime: subagent
mode: session
agentId: litmus-worker-arch-1
cwd: ~/.litmus/agents/arch-1
为每个代理重复此操作,然后:
sessions_yield message: 研究代理正在运行。有新发现时我会通知您。
模板:architecture · optimizer · regularization · general
完整模板详情:{baseDir}/references/templates/
3 — 启动主管层
bash
bash {baseDir}/scripts/setup-cron.sh --timezone Your/Timezone
注册 6 个 cron 任务:
| Cron | 默认调度 | 角色 |
|---|
| litmus-director | 研究时段每 2 小时 | 审查结果,指导工作代理,停滞时指南针重置 |
| litmus-leisure |
每天 03:00 | 将工作代理切换为论文阅读/创意思考模式 |
| litmus-synthesizer | 每天 04:00 | 将笔记提炼到技能库,编写研究议程 |
| litmus-dawn | 每天 06:00 | 唤醒工作代理,排队合成器的优先实验 |
| litmus-watchdog | 每 30 分钟 | 存活检查,零改进时进入逃生模式 |
| litmus-digest | 每天 08:00 | 将晨间研究叙事发送到您的聊天 |
所有时间在入门过程中均可配置 —— 设置代理会提出默认值并询问您想要更改的内容。常见预设:夜猫子 (01:00/02:00/04:00/07:00)、早鸟 (23:00/00:30/02:00/05:30)、密集模式 (1 小时主管)。向 scripts/setup-cron.sh 传递自定义时间,使用 --leisure-start、--synthesizer-time、--dawn-time、--digest-time、--director-hours、--watchdog-minutes。
管理代理
状态(实验计数、最佳 val_bpb、git 树):
bash
bash {baseDir}/scripts/status.sh
排行榜(跨代理,来自 shared/attempts/ JSON):
bash
bash {baseDir}/scripts/results.sh --top 10
bash {baseDir}/scripts/results.sh --agent arch-1 # 单个代理
完整实验室 git 历史(所有代理的实验以树形显示):
bash
git -C ~/.litmus/repo log --all --oneline --graph
检查任何实验:
bash
git -C ~/.litmus/repo show # 查看更改内容
cat ~/.litmus/shared/attempts/.json # 查看指标
引导(运行中重定向,无需重启):
subagents action: steer target: litmus-worker-arch-1
message: 停止优化深度。检出 opt-2 的最佳提交,将其学习率与 DEPTH=10 结合。
停止:
subagents action: kill target: all
代理整夜编写的内容
| 路径 | 内容 |
|---|
| ~/.litmus/shared/attempts/<hash>.json | 每个实验的结构化记录(代理、val_bpb、状态、标题) |
| ~/.litmus/shared/skills/<name>.md |
经过验证的可复用技术,带有 YAML 前置元数据 |
| ~/.litmus/shared/notes/discoveries/ | 每次改进的发现笔记 |
| ~/.litmus/shared/notes/anomalies/ | 意外结果笔记 |
| ~/.litmus/shared/notes/moonshots/ | 休闲模式下的推测性假设 |
| ~/.litmus/shared/notes/synthesis/ | 合成器的研究议程和组合矩阵 |
| ~/.litmus/shared/discoveries.md | 跨代理知识库(扁平化,便于快速阅读) |
| ~/.litmus/shared/midnight-reflections.md | 休闲代理的夜间叙事 |
| ~/.litmus/repo/ (git) | 所有代理在其分支上的所有实验提交 |
参考文件
- - {baseDir}/references/onboarding.md — 首次设置对话
- {baseDir}/references/program.md — 工作代理循环(git 感知、技能读取、两阶段预算)
- {baseDir}/references/director.md — 主管 cron(指南针重置、交叉授粉)
- {baseDir}/references/leisure.md — 休闲模式(论文阅读、结构化笔记、技能提取)
- {baseDir}/references/synthesizer.md — 合成器 cron(知识蒸馏、技能库)
- {baseDir}/references/dawn.md — 黎明 cron(唤醒工作代理、排队实验)
- {baseDir}/references/watchdog.md — 看门狗 cron(存活检查、逃生模式)
- {baseDir}/references/digest.md — 晨间摘要(研究叙事)
- {baseDir}/references/templates/ — 研究焦点模板
- {baseDir}/references/clawrxiv.md — ClawRxiv 集成(可选自动发布)