FlowForge
Autonomous spec → plan → code → QA pipeline powered by Claude Code.
All heavy computation runs through Claude Code (Max subscription). OpenClaw only orchestrates.
Architecture
CODEBLOCK0
Workflow Types
Classify the task before planning — each type has a different phase structure:
| Type | When | Phase Order |
|---|
| INLINECODE0 | New capability | Backend → Worker → Frontend → Integration |
| INLINECODE1 |
Restructure existing code | Add New → Migrate → Remove Old → Cleanup |
|
investigation | Bug hunt | Reproduce → Investigate → Fix → Harden |
|
migration | Move data/infra | Prepare → Test → Execute → Cleanup |
|
simple | Single-file change | Just subtasks, no phases |
Steps
1. Setup workspace
CODEBLOCK1
Creates ~/.forge/<timestamp>/ with task.md.
2. Clarification checkpoint (required before spec)
Before running the pipeline, ask 2–4 targeted questions to resolve ambiguity. Do not ask for information already in task.md. Focus on:
- - Scope edge cases — "Does this include X, or is that a separate shape?"
- Constraints — "Any files that are frozen / must not be touched?"
- Integration points — "Which existing module owns this responsibility?"
- Success definition — "What does passing look like — a test, a manual check, a metric?"
Present questions in a numbered list. Wait for answers before proceeding. If the task is unambiguous (e.g., a single-file fix from a clear issue), skip this step and note "No clarification needed."
Save answers to ~/.forge/<timestamp>/clarifications.md for reference during spec + plan phases.
3. Run the pipeline
CODEBLOCK2
This chains 4 Claude Code calls:
- 1. Spec — generates
spec.md incorporating clarifications (high thinking) - Plan — generates
implementation_plan.json (high thinking) - Code — executes each subtask with verification (medium thinking)
- QA — reviews output, scores against spec (high thinking)
Each step saves output to the workspace directory. Claude Code does ALL the work.
4. Monitor
Poll workspace for completion:
CODEBLOCK3
Account Rotation
Three Claude Max accounts rotate automatically on rate limit:
CODEBLOCK4
Configure your accounts in ~/.flowforge/accounts.txt (one email per line).
Save credentials per account in ~/.claude/accounts/<email>.json.
Switch accounts with: INLINECODE13
GitHub Issues
To pull a task from a GitHub issue:
CODEBLOCK5
Then run the pipeline normally.
Output
On completion, workspace contains:
- -
clarifications.md — pre-spec Q&A (scope, constraints, integration points) - INLINECODE15 — full specification (incorporates clarifications)
- INLINECODE16 — phases + subtasks with status
- INLINECODE17 — QA review and score
- INLINECODE18 — session handoff note (decisions made, patterns established, what next session needs to know)
- INLINECODE19 — timestamped execution log
Optional: Rubric Scoring (200 criteria)
Add --rubric flag for high-stakes runs. Scores against a universal 200-criterion quality rubric after the spec-based QA pass:
CODEBLOCK6
Rubric covers: Architecture (40), Code Quality (40), Testing (40), Error Handling (30), Security (20), Documentation (15), Observability (15).
Verdict thresholds: ≥180 = Ship it | 150–179 = Needs work | <150 = Major rework
Skip --rubric for quick tasks. Use it before shipping to production.
Prompts
See references/spec-prompt.md, references/planner-prompt.md, references/qa-prompt.md, references/rubric-prompt.md for the full Claude Code prompts used at each stage.
FlowForge
由Claude Code驱动的自主规范→计划→编码→质量保证流水线。
所有重型计算通过Claude Code(Max订阅)运行。OpenClaw仅负责编排。
架构
Flo(最小令牌)→ shell流水线 → Claude Code(所有繁重工作)
↓
达到速率限制时自动轮换账户
工作流类型
在规划前对任务进行分类——每种类型具有不同的阶段结构:
| 类型 | 适用场景 | 阶段顺序 |
|---|
| feature | 新功能 | 后端 → 工作进程 → 前端 → 集成 |
| refactor |
重构现有代码 | 新增 → 迁移 → 移除旧代码 → 清理 |
| investigation | 缺陷排查 | 复现 → 调查 → 修复 → 加固 |
| migration | 数据/基础设施迁移 | 准备 → 测试 → 执行 → 清理 |
| simple | 单文件变更 | 仅子任务,无阶段划分 |
步骤
1. 设置工作区
bash
bash ~/clawd/skills/flowforge/scripts/init_forge.sh <任务描述> <仓库路径>
在 ~/.forge/<时间戳>/ 下创建包含 task.md 的目录。
2. 澄清确认点(规范前必需)
在运行流水线之前,提出2-4个有针对性的问题以消除歧义。不要询问 task.md 中已有的信息。重点关注:
- - 范围边界情况 — 这包括X吗,还是说X是独立的部分?
- 约束条件 — 是否有任何文件被冻结/不得触碰?
- 集成点 — 哪个现有模块负责此项职责?
- 成功定义 — 通过的标准是什么——测试、手动检查还是指标?
以编号列表形式呈现问题。等待回答后再继续。如果任务明确无歧义(例如,根据清晰的问题描述进行单文件修复),则跳过此步骤并注明无需澄清。
将答案保存到 ~/.forge/<时间戳>/clarifications.md,供规范和计划阶段参考。
3. 运行流水线
bash
bash ~/clawd/skills/flowforge/scripts/run_forge.sh ~/.forge/<时间戳>/
这将串联4次Claude Code调用:
- 1. 规范 — 生成包含澄清内容的 spec.md(高思考量)
- 计划 — 生成 implementation_plan.json(高思考量)
- 编码 — 执行每个子任务并验证(中等思考量)
- 质量保证 — 审查输出,对照规范评分(高思考量)
每个步骤将输出保存到工作区目录。Claude Code完成所有工作。
4. 监控
轮询工作区以检查完成情况:
bash
tail -f ~/.forge/<时间戳>/progress.log
cat ~/.forge/<时间戳>/qa_report.md
账户轮换
三个Claude Max账户在达到速率限制时自动轮换:
account-1@gmail.com → account-2@gmail.com → account-3@gmail.com → 重试
在 ~/.flowforge/accounts.txt 中配置您的账户(每行一个邮箱)。
将每个账户的凭据保存在 ~/.claude/accounts/<邮箱>.json 中。
使用以下命令切换账户:bash <技能目录>/scripts/rotate_account.sh
GitHub Issues
要从GitHub Issue中拉取任务:
bash
gh issue view <编号> --repo <所有者>/<仓库> --json title,body | \
jq -r # + .title + \n\n + .body > ~/.forge/<时间戳>/task.md
然后正常运行流水线。
输出
完成后,工作区包含:
- - clarifications.md — 规范前的问答(范围、约束、集成点)
- spec.md — 完整规范(包含澄清内容)
- implementationplan.json — 阶段和子任务及其状态
- qareport.md — 质量保证审查和评分
- project-context.md — 会话交接记录(已做决策、已建立模式、下个会话需要了解的内容)
- progress.log — 带时间戳的执行日志
可选:评分标准(200项标准)
为高风险运行添加 --rubric 标志。在基于规范的质量保证审查之后,对照通用的200项质量标准进行评分:
bash
bash ~/clawd/skills/flowforge/scripts/run_forge.sh ~/.forge/<时间戳>/ --rubric
评分标准涵盖:架构(40)、代码质量(40)、测试(40)、错误处理(30)、安全性(20)、文档(15)、可观测性(15)。
判定阈值:≥180 = 可发布 | 150–179 = 需要改进 | <150 = 需要重大返工
快速任务跳过 --rubric。在发布到生产环境前使用。
提示词
参见 references/spec-prompt.md、references/planner-prompt.md、references/qa-prompt.md、references/rubric-prompt.md,了解每个阶段使用的完整Claude Code提示词。