autooptimise自动优化

Autonomously optimise any OpenClaw skill using a benchmark-driven experiment loop. Scores skill outputs 0-10 across 4 dimensions, identifies the lowest-scoring pattern, proposes a targeted SKILL.md change, re-tests, and keeps or discards based on measured improvement. Use when asked to: optimise my [skill] skill, run autooptimise on [skill], benchmark my [skill] skill, improve my skill overnight.

作者: admin | 来源: ClawHub

autooptimise

Autonomous benchmark-driven skill optimisation for OpenClaw. Inspired by Andrej Karpathy's autoresearch — the same modify → test → score → keep/discard loop, applied to agent skill quality instead of GPU training.

Trigger Phrases

- INLINECODE0
INLINECODE1
INLINECODE2
INLINECODE3

Key Files

File	Purpose
INLINECODE4	Test task suite (prompts + expected qualities)
INLINECODE5

How to Run

1. Read runner/run_experiment.md — it contains the full loop instructions
Confirm the target skill with the user if not specified
Execute the loop (max 3 iterations)
Present proposed changes for human approval — never auto-apply

Scoring

Use the best available LLM judge model (prefer a strong reasoning model). Score each task 0–10 on:

- Accuracy — correct answer / correct tool called
Conciseness — no padding, no unnecessary text
Tool usage — right tool, right parameters
Formatting — output matches expected format

Full rubric: INLINECODE9

Safety Rules

- Never auto-apply changes. Always present a diff and wait for explicit human approval.
Never modify benchmark/tasks.json or benchmark/scorer.md during a run.
Never exceed 3 iterations per run in v0.1.
Log every action to runner/experiment_log.md.

autooptimise

针对OpenClaw的自主基准驱动技能优化。灵感来源于Andrej Karpathy的autoresearch——相同的修改→测试→评分→保留/丢弃循环，但应用于智能体技能质量而非GPU训练。

触发短语

- 优化我的天气技能
对[技能名称]运行autooptimise
对我的[技能名称]技能进行基准测试
通宵改进我的技能

关键文件

文件	用途
benchmark/tasks.json	测试任务套件（提示词+预期质量）
benchmark/scorer.md

运行方式

1. 阅读runner/run_experiment.md——其中包含完整的循环指令
若未指定目标技能，则与用户确认
执行循环（最多3次迭代）
提交修改方案供人工审批——切勿自动应用

评分标准

使用可用的最佳LLM评判模型（优先选择强推理模型）。每项任务按0-10分评分，评估维度包括：

- 准确性——正确答案/正确调用工具
简洁性——无填充内容，无多余文本
工具使用——正确的工具和参数
格式规范——输出符合预期格式

完整评分标准：benchmark/scorer.md

安全规则

- 切勿自动应用修改。 始终展示差异对比并等待明确的人工审批。
运行期间切勿修改 benchmark/tasks.json 或 benchmark/scorer.md。
v0.1版本每次运行不得超过3次迭代。
将每次操作记录到runner/experiment_log.md中。

autooptimise自动优化

autooptimise

autooptimise

Trigger Phrases

Key Files

How to Run

Scoring

Safety Rules

autooptimise

触发短语

关键文件

运行方式

评分标准

安全规则

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

autooptimise自动优化

autooptimise

autooptimise

Trigger Phrases

Key Files

How to Run

Scoring

Safety Rules

autooptimise

触发短语

关键文件

运行方式

评分标准

安全规则

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement