Plugin Creator
Use this skill to build or debug OpenClaw plugins. Prefer the official SDK surface, official documentation, and existing plugin patterns over generic plugin assumptions.
Why plugins exist
Plugins exist to extend OpenClaw without forking the host.
That matters because most user needs are not “change everything.” They are usually one of these:
- - teach the agent a new capability
- let the user trigger a deterministic shortcut
- react to an event in the runtime
- package repeatable domain knowledge
The philosophical goal is not to put all custom logic into one plugin. The goal is to place each behavior at the smallest correct boundary so it stays understandable, testable, and portable.
When deciding what to build, start from the user need, not from the mechanism. Ask:
- - What exact problem is the user trying to solve?
- Who should initiate the behavior: the user, the agent, or the runtime?
- Does the behavior need judgment, determinism, or passive observation?
- What is the smallest unit that solves the need cleanly?
If you answer those questions first, the plugin shape usually becomes obvious.
Mental model: hook vs tool vs command vs skill
These are different layers. Do not collapse them into “plugin stuff.”
Hooks — react to runtime events
Use a hook when the behavior should happen because something else happened.
- - Mental model: an interception point or observer in the runtime lifecycle
- Good for: auditing, rewriting, guardrails, telemetry, prompt shaping, policy enforcement
- Ask for a hook when the real question is: “When X happens, should I observe it, modify it, or block it?”
Tools — give the agent a capability
Use a tool when the agent needs to do something during reasoning.
- - Mental model: a callable capability inside the agent's toolbox
- Good for: API calls, deterministic computations, external actions, structured lookups
- Ask for a tool when the real question is: “Should the model be able to choose this action mid-run?”
Slash commands / native commands — give the user a deterministic shortcut
Use a command when the user should be able to explicitly trigger a behavior without relying on model judgment.
- - Mental model: a direct entrypoint, not an AI-selected capability
- Good for: status, toggles, admin actions, explicit workflows, manual overrides
- Ask for a command when the real question is: “Should the user be able to force this immediately?”
Skills — package reusable knowledge and workflow
Use a skill when the problem is not “run this one function,” but “help the model reason in a repeatable way.”
- - Mental model: a reusable playbook for judgment, workflow, and domain knowledge
- Good for: domain-specific analysis, multi-step procedures, standard operating methods, decomposition guidance
- Ask for a skill when the real question is: “Does the model need better thinking structure, not just a new API?”
Practical decision rule
- - If the user explicitly triggers it, start by considering a command.
- If the model should choose it during reasoning, start by considering a tool.
- If it should happen because the runtime reached a lifecycle point, start by considering a hook.
- If the main value is judgment, reusable reasoning, or process guidance, start by considering a skill.
Many good plugins combine more than one layer. The mistake is not combining them. The mistake is combining them without separating responsibilities.
Decomposing user needs
When a user says “I want a plugin that does X,” do not immediately design files. Decompose the request.
Step 1: find the real trigger
- - User-triggered → likely command
- Agent-triggered → likely tool
- Event-triggered → likely hook
- Knowledge/workflow-triggered → likely skill
Step 2: split the request by responsibility
Most plugin requests contain multiple concerns mixed together:
- - invocation: how behavior starts
- decision logic: how behavior decides what to do
- side effects: what external action happens
- state: what must be remembered
- visibility: what the user should see
Split those concerns before coding. A clean plugin often looks like:
- 1. a thin registration layer in INLINECODE0
- small implementation modules per responsibility
- tests that validate each boundary separately
Step 3: choose the smallest correct unit
Prefer:
- - one command per clear user intent
- one tool per clear capability
- one hook per lifecycle concern
- one skill per coherent reasoning workflow
Avoid “mega plugins” that mix unrelated behavior just because the code lives in one package.
Step 4: verify all four layers
Every plugin feature should be checked at four layers:
- 1. Manifest — is the plugin declared correctly?
- Registration — does the plugin actually register the command/tool/hook/skill?
- Runtime — can the runtime reach and execute it?
- Surface — can the user actually observe or trigger it where expected?
This prevents a common failure mode: “the code exists, therefore the feature works.”
Evidence priority
When something is unclear, use this priority order:
- 1. Public behavior explicitly promised in official docs.
- Published SDK types, manifest/schema references, and other stable plugin-facing contracts that do not require a full local source checkout.
- Existing plugin patterns in the OpenClaw repo when the repo source is available, for example
extensions/observability-lab/. - Project-specific operational experience and known pitfalls.
If layers 3 or 4 conflict with layers 1 or 2, trust layers 1 and 2. Also separate “current repo implementation observations” from “stable public contract” in your write-up.
What to do first
- 1. Classify the task first.
- If you are creating or refactoring plugin structure, read
references/plugin-layout-and-registration.md first.
- If you are working on hooks or event observation, read
references/hooks-and-events.md first.
- If the issue is “the plugin seems registered but does not work at runtime”, read
references/pitfalls-and-debugging.md first.
- If you are adding tests, validating packaging, or tightening the dev workflow, read
references/testing-and-workflow.md first.
- If you are not sure which official source to trust first, read
references/official-docs.md first.
- 2. Confirm the plugin boundary before writing code.
- Decide whether this plugin is a tool, hook, command, skill, service, channel, provider, or a combination.
- Then split the problem into four layers:
- whether the manifest declares it
- whether registration actually happens
- whether runtime agent / gateway flows can really use it
- whether the relevant surface actually displays or exposes it
- Start with the smallest verifiable slice. Do not pile on multiple capabilities at once.
- 3. Prefer an existing pattern before inventing one.
-
extensions/observability-lab/: best for learning combined tool, typed hook, plugin skill, and slash-command patterns.
-
extensions/open-prose/: useful for learning plugin-shipped skill packaging.
-
extensions/lobster/ and
extensions/llm-task/: useful for optional tools via
optional: true.
Workflow
- 1. Choose the location and shape first.
- When developing inside the OpenClaw repo, prefer
extensions/<plugin-id>/.
- When developing outside the repo, keep the same directory shape and SDK import discipline.
- 2. Build the smallest valid skeleton first.
- At minimum, create
package.json,
openclaw.plugin.json, and
index.ts.
- If plugin code frequently references SDK types, add a local
api.ts barrel.
- If the plugin grows beyond a tiny surface, split command / hook / tool / skill / shared state into separate modules.
- 3. Add capabilities after the boundary is clear.
- tools use
api.registerTool(...)
- commands use
api.registerCommand(...)
- typed hooks use
api.on(...)
- lower-level or more generic hook work should consult
api.registerHook(...)
- plugin-shipped skills are declared via the
skills field in INLINECODE22
- 4. Pass the pre-install validation gate before any install step.
- Run the most direct scoped test first:
pnpm test -- extensions/<plugin-id>/ or
pnpm test -- extensions/<plugin-id>/index.test.ts
- When developing inside the OpenClaw repo, run at least one
pnpm build
- If the touched surface extends beyond the local plugin, add
pnpm check and the appropriate broader
pnpm test
- Only after those pass may you proceed to
pnpm openclaw plugins inspect <id>, install, restart, and real-surface verification
- 5. Then do post-install and runtime verification.
-
pnpm openclaw plugins inspect <id>
- install / restart / real conversation-surface verification
- read session logs or
systemPromptReport when needed
- 6. Any new deliverable package must get a new version.
- Update the plugin
package.json version before repackaging.
- Every new remote handoff or installable iteration needs a fresh patch version.
- Always give the remote operator the latest tgz filename, the exact version, and an optional checksum. Do not say “install the package in dist” without naming the file.
Pre-install validation
If the task includes “hand this to someone to install”, “ship to a remote environment”, “build a tgz”, or “prepare install instructions”, pre-install validation is mandatory. Do not treat openclaw plugins install ... as the first validation step.
Minimum gate for in-repo plugin development:
- 1. scoped tests pass
- INLINECODE33 passes
- the target runtime version is known before compatibility and packaging claims are made
Recommended order:
CODEBLOCK0
Execution rules:
- -
pnpm check is not always required for the smallest isolated plugin-local change, but once the touched surface crosses plugin-local boundaries, do not skip it. - Put
plugins inspect before install so you can confirm manifest / registration / diagnostics before debugging a failed install. - If you are handing off a package, run
npm pack --pack-destination dist, then provide the exact latest dist/<package>-<version>.tgz filename, version, and checksum. - If the target environment is not the current runtime, explicitly verify the target OpenClaw version. Since
2026.3.23, plugin compatibility is resolved against the active runtime version during install, so do not rely on stale constants. - For correction releases such as
2026.3.23-2, do not reuse an older tgz. Repack and hand off the new deliverable version explicitly.
Non-negotiable constraints
- - Import production plugin code only from
openclaw/plugin-sdk/<subpath> official surfaces; do not import core src/** paths directly. - INLINECODE42 must exist, and
configSchema must stay strict. - Skill YAML frontmatter is only for skill metadata; it does not attach tools to the agent.
- A plugin tool being “registered” does not mean it is automatically usable by the current agent; tool policy may still filter it.
- Every conclusion must carry enough local context to stand on its own; do not rely on unstated prior conversation context.
- Keep wording objective; avoid subjective phrasing.
- For any tool-related design, explicitly describe tool availability, triggerability, and determinism limits.
- INLINECODE44 , plugin-shipped skills, skill commands,
command-dispatch: tool, and native command menu visibility are different mechanisms; do not blur them together. - Prefer
before_model_resolve / before_prompt_build for prompt injection; treat before_agent_start as a compatibility path only. - If the plugin needs runtime help, prefer
api.runtime.* instead of bypassing the SDK into host internals. - If a plugin is being handed off for remote installation, the deliverable must be the newest tgz with the new version; do not present an older package path, filename, or hash as current.
- Install is not the start of validation; if the pre-install gate is not green, do not ask anyone to run
openclaw plugins install .... - Before handoff, say which step validates manifest, registration, runtime, and surface discoverability. Do not collapse those layers into a vague “already tested”.
Handling abnormal cases
Plugins fail in predictable ways. Treat failure handling as part of the design, not as a cleanup step.
When behavior is ambiguous
- - If the same user need could be solved by a hook, a tool, or a command, do not guess.
- Explain the tradeoff in plain language:
- command = deterministic and user-controlled
- tool = agent-controlled and flexible
- hook = passive or interceptive runtime behavior
- - Pick the smallest mechanism that preserves the intended user experience.
When config or environment is missing
- - Fail clearly, not silently.
- Return actionable errors that say what is missing and where it should be configured.
- Distinguish “plugin loaded” from “plugin usable” — many runtime failures are configuration failures, not registration failures.
When external dependencies fail
- - Prefer narrow failures over global breakage.
- Let one failing API call or optional integration degrade one capability, not crash the whole plugin.
- If a capability is optional, model that explicitly in command output, tool errors, or docs.
When state can drift or disappear
- - Assume in-memory state is temporary.
- If state must survive restart, make persistence explicit.
- Validate restored state before trusting it.
- Design for recovery, not just the happy path.
When debugging unexpected behavior
Use this order:
- 1. confirm the plugin is loadable
- confirm the target capability is registered
- confirm the runtime path can actually reach it
- confirm the user-facing surface exposes it as expected
- only then treat it as a deeper logic bug
This order matters because many “logic bugs” are actually loading, policy, or surface-discovery problems.
Design philosophy for users
The job is not just to expose SDK features. The job is to help the user get the behavior they actually want.
That means you should be able to explain, in simple language:
- - why this behavior belongs in a hook, tool, command, or skill
- why it is split into these pieces and not fewer or more
- what happens on the happy path
- what happens on the failure path
- what the user can rely on, and what remains probabilistic or policy-gated
If you cannot explain the design simply, the design is probably still too tangled.
Delivery standard
When plugin development or debugging is complete, the output should cover at least:
- - plugin shape and directory structure
- key registration points
- which official docs or source entrypoints were used
- which validations were run
- which conclusions belong to manifest / registration / runtime / surface discoverability
- whether any residual risk remains around tool policy, hook semantics, config schema, or installation flow
References
- - Layout and registration: INLINECODE51
- Hooks and events: INLINECODE52
- Testing and dev workflow: INLINECODE53
- Pitfalls and debugging: INLINECODE54
- Official docs entrypoints: INLINECODE55
- In-repo example map: INLINECODE56
插件创建器
使用此技能来构建或调试OpenClaw插件。优先使用官方SDK接口、官方文档和现有插件模式,而非通用的插件假设。
插件存在的原因
插件的存在是为了扩展OpenClaw,而无需分叉宿主。
这一点很重要,因为大多数用户的需求并非改变一切。它们通常属于以下类别之一:
- - 教授代理新能力
- 让用户触发确定性快捷操作
- 响应运行时中的事件
- 封装可复用的领域知识
其哲学目标不是将所有自定义逻辑塞入一个插件,而是将每个行为放置在最小的正确边界内,使其保持可理解、可测试和可移植。
在决定构建什么时,从用户需求出发,而非从机制出发。问自己:
- - 用户试图解决的具体问题是什么?
- 行为应由谁发起:用户、代理还是运行时?
- 该行为需要判断力、确定性还是被动观察?
- 能干净解决问题的最小单元是什么?
如果你先回答了这些问题,插件的形态通常会变得显而易见。
心智模型:钩子 vs 工具 vs 命令 vs 技能
这些是不同的层次。不要将它们混为一谈为插件相关的东西。
钩子——响应运行时事件
当行为应因其他事件发生而触发时,使用钩子。
- - 心智模型:运行时生命周期中的拦截点或观察者
- 适用于:审计、重写、护栏、遥测、提示词塑造、策略执行
- 当真正的问题是:当X发生时,我应该观察它、修改它还是阻止它?时,考虑使用钩子
工具——赋予代理能力
当代理需要在推理过程中执行某些操作时,使用工具。
- - 心智模型:代理工具箱中的可调用能力
- 适用于:API调用、确定性计算、外部操作、结构化查询
- 当真正的问题是:模型是否应该能够在运行过程中选择此操作?时,考虑使用工具
斜杠命令/原生命令——给用户确定性快捷操作
当用户应该能够显式触发某个行为而无需依赖模型判断时,使用命令。
- - 心智模型:直接入口点,而非AI选择的能力
- 适用于:状态查看、开关切换、管理操作、显式工作流、手动覆盖
- 当真正的问题是:用户是否应该能够立即强制执行此操作?时,考虑使用命令
技能——封装可复用的知识和流程
当问题不是运行这个函数,而是帮助模型以可重复的方式推理时,使用技能。
- - 心智模型:用于判断、工作流和领域知识的可复用剧本
- 适用于:特定领域分析、多步骤流程、标准操作方法、分解指导
- 当真正的问题是:模型是否需要更好的思考结构,而不仅仅是新的API?时,考虑使用技能
实用决策规则
- - 如果由用户显式触发,首先考虑命令。
- 如果模型应在推理过程中选择它,首先考虑工具。
- 如果它应因运行时到达某个生命周期点而触发,首先考虑钩子。
- 如果主要价值在于判断、可复用推理或流程指导,首先考虑技能。
许多优秀的插件会组合多个层次。错误不在于组合,而在于未分离职责地组合。
分解用户需求
当用户说我想要一个做X的插件时,不要立即设计文件。先分解请求。
第一步:找到真正的触发器
- - 用户触发 → 可能是命令
- 代理触发 → 可能是工具
- 事件触发 → 可能是钩子
- 知识/流程触发 → 可能是技能
第二步:按职责拆分请求
大多数插件请求包含多个混杂的关注点:
- - 调用方式:行为如何开始
- 决策逻辑:行为如何决定做什么
- 副作用:发生什么外部操作
- 状态:需要记住什么
- 可见性:用户应该看到什么
在编码前拆分这些关注点。一个干净的插件通常看起来像:
- 1. index.ts中的薄注册层
- 每个职责对应的小型实现模块
- 分别验证每个边界的测试
第三步:选择最小的正确单元
优先选择:
- - 每个明确的用户意图对应一个命令
- 每个明确的能力对应一个工具
- 每个生命周期关注点对应一个钩子
- 每个连贯的推理工作流对应一个技能
避免仅仅因为代码位于同一个包中而混合不相关行为的巨型插件。
第四步:验证所有四个层次
每个插件功能应在四个层次上进行检查:
- 1. 清单——插件声明是否正确?
- 注册——插件是否实际注册了命令/工具/钩子/技能?
- 运行时——运行时能否访问并执行它?
- 界面——用户能否在预期位置实际观察或触发它?
这可以防止常见的失败模式:代码存在,因此功能有效。
证据优先级
当某些内容不明确时,使用以下优先级顺序:
- 1. 官方文档中明确承诺的公共行为。
- 已发布的SDK类型、清单/模式引用以及其他稳定的插件面向合约,这些不需要完整的本地源码检出。
- OpenClaw仓库中现有的插件模式(当仓库源码可用时),例如extensions/observability-lab/。
- 项目特定的操作经验和已知陷阱。
如果第3层或第4层与第1层或第2层冲突,信任第1层和第2层。同时,在你的报告中区分当前仓库实现观察和稳定的公共合约。
首先做什么
- 1. 首先对任务进行分类。
- 如果你正在创建或重构插件结构,先阅读references/plugin-layout-and-registration.md。
- 如果你正在处理钩子或事件观察,先阅读references/hooks-and-events.md。
- 如果问题是插件似乎已注册但在运行时不起作用,先阅读references/pitfalls-and-debugging.md。
- 如果你正在添加测试、验证打包或优化开发工作流,先阅读references/testing-and-workflow.md。
- 如果你不确定首先信任哪个官方来源,先阅读references/official-docs.md。
- 2. 在编写代码前确认插件边界。
- 确定此插件是工具、钩子、命令、技能、服务、通道、提供者还是组合体。
- 然后将问题拆分为四个层次:
- 清单是否声明了它
- 注册是否实际发生
- 运行时代理/网关流程是否真的能使用它
- 相关界面是否实际显示或暴露它
- 从最小的可验证切片开始。不要一次性堆叠多个能力。
- 3. 在发明新模式之前优先使用现有模式。
- extensions/observability-lab/:最适合学习组合工具、类型化钩子、插件技能和斜杠命令模式。
- extensions/open-prose/:适合学习插件附带技能的打包方式。
- extensions/lobster/和extensions/llm-task/:适合通过optional: true实现可选工具。
工作流
- 1. 首先选择位置和形态。
- 在OpenClaw仓库内开发时,优先使用extensions/
/。
- 在仓库外开发时,保持相同的目录结构和SDK导入规范。
- 2. 首先构建最小的有效骨架。
- 至少创建package.json、openclaw.plugin.json和index.ts。
- 如果插件代码频繁引用SDK类型,添加本地api.ts桶文件。
- 如果插件增长超出微小界面,将命令/钩子/工具/技能/共享状态拆分为独立模块。
- 3. 在边界清晰后添加能力。
- 工具使用api.registerTool(...)
- 命令使用api.registerCommand(...)
- 类型化钩子使用api.on(...)
- 较低级别或更通用的钩子工作应参考api.registerHook(...)
- 插件附带的技能通过openclaw.plugin.json中的skills字段声明
- 4. 在任何安装步骤之前通过预安装验证门。
- 首先运行最直接的限定范围测试:pnpm test -- extensions//或pnpm test -- extensions//index.test.ts
- 在OpenClaw仓库内开发时,至少运行一次pnpm build
- 如果触及的表面超出本地插件范围,添加pnpm check和适当的更广泛pnpm test
- 只有通过这些后,才能继续执行pnpm openclaw plugins inspect 、安装、重启和真实界面验证
- 5. 然后进行安装后和运行时验证。
- pnpm openclaw plugins inspect
- 安装/重启/真实对话界面验证
- 必要时读取会话日志或systemPromptReport
- 6. 任何新的可交付包必须获得新版本。
- 在重新打包前更新插件package.json版本。
- 每次新的远程交接或可安装迭代都需要一个新的补丁版本。
- 始终向远程操作员提供最新的tgz文件名、确切版本和