calibre-metadata-apply
A skill for updating metadata of existing Calibre books.
Skill selection contract (strict)
- - If the user intent is metadata edit/fix/update, this skill is mandatory.
- If the request mentions an ID together with edit/fix/update intent (e.g.
ID1011 タイトル修正, ID1011 のタイトルを直して), this skill is mandatory. - If the request mentions an ID but only for viewing/checking/confirming (e.g.
ID1021 を確認して, ID1021 の詳細), do NOT use this skill — route to calibre-catalog-read. - INLINECODE5 must not be used for those edit intents.
Use this skill when the user asks any of:
- - "ID指定でタイトル修正"
- "メタデータ編集"
- INLINECODE6 updates
Do NOT use this skill for:
- - Read-only lookups (e.g. "ID 1021 を確認して", "ID 1021 の情報を見せて", "show me book 1021")
- Checking what metadata a book currently has without intent to change it
- Those must use INLINECODE7
Requirements
- -
calibredb must be available on PATH in the runtime environment - INLINECODE9 installed (for spawn payload generation)
- INLINECODE10 is optional/recommended for PDF evidence checks
- Reachable Calibre Content server URL
-
http://HOST:PORT/#LIBRARY_ID
- If
LIBRARY_ID is unknown, use
#- once to list available IDs on the server.
- -
--with-library can be omitted only when one of these is configured:
- env:
CALIBRE_WITH_LIBRARY or
CALIBRE_LIBRARY_URL or
CALIBRE_CONTENT_SERVER_URL
- optional library id completion:
CALIBRE_LIBRARY_ID
- - Read the "Calibre Content Server" section of TOOLS.md for the correct
--with-library URL. - Host failover (IP change resilience):
- Optional env:
CALIBRE_SERVER_HOSTS=host1,host2,...
- Script auto-tries candidates, including WSL host-side
nameserver from
/etc/resolv.conf.
- - If authentication is enabled, prefer
/home/altair/.openclaw/.env:
-
CALIBRE_USERNAME=<user>
-
CALIBRE_PASSWORD=<password>
- - Auth scheme policy for this workflow:
- Non-SSL deployment assumes
Digest authentication.
- Do not pass auth mode arguments such as
--auth-mode /
--auth-scheme.
- - Pass
--password-env CALIBRE_PASSWORD (username auto-loads from env) - You can still override explicitly with
--username <user>.
Supported fields
Direct fields (set_metadata --field)
- - INLINECODE31
- INLINECODE32
- INLINECODE33 (string with
& or array) - INLINECODE35
- INLINECODE36
- INLINECODE37
- INLINECODE38 (string or array)
- INLINECODE39
- INLINECODE40 (
YYYY-MM-DD) - INLINECODE42
- INLINECODE43
Helper fields
- -
comments_html (OC marker block upsert) - INLINECODE45 (auto-generates analysis HTML for comments)
- INLINECODE46 (adds tags)
- INLINECODE47 (default
true) - INLINECODE49 (remove specific tags after merge)
Required execution flow
A. Target confirmation (mandatory)
- 1. Run read-only lookup to narrow candidates
- Show INLINECODE50
- Get user confirmation for final target IDs
- Build JSONL using only confirmed IDs
B. Proposal synthesis (when metadata is missing)
- 1. Collect evidence from file extraction + web sources
- Show one merged proposal table with:
-
candidate,
source,
confidence (high|medium|low)
-
title_sort_candidate,
author_sort_candidate
- 3. Get user decision:
-
approve all
-
approve only: <fields>
-
reject: <fields>
-
edit: <field>=<value>
- 4. Apply only approved/finalized fields
- If confidence is low or sources conflict, keep fields empty
C. Apply
- 1. Run dry-run first (mandatory)
- Run
--apply only after explicit user approval - Re-read and report final values
Analysis worker policy
- - Use
subagent-spawn-command-builder to generate sessions_spawn payload for heavy candidate generation
-
task is required.
- Profile should include model/thinking/timeout/cleanup for this workflow.
- - Use lightweight subagent model for analysis (avoid main heavy model)
- Keep final decisions + dry-run/apply in main
Data flow disclosure
- Build
calibredb set_metadata commands from JSONL.
- Read/write local state files (
state/runs.json).
- - Subagent execution (optional for heavy candidate generation):
- Uses
sessions_spawn via
subagent-spawn-command-builder.
- Text/metadata sent to subagent can reach model endpoints configured by runtime profile.
-
calibredb set_metadata updates metadata on the target Calibre Content server.
Security rules:
- - Prefer env-based password (
--password-env CALIBRE_PASSWORD) over inline --password. - If user does not want external model/subagent processing, keep flow local and skip subagent orchestration.
- In agent/chat execution, do not call
calibredb directly for edit operations.
- Always execute
node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs.
- - Never run
calibre-server from this skill.
- This workflow always targets an already-running Calibre Content server.
Connection bootstrap (mandatory)
- - Do not ask the user for
--with-library first. - First, execute using saved defaults (env) with no explicit
--with-library.
- Scripts auto-load
.env and resolve
CALIBRE_WITH_LIBRARY /
CALIBRE_CONTENT_SERVER_URL.
- - Ask user for URL only when command output shows unresolved connection, such as:
-
missing --with-library
-
unable to resolve usable --with-library
- repeated connection failures for all candidates
Long-run turn-split policy (library-wide)
For library-wide heavy processing, always use turn-split execution.
Unknown-document recovery flow (M3)
Batch sizing rule:
- - Keep each unknown-document batch small enough to show full row-by-row results in chat (no representative sampling).
- If unresolved items remain, stop and wait for explicit user instruction to start the next batch.
User intervention checkpoints (fixed)
- 1. Light pass (metadata-only)
- Always run this stage by default (no extra user instruction required)
- Analyze existing metadata only (no file content read)
- Present a table to user:
- current file/title
- recommended title/metadata
- confidence/evidence summary
- Stop and wait for user instruction before any deeper stage
- 2. On user request: page-1 pass
- Read only the first page and refine proposals
- Report delta from light pass
- 3. If still uncertain: deep pass
- Read first 5 pages + last 5 pages
- Add web evidence search
- Produce finalized proposal with confidence + rationale
- 4. Approval gate
- Show detailed findings and request explicit approval before apply
Pending and unsupported handling
- - Use
pending-review tag for unresolved/hold items. - If document is unresolved in current flow, do not force metadata guesses.
- Tag with
pending-review and keep for follow-up investigation.
Diff report format (for unknown batch runs)
Return full results (not samples):
- - execution summary (target/changed/pending/skipped/error)
- full changed list with
id + key before/after fields - full pending list with
id + reason - full error list with
id + error summary - confidence must be expressed as INLINECODE86
Runtime artifact policy
- - Keep run-state and temporary artifacts only while a run is active.
- On successful completion, remove per-run state/artifacts.
- On failure, keep minimal artifacts only for retry/debug, then clean up after resolution.
Internal orchestration (recommended)
- - Use lightweight subagent for all analysis stages
- Keep apply decisions in main session
- Persist run state for each stage in INLINECODE87
Turn 1 (start)
- 1. Main defines scope
- Main generates spawn payload via
subagent-spawn-command-builder (profile example: calibre-meta), then calls INLINECODE90 - Save
run_id/session_key/task via INLINECODE92 - Immediately tell the user this is a subagent job and state the execution model used for analysis
- Reply with "analysis started" and keep normal chat responsive
Turn 2 (completion)
- 1. Receive subagent completion notice
- Save result JSON
- Complete state handling via INLINECODE93
- Return summarized proposal (apply only when needed)
Run state file:
PDF extraction policy
- 1. Try
ebook-convert first - If empty/failed, fallback to INLINECODE96
- If both fail, switch to web-evidence-first mode
Sort reading policy
- - Use user-configured
reading_script for Japanese/non-Latin sort fields
-
katakana /
hiragana /
latin
- - Ask once on first use, then reuse for the session
- Default policy is full reading (no truncation)
- Read the "Calibre Content Server" section of TOOLS.md for the configured
reading_script value; pass it as a CLI argument when needed.
Usage
Dry-run:
CODEBLOCK0
Dry-run (when default library is preconfigured via env/config):
CODEBLOCK1
Apply:
CODEBLOCK2
Do not
- - Do not run direct
--apply using ambiguous title matches only - Do not include unconfirmed IDs in apply payload
- Do not auto-fill low-confidence candidates without explicit confirmation
- Do not start a local server with guessed path like INLINECODE103
技能名称: calibre-metadata-apply
calibre-metadata-apply
用于更新现有Calibre图书元数据的技能。
技能选择契约(严格)
- - 如果用户意图是元数据编辑/修复/更新,则必须使用此技能。
- 如果请求中提到了ID并带有编辑/修复/更新意图(例如 ID1011 タイトル修正、ID1011 のタイトルを直して),则必须使用此技能。
- 如果请求中提到了ID,但仅用于查看/检查/确认(例如 ID1021 を確認して、ID1021 の詳細),则不要使用此技能——应路由至 calibre-catalog-read。
- calibre-catalog-read 不得用于上述编辑意图。
当用户提出以下任一请求时,使用此技能:
- - ID指定でタイトル修正
- メタデータ編集
- title/authors/series/series_index/tags/publisher/pubdate/languages 更新
不要在以下情况下使用此技能:
- - 只读查询(例如 ID 1021 を確認して、ID 1021 の情報を見せて、show me book 1021)
- 检查图书当前元数据但无意更改
- 这些情况必须使用 calibre-catalog-read
要求
- - 运行时环境中 calibredb 必须在PATH中可用
- 已安装 subagent-spawn-command-builder(用于生成spawn负载)
- pdffonts 是可选的/推荐用于PDF证据检查
- 可访问的Calibre Content服务器URL
- http://HOST:PORT/#LIBRARY_ID
- 如果 LIBRARY_ID 未知,使用 #- 一次以列出服务器上可用的ID。
- - 仅当以下条件之一已配置时,可省略 --with-library:
- 环境变量:CALIBRE
WITHLIBRARY 或 CALIBRE
LIBRARYURL 或 CALIBRE
CONTENTSERVER_URL
- 可选的库ID补全:CALIBRE
LIBRARYID
- - 阅读TOOLS.md中的Calibre Content Server部分以获取正确的 --with-library URL。
- 主机故障转移(IP变更弹性):
- 可选环境变量:CALIBRE
SERVERHOSTS=host1,host2,...
- 脚本自动尝试候选主机,包括来自 /etc/resolv.conf 的WSL主机端 nameserver。
- - 如果启用了身份验证,优先使用 /home/altair/.openclaw/.env:
- CALIBRE_USERNAME=
- CALIBRE_PASSWORD=
- 非SSL部署假定使用摘要认证。
- 不要传递认证模式参数,如 --auth-mode / --auth-scheme。
- - 传递 --password-env CALIBRE_PASSWORD(用户名从环境变量自动加载)
- 您仍然可以使用 --username 显式覆盖。
支持的字段
直接字段 (set_metadata --field)
- - title
- titlesort
- authors(使用 & 分隔的字符串或数组)
- authorsort
- series
- series_index
- tags(字符串或数组)
- publisher
- pubdate (YYYY-MM-DD)
- languages
- comments
辅助字段
- - commentshtml(OC标记块更新插入)
- analysis(自动为comments生成分析HTML)
- analysistags(添加标签)
- tagsmerge(默认为 true)
- tagsremove(合并后移除特定标签)
必需的执行流程
A. 目标确认(必需)
- 1. 执行只读查询以缩小候选范围
- 显示 id,title,authors,series,series_index
- 获取用户对最终目标ID的确认
- 仅使用已确认的ID构建JSONL
B. 提案合成(当元数据缺失时)
- 1. 从文件提取和网络来源收集证据
- 显示一个合并的提案表格,包含:
- candidate、source、confidence (high|medium|low)
- titlesortcandidate、authorsortcandidate
- 3. 获取用户决策:
- approve all
- approve only:
- reject:
- edit: =
- 4. 仅应用已批准/最终确定的字段
- 如果置信度低或来源冲突,保持字段为空
C. 应用
- 1. 首先执行试运行(必需)
- 仅在用户明确批准后运行 --apply
- 重新读取并报告最终值
分析工作器策略
- - 使用 subagent-spawn-command-builder 为繁重的候选生成生成 sessions_spawn 负载
- task 是必需的。
- 配置文件应为此工作流包含模型/思考/超时/清理设置。
- - 使用轻量级子代理模型进行分析(避免使用主要重型模型)
- 将最终决策和试运行/应用保留在主流程中
数据流披露
- 从JSONL构建 calibredb set_metadata 命令。
- 读取/写入本地状态文件(state/runs.json)。
- 通过 subagent-spawn-command-builder 使用 sessions_spawn。
- 发送给子代理的文本/元数据可以到达由运行时配置文件配置的模型端点。
- calibredb set_metadata 更新目标Calibre Content服务器上的元数据。
安全规则:
- - 优先使用基于环境变量的密码(--password-env CALIBRE_PASSWORD),而非内联 --password。
- 如果用户不希望外部模型/子代理处理,保持流程本地化并跳过子代理编排。
- 在代理/聊天执行中,不要直接调用 calibredb 进行编辑操作。
- 始终执行 node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs。
- - 切勿从此技能运行 calibre-server。
- 此工作流始终针对已运行的Calibre Content服务器。
连接引导(必需)
- - 不要首先询问用户 --with-library。
- 首先,使用保存的默认值(环境变量)执行,不显式指定 --with-library。
- 脚本自动加载 .env 并解析 CALIBREWITHLIBRARY / CALIBRECONTENTSERVER_URL。
- - 仅当命令输出显示连接未解决时,才询问用户URL,例如:
- missing --with-library
- unable to resolve usable --with-library
- 所有候选主机的连接反复失败
长时间运行的分轮策略(库范围)
对于库范围的重度处理,始终使用分轮执行。
未知文档恢复流程(M3)
批次大小规则:
- - 保持每个未知文档批次足够小,以便在聊天中显示完整的逐行结果(无代表性抽样)。
- 如果仍有未解决的项目,停止并等待用户明确指令以开始下一批次。
用户干预检查点(固定)
- 1. 轻量扫描(仅元数据)
- 默认情况下始终运行此阶段(无需额外用户指令)
- 仅分析现有元数据(不读取文件内容)
- 向用户呈现表格:
- 当前文件/标题
- 推荐的标题/元数据
- 置信度/证据摘要
- 在进入任何更深阶段前停止并等待用户指令
- 2. 应请求:首页扫描
- 仅读取第一页并优化提案
- 报告与轻量扫描的差异
- 3. 如果仍不确定:深度扫描
- 读取前5页和后5页
- 添加网络证据搜索
- 生成包含置信度和理由的最终提案
- 4. 批准关口
- 在应用前显示详细发现并请求明确批准
待定和未支持的处理
- - 对未解决/搁置的项目使用 pending-review 标签。
- 如果文档在当前流程中未解决,不要强制猜测元数据。
- 标记为 pending-review 并保留以供后续调查。
差异报告格式(用于未知批次运行)
返回完整结果(非样本):
- - 执行摘要(目标/已更改/待定/跳过/错误)
- 完整的已更改列表,包含 id 及关键字段的前后对比
- 完整的待定列表,