calibre-metadata-apply

A skill for updating metadata of existing Calibre books.

Skill selection contract (strict)

- If the user intent is metadata edit/fix/update, this skill is mandatory.
If the request mentions an ID together with edit/fix/update intent (e.g. ID1011 タイトル修正, ID1011 のタイトルを直して), this skill is mandatory.
If the request mentions an ID but only for viewing/checking/confirming (e.g. ID1021 を確認して, ID1021 の詳細), do NOT use this skill — route to calibre-catalog-read.
INLINECODE5 must not be used for those edit intents.

Use this skill when the user asks any of:

- "ID指定でタイトル修正"
"メタデータ編集"
INLINECODE6 updates

Do NOT use this skill for:

- Read-only lookups (e.g. "ID 1021 を確認して", "ID 1021 の情報を見せて", "show me book 1021")
Checking what metadata a book currently has without intent to change it
Those must use INLINECODE7

Requirements

- calibredb must be available on PATH in the runtime environment
INLINECODE9 installed (for spawn payload generation)
INLINECODE10 is optional/recommended for PDF evidence checks
Reachable Calibre Content server URL

- http://HOST:PORT/#LIBRARY_ID - If LIBRARY_ID is unknown, use #- once to list available IDs on the server.

- --with-library can be omitted only when one of these is configured:

- env: CALIBRE_WITH_LIBRARY or CALIBRE_LIBRARY_URL or CALIBRE_CONTENT_SERVER_URL - optional library id completion: CALIBRE_LIBRARY_ID

- Read the "Calibre Content Server" section of TOOLS.md for the correct --with-library URL.
Host failover (IP change resilience):

- Optional env: CALIBRE_SERVER_HOSTS=host1,host2,... - Script auto-tries candidates, including WSL host-side nameserver from /etc/resolv.conf.

- If authentication is enabled, prefer /home/altair/.openclaw/.env:

- CALIBRE_USERNAME=<user> - CALIBRE_PASSWORD=<password>

- Auth scheme policy for this workflow:

- Non-SSL deployment assumes Digest authentication. - Do not pass auth mode arguments such as --auth-mode / --auth-scheme.

- Pass --password-env CALIBRE_PASSWORD (username auto-loads from env)
You can still override explicitly with --username <user>.

Supported fields

Direct fields (`set_metadata --field`)

- INLINECODE31
INLINECODE32
INLINECODE33 (string with & or array)
INLINECODE35
INLINECODE36
INLINECODE37
INLINECODE38 (string or array)
INLINECODE39
INLINECODE40 (YYYY-MM-DD)
INLINECODE42
INLINECODE43

Helper fields

- comments_html (OC marker block upsert)
INLINECODE45 (auto-generates analysis HTML for comments)
INLINECODE46 (adds tags)
INLINECODE47 (default true)
INLINECODE49 (remove specific tags after merge)

Required execution flow

A. Target confirmation (mandatory)

1. Run read-only lookup to narrow candidates
Show INLINECODE50
Get user confirmation for final target IDs
Build JSONL using only confirmed IDs

B. Proposal synthesis (when metadata is missing)

1. Collect evidence from file extraction + web sources
Show one merged proposal table with:

- candidate, source, confidence (high|medium|low) - title_sort_candidate, author_sort_candidate

3. Get user decision:

- approve all - approve only: <fields> - reject: <fields> - edit: <field>=<value>

4. Apply only approved/finalized fields
If confidence is low or sources conflict, keep fields empty

C. Apply

1. Run dry-run first (mandatory)
Run --apply only after explicit user approval
Re-read and report final values

Analysis worker policy

- Use subagent-spawn-command-builder to generate sessions_spawn payload for heavy candidate generation

- task is required. - Profile should include model/thinking/timeout/cleanup for this workflow.

- Use lightweight subagent model for analysis (avoid main heavy model)
Keep final decisions + dry-run/apply in main

Data flow disclosure

- Local execution:

- Build calibredb set_metadata commands from JSONL. - Read/write local state files (state/runs.json).

- Subagent execution (optional for heavy candidate generation):

- Uses sessions_spawn via subagent-spawn-command-builder. - Text/metadata sent to subagent can reach model endpoints configured by runtime profile.

- Remote write:

- calibredb set_metadata updates metadata on the target Calibre Content server.

Security rules:

- Prefer env-based password (--password-env CALIBRE_PASSWORD) over inline --password.
If user does not want external model/subagent processing, keep flow local and skip subagent orchestration.
In agent/chat execution, do not call calibredb directly for edit operations.

- Always execute node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs.

- Never run calibre-server from this skill.

- This workflow always targets an already-running Calibre Content server.

Connection bootstrap (mandatory)

- Do not ask the user for --with-library first.
First, execute using saved defaults (env) with no explicit --with-library.

- Scripts auto-load .env and resolve CALIBRE_WITH_LIBRARY / CALIBRE_CONTENT_SERVER_URL.

- Ask user for URL only when command output shows unresolved connection, such as:

- missing --with-library - unable to resolve usable --with-library - repeated connection failures for all candidates

Long-run turn-split policy (library-wide)

For library-wide heavy processing, always use turn-split execution.

Unknown-document recovery flow (M3)

Batch sizing rule:

- Keep each unknown-document batch small enough to show full row-by-row results in chat (no representative sampling).
If unresolved items remain, stop and wait for explicit user instruction to start the next batch.

User intervention checkpoints (fixed)

1. Light pass (metadata-only)

- Always run this stage by default (no extra user instruction required) - Analyze existing metadata only (no file content read) - Present a table to user: - current file/title - recommended title/metadata - confidence/evidence summary - Stop and wait for user instruction before any deeper stage

2. On user request: page-1 pass

- Read only the first page and refine proposals - Report delta from light pass

3. If still uncertain: deep pass

- Read first 5 pages + last 5 pages - Add web evidence search - Produce finalized proposal with confidence + rationale

4. Approval gate

- Show detailed findings and request explicit approval before apply

Pending and unsupported handling

- Use pending-review tag for unresolved/hold items.
If document is unresolved in current flow, do not force metadata guesses.

- Tag with pending-review and keep for follow-up investigation.

Diff report format (for unknown batch runs)

Return full results (not samples):

- execution summary (target/changed/pending/skipped/error)
full changed list with id + key before/after fields
full pending list with id + reason
full error list with id + error summary
confidence must be expressed as INLINECODE86

Runtime artifact policy

- Keep run-state and temporary artifacts only while a run is active.
On successful completion, remove per-run state/artifacts.
On failure, keep minimal artifacts only for retry/debug, then clean up after resolution.

Internal orchestration (recommended)

- Use lightweight subagent for all analysis stages
Keep apply decisions in main session
Persist run state for each stage in INLINECODE87

Turn 1 (start)

1. Main defines scope
Main generates spawn payload via subagent-spawn-command-builder (profile example: calibre-meta), then calls INLINECODE90
Save run_id/session_key/task via INLINECODE92
Immediately tell the user this is a subagent job and state the execution model used for analysis
Reply with "analysis started" and keep normal chat responsive

Turn 2 (completion)

1. Receive subagent completion notice
Save result JSON
Complete state handling via INLINECODE93
Return summarized proposal (apply only when needed)

Run state file:

- INLINECODE94

PDF extraction policy

1. Try ebook-convert first
If empty/failed, fallback to INLINECODE96
If both fail, switch to web-evidence-first mode

Sort reading policy

- Use user-configured reading_script for Japanese/non-Latin sort fields

- katakana / hiragana / latin

- Ask once on first use, then reuse for the session
Default policy is full reading (no truncation)
Read the "Calibre Content Server" section of TOOLS.md for the configured reading_script value; pass it as a CLI argument when needed.

Usage

Dry-run:

CODEBLOCK0

Dry-run (when default library is preconfigured via env/config):

CODEBLOCK1

Apply:

CODEBLOCK2

Do not

- Do not run direct --apply using ambiguous title matches only
Do not include unconfirmed IDs in apply payload
Do not auto-fill low-confidence candidates without explicit confirmation
Do not start a local server with guessed path like INLINECODE103

技能名称: calibre-metadata-apply

calibre-metadata-apply

用于更新现有Calibre图书元数据的技能。

技能选择契约（严格）

- 如果用户意图是元数据编辑/修复/更新，则必须使用此技能。
如果请求中提到了ID并带有编辑/修复/更新意图（例如 ID1011 タイトル修正、ID1011 のタイトルを直して），则必须使用此技能。
如果请求中提到了ID，但仅用于查看/检查/确认（例如 ID1021 を確認して、ID1021 の詳細），则不要使用此技能——应路由至 calibre-catalog-read。
calibre-catalog-read 不得用于上述编辑意图。

当用户提出以下任一请求时，使用此技能：

- ID指定でタイトル修正
メタデータ編集
title/authors/series/series_index/tags/publisher/pubdate/languages 更新

不要在以下情况下使用此技能：

- 只读查询（例如 ID 1021 を確認して、ID 1021 の情報を見せて、show me book 1021）
检查图书当前元数据但无意更改
这些情况必须使用 calibre-catalog-read

要求

- 运行时环境中 calibredb 必须在PATH中可用
已安装 subagent-spawn-command-builder（用于生成spawn负载）
pdffonts 是可选的/推荐用于PDF证据检查
可访问的Calibre Content服务器URL

- http://HOST:PORT/#LIBRARY_ID - 如果 LIBRARY_ID 未知，使用 #- 一次以列出服务器上可用的ID。

- 仅当以下条件之一已配置时，可省略 --with-library：

- 环境变量：CALIBREWITHLIBRARY 或 CALIBRELIBRARYURL 或 CALIBRECONTENTSERVER_URL - 可选的库ID补全：CALIBRELIBRARYID

- 阅读TOOLS.md中的Calibre Content Server部分以获取正确的 --with-library URL。
主机故障转移（IP变更弹性）：

- 可选环境变量：CALIBRESERVERHOSTS=host1,host2,... - 脚本自动尝试候选主机，包括来自 /etc/resolv.conf 的WSL主机端 nameserver。

- 如果启用了身份验证，优先使用 /home/altair/.openclaw/.env：

- CALIBRE_USERNAME= - CALIBRE_PASSWORD=

- 此工作流的认证方案策略：

- 非SSL部署假定使用摘要认证。 - 不要传递认证模式参数，如 --auth-mode / --auth-scheme。

- 传递 --password-env CALIBRE_PASSWORD（用户名从环境变量自动加载）
您仍然可以使用 --username 显式覆盖。

支持的字段

直接字段 (set_metadata --field)

- title
titlesort
authors（使用 & 分隔的字符串或数组）
authorsort
series
series_index
tags（字符串或数组）
publisher
pubdate (YYYY-MM-DD)
languages
comments

辅助字段

- commentshtml（OC标记块更新插入）
analysis（自动为comments生成分析HTML）
analysistags（添加标签）
tagsmerge（默认为 true）
tagsremove（合并后移除特定标签）

必需的执行流程

A. 目标确认（必需）

1. 执行只读查询以缩小候选范围
显示 id,title,authors,series,series_index
获取用户对最终目标ID的确认
仅使用已确认的ID构建JSONL

B. 提案合成（当元数据缺失时）

1. 从文件提取和网络来源收集证据
显示一个合并的提案表格，包含：

- candidate、source、confidence (high|medium|low) - titlesortcandidate、authorsortcandidate

3. 获取用户决策：

- approve all - approve only: - reject: - edit: =

4. 仅应用已批准/最终确定的字段
如果置信度低或来源冲突，保持字段为空

C. 应用

1. 首先执行试运行（必需）
仅在用户明确批准后运行 --apply
重新读取并报告最终值

分析工作器策略

- 使用 subagent-spawn-command-builder 为繁重的候选生成生成 sessions_spawn 负载

- task 是必需的。 - 配置文件应为此工作流包含模型/思考/超时/清理设置。

- 使用轻量级子代理模型进行分析（避免使用主要重型模型）
将最终决策和试运行/应用保留在主流程中

数据流披露

- 本地执行：

- 从JSONL构建 calibredb set_metadata 命令。 - 读取/写入本地状态文件（state/runs.json）。

- 子代理执行（可选，用于繁重候选生成）：

- 通过 subagent-spawn-command-builder 使用 sessions_spawn。 - 发送给子代理的文本/元数据可以到达由运行时配置文件配置的模型端点。

- 远程写入：

- calibredb set_metadata 更新目标Calibre Content服务器上的元数据。

安全规则：

- 优先使用基于环境变量的密码（--password-env CALIBRE_PASSWORD），而非内联 --password。
如果用户不希望外部模型/子代理处理，保持流程本地化并跳过子代理编排。
在代理/聊天执行中，不要直接调用 calibredb 进行编辑操作。

- 始终执行 node skills/calibre-metadata-apply/scripts/calibredb_apply.mjs。

- 切勿从此技能运行 calibre-server。

- 此工作流始终针对已运行的Calibre Content服务器。

连接引导（必需）

- 不要首先询问用户 --with-library。
首先，使用保存的默认值（环境变量）执行，不显式指定 --with-library。

- 脚本自动加载 .env 并解析 CALIBREWITHLIBRARY / CALIBRECONTENTSERVER_URL。

- 仅当命令输出显示连接未解决时，才询问用户URL，例如：

- missing --with-library - unable to resolve usable --with-library - 所有候选主机的连接反复失败

长时间运行的分轮策略（库范围）

对于库范围的重度处理，始终使用分轮执行。

未知文档恢复流程（M3）

批次大小规则：

- 保持每个未知文档批次足够小，以便在聊天中显示完整的逐行结果（无代表性抽样）。
如果仍有未解决的项目，停止并等待用户明确指令以开始下一批次。

用户干预检查点（固定）

1. 轻量扫描（仅元数据）

- 默认情况下始终运行此阶段（无需额外用户指令） - 仅分析现有元数据（不读取文件内容） - 向用户呈现表格： - 当前文件/标题 - 推荐的标题/元数据 - 置信度/证据摘要 - 在进入任何更深阶段前停止并等待用户指令

2. 应请求：首页扫描

- 仅读取第一页并优化提案 - 报告与轻量扫描的差异

3. 如果仍不确定：深度扫描

- 读取前5页和后5页 - 添加网络证据搜索 - 生成包含置信度和理由的最终提案

4. 批准关口

- 在应用前显示详细发现并请求明确批准

待定和未支持的处理

- 对未解决/搁置的项目使用 pending-review 标签。
如果文档在当前流程中未解决，不要强制猜测元数据。

- 标记为 pending-review 并保留以供后续调查。

差异报告格式（用于未知批次运行）

返回完整结果（非样本）：

- 执行摘要（目标/已更改/待定/跳过/错误）
完整的已更改列表，包含 id 及关键字段的前后对比
完整的待定列表，

calibre-metadata-applyCalibre元数据应用