epub2md CLI
Use this skill to operate on local EPUB files with epub2md instead of hand-rolling parsing logic.
Bundled script
For any conversion that writes files, use the bundled wrapper script:
INLINECODE1
The wrapper exists to enforce a stable workspace layout and avoid epub2md 1.6.2 merge-path quirks.
One more epub2md 1.6.2 quirk: source filenames containing glob metacharacters such as [ ] ? or * can still be treated as patterns by epub2md, even when the shell path was quoted correctly. When that happens, keep the original EPUB in inputs/, but stage a temporary safe basename such as book.epub before calling raw epub2md, then copy the generated book/ directory into the expected workspace output folder.
What epub2md is good at
- - Inspecting book metadata with INLINECODE14
- Inspecting table of contents and nesting with INLINECODE15
- Listing sections/chapters with INLINECODE16
- Converting an EPUB into chapter-by-chapter Markdown files
- Merging an EPUB into one Markdown file, preferably with an explicit output name
- Merging an existing Markdown directory with plain INLINECODE17
- Downloading remote images referenced by the EPUB with INLINECODE18
- Unzipping EPUB contents for inspection with INLINECODE19
- Batch conversion with quoted glob patterns like INLINECODE20
Prerequisites
Check the command first:
CODEBLOCK0
If it is missing and the environment allows installs, install it with npm:
CODEBLOCK1
If --localize is needed, make sure Node.js is at least 18.0.0.
Workspace layout
By default, write conversion jobs to:
INLINECODE23
Within that directory:
- -
inputs/ contains the original EPUB file - INLINECODE25 contains conversion results
- INLINECODE26 contains chapter-by-chapter Markdown output
- INLINECODE27 contains merged Markdown output and related assets
- INLINECODE28 contains saved inspection results when the user explicitly asks for inspection output
Do not write conversion output next to the user's source EPUB unless they explicitly ask for a different layout.
Working style
- 1. Start from the user's real goal, not from the default conversion.
- If they only want metadata or structure, do not convert the book.
- Inspection is opt-in only. Do not run it unless the user explicitly asked to inspect the EPUB.
- If they want one final Markdown file from an EPUB, use merge mode.
- If they want chapter files, use split mode.
- 2. Confirm the source path before running commands.
- Prefer exact paths the user already gave you.
- If they referred to "that epub in this folder", discover it with shell tools such as
rg --files -g '*.epub'.
- Quote paths and glob patterns when they contain spaces or wildcard characters.
- Quoting is necessary but not sufficient when the EPUB basename itself contains
[ ] ? or
*;
epub2md may still interpret the basename as a glob internally.
- 3. If the user asked for conversion but did not specify the output shape, ask one short clarifying question before writing files.
- Ask whether they want
多个文件、
只要 merge 文件、or
都转换.
- Do not guess between split and merge when the user clearly cares about the output form.
- 4. Use the wrapper script for any file-writing conversion task.
- Do not call
epub2md directly for split/merge/both output jobs unless the user explicitly asked for raw CLI invocation.
- The wrapper copies the original EPUB into
inputs/ and writes results into
outputs/.
- The wrapper always uses a safe merge invocation and should not produce a stray
home/ directory.
- If conversion still fails with
No files found matching pattern: or a wrapper
FileNotFoundError caused by a glob-like source basename, fall back to safe-basename staging: copy the EPUB into a temp directory as
book.epub, run raw
epub2md there, then copy the generated
book/ directory back into
outputs/merge/ or
outputs/split/.
- 5. After the command finishes, report concrete outputs.
- Say which command you ran.
- Say where the generated files were written.
- Mention any limitations, such as remote images not being localized unless
--localize was used.
- 6. Treat
--sections as an inspection tool, not a default user-facing output.
- In
epub2md 1.6.2,
--sections prints very large objects with raw HTML.
- Use it when you need deep inspection.
- Summarize the findings for the user unless they explicitly asked for the raw dump.
Command selection
Inspect only
CODEBLOCK2
Use these only when the user explicitly wants to understand the book before deciding how to export it.
Convert into chapter Markdown files
CODEBLOCK3
Use --autocorrect when the user explicitly wants spacing or punctuation cleanup in the Markdown output.
Merge into one Markdown file
CODEBLOCK4
Use the custom merged filename form when the user asks for a specific final filename.
Convert both split and merge outputs
CODEBLOCK5
Localize remote images
CODEBLOCK6
Use this only when the user wants remote images downloaded into the output folder. Mention the Node.js >=18 requirement if needed.
Batch conversion
CODEBLOCK7
Keep the glob quoted so epub2md receives the pattern directly.
This is a raw CLI fallback for explicit bulk-processing requests. It does not use the per-book workspace layout above. Prefer the bundled wrapper for normal single-book conversion jobs.
Fallback for glob-like EPUB filenames
Use this only when the source basename itself contains [ ] ? or * and the normal wrapper flow fails.
CODEBLOCK8
This preserves the normal workspace layout while avoiding epub2md's internal glob matching on the original basename.
Expected outputs
Use these expectations when explaining results:
- - The original EPUB is copied to INLINECODE61
- Split output is written under INLINECODE62
- Merge output is written under INLINECODE63
- Inspection output is written under
.../{bookname}/outputs/inspect/ only when the user asked for inspection - Chapter Markdown files are typically written as numbered Markdown files
- Many chapter files are named like
NNN-title.md, but fallback names such as NNN-partXXXX.md can also appear - Images are stored under an
images/ subdirectory inside the relevant output folder - Merge mode defaults to
{bookname}-merged.md unless the user asked for another filename
Do not claim remote images were downloaded unless --localize was part of the command.
Known quirks
These were validated against epub2md 1.6.2:
- -
--sections is often too verbose to show directly because it includes raw HTML payloads - Plain
--merge on an absolute EPUB path can generate an awkward nested path and keep the .epub extension on the merged Markdown file - The bundled wrapper avoids that behavior by never relying on plain
--merge for EPUB merge jobs - EPUB basenames containing
[ ] ? or * can be interpreted as glob patterns by epub2md itself, even when shell quoting is correct - The common symptom is
No files found matching pattern: from epub2md, or a wrapper FileNotFoundError because epub2md created book/ while the wrapper expected a directory named after the original book - The practical workaround is to stage the source EPUB to a safe basename such as
book.epub, run epub2md from that temp directory, then copy the generated book/ contents back into the expected workspace output directory
Response format
Keep the response practical and concrete:
- 1. State the action you took
- Show the exact
epub2md command you ran - List the output path or paths created
- Mention blockers or caveats, if any
Examples
Example 1: convert to chapter files
User intent: "把 ~/Books/foo.epub 转成多个 markdown 章节。"
Good command:
CODEBLOCK9
Example 2: merge and localize
User intent: "把 ~/Books/foo.epub 合成一个 markdown,并把远程图片下载到本地。"
Good command:
CODEBLOCK10
Example 3: user did not specify split or merge
User intent: "把这个 epub 转成 markdown。"
Good response:
CODEBLOCK11
Example 4: inspect only
User intent: "先帮我看看这本书的目录结构,不要导出正文。"
Good command:
CODEBLOCK12
epub2md CLI
使用此技能通过 epub2md 操作本地 EPUB 文件,而非手动编写解析逻辑。
捆绑脚本
对于任何写入文件的转换操作,请使用捆绑的包装脚本:
/home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py
该包装脚本用于强制执行稳定的工作区布局,并避免 epub2md 1.6.2 的合并路径怪癖。
epub2md 1.6.2 的另一个怪癖:包含全局元字符(如 [ ] ? 或 *)的源文件名仍可能被 epub2md 视为模式,即使 shell 路径引用正确。发生这种情况时,将原始 EPUB 保留在 inputs/ 中,但在调用原始 epub2md 之前,暂存一个安全的临时基本名称(如 book.epub),然后将生成的 book/ 目录复制到预期的工作区输出文件夹中。
epub2md 擅长什么
- - 使用 --info 检查书籍元数据
- 使用 --structure 检查目录和嵌套结构
- 使用 --sections 列出章节/部分
- 将 EPUB 转换为逐章的 Markdown 文件
- 将 EPUB 合并为一个 Markdown 文件,最好使用显式输出名称
- 使用简单的 --merge 合并现有的 Markdown 目录
- 使用 --localize 下载 EPUB 引用的远程图片
- 使用 --unzip 解压 EPUB 内容以供检查
- 使用带引号的全局模式(如 books/*.epub)进行批量转换
前置条件
首先检查命令:
bash
command -v epub2md
如果缺失且环境允许安装,使用 npm 安装:
bash
npm install -g epub2md
如果需要 --localize,请确保 Node.js 至少为 18.0.0。
工作区布局
默认情况下,将转换任务写入:
/home/admin1/.agents/skills/epub2md-cli-workspace/{bookname}
在该目录内:
- - inputs/ 包含原始 EPUB 文件
- outputs/ 包含转换结果
- outputs/split/ 包含逐章的 Markdown 输出
- outputs/merge/ 包含合并后的 Markdown 输出及相关资源
- outputs/inspect/ 包含用户明确要求检查输出时保存的检查结果
除非用户明确要求不同的布局,否则不要将转换输出写入用户源 EPUB 旁边。
工作风格
- 1. 从用户的真实目标出发,而不是默认转换。
- 如果用户只需要元数据或结构,不要转换书籍。
- 检查仅限选择加入。除非用户明确要求检查 EPUB,否则不要运行。
- 如果用户想要从 EPUB 得到一个最终的 Markdown 文件,使用合并模式。
- 如果用户想要章节文件,使用拆分模式。
- 2. 在运行命令前确认源路径。
- 优先使用用户已提供的精确路径。
- 如果用户提到这个文件夹里的那个 epub,使用 shell 工具(如 rg --files -g *.epub)发现它。
- 当路径或全局模式包含空格或通配符时,请引用它们。
- 当 EPUB 基本名称本身包含 [ ] ? 或 * 时,引用是必要但不充分的;epub2md 仍可能在内部将基本名称解释为全局模式。
- 3. 如果用户要求转换但未指定输出形式,在写入文件前先问一个简短的问题来澄清。
- 询问用户是想要 多个文件、只要 merge 文件、还是 都转换。
- 当用户明确关心输出形式时,不要在拆分和合并之间猜测。
- 4. 对于任何写入文件的转换任务,使用包装脚本。
- 除非用户明确要求原始 CLI 调用,否则不要直接调用 epub2md 进行拆分/合并/两种输出任务。
- 包装脚本将原始 EPUB 复制到 inputs/ 并将结果写入 outputs/。
- 包装脚本始终使用安全的合并调用,不应产生杂散的 home/ 目录。
- 如果转换仍然失败,出现 No files found matching pattern: 或由类似全局的源基本名称引起的包装脚本 FileNotFoundError,回退到安全基本名称暂存:将 EPUB 复制到临时目录作为 book.epub,在那里运行原始 epub2md,然后将生成的 book/ 目录复制回 outputs/merge/ 或 outputs/split/。
- 5. 命令完成后,报告具体的输出。
- 说明运行了哪个命令。
- 说明生成的文件写入位置。
- 提及任何限制,例如未使用 --localize 时远程图片未被本地化。
- 6. 将 --sections 视为检查工具,而非默认的用户可见输出。
- 在 epub2md 1.6.2 中,--sections 会打印包含原始 HTML 的非常大的对象。
- 在需要深度检查时使用它。
- 为用户总结发现,除非他们明确要求原始转储。
命令选择
仅检查
bash
epub2md --info /path/to/book.epub
epub2md --structure /path/to/book.epub
epub2md --sections /path/to/book.epub
epub2md --unzip /path/to/book.epub
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode inspect
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode inspect \
--inspect-actions info structure sections
仅在用户明确想要在决定如何导出之前了解书籍时使用这些。
转换为章节 Markdown 文件
bash
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode split
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode split \
--autocorrect
当用户明确想要在 Markdown 输出中清理间距或标点时使用 --autocorrect。
合并为一个 Markdown 文件
bash
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode merge
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode merge \
--merge-name custom-name.md
当用户要求特定的最终文件名时使用自定义合并文件名形式。
同时转换拆分和合并输出
bash
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode both
本地化远程图片
bash
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode split \
--localize
python3 /home/admin1/.agents/skills/epub2md-cli/scripts/run_epub2md.py \
--input /path/to/book.epub \
--mode merge \
--merge-name custom-name.md \
--localize
仅在用户想要将远程图片下载到输出文件夹时使用。如果需要,提及 Node.js >=18 的要求。
批量转换
bash
epub2md books/*.epub
epub2md --merge books/*.epub
保持全局模式引用,以便 epub2md 直接接收模式。
这是用于显式批量处理请求的原始 CLI 回退。它不使用上述的每本书工作区布局。对于正常的单本书转换任务,优先使用捆绑的包装脚本。
类似全局模式的 EPUB 文件名的回退
仅在源基本名称本身包含 [ ] ? 或 * 且正常包装脚本流程失败时使用此回退。
bash
src=/path/to/Book [Annotated].epub
book_name=Book [Annotated