PDF to PPT Conversion Skill
This skill converts PDF files into PowerPoint presentations by first rendering each PDF page as a high-quality image, then assembling those images into a PPT where each image fills a slide. This approach works well for PDFs that are primarily visual (e.g., design drawings, presentations scanned to PDF, etc.) and preserves the visual layout faithfully.
When to Use
Use this skill when you need to:
- - Convert a PDF document into an editable PowerPoint presentation.
- Preserve the exact visual layout of each PDF page (text, images, vectors) as slide backgrounds.
- Work with PDFs that are not easily editable via direct text extraction (e.g., complex layouts, designer files).
- Batch-convert multiple PDFs to PPT format.
Core Rules
1. Input PDF Requirements
- - The PDF should be text-based or vector-based for best results. Scanned PDFs will be converted as images (which is still valid for this skill).
- Ensure the PDF is not password-protected, or provide the password via environment variable if supported.
2. Image Rendering Quality
- - The skill renders PDF pages to images using a configurable zoom factor (default 2.0, which yields ~144 DPI from default 72 DPI).
- Higher zoom values increase image quality and file size but improve text readability in the resulting PPT.
- For text-heavy PDFs where you need selectable text in PPT, consider direct PDF→PPT conversion tools (like LibreOffice) instead.
3. Output PPT Characteristics
- - Each PDF page becomes one slide.
- Slide size defaults to 16:9 (width=13.33in, height=7.5in) but can be customized.
- Images are stretched to fill the slide; aspect ratio is not locked (to ensure full coverage). If aspect ratio preservation is critical, edit the script to center and crop.
- The generated PPT uses a blank layout for each slide, placing the image as a shape that covers the entire slide.
4. Dependencies
- - Python packages:
PyMuPDF (fitz) and python-pptx. - These are installed automatically when the skill is first used via the provided install command, or you can install them manually.
Installation
To install this skill from a local path (after copying to your skills folder), you can use:
CODEBLOCK0
Or if you have cloned the skill repository:
CODEBLOCK1
Usage
Basic Conversion
CODEBLOCK2
This will:
- - Create an
images/ subdirectory next to the input PDF (or use a specified directory). - Render each PDF page to a PNG image in that folder.
- Generate a PPT file with the same base name as the input PDF, appended with
.pptx, in the same directory as the input.
Advanced Options
CODEBLOCK3
Arguments:
- -
input.pdf: Path to the source PDF file (required). - INLINECODE5 : Directory to store intermediate images. If not provided, defaults to
<pdf_dir>/images/. - INLINECODE7 : Path for the output PPT file. If not provided, defaults to
<pdf_basename>.pptx in the same directory as input PDF. - INLINECODE9 : Zoom factor for rendering (default 2.0). Higher = higher DPI.
- INLINECODE10 : Width of slides in inches (default 13.33 for 16:9 at 7.5in height).
- INLINECODE11 : Height of slides in inches (default 7.5).
- INLINECODE12 : Image format for intermediates (default png).
Example: Convert with Custom Settings
CODEBLOCK4
Workflow Tips
- 1. Check Image Quality: After conversion, open the generated PPT and review a few slides. If text appears blurry, increase
--zoom. - File Size: Higher zoom and PNG format increase both image and final PPT size. Use JPG for smaller files if slight compression artifacts are acceptable.
- Post-Processing: The PPT is fully editable in PowerPoint. You can:
- Add text boxes, shapes, or annotations over the image backgrounds.
- Replace images with higher-quality versions if needed.
- Extract individual images via “Save as Picture” if required.
- 4. Batch Processing: Wrap the script in a loop for multiple PDFs:
CODEBLOCK5
Troubleshooting
- - ModuleNotFoundError for fitz or pptx: Ensure dependencies are installed. Run:
pip install --break-system-packages PyMuPDF python-pptx
(Add
--break-system-packages if using system Python in a restricted environment.)
- - Empty Images: Verify the PDF is not empty and that PyMuPDF can open it. Try opening the PDF in a viewer to confirm it has content.
- Memory Issues: Very large PDFs with high zoom may consume significant RAM. Process in batches or reduce zoom.
Related Skills
Consider combining with:
- -
pdf-tools — for extracting text, merging, splitting PDFs before conversion. - INLINECODE16 — for best practices on editing and styling the resulting PPT.
- INLINECODE17 — if you need to process the intermediate images further.
Feedback
If you find this skill useful, consider starring it on ClawHub: INLINECODE18
For issues or enhancements, please refer to the skill’s source repository.
This skill was created to automate PDF → image → PPT conversion for visual-rich documents.
PDF 转 PPT 技能
该技能通过先将每个PDF页面渲染为高质量图像,然后将这些图像组装成PPT(每张图像填充一张幻灯片),从而将PDF文件转换为PowerPoint演示文稿。这种方法适用于主要包含视觉内容的PDF(例如设计图纸、扫描为PDF的演示文稿等),并能忠实保留视觉布局。
使用时机
在以下情况下使用此技能:
- - 需要将PDF文档转换为可编辑的PowerPoint演示文稿。
- 需要将每个PDF页面的精确视觉布局(文本、图像、矢量图形)作为幻灯片背景保留。
- 处理无法通过直接文本提取轻松编辑的PDF(例如复杂布局、设计文件)。
- 批量将多个PDF转换为PPT格式。
核心规则
1. 输入PDF要求
- - 为获得最佳效果,PDF应为基于文本或基于矢量的格式。扫描版PDF将作为图像转换(此技能同样适用)。
- 确保PDF未受密码保护,或通过环境变量提供密码(如果支持)。
2. 图像渲染质量
- - 该技能使用可配置的缩放因子(默认2.0,从默认72 DPI生成约144 DPI)将PDF页面渲染为图像。
- 更高的缩放值可提高图像质量和文件大小,但能改善生成PPT中文本的可读性。
- 对于需要在PPT中获得可选文本的文本密集型PDF,请考虑使用直接PDF→PPT转换工具(如LibreOffice)。
3. 输出PPT特性
- - 每个PDF页面对应一张幻灯片。
- 幻灯片尺寸默认为16:9(宽13.33英寸,高7.5英寸),但可自定义。
- 图像被拉伸以填充幻灯片;宽高比未锁定(以确保完全覆盖)。如果必须保留宽高比,请编辑脚本以居中并裁剪。
- 生成的PPT为每张幻灯片使用空白布局,将图像作为覆盖整个幻灯片的形状放置。
4. 依赖项
- - Python包:PyMuPDF(fitz)和python-pptx。
- 首次使用该技能时,通过提供的安装命令自动安装,或手动安装。
安装
从本地路径安装此技能(复制到技能文件夹后),可使用:
bash
clawhub install /path/to/pdf-to-ppt
或者,如果已克隆技能仓库:
bash
clawhub install pdf-to-ppt
使用方法
基本转换
bash
python3 scripts/pdftoppt.py input.pdf
这将:
- - 在输入PDF旁边创建images/子目录(或使用指定目录)。
- 将每个PDF页面渲染为该文件夹中的PNG图像。
- 在与输入PDF相同的目录中生成一个PPT文件,其基本名称与输入PDF相同,并附加.pptx后缀。
高级选项
bash
python3 scripts/pdftoppt.py input.pdf \
--img-dir /path/to/images \
--output output.pptx \
--zoom 3.0 \
--slide-width-in 16 \
--slide-height-in 9 \
--format png
参数:
- - input.pdf:源PDF文件路径(必需)。
- --img-dir DIR:存储中间图像的目录。如果未提供,默认为dir>/images/。
- --output PPTX:输出PPT文件的路径。如果未提供,默认为输入PDF所在目录中的basename>.pptx。
- --zoom ZOOM:渲染缩放因子(默认2.0)。值越高,DPI越高。
- --slide-width-in INCHES:幻灯片宽度(英寸,默认13.33,对应16:9比例下7.5英寸高度)。
- --slide-height-in INCHES:幻灯片高度(英寸,默认7.5)。
- --format {png,jpg}:中间图像格式(默认png)。
示例:使用自定义设置转换
bash
python3 scripts/pdftoppt.py report.pdf \
--img-dir ./tmp/imgs \
--output ./presentation.pptx \
--zoom 2.5 \
--format jpg
工作流程提示
- 1. 检查图像质量:转换后,打开生成的PPT并检查几张幻灯片。如果文本模糊,请增加--zoom值。
- 文件大小:更高的缩放值和PNG格式会增加图像和最终PPT的大小。如果可接受轻微压缩伪影,请使用JPG以减小文件。
- 后期处理:PPT可在PowerPoint中完全编辑。您可以:
- 在图像背景上添加文本框、形状或注释。
- 根据需要替换为更高质量的图像。
- 通过“另存为图片”提取单个图像。
- 4. 批量处理:使用循环处理多个PDF:
bash
for pdf in *.pdf; do
python3 scripts/pdf
toppt.py $pdf
done
故障排除
- - fitz或pptx的ModuleNotFoundError:确保已安装依赖项。运行:
bash
pip install --break-system-packages PyMuPDF python-pptx
(如果在受限环境中使用系统Python,请添加--break-system-packages。)
- - 空白图像:验证PDF不为空且PyMuPDF可以打开它。尝试在查看器中打开PDF以确认其包含内容。
- 内存问题:非常大的PDF使用高缩放值可能会消耗大量RAM。请分批处理或降低缩放值。
相关技能
考虑与以下技能结合使用:
- - pdf-tools — 用于在转换前提取文本、合并、拆分PDF。
- powerpoint-pptx — 用于编辑和样式化生成PPT的最佳实践。
- images — 如果需要进一步处理中间图像。
反馈
如果您觉得此技能有用,请在ClawHub上为其加星:clawhub star pdf-to-ppt
如有问题或改进建议,请参考技能的源代码仓库。
此技能旨在为视觉丰富的文档自动化PDF→图像→PPT转换流程。