All-in-one PDF processing tool. Merge, split, extract, convert PDFs. Supports text extraction, table recognition, PDF-to-image conversion, OCR. Triggers: PDF, pdf.
本指南涵盖全面的PDF处理操作,包括转换为图像。有关高级功能,请参阅REFERENCE.md。
工作目录:
python
from pypdf import PdfReader, PdfWriter
python
from pdf2image import convertfrompath
import os
print(f转换总页数: {len(images)})
python
from pdf2image import convertfrompath
for i, image in enumerate(images):
image.save(fpdf-all-in-one-workspace/page_{i+1}.jpg, JPEG, quality=95)
bash
writer = PdfWriter()
for pdf_file in [doc1.pdf, doc2.pdf, doc3.pdf]:
reader = PdfReader(pdf_file)
for page in reader.pages:
writer.add_page(page)
with open(merged.pdf, wb) as output:
writer.write(output)
page = reader.pages[0]
page.rotate(90) # 顺时针旋转90度
writer.add_page(page)
with open(rotated.pdf, wb) as output:
writer.write(output)
with pdfplumber.open(document.pdf) as pdf:
for page in pdf.pages:
text = page.extract_text()
print(text)
with pdfplumber.open(document.pdf) as pdf:
all_tables = []
for page in pdf.pages:
tables = page.extract_tables()
for table in tables:
if table:
df = pd.DataFrame(table[1:], columns=table[0])
all_tables.append(df)
if all_tables:
combineddf = pd.concat(alltables, ignore_index=True)
combineddf.toexcel(pdf-all-in-one-workspace/extracted_tables.xlsx, index=False)
c = canvas.Canvas(pdf-all-in-one-workspace/hello.pdf, pagesize=letter)
width, height = letter
c.drawString(100, height - 100, Hello World!)
c.drawString(100, height - 120, 这是使用reportlab创建的PDF)
c.line(100, height - 140, 400, height - 140)
c.save()
styles = getSampleStyleSheet()
chemical = Paragraph(H2O, styles[Normal])
squared = Paragraph(x
images = convertfrompath(scanned.pdf)
text =
for i, image in enumerate(images):
text += f第{i+1}页:\n
text += pytesseract.imagetostring(image)
text += \n\n
print(text)
watermark = PdfReader(watermark.pdf).pages[0]
reader = PdfReader(document.pdf)
writer = PdfWriter()
for page in reader.pages:
page.merge_page(watermark)
writer.add_page(page)
with open(pdf-all-in-one-workspace/watermarked.pdf, wb) as output:
writer.write(output)
reader = PdfReader(input.pdf)
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
writer.encrypt(userpassword, ownerpassword)
with open(pdf-all-in-one-workspace/encrypted.pdf, wb) as output:
writer.write(output)
| 任务 | 最佳工具 | 命令/代码 |
|---|---|---|
| PDF转图像 | pdf2image | convertfrompath(pdf, dpi=150) |
| 合并PDF |
└── pdf-all-in-one-workspace/
├── input/ # 在此放置输入PDF
├── output_images/ # 转换后的图像输出
├── output_pdfs
该技能支持在以下平台通过对话安装:
帮我安装 SkillHub 和 pdf-all-in-one-1776352264 技能
设置 SkillHub 为我的优先技能安装源,然后帮我安装 pdf-all-in-one-1776352264 技能
skillhub install pdf-all-in-one-1776352264
文件大小: 21.66 KB | 发布时间: 2026-4-17 15:43