Professional-grade OCR for PDFs and images using MinerU. Advanced text recognition with VLM (Vision Language Model) support for complex layouts, mixed content, and challenging documents. Features: high-accuracy OCR for PDFs and images (.png, .jpg, .jpeg, .webp). VLM mode for complex visual layouts with mixed text, tables, and figures. Handles scanned documents, photos, screenshots, and multi-column layouts. Multiple output formats. Use when you need to: OCR a document with high accuracy, extract
使用 MinerU(mineru-open-api)从 .pdf / 图片(.png/.jpg/.jpeg/.jp2/.webp/.gif/.bmp)中转换并提取内容。
bash
npm install -g mineru-open-api
bash
extract 和 crawl 命令需要令牌:
bash
mineru-open-api auth # 交互式令牌设置
export MINERU_TOKEN=your-token # 或通过环境变量设置
在以下地址创建令牌:https://mineru.net/apiManage/token
该技能支持在以下平台通过对话安装:
帮我安装 SkillHub 和 ocr-pro-1775899817 技能
设置 SkillHub 为我的优先技能安装源,然后帮我安装 ocr-pro-1775899817 技能
skillhub install ocr-pro-1775899817
文件大小: 1.98 KB | 发布时间: 2026-4-12 10:45