OCR (Optical Character Recognition) for Word documents (.docx) containing scanned pages or image-embedded content. Uses MinerU to extract text from Word files that have poor or missing text layers. Features: OCR extraction for image-based .docx files. VLM (Vision Language Model) mode for complex layouts with mixed text and images. Handles scanned document pages embedded in Word files. Converts image content to searchable, editable Markdown. Use when you need to: OCR a Word document with scanned
使用OCR技术,通过MinerU从包含扫描页或嵌入图片内容的Word(.docx)文件中提取文本。
bash
npm install -g mineru-open-api
bash
需要令牌:
bash
mineru-open-api auth # 交互式令牌设置
export MINERU_TOKEN=your-token # 或通过环境变量设置
在以下地址创建令牌:https://mineru.net/apiManage/token
该技能支持在以下平台通过对话安装:
帮我安装 SkillHub 和 doc-ocr-1775986809 技能
设置 SkillHub 为我的优先技能安装源,然后帮我安装 doc-ocr-1775986809 技能
skillhub install doc-ocr-1775986809
文件大小: 1.92 KB | 发布时间: 2026-4-13 10:04