DocuClaw Skill
DocuClaw provides a sovereign data infrastructure for processing and archiving documents. It uses multimodal LLMs to extract structured information from scans, photos, and emails, storing everything in human-readable, version-controllable Markdown files.
Use Cases
- - Expense Management: Extract totals, taxes, and dates from receipts for tax filing.
- Contract Analysis: Monitor expiration dates and renewal clauses in legal documents.
- Sovereign Archival: Maintain a local-first, GDPR/GoBD compliant archive of all physical and digital mail.
- Unified Querying: Ask questions about your document history without cloud exposure.
Key Features
- - 100% Local: Zero cloud dependency. Your private data never leaves your hardware.
- Plug-and-Play Parsers: Extensible architecture for country-specific document formats.
- AI-Powered: Supports Ollama, OpenAI Vision, or any multimodal model for intelligent extraction.
- Markdown Schema: Normalizes all documents into a universal schema with YAML metadata.
Workflow Example
- 1. Input: A PDF invoice or a photo of a receipt.
- Process: Run
docuclaw process to trigger AI extraction. - Archive: Document is saved to your local vault as
YYYY/MM/filename.md. - Action: The extracted data is synced to your calendar or accounting tool.
Integration
DocuClaw is designed to work seamlessly with the OpenClaw ecosystem, allowing AI agents to perform RAG (Retrieval-Augmented Generation) over your local document archive.
DocuClaw 技能
DocuClaw 提供主权数据基础设施,用于处理和归档文档。它利用多模态大语言模型从扫描件、照片和电子邮件中提取结构化信息,并将所有内容存储为人类可读、可版本控制的 Markdown 文件。
使用场景
- - 费用管理:从收据中提取总额、税额和日期,用于税务申报。
- 合同分析:监控法律文件中的到期日期和续约条款。
- 主权归档:维护本地优先、符合GDPR/GoBD标准的实体及数字邮件归档。
- 统一查询:无需云端暴露即可查询文档历史记录。
核心特性
- - 100%本地化:零云端依赖。您的私密数据永不离开本地硬件。
- 即插即用解析器:可扩展架构,支持特定国家文档格式。
- AI驱动:支持Ollama、OpenAI Vision或任何多模态模型进行智能提取。
- Markdown模式:将所有文档标准化为包含YAML元数据的通用模式。
工作流程示例
- 1. 输入:PDF发票或收据照片。
- 处理:运行docuclaw process触发AI提取。
- 归档:文档以YYYY/MM/filename.md格式保存至本地保险库。
- 操作:提取的数据同步至日历或会计工具。
集成
DocuClaw旨在与OpenClaw生态系统无缝协作,使AI代理能够对本地文档归档执行RAG(检索增强生成)操作。