Tree-Graph Hybrid RAG

This skill teaches Claude how to build the database layer of a Tree-Graph Hybrid RAG system. It focuses on the integration seam between PageIndex-style tree output and LightRAG-style graph extraction, both stored in PostgreSQL.

Core Philosophy

- Tree (Macro): Represents the document's native hierarchy. Gives the LLM the structural skeleton (Chapter -> Section).
Graph (Micro): Represents Entities and Relationships. Gives the LLM cross-document, fine-grained factual connections.
Fusion: Every node and edge in the Graph is anchored to a specific node_id in the Tree, enabling bidirectional traversal (from graph detail to tree context, or tree context to graph detail).

Bundled Resources

This skill includes the minimum resources needed to teach Claude the database design and data flow:

- schema.sql: The complete PostgreSQL table definitions required for this architecture.
ingestioncore.py: Python script demonstrating how to flatten the Tree JSON into Postgres and how to extract graph entities anchored to the tree.
retrievalcore.py: Python script demonstrating the Hybrid Retrieval logic (Querying the Graph to find Tree nodeids, then extracting the macro context).
smoketest.py: Minimal no-database smoke test that validates the ingestion and retrieval flow with a fake pool.
integration-pattern.md: Explains what this skill covers, what it intentionally does not reimplement, and where it should sit in a real service.
queries.md: Common SQL patterns for loading skeletons, anchoring graph hits, and assembling answer context.

Standard Workflows

1. Indexing Workflow

1. Tree Extraction: Extract headers/TOC. Save skeleton to nodes and text to node_contents.
Graph Extraction: Pass each node_contents to an LLM to extract entities and relations.
Anchoring: Save entities/relations with their corresponding node_id as a foreign key.

2. Retrieval Workflow

1. Entity/Relation Search: Extract keywords from the user query. Search the entities and relationships tables to find matching factual details.
Anchor Resolution: Get the node_ids associated with the matched graph elements.
Contextualization (Tree Traversal): Query the nodes table using the node_ids. Traverse up (parent_id) to gather the section titles and summaries.
Content Fetch: Retrieve the full text from node_contents only for the required nodes.
Synthesis: Feed the LLM a prompt containing:
- Found Entities & Relations - Tree Context (e.g., "This was mentioned in Chapter 3: Financials") - Raw Text Chunks
Output Expectations

When this skill is triggered, prefer producing:

1. PostgreSQL DDL or migration SQL
Tree-flattening ingestion code
Graph anchoring logic tied to INLINECODE12
Retrieval SQL that starts from graph hits and resolves back to tree context
Clear explanation of why this database design is preferable to storing one giant nested JSON blob

Developer Guidelines

- Always enforce bone-meat separation: Never store massive text chunks in the nodes or entities tables.
Always maintain multi-tenancy: Ensure every query filters by workspace.
When users ask to implement a retrieval function, write SQL queries that join relationships -> nodes -> node_contents to demonstrate the hybrid power.
Do not build a full product scaffold inside the skill. Keep the focus on database design, ingestion, anchoring, and retrieval patterns.
Do not rewrite PageIndex or LightRAG in full inside the skill. Reuse their existing pipelines and apply this skill at the integration seam.

树-图混合RAG

本技能教授Claude如何构建树-图混合RAG系统的数据库层。它专注于PageIndex风格的树形输出与LightRAG风格的图提取之间的集成接口，两者均存储在PostgreSQL中。

核心理念

- 树（宏观）：代表文档的原生层级结构。为LLM提供结构骨架（章节 -> 小节）。
图（微观）：代表实体和关系。为LLM提供跨文档的细粒度事实连接。
融合：图中的每个节点和边都锚定到树中的特定node_id，实现双向遍历（从图细节到树上下文，或从树上下文到图细节）。

捆绑资源

本技能包含教授Claude数据库设计和数据流所需的最小资源：

- schema.sql：该架构所需的完整PostgreSQL表定义。

ingestioncore.py：演示如何将树JSON扁平化到Postgres以及如何提取锚定到树的图实体的Python脚本。
retrievalcore.py：演示混合检索逻辑（查询图以找到树nodeid，然后提取宏观上下文）的Python脚本。
smoketest.py：最小的无数据库冒烟测试，使用假连接池验证摄取和检索流程。
integration-pattern.md：解释本技能涵盖的内容、有意不重新实现的内容，以及在实际服务中的定位。
queries.md：用于加载骨架、锚定图命中结果和组装答案上下文的常见SQL模式。

标准工作流

1. 索引工作流

1. 树提取：提取标题/目录。将骨架保存到nodes，文本保存到nodecontents。
图提取：将每个nodecontents传递给LLM以提取实体和关系。
锚定：将实体/关系及其对应的node_id作为外键保存。

2. 检索工作流

1. 实体/关系搜索：从用户查询中提取关键词。搜索entities和relationships表以找到匹配的事实细节。
锚点解析：获取与匹配图元素关联的nodeid。
上下文化（树遍历）：使用nodeid查询nodes表。向上遍历（parentid）以收集章节标题和摘要。
内容获取：仅从nodecontents中检索所需节点的完整文本。
综合：向LLM提供包含以下内容的提示：
- 找到的实体和关系 - 树上下文（例如，这在第3章：财务部分提到） - 原始文本块
输出预期

当触发此技能时，优先生成：

1. PostgreSQL DDL或迁移SQL
树扁平化摄取代码
绑定到node_id的图锚定逻辑
从图命中结果开始并解析回树上下文的检索SQL
清晰解释为什么这种数据库设计优于存储单个巨大的嵌套JSON blob

开发者指南

- 始终强制执行骨肉分离：切勿在nodes或entities表中存储大量文本块。
始终维护多租户：确保每个查询按workspace过滤。
当用户要求实现检索函数时，编写连接relationships -> nodes -> node_contents的SQL查询以展示混合能力。
不要在技能内部构建完整的产品脚手架。保持关注数据库设计、摄取、锚定和检索模式。
不要在技能内部完整重写PageIndex或LightRAG。复用其现有管道，并在集成接口处应用本技能。

标准工作流

1. 索引工作流

2. 检索工作流

输出预期

开发者指南

tree-graph-rag树图融合库

tree-graph-rag

Tree-Graph Hybrid RAG

Core Philosophy

Bundled Resources

Standard Workflows

1. Indexing Workflow

2. Retrieval Workflow

Output Expectations

Developer Guidelines

树-图混合RAG

核心理念

捆绑资源

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

tree-graph-rag树图融合库

tree-graph-rag

Tree-Graph Hybrid RAG

Core Philosophy

Bundled Resources

Standard Workflows

1. Indexing Workflow

2. Retrieval Workflow

Output Expectations

Developer Guidelines

树-图混合RAG

核心理念

捆绑资源

标准工作流

1. 索引工作流

2. 检索工作流

输出预期

开发者指南

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement