Obsidian Wiki
Implements Karpathy's LLM Wiki pattern inside an Obsidian vault.
The agent is the compiler; Obsidian is the IDE; the wiki is the codebase.
Vault Resolution
Resolve the vault path at runtime. Store it in $VAULT for all operations.
Preferred methods (in order):
- 1. If the user specifies a vault path, use it directly
- Read the Obsidian config file to find the open vault:
CODEBLOCK0
- 3. Use
obsidian-cli print-default --path-only if available
The skill and scripts make no assumptions about the vault location. All paths
are relative to $VAULT. Scripts require Bash and python3; they use BSD-compatible
grep/sed/awk and are tested on macOS and Linux.
Vault Layout
CODEBLOCK1
Ownership rules
- -
raw/ — immutable. The agent reads but never modifies. wiki/ — agent-owned. The agent creates, updates, and maintains all pages._meta/ — co-owned. The agent proposes changes; the user approves..wiki-meta/ — machine-only. Delta tracking, caches.- Everything else in the vault is untouched.
Page Format
CODEBLOCK2
Provenance markers
Use Obsidian comment syntax %%...%% (invisible in reading view, visible in edit mode):
- -
%%from: raw/path/to/source.md%% — claim extracted from this source - INLINECODE9 — LLM synthesis across multiple sources
- INLINECODE10 — sources disagree
Page-level provenance goes in frontmatter sources: field.
Inline provenance is optional, for granular paragraph-level attribution.
Do NOT use ^[...] — that is Obsidian's inline footnote syntax.
Wikilink rules
- - Obsidian resolves wikilinks by FILENAME only — not by title or aliases.
- Always write links as
[[filename|Display Title]]. Example: INLINECODE14 - In markdown tables: the
| in wikilinks conflicts with table column separators.
Prefer bullet lists over tables when cells contain wikilinks.
If a table is necessary, use
\| inside the wikilink:
[[filename\|Title]].
The lint and fix scripts handle this escaping transparently.
- - Reference raw sources as plain paths in frontmatter
sources: and in ## Sources sections. - For sections: INLINECODE20
- Run
scripts/fix-wikilinks.py "$VAULT" after creating pages to auto-rewrite any [[Title]] links to [[filename|Title]] format.
Workflows
1. Setup
Initialize the vault structure. Create dirs, then:
- - Copy
references/schema-template.md → INLINECODE25 - Copy
references/taxonomy-template.md → INLINECODE27 - Copy
references/agents-template.md → INLINECODE29 - Customize all three for the vault's domain
- Create
wiki/index.md, INLINECODE31
Source of truth: once copied, _meta/ and AGENTS.md are the live instances.
references/ are generic starter templates — they do not stay in sync.
2. Ingest
- 1. Run
scripts/wiki-manifest.sh "$VAULT" diff to see pending sources - For each pending file:
- Read the source, identify entities, concepts, claims, relationships
- Create a source summary in
wiki/sources/
- Create or update entity/concept pages with [[wikilinks]]
- Track provenance in frontmatter
sources: field
- 3. Mark ingested: INLINECODE38
- Regenerate index: INLINECODE39
- Append to
wiki/log.md:
CODEBLOCK3
Source types:
- -
.md / .txt — read directly with read tool - INLINECODE44 — use
pdf tool to extract content (copy to workspace first if the file is outside it, then clean up) - INLINECODE46 /
.html — convert to text or extract content before ingesting - URL — use
web_fetch to save as .md in raw/articles/, then ingest the .md - Images in
raw/assets/ — referenced by pages, not ingested independently (the manifest tracks only ingestable document sources)
3. Query
- 1. Read
wiki/index.md to find relevant pages - Read relevant pages, synthesize an answer with [[wikilink]] citations
- Offer to save valuable answers as new pages in INLINECODE52
4. Lint
Run scripts/wiki-lint.sh "$VAULT" for automated checks (frontmatter, broken
wikilinks, orphans, stale pages, tag drift). Then manually review for:
- - Contradictions between pages (requires reading, not scriptable)
- Missing pages: concepts mentioned in text but lacking their own page
- Weak cross-references that should be strengthened
5. Maintain
Periodic (heartbeat or manual):
- 1. Run lint
- Scan for unlinked mentions of entity/concept names → add [[wikilinks]]
- Update taxonomy if new tags emerged
- Populate
wiki/reports/ with dashboards (open questions, contradictions, stale) - Review stale pages, flag for update or archival
6. Navigate
Use the obsidian skill (if available) for CLI operations (search, open, move/rename with
wikilink refactoring). For bulk reads/writes, use read/write tools directly.
Scripts
All scripts require Bash (BSD grep/sed/awk, python3 for JSON/dates).
Resolve <skill-dir> to the directory containing this SKILL.md.
scripts/wiki-index.sh "$VAULT"
Regenerate wiki/index.md from frontmatter of all wiki pages.
scripts/wiki-lint.sh "$VAULT"
Structural health checks: missing frontmatter, broken wikilinks, orphan pages,
stale content, tag drift. Outputs a summary with per-category counts.
scripts/wiki-manifest.sh "$VAULT" <command>
Delta tracking via SHA-256 hashes in .wiki-meta/manifest.json.
CODEBLOCK4
scripts/fix-wikilinks.py "$VAULT" [--dry-run]
Rewrite [[Title]] links to [[filename|Title]] format for Obsidian resolution.
Run after bulk page creation. Use --dry-run to preview without writing.
scripts/extract-book-digests.sh <books-dir> <output-dir>
Extract first 12 pages of each PDF as text via pdftotext. Used for cross-validating
wiki source pages against actual book content.
Tips
- - Ingest one source at a time for best quality. Stay involved.
- Good answers → wiki pages. Don't let syntheses disappear into chat history.
- Graph view in Obsidian shows wiki shape: hubs, orphans, clusters.
- Dataview plugin queries frontmatter if installed.
- Git the vault for version history (recommended).
- Web Clipper browser extension gets articles into
raw/ fastest.
Obsidian Wiki
在 Obsidian 仓库中实现 Karpathy 的 LLM Wiki 模式。
智能体是编译器;Obsidian 是 IDE;Wiki 是代码库。
仓库解析
在运行时解析仓库路径。将其存储在 $VAULT 中,供所有操作使用。
首选方法(按顺序):
- 1. 如果用户指定了仓库路径,直接使用
- 读取 Obsidian 配置文件以找到打开的仓库:
bash
python3 -c
import json, pathlib, os
Obsidian 将仓库配置存储在平台相关的位置
for p in [
Library/Application Support/obsidian/obsidian.json,
.config/obsidian/obsidian.json,
.var/app/md.obsidian.Obsidian/config/obsidian/obsidian.json,
]:
f = pathlib.Path.home() / p
if f.exists():
for v in json.loads(f.read_text()).get(vaults,{}).values():
if v.get(open): print(v[path]); break
break
- 3. 如果可用,使用 obsidian-cli print-default --path-only
该技能和脚本不对仓库位置做任何假设。所有路径
都相对于 $VAULT。脚本需要 Bash 和 python3;它们使用与 BSD 兼容的
grep/sed/awk,并在 macOS 和 Linux 上经过测试。
仓库布局
/
├── raw/ # 不可变的源文档
│ ├── articles/ # 网页剪辑、博客文章
│ ├── papers/ # arXiv、IEEE、ACM 论文
│ ├── projects/ # 项目笔记、会议记录
│ ├── books/ # 书籍章节、摘录
│ └── assets/ # 图片、图表(本地下载)
├── wiki/ # LLM 编译的页面(智能体拥有)
│ ├── entities/ # 人物、系统、项目、组织
│ ├── concepts/ # 想法、模式、技术、方法
│ ├── syntheses/ # 跨领域总结、比较
│ ├── sources/ # 每个摄入源一个总结页面
│ ├── reports/ # 智能体生成的仪表板(智能体在维护工作流中创建)
│ ├── index.md # 自动生成的目录
│ └── log.md # 仅追加的时间顺序记录
├── _meta/
│ ├── schema.md # Wiki 约定(与用户共同演进)
│ └── taxonomy.md # 规范标签词汇表
├── .wiki-meta/ # 机器状态(不供人类使用)
│ └── manifest.json # 增量跟踪:已摄入文件 + SHA-256 哈希
├── AGENTS.md # 此仓库的智能体指令
└── .obsidian/ # Obsidian 配置(请勿触碰)
所有权规则
- - raw/ — 不可变。智能体读取但从不修改。
- wiki/ — 智能体拥有。智能体创建、更新和维护所有页面。
- _meta/ — 共同拥有。智能体提出更改;用户批准。
- .wiki-meta/ — 仅机器使用。增量跟踪、缓存。
- 仓库中的其他所有内容均不触碰。
页面格式
markdown
title: <页面标题>
type: entity | concept | synthesis | source | report
tags: [来自 taxonomy.md]
sources: [raw/path/to/source.md]
created: YYYY-MM-DD
updated: YYYY-MM-DD
confidence: high | medium | low # 可选
status: active | review | stale | archived # 可选
<页面标题>
包含指向其他 wiki 页面的 [[wikilinks]] 的内容。
未解决问题
来源
- - raw/path/to/source.md — 此来源贡献的内容
溯源标记
使用 Obsidian 注释语法 %%...%%(在阅读视图中不可见,在编辑模式下可见):
- - %%from: raw/path/to/source.md%% — 从此来源提取的声明
- %%inferred%% — LLM 跨多个来源的综合
- %%ambiguous: explanation%% — 来源之间存在分歧
页面级溯源放在 frontmatter 的 sources: 字段中。
内联溯源是可选的,用于精细的段落级归属。
不要使用 ^[...] — 那是 Obsidian 的内联脚注语法。
Wikilink 规则
- - Obsidian 仅通过文件名解析 wikilink — 而不是通过标题或别名。
- 始终将链接写为 [[filename|显示标题]]。示例:[[convolutional-neural-network|卷积神经网络]]
- 在 Markdown 表格中: wikilink 中的 | 与表格列分隔符冲突。
当单元格包含 wikilink 时,优先使用项目符号列表而非表格。
如果必须使用表格,请在 wikilink 中使用 \|:[[filename\|标题]]。
lint 和修复脚本会透明地处理此转义。
- - 在 frontmatter 的 sources: 和 ## 来源 部分中,将原始源引用为纯路径。
- 对于章节:[[filename#章节|显示]]
- 创建页面后运行 scripts/fix-wikilinks.py $VAULT 以自动将任何 [[标题]] 链接重写为 [[filename|标题]] 格式。
工作流
1. 设置
初始化仓库结构。创建目录,然后:
- - 复制 references/schema-template.md → meta/schema.md
- 复制 references/taxonomy-template.md → meta/taxonomy.md
- 复制 references/agents-template.md → AGENTS.md
- 根据仓库领域自定义所有三个文件
- 创建 wiki/index.md、wiki/log.md
事实来源:复制后,_meta/ 和 AGENTS.md 是活动实例。
references/ 是通用起始模板 — 它们不会保持同步。
2. 摄入
- 1. 运行 scripts/wiki-manifest.sh $VAULT diff 查看待处理的来源
- 对于每个待处理的文件:
- 读取来源,识别实体、概念、声明、关系
- 在 wiki/sources/ 中创建来源摘要
- 使用 [[wikilinks]] 创建或更新实体/概念页面
- 在 frontmatter 的 sources: 字段中跟踪溯源
- 3. 标记已摄入:scripts/wiki-manifest.sh $VAULT mark
- 重新生成索引:scripts/wiki-index.sh $VAULT
- 追加到 wiki/log.md:
## [YYYY-MM-DD] ingest | <来源标题>
- 来源:raw/<路径>
- 创建的页面:<列表>
- 更新的页面:<列表>
来源类型:
- - .md / .txt — 使用 read 工具直接读取
- .pdf — 使用 pdf 工具提取内容(如果文件在工作区外,先复制到工作区,然后清理)
- .epub / .html — 在摄入前转换为文本或提取内容
- URL — 使用 web_fetch 保存为 raw/articles/ 中的 .md,然后摄入该 .md
- raw/assets/ 中的图片 — 由页面引用,不独立摄入(清单仅跟踪可摄入的文档来源)
3. 查询
- 1. 读取 wiki/index.md 以找到相关页面
- 读取相关页面,综合带有 [[wikilink]] 引用的答案
- 提供将有价值的答案保存为 wiki/syntheses/ 中的新页面的选项
4. Lint
运行 scripts/wiki-lint.sh $VAULT 进行自动检查(frontmatter、损坏的
wikilink、孤立页面、过时页面、标签漂移)。然后手动检查:
- - 页面之间的矛盾(需要阅读,不可脚本化)
- 缺失页面:文本中提到但缺少独立页面的概念
- 应加强的薄弱交叉引用
5. 维护
定期(心跳或手动):
- 1. 运行 lint
- 扫描实体/概念名称的未链接提及 → 添加 [[wikilinks]]
- 如果出现新标签,更新 taxonomy
- 使用仪表板填充 wiki/reports/(未解决问题、矛盾、过时内容)
- 检查过时页面,标记为更新或归档
6. 导航
使用 obsidian 技能(如果可用)进行 CLI 操作(搜索、打开