Obsidian Clipper
Universal clipper — saves URLs from any platform to your Obsidian Vault with local media, tags, and wikilinks.
Configuration
On first run, read config.yml from the same directory as this SKILL.md file. If missing, tell the user to copy config.yml.example to config.yml and fill in vault.base_path.
Key paths derived from config:
CODEBLOCK0
URL Router
Match the URL and dispatch to the correct handler:
| URL pattern | Handler |
|---|
| INLINECODE4 or INLINECODE5 | X Handler |
| INLINECODE6 |
WeChat Handler |
|
xhslink.com/* or
xiaohongshu.com/* | Xiaohongshu Handler |
|
v.douyin.com/* or
douyin.com/video/* | Douyin Handler |
|
github.com/{owner}/{repo} (no deeper path) | GitHub Handler |
| Everything else | Web Handler |
Shared Rules
These apply to ALL handlers:
File naming
- - Use the content title as filename
- Strip
/\:*?"<>| and all emoji / special Unicode - Truncate if >60 characters
- If a file with the same name exists, ask the user before overwriting
Tags
- - Every clipping MUST have the
clipping tag - No
. or spaces in tags (llms.txt → llms-txt, Claude Code → Claude-Code) - Add 2-4 content-based tags automatically
Media download
- - Download all images and videos to INLINECODE19
- Image naming:
{title-slug}-{n}.{ext} (slug ≤20 chars, no emoji) - Video naming:
{title-slug}.mp4 or INLINECODE22 - Replace remote URLs with Obsidian wikilinks: INLINECODE23
"Why clipped" field
- - Infer from conversation context if possible
- If context is insufficient, ask the user
Cross-platform linking
- - If the content contains a
github.com/{owner}/{repo} link, auto-trigger the GitHub Handler to create a GitHub note, then add bidirectional wikilinks
X Handler
Saves X (Twitter) posts and articles.
Step 1: Fetch post data
CODEBLOCK1
Extract from URL: x.com/{handle}/status/{id} → handle, tweet_id.
Fields:
- -
tweet.author.name → author name - INLINECODE27 → @handle
- INLINECODE28 → post body (short posts)
- INLINECODE29 → long-form article (if present)
- INLINECODE30 → article title
- INLINECODE31 → structured content (Draft.js format)
- INLINECODE32 → publish date
- INLINECODE33 /
tweet.retweets / tweet.views → engagement - INLINECODE36 → short post images
- INLINECODE37 → article inline images
Step 2: Determine content type
Short post (tweet.article is null): use tweet.text, images from tweet.media.photos.
Article (tweet.article exists): parse tweet.article.content.blocks (Draft.js):
| block type | Markdown |
|---|
| INLINECODE43 | paragraph |
| INLINECODE44 |
# heading |
|
header-two |
## heading |
|
header-three |
### heading |
|
unordered-list-item |
- item |
|
ordered-list-item |
1. item |
|
atomic | insert image
![[filename]] |
|
blockquote |
> quote |
Inline styles: Bold → **text**, Italic → *text*.
For atomic blocks: entityMap → value.data.mediaItems[].mediaId → match media_entities → media_info.original_img_url.
Step 3: Download images
Download all images to ATTACHMENTS/.
Step 4: Generate file
Write to X_DIR/{title}.md:
Short post:
CODEBLOCK2
Article:
CODEBLOCK3
Notes
- - fxtwitter API needs no auth but may rate-limit; try
vxtwitter.com as fallback - Title: articles use
article.title; short posts use INLINECODE71
WeChat Handler
Saves WeChat Official Account (公众号) articles.
Step 1: Fetch article data
CODEBLOCK4
Extract: title, nick_name, create_time, content_noencode, link.
If API returns 204 or fails, fall back to defuddle or WebFetch.
Step 2: Fetch HTML (if rich content needed)
If content has images or complex formatting:
CODEBLOCK5
Extract image URLs and formatted content from HTML.
Step 3: Download images
- - Download each image to INLINECODE79
- WeChat image URLs:
mmbiz.qpic.cn/mmbiz_png/... → .png, mmbiz_jpg/... → INLINECODE83 - Replace with INLINECODE84
Step 4: Generate file
Write to WECHAT_DIR/{title}.md:
CODEBLOCK6
Xiaohongshu Handler
Saves 小红书 notes (images + video).
Step 1: Resolve short link
If URL is xhslink.com, follow redirects to get the real URL with full query parameters (especially xsec_token):
CODEBLOCK7
Important: The full query string is required. Bare URLs return empty noteDetailMap.
Step 2: Fetch SSR data
Xiaohongshu is an SPA, but SSR embeds data in window.__INITIAL_STATE__:
CODEBLOCK8
Step 3: Download media
- - Images:
curl -L -o each image to INLINECODE91 - Video (if present):
curl -L -o to INLINECODE93
Step 4: Generate file
Write to XHS_DIR/{title}.md:
CODEBLOCK9
Notes
- - If curl can't reach Xiaohongshu, add INLINECODE95
- INLINECODE96 often contains
[话题]#tag[话题]# markers — strip them for clean text
Douyin Handler
Saves 抖音 videos. Requires douyin-downloader tool — check config.douyin.enabled first. If disabled, tell the user to install douyin-downloader and enable it in config.
Step 1: Extract and resolve link
From share text like 7.94 复制打开抖音...https://v.douyin.com/xxxxx/ DUL:/..., extract the v.douyin.com short link.
Resolve to full video ID:
CODEBLOCK10
Construct: INLINECODE101
Step 2: Download with douyin-downloader
- 1. Edit
{config.douyin.tool_path}/config.yml: set link to the full URL - Run:
cd {config.douyin.tool_path} && {config.douyin.python} run.py
- 3. Find files in
Downloaded/ — .mp4, _cover.jpg, INLINECODE107
Step 3: Extract metadata
From _data.json:
- -
desc → description (title + hashtags) - INLINECODE110 → author
- INLINECODE111 → likes
- INLINECODE112 → comments
- INLINECODE113 → shares
- INLINECODE114 → Unix timestamp → convert to YYYY-MM-DD
Step 4: Copy media to vault
Copy video and cover to ATTACHMENTS/.
Step 5: Generate file
Write to DOUYIN_DIR/{title}.md:
CODEBLOCK12
Notes
- - Short links MUST be resolved first — douyin-downloader cannot handle them
- If download fails (0% success), cookies may be expired — tell user to re-run cookie fetcher:
cd {config.douyin.tool_path} && {config.douyin.python} -m tools.cookie_fetcher --config config.yml
- -
create_time is Unix timestamp: INLINECODE118
GitHub Handler
Saves GitHub repositories.
Step 1: Fetch repo metadata
CODEBLOCK14
Extract: name, full_name, description, stargazers_count, language, license.spdx_id, topics, created_at, html_url, homepage.
Step 2: Fetch README
CODEBLOCK15
Decode base64 content to get README markdown.
Step 3: Summarize README
Do NOT copy the full README. Extract:
- 1. Core features: 3-5 bullet points, one sentence each
- Quick start: only the most essential command or API snippet
- Notes: limitations, dependencies, special requirements
Step 4: Generate file
Write to GITHUB_DIR/{repo-name}.md:
CODEBLOCK16
Notes
- - Star count as raw number (no
1.2k formatting — for Dataview sorting) - Use
name not full_name for filename - If API returns 403 (rate limit), tell user to wait
- Strip emoji from filenames
Web Handler
Saves generic web pages. Fallback for URLs that don't match any platform handler.
Step 1: Extract content with defuddle
CODEBLOCK17
Extract: title, author, content (HTML), description, published, domain.
If defuddle is not installed, run npm install -g defuddle-cli first.
Step 2: Handle empty content (SPA fallback)
If content is empty or just <body></body> (client-rendered SPA), try fallback paths in order:
Path A: CDP browser (if config.web.cdp_enabled):
- 1. Open page:
curl -s "{config.web.cdp_url}/new?url=<URL>" → get INLINECODE145 - Scroll: INLINECODE146
- Extract title: INLINECODE147
- Extract body: INLINECODE148
- Extract images: INLINECODE149
- Extract videos: INLINECODE150
- Close tab: INLINECODE151
Path B: WebFetch — use the WebFetch tool as fallback.
Path C: Bookmark mode — if URL is a web app/tool/dashboard or all paths fail, create a bookmark note from meta info only.
Step 3: Download media
Download images and videos to ATTACHMENTS/, replace with ![[filename]].
Step 4: Generate file
Write to WEB_DIR/{title}.md:
Content page:
CODEBLOCK18
Bookmark page (tool/app/SPA):
CODEBLOCK19
Notes
- - defuddle does not work on SPAs (React/Vue client-rendered) — use CDP path
- Always close CDP tabs after use
Obsidian Clipper
通用剪贴工具——将任意平台的URL保存到你的Obsidian仓库,包含本地媒体、标签和维基链接。
配置
首次运行时,从与本SKILL.md文件相同的目录读取config.yml。如果文件缺失,请告知用户将config.yml.example复制为config.yml并填写vault.base_path。
从配置派生的关键路径:
BASE = config.vault.base_path
ATTACHMENTS = BASE / config.vault.attachments_dir
X_DIR = BASE / config.vault.dirs.x
WECHAT_DIR = BASE / config.vault.dirs.wechat
XHS_DIR = BASE / config.vault.dirs.xiaohongshu
DOUYIN_DIR = BASE / config.vault.dirs.douyin
GITHUB_DIR = BASE / config.vault.dirs.github
WEB_DIR = BASE / config.vault.dirs.web
URL路由
匹配URL并分发到正确的处理器:
| URL模式 | 处理器 |
|---|
| x.com/ 或 twitter.com/ | X处理器 |
| mp.weixin.qq.com/* |
微信处理器 |
| xhslink.com/
或 xiaohongshu.com/ | 小红书处理器 |
| v.douyin.com/
或 douyin.com/video/ | 抖音处理器 |
| github.com/{owner}/{repo}(无更深路径) | GitHub处理器 |
| 其他所有 | 网页处理器 |
共享规则
以下规则适用于所有处理器:
文件命名
- - 使用内容标题作为文件名
- 去除/\:*?<>|以及所有表情符号/特殊Unicode字符
- 超过60个字符时截断
- 如果同名文件已存在,在覆盖前询问用户
标签
- - 每条剪贴记录必须包含clipping标签
- 标签中不能包含.或空格(llms.txt → llms-txt,Claude Code → Claude-Code)
- 自动添加2-4个基于内容的标签
媒体下载
- - 将所有图片和视频下载到ATTACHMENTS/
- 图片命名:{标题-slug}-{n}.{ext}(slug不超过20个字符,不含表情符号)
- 视频命名:{标题-slug}.mp4或{标题-slug}-video-{n}.mp4
- 将远程URL替换为Obsidian维基链接:![[filename.ext]]
收藏原因字段
- - 尽可能从对话上下文中推断
- 如果上下文不足,询问用户
跨平台链接
- - 如果内容包含github.com/{owner}/{repo}链接,自动触发GitHub处理器创建GitHub笔记,然后添加双向维基链接
X处理器
保存X(Twitter)帖子和文章。
步骤1:获取帖子数据
bash
curl -s https://api.fxtwitter.com/{handle}/status/{tweet_id}
从URL中提取:x.com/{handle}/status/{id} → handle, tweet_id。
字段:
- - tweet.author.name → 作者名称
- tweet.author.screenname → @handle
- tweet.text → 帖子正文(短帖子)
- tweet.article → 长文(如果存在)
- tweet.article.title → 文章标题
- tweet.article.content.blocks → 结构化内容(Draft.js格式)
- tweet.createdat → 发布日期
- tweet.likes / tweet.retweets / tweet.views → 互动数据
- tweet.media → 短帖子图片
- tweet.article.media_entities → 文章内嵌图片
步骤2:确定内容类型
短帖子(tweet.article为null):使用tweet.text,图片来自tweet.media.photos。
文章(tweet.article存在):解析tweet.article.content.blocks(Draft.js格式):
| 块类型 | Markdown |
|---|
| unstyled | 段落 |
| header-one |
# 标题 |
| header-two | ## 标题 |
| header-three | ### 标题 |
| unordered-list-item | - 项目 |
| ordered-list-item | 1. 项目 |
| atomic | 插入图片![[filename]] |
| blockquote | > 引用 |
内联样式:Bold → 文本,Italic → 文本。
对于atomic块:entityMap → value.data.mediaItems[].mediaId → 匹配mediaentities → mediainfo.originalimgurl。
步骤3:下载图片
将所有图片下载到ATTACHMENTS/。
步骤4:生成文件
写入X_DIR/{title}.md:
短帖子:
markdown
title: {作者名称}的推文 - {前30个字符}
author: {作者名称}
handle: @{screen_name}
source: {原始URL}
date: {发布日期 YYYY-MM-DD}
tags:
- clipping
- {自动标签}
{作者名称}的推文
X: @{handle} ({name}) | {date} | {likes} 赞 · {retweets} 转推 · {views} 浏览
{正文文本}
{图片 ![[filename]]}
文章:
markdown
title: {article.title}
author: {作者名称}
handle: @{screen_name}
source: {原始URL}
date: {发布日期 YYYY-MM-DD}
tags:
- clipping
- {自动标签}
{article.title}
X: @{handle} ({name}) | {date} | {likes} 赞 · {retweets} 转推 · {views} 浏览
{解析后的Markdown正文,包含![[本地图片]]}
备注
- - fxtwitter API无需认证但可能有速率限制;可尝试vxtwitter.com作为备用
- 标题:文章使用article.title;短帖子使用{name}-{文本前15个字符}
微信处理器
保存微信公众号文章。
步骤1:获取文章数据
bash
curl -s https://down.mptext.top/api/public/v1/download?url={URL编码后的链接}&format=json
提取:title,nickname,createtime,content_noencode,link。
如果API返回204或失败,回退到defuddle或WebFetch。
步骤2:获取HTML(如果需要富内容)
如果内容包含图片或复杂格式:
bash
curl -s https://down.mptext.top/api/public/v1/download?url={URL编码后的链接}&format=html
从HTML中提取图片URL和格式化内容。
步骤3:下载图片
- - 将每张图片下载到ATTACHMENTS/
- 微信图片URL:mmbiz.qpic.cn/mmbizpng/... → .png,mmbizjpg/... → .jpg
- 替换为![[filename.jpg]]
步骤4:生成文件
写入WECHAT_DIR/{title}.md:
markdown
title: {文章标题}
author: {公众号名称}
source: {原始链接}
date: {发布日期 YYYY-MM-DD}
tags:
- clipping
- {自动标签}
{文章标题}
公众号:{name} | {发布日期}
{正文,包含![[本地图片]]}
小红书处理器
保存小红书笔记(图片+视频)。
步骤1:解析短链接
如果URL是xhslink.com,跟随重定向获取真实URL并保留完整查询参数(特别是xsec_token):
bash
curl -sL <短链接> -o /dev/null -w %{url_effective}
重要:需要完整的查询字符串。裸URL会返回空的noteDetailMap。
步骤2:获取SSR数据
小红书是SPA,但SSR将数据嵌入到window.INITIAL_STATE中:
bash
curl -sL <完整URL-带参数> \
-H User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10157) AppleWebKit/537.36 \
-H Accept: text/html | python3 -c
import sys, re, json
html = sys.stdin.read()
m = re.search(rwindow\.INITIAL_STATE\s=\s