llms-txt-sniffer: The Smart Document Radar
This skill streamlines documentation ingestion by locating the most AI-optimized version of a site's content.
🧠 Why llms.txt?
It provides a high-density, Markdown-based index designed for LLMs to map entire sites instantly and save tokens.
🚀 Discovery Strategy (Two-Stage)
Stage 1: Quick Jump Probes (Instructional)
- 1. URL + /llms.txt: Probe
{input_url}/llms.txt using curl -I. - Domain Root: Probe
https://{domain}/llms.txt using curl -I.
Stage 2: Advanced Sniffing (Tool-based)
If Stage 1 fails, run the companion sniffer script located in this skill's directory:
INLINECODE4
📜 Behavioral Rules
- - User-Initiated Only: Only invoke this skill when the user explicitly provides a documentation URL. Do not autonomously scan domains.
- Switch to High-Speed Mode: Once an index is found, prioritize its links over manual scraping.
- Index Summary: Always present a brief structure overview.
- Fallback: Use
sitemap.xml parser results if llms.txt is missing.
llms-txt-sniffer:智能文档雷达
该技能通过定位网站内容中最适配AI的版本,简化文档的摄取过程。
🧠 为什么选择llms.txt?
它提供了一种高密度、基于Markdown的索引,专为LLM设计,可即时映射整个网站并节省令牌。
🚀 发现策略(两阶段)
第一阶段:快速跳转探测(指令式)
- 1. URL + /llms.txt:使用curl -I探测{input_url}/llms.txt。
- 域名根目录:使用curl -I探测https://{domain}/llms.txt。
第二阶段:高级嗅探(基于工具)
若第一阶段失败,运行本技能目录下的配套嗅探脚本:
python3 sniffer.py $ARGUMENTS
📜 行为规则
- - 仅限用户发起:仅在用户明确提供文档URL时调用此技能。不得自主扫描域名。
- 切换至高速模式:一旦找到索引,优先使用其链接而非手动抓取。
- 索引摘要:始终呈现简要的结构概览。
- 降级方案:若缺少llms.txt,则使用sitemap.xml解析结果。