SEO & LLM Rankings
Audit websites for traditional SEO health and AI search visibility (GEO). Generate prioritized reports with actionable fixes and ready-to-use prompts for AI coding agents. Covers Google, Bing, ChatGPT, Perplexity, Claude, Gemini, Copilot, and Google AI Overviews.
Before You Start
Check for product marketing context first:
If .agents/product-marketing-context.md exists, read it before asking questions. Use that context and only ask for information not already covered.
Gather missing context by asking the user:
- 1. Audit mode -- Live URL scan or local codebase scan?
- Site URL (if URL mode) -- What URL should be audited?
- Project path (if codebase mode) -- Where is the project? (default: current workspace)
- Scope -- Full site audit or specific pages/sections?
- Site type -- SaaS, e-commerce, blog, local business, portfolio, etc.
- Primary keywords -- What keywords/topics matter most?
- Business goal -- Traffic, leads, signups, sales, brand awareness?
- Known issues -- Any specific concerns or recent traffic drops?
- Tech stack -- Next.js, WordPress, Shopify, static HTML, etc.
Only ask what you don't already know. If the user gives a URL, use URL mode. If the user says "audit my project" or "scan the codebase," use codebase mode. If ambiguous, ask.
Platform Detection & URL Fetching Strategy
URL mode fetches live pages via HTTP. This often fails because of WAFs (Cloudflare, etc.), corporate firewalls, bot-detection, or restricted shell environments. Before running any URL-mode fetch, detect your platform and use the most reliable method available.
How to detect your platform:
- - Cursor -- You have a
WebFetch tool and WebSearch tool. You may also have @Web mentions available. If you can call WebFetch, you are in Cursor. - Claude Code -- You have a
WebFetch tool natively. Similar to Cursor but runs in a terminal-based agent. - Codex (OpenAI) -- You have browser tools or web-fetch MCP tools. Check your available tool list.
- Windsurf -- You have built-in web browsing and URL fetching capability.
- Aider / terminal-only agents -- No built-in web tools. You must rely on shell commands (
curl, python3). - GitHub Copilot Workspace -- Has web access through its own tool surface.
URL fetching priority chain (try in order, fall back on failure):
| Platform | Primary Fetch Method | Fallback 1 | Fallback 2 |
|---|
| Cursor | INLINECODE8 tool (bypasses WAFs/firewalls) | Python script (scripts/seo_audit.py) | INLINECODE10 with browser UA |
| Claude Code |
WebFetch tool | Python script |
curl with browser UA |
|
Codex |
WebFetch / browser tools | Python script | -- |
|
Windsurf | Built-in web browsing | Python script |
curl with browser UA |
|
Aider / terminal | Python script |
curl with browser UA | -- |
|
Other | Python script |
curl with browser UA | -- |
Failure detection: Treat any of these as a fetch failure that should trigger the next fallback:
- - HTTP 403, 404, 5xx status codes
- Connection timeout or connection refused
- Empty response body
- SSL/TLS errors
- INLINECODE17 (Windows without curl)
If all methods fail, do NOT silently skip the check. Inform the user that the URL could not be reached and suggest:
- 1. Trying from a different network
- Pasting the page HTML or file content directly for analysis
- Switching to codebase mode if they have the source locally
Audit Mode Selection
This skill supports two audit modes:
| Mode | When to Use | How It Works |
|---|
| URL mode | Site is live / deployed | Fetches pages via HTTP, checks robots.txt, sitemap, llms.txt, load time |
| Codebase mode |
Site is local / pre-deployment | Scans project files using Glob, Read, Grep -- checks HTML, layout files, config |
Codebase mode is especially useful for:
- - Catching SEO issues before deployment
- Projects not yet live
- When the user wants to scan source code directly
- Auditing meta tags set in framework layout/page files (Next.js, Astro, etc.)
Audit Workflow
Phase 1: Technical SEO Scan
Choose your mode:
URL Mode
Run the audit script and manual checks:
CODEBLOCK0
Manual checks -- use the fetching strategy from "Platform Detection & URL Fetching Strategy" above.
Follow the priority chain for your platform. Examples for each method:
Primary: WebFetch (Cursor, Claude Code, Codex, Windsurf)
Use WebFetch for each resource. This is the most reliable method -- it handles TLS, follows redirects, and bypasses most WAF/bot-detection blocks.
CODEBLOCK1
If WebFetch returns an error for a specific resource (403, 404, 500, or timeout), note the failure and try the next fallback for that resource. Do NOT skip the check silently.
Fallback 1: Python script
CODEBLOCK2
The script has built-in retry logic and User-Agent rotation. If the script reports a fetch failure, it will print a diagnostic message with the status code and suggested next steps.
Fallback 2: curl with browser User-Agent
Use a realistic browser User-Agent to avoid bot detection. Add --retry 2 and --max-time 30 for resilience.
CODEBLOCK3
If all methods fail for a resource, tell the user which resource could not be fetched, the error received, and suggest they paste the content manually or switch to codebase mode.
PageSpeed check (optional, API-based):
CODEBLOCK4
What to check (both modes):
- - Title tag exists, 50-60 chars, contains primary keyword
- Meta description exists, 150-160 chars, compelling
- Single H1 per page, contains primary keyword
- Heading hierarchy: H1 > H2 > H3 (no skips)
- HTTPS enabled with valid certificate
- robots.txt exists and allows important pages
- XML sitemap exists and is accessible
- Page loads in < 3 seconds (URL mode only)
- Core Web Vitals: LCP < 2.5s, INP < 200ms, CLS < 0.1 (URL mode only)
- Open Graph and Twitter Card tags present
- Images have alt text
- No broken internal links
Codebase Mode
Scan the project files directly using Glob, Read, and Grep. No live server needed.
Step 1: Discover project structure and framework
CODEBLOCK5
Step 2: Check meta tags in layout/page files
For Next.js (App Router):
CODEBLOCK6
For Next.js (Pages Router):
CODEBLOCK7
For static HTML:
CODEBLOCK8
For Astro/Vue/Svelte:
CODEBLOCK9
Step 3: Check for robots.txt and sitemap
CODEBLOCK10
For Next.js dynamic generation:
CODEBLOCK11
Step 4: Check for llms.txt
CODEBLOCK12
Step 5: Check schema markup in source
CODEBLOCK13
Step 6: Check headings structure
CODEBLOCK14
For React/JSX components:
CODEBLOCK15
Step 7: Check images for alt text
CODEBLOCK16
Step 8: Check Open Graph and social tags
CODEBLOCK17
Codebase mode limitations:
- - Cannot check page load time or Core Web Vitals (needs live server)
- Cannot test HTTP redirect chains or SSL
- Cannot detect runtime-only issues (client-side rendering, JS errors)
- Schema injected purely at runtime by plugins won't be visible
Codebase mode advantages:
- - Catches issues before deployment
- Can see the actual source code and suggest exact file/line fixes
- Works offline
- Can scan all pages at once instead of one URL at a time
- Fix prompts can reference exact file paths
Phase 2: GEO / AI Visibility Scan
AI Crawler Access -- check robots.txt for these bots (in URL mode, fetch directly; in codebase mode, read the robots.txt file from public/ or project root):
| Bot | Platform | Purpose |
|---|
| GPTBot | OpenAI | ChatGPT training/knowledge |
| ChatGPT-User |
OpenAI | ChatGPT web browsing |
| ClaudeBot | Anthropic | Claude training |
| Claude-Web | Anthropic | Claude web search |
| anthropic-ai | Anthropic | Claude training data |
| PerplexityBot | Perplexity | Real-time search answers |
| Google-Extended | Google | Gemini / AI Overviews |
| Applebot-Extended | Apple | Apple Intelligence |
| Bingbot | Microsoft | Copilot (uses Bing index) |
llms.txt File:
Check if https://example.com/llms.txt exists. This AI discovery file (introduced 2024, widely adopted 2026) provides structured context to LLMs. Sites with llms.txt saw ~35% increase in AI visibility within 60 days. If missing, recommend creating one -- see references/fix-prompt-templates.md for a ready-to-use prompt.
Schema Markup:
INLINECODE23 and curl cannot reliably detect JS-injected schema (Yoast, RankMath, AIOSEO). Use browser tools or Google Rich Results Test for accurate detection.
Check for: FAQPage (+40% AI visibility), Article, Organization, WebPage, BreadcrumbList, SpeakableSpecification. See references/schema-templates.md.
AI Citation Scoring:
Score the page across 5 dimensions (see references/ai-citation-scoring.md):
- 1. Extractability -- Can AI pull a useful answer?
- Quotability -- Are there statements worth citing?
- Authority -- Does it signal expertise?
- Freshness -- Is content current?
- Entity Clarity -- Can AI identify the entity?
Veto rule: If AI crawlers are blocked in robots.txt, AI visibility score = 0 regardless of content quality.
Phase 3: Content Quality Assessment
GEO Methods Check (see references/geo-methods.md):
- - Citations with sources present? (+27-40% visibility)
- Statistics with named sources? (+33-37%)
- Expert quotes with attribution? (+30-43%)
- Answer-first format? (40-60 word answer capsule after H2s)
- No keyword stuffing? (causes -9 to -10%)
E-E-A-T Signals:
- - Author bios with credentials visible?
- About page with company information?
- Contact information accessible?
- First-hand experience demonstrated?
- Content dated and recently updated?
Content Structure for AI Extraction:
- - Clear H2/H3 headings that match questions people ask
- Answer in first sentence after each heading
- Tables for comparisons, ordered lists for processes
- Short paragraphs (2-3 sentences)
- FAQ sections with direct answers
AI Writing Detection:
Check content for AI writing patterns that reduce credibility. See references/ai-writing-detection.md.
Report Format
Generate this report after completing all three phases:
CODEBLOCK18
Fix Prompt Generation
When the audit finds a small number of fixable issues, generate a ready-to-use prompt that the user can paste directly into their AI coding agent (Cursor, Claude Code, Codex, etc.).
When to generate a fix prompt:
- - Fewer than ~10 distinct issues found
- Issues are code-level fixes (meta tags, schema, robots.txt, content structure)
- User asks for a prompt to fix things
- Codebase mode: always generate fix prompts with exact file paths and line numbers
How to build the prompt:
- 1. Read the fix prompt templates from references/fix-prompt-templates.md
- Select the relevant template(s) based on audit findings
- Fill in the specific issues, file paths, and current values from the audit
- Combine into a single
.prompt.md formatted prompt with:
- Clear persona (SEO/GEO expert)
- Specific task listing each issue found
- Step-by-step fix instructions
- Validation criteria
Example output (URL mode):
CODEBLOCK19
Example output (codebase mode):
CODEBLOCK20
Content Prompt Generation
When the audit identifies content gaps -- missing pages, thin content that needs rewriting, or opportunities for new pages targeting high-value queries -- generate a ready-to-use prompt that produces SEO + GEO optimized content when pasted into an AI writing/coding agent.
When to generate a content prompt:
- - Audit reveals content gaps (missing pages for queries the site should rank for)
- Programmatic SEO opportunities identified (see table below)
- User explicitly asks for help creating new content or pages
- Thin content flagged (word count < 300) that needs a full rewrite rather than a quick fix
- Competitor analysis shows topics the site doesn't cover but should
How to build the prompt:
- 1. Read the content prompt templates from references/content-prompt-templates.md
- Select the template matching the page type:
-
Article / Blog Post -- informational content targeting a query
-
Landing / Product Page -- conversion-focused page
-
Glossary / Definition Page -- "What is X?" pages (great for pSEO)
-
Comparison Page -- "X vs Y" structured comparison
-
FAQ / Resource Page -- Q&A aggregation page
-
Location / Persona Page -- localized or audience-specific variant
- 3. Fill placeholders using audit findings, keyword data, and product-marketing-context (if
.agents/product-marketing-context.md exists) - The template already embeds GEO optimization rules (answer capsules, citation density, quotation slots from references/geo-methods.md), anti-AI-writing constraints (from references/ai-writing-detection.md), and schema markup requirements -- do not remove them
- Output as a single
.prompt.md formatted prompt
Example output (article page):
CODEBLOCK21
Example output (comparison page):
CODEBLOCK22
Combining with programmatic SEO: When the audit suggests scale pages (integrations, locations, personas, glossary terms), use the batch pattern in Template 3 or Template 6 from references/content-prompt-templates.md. Provide the list of items (terms, cities, audiences) inside the prompt so the agent generates one page per item using the same structure.
Programmatic SEO Opportunities
For sites that could benefit from pages at scale, suggest opportunities from these playbooks:
| If the site has... | Suggest... |
|---|
| Product with integrations | Integration pages (/integrations/[tool]/) |
| Multi-segment audience |
Persona pages (
/for/[audience]/) |
| Local presence | Location pages (
/[service]/[city]/) |
| Competitor landscape | Comparison pages (
/compare/[x]-vs-[y]/) |
| Industry expertise | Glossary pages (
/glossary/[term]/) |
| Design/creative product | Template pages (
/templates/[type]/) |
Only suggest if there's genuine search demand and the site has (or can create) unique data for each page. Use subfolders, not subdomains.
Platform-Specific Optimization
For detailed ranking factors per platform, see references/platform-ranking-factors.md.
Quick reference:
| Platform | Primary Index | Key Factor | Unique Requirement |
|---|
| Google | Google | Backlinks + E-E-A-T | Core Web Vitals |
| Google AI Overviews |
Google | E-E-A-T + Schema | Knowledge Graph |
| ChatGPT | Bing-based | Domain Authority | Content-Answer Fit |
| Perplexity | Own + Google | Semantic Relevance | FAQ Schema, freshness |
| Claude | Brave | Factual Density | Brave Search indexing |
| Copilot | Bing | Bing Index | MS Ecosystem presence |
Validation Tools
| Tool | URL | Purpose |
|---|
| Google Rich Results Test | search.google.com/test/rich-results | Schema validation (renders JS) |
| Schema.org Validator |
validator.schema.org | Schema syntax check |
| PageSpeed Insights | pagespeed.web.dev | Core Web Vitals |
| Google Search Console | search.google.com/search-console | Indexing, errors, performance |
| Bing Webmaster Tools | bing.com/webmasters | Bing indexing |
References
Scripts
- -
scripts/seo_audit.py -- Full SEO + GEO audit (no API required). Checks meta tags, headings, robots.txt, sitemap, llms.txt, AI crawlers, schema, performance.
CODEBLOCK23
Troubleshooting: URL Fetch Failures
URL fetching can fail for many reasons. Use this table to diagnose and recover.
| Symptom | Likely Cause | Fix |
|---|
| 403 Forbidden | WAF/bot detection (Cloudflare, etc.) | Use WebFetch (Cursor/Claude Code) or curl with browser UA |
| 404 on known pages |
URL path wrong OR CDN blocking non-browser agents | Verify URL in a browser, then try
WebFetch |
| 429 Too Many Requests | Rate limiting | Wait and retry, or use
WebFetch |
| 500 Server Error | Server issue or aggressive bot blocking | Retry with backoff (script does this automatically), try
WebFetch |
| Connection timeout | Firewall blocking outbound requests | Use
WebFetch, or ask user to paste content |
| Empty response | Bot trap returning empty body | Use
WebFetch, verify the URL loads in a real browser |
| SSL/TLS error | Certificate issue or corporate MITM proxy | Add
--insecure to curl (warn user), or use
WebFetch |
|
curl: command not found | Windows without curl or restricted env | Use Python script (
scripts/seo_audit.py) or
WebFetch |
| Script exits with code 2 | Connection/timeout error (no HTTP status) | Network issue; try
WebFetch or check VPN/firewall |
| Script exits with code 1 | HTTP error (got a status code back) | Check the status code in output; likely bot detection |
Platform-specific notes:
- - Cursor: Always prefer the
WebFetch tool. It handles TLS, follows redirects, and bypasses most WAF blocks. If fetching a URL fails with WebFetch, do NOT silently skip the check -- inform the user with the error and suggest they paste the content or switch to codebase mode. - Claude Code: Same as Cursor --
WebFetch is available and preferred. - Codex (OpenAI): Use available browser/web tools. Check your tool list for
WebFetch or equivalent. - Windsurf: Use built-in browsing capability. Falls back to shell commands.
- Terminal-only agents (Aider, etc.): Must rely on shell commands. Use the Python script first (it has retry logic and UA rotation), then fall back to curl with a browser User-Agent.
- GitHub Copilot Workspace: Use platform web access tools. Fall back to Python script if unavailable.
SEO 与 LLM 排名
对网站进行传统SEO健康度和AI搜索可见性(GEO)审计。生成带有可操作修复方案的优先级报告,以及可直接用于AI编码助手的提示。涵盖Google、Bing、ChatGPT、Perplexity、Claude、Gemini、Copilot和Google AI概览。
开始之前
首先检查产品营销上下文:
如果存在 .agents/product-marketing-context.md,请在提问前先阅读。使用该上下文,仅询问尚未涵盖的信息。
通过询问用户收集缺失的上下文:
- 1. 审计模式 -- 实时URL扫描还是本地代码库扫描?
- 网站URL(如果是URL模式)-- 应审计哪个URL?
- 项目路径(如果是代码库模式)-- 项目在哪里?(默认:当前工作区)
- 范围 -- 全站审计还是特定页面/部分?
- 网站类型 -- SaaS、电商、博客、本地商家、作品集等。
- 主要关键词 -- 哪些关键词/主题最重要?
- 业务目标 -- 流量、线索、注册、销售、品牌知名度?
- 已知问题 -- 是否有特定问题或近期流量下降?
- 技术栈 -- Next.js、WordPress、Shopify、静态HTML等。
仅询问你尚不知道的信息。如果用户提供URL,使用URL模式。如果用户说审计我的项目或扫描代码库,使用代码库模式。如果模棱两可,请询问。
平台检测与URL获取策略
URL模式通过HTTP获取实时页面。由于WAF(Cloudflare等)、企业防火墙、机器人检测或受限的Shell环境,这通常会失败。在执行任何URL模式获取之前,检测你的平台并使用最可靠的方法。
如何检测你的平台:
- - Cursor -- 你有 WebFetch 工具和 WebSearch 工具。你可能还有 @Web 提及功能。如果你能调用 WebFetch,你就在Cursor中。
- Claude Code -- 你原生拥有 WebFetch 工具。与Cursor类似,但在基于终端的代理中运行。
- Codex (OpenAI) -- 你有浏览器工具或Web获取MCP工具。检查你的可用工具列表。
- Windsurf -- 你有内置的网页浏览和URL获取能力。
- Aider / 仅终端代理 -- 没有内置的Web工具。你必须依赖Shell命令(curl、python3)。
- GitHub Copilot Workspace -- 通过其自身的工具界面具有Web访问权限。
URL获取优先级链(按顺序尝试,失败时回退):
| 平台 | 主要获取方法 | 回退1 | 回退2 |
|---|
| Cursor | WebFetch 工具(绕过WAF/防火墙) | Python脚本(scripts/seo_audit.py) | 带浏览器UA的curl |
| Claude Code |
WebFetch 工具 | Python脚本 | 带浏览器UA的curl |
|
Codex | WebFetch / 浏览器工具 | Python脚本 | -- |
|
Windsurf | 内置网页浏览 | Python脚本 | 带浏览器UA的curl |
|
Aider / 终端 | Python脚本 | 带浏览器UA的curl | -- |
|
其他 | Python脚本 | 带浏览器UA的curl | -- |
失败检测: 将以下任何情况视为应触发下一个回退的获取失败:
- - HTTP 403、404、5xx状态码
- 连接超时或连接被拒绝
- 空响应体
- SSL/TLS错误
- curl: command not found(没有curl的Windows)
如果所有方法都失败,不要静默跳过检查。告知用户无法访问该URL,并建议:
- 1. 尝试从不同的网络访问
- 直接粘贴页面HTML或文件内容进行分析
- 如果本地有源代码,切换到代码库模式
审计模式选择
此技能支持两种审计模式:
| 模式 | 何时使用 | 工作原理 |
|---|
| URL模式 | 网站已上线/已部署 | 通过HTTP获取页面,检查robots.txt、sitemap、llms.txt、加载时间 |
| 代码库模式 |
网站是本地/预部署 | 使用Glob、Read和Grep扫描项目文件 -- 检查HTML、布局文件、配置 |
代码库模式特别适用于:
- - 在部署前发现SEO问题
- 尚未上线的项目
- 用户想要直接扫描源代码时
- 审计框架布局/页面文件中设置的元标签(Next.js、Astro等)
审计工作流程
第一阶段:技术SEO扫描
选择你的模式:
URL模式
运行审计脚本和手动检查:
bash
python3 scripts/seo_audit.py https://example.com
python3 scripts/seo_audit.py https://example.com --full # 详细输出
手动检查 -- 使用上述平台检测与URL获取策略中的获取策略。
遵循你平台的优先级链。每种方法的示例:
主要方法:WebFetch(Cursor、Claude Code、Codex、Windsurf)
对每个资源使用 WebFetch。这是最可靠的方法 -- 它处理TLS、遵循重定向,并绕过大多数WAF/机器人检测拦截。
WebFetch(https://example.com) -- 主页面HTML
WebFetch(https://example.com/robots.txt) -- robots.txt
WebFetch(https://example.com/sitemap.xml) -- sitemap
WebFetch(https://example.com/llms.txt) -- llms.txt(AI发现)
如果 WebFetch 对特定资源返回错误(403、404、500或超时),记录失败并尝试该资源的下一个回退。不要静默跳过检查。
回退1:Python脚本
bash
python3 scripts/seo_audit.py https://example.com
python3 scripts/seo_audit.py https://example.com --full
该脚本内置了重试逻辑和User-Agent轮换。如果脚本报告获取失败,它将打印包含状态码和建议下一步操作的诊断消息。
回退2:带浏览器User-Agent的curl
使用真实的浏览器User-Agent以避免机器人检测。添加 --retry 2 和 --max-time 30 以提高弹性。
bash
UA=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
curl -sL -A $UA --retry 2 --max-time 30 https://example.com/robots.txt
curl -sL -A $UA --retry 2 --max-time 30 https://example.com/sitemap.xml
curl -sL -A $UA --retry 2 --max-time 30 https://example.com/llms.txt
curl -sL -A $UA --retry 2 --max-time 30 https://example.com
curl -sIL -A $UA --retry 2 --max-time 30 https://example.com
如果所有方法对某个资源都失败,告知用户哪个资源无法获取、收到的错误,并建议他们手动粘贴内容或切换到代码库模式。
PageSpeed检查(可选,基于API):
bash
curl https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://example.com&strategy=mobile
检查内容(两种模式):
- - 标题标签存在,50-60个字符,包含主要关键词
- 元描述存在,150-160个字符,有吸引力
- 每页单个H1,包含主要关键词
- 标题层级:H1 > H2 > H3(无跳级)
- 启用HTTPS且证书有效
- robots.txt存在并允许重要页面
- XML站点地图存在且可访问
- 页面加载时间 < 3秒(仅URL模式)
- 核心网页指标:LCP < 2.5秒,INP < 200毫秒,CLS < 0.1(仅URL模式)
- Open Graph和Twitter Card标签存在
- 图片有alt文本
- 无损坏的内部链接
代码库模式
使用Glob、Read和Grep直接扫描项目文件。无需实时服务器。
步骤1:发现项目结构和框架
Glob: /*.{html,jsx,tsx,astro,vue,svelte,php}
Glob: /layout.{tsx,jsx,js,ts}
Glob: /app/layout.{tsx,jsx}
Glob: /_app.{tsx,jsx}
Glob: /index.{html,tsx,jsx}
Glob: /head.{tsx,jsx}