Search Console SEO Report Generator
Generate professional, chart-rich PDF reports from Google Search Console data. Covers traffic trends, top pages, top keywords, country/device distribution, growth analysis, and actionable SEO recommendations.
Prerequisites
Before running this skill, verify these requirements:
1. Service Account Key File
You need a Google Cloud Service Account JSON key file with access to the Search Console properties. The file looks like:
CODEBLOCK0
Ask the user for the path to their key file. Common locations: ~/Downloads/*.json, project directory.
If the user doesn't have one yet, guide them through:
- 1. Create a Service Account in Google Cloud Console (IAM & Admin > Service Accounts)
- Create a JSON key for it (Keys tab > Add Key > JSON)
- Add the service account email as a user in Search Console (Settings > Users and permissions > Add user, "Restricted" permission is sufficient)
- Enable the "Google Search Console API" in the project's API Library
2. Python Environment
The script requires these packages: pyjwt, cryptography, requests, matplotlib, pandas, reportlab.
Set up a virtual environment to avoid system conflicts:
CODEBLOCK1
IMPORTANT: Always use /tmp/sc-env/bin/python to run scripts, not the system Python.
Timeout warning: Package installation and first matplotlib import can be slow (60-120s). Set bash timeout to 180000ms for these operations.
3. Chinese Font for PDF (macOS)
The PDF uses STHeiti for proper CJK + Latin + symbol rendering. Register it like this:
CODEBLOCK2
CRITICAL font rules:
- - Do NOT use
UnicodeCIDFont('STSong-Light') — it causes English letter spacing to be too narrow and Unicode symbols like • (U+2022) to render as garbage characters (e.g. "煉"). - Always use TrueType fonts registered via
TTFont for proper mixed CJK/Latin rendering. - On non-macOS systems, find an available CJK TTF font:
fc-list :lang=zh file or look for Noto Sans CJK / WenQuanYi.
4. Matplotlib Font for Charts (macOS)
CRITICAL: STHeiti alone does NOT cover Korean glyphs — keywords or country names in Korean will show as hollow rectangles (□). Use Arial Unicode MS instead, which covers CJK + Korean + most other scripts:
CODEBLOCK3
Do NOT try to combine STHeiti + AppleSDGothicNeo via font.sans-serif list — matplotlib uses a single font per render pass and does not do per-glyph fallback, so the list only helps if glyphs exist in the first font.
Step-by-Step Instructions
Step 1: Gather Input from User
Ask for or determine:
- - Key file path: Path to the Service Account JSON key file
- Site URLs: One or more Search Console property URLs (format:
https://www.example.com/) - Date range: Default to last 90 days. The user may request a custom range.
- Output path: Where to save the PDF and data files. Default to project directory.
- Language: Report can be in Chinese (default) or English — match the user's language.
Step 2: Authenticate with Google API
Use JWT-based Service Account authentication. Here is the exact authentication code:
CODEBLOCK4
Error handling: If authentication fails with 403, the API may not be enabled or the service account may not have Search Console access. Tell the user which to check.
Step 3: Fetch Data from Search Console API
Use the Search Analytics API endpoint for each site. The base query function:
CODEBLOCK5
For each site, fetch ALL of the following data (this is the complete list — do not skip any):
| # | Query | Dimensions | row_limit | Purpose |
|---|
| 1 | Daily traffic trend | INLINECODE14 | 25000 | Time series for charts |
| 2 |
Top pages |
["page"] | 50 | Most visited pages |
| 3 | Top queries |
["query"] | 50 | Most searched keywords |
| 4 | Country distribution |
["country"] | 30 | Geographic breakdown |
| 5 | Device distribution |
["device"] | 10 | Desktop/Mobile/Tablet split |
| 6 | Search appearance |
["searchAppearance"] | 20 | Rich result types |
| 7 | Query-page combos |
["query", "page"] | 100 | Which keywords drive which pages |
| 8 | Period comparison (first half) |
["page"] with first-half dates | 500 | Growth analysis |
| 9 | Period comparison (second half) |
["page"] with second-half dates | 500 | Growth analysis |
Period comparison logic: Split the date range in half. For each page URL, compare clicks between the two halves. Categorize pages as:
- - Growing: clicks increased (sort by change descending)
- Declining: clicks decreased (sort by change ascending)
- New: appeared only in the second half
- Lost: appeared only in the first half
Step 4: Calculate Summary Statistics
For each site, compute:
CODEBLOCK6
Step 5: Save Raw Data as JSON
Save all fetched data to sc_detailed_data.json for reproducibility:
CODEBLOCK7
Step 6: Generate Charts with Matplotlib
IMPORTANT: Always set matplotlib.use('Agg') BEFORE importing pyplot (no display server available).
Generate these charts (save as PNG, dpi=150):
Chart 1: Combined Traffic Trend (all sites)
- - 2-row subplot: top = daily clicks, bottom = daily impressions
- One line per site, color-coded
- X-axis: dates formatted as
%m-%d, rotated 45 degrees - Legend in upper-left
Chart 2: Per-site Detail (one per site with enough data)
- - 2-row subplot: top = daily clicks with 7-day moving average, bottom = average position (inverted Y-axis — lower is better)
- Fill area under clicks line with alpha=0.3
CODEBLOCK8
Chart 3: Device Distribution
- - 1-row, N-column pie charts (one per site)
- Show percentage labels
Chart 4: Country Distribution (pie chart per site)
- - Use
figsize=(10, 7) and radius=0.9 — smaller sizes make labels illegible - Place on its own full-width row in the PDF (do NOT put side-by-side with keyword chart)
- In PDF, render at INLINECODE28
- Top 8 countries by clicks
- Map country codes to readable names using this mapping:
CODEBLOCK9
Step 7: Generate PDF Report
Use reportlab with A4 page size. The report has 7 sections:
PDF Structure
CODEBLOCK10
PDF Style Configuration
Use 1.2cm margins (not the default 2cm) to maximize content area. Usable width = 21cm - 2×1.2cm = 18.6cm. Set all chart/table widths to 18.6cm accordingly.
CODEBLOCK11
Table Style Template
Use this consistent style for all data tables:
CODEBLOCK12
Bullet Points
Use Unicode bullet character \u2022 (•) for list items:
CODEBLOCK13
This renders correctly with STHeiti font. Do NOT use other bullet approaches.
Step 8: Generate SEO Recommendations
Analyze the data and generate recommendations following these guidelines:
Analysis Framework
- 1. CTR Analysis: If average CTR < 5%, recommend Title/Description optimization.
- Position Opportunities: Find keywords ranking 5-15 (page 1-2 boundary) — these are low-hanging fruit for optimization.
- Country Focus: Identify the #1 traffic country and recommend localized content.
- Growth Momentum: Sites with click growth > 100% are in "breakout" phase — recommend increasing content investment.
- New Sites: Sites with < 30 days of data need basic SEO foundations (Sitemap submission, internal linking).
- Declining Pages: If many pages have declining clicks, recommend content quality audit.
- Device Split: If mobile > 60%, emphasize mobile optimization and Core Web Vitals.
- Technical SEO: Always recommend hreflang for multi-region sites, 404 fixes, and page speed optimization.
Priority Classification
- - P0 (Do immediately): CTR optimization, fixing unindexed pages
- P1 (Do this month): Content localization, keyword optimization for positions 5-15
- P2 (Plan for next quarter): hreflang implementation, Core Web Vitals, Sitemap tuning
Step 9: Present Results
After generating the PDF:
- 1. Confirm the PDF file path to the user
- Provide a summary in chat covering:
- Report structure (sections and page count)
- Key highlights per site (1-2 sentences each)
- Top 3 priority recommendations
- 3. Mention the raw data JSON file path for further analysis
- Offer next steps (e.g., "Do you want me to analyze a specific page or keyword in more detail?")
Common Errors and Solutions
| Error | Cause | Solution |
|---|
| INLINECODE30 on API call | Service account not added to Search Console | Add the service account email as a user in Search Console settings |
| INLINECODE31 |
API not enabled | Enable it at https://console.cloud.google.com/apis/library/searchconsole.googleapis.com |
| Empty
rows in response | No data for that site/date range | Check if the site URL exactly matches the Search Console property (trailing slash matters!) |
|
jwt.encode error | Missing
cryptography package |
pip install cryptography |
| PDF shows garbled Chinese | Wrong font | Use TTFont with STHeiti, NOT UnicodeCIDFont with STSong-Light |
| Matplotlib timeout on first run | Building font cache | Set bash timeout to 180000ms; this only happens once |
|
MPLCONFIGDIR warning | No write access to
~/.matplotlib | Harmless; matplotlib creates a temp cache automatically |
Example Usage
User: "Help me generate an SEO report for my websites using Search Console"
→ Ask for key file path and site URLs, then run the full pipeline.
User: "Analyze search performance for example.com over the last 90 days and export to PDF"
→ Run with default 90-day range, generate full report.
User: "Compare search traffic between my 3 sites"
→ Run for all 3 sites, emphasize the comparison aspects in the summary table and trends chart.
Layout Rules (Lessons Learned)
Top Pages Table — Use Paragraph for Word Wrap
CRITICAL: The Page column contains long URLs that WILL overflow into adjacent columns if you use plain strings. Always wrap ALL table cells in
Paragraph() objects:
CODEBLOCK14
Country Chart — Keep on Its Own Row
Do NOT put the country pie chart side-by-side with the keyword chart. The pie becomes too small to read. Always put it on a separate row at
width=16*cm.
File Naming — Always Include Date
Output filename must include today's date to avoid overwriting previous reports:
CODEBLOCK15
Per-Site Layout Order (One Page Per Site)
- 1. KPI metrics row (Clicks / Impressions / CTR / Avg. Position)
- Top Keywords chart (full width, 18.6cm)
- Country pie chart (own row, 16cm)
- Top Pages table (with Paragraph cells)
- PageBreak()
Reference Script
See gen_report.py in this skill directory for the complete, production-ready implementation with all fixes applied.
Output Files
| File | Description |
|---|
| INLINECODE41 | Raw API data for all sites (reproducible) |
| INLINECODE42 |
Generated chart images (temp, /tmp/) |
|
project/reports/search_console_report_YYYY-MM-DD.pdf | Final PDF report (date-stamped, never overwritten) |
Search Console SEO 报告生成器
从 Google Search Console 数据生成专业、图表丰富的 PDF 报告。涵盖流量趋势、热门页面、热门关键词、国家/设备分布、增长分析和可执行的 SEO 建议。
前置条件
在运行此技能之前,请验证以下要求:
1. 服务账号密钥文件
您需要一个具有 Search Console 属性访问权限的 Google Cloud 服务账号 JSON 密钥文件。文件格式如下:
json
{
type: service_account,
project_id: ...,
privatekeyid: ...,
private_key: -----BEGIN PRIVATE KEY-----\n...,
client_email: xxx@project.iam.gserviceaccount.com,
token_uri: https://oauth2.googleapis.com/token,
...
}
向用户询问其密钥文件的路径。常见位置:~/Downloads/*.json、项目目录。
如果用户还没有密钥文件,请引导他们完成以下步骤:
- 1. 在 Google Cloud Console 中创建服务账号(IAM 与管理 > 服务账号)
- 为其创建 JSON 密钥(密钥选项卡 > 添加密钥 > JSON)
- 将服务账号邮箱添加为 Search Console 中的用户(设置 > 用户和权限 > 添加用户,受限权限即可)
- 在项目的 API 库中启用Google Search Console API
2. Python 环境
脚本需要以下包:pyjwt、cryptography、requests、matplotlib、pandas、reportlab。
设置虚拟环境以避免系统冲突:
bash
python3 -m venv /tmp/sc-env
/tmp/sc-env/bin/pip install pyjwt cryptography requests matplotlib pandas reportlab
重要:始终使用 /tmp/sc-env/bin/python 运行脚本,而不是系统 Python。
超时警告:包安装和首次 matplotlib 导入可能较慢(60-120 秒)。对于这些操作,请将 bash 超时设置为 180000 毫秒。
3. PDF 中文字体(macOS)
PDF 使用 STHeiti 实现中日韩文字、拉丁字母和符号的正确渲染。按如下方式注册:
python
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
pdfmetrics.registerFont(TTFont(CNFont, /System/Library/Fonts/STHeiti Medium.ttc, subfontIndex=0))
pdfmetrics.registerFont(TTFont(CNFontLight, /System/Library/Fonts/STHeiti Light.ttc, subfontIndex=0))
pdfmetrics.registerFontFamily(CNFont, normal=CNFontLight, bold=CNFont)
关键字体规则:
- - 不要使用 UnicodeCIDFont(STSong-Light)——它会导致英文字母间距过窄,并且 Unicode 符号如 •(U+2022)会渲染为乱码字符(例如煉)。
- 始终使用通过 TTFont 注册的 TrueType 字体,以实现中日韩文字和拉丁字母的正确混合渲染。
- 在非 macOS 系统上,找到可用的中日韩 TrueType 字体:fc-list :lang=zh file 或查找 Noto Sans CJK / WenQuanYi。
4. 图表 Matplotlib 字体(macOS)
关键:仅使用 STHeiti 无法覆盖韩文字形——韩文关键词或国家名称将显示为空心矩形(□)。请改用 Arial Unicode MS,它覆盖中日韩文字、韩文和大多数其他文字:
python
import matplotlib.font_manager as fm
matplotlib.rcParams[font.family] = sans-serif
matplotlib.rcParams[axes.unicode_minus] = False
unicodefont_path = /Library/Fonts/Arial Unicode.ttf
if not os.path.exists(unicodefont_path):
unicodefont_path = /System/Library/Fonts/STHeiti Medium.ttc # 备用
fm.fontManager.addfont(unicodefont_path)
ufname = fm.FontProperties(fname=unicodefontpath).get_name()
matplotlib.rcParams[font.sans-serif] = [_ufname, DejaVu Sans]
不要尝试通过 font.sans-serif 列表组合 STHeiti 和 AppleSDGothicNeo——matplotlib 每次渲染只使用一种字体,不进行逐字形回退,因此该列表仅在第一个字体中存在字形时才有帮助。
分步说明
第 1 步:收集用户输入
询问或确定:
- - 密钥文件路径:服务账号 JSON 密钥文件的路径
- 网站 URL:一个或多个 Search Console 属性 URL(格式:https://www.example.com/)
- 日期范围:默认为最近 90 天。用户可请求自定义范围。
- 输出路径:保存 PDF 和数据文件的位置。默认为项目目录。
- 语言:报告可为中文(默认)或英文——与用户语言匹配。
第 2 步:通过 Google API 进行身份验证
使用基于 JWT 的服务账号身份验证。以下是确切的身份验证代码:
python
import json, time, jwt, requests
def getaccesstoken(key_file):
with open(key_file) as f:
creds = json.load(f)
now = int(time.time())
payload = {
iss: creds[client_email],
scope: https://www.googleapis.com/auth/webmasters.readonly,
aud: creds[token_uri],
iat: now,
exp: now + 3600,
}
signedjwt = jwt.encode(payload, creds[privatekey], algorithm=RS256)
resp = requests.post(creds[token_uri], data={
grant_type: urn:ietf:params:oauth:grant-type:jwt-bearer,
assertion: signed_jwt,
})
resp.raiseforstatus()
return resp.json()[access_token]
错误处理:如果身份验证失败并返回 403,则可能未启用 API 或服务账号没有 Search Console 访问权限。告知用户需要检查哪一项。
第 3 步:从 Search Console API 获取数据
对每个网站使用 Search Analytics API 端点。基本查询函数:
python
import datetime
END_DATE = datetime.date.today() - datetime.timedelta(days=3) # 数据约有 3 天延迟
STARTDATE = ENDDATE - datetime.timedelta(days=89)
def querysc(token, siteurl, dimensions, start=None, end=None, row_limit=100):
查询 Search Console Search Analytics API。
参数:
token: OAuth2 访问令牌
site_url: 完整属性 URL,例如 https://www.example.com/
dimensions: 维度列表。有效值:
- date — 每日细分
- query — 搜索关键词
- page — 页面 URL
- country — ISO 3166-1 alpha-3 国家代码(小写)
- device — DESKTOP、MOBILE、TABLET
- searchAppearance — 富结果类型
可组合:[query, page] 用于关键词-页面矩阵
start: 开始日期(datetime.date)。默认为 START_DATE。
end: 结束日期(datetime.date)。默认为 END_DATE。
row_limit: 最大行数(最大 25000)。
返回:
行字典列表,键为:keys(列表)、clicks、impressions、ctr、position
注意:ctr 为小数(0.05 = 5%),显示时需乘以 100。
url = fhttps://www.googleapis.com/webmasters/v3/sites/{requests.utils.quote(site_url, safe=)}/searchAnalytics/query
body = {
startDate: (start or START_DATE).isoformat(),
endDate: (end or END_DATE).isoformat(),
dimensions: dimensions,
rowLimit: row_limit,
}
resp = requests.post(url, headers={Authorization: fBearer {token}}, json=body)
resp.raiseforstatus()
return resp.json().get(rows, [])
对于每个网站,获取以下所有数据(这是完整列表——不要跳过任何一项):
| # | 查询 | 维度 | row_limit | 目的 |
|---|
| 1 | 每日流量趋势 | [date] | 25000 | 图表的时间序列 |
| 2 |
热门页面 | [page] | 50 | 访问量最大的页面 |
| 3 | 热门查询 | [query] | 50 | 搜索最多的关键词 |
| 4 | 国家分布 | [country] | 30 | 地理细分 |
| 5 | 设备分布