Gaokao English Vocabulary Frequency Grading System
Overview
This skill produces a self-contained interactive HTML webpage (gaokao_english_vocabulary.html) plus an external data file (vocab_data.js) that together form a comprehensive vocabulary study tool for Chinese Gaokao English preparation.
The system classifies 4000+ words and 800+ phrases into 6 frequency levels each (12 levels total), with mastery tags, wrong-word warnings, search, multi-filter, and collapsible sections — all in a responsive dark theme.
Architecture
File Structure
CODEBLOCK0
The data file is separated from HTML to keep the page loadable even with large datasets. The HTML contains all styling and logic inline.
Data Format (vocab_data.js)
Each entry in the VOCAB array follows one of two formats:
Word format:
CODEBLOCK1
Phrase format (flat):
CODEBLOCK2
Level System
Words (6 levels by exam count):
| Level | Name | Range | Color |
|---|
| Lv.1 | 超高频 Super High | ≥40 times | Green #3fb950 |
| Lv.2 |
高频 High | 20-39 times | Blue #58a6ff |
| Lv.3 | 次高频 Sub-High | 10-19 times | Yellow #e3b341 |
| Lv.4 | 中频 Medium | 5-9 times | Red #f85149 |
| Lv.5 | 低频 Low | 2-4 times | Purple #a371f7 |
| Lv.6 | 超低频 Very Low | 0-1 times | Gray #484f58 |
Phrases (6 levels by exam count):
| Level | Name | Range | Color |
|---|
| P1 | 超高频 | ≥20 times | Cyan #38bdf8 |
| P2 |
高频 | 12-19 times | Teal #22d3ee |
| P3 | 次高频 | 8-11 times | Blue #38bdf8 |
| P4 | 中频 | 5-7 times | Light Blue #7dd3fc |
| P5 | 低频 | 3-4 times | Soft Blue #93c5fd |
| P6 | 超低频 | 1-2 times | Slate #94a3b8 |
Mastery Levels
| Tag | Label | Color | CSS Class |
|---|
| 145 | 🔥 145分必掌握 | Green bg | INLINECODE4 |
| must |
✅ 必须掌握 | Yellow bg |
.mmust |
| normal | 📖 一般掌握 | Gray bg |
.mnormal |
| know | 👀 了解意思 | Dark bg |
.mknow |
Key Features
1. Sticky Top Bar
Header, stats, search box, and filter buttons are all in a
position:sticky top bar that stays visible during scroll.
2. Filter Buttons with Frequency Annotations
Each filter button shows the level name and a small
<small> tag with the exam count range (e.g., "20-39次").
3. Sticky Level Headers
Each level section header (e.g., "🟢 Lv.1 超高频") uses
position:sticky so it remains visible while scrolling through cards in that section.
4. Collapsible Sections
Each level section can be expanded/collapsed by clicking its header. Sections start collapsed and expand with a smooth
max-height transition.
5. Card Design
Each vocabulary card displays:
- - Frequency bar (width proportional to exam count)
- Word + part of speech + phonetic
- Meaning(s) with per-meaning count (for words with multiple meanings)
- Mastery badge + exam count badge
- Wrong-word tip (if applicable)
6. Search
Real-time search across word, meaning, and phonetic fields. Supports both word
d[].m and phrase
m formats.
7. Multi-Filter
Filter buttons for: All, 6 word levels, All phrases, 6 phrase levels, 145-essential, Wrong-word.
Page Layout Structure
CODEBLOCK3
Workflow
Generating a New Vocabulary System
- 1. Gather vocabulary data — Collect word lists with:
- Word, part of speech, meaning, exam frequency count
- Mastery level assignment (145 / must / normal / know)
- Wrong-word warnings with tip text (optional)
- 2. Generate
vocab_data.js — Format data as var VOCAB = [...]; following the data format above. Words use d:[{m,c}] array; phrases use flat m and c fields.
- 3. Create the HTML page — Use the template structure from
references/template_structure.md. Key sections:
-
<style> block with all CSS (dark theme, card styles, level colors, responsive breakpoints)
- Top bar with header, stats, search, filter buttons
- Main content area
<div id="mainContent">
-
<script> block with:
getLevel(),
getMaxCount(),
renderCard(),
render(),
filterWord(),
toggleSection(),
setFilter(),
doSearch(), INLINECODE31
- 4. Ensure all 12 levels are non-empty — If any level has zero entries, either add more vocabulary or adjust the count thresholds.
- 5. Validate data integrity — Run Python checks:
- Every entry has
w and
s fields
- Every entry has either
d[] (words) or
m (phrases) for meaning
- Every entry has
v (words) or
i (phrases) for mastery
- No NaN or null values
- 6. Test in browser — Start HTTP server (
python -m http.server 8080) and verify:
- Stats display correct totals
- All filters work
- Search works across word/meaning/phonetic
- Collapsible sections expand/collapse
- Sticky headers and top bar work during scroll
CSS Design Principles
- - Dark theme: Background
#0d1117, cards #161b22, borders INLINECODE41 - Top bar gradient: INLINECODE42
- Card hover: Border color changes to
#58a6ff, slight translateY lift + box-shadow - Responsive grid: INLINECODE44
- Mobile: Single column at INLINECODE45
- Level header sticky: INLINECODE46
- Sticky top bar: INLINECODE47
Customization Points
- - Exam year range: Update the subtitle text (default: "1977—2025年")
- Level thresholds: Adjust in
getLevel() function - Level colors: Update CSS
.level-X .level-header and .level-X .word-bar rules - Mastery labels: Update
getMasteryLabel() function - Signature/署名: Add below subtitle in header section
- Additional filters: Add new cases in
setFilter() and filterWord() switch statements
Resources
references/template_structure.md
Complete HTML template structure with all CSS styles and JavaScript functions. Use as the starting point when generating a new page — copy and customize the thresholds, colors, and data.
scripts/generate_vocab.py
Python script that generates
vocab_data.js from a structured vocabulary source. Demonstrates the data generation workflow including word/phrase classification, mastery assignment, and JS output formatting.