Caching Patterns
A well-placed cache is the cheapest way to buy speed. A misplaced cache is the most expensive way to buy bugs.
Cache Strategies
| Strategy | How It Works | When to Use |
|---|
| Cache-Aside (Lazy) | App checks cache → miss → reads DB → writes to cache | Default choice — general purpose |
| Read-Through |
Cache fetches from DB on miss automatically | ORM-integrated caching, CDN origin fetch |
|
Write-Through | Writes go to cache AND DB synchronously | Read-heavy with strong consistency |
|
Write-Behind | Writes go to cache, async flush to DB | High write throughput, eventual consistency OK |
|
Refresh-Ahead | Cache proactively refreshes before expiry | Predictable access patterns, low-latency critical |
CODEBLOCK0
Cache Invalidation
| Method | Consistency | When to Use |
|---|
| TTL-based | Eventual (up to TTL) | Simple data, acceptable staleness |
| Event-based |
Strong (near real-time) | Inventory, profile updates |
|
Version-based | Strong | Static assets, API responses, config |
|
Tag-based | Strong | CMS content, category-based purging |
TTL Guidelines
| Data Type | TTL | Rationale |
|---|
| Static assets (CSS/JS/images) | 1 year + cache-busting hash | Immutable by filename |
| API config / feature flags |
30–60 seconds | Fast propagation needed |
| User profile data | 5–15 minutes | Tolerable staleness |
| Product catalog | 1–5 minutes | Balance freshness vs load |
| Session data | Match session timeout | Security requirement |
HTTP Caching
Cache-Control Directives
| Directive | Meaning |
|---|
| INLINECODE0 | Cache for N seconds |
| INLINECODE1 |
CDN/shared cache max age (overrides max-age) |
|
no-cache | Must revalidate before using cached copy |
|
no-store | Never cache anywhere |
|
must-revalidate | Once stale, must revalidate |
|
private | Only browser can cache, not CDN |
|
public | Any cache can store |
|
immutable | Content will never change (within max-age) |
|
stale-while-revalidate=N | Serve stale for N seconds while fetching fresh |
Common Recipes
CODEBLOCK1
Conditional Requests
| Mechanism | Request Header | Response Header | How It Works |
|---|
| ETag | INLINECODE9 | INLINECODE10 | Hash-based — 304 if match |
| Last-Modified |
If-Modified-Since: <date> |
Last-Modified: <date> | Date-based — 304 if unchanged |
Prefer ETag over Last-Modified — ETags detect content changes regardless of timestamp granularity.
Application Caching
| Solution | Speed | Shared Across Processes | When to Use |
|---|
| In-memory LRU | Fastest | No | Single-process, bounded memory, hot data |
| Redis |
Sub-ms (network) | Yes |
Production default — TTL, pub/sub, persistence |
|
Memcached | Sub-ms (network) | Yes | Simple key-value at extreme scale |
|
SQLite | Fast (disk) | No | Embedded apps, edge caching |
Redis vs Memcached
| Feature | Redis | Memcached |
|---|
| Data structures | Strings, hashes, lists, sets, sorted sets | Strings only |
| Persistence |
AOF, RDB snapshots | None |
| Pub/Sub | Yes | No |
| Max value size | 512 MB | 1 MB |
|
Verdict |
Default choice | Pure cache at extreme scale |
Distributed Caching
| Concern | Solution |
|---|
| Partitioning | Consistent hashing — minimal reshuffling on node changes |
| Replication |
Primary-replica — writes to primary, reads from replicas |
|
Failover | Redis Sentinel or Cluster auto-failover |
Rule of thumb: 3 primaries + 3 replicas minimum for production Redis Cluster.
Cache Eviction Policies
| Policy | How It Works | When to Use |
|---|
| LRU | Evicts least recently accessed | Default — general purpose |
| LFU |
Evicts least frequently accessed | Skewed popularity distributions |
|
FIFO | Evicts oldest entry | Simple, time-ordered data |
|
TTL | Evicts after fixed duration | Data with known freshness window |
Redis default is noeviction. Set maxmemory-policy to allkeys-lru or volatile-lru for production.
Caching Layers
CODEBLOCK2
| Layer | What to Cache | Invalidation |
|---|
| Browser | Static assets, API responses | Versioned URLs, Cache-Control |
| CDN |
Static files, public API responses | Purge API, surrogate keys |
|
Application | Computed results, DB queries, external API | Event-driven, TTL |
|
Database | Query plans, buffer pool, materialized views |
ANALYZE, manual refresh |
Cache Stampede Prevention
When a hot key expires, hundreds of requests simultaneously hit the database.
| Technique | How It Works |
|---|
| Mutex / Lock | First request locks, fetches, populates; others wait |
| Probabilistic early expiration |
Random chance of refreshing before TTL |
|
Request coalescing | Deduplicate in-flight requests for same key |
|
Stale-while-revalidate | Serve stale, refresh asynchronously |
Cache Warming
| Strategy | When to Use |
|---|
| On-deploy warm-up | Predictable key set, latency-sensitive |
| Background job |
Reports, dashboards, catalog data |
|
Shadow traffic | Cache migration, new infrastructure |
|
Priority-based | Limited warm-up time budget |
Cold start impact: A full cache flush can increase DB load 10–100x. Always warm gradually or use stale-while-revalidate.
Monitoring
| Metric | Healthy Range | Action if Unhealthy |
|---|
| Hit rate | > 90% | Low → cache too small, wrong TTL, bad key design |
| Eviction rate |
Near 0 steady state | High → increase memory or tune policy |
|
Latency (p99) | < 1ms (Redis) | High → network issue, large values, hot key |
|
Memory usage | < 80% of max | Approaching max → scale up or tune eviction |
NEVER Do
- 1. NEVER cache without a TTL or invalidation plan — data rots; every entry needs an expiry path
- NEVER treat cache as durable storage — caches evict, crash, and restart; always fall back to source of truth
- NEVER cache sensitive data (tokens, PII) without encryption — cache breaches expose everything in plaintext
- NEVER ignore cache stampede on hot keys — one expired popular key can take down your database
- NEVER use unbounded in-memory caches in production — memory grows until OOM-killed
- NEVER cache mutable data with
immutable Cache-Control — browsers will never re-fetch - NEVER skip monitoring hit/miss rates — you won't know if your cache is helping or hurting
技能名称: 缓存
详细描述:
缓存模式
位置得当的缓存是购买速度最廉价的方式。位置不当的缓存是购买Bug最昂贵的方式。
缓存策略
| 策略 | 工作原理 | 使用场景 |
|---|
| 旁路缓存(懒加载) | 应用检查缓存 → 未命中 → 读取数据库 → 写入缓存 | 默认选择 — 通用场景 |
| 穿透读取 |
缓存未命中时自动从数据库获取 | ORM集成缓存、CDN源站回源 |
|
同步写入 | 同步写入缓存和数据库 | 读密集且要求强一致性 |
|
异步写入 | 写入缓存,异步刷新到数据库 | 高写入吞吐量,可接受最终一致性 |
|
提前刷新 | 缓存过期前主动刷新 | 可预测的访问模式,低延迟关键场景 |
旁路缓存流程:
应用 ──► 缓存 ──► 命中? ──► 返回数据
│
▼ 未命中
读取数据库 ──► 存入缓存 ──► 返回数据
缓存失效
| 方法 | 一致性 | 使用场景 |
|---|
| 基于TTL | 最终一致性(最多到TTL) | 简单数据,可接受过时 |
| 基于事件 |
强一致性(近实时) | 库存、个人资料更新 |
|
基于版本 | 强一致性 | 静态资源、API响应、配置 |
|
基于标签 | 强一致性 | CMS内容、基于分类的清除 |
TTL指南
| 数据类型 | TTL | 理由 |
|---|
| 静态资源(CSS/JS/图片) | 1年 + 缓存破坏哈希 | 按文件名不可变 |
| API配置/功能开关 |
30–60秒 | 需要快速传播 |
| 用户个人资料数据 | 5–15分钟 | 可容忍过时 |
| 产品目录 | 1–5分钟 | 平衡新鲜度与负载 |
| 会话数据 | 与会话超时一致 | 安全要求 |
HTTP缓存
Cache-Control指令
| 指令 | 含义 |
|---|
| max-age=N | 缓存N秒 |
| s-maxage=N |
CDN/共享缓存最大年龄(覆盖max-age) |
| no-cache | 使用缓存副本前必须重新验证 |
| no-store | 任何地方都不缓存 |
| must-revalidate | 一旦过期,必须重新验证 |
| private | 仅浏览器可缓存,CDN不可 |
| public | 任何缓存均可存储 |
| immutable | 内容永远不会改变(在max-age内) |
| stale-while-revalidate=N | 在获取新内容期间,N秒内可提供过时内容 |
常见配置
不可变静态资源(带哈希文件名)
Cache-Control: public, max-age=31536000, immutable
API响应,CDN缓存,后台刷新
Cache-Control: public, s-maxage=60, stale-while-revalidate=300
个性化数据,仅浏览器
Cache-Control: private, max-age=0, must-revalidate
ETag: abc123
永不缓存(认证令牌、敏感数据)
Cache-Control: no-store
条件请求
| 机制 | 请求头 | 响应头 | 工作原理 |
|---|
| ETag | If-None-Match: abc | ETag: abc | 基于哈希 — 匹配则返回304 |
| Last-Modified |
If-Modified-Since: <日期> | Last-Modified: <日期> | 基于日期 — 未变更则返回304 |
优先使用ETag而非Last-Modified — ETag可检测内容变化,不受时间戳粒度限制。
应用层缓存
| 方案 | 速度 | 跨进程共享 | 使用场景 |
|---|
| 内存LRU | 最快 | 否 | 单进程、有限内存、热点数据 |
| Redis |
亚毫秒(网络) | 是 |
生产环境默认 — TTL、发布/订阅、持久化 |
|
Memcached | 亚毫秒(网络) | 是 | 极大规模下的简单键值对 |
|
SQLite | 快(磁盘) | 否 | 嵌入式应用、边缘缓存 |
Redis vs Memcached
| 特性 | Redis | Memcached |
|---|
| 数据结构 | 字符串、哈希、列表、集合、有序集合 | 仅字符串 |
| 持久化 |
AOF、RDB快照 | 无 |
| 发布/订阅 | 是 | 否 |
| 最大值大小 | 512 MB | 1 MB |
|
结论 |
默认选择 | 极大规模下的纯缓存 |
分布式缓存
| 关注点 | 方案 |
|---|
| 分区 | 一致性哈希 — 节点变化时最小化重新分配 |
| 复制 |
主从复制 — 写入主节点,从从节点读取 |
|
故障转移 | Redis Sentinel或集群自动故障转移 |
经验法则: 生产环境Redis集群至少3主+3从。
缓存淘汰策略
| 策略 | 工作原理 | 使用场景 |
|---|
| LRU | 淘汰最近最少使用的 | 默认 — 通用场景 |
| LFU |
淘汰最不常用的 | 偏态流行度分布 |
|
FIFO | 淘汰最旧的条目 | 简单、按时间排序的数据 |
|
TTL | 固定时间后淘汰 | 具有已知新鲜度窗口的数据 |
Redis默认是noeviction。生产环境请设置maxmemory-policy为allkeys-lru或volatile-lru。
缓存层级
浏览器缓存 → CDN → 负载均衡器 → 应用缓存 → 数据库缓存 → 数据库
| 层级 | 缓存内容 | 失效方式 |
|---|
| 浏览器 | 静态资源、API响应 | 带版本号的URL、Cache-Control |
| CDN |
静态文件、公共API响应 | 清除API、代理键 |
|
应用层 | 计算结果、数据库查询、外部API | 事件驱动、TTL |
|
数据库 | 查询计划、缓冲池、物化视图 | ANALYZE、手动刷新 |
缓存雪崩预防
当热点键过期时,数百个请求同时命中数据库。
| 技术 | 工作原理 |
|---|
| 互斥锁/锁 | 第一个请求加锁、获取、填充;其他请求等待 |
| 概率性提前过期 |
在TTL前有一定概率刷新 |
|
请求合并 | 对同一键的进行中请求去重 |
|
过时-同时-重新验证 | 提供过时内容,异步刷新 |
缓存预热
| 策略 | 使用场景 |
|---|
| 部署时预热 | 可预测的键集、延迟敏感 |
| 后台任务 |
报表、仪表盘、目录数据 |
|
影子流量 | 缓存迁移、新基础设施 |
|
基于优先级 | 有限的预热时间预算 |
冷启动影响: 完全刷新缓存可能使数据库负载增加10–100倍。务必逐步预热或使用过时-同时-重新验证。
监控
| 指标 | 健康范围 | 不健康时的操作 |
|---|
| 命中率 | > 90% | 低 → 缓存太小、TTL错误、键设计不良 |
| 淘汰率 |
接近0稳态 | 高 → 增加内存或调整策略 |
|
延迟(p99) | < 1ms(Redis) | 高 → 网络问题、大值、热点键 |
|
内存使用率 | < 最大值的80% | 接近最大值 → 扩容或调整淘汰策略 |
绝对禁止
- 1. 绝对不要在没有TTL或失效计划的情况下缓存 — 数据会腐烂;每个条目都需要过期路径
- 绝对不要把缓存当作持久化存储 — 缓存会淘汰、崩溃和重启;始终回退到数据源
- 绝对不要在未加密的情况下缓存敏感数据(令牌、PII) — 缓存泄露会暴露所有明文数据
- 绝对不要忽视热点键的缓存雪崩 — 一个过期的热门键可能击垮你的数据库
- 绝对不要在生产环境使用无界内存缓存 — 内存会增长直到被OOM杀死
- 绝对不要对可变数据使用immutable Cache-Control —