Testing Strategy (Deep Workflow)
Testing strategy answers: what failures would hurt users, what’s cheap to catch, and what signals we trust in CI. Coverage percentage alone is a weak proxy—risk alignment matters.
When to Offer This Workflow
Trigger conditions:
- - New service or major refactor; “what should we test?”
- Flaky CI, long runtimes, or tests nobody trusts
- Debate: unit vs integration vs e2e; QA headcount vs automation
Initial offer:
Use six stages: (1) risk & quality goals, (2) pyramid & layers, (3) design per layer, (4) data & environments, (5) CI & gates, (6) observability of test health. Confirm release cadence and regulatory needs.
Stage 1: Risk & Quality Goals
Goal: Connect tests to user impact and business risk.
Questions
- 1. Worst failure categories: payments wrong, data leak, outage, wrong advice (AI)?
- SLO for critical paths—what must never break silently?
- Change velocity—how fast must PRs merge safely?
Output
Risk register → test priorities (not every line equally important).
Exit condition: Top 5 risks have explicit test intent.
Stage 2: Pyramid & Layers
Goal: Many fast tests, some integration, few e2e—proportion tuned to risk.
Layers (typical)
- - Unit: pure logic, cheap, deterministic
- Integration: DB, queue, real dependencies in containers—slower but valuable
- Contract: between services—consumer-driven contracts when decoupled teams
- E2E: full stack—expensive; minimal happy path + critical regressions
Anti-patterns
- - E2E-only (slow, flaky)
- Mock everything (misses real integration bugs)
Exit condition: Written policy: what belongs in each layer for this codebase.
Stage 3: Design Per Layer
Goal: Tests are readable, stable, and debuggable.
Unit
- - Given/when/then clarity; avoid testing implementation details
- Property-based tests for tricky invariants (dates, money, parsers)
Integration
- - Testcontainers or docker-compose in CI; migrations applied
- Parallel safe—unique DB schemas or transactions
E2E
- - Stable selectors (data-testid); retry policy disciplined—fix flakes, don’t hide them
- Seed data minimal; idempotent setup
Exit condition: Flake classification process exists (quarantine + ticket).
Stage 4: Data & Environments
Goal: Representative data without PII leakage.
Practices
- - Fixtures versioned; factories for variations
- Anonymized prod-like datasets for perf tests—governance for access
- Env parity: staging behaves like prod enough for meaningful e2e
Exit condition: Data generation documented; secrets not in tests.
Stage 5: CI & Gates
Goal: Fast feedback on PRs; nightly heavier suites if needed.
Tiers
- - PR: lint, unit, fast integration subset
- Main: full integration; optional e2e against ephemeral env
- Release: smoke + canary in prod
Metrics
- - Flake rate, duration, quarantined tests count—visible
Exit condition: Merge policy tied to green checks; exceptions process defined.
Stage 6: Test Health & Culture
Goal: Tests are owned like features.
Practices
- - Ownership per suite; on-call for CI when org size supports
- Delete tests that don’t pay rent—or fix them
Final Review Checklist
- - [ ] Risks mapped to test layers
- [ ] Pyramid policy documented
- [ ] Flake management process exists
- [ ] CI tiers match team velocity
- [ ] Data/fixture strategy safe and maintainable
Tips for Effective Guidance
- - Recommend testing seams: boundaries where contracts are stable.
- Warn against snapshot abuse for large UI—diff noise kills trust.
- For AI/LLM, discuss eval harnesses beyond classic unit tests.
Handling Deviations
- - Legacy untestable code: characterization tests then refactor seams.
- Startup speed: smoke + critical path first; expand as pain appears.
测试策略(深度工作流)
测试策略回答:哪些故障会伤害用户、哪些问题容易捕获,以及我们在CI中信任哪些信号。单纯的覆盖率指标是薄弱的代理指标——风险对齐才是关键。
何时提供此工作流
触发条件:
- - 新服务或重大重构:“我们应该测试什么?”
- CI不稳定、运行时间长,或测试无人信任
- 争论:单元测试 vs 集成测试 vs 端到端测试;QA人力 vs 自动化
初始提供:
使用六个阶段:(1) 风险与质量目标,(2) 测试金字塔与层级,(3) 各层级设计,(4) 数据与环境,(5) CI与门禁,(6) 测试健康度可观测性。确认发布节奏和监管需求。
阶段1:风险与质量目标
目标: 将测试与用户影响和业务风险关联。
问题
- 1. 最严重的故障类别:支付错误、数据泄露、服务中断、AI错误建议?
- 关键路径的SLO——哪些绝不能静默失效?
- 变更速度——PR必须多快安全合并?
输出
风险登记册 → 测试优先级(并非每行代码同等重要)。
退出条件: 前5个风险有明确的测试意图。
阶段2:测试金字塔与层级
目标: 大量快速测试、适量集成测试、少量端到端测试——比例根据风险调整。
层级(典型)
- - 单元测试:纯逻辑,成本低,确定性
- 集成测试:数据库、队列、容器中的真实依赖——较慢但有价值
- 契约测试:服务间——解耦团队时采用消费者驱动契约
- 端到端测试:全栈——成本高;最小化快乐路径 + 关键回归
反模式
- - 仅端到端测试(慢、不稳定)
- 全部模拟(遗漏真实集成缺陷)
退出条件: 编写策略:该代码库各层级应包含什么内容。
阶段3:各层级设计
目标: 测试可读、稳定且可调试。
单元测试
- - Given/when/then清晰;避免测试实现细节
- 对棘手的不变量(日期、金额、解析器)使用基于属性的测试
集成测试
- - 在CI中使用Testcontainers或docker-compose;应用迁移
- 并行安全——唯一数据库模式或事务
端到端测试
- - 稳定选择器(data-testid);重试策略有纪律——修复不稳定,而非隐藏
- 种子数据最小化;幂等设置
退出条件: 存在不稳定分类流程(隔离 + 工单)。
阶段4:数据与环境
目标: 代表性数据,无PII泄露。
实践
- - 测试夹具版本化;工厂模式用于变体
- 性能测试使用匿名化的类生产数据集——访问需治理
- 环境对等:预发布环境足够像生产环境以进行有意义的端到端测试
退出条件: 数据生成已文档化;密钥不在测试中。
阶段5:CI与门禁
目标: PR上快速反馈;必要时夜间运行更重的测试套件。
层级
- - PR:代码检查、单元测试、快速集成测试子集
- 主分支:完整集成测试;可选针对临时环境的端到端测试
- 发布:冒烟测试 + 生产环境金丝雀测试
指标
退出条件: 合并策略与绿色检查绑定;例外流程已定义。
阶段6:测试健康度与文化
目标: 测试像功能一样被拥有。
实践
- - 每个套件有归属;组织规模支持时CI有值班
- 删除不产生价值的测试——或修复它们
最终审查清单
- - [ ] 风险已映射到测试层级
- [ ] 测试金字塔策略已文档化
- [ ] 存在不稳定管理流程
- [ ] CI层级匹配团队速度
- [ ] 数据/夹具策略安全且可维护
有效指导技巧
- - 推荐测试接缝:契约稳定的边界。
- 警告快照滥用用于大型UI——差异噪音会破坏信任。
- 对于AI/LLM,讨论超越传统单元测试的评估框架。
处理偏差
- - 遗留不可测试代码:先做特征测试,然后重构接缝。
- 初创速度:先做冒烟测试 + 关键路径;随着痛点出现再扩展。