When to Load
User asks about: analyzing data, finding patterns, understanding metrics, testing hypotheses, cohort analysis, A/B testing, churn analysis, statistical significance.
Core Principle
Analysis without a decision is just arithmetic. Always clarify: What would change if this analysis shows X vs Y?
Methodology First
Before touching data:
- 1. What decision is this analysis supporting?
- What would change your mind? (the real question)
- What data do you actually have vs what you wish you had?
- What timeframe is relevant?
Statistical Rigor Checklist
- - [ ] Sample size sufficient? (small N = wide confidence intervals)
- [ ] Comparison groups fair? (same time period, similar conditions)
- [ ] Multiple comparisons? (20 tests = 1 "significant" by chance)
- [ ] Effect size meaningful? (statistically significant ≠ practically important)
- [ ] Uncertainty quantified? ("12-18% lift" not just "15% lift")
Analytical Pitfalls to Catch
| Pitfall | What it looks like | How to avoid |
|---|
| Simpson's Paradox | Trend reverses when you segment | Always check by key dimensions |
| Survivorship bias |
Only analyzing current users | Include churned/failed in dataset |
| Comparing unequal periods | Feb (28d) vs March (31d) | Normalize to per-day or same-length windows |
| p-hacking | Testing until something is "significant" | Pre-register hypotheses or adjust for multiple comparisons |
| Correlation in time series | Both went up = "related" | Check if controlling for time removes relationship |
| Aggregating percentages | Averaging percentages directly | Re-calculate from underlying totals |
For detailed examples of each pitfall, see pitfalls.md.
Approach Selection
| Question type | Approach | Key output |
|---|
| "Is X different from Y?" | Hypothesis test | p-value + effect size + CI |
| "What predicts Z?" |
Regression/correlation | Coefficients + R² + residual check |
| "How do users behave over time?" | Cohort analysis | Retention curves by cohort |
| "Are these groups different?" | Segmentation | Profiles + statistical comparison |
| "What's unusual?" | Anomaly detection | Flagged points + context |
For technique details and when to use each, see techniques.md.
Output Standards
- 1. Lead with the insight, not the methodology
- Quantify uncertainty — ranges, not point estimates
- State limitations — what this analysis can't tell you
- Recommend next steps — what would strengthen the conclusion
Red Flags to Escalate
- - User wants to "prove" a predetermined conclusion
- Sample size too small for reliable inference
- Data quality issues that invalidate analysis
- Confounders that can't be controlled for
何时使用
用户询问:分析数据、发现模式、理解指标、检验假设、同期群分析、A/B测试、流失分析、统计显著性。
核心原则
没有决策的分析只是算术。始终明确:如果分析结果显示X而非Y,会改变什么?
方法论优先
在处理数据之前:
- 1. 这项分析支持什么决策?
- 什么会改变你的想法?(真正的问题)
- 你实际拥有什么数据 vs 你希望拥有什么数据?
- 相关的时间范围是什么?
统计严谨性检查清单
- - [ ] 样本量是否充足?(小样本 = 宽置信区间)
- [ ] 比较组是否公平?(相同时间段、相似条件)
- [ ] 是否存在多重比较?(20次检验 = 1次显著纯属偶然)
- [ ] 效应量是否有意义?(统计显著 ≠ 实际重要)
- [ ] 不确定性是否量化?(12-18%提升而非仅15%提升)
需警惕的分析陷阱
| 陷阱 | 表现 | 如何避免 |
|---|
| 辛普森悖论 | 细分后趋势反转 | 始终按关键维度检查 |
| 幸存者偏差 |
仅分析当前用户 | 在数据集中包含流失/失败用户 |
| 比较不等长周期 | 二月(28天)vs 三月(31天) | 标准化为每日或等长窗口 |
| p值操控 | 不断测试直到显著 | 预先注册假设或调整多重比较 |
| 时间序列相关性 | 两者都上升 = 相关 | 检查控制时间后是否消除关系 |
| 聚合百分比 | 直接平均百分比 | 从底层总数重新计算 |
每个陷阱的详细示例,请参见 pitfalls.md。
方法选择
| 问题类型 | 方法 | 关键输出 |
|---|
| X与Y是否不同? | 假设检验 | p值 + 效应量 + 置信区间 |
| 什么预测Z? |
回归/相关性 | 系数 + R² + 残差检验 |
| 用户随时间如何行为? | 同期群分析 | 按同期群的留存曲线 |
| 这些群体是否不同? | 细分 | 画像 + 统计比较 |
| 什么异常? | 异常检测 | 标记点 + 上下文 |
技术细节及何时使用每种方法,请参见 techniques.md。
输出标准
- 1. 以洞察为先,而非方法论
- 量化不确定性——范围,而非点估计
- 说明局限性——此分析无法告诉你的内容
- 推荐后续步骤——什么能加强结论
需升级的红旗信号
- - 用户想要证明预先确定的结论
- 样本量过小,无法进行可靠推断
- 数据质量问题导致分析无效
- 无法控制的混杂因素