Code Review Checklist
Thorough, structured approach to reviewing code. Work through each dimension systematically rather than scanning randomly.
Installation
OpenClaw / Moltbot / Clawbot
CODEBLOCK0
Review Dimensions
| Dimension | Focus | Priority |
|---|
| Security | Vulnerabilities, auth, data exposure | Critical |
| Performance |
Speed, memory, scalability bottlenecks | High |
| Correctness | Logic errors, edge cases, data integrity | High |
| Maintainability | Readability, structure, future-proofing | Medium |
| Testing | Coverage, quality, reliability of tests | Medium |
| Accessibility | WCAG compliance, keyboard nav, screen readers | Medium |
| Documentation | Comments, API docs, changelog entries | Low |
Security Checklist
Review every change for these vulnerabilities:
- - [ ] SQL Injection — All queries use parameterized statements or an ORM; no string concatenation with user input
- [ ] XSS — User-provided content is escaped/sanitized before rendering;
dangerouslySetInnerHTML or equivalent is justified and safe - [ ] CSRF Protection — State-changing requests require valid CSRF tokens; SameSite cookie attributes are set
- [ ] Authentication — Every protected endpoint verifies the user is authenticated before processing
- [ ] Authorization — Resource access is scoped to the requesting user's permissions; no IDOR vulnerabilities
- [ ] Input Validation — All external input (params, headers, body, files) is validated for type, length, format, and range on the server side
- [ ] Secrets Management — No API keys, passwords, tokens, or credentials in source code; secrets come from environment variables or a vault
- [ ] Dependency Safety — New dependencies are from trusted sources, actively maintained, and free of known CVEs
- [ ] Sensitive Data — PII, tokens, and secrets are never logged, included in error messages, or returned in API responses
- [ ] Rate Limiting — Public and auth endpoints have rate limits to prevent brute-force and abuse
- [ ] File Upload Safety — Uploaded files are validated for type and size, stored outside the webroot, and served with safe Content-Type headers
- [ ] HTTP Security Headers — Content-Security-Policy, X-Content-Type-Options, Strict-Transport-Security are set
Performance Checklist
- - [ ] N+1 Queries — Database access patterns are batched or joined; no loops issuing individual queries
- [ ] Unnecessary Re-renders — Components only re-render when their relevant state/props change; memoization is applied where measurable
- [ ] Memory Leaks — Event listeners, subscriptions, timers, and intervals are cleaned up on unmount/disposal
- [ ] Bundle Size — New dependencies are tree-shakeable; large libraries are loaded dynamically; no full-library imports for a single function
- [ ] Lazy Loading — Heavy components, routes, and below-the-fold content use lazy loading / code splitting
- [ ] Caching Strategy — Expensive computations and API responses use appropriate caching (memoization, HTTP cache headers, Redis)
- [ ] Database Indexing — Queries filter/sort on indexed columns; new queries have been checked with EXPLAIN
- [ ] Pagination — List endpoints and queries use pagination or cursor-based fetching; no unbounded SELECT *
- [ ] Async Operations — Long-running tasks are offloaded to background jobs or queues rather than blocking request threads
- [ ] Image & Asset Optimization — Images are properly sized, use modern formats (WebP/AVIF), and leverage CDN delivery
Correctness Checklist
- - [ ] Edge Cases — Empty arrays, empty strings, zero values, negative numbers, and maximum values are handled
- [ ] Null/Undefined Handling — Nullable values are checked before access; optional chaining or guards prevent runtime errors
- [ ] Off-by-One Errors — Loop bounds, array slicing, pagination offsets, and range calculations are verified
- [ ] Race Conditions — Concurrent access to shared state uses locks, transactions, or atomic operations
- [ ] Timezone Handling — Dates are stored in UTC; display conversion happens at the presentation layer
- [ ] Unicode & Encoding — String operations handle multi-byte characters; text encoding is explicit (UTF-8)
- [ ] Integer Overflow / Precision — Arithmetic on large numbers or currency uses appropriate types (BigInt, Decimal)
- [ ] Error Propagation — Errors from async calls and external services are caught and handled; promises are never silently swallowed
- [ ] State Consistency — Multi-step mutations are transactional; partial failures leave the system in a valid state
- [ ] Boundary Validation — Values at the boundaries of valid ranges (min, max, exactly-at-limit) are tested
Maintainability Checklist
- - [ ] Naming Clarity — Variables, functions, and classes have descriptive names that reveal intent
- [ ] Single Responsibility — Each function/class/module does one thing; changes to one concern don't ripple through unrelated code
- [ ] DRY — Duplicated logic is extracted into shared utilities; copy-pasted blocks are consolidated
- [ ] Cyclomatic Complexity — Functions have low branching complexity; deeply nested chains are refactored
- [ ] Error Handling — Errors are caught at appropriate boundaries, logged with context, and surfaced meaningfully
- [ ] Dead Code Removal — Commented-out code, unused imports, unreachable branches, and obsolete feature flags are removed
- [ ] Magic Numbers & Strings — Literal values are extracted into named constants with clear semantics
- [ ] Consistent Patterns — New code follows the conventions already established in the codebase
- [ ] Function Length — Functions are short enough to understand at a glance; long functions are decomposed
- [ ] Dependency Direction — Dependencies point inward (infrastructure to domain); core logic does not import from UI or framework layers
Testing Checklist
- - [ ] Test Coverage — New logic paths have corresponding tests; critical paths have both happy-path and failure-case tests
- [ ] Edge Case Tests — Tests cover boundary values, empty inputs, nulls, and error conditions
- [ ] No Flaky Tests — Tests are deterministic; no reliance on timing, external services, or shared mutable state
- [ ] Test Independence — Each test sets up its own state and tears it down; test order does not affect results
- [ ] Meaningful Assertions — Tests assert on behavior and outcomes, not implementation details
- [ ] Test Readability — Tests follow Arrange-Act-Assert; test names describe the scenario and expected outcome
- [ ] Mocking Discipline — Only external boundaries (network, DB, filesystem) are mocked
- [ ] Regression Tests — Bug fixes include a test that reproduces the original bug and proves it is resolved
Review Process
Work through the code in three passes. Do not try to catch everything in one read.
| Pass | Focus | Time | What to Look For |
|---|
| First | High-level structure | 2-5 min | Architecture fit, file organization, API design, overall approach |
| Second |
Line-by-line detail | Bulk | Logic errors, security issues, performance problems, edge cases |
| Third | Edge cases & hardening | 5 min | Failure modes, concurrency, boundary values, missing tests |
First Pass (2-5 minutes)
- 1. Read the PR description and linked issue
- Scan the file list — does the change scope make sense?
- Check the overall approach — is this the right solution to the problem?
- Verify the change does not introduce architectural drift
Second Pass (bulk of review time)
- 1. Read each file diff top to bottom
- Check every function change against the checklists above
- Verify error handling at every I/O boundary
- Flag anything that makes you pause — trust your instincts
Third Pass (5 minutes)
- 1. Think about what could go wrong in production
- Check for missing tests on the code paths you flagged
- Verify rollback safety — can this change be reverted without data loss?
- Confirm documentation and changelog are updated if needed
Severity Levels
Classify every comment by severity so the author knows what blocks merge.
| Level | Label | Meaning | Blocks Merge? |
|---|
| Critical | INLINECODE1 | Security vulnerability, data loss, or crash in production | Yes |
| Major |
[MAJOR] | Bug, logic error, or significant performance regression | Yes |
| Minor |
[MINOR] | Improvement that would reduce future maintenance cost | No |
| Nitpick |
[NIT] | Style preference, naming suggestion, or trivial cleanup | No |
Always prefix your review comment with the severity label. This removes ambiguity about what matters.
Giving Feedback
Principles
- - Be specific — Point to the exact line and explain the issue, not just "this is wrong"
- Explain why — State the risk or consequence, not just the rule
- Suggest a fix — Offer a concrete alternative or code snippet when possible
- Ask, don't demand — Use questions for subjective points: "What do you think about...?"
- Acknowledge good work — Call out clean solutions, clever optimizations, or thorough tests
- Separate blocking from non-blocking — Use severity labels so the author knows what matters
Example Comments
Bad:
This is wrong. Fix it.
Good:
INLINECODE5 This query interpolates user input directly into the SQL string (line 42), which is vulnerable to SQL injection. Consider using a parameterized query:
CODEBLOCK1
Bad:
Why didn't you add tests?
Good:
INLINECODE6 The new calculateDiscount() function has a few branching paths — could we add tests for the zero-quantity and negative-price edge cases to prevent regressions?
Bad:
I would have done this differently.
Good:
INLINECODE8 This works well. An alternative approach could be extracting the retry logic into a shared withRetry() wrapper — but that's optional and could be a follow-up.
Review Anti-Patterns
Avoid these common traps that waste time and damage team trust:
| Anti-Pattern | Description |
|---|
| Rubber-Stamping | Approving without reading. Creates false confidence and lets bugs through. |
| Bikeshedding |
Spending 30 minutes debating a variable name while ignoring a race condition. |
|
Blocking on Style | Refusing to approve over formatting that a linter should enforce automatically. |
|
Gatekeeping | Requiring your personal preferred approach when the submitted one is correct. |
|
Drive-by Reviews | Leaving one vague comment and disappearing. Commit to following through. |
|
Scope Creep Reviews | Requesting unrelated refactors that should be separate PRs. |
|
Stale Reviews | Letting PRs sit for days. Review within 24 hours or hand off to someone else. |
|
Emotional Language | "This is terrible" or "obviously wrong." Critique the code, not the person. |
NEVER Do
- 1. NEVER approve without reading every changed line — rubber-stamping is worse than no review
- NEVER block a PR solely for style preferences — use a linter; humans review logic
- NEVER leave feedback without a severity level — ambiguity causes wasted cycles
- NEVER request changes without explaining why — "fix this" teaches nothing
- NEVER review more than 400 lines in one sitting — comprehension drops sharply; break large PRs into sessions
- NEVER skip the security checklist — one missed vulnerability outweighs a hundred style nits
- NEVER make it personal — review the code, never the coder; assume good intent
代码审查清单
对代码进行全面、结构化的审查。系统地逐一审视每个维度,而非随机扫描。
安装
OpenClaw / Moltbot / Clawbot
bash
npx clawhub@latest install code-review
审查维度
| 维度 | 关注点 | 优先级 |
|---|
| 安全性 | 漏洞、认证、数据泄露 | 关键 |
| 性能 |
速度、内存、可扩展性瓶颈 | 高 |
| 正确性 | 逻辑错误、边界情况、数据完整性 | 高 |
| 可维护性 | 可读性、结构、前瞻性 | 中 |
| 测试 | 测试覆盖率、质量、可靠性 | 中 |
| 可访问性 | WCAG合规、键盘导航、屏幕阅读器 | 中 |
| 文档 | 注释、API文档、更新日志条目 | 低 |
安全检查清单
针对每次变更检查以下漏洞:
- - [ ] SQL注入 — 所有查询使用参数化语句或ORM;无用户输入字符串拼接
- [ ] 跨站脚本攻击 — 用户提供的内容在渲染前已转义/清理;dangerouslySetInnerHTML或等效用法合理且安全
- [ ] 跨站请求伪造保护 — 改变状态的请求需要有效的CSRF令牌;已设置SameSite Cookie属性
- [ ] 认证 — 每个受保护的端点在处理前验证用户已认证
- [ ] 授权 — 资源访问限定于请求用户的权限;无IDOR漏洞
- [ ] 输入验证 — 所有外部输入(参数、请求头、请求体、文件)在服务端验证类型、长度、格式和范围
- [ ] 密钥管理 — 源代码中无API密钥、密码、令牌或凭据;密钥来自环境变量或密钥库
- [ ] 依赖安全性 — 新依赖来自可信来源,积极维护,且无已知CVE
- [ ] 敏感数据 — 个人身份信息、令牌和密钥绝不记录日志、包含在错误消息中或返回在API响应中
- [ ] 速率限制 — 公开和认证端点有速率限制以防止暴力破解和滥用
- [ ] 文件上传安全 — 上传的文件验证类型和大小,存储在webroot之外,并使用安全的Content-Type头提供
- [ ] HTTP安全头 — 已设置Content-Security-Policy、X-Content-Type-Options、Strict-Transport-Security
性能检查清单
- - [ ] N+1查询 — 数据库访问模式已批处理或连接;无循环发出单个查询
- [ ] 不必要的重渲染 — 组件仅在其相关状态/属性变化时重渲染;在可衡量的地方应用了记忆化
- [ ] 内存泄漏 — 事件监听器、订阅、定时器和间隔在卸载/销毁时已清理
- [ ] 打包体积 — 新依赖可进行树摇;大型库动态加载;不为单个函数导入整个库
- [ ] 懒加载 — 重型组件、路由和首屏以下内容使用懒加载/代码分割
- [ ] 缓存策略 — 昂贵计算和API响应使用适当的缓存(记忆化、HTTP缓存头、Redis)
- [ ] 数据库索引 — 查询在索引列上过滤/排序;新查询已使用EXPLAIN检查
- [ ] 分页 — 列表端点和查询使用分页或基于游标的获取;无无界SELECT *
- [ ] 异步操作 — 长时间运行的任务卸载到后台作业或队列,而非阻塞请求线程
- [ ] 图片与资源优化 — 图片尺寸适当,使用现代格式(WebP/AVIF),并利用CDN分发
正确性检查清单
- - [ ] 边界情况 — 空数组、空字符串、零值、负数和最大值已处理
- [ ] 空值/未定义处理 — 在访问前检查可空值;可选链或守卫防止运行时错误
- [ ] 差一错误 — 循环边界、数组切片、分页偏移量和范围计算已验证
- [ ] 竞态条件 — 对共享状态的并发访问使用锁、事务或原子操作
- [ ] 时区处理 — 日期以UTC存储;显示转换在表示层进行
- [ ] Unicode与编码 — 字符串操作处理多字节字符;文本编码明确(UTF-8)
- [ ] 整数溢出/精度 — 对大数字或货币的算术使用适当类型(BigInt、Decimal)
- [ ] 错误传播 — 异步调用和外部服务的错误被捕获并处理;promise永不静默吞没
- [ ] 状态一致性 — 多步突变是事务性的;部分失败使系统保持在有效状态
- [ ] 边界验证 — 测试有效范围边界上的值(最小值、最大值、恰好达到限制)
可维护性检查清单
- - [ ] 命名清晰度 — 变量、函数和类具有描述性名称,揭示意图
- [ ] 单一职责 — 每个函数/类/模块做一件事;对某一关注点的更改不会波及无关代码
- [ ] 不重复自己 — 重复逻辑提取到共享工具中;复制粘贴的代码块已合并
- [ ] 圈复杂度 — 函数具有低分支复杂度;深度嵌套的链已重构
- [ ] 错误处理 — 错误在适当边界捕获,附带上下文记录日志,并有意义地呈现
- [ ] 死代码移除 — 注释掉的代码、未使用的导入、不可达分支和过时的功能标志已移除
- [ ] 魔法数字与字符串 — 字面值提取为具有清晰语义的命名常量
- [ ] 一致的模式 — 新代码遵循代码库中已建立的约定
- [ ] 函数长度 — 函数足够短,一目了然;长函数已分解
- [ ] 依赖方向 — 依赖指向内部(基础设施到领域);核心逻辑不导入UI或框架层
测试检查清单
- - [ ] 测试覆盖率 — 新逻辑路径有对应测试;关键路径有快乐路径和失败案例测试
- [ ] 边界情况测试 — 测试覆盖边界值、空输入、空值和错误条件
- [ ] 无不稳定测试 — 测试是确定性的;不依赖时序、外部服务或共享可变状态
- [ ] 测试独立性 — 每个测试设置自己的状态并清理;测试顺序不影响结果
- [ ] 有意义的断言 — 测试断言行为和结果,而非实现细节
- [ ] 测试可读性 — 测试遵循Arrange-Act-Assert;测试名称描述场景和预期结果
- [ ] 模拟纪律 — 仅模拟外部边界(网络、数据库、文件系统)
- [ ] 回归测试 — 错误修复包含一个可重现原始错误并证明已解决的测试
审查流程
分三遍审查代码。不要试图在一次阅读中捕捉所有问题。
| 遍次 | 关注点 | 时间 | 查找内容 |
|---|
| 第一遍 | 高层结构 | 2-5分钟 | 架构适配性、文件组织、API设计、整体方法 |
| 第二遍 |
逐行细节 | 主体 | 逻辑错误、安全问题、性能问题、边界情况 |
| 第三遍 | 边界情况与加固 | 5分钟 | 故障模式、并发、边界值、缺失测试 |
第一遍(2-5分钟)
- 1. 阅读PR描述和关联问题
- 扫描文件列表——变更范围是否合理?
- 检查整体方法——这是解决问题的正确方案吗?
- 验证变更不会引入架构漂移
第二遍(审查主体时间)
- 1. 从上到下阅读每个文件差异
- 对照上述清单检查每个函数变更
- 验证每个I/O边界的错误处理
- 标记任何让你停顿的内容——相信你的直觉
第三遍(5分钟)
- 1. 思考生产环境中可能出错的情况
- 检查你标记的代码路径上是否有缺失的测试
- 验证回滚安全性——此变更能否在不丢失数据的情况下回退?
- 确认文档和更新日志在需要时已更新
严重级别
为每条评论分类严重级别,以便作者知道哪些会阻塞合并。
| 级别 | 标签 | 含义 | 阻塞合并? |
|---|
| 关键 | [CRITICAL] | 安全漏洞、数据丢失或生产环境崩溃 | 是 |
| 主要 |
[MAJOR] | 错误、逻辑错误或显著性能回归 | 是 |
| 次要 | [MINOR] | 可降低未来维护成本的改进 | 否 |
| 细节 | [NIT] | 风格偏好、命名建议或琐碎清理 | 否 |
始终在审查评论前加上严重级别标签。这消除了关于什么重要的歧义。
##