Error Handling

Consistent errors reduce support load and on-call pain. Design a taxonomy, stable codes, safe user messaging, and operator visibility—without leaking secrets or stack traces to clients.

When to Offer This Workflow

Trigger conditions:

- Inconsistent HTTP status codes and response bodies
Retry storms or duplicate side effects from naive retries
Logs that cannot be tied to user-visible failures

Initial offer:

Use six stages: (1) classify errors, (2) map to transport, (3) user messaging, (4) retries & idempotency, (5) observability, (6) client SDKs & DX). Confirm REST/GraphQL/gRPC and sync/async patterns.

Stage 1: Classify Errors

Goal: Distinguish validation, authentication, authorization, not found, conflict, rate limit, dependency failure, and internal bugs.

Exit condition: Table or enum of codes with owning team and meaning.

Stage 2: Map to Transport

Goal: Correct HTTP 4xx/5xx; GraphQL errors with extensions; gRPC status codes; optional RFC 7807 Problem Details for JSON APIs.

Stage 3: User Messaging

Goal: Actionable copy for end users; opaque support reference id; no internal hostnames, SQL fragments, or stack traces in client responses.

Stage 4: Retries & Idempotency

Goal: Retry only safe or idempotent operations; exponential backoff with jitter; align with idempotency keys on writes.

Stage 5: Observability

Goal: Structured logs with error.code, trace_id, user_id (where allowed); metrics by error class; alerts on error-rate SLO burn.

Stage 6: Client SDKs & DX

Goal: Typed errors in SDKs; documented recovery; map codes to user-facing strings in apps consistently.

Final Review Checklist

- [ ] Taxonomy and ownership defined
[ ] Transport mapping correct and consistent
[ ] User-safe messages with correlation ids
[ ] Retry policy matches idempotency story
[ ] Logs and metrics wired for ops

Tips for Effective Guidance

- Separate expected validation errors from unexpected 500s in dashboards.
Pair with idempotency for write paths and queues.

Handling Deviations

- Mobile offline: queue with explicit user-visible sync state.

错误处理

一致的错误处理能减少支持负担和值班痛苦。设计一套分类体系、稳定错误码、安全的用户消息和运维可见性——同时避免向客户端泄露密钥或堆栈跟踪。

何时提供此工作流

触发条件：

- HTTP 状态码和响应体不一致
因简单重试导致的重试风暴或重复副作用
日志无法关联到用户可见的故障

初始方案：

使用六个阶段：(1) 错误分类，(2) 映射到传输层，(3) 用户消息，(4) 重试与幂等性，(5) 可观测性，(6) 客户端 SDK 与开发者体验。确认 REST/GraphQL/gRPC 以及同步/异步模式。

阶段 1：错误分类

目标： 区分校验错误、认证错误、授权错误、未找到、冲突、限流、依赖故障和内部缺陷。

退出条件： 包含错误码、所属团队和含义的表格或枚举。

阶段 2：映射到传输层

目标： 正确的 HTTP 4xx/5xx 状态码；带扩展信息的 GraphQL 错误；gRPC 状态码；JSON API 可选的 RFC 7807 问题详情。

阶段 3：用户消息

目标： 面向终端用户的可操作文案；不透明的支持参考 ID；客户端响应中不包含内部主机名、SQL 片段或堆栈跟踪。

阶段 4：重试与幂等性

目标： 仅重试安全或幂等的操作；带抖动的指数退避；与写入操作的幂等性键对齐。

阶段 5：可观测性

目标： 包含 error.code、traceid、userid（允许时）的结构化日志；按错误类别的指标；基于错误率 SLO 燃烧的告警。

阶段 6：客户端 SDK 与开发者体验

目标： SDK 中的类型化错误；有文档记录的恢复方案；在应用中一致地将错误码映射到面向用户的字符串。

最终审查清单

- [ ] 已定义分类体系和归属
[ ] 传输层映射正确且一致
[ ] 用户安全消息包含关联 ID
[ ] 重试策略与幂等性方案匹配
[ ] 日志和指标已为运维配置

有效指导技巧

- 在仪表板中将预期的校验错误与意外的 500 错误分开。
为写入路径和队列配合使用幂等性。

处理偏差情况

- 移动端离线：使用队列并附带用户可见的显式同步状态。

error-handling错误处理

error-handling

Error Handling

When to Offer This Workflow

Stage 1: Classify Errors

Stage 2: Map to Transport

Stage 3: User Messaging

Stage 4: Retries & Idempotency

Stage 5: Observability

Stage 6: Client SDKs & DX

Final Review Checklist

Tips for Effective Guidance

Handling Deviations

错误处理

何时提供此工作流

阶段 1：错误分类

阶段 2：映射到传输层

阶段 3：用户消息

阶段 4：重试与幂等性

阶段 5：可观测性

阶段 6：客户端 SDK 与开发者体验

最终审查清单

有效指导技巧

处理偏差情况

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

error-handling错误处理

error-handling

Error Handling

When to Offer This Workflow

Stage 1: Classify Errors

Stage 2: Map to Transport

Stage 3: User Messaging

Stage 4: Retries & Idempotency

Stage 5: Observability

Stage 6: Client SDKs & DX

Final Review Checklist

Tips for Effective Guidance

Handling Deviations

错误处理

何时提供此工作流

阶段 1：错误分类

阶段 2：映射到传输层

阶段 3：用户消息

阶段 4：重试与幂等性

阶段 5：可观测性

阶段 6：客户端 SDK 与开发者体验

最终审查清单

有效指导技巧

处理偏差情况

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement