When to Use
Use when designing agent systems, choosing frameworks, implementing memory/tools, specifying agent behavior for teams, or reviewing agent security.
Quick Reference
| Topic | File |
|---|
| Architecture patterns & memory | INLINECODE0 |
| Framework comparison |
frameworks.md |
| Use cases by role |
use-cases.md |
| Implementation patterns & code |
implementation.md |
| Security boundaries & risks |
security.md |
| Evaluation & debugging |
evaluation.md |
Before Building — Decision Checklist
- - [ ] Single purpose defined? If you can't say it in one sentence, split into multiple agents
- [ ] User identified? Internal team, end customer, or another system?
- [ ] Interaction modality? Chat, voice, API, scheduled tasks?
- [ ] Single vs multi-agent? Start simple — only add agents when roles genuinely differ
- [ ] Memory strategy? What persists within session vs across sessions vs forever?
- [ ] Tool access tiers? Which actions are read-only vs write vs destructive?
- [ ] Escalation rules? When MUST a human step in?
- [ ] Cost ceiling? Budget per task, per user, per month?
Critical Rules
- 1. Start with one agent — Multi-agent adds coordination overhead. Prove single-agent insufficient first.
- Define escalation triggers — Angry users, legal mentions, confidence drops, repeated failures → human
- Separate read from write tools — Read tools need less approval than write tools
- Log everything — Tool calls, decisions, user interactions. You'll need the audit trail.
- Test adversarially — Assume users will try to break or manipulate the agent
- Budget by task type — Use cheaper models for simple tasks, expensive for complex
The Agent Loop (Mental Model)
CODEBLOCK0
Every agent is this loop. The differences are:
- - What it observes (context window, memory, tool results)
- How it thinks (direct, chain-of-thought, planning)
- What it can act on (tools, APIs, communication channels)
何时使用
在设计智能体系统、选择框架、实现记忆/工具、为团队指定智能体行为或审查智能体安全性时使用。
快速参考
| 主题 | 文件 |
|---|
| 架构模式与记忆 | architecture.md |
| 框架对比 |
frameworks.md |
| 按角色划分的用例 | use-cases.md |
| 实现模式与代码 | implementation.md |
| 安全边界与风险 | security.md |
| 评估与调试 | evaluation.md |
构建前——决策清单
- - [ ] 是否定义了单一目标? 如果无法用一句话说清,请拆分为多个智能体
- [ ] 是否明确了用户? 内部团队、最终客户还是其他系统?
- [ ] 交互方式? 聊天、语音、API、定时任务?
- [ ] 单智能体还是多智能体? 从简单开始——仅在角色真正不同时才添加智能体
- [ ] 记忆策略? 哪些信息在会话内、跨会话或永久保留?
- [ ] 工具访问层级? 哪些操作是只读、写入还是破坏性操作?
- [ ] 升级规则? 何时必须由人类介入?
- [ ] 成本上限? 每项任务、每个用户、每月的预算?
关键规则
- 1. 从一个智能体开始——多智能体会增加协调开销。先证明单智能体不足。
- 定义升级触发条件——愤怒用户、法律提及、置信度下降、重复失败→转交人类
- 区分读取与写入工具——读取工具需要的审批少于写入工具
- 记录一切——工具调用、决策、用户交互。你需要审计追踪。
- 进行对抗性测试——假设用户会试图破坏或操纵智能体
- 按任务类型预算——简单任务使用更便宜的模型,复杂任务使用昂贵的模型
智能体循环(思维模型)
观察→思考→行动→观察→...
每个智能体都是这个循环。区别在于:
- - 它观察什么(上下文窗口、记忆、工具结果)
- 它如何思考(直接、思维链、规划)
- 它可以对什么采取行动(工具、API、通信渠道)