Embodied Task Decomposition
This skill decomposes high-level natural language instructions into atomic subtasks that a robot can execute.
When to Use
- - User provides an image AND a task instruction
- User asks to "decompose", "break down", or "split" a task
- User wants step-by-step actions for robot execution
Input Format
- 1. Image: Photo of the physical scene (any environment: kitchen, office, outdoor, etc.)
- Task Instruction: Natural language description of what to accomplish
Example:
CODEBLOCK0
Output Format
Numbered list of subtasks, each following format:
CODEBLOCK1
Process
- 1. Analyze the image - Identify objects, surfaces, locations, tools visible
- Understand the task - What is the goal? What needs to be moved/ manipulated?
- Break into atomic actions - Each subtask = one action from the action bank
- Specify gripper - Always indicate left, right or either gripper
Action Bank
Refer to action-bank.md for the complete list of allowed actions. All subtasks MUST use actions from this bank.
Examples
See examples.md for detailed decomposition examples across different domains.
Important Notes
- - Use ONLY actions from the action bank
- Each subtask = one primary action
- Always specify gripper (left/right/either)
- Include target object and location
- Keep subtasks atomic and sequential
- Consider object state changes (e.g., "open bag" before "take fruit")
Updating the Action Bank
The agent MAY add new actions to the action bank when needed. To add a new action:
- 1. Check for duplicates - Search existing actions for similar functionality
- Verify functional difference - New action must serve a distinct purpose
- Add with documentation - Include description and example usage
Duplicate Check Criteria
A new action is considered a duplicate if it:
- - Has the same name as an existing action
- Describes the same physical movement (e.g., "lift" vs "raise")
- Can be used interchangeably with an existing action in all contexts
Adding a New Action
When adding to action-bank.md, follow this format:
CODEBLOCK2
Example of adding "insert" (different from "place" - "place" = put on surface, "insert" = put into container):
CODEBLOCK3
具身任务分解
该技能将高级自然语言指令分解为机器人可执行的原子子任务。
使用时机
- - 用户提供图像和任务指令
- 用户要求分解、拆解或拆分任务
- 用户需要机器人执行的逐步操作
输入格式
- 1. 图像:物理场景照片(任何环境:厨房、办公室、室外等)
- 任务指令:描述要完成内容的自然语言
示例:
任务指令:将白色桌子上面包机中的烤面包取出放在盘子上
图像:[图像路径或描述]
输出格式
子任务编号列表,每个子任务遵循以下格式:
{动作} {目标} {位置/可选介词短语} 使用{左/右/任意}夹爪
处理流程
- 1. 分析图像 - 识别可见的物体、表面、位置、工具
- 理解任务 - 目标是什么?需要移动/操作什么?
- 分解为原子动作 - 每个子任务 = 动作库中的一个动作
- 指定夹爪 - 始终指明左、右或任意夹爪
动作库
请参考action-bank.md获取完整允许动作列表。所有子任务必须使用该动作库中的动作。
示例
请参见examples.md获取跨不同领域的详细分解示例。
重要说明
- - 仅使用动作库中的动作
- 每个子任务 = 一个主要动作
- 始终指定夹爪(左/右/任意)
- 包含目标物体和位置
- 保持子任务的原子性和顺序性
- 考虑物体状态变化(例如,在取水果之前打开袋子)
更新动作库
代理可在需要时向动作库添加新动作。添加新动作时:
- 1. 检查重复 - 搜索现有动作中是否有类似功能
- 验证功能差异 - 新动作必须具有独特用途
- 添加文档 - 包含描述和示例用法
重复检查标准
新动作在以下情况下被视为重复:
- - 与现有动作名称相同
- 描述相同的物理运动(例如,lift与raise)
- 在所有上下文中可与现有动作互换使用
添加新动作
添加到action-bank.md时,遵循以下格式:
markdown
newaction the object
添加insert的示例(与place不同 - place=放在表面上,insert=放入容器中):
markdown
| insert | 将物体放入容器或槽内 | insert the key into the lock