Embodied Task Decomposition

This skill decomposes high-level natural language instructions into atomic subtasks that a robot can execute.

When to Use

- User provides an image AND a task instruction
User asks to "decompose", "break down", or "split" a task
User wants step-by-step actions for robot execution

Input Format

1. Image: Photo of the physical scene (any environment: kitchen, office, outdoor, etc.)
Task Instruction: Natural language description of what to accomplish

Example:
CODEBLOCK0

Output Format

Numbered list of subtasks, each following format:
CODEBLOCK1

Process

1. Analyze the image - Identify objects, surfaces, locations, tools visible
Understand the task - What is the goal? What needs to be moved/ manipulated?
Break into atomic actions - Each subtask = one action from the action bank
Specify gripper - Always indicate left, right or either gripper

Action Bank

Refer to action-bank.md for the complete list of allowed actions. All subtasks MUST use actions from this bank.

Examples

See examples.md for detailed decomposition examples across different domains.

Important Notes

- Use ONLY actions from the action bank
Each subtask = one primary action
Always specify gripper (left/right/either)
Include target object and location
Keep subtasks atomic and sequential
Consider object state changes (e.g., "open bag" before "take fruit")

Updating the Action Bank

The agent MAY add new actions to the action bank when needed. To add a new action:

1. Check for duplicates - Search existing actions for similar functionality
Verify functional difference - New action must serve a distinct purpose
Add with documentation - Include description and example usage

Duplicate Check Criteria

A new action is considered a duplicate if it:

- Has the same name as an existing action
Describes the same physical movement (e.g., "lift" vs "raise")
Can be used interchangeably with an existing action in all contexts

Adding a New Action

When adding to action-bank.md, follow this format:

CODEBLOCK2

Example of adding "insert" (different from "place" - "place" = put on surface, "insert" = put into container):

CODEBLOCK3

具身任务分解

该技能将高级自然语言指令分解为机器人可执行的原子子任务。

使用时机

- 用户提供图像和任务指令
用户要求分解、拆解或拆分任务
用户需要机器人执行的逐步操作

输入格式

1. 图像：物理场景照片（任何环境：厨房、办公室、室外等）
任务指令：描述要完成内容的自然语言

示例：

任务指令：将白色桌子上面包机中的烤面包取出放在盘子上
图像：[图像路径或描述]

输出格式

子任务编号列表，每个子任务遵循以下格式：

{动作} {目标} {位置/可选介词短语} 使用{左/右/任意}夹爪

处理流程

1. 分析图像 - 识别可见的物体、表面、位置、工具
理解任务 - 目标是什么？需要移动/操作什么？
分解为原子动作 - 每个子任务 = 动作库中的一个动作
指定夹爪 - 始终指明左、右或任意夹爪

动作库

请参考action-bank.md获取完整允许动作列表。所有子任务必须使用该动作库中的动作。

示例

请参见examples.md获取跨不同领域的详细分解示例。

重要说明

- 仅使用动作库中的动作
每个子任务 = 一个主要动作
始终指定夹爪（左/右/任意）
包含目标物体和位置
保持子任务的原子性和顺序性
考虑物体状态变化（例如，在取水果之前打开袋子）

更新动作库

代理可在需要时向动作库添加新动作。添加新动作时：

1. 检查重复 - 搜索现有动作中是否有类似功能
验证功能差异 - 新动作必须具有独特用途
添加文档 - 包含描述和示例用法

重复检查标准

新动作在以下情况下被视为重复：

- 与现有动作名称相同
描述相同的物理运动（例如，lift与raise）
在所有上下文中可与现有动作互换使用

添加新动作

添加到action-bank.md时，遵循以下格式：

markdown

动作名称	描述	示例用法
newaction	功能说明

newaction the object

添加insert的示例（与place不同 - place=放在表面上，insert=放入容器中）：

markdown
| insert | 将物体放入容器或槽内 | insert the key into the lock

embodied-task-decomposition具身任务分解