Mobile Automation with agent-device
For exploration, use snapshot refs. For deterministic replay, use selectors.
Start Here (Read This First)
Use this skill as a router, not a full manual.
- 1. Pick one mode:
- Normal interaction flow
- Debug/crash flow
- Replay maintenance flow
- 2. Run one canonical flow below.
- Open references only if blocked.
Decision Map
- - No target context yet:
devices -> pick target -> open. - Normal UI task:
open -> snapshot -i -> press/fill -> diff snapshot -i -> INLINECODE6 - Debug/crash:
open <app> -> logs clear --restart -> reproduce -> logs path -> targeted INLINECODE10 - Replay drift:
replay -u <path> -> verify updated selectors
Canonical Flows
1) Normal Interaction Flow
CODEBLOCK0
2) Debug/Crash Flow
CODEBLOCK1
Logging is off by default. Enable only for debugging windows.
logs clear --restart requires an active app session (open <app> first).
3) Replay Maintenance Flow
CODEBLOCK2
Command Skeleton (Minimal)
Session and navigation
CODEBLOCK3
Use boot only as fallback when open cannot find/connect to a ready target.
Snapshot and targeting
CODEBLOCK4
INLINECODE16 is canonical tap command; click is an alias.
Utilities
CODEBLOCK5
Batch (when sequence is already known)
CODEBLOCK6
Guardrails (High Value Only)
- - Re-snapshot after UI mutations (navigation/modal/list changes).
- Prefer
snapshot -i; scope/depth only when needed. - Use refs for discovery, selectors for replay/assertions.
- Use
fill for clear-then-type semantics; use type for focused append typing. - iOS
appstate is session-scoped; Android appstate is live foreground state. - iOS settings helpers are simulator-only; use faceid
match|nonmatch|enroll|unenroll. - If using
--save-script, prefer explicit path syntax (--save-script=flow.ad or ./flow.ad).
Security and Trust Notes
- - Prefer a preinstalled
agent-device binary over on-demand package execution. - If install is required, pin an exact version (for example:
npx --yes agent-device@<exact-version> --help). - Signing/provisioning environment variables are optional, sensitive, and only for iOS physical-device setup.
- Logs/artifacts are written under
~/.agent-device; replay scripts write to explicit paths you provide. - Keep logging off unless debugging and use least-privilege/isolated environments for autonomous runs.
Common Mistakes
- - Mixing debug flow into normal runs (keep logs off unless debugging).
- Continuing to use stale refs after screen transitions.
- Using URL opens with Android
--activity (unsupported combination). - Treating
boot as default first step instead of fallback.
References
技能名称: agent-device
详细描述:
使用agent-device进行移动端自动化
对于探索性操作,使用快照引用。对于确定性回放,使用选择器。
从这里开始(请先阅读)
将此技能作为路由器使用,而非完整手册。
- 1. 选择一种模式:
- 正常交互流程
- 调试/崩溃流程
- 回放维护流程
- 2. 运行下方一个标准流程。
- 仅在遇到阻塞时打开参考资料。
决策地图
- - 尚无目标上下文:devices -> 选择目标 -> open。
- 正常UI任务:open -> snapshot -i -> press/fill -> diff snapshot -i -> close
- 调试/崩溃:open -> logs clear --restart -> 复现 -> logs path -> 定向grep
- 回放漂移:replay -u -> 验证更新后的选择器
标准流程
1) 正常交互流程
bash
agent-device open Settings --platform ios
agent-device snapshot -i
agent-device press @e3
agent-device diff snapshot -i
agent-device fill @e5 test
agent-device close
2) 调试/崩溃流程
bash
agent-device open MyApp --platform ios
agent-device logs clear --restart
agent-device logs path
日志默认关闭。仅在调试窗口时启用。
logs clear --restart 需要活跃的应用会话(先执行 open )。
3) 回放维护流程
bash
agent-device replay -u ./session.ad
命令骨架(最小化)
会话与导航
bash
agent-device devices
agent-device open [app|url] [url]
agent-device open [app] --relaunch
agent-device close [app]
agent-device session list
仅在 open 无法找到/连接到就绪目标时,将 boot 作为备用方案使用。
快照与目标定位
bash
agent-device snapshot -i
agent-device diff snapshot -i
agent-device find Sign In click
agent-device press @e1
agent-device fill @e2 text
agent-device is visible id=anchor
press 是标准点击命令;click 是其别名。
实用工具
bash
agent-device appstate
agent-device get text @e1
agent-device screenshot out.png
agent-device trace start
agent-device trace stop ./trace.log
批处理(当序列已知时)
bash
agent-device batch --steps-file /tmp/batch-steps.json --json
防护栏(仅高价值项)
- - UI变更(导航/模态框/列表变化)后重新获取快照。
- 优先使用 snapshot -i;仅在需要时使用范围/深度参数。
- 使用引用进行发现,使用选择器进行回放/断言。
- 使用 fill 实现清除后输入语义;使用 type 实现聚焦追加输入。
- iOS 的 appstate 是会话作用域;Android 的 appstate 是实时前台状态。
- iOS 设置辅助功能仅限模拟器;使用 faceid match|nonmatch|enroll|unenroll。
- 如果使用 --save-script,优先使用显式路径语法(--save-script=flow.ad 或 ./flow.ad)。
安全与信任说明
- - 优先使用预安装的 agent-device 二进制文件,而非按需执行包。
- 如需安装,请锁定精确版本(例如:npx --yes agent-device@<精确版本> --help)。
- 签名/配置环境变量为可选、敏感信息,仅用于iOS物理设备设置。
- 日志/工件写入 ~/.agent-device 目录;回放脚本写入您提供的显式路径。
- 除非调试,否则保持日志关闭,并使用最小权限/隔离环境进行自主运行。
常见错误
- - 将调试流程混入正常运行(除非调试,否则保持日志关闭)。
- 屏幕切换后继续使用过时的引用。
- 使用URL打开时搭配Android的 --activity(不支持的组合)。
- 将 boot 视为默认第一步而非备用方案。
参考资料