Droidrun Agent
Provides two async clients (PortalHTTPClient and PortalWSClient), a configuration helper (PortalConfig), and a built-in MCP server for communicating with Android devices running DroidRun Portal. All client methods are async and support async with context managers.
Installation
CODEBLOCK0
PortalHTTPClient
Communicates with Portal's HTTP server (default port 8080) using Bearer token authentication.
CODEBLOCK1
Query Methods (GET)
| Signature | Return Type | Description |
|---|
| INLINECODE5 | INLINECODE6 | Health check, no auth required |
| INLINECODE7 |
dict | Simplified accessibility tree |
|
get_a11y_tree_full(*, filter: bool = True) |
dict | Full accessibility tree,
filter=False keeps small elements |
|
get_state() |
dict | Simplified UI state |
|
get_state_full(*, filter: bool = True) |
dict | Full UI state (a11y
tree + phonestate),
filter=False keeps small elements |
|
get_phone_state() |
dict | Phone state info (current app, activity, keyboard status, etc.) |
|
get_version() |
str | Portal app version string |
|
get_packages() |
list[dict] | List of launchable apps, each containing
packageName,
label, etc. |
|
take_screenshot(*, hide_overlay: bool = True) |
bytes | Device screenshot as PNG bytes,
hide_overlay=False to show overlay |
Action Methods (POST)
| Signature | Return Type | Description |
|---|
| INLINECODE28 | INLINECODE29 | Tap screen coordinates |
| INLINECODE30 |
dict | Swipe gesture,
duration is optional duration in milliseconds |
|
global_action(action: int) |
dict | Execute Android accessibility global action (1=Back, 2=Home, 3=Recents) |
|
start_app(package: str, activity: str \| None = None, stop_before_launch: bool = False) |
dict | Launch an app |
|
stop_app(package: str) |
dict | Best-effort stop an app |
|
input_text(text: str, clear: bool = True) |
dict | Input text (auto base64-encoded),
clear=True clears field first |
|
clear_input() |
dict | Clear the focused input field |
|
press_key(key_code: int) |
dict | Send Android key code (e.g. 66=Enter, 3=Home, 4=Back) |
|
set_overlay_offset(offset: int) |
dict | Set overlay vertical offset in pixels |
|
set_socket_port(port: int) |
dict | Update the HTTP server port |
PortalWSClient
Communicates with Portal's WebSocket server (default port 8081) using JSON-RPC style messages. Automatically reconnects when a method is called on a broken connection.
CODEBLOCK2
Methods
Supports all action methods from PortalHTTPClient (tap, swipe, global_action, start_app, stop_app, input_text, clear_input, press_key, set_overlay_offset, set_socket_port, take_screenshot) with identical signatures.
Query methods:
| Signature | Return Type | Description |
|---|
| INLINECODE61 | INLINECODE62 | List of launchable packages |
| INLINECODE63 |
Any | Full state,
filter=False keeps small elements |
|
get_version() |
Any | Portal version string |
|
get_time() |
Any | Device Unix timestamp in milliseconds |
|
install(urls: list[str], hide_overlay: bool = True) |
Any | Install APK(s) from URL(s), supports split APKs (WebSocket only) |
WebSocket screenshots automatically parse binary frames and return PNG bytes directly.
Exceptions
All exceptions inherit from PortalError:
| Exception | Trigger |
|---|
| INLINECODE74 | Base exception |
| INLINECODE75 |
Cannot connect to Portal server |
|
PortalAuthError | Invalid or missing token (HTTP 401/403) |
|
PortalTimeoutError | Request timed out |
|
PortalResponseError | Server returned unexpected status or error |
Full Usage Example
CODEBLOCK3
PortalConfig
Helper dataclass for managing connection settings. Supports direct construction or loading from environment variables.
CODEBLOCK4
| Field | Type | Default | Description |
|---|
| INLINECODE79 | INLINECODE80 | (required) | Portal HTTP or WebSocket base URL |
| INLINECODE81 |
str | (required) | Bearer authentication token |
|
timeout |
float |
10.0 | Request timeout in seconds |
|
transport |
str |
"http" |
"http" or
"ws" |
Environment variables for from_env(): PORTAL_BASE_URL, PORTAL_TOKEN, PORTAL_TIMEOUT, PORTAL_TRANSPORT.
MCP Server
A built-in MCP (Model Context Protocol) server exposes all Portal operations as tools for AI agent integration. Requires the mcp optional dependency (pip install droidrun-agent[mcp]).
Starting the server
CODEBLOCK5
The server reads PORTAL_BASE_URL, PORTAL_TOKEN, PORTAL_TIMEOUT, and PORTAL_TRANSPORT from environment variables and communicates over stdio.
MCP Tools
| Tool | Description |
|---|
| INLINECODE102 | Health check (HTTP only) |
| INLINECODE103 |
Tap screen coordinates |
|
portal_swipe | Swipe gesture |
|
portal_screenshot | Take screenshot, returns PNG image |
|
portal_get_state | Get simplified UI state |
|
portal_get_state_full | Get full UI state (a11y tree + phone state) |
|
portal_get_a11y_tree | Get simplified accessibility tree (HTTP only) |
|
portal_get_a11y_tree_full | Get full accessibility tree (HTTP only) |
|
portal_get_phone_state | Get phone state info (HTTP only) |
|
portal_get_version | Get Portal app version |
|
portal_get_packages | List launchable packages |
|
portal_global_action | Execute accessibility global action (1=Back, 2=Home, 3=Recents) |
|
portal_start_app | Launch an app by package name |
|
portal_stop_app | Stop an app |
|
portal_input_text | Input text into focused field |
|
portal_clear_input | Clear focused input field |
|
portal_press_key | Send Android key code (66=Enter, 3=Home, 4=Back) |
|
portal_set_overlay_offset | Set overlay vertical offset |
|
portal_get_time | Get device timestamp (WebSocket only) |
|
portal_install | Install APK(s) from URL(s) (WebSocket only) |
openclaw integration
Register as an openclaw MCP skill:
CODEBLOCK6
技能名称: droidrun-agent
详细描述:
Droidrun Agent
提供两个异步客户端(PortalHTTPClient 和 PortalWSClient)、一个配置辅助类(PortalConfig)以及一个内置的 MCP 服务器,用于与运行 DroidRun Portal 的 Android 设备通信。所有客户端方法均为 async 并支持 async with 上下文管理器。
安装
bash
cd droidrun-agent && uv sync # 仅核心功能
cd droidrun-agent && uv sync --extra mcp # 包含 MCP 服务器支持
PortalHTTPClient
使用 Bearer 令牌认证与 Portal 的 HTTP 服务器(默认端口 8080)通信。
python
from droidrun_agent import PortalHTTPClient
async with PortalHTTPClient(baseurl=http://192.168.1.100:8080, token=YOURTOKEN) as client:
await client.ping()
state = await client.getstatefull()
await client.tap(200, 400)
png = await client.take_screenshot()
查询方法(GET)
| 签名 | 返回类型 | 描述 |
|---|
| ping() | dict | 健康检查,无需认证 |
| geta11ytree() |
dict | 简化的无障碍树 |
| get
a11ytree_full(*, filter: bool = True) | dict | 完整的无障碍树,filter=False 保留小元素 |
| get_state() | dict | 简化的 UI 状态 |
| get
statefull(*, filter: bool = True) | dict | 完整的 UI 状态(a11y
tree + phonestate),filter=False 保留小元素 |
| get
phonestate() | dict | 手机状态信息(当前应用、活动、键盘状态等) |
| get_version() | str | Portal 应用版本字符串 |
| get_packages() | list[dict] | 可启动应用列表,每个包含 packageName、label 等 |
| take
screenshot(*, hideoverlay: bool = True) | bytes | 设备截图,返回 PNG 字节数据,hide_overlay=False 显示覆盖层 |
操作方法(POST)
| 签名 | 返回类型 | 描述 |
|---|
| tap(x: int, y: int) | dict | 点击屏幕坐标 |
| swipe(startx: int, starty: int, endx: int, endy: int, duration: int \ |
None = None) | dict | 滑动手势,duration 为可选的持续时间(毫秒) |
| global_action(action: int) | dict | 执行 Android 无障碍全局操作(1=返回,2=主页,3=最近任务) |
| start
app(package: str, activity: str \| None = None, stopbefore_launch: bool = False) | dict | 启动应用 |
| stop_app(package: str) | dict | 尽力停止应用 |
| input_text(text: str, clear: bool = True) | dict | 输入文本(自动 base64 编码),clear=True 先清空字段 |
| clear_input() | dict | 清空焦点输入字段 |
| press
key(keycode: int) | dict | 发送 Android 键码(例如 66=回车,3=主页,4=返回) |
| set
overlayoffset(offset: int) | dict | 设置覆盖层垂直偏移量(像素) |
| set
socketport(port: int) | dict | 更新 HTTP 服务器端口 |
PortalWSClient
使用 JSON-RPC 风格消息与 Portal 的 WebSocket 服务器(默认端口 8081)通信。当在断开的连接上调用方法时自动重连。
python
from droidrun_agent import PortalWSClient
async with PortalWSClient(ws://192.168.1.100:8081, token=YOUR_TOKEN) as ws:
await ws.tap(200, 400)
state = await ws.get_state()
png = await ws.take_screenshot()
timems = await ws.gettime()
方法
支持 PortalHTTPClient 的所有操作方法(tap、swipe、globalaction、startapp、stopapp、inputtext、clearinput、presskey、setoverlayoffset、setsocketport、take_screenshot),签名相同。
查询方法:
| 签名 | 返回类型 | 描述 |
|---|
| getpackages() | Any | 可启动包列表 |
| getstate(*, filter: bool = True) |
Any | 完整状态,filter=False 保留小元素 |
| get_version() | Any | Portal 版本字符串 |
| get_time() | Any | 设备 Unix 时间戳(毫秒) |
| install(urls: list[str], hide_overlay: bool = True) | Any | 从 URL 安装 APK,支持拆分 APK(仅 WebSocket) |
WebSocket 截图自动解析二进制帧并直接返回 PNG bytes。
异常
所有异常继承自 PortalError:
| 异常 | 触发条件 |
|---|
| PortalError | 基础异常 |
| PortalConnectionError |
无法连接到 Portal 服务器 |
| PortalAuthError | 无效或缺失令牌(HTTP 401/403) |
| PortalTimeoutError | 请求超时 |
| PortalResponseError | 服务器返回意外状态或错误 |
完整使用示例
python
import asyncio
from droidrun_agent import PortalHTTPClient, PortalWSClient
async def demo_http():
async with PortalHTTPClient(http://localhost:8080, token=YOUR_TOKEN) as client:
print(await client.ping())
print(Version:, await client.get_version())
print(Packages:, len(await client.get_packages()))
await client.tap(500, 800)
await client.swipe(500, 1500, 500, 500, duration=300)
await client.input_text(Hello World)
await client.press_key(66) # 回车
state = await client.getstatefull()
png = await client.take_screenshot()
print(fScreenshot: {len(png)} bytes)
async def demo_ws():
async with PortalWSClient(ws://localhost:8081, token=YOUR_TOKEN) as ws:
print(Version:, await ws.get_version())
print(Time:, await ws.get_time())
await ws.tap(500, 800)
await ws.start_app(com.android.settings)
png = await ws.take_screenshot()
print(fScreenshot: {len(png)} bytes)
asyncio.run(demo_http())
asyncio.run(demo_ws())
PortalConfig
用于管理连接设置的辅助数据类。支持直接构造或从环境变量加载。
python
from droidrun_agent import PortalConfig
直接构造
config = PortalConfig(base
url=http://192.168.1.100:8080, token=YOURTOKEN)
client = config.create_client()
从环境变量加载
config = PortalConfig.from_env()
client = config.create_client()
| 字段 | 类型 | 默认值 | 描述 |
|---|
| base_url | str | (必需) | Portal HTTP 或 WebSocket 基础 URL |
| token |
str | (必需) | Bearer 认证令牌 |
| timeout | float | 10.0 | 请求超时时间(秒) |
| transport | str | http | http 或 ws |
fromenv() 的环境变量:PORTALBASEURL、PORTALTOKEN、PORTALTIMEOUT、PORTALTRANSPORT。
MCP 服务器
内置的 MCP(模型上下文协议)服务器将所有 Portal 操作作为工具暴露,用于 AI 代理集成。需要 mcp 可选依赖(pip install droidrun-agent[mcp])。
启动服务器
bash
通过 CLI 入口点
droidrun-agent --mcp
或作为 Python 模块
python -m droidrun_agent --mcp
服务器从环境变量读取 PORTALBASEURL、PORTALTOKEN、PORTALTIMEOUT 和 PORTAL_TRANSPORT,并通过 stdio 通信。
MCP 工具
| 工具 | 描述 |
|------|-------------|