Mac Control
Automate Mac UI interactions using cliclick (mouse/keyboard) and system tools.
Tools
- - cliclick:
/opt/homebrew/bin/cliclick - mouse/keyboard control - screencapture: Built-in screenshot tool
- magick: ImageMagick for image analysis
- osascript: AppleScript for window info
Coordinate System (Eason's Mac Mini)
Current setup: 1920x1080 display, 1:1 scaling (no conversion needed!)
- - Screenshot coords = cliclick coords
- If screenshot shows element at (800, 500), click at (800, 500)
For Retina Displays (2x)
If screenshot is 2x the logical resolution:
CODEBLOCK0
Calibration Script
Run to verify your scale factor:
CODEBLOCK1
cliclick Commands
CODEBLOCK2
Screenshots
CODEBLOCK3
Workflow: Screenshot → Analyze → Click
Best practice for reliable clicking:
- 1. Take screenshot
CODEBLOCK4
- 2. View screenshot (Read tool) to find target coordinates
- 3. Click at those coordinates (1:1 on 1920x1080)
CODEBLOCK5
- 4. Verify by taking another screenshot
Example: Click a button
CODEBLOCK6
Window Bounds
CODEBLOCK7
Common Patterns
Chrome Extension Icon (Browser Relay)
Use AppleScript to find exact button position:
CODEBLOCK8
Clicking by Color Detection
If you need to find a specific colored element:
CODEBLOCK9
Dialog Button Click
- 1. Screenshot the dialog
- Find button coordinates visually
- Click (no scaling on 1920x1080)
CODEBLOCK10
Type in Text Field
CODEBLOCK11
Helper Scripts
Located in /Users/eason/clawd/scripts/:
- -
calibrate-cursor.sh - Calibrate coordinate scaling - INLINECODE3 - Click at screenshot coordinates
- INLINECODE4 - Get current cursor position
- INLINECODE5 - Auto-click Browser Relay extension
Keyboard Navigation (When Clicks Fail)
Google OAuth and protected pages block synthetic mouse clicks! Use keyboard navigation:
CODEBLOCK12
When to use keyboard instead of mouse:
- - Google OAuth / login pages (anti-automation protection)
- Popup dialogs with focus trapping
- When mouse clicks consistently fail after verification
Chrome Browser Relay & Multiple Windows
Problem: Browser Relay may list tabs from multiple Chrome windows, causing snapshot to fail on the desired tab.
Solution:
- 1. Close extra Chrome windows before automation
- Or ensure only the target window has relay attached
Check tabs visible to relay:
CODEBLOCK13
If target tab missing from list → wrong window attached.
Verify single window:
CODEBLOCK14
Verify-Before-Click Workflow
Critical: Always verify coordinates BEFORE clicking important buttons.
CODEBLOCK15
Troubleshooting
Click lands wrong: Verify scale factor with calibration script
cliclick m: doesn't move cursor visually: Use c: (click) instead, or check with cliclick p to confirm position changed
Permission denied: System Settings → Privacy & Security → Accessibility → Add INLINECODE9
Window not found: Check exact app name:
CODEBLOCK16
Clicks ignored on OAuth/protected pages: These pages block synthetic events. Use keyboard navigation (Tab + Enter) instead.
pyautogui vs cliclick coordinates differ: Stick with cliclick for consistency. pyautogui may have different coordinate mapping.
Quartz CGEvent clicks don't work: Some pages (Google OAuth) block low-level mouse events too. Keyboard is the only reliable method.
Mac 控制
使用 cliclick(鼠标/键盘)和系统工具自动化 Mac UI 交互。
工具
- - cliclick:/opt/homebrew/bin/cliclick - 鼠标/键盘控制
- screencapture:内置截图工具
- magick:用于图像分析的 ImageMagick
- osascript:用于窗口信息的 AppleScript
坐标系(Eason 的 Mac Mini)
当前设置:1920x1080 显示器,1:1 缩放(无需转换!)
- - 截图坐标 = cliclick 坐标
- 如果截图显示元素在 (800, 500),则点击 (800, 500)
对于 Retina 显示器(2x)
如果截图是逻辑分辨率的 2 倍:
bash
转换:cliclickcoords = screenshotcoords / 2
cliclick c:$((screenshot
x / 2)),$((screenshoty / 2))
校准脚本
运行以验证缩放因子:
bash
/Users/eason/clawd/scripts/calibrate-cursor.sh
cliclick 命令
bash
点击坐标
/opt/homebrew/bin/cliclick c:500,300
移动鼠标(不点击)- 注意:可能不会视觉上更新光标
/opt/homebrew/bin/cliclick m:500,300
双击
/opt/homebrew/bin/cliclick dc:500,300
右键点击
/opt/homebrew/bin/cliclick rc:500,300
点击并拖动
/opt/homebrew/bin/cliclick dd:100,100 du:200,200
输入文本
/opt/homebrew/bin/cliclick t:hello world
按键(回车、退出、Tab 等)
/opt/homebrew/bin/cliclick kp:return
/opt/homebrew/bin/cliclick kp:escape
带修饰键的按键(cmd+w 关闭窗口)
/opt/homebrew/bin/cliclick kd:cmd t:w ku:cmd
获取当前鼠标位置
/opt/homebrew/bin/cliclick p
操作前等待(毫秒)
/opt/homebrew/bin/cliclick -w 100 c:500,300
截图
bash
全屏(静默)
/usr/sbin/screencapture -x /tmp/screenshot.png
带光标(可能不适用于自定义光标颜色)
/usr/sbin/screencapture -C -x /tmp/screenshot.png
交互式区域选择
screencapture -i region.png
延迟截图
screencapture -T 3 -x delayed.png # 3 秒延迟
工作流程:截图 → 分析 → 点击
可靠点击的最佳实践:
- 1. 截图
bash
/usr/sbin/screencapture -x /tmp/screen.png
- 2. 查看截图(读取工具)以找到目标坐标
- 3. 点击这些坐标(1920x1080 下 1:1)
bash
/opt/homebrew/bin/cliclick c:X,Y
- 4. 验证:再次截图
示例:点击按钮
bash
1. 截图
/usr/sbin/screencapture -x /tmp/before.png
2. 查看图像,找到按钮在 (850, 450)
(在 /tmp/before.png 上使用读取工具)
3. 点击
/opt/homebrew/bin/cliclick c:850,450
4. 验证
/usr/sbin/screencapture -x /tmp/after.png
窗口边界
bash
获取 Chrome 窗口边界
osascript -e tell application Google Chrome to get bounds of front window
返回:0, 38, 1920, 1080(左,上,右,下)
常见模式
Chrome 扩展图标(Browser Relay)
使用 AppleScript 查找精确按钮位置:
bash
查找 Clawdbot 扩展按钮位置
osascript -e
tell application System Events
tell process Google Chrome
set toolbarGroup to group 2 of group 3 of toolbar 1 of group 1 of group 1 of group 1 of group 1 of group 1 of window 1
set allButtons to every pop up button of toolbarGroup
repeat with btn in allButtons
if description of btn contains Clawdbot then
return position of btn & size of btn
end if
end repeat
end tell
end tell
输出:1755, 71, 34, 34(x, y, 宽度, 高度)
点击按钮中心
center_x = x + width/2 = 1755 + 17 = 1772
center_y = y + height/2 = 71 + 17 = 88
/opt/homebrew/bin/cliclick c:1772,88
通过颜色检测点击
如果需要查找特定颜色的元素:
bash
在截图中查找红色 (#FF0000) 像素
magick /tmp/screen.png txt:- | grep #FF0000 | head -5
计算颜色区域中心
magick /tmp/screen.png txt:- | grep #FF0000 | awk -F[,:]
BEGIN{sx=0;sy=0;c=0}
{sx+=$1;sy+=$2;c++}
END{printf Center: (%d, %d)\n, sx/c, sy/c}
对话框按钮点击
- 1. 截图对话框
- 视觉上查找按钮坐标
- 点击(1920x1080 下无需缩放)
bash
示例:点击 (960, 540) 处的确定按钮
/opt/homebrew/bin/cliclick c:960,540
在文本字段中输入
bash
点击聚焦,然后输入
/opt/homebrew/bin/cliclick c:500,300
sleep 0.2
/opt/homebrew/bin/cliclick t:Hello world
/opt/homebrew/bin/cliclick kp:return
辅助脚本
位于 /Users/eason/clawd/scripts/:
- - calibrate-cursor.sh - 校准坐标缩放
- click-at-visual.sh - 点击截图坐标
- get-cursor-pos.sh - 获取当前光标位置
- attach-browser-relay.sh - 自动点击 Browser Relay 扩展
键盘导航(当点击失败时)
Google OAuth 和受保护页面会阻止模拟鼠标点击! 使用键盘导航:
bash
Tab 在元素间导航
osascript -e tell application System Events to keystroke tab
Shift+Tab 向后导航
osascript -e tell application System Events to key code 48 using shift down
回车激活聚焦的元素
osascript -e tell application System Events to keystroke return
完整工作流程:Tab 3 次然后回车
osascript -e
tell application System Events
keystroke tab
delay 0.15
keystroke tab
delay 0.15
keystroke tab
delay 0.15
keystroke return
end tell
何时使用键盘而非鼠标:
- - Google OAuth / 登录页面(反自动化保护)
- 具有焦点捕获的弹出对话框
- 验证后鼠标点击始终失败时
Chrome 浏览器中继与多窗口
问题:Browser Relay 可能列出多个 Chrome 窗口的标签页,导致 snapshot 在目标标签页上失败。
解决方案:
- 1. 自动化前关闭额外的 Chrome 窗口
- 或确保只有目标窗口附加了中继
检查中继可见的标签页:
bash
在代理代码中
browser action=tabs profile=chrome
如果目标标签页不在列表中 → 附加了错误的窗口。
验证单窗口:
bash
osascript -e tell application Google Chrome to return count of windows
点击前验证工作流程
关键:在点击重要按钮前始终验证坐标。
bash
1. 截图
osascript -e do shell script /usr/sbin/screencapture -x /tmp/before.png
2. 查看截图(读取工具),记下目标位置
3. 移动鼠标验证位置(可选)
python3 -c import pyautogui; pyautogui.moveTo(X, Y)
osascript -e do shell script /usr/sbin/screencapture -C -x /tmp/verify.png
4. 检查光标在目标上,然后点击