CLI AI Proxy
Local OpenAI-compatible proxy that bridges Gemini CLI and Claude Code to a unified REST API. Requests go through installed CLI tools — no direct API calls, no API key management.
When to Use
✅ User asks to start/stop/check the AI proxy
✅ User wants to route requests through Gemini CLI or Claude Code
✅ User asks about available models or proxy health
✅ User wants to configure OpenClaw to use the proxy
✅ Troubleshooting proxy connectivity or CLI issues
❌ Direct API calls to OpenAI/Anthropic/Google (this proxy only uses CLI tools)
❌ Managing API keys (CLIs handle their own authentication)
Quick Reference
| Action | Command |
|---|
| Start proxy | INLINECODE0 |
| Stop proxy |
{baseDir}/scripts/stop.sh |
| Check status |
{baseDir}/scripts/status.sh |
| Health check |
{baseDir}/scripts/health.sh |
| Configure OpenClaw |
{baseDir}/scripts/configure-provider.sh |
| Full install |
{baseDir}/scripts/install.sh |
Proxy Lifecycle
Starting
CODEBLOCK0
Starts the proxy on 127.0.0.1:9090 (default). The proxy listens for OpenAI-compatible requests and routes them to the appropriate CLI tool.
Before starting, verify at least one CLI is available:
- -
gemini --version (Gemini CLI) - INLINECODE8 (Claude Code)
Checking Status
CODEBLOCK1
Shows: running/stopped, PID, health endpoint data, available CLI providers, concurrency stats.
Stopping
CODEBLOCK2
Gracefully shuts down the proxy: stops accepting connections, kills active CLI subprocesses, cleans up.
Available Models
| Model ID | Provider | Backend Model |
|---|
| INLINECODE9 | Gemini CLI | gemini-2.5-flash |
| INLINECODE10 |
Gemini CLI | gemini-2.5-pro |
|
claude | Claude Code | sonnet |
|
claude-opus | Claude Code | opus |
When OpenClaw is configured, use as cli-ai-proxy/gemini, cli-ai-proxy/claude, etc.
OpenClaw Integration
To configure OpenClaw to route through the proxy:
CODEBLOCK3
This automatically:
- 1. Adds
cli-ai-proxy as a provider in INLINECODE16 - Registers all proxy models in the agent defaults
- Creates a backup of the original config
After configuring, set the default model in openclaw.json:
CODEBLOCK4
API Endpoints
The proxy exposes:
- -
POST /v1/chat/completions — Chat completions (streaming + non-streaming) - INLINECODE19 — List available models
- INLINECODE20 — Health check with provider status and concurrency info
Default base URL: INLINECODE21
For full API details see references/api.md.
Image Support
The proxy supports images in messages. When a request contains image_url content parts:
- 1. Images are saved to temporary files
- The prompt instructs the CLI to read the image via its built-in file tools
- Temp files are automatically cleaned up after each request
Supports both base64 data URLs and remote image URLs.
Configuration
Config file: config.yaml in the proxy installation directory.
Key settings:
- -
server.port — Listen port (default: 9090) - INLINECODE25 — Max concurrent CLI processes (default: 5)
- INLINECODE26 — CLI process timeout in ms (default: 300000)
- INLINECODE27 — Default model when none specified
For full configuration options see references/configuration.md.
Troubleshooting
Proxy won't start
- 1. Check if port 9090 is already in use: INLINECODE28
- Verify Node.js is available: INLINECODE29
- Check logs: read the proxy.log file in the installation directory
CLI not available
- 1. Verify CLI is installed and in PATH:
which gemini or INLINECODE31 - Check CLI auth:
gemini --version or INLINECODE33 - The proxy health endpoint shows which CLIs are available
429 Too Many Requests
The concurrency limit has been reached. Either:
- - Wait for current requests to complete
- Increase
concurrency.max in config.yaml
Timeout errors (504)
The CLI process took too long. Either:
- - Increase
timeout in config.yaml - Check if the CLI is hanging (auth issues, network)
For more troubleshooting see references/troubleshooting.md.
CLI AI 代理
本地兼容 OpenAI 的代理,将 Gemini CLI 和 Claude Code 桥接到统一的 REST API。请求通过已安装的 CLI 工具转发——无需直接调用 API,无需管理 API 密钥。
使用场景
✅ 用户要求启动/停止/检查 AI 代理
✅ 用户希望通过 Gemini CLI 或 Claude Code 路由请求
✅ 用户询问可用模型或代理健康状态
✅ 用户想要配置 OpenClaw 使用代理
✅ 排查代理连接或 CLI 问题
❌ 直接调用 OpenAI/Anthropic/Google 的 API(此代理仅使用 CLI 工具)
❌ 管理 API 密钥(CLI 自行处理身份验证)
快速参考
| 操作 | 命令 |
|---|
| 启动代理 | {baseDir}/scripts/start.sh |
| 停止代理 |
{baseDir}/scripts/stop.sh |
| 检查状态 | {baseDir}/scripts/status.sh |
| 健康检查 | {baseDir}/scripts/health.sh |
| 配置 OpenClaw | {baseDir}/scripts/configure-provider.sh |
| 完整安装 | {baseDir}/scripts/install.sh |
代理生命周期
启动
bash
{baseDir}/scripts/start.sh
在 127.0.0.1:9090(默认)上启动代理。代理监听兼容 OpenAI 的请求,并将其路由到相应的 CLI 工具。
启动前,请确认至少有一个 CLI 可用:
- - gemini --version(Gemini CLI)
- claude --version(Claude Code)
检查状态
bash
{baseDir}/scripts/status.sh
显示:运行/停止状态、PID、健康端点数据、可用 CLI 提供商、并发统计。
停止
bash
{baseDir}/scripts/stop.sh
优雅关闭代理:停止接受连接,终止活动的 CLI 子进程,清理资源。
可用模型
| 模型 ID | 提供商 | 后端模型 |
|---|
| gemini | Gemini CLI | gemini-2.5-flash |
| gemini-pro |
Gemini CLI | gemini-2.5-pro |
| claude | Claude Code | sonnet |
| claude-opus | Claude Code | opus |
配置 OpenClaw 后,使用 cli-ai-proxy/gemini、cli-ai-proxy/claude 等格式。
OpenClaw 集成
配置 OpenClaw 通过代理路由请求:
bash
{baseDir}/scripts/configure-provider.sh
此操作会自动:
- 1. 在 ~/.openclaw/openclaw.json 中添加 cli-ai-proxy 作为提供商
- 将所有代理模型注册到代理默认设置中
- 创建原始配置的备份
配置完成后,在 openclaw.json 中设置默认模型:
json
{ agents: { defaults: { model: { primary: cli-ai-proxy/gemini } } } }
API 端点
代理暴露以下端点:
- - POST /v1/chat/completions — 聊天补全(流式 + 非流式)
- GET /v1/models — 列出可用模型
- GET /health — 健康检查,包含提供商状态和并发信息
默认基础 URL:http://127.0.0.1:9090/v1
完整 API 详情请参见 references/api.md。
图片支持
代理支持消息中的图片。当请求包含 image_url 内容部分时:
- 1. 图片保存到临时文件
- 提示指令要求 CLI 通过其内置文件工具读取图片
- 每次请求后自动清理临时文件
支持 base64 数据 URL 和远程图片 URL。
配置
配置文件:代理安装目录中的 config.yaml。
关键设置:
- - server.port — 监听端口(默认:9090)
- concurrency.max — 最大并发 CLI 进程数(默认:5)
- timeout — CLI 进程超时时间(毫秒,默认:300000)
- defaultModel — 未指定时的默认模型
完整配置选项请参见 references/configuration.md。
故障排查
代理无法启动
- 1. 检查端口 9090 是否已被占用:lsof -i :9090
- 确认 Node.js 可用:node --version
- 检查日志:读取安装目录中的 proxy.log 文件
CLI 不可用
- 1. 确认 CLI 已安装并在 PATH 中:which gemini 或 which claude
- 检查 CLI 认证:gemini --version 或 claude --version
- 代理健康端点显示哪些 CLI 可用
429 请求过多
已达到并发限制。可以:
- - 等待当前请求完成
- 增加 config.yaml 中的 concurrency.max
超时错误(504)
CLI 进程耗时过长。可以:
- - 增加 config.yaml 中的 timeout
- 检查 CLI 是否挂起(认证问题、网络问题)
更多故障排查请参见 references/troubleshooting.md。