Ops Deck Lite — Code Search + Prompt Library
Two high-impact services that make any AI agent dramatically more efficient: semantic code search and a categorized prompt library. Lightweight (~200MB RAM), local-only, zero cloud costs.
For the full operational stack (agent intel, social pipeline, dev journal, monitoring), see ops-deck.
What You Get
1. Semantic Code Search (:5204)
Search your entire codebase by meaning, not just text matching. Ask "authentication middleware" and find the actual auth code even if it's called verifyToken or checkSession.
- - Hybrid search: vector similarity + keyword matching
- Local embeddings: qwen3-embedding:8b via Ollama (free, private)
- Code summaries: each chunk gets a natural language summary for better semantic matching
- Fast: <100ms search across 96K+ code chunks
- Nightly re-index: cron at 4am keeps the index fresh
CODEBLOCK0
Modes:
- -
hybrid (default, best) — combines vector similarity with text matching - INLINECODE4 — raw code matching only
- INLINECODE5 — search against natural language summaries
2. Prompt Library (:5202)
Categorized, searchable prompt templates. Stop writing the same prompts from scratch every session.
CODEBLOCK1
Prerequisites
- - Node.js 18+ (for prompt library)
- Python 3.10+ with FastAPI and uvicorn (for code search)
- Ollama with
qwen3-embedding:8b model - PM2 for process management
- SQLite (for code search index, no external DB)
Setup
1. Install dependencies
CODEBLOCK2
2. Create the Code Search service
CODEBLOCK3
Key code search server features:
- - Walks your project directories, splits code into chunks
- Generates embeddings via Ollama API (localhost:11434)
- Stores chunks + embeddings + summaries in SQLite
- FastAPI with POST /api/search, GET /api/health, POST /api/index
3. Create the Prompt Library
CODEBLOCK4
4. PM2 config
CODEBLOCK5
5. Start and index
CODEBLOCK6
Agent Integration
Add to your AGENTS.md or TOOLS.md:
CODEBLOCK7
Resource Usage
| Service | RAM | CPU | Disk |
|---|
| Code Search | ~150MB | <1% idle | ~50MB index per 100K chunks |
| Prompt Library |
~50MB | <1% idle | <1MB |
| Ollama (embedding model) | ~4GB | Spikes during indexing | ~4GB model |
Total: ~200MB for the services (Ollama runs independently and is shared with other tools).
Why Not Just Grep?
Grep finds exact text matches. Code search finds meaning:
| Query | Grep finds | Code Search finds |
|---|
| "auth middleware" | Files containing "auth middleware" | INLINECODE7 , checkSession(), INLINECODE9 |
| "database pooling" |
Files containing "database pooling" |
createPool(),
getConnection(),
pg.Pool config |
| "error handling" | Files containing "error handling" | try/catch blocks, error middleware, custom Error classes |
The embeddings understand code semantics. That's the whole point.
Ops Deck Lite — 代码搜索 + 提示词库
两项高影响力服务,让任何AI代理效率大幅提升:语义代码搜索和分类提示词库。轻量级(约200MB内存)、纯本地运行、零云成本。
如需完整运维栈(代理情报、社交管道、开发日志、监控),请参见 ops-deck。
您将获得
1. 语义代码搜索(:5204)
通过含义搜索整个代码库,而不仅仅是文本匹配。搜索身份验证中间件,即使实际代码名为 verifyToken 或 checkSession,也能找到相关认证代码。
- - 混合搜索:向量相似度 + 关键词匹配
- 本地嵌入:通过Ollama使用 qwen3-embedding:8b(免费、私密)
- 代码摘要:每个代码块生成自然语言摘要,提升语义匹配效果
- 快速:在96K+代码块中搜索耗时<100ms
- 夜间重建索引:凌晨4点cron任务保持索引更新
bash
搜索
curl -s -X POST http://localhost:5204/api/search \
-H Content-Type: application/json \
-d {query:数据库连接池,mode:hybrid,limit:10}
健康检查
curl -s http://localhost:5204/api/health
重建索引(含摘要)
curl -X POST http://localhost:5204/api/index?summarize=true
按项目筛选
curl -s -X POST http://localhost:5204/api/search \
-H Content-Type: application/json \
-d {query:错误处理,mode:hybrid,project:my-api,limit:5}
模式:
- - hybrid(默认,最佳)— 结合向量相似度与文本匹配
- code — 仅原始代码匹配
- summary — 针对自然语言摘要进行搜索
2. 提示词库(:5202)
分类、可搜索的提示词模板。无需每次会话都从头编写相同的提示词。
bash
列出所有提示词
curl -s http://localhost:5202/api/prompts | python3 -c
import sys,json
[print(f{p[\id\]}: {p[\title\]} [{p[\category\]}]) for p in json.load(sys.stdin)]
获取特定提示词
curl -s http://localhost:5202/api/prompts/
创建提示词
curl -s -X POST http://localhost:5202/api/prompts \
-H Content-Type: application/json \
-d {title:代码审查,category:coding,content:审查以下代码...}
前提条件
- - Node.js 18+(用于提示词库)
- Python 3.10+ 及 FastAPI 和 uvicorn(用于代码搜索)
- 带有 qwen3-embedding:8b 模型的 Ollama
- PM2 用于进程管理
- SQLite(用于代码搜索索引,无需外部数据库)
设置
1. 安装依赖
bash
npm install -g pm2
pip install fastapi uvicorn aiofiles
Ollama 嵌入模型
ollama pull qwen3-embedding:8b
2. 创建代码搜索服务
bash
mkdir -p pipeline/work/code-search
cd pipeline/work/code-search
服务器需要:
- server.py(FastAPI 应用)
- code_index.db(SQLite,首次索引时自动创建)
- 本地运行的 Ollama 用于生成嵌入
关键代码搜索服务器功能:
- - 遍历项目目录,将代码分割成块
- 通过 Ollama API(localhost:11434)生成嵌入
- 将代码块、嵌入和摘要存储在 SQLite 中
- FastAPI 提供 POST /api/search、GET /api/health、POST /api/index
3. 创建提示词库
bash
mkdir -p pipeline/work/prompt-library/backend
cd pipeline/work/prompt-library/backend
Express 服务器包含:
- GET /api/prompts(列出所有)
- GET /api/prompts/:id(获取单个)
- POST /api/prompts(创建)
- PUT /api/prompts/:id(更新)
- DELETE /api/prompts/:id(删除)
- SQLite 或 JSON 文件存储
4. PM2 配置
javascript
// ecosystem.config.cjs
module.exports = {
apps: [
{
name: code-search,
cwd: ./pipeline/work/code-search,
script: server.py,
interpreter: python3,
autorestart: true,
},
{
name: prompt-library-api,
cwd: ./pipeline/work/prompt-library/backend,
script: server.js,
autorestart: true,
},
]
};
5. 启动并建立索引
bash
pm2 start ecosystem.config.cjs
pm2 save
初始代码索引(根据代码库大小需要几分钟)
curl -X POST http://localhost:5204/api/index?summarize=true
设置夜间重建索引
(crontab -l 2>/dev/null; echo 0 4 * curl -s -X POST http://localhost:5204/api/index?summarize=true > /dev/null) | crontab -
代理集成
添加到您的 AGENTS.md 或 TOOLS.md:
markdown
代码搜索 API(请优先使用此接口)
在使用 grep、生成子代理或读取10个文件之前:请调用此 API。
curl -s -X POST http://localhost:5204/api/search \
-H Content-Type: application/json \
-d {query:您的搜索内容,mode:hybrid,limit:10}
提示词库
在从头编写提示词之前,请先检查是否已有现成的:
curl -s http://localhost:5202/api/prompts
资源使用
| 服务 | 内存 | CPU | 磁盘 |
|---|
| 代码搜索 | 约150MB | 空闲<1% | 每10万代码块约50MB索引 |
| 提示词库 |
约50MB | 空闲<1% | <1MB |
| Ollama(嵌入模型) | 约4GB | 索引时峰值 | 约4GB模型 |
总计:服务约200MB(Ollama独立运行,与其他工具共享)。
为什么不用 Grep?
Grep 查找精确文本匹配。代码搜索查找含义:
| 查询 | Grep 找到 | 代码搜索找到 |
|---|
| auth middleware | 包含auth middleware的文件 | verifyToken()、checkSession()、requireAuth() |
| database pooling |
包含database pooling的文件 | createPool()、getConnection()、pg.Pool 配置 |
| error handling | 包含error handling的文件 | try/catch 块、错误中间件、自定义 Error 类 |
嵌入向量理解代码语义。这正是其核心价值所在。