clawgrep
Semantic + keyword file search. Output is grep-compatible. Runs fully locally. On first run, automatically downloads a small ONNX embedding model (~30 MB) from Hugging Face and caches it in the local cache directory. After that, all searches are offline.
Check availability
CODEBLOCK0
If not found, install from the open-source repository using any of these methods (only one needed):
CODEBLOCK1
Basic usage
CODEBLOCK2
Always pass --no-color when parsing output programmatically.
Search a workspace
CODEBLOCK3
Output format
Grep-compatible, one result per line, ranked by relevance (best first):
CODEBLOCK4
Each line is file:line:text. Context lines (from -C) use - as the
separator instead: file-line-text.
Exit codes
| Code | Meaning |
|---|
| INLINECODE5 | Match found |
| INLINECODE6 |
No match |
|
2 | Error |
Same as grep. Use -q for existence checks without output.
Choosing search mode
Default weights: 70% semantic, 30% keyword.
Concept search (don't know exact wording):
CODEBLOCK5
Exact identifier search (note IDs, tags, serial numbers):
CODEBLOCK6
Key flags
| Flag | Purpose |
|---|
| INLINECODE9 | Number of results (default: 5) |
| INLINECODE10 |
Context lines before and after |
|
-l | Print only matching filenames |
|
-q | Quiet; just set exit code |
|
--show-score | Append relevance score |
|
--path-boost N | Boost filename matches (>1.0 = higher) |
|
--min-score N | Filter low-relevance results (0.0–1.0) |
See CLI reference for all flags.
Best practices
- 1. Use
--no-color always when parsing output. - Keep
-k small (3–5) to reduce output. Increase only when needed. - Check exit codes instead of parsing stdout when possible.
- Let the cache persist — don't use
--no-cache unless searching throwaway
content. First run indexes; subsequent runs are fast.
- 5. Search the narrowest relevant directory, not the whole filesystem.
References (advanced, usually not needed)
The information above should be sufficient for normal use. Only load these if
you run into problems or need flags not listed above:
- - CLI reference — all flags, config file format, grep compatibility
- Examples — more input/output examples for edge cases
clawgrep
语义+关键词文件搜索。输出兼容grep。完全本地运行。首次运行时,自动从Hugging Face下载一个小型ONNX嵌入模型(约30MB)并缓存到本地缓存目录。此后所有搜索均为离线模式。
检查可用性
bash
clawgrep --version
如果未找到,请使用以下任一方法从开源仓库安装(只需一种):
bash
cargo install clawgrep # Rust(推荐)
npm install -g clawgrep # Node.js
pip install clawgrep # Python
基本用法
bash
clawgrep --no-color 查询内容 <路径>
以编程方式解析输出时,始终传递--no-color参数。
搜索工作区
bash
clawgrep --no-color 之前关于认证流程的讨论 ./memory
输出格式
兼容grep,每行一个结果,按相关性排序(最佳优先):
$ clawgrep --no-color 之前关于认证流程的讨论 ./memory
memory/2025-06-12-auth-design.md:8:决定对所有客户端认证使用OAuth2 with PKCE。
memory/2025-06-12-auth-design.md:14:令牌刷新应对用户透明。
memory/2025-06-10-planning.md:3:认证流程是本冲刺的最高优先级。
memory/archive/2025-05-session-notes.md:42:讨论了将认证迁移到独立服务。
memory/archive/2025-05-session-notes.md:87:需要重新审视令牌过期策略。
每行格式为文件:行号:文本。上下文行(来自-C)使用-作为分隔符:文件-行号-文本。
退出码
未找到匹配 |
| 2 | 错误 |
与grep相同。使用-q进行存在性检查而不输出结果。
选择搜索模式
默认权重:70%语义,30%关键词。
概念搜索(不确定确切措辞):
bash
clawgrep --no-color 关于迁移策略的决策 ./memory
精确标识符搜索(笔记ID、标签、序列号):
bash
clawgrep --no-color --keyword-weight 0.8 --semantic-weight 0.2 PROJ-1042 ./memory
关键标志
上下文行数(前后各N行) |
| -l | 仅打印匹配的文件名 |
| -q | 静默模式;仅设置退出码 |
| --show-score | 附加相关性分数 |
| --path-boost N | 提升文件名匹配权重(>1.0 = 更高) |
| --min-score N | 过滤低相关性结果(0.0–1.0) |
所有标志请参见CLI参考。
最佳实践
- 1. 解析输出时始终使用--no-color。
- 保持-k较小(3–5)以减少输出。仅在需要时增加。
- 尽可能检查退出码而非解析标准输出。
- 让缓存持久化——除非搜索一次性内容,否则不要使用--no-cache。首次运行建立索引;后续运行速度很快。
- 搜索最窄的相关目录,而非整个文件系统。
参考(高级,通常不需要)
以上信息应足以满足正常使用。仅在遇到问题或需要上述未列出的标志时加载以下内容:
- - CLI参考——所有标志、配置文件格式、grep兼容性
- 示例——更多边界情况的输入/输出示例