search-intelligence-skill
Use search-intelligence-skill to give any AI agent the ability to search the entire internet like an expert OSINT analyst, SEO engineer, and security researcher combined. All searches flow through your SearXNG instance — zero API keys, full privacy, 90+ engines.
The skill generates optimized dork queries, selects intelligent multi-step search strategies, translates operators across engines, routes queries to the best SearXNG engines, scores results by multi-signal relevance, and learns from results to refine searches automatically.
Setup (once)
Install the package
CODEBLOCK0
Start a SearXNG instance (if you don't have one)
CODEBLOCK1
Enable JSON API in SearXNG settings
CODEBLOCK2
Initialize in code
CODEBLOCK3
Common Commands
Natural language search (the main interface)
CODEBLOCK4
Control search depth
CODEBLOCK5
Security scanning — exposed files and panels
CODEBLOCK6
Security scanning — vulnerability research
CODEBLOCK7
OSINT investigation — people
CODEBLOCK8
OSINT investigation — domains and companies
CODEBLOCK9
SEO analysis
CODEBLOCK10
Academic research
CODEBLOCK11
Code and developer search
CODEBLOCK12
File hunting
CODEBLOCK13
News search
CODEBLOCK14
Image and video search
CODEBLOCK15
Social media search
CODEBLOCK16
Direct dork query (no intent parsing)
CODEBLOCK17
Preview queries without executing them
CODEBLOCK18
Build a custom dork from parameters
CODEBLOCK19
Execute a named strategy against a target
CODEBLOCK20
Batch search — multiple queries at once
CODEBLOCK21
Override engine and category selection
CODEBLOCK22
Working with the SearchReport object
CODEBLOCK23
Working with individual SearchResult objects
CODEBLOCK24
AI Agent Integration
Basic tool handler
CODEBLOCK25
With depth control from agent
CODEBLOCK26
Returning structured data to agent
CODEBLOCK27
OpenAI function calling / tool definition
CODEBLOCK28
LangChain tool wrapper
CODEBLOCK29
Context manager for clean resource handling
CODEBLOCK30
Using Individual Components Directly
IntentParser — analyze queries without searching
CODEBLOCK31
DorkGenerator — generate queries without searching
CODEBLOCK32
ResultAnalyzer — score and analyze results
CODEBLOCK33
SearXNGClient — direct API access
CODEBLOCK34
Quick Reference
Search Depths
CODEBLOCK35
Intent Categories (auto-detected)
CODEBLOCK36
Search Strategies (auto-selected by depth + intent)
CODEBLOCK37
Supported SearXNG Engines (90+)
CODEBLOCK38
Dork Operators (auto-translated across engines)
CODEBLOCK39
Dork Template Library
Security dorks available (by subcategory)
CODEBLOCK40
OSINT dorks available (by subcategory)
CODEBLOCK41
SEO dorks available (by subcategory)
CODEBLOCK42
Academic dorks available (by subcategory)
CODEBLOCK43
Code dorks available (by subcategory)
CODEBLOCK44
Advanced Usage
Cross-engine dork translation
CODEBLOCK45
Result scoring details
CODEBLOCK46
Auto-refinement behavior
CODEBLOCK47
Entity extraction capabilities
CODEBLOCK48
Time range detection
CODEBLOCK49
Constraint extraction
CODEBLOCK50
Pagination
CODEBLOCK51
Rate limiting and retries
CODEBLOCK52
Logging for debugging
CODEBLOCK53
API Methods
| Method | Purpose | Returns |
|---|
| INLINECODE1 | Full intelligent search pipeline | INLINECODE2 |
| INLINECODE3 |
Execute raw dork query directly |
SearchReport |
|
skill.suggest_queries(query) | Preview dorks without executing |
list[DorkQuery] |
|
skill.build_dork(keyword, ...) | Build custom dork from parameters |
DorkQuery |
|
skill.execute_strategy(name, target) | Run named strategy against target |
SearchReport |
|
skill.search_batch(queries, ...) | Execute multiple searches |
list[SearchReport] |
|
skill.health_check() | Check SearXNG connectivity |
bool |
|
skill.close() | Close HTTP client |
None |
SearchReport Properties
| Property | Type | Description |
|---|
| INLINECODE17 | INLINECODE18 | Original natural language query |
| INLINECODE19 |
SearchIntent | Parsed intent with category, entities, keywords |
|
.strategy |
SearchStrategy | Strategy that was used (name, steps) |
|
.results |
list[SearchResult] | Scored and deduplicated results |
|
.total_found |
int | Total results before deduplication |
|
.suggestions |
list[str] | Refinement suggestions |
|
.refined_queries |
list[str] | Auto-refinement queries used |
|
.errors |
list[str] | Errors encountered during search |
|
.timing_seconds |
float | Total wall-clock time |
|
.engines_used |
list[str] | Engines that returned results |
|
.to_context(max_results) |
str | LLM-formatted text output |
|
.top(n) |
list[SearchResult] | Top N by relevance score |
SearchResult Properties
| Property | Type | Description |
|---|
| INLINECODE41 | INLINECODE42 | Result title |
| INLINECODE43 |
str | Result URL |
|
.snippet |
str | Content snippet / description |
|
.engines |
list[str] | Which SearXNG engines returned it |
|
.score |
float | Raw SearXNG score |
|
.relevance |
float | Computed multi-signal relevance (0-10) |
|
.category |
str | SearXNG result category |
|
.positions |
list[int] | Rank positions across engines |
|
.metadata |
dict | Extra fields: publishedDate, thumbnail, img_src |
Troubleshooting
SearXNG not reachable
CODEBLOCK54
CODEBLOCK55
No results returned
CODEBLOCK56
Timeout errors
CODEBLOCK57
Rate limiting (429 errors)
CODEBLOCK58
SSL errors (local development only)
CODEBLOCK59
Wrong intent detected
CODEBLOCK60
Memory usage with large result sets
CODEBLOCK61
How It All Works Together
CODEBLOCK62
Notes
Privacy
- - All searches route through YOUR SearXNG instance
- Zero API keys required for any engine
- No data sent to third-party services (except through SearXNG's engine requests)
- SearXNG strips tracking parameters and anonymizes requests
Performance tips
- - Reuse the
SearchSkill instance across searches (connection pooling) - Use
depth="quick" for simple lookups, reserve "deep" / "exhaustive" for research - Set
auto_refine=False for speed-critical paths - Use
skill.suggest_queries() to preview before executing expensive searches - Batch independent queries with INLINECODE65
Accuracy tips
- - Include specific entities in your query (domains, emails, CVEs, names)
- Use quoted phrases for exact matching: INLINECODE66
- Specify time ranges when freshness matters: INLINECODE67
- Use
depth="deep" or "exhaustive" for comprehensive coverage - Check
report.suggestions for refinement ideas - Check
report.intent to verify the skill understood your query correctly
Extending the skill
- - Add new dork templates in
config.py → INLINECODE73 - Add new intent signals in
config.py → INLINECODE75 - Add new engines in
config.py → INLINECODE77 - Add new operator translations in
config.py → INLINECODE79 - Add new strategies in
config.py → INLINECODE81 - Add new subcategory detection in
intent.py → INLINECODE83
Confirm before sensitive operations
- - Security scanning dorks may trigger alerts on target domains
- OSINT queries may involve personal information — use responsibly
- Always validate that the target domain/entity is authorized for testing
- This tool is for legitimate research, authorized security testing, and SEO analysis
search-intelligence-skill
使用 search-intelligence-skill 可以让任何 AI 代理具备像专业 OSINT 分析师、SEO 工程师和安全研究员一样搜索整个互联网的能力。所有搜索都通过你的 SearXNG 实例进行——无需任何 API 密钥,完全保护隐私,支持 90+ 搜索引擎。
该技能可生成优化的 Dork 查询、选择智能的多步搜索策略、跨搜索引擎转换操作符、将查询路由到最佳的 SearXNG 引擎、根据多信号相关性对结果进行评分,并从结果中学习以自动优化搜索。
设置(一次性)
安装包
bash
从源码安装(推荐)
git clone https://github.com/mouaad-ops/search-intelligence-skill.git
cd search-intelligence-skill
pip install -e .
或直接使用 pip
pip install search-intelligence-skill # 暂不可用
启动 SearXNG 实例(如果没有)
bash
Docker(最快)
docker run -d \
--name searxng \
-p 8888:8080 \
-e SEARXNG_SECRET=your-secret-key \
searxng/searxng:latest
验证是否运行
curl http://localhost:8888/healthz
在 SearXNG 设置中启用 JSON API
yaml
在 searxng/settings.yml 中——确保搜索格式包含 json
search:
formats:
- html
- json
在代码中初始化
python
from searchintelligenceskill import SearchSkill
默认——localhost:8888
skill = SearchSkill()
自定义实例
skill = SearchSkill(
searxng_url=http://localhost:8888,
timeout=30.0,
max_retries=2,
rate_limit=0.5,
verify_ssl=True,
auto_refine=True,
max
refinerounds=1,
)
验证连接
if skill.health_check():
print(✓ SearXNG 可达)
else:
print(✗ 无法访问 SearXNG——请检查 URL 和端口)
常用命令
自然语言搜索(主要接口)
python
from searchintelligenceskill import SearchSkill
skill = SearchSkill(searxng_url=http://localhost:8888)
只需描述你想要的内容——技能会处理一切:
意图检测、Dork 生成、引擎选择、评分
report = skill.search(在 example.com 上查找暴露的 .env 文件)
打印适合 LLM 的格式化输出
print(report.to_context())
访问结构化结果
for r in report.top(5):
print(f[{r.relevance:.1f}] {r.title})
print(f {r.url})
print(f {r.snippet[:200]})
控制搜索深度
python
from searchintelligenceskill import Depth
快速——1-2 个查询,单步,快速查找
report = skill.search(什么是 CORS, depth=quick)
标准——3-6 个查询,多引擎,良好的默认值
report = skill.search(python 异步框架对比, depth=standard)
深度——6-12 个查询,多步策略,彻底研究
report = skill.search(对 target.com 进行安全审计, depth=deep)
穷尽——12+ 个查询,完整的 OSINT 链,全面扫描
report = skill.search(对 suspect-domain.com 进行全面侦察, depth=exhaustive)
安全扫描——暴露的文件和面板
python
report = skill.search(
在 example.com 上查找暴露的 .env 文件、管理面板和目录列表,
depth=deep,
)
print(f意图: {report.intent.category.value}/{report.intent.subcategory})
→ 意图: security/exposed_files
print(f策略: {report.strategy.name})
→ 策略: multi_angle
print(f结果数: {len(report.results)})
for r in report.top(10):
print(f [{r.relevance:.1f}] {r.title} — {r.url})
安全扫描——漏洞研究
python
CVE 研究
report = skill.search(CVE-2024-3094 xz 后门利用细节, depth=deep)
特定技术漏洞
report = skill.search(
Apache Struts 2024 远程代码执行漏洞,
depth=standard,
)
暴露的 API 端点
report = skill.search(
在 target.com 上查找暴露的 Swagger API 文档,
depth=deep,
)
Git 仓库暴露
report = skill.search(
在 example.com 上查找暴露的 .git 目录,
depth=deep,
)
OSINT 调查——人员
python
按姓名
report = skill.search(
对 John Doe 进行 OSINT 调查——社交媒体、电子邮件、个人资料,
depth=deep,
)
按电子邮件
report = skill.search(
调查 john.doe@example.com——查找所有账户和提及,
depth=exhaustive,
)
按用户名
report = skill.search(
查找用户名 @johndoe42 的所有账户,
depth=deep,
)
按电话号码
report = skill.search(
查询电话号码 +1-555-123-4567,
depth=standard,
)
OSINT 调查——域名和公司
python
域名侦察
report = skill.search(
对 target.com 进行全面域名侦察——子域名、DNS、证书、技术栈,
depth=exhaustive,
)
公司调查
report = skill.search(
调查公司 Acme Corp——员工、备案、数据泄露,
depth=deep,
)
IP 地址查询
report = skill.search(
调查 IP 192.168.1.1——开放端口、服务、滥用报告,
depth=standard,
)
SEO 分析
python
网站索引检查
report = skill.search(
对 example.com 进行 SEO 索引分析,
depth=standard,
)
反向链接研究
report = skill.search(
查找指向 example.com 的反向链接,
depth=deep,
)
竞争对手分析
report = skill.search(
对 example.com 进行 SEO 竞争对手分析——相关网站、排名关键词,
depth=deep,
)
技术 SEO 审计
report = skill.search(
对 example.com 进行技术 SEO 检查——站点地图、robots.txt、规范链接、hreflang,
depth=deep,
)
学术研究
python
查找论文
report = skill.search(
2024 年关于 Transformer 架构缩放定律的最新研究论文,
depth=standard,
)
查找数据集
report = skill.search(
下载情感分析基准数据集 CSV,
depth=standard,
)
查找作者及其作品
report = skill.search(
作者 Yann LeCun 关于深度学习的出版物,
depth=deep,
)
代码和开发者搜索
python
查找仓库
report = skill.search(
支持 OCR 的 Python PDF 文本提取库,
depth=standard,
)
查找包
report = skill.search(
用于实时 WebSocket 发布/订阅的 npm 包,
depth=standard,
)
调试错误
report = skill.search(
RuntimeError: CUDA out of memory pytorch 解决方案,
depth=standard,
)
查找文档
report = skill.search(
FastAPI 依赖注入文档示例,
depth=quick,
)
文件搜索
python
查找特定文件类型
report = skill.search(
机器学习速查表 filetype:pdf,
depth=standard,
)
查找数据集
report = skill.search(
2023 年美国人口普查数据下载 CSV,
depth=standard,
)
查找配置文件
report = skill.search(
微服务 docker-compose 示例 filetype:yaml,
depth=standard,
)
新闻搜索
python
近期新闻
report = skill.search(
本周关于 AI 监管的最新新闻,
depth=standard,
)
突发新闻
report = skill.search(
今日网络安全突发新闻,
depth=quick,
)
新闻分析
report = skill.search(
欧盟 AI 法案对初创公司影响的分析,
depth=standard,
)
图片和视频搜索
python
图片
report = skill.search(
NASA 火星表面高分辨率照片,
depth=standard,
)
视频
report = skill.search(
Kubernetes 部署策略视频教程,
depth=standard,
)
社交媒体搜索
python
Reddit 讨论
report = skill.search(
reddit 讨论关于最佳自托管 Google Photos 替代品,
depth=standard,
)
论坛帖子
report = skill.search(
论坛