Email Finder
Discover email addresses associated with a domain using multiple methods.
How It Works
- 1. Website Scraping — Fetches homepage, /contact, /about, /team pages and extracts emails via regex
- Search Dorking — Searches for published emails in directories and search engines
- Pattern Guessing — If a name is provided, generates common patterns (first@, first.last@, flast@, etc.)
- DNS Hints — Checks MX/SPF/DMARC records to identify the email provider
- SMTP Verification — Verifies all found/guessed emails using RCPT TO
Dependencies
CODEBLOCK0
Usage
Basic domain search
CODEBLOCK1
With name for pattern guessing
CODEBLOCK2
Skip SMTP verification
CODEBLOCK3
Options
- -
--name "First Last" — Enable pattern guessing for a specific person - INLINECODE1 — Skip SMTP verification step
- INLINECODE2 — Connection timeout (default: 10)
Output
JSON to stdout:
CODEBLOCK4
Source values
| Value | Meaning |
|---|
| INLINECODE3 | Found on the domain's website |
| INLINECODE4 |
Found via search/directory lookup |
|
guessed | Generated from name patterns |
|
dns | Found in DNS records (DMARC reports, etc.) |
Deliverable values
| Value | Meaning |
|---|
| INLINECODE7 | Server accepted the recipient |
| INLINECODE8 |
Server rejected the recipient (invalid) |
|
catch-all | Server accepts all addresses |
|
unknown | Could not determine |
|
not_checked | Verification was skipped |
Rate Limiting
The script includes built-in rate limiting at every stage to protect your IP:
CODEBLOCK5
Options
- -
--scrape-delay SECONDS — Pause between website page fetches (default: 0.5) - INLINECODE13 — Pause between SMTP verification checks (default: 2.0)
- INLINECODE14 — Max SMTP verifications per run (default: 15). Remaining emails get
not_checked status.
Why rate limiting matters
This tool hits both web servers and mail servers. Without rate limiting:
- - Web scraping — Aggressive crawling gets your IP blocked by WAFs (Cloudflare, etc.) and makes you look like a bot. Respectful delays avoid this.
- SMTP verification — Mail servers flag IPs making rapid RCPT TO requests. Your IP can get blacklisted, affecting your ability to send real email.
- Residential IPs are fragile — Unlike datacenter IPs, your home/office IP is shared across all your internet activity. Getting it blacklisted affects everything.
Guidelines for agents
| Scenario | Recommended approach |
|---|
| Single domain lookup | Defaults are fine |
| Domain + name pattern guessing |
Defaults are fine (15 SMTP checks covers all patterns) |
| Multiple domains in sequence | Add 5-10s pause between domains. Don't run more than 20 domains/day |
| Just need the email provider | Use
--no-verify — DNS-only, zero risk |
| Bulk prospecting (50+ domains) | Use a paid service (Hunter.io, Apollo) or spread across multiple days |
Key principle: The script is designed for targeted lookups, not mass scraping. If you need to process hundreds of domains, use a dedicated service with proper IP reputation management.
Limitations
- - Website scraping depends on emails being visible in page source (won't find obfuscated/JS-rendered emails)
- Search engines may block automated queries
- SMTP verification requires outbound port 25 access
- Catch-all domains accept all addresses — can't confirm real inboxes
- Be respectful: the script adds delays between requests but don't run it in tight loops
邮箱查找器
通过多种方法发现与域名关联的邮箱地址。
工作原理
- 1. 网站抓取 — 获取首页、/contact、/about、/team页面,并通过正则表达式提取邮箱
- 搜索技巧 — 在目录和搜索引擎中搜索已发布的邮箱
- 模式猜测 — 如果提供了姓名,则生成常见模式(first@、first.last@、flast@等)
- DNS提示 — 检查MX/SPF/DMARC记录以识别邮箱提供商
- SMTP验证 — 使用RCPT TO验证所有找到/猜测的邮箱
依赖项
bash
pip3 install dnspython
使用方法
基础域名搜索
bash
python3 scripts/find_emails.py example.com
带姓名进行模式猜测
bash
python3 scripts/find_emails.py example.com --name John Smith
跳过SMTP验证
bash
python3 scripts/find_emails.py example.com --no-verify
选项
- - --name First Last — 为特定人员启用模式猜测
- --no-verify — 跳过SMTP验证步骤
- --timeout SECONDS — 连接超时(默认:10)
输出
JSON输出到标准输出:
json
{
domain: example.com,
provider: Google Workspace,
mx: [aspmx.l.google.com],
spf: v=spf1 include:_spf.google.com ~all,
dmarc: v=DMARC1; p=reject; rua=mailto:dmarc@example.com,
emails_found: 2,
emails: [
{
email: info@example.com,
source: scraped,
deliverable: yes,
smtp_detail: 2.1.5 OK
},
{
email: john.smith@example.com,
source: guessed,
deliverable: catch-all,
smtp_detail: 2.1.5 OK
}
]
}
来源值
| 值 | 含义 |
|---|
| scraped | 在域名网站上找到 |
| searched |
通过搜索/目录查找找到 |
| guessed | 从姓名模式生成 |
| dns | 在DNS记录中找到(DMARC报告等) |
可送达性值
服务器拒绝了收件人(无效) |
| catch-all | 服务器接受所有地址 |
| unknown | 无法确定 |
| not_checked | 验证被跳过 |
速率限制
该脚本在每个阶段都内置了速率限制,以保护您的IP:
bash
默认值:页面抓取间隔0.5秒,SMTP检查间隔2秒,最多15次SMTP检查
python3 scripts/find_emails.py example.com --name John Smith
敏感环境的保守设置
python3 scripts/find_emails.py example.com --scrape-delay 1.0 --smtp-delay 4 --max-smtp-checks 8
仅抓取,不进行SMTP(零风险)
python3 scripts/find_emails.py example.com --no-verify
选项
- - --scrape-delay SECONDS — 网站页面抓取之间的暂停时间(默认:0.5)
- --smtp-delay SECONDS — SMTP验证检查之间的暂停时间(默认:2.0)
- --max-smtp-checks N — 每次运行的最大SMTP验证次数(默认:15)。剩余的邮箱将获得not_checked状态。
为什么速率限制很重要
该工具同时访问Web服务器和邮件服务器。如果没有速率限制:
- - 网站抓取 — 激进的爬取会导致您的IP被WAF(Cloudflare等)屏蔽,并让您看起来像机器人。适当的延迟可以避免这种情况。
- SMTP验证 — 邮件服务器会标记快速发送RCPT TO请求的IP。您的IP可能被列入黑名单,影响您发送真实邮件的能力。
- 住宅IP很脆弱 — 与数据中心IP不同,您的家庭/办公室IP在所有互联网活动中共享。被列入黑名单会影响一切。
代理指南
| 场景 | 推荐方法 |
|---|
| 单个域名查询 | 默认设置即可 |
| 域名+姓名模式猜测 |
默认设置即可(15次SMTP检查覆盖所有模式) |
| 按顺序查询多个域名 | 域名之间添加5-10秒暂停。每天不要运行超过20个域名 |
| 只需要邮箱提供商 | 使用--no-verify — 仅DNS,零风险 |
| 批量挖掘(50+域名) | 使用付费服务(Hunter.io、Apollo)或分散在数天内完成 |
关键原则: 该脚本设计用于定向查询,而非批量抓取。如果您需要处理数百个域名,请使用具有适当IP声誉管理的专用服务。
局限性
- - 网站抓取依赖于页面源代码中可见的邮箱(无法找到混淆/JS渲染的邮箱)
- 搜索引擎可能会阻止自动查询
- SMTP验证需要出站端口25访问权限
- 全能域名接受所有地址 — 无法确认真实收件箱
- 请保持尊重:脚本在请求之间添加了延迟,但不要紧密循环运行它