Home Lab AI — Your Spare Machines Are a Cluster
You have machines sitting around your home lab. A mini PC in the closet. A workstation on the desk. Maybe a desktop doing light work. Together, your home lab has more compute than most cloud instances — you just need software that treats them as one home lab system. Works on macOS, Linux, and Windows.
Ollama Herd turns your home lab into a local AI cluster. One home lab endpoint, zero config, four model types.
What your home lab gets
CODEBLOCK0
- - Home lab LLM inference — Llama, Qwen, DeepSeek, Phi, Mistral, Gemma
- Home lab image generation — Stable Diffusion 3, Flux, z-image-turbo
- Home lab speech-to-text — Qwen3-ASR transcription
- Home lab embeddings — nomic-embed-text, mxbai-embed for RAG
All routed to the best available home lab device automatically.
Home Lab Setup (5 minutes)
On every home lab machine:
CODEBLOCK1
Pick one home lab machine as the router:
CODEBLOCK2
On every other home lab machine:
CODEBLOCK3
That's it. Home lab devices discover each other automatically on your local network. No IP addresses, no config files, no Docker, no Kubernetes.
Optional: add home lab image generation
CODEBLOCK4
Use Your Home Lab
Home lab LLM chat
CODEBLOCK5
Home lab image generation
CODEBLOCK6
Home lab transcription
CODEBLOCK7
Home lab knowledge base
CODEBLOCK8
How the Home Lab Routes Requests
The home lab router scores each device on 7 signals and picks the best one:
| Home Lab Signal | What it measures |
|---|
| Thermal state | Is the home lab model already loaded (hot) or needs cold-loading? |
| Memory fit |
Does the home lab device have enough RAM for this model? |
| Queue depth | Is the home lab device already busy with other requests? |
| Wait time | How long has the home lab request been waiting? |
| Role affinity | Big models prefer big home lab machines, small models prefer small ones |
| Availability trend | Is this home lab device reliably available at this time of day? |
| Context fit | Does the loaded context window fit the home lab request? |
You don't manage any of this. The home lab router handles it.
The Home Lab Dashboard
Open http://localhost:11435/dashboard in your browser — your home lab command center:
- - Home Lab Fleet Overview — see every device, loaded models, queue depths, health
- Trends — home lab requests per hour, latency, token throughput over 24h-7d
- Health — 15 automated home lab checks with recommendations
- Recommendations — optimal home lab model mix per device based on your hardware
Recommended Home Lab Models by Device
Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.
| Home Lab Device | RAM | Start with |
|---|
| MacBook Air (8GB) | 8GB | INLINECODE1 , INLINECODE2 |
| Mac Mini (16GB) |
16GB |
phi4,
gemma3:4b,
nomic-embed-text |
| Mac Mini (32GB) | 32GB |
qwen3:14b,
deepseek-r1:14b |
| MacBook Pro (64GB) | 64GB |
qwen3:32b,
codestral,
z-image-turbo |
| Mac Studio (128GB) | 128GB |
llama3.3:70b,
qwen3:72b |
| Mac Studio (256GB) | 256GB |
gpt-oss:120b,
sd3.5-large |
The home lab router's model recommender suggests the optimal mix: GET /dashboard/api/recommendations.
Works with Every Home Lab Tool
The home lab fleet exposes an OpenAI-compatible API. Any tool that works with OpenAI works with your home lab:
| Tool | Home Lab Connection |
|---|
| Open WebUI | Set Ollama URL to INLINECODE16 |
| Aider |
aider --openai-api-base http://homelab-router:11435/v1 |
|
Continue.dev | Base URL:
http://homelab-router:11435/v1 |
|
LangChain |
ChatOpenAI(base_url="http://homelab-router:11435/v1") |
|
CrewAI | Set
OPENAI_API_BASE=http://homelab-router:11435/v1 |
|
Any OpenAI SDK | Base URL:
http://homelab-router:11435/v1, API key: any string |
Full documentation
Contribute
Ollama Herd is open source (MIT) and built by home lab enthusiasts for home lab enthusiasts:
- - Star on GitHub — help other home lab builders find us
- Open an issue — share your home lab setup, report bugs
- PRs welcome — from humans and AI agents.
CLAUDE.md gives full context. - Built by twin brothers in Alaska who run their own home lab fleet.
Home Lab Guardrails
- - No automatic downloads — home lab model pulls require explicit user confirmation. Some models are 70GB+.
- Home lab model deletion requires explicit user confirmation.
- All home lab requests stay local — no data leaves your home network.
- Never delete or modify files in
~/.fleet-manager/ (home lab routing data and logs). - No cloud dependencies — your home lab works offline after initial model downloads.
家庭实验室AI——你的闲置机器就是集群
你的家庭实验室里散落着各种机器。壁橱里的迷你PC。桌上的工作站。或许还有一台做轻量工作的台式机。综合来看,你的家庭实验室拥有比大多数云实例更强的算力——你只需要一款能将它们视为一个家庭实验室系统的软件。支持macOS、Linux和Windows。
Ollama Herd将你的家庭实验室转变为一个本地AI集群。一个家庭实验室端点,零配置,四种模型类型。
你的家庭实验室将获得
设备1 (32GB) ─┐
设备2 (64GB) ├──→ 家庭实验室路由器 (:11435) ←── 你的应用/智能体
设备3 (256GB) ─┘
- - 家庭实验室LLM推理 — Llama、Qwen、DeepSeek、Phi、Mistral、Gemma
- 家庭实验室图像生成 — Stable Diffusion 3、Flux、z-image-turbo
- 家庭实验室语音转文字 — Qwen3-ASR转录
- 家庭实验室嵌入 — nomic-embed-text、mxbai-embed(用于RAG)
所有请求自动路由到最佳可用的家庭实验室设备。
家庭实验室设置(5分钟)
在每台家庭实验室机器上:
bash
pip install ollama-herd # 家庭实验室AI路由器
选择一台家庭实验室机器作为路由器:
bash
herd # 启动家庭实验室路由器
在其他每台家庭实验室机器上:
bash
herd-node # 自动加入家庭实验室集群
就这样。家庭实验室设备会在你的本地网络上自动发现彼此。无需IP地址、无需配置文件、无需Docker、无需Kubernetes。
可选:添加家庭实验室图像生成功能
bash
uv tool install mflux # Flux模型(家庭实验室最快)
uv tool install diffusionkit # Stable Diffusion 3/3.5
使用你的家庭实验室
家庭实验室LLM聊天
python
from openai import OpenAI
家庭实验室推理客户端
homelab
client = OpenAI(baseurl=http://localhost:11435/v1, api_key=not-needed)
homelab
response = homelabclient.chat.completions.create(
model=llama3.3:70b,
messages=[{role: user, content: 如何搭建家庭实验室NAS?}],
stream=True,
)
for chunk in homelab_response:
print(chunk.choices[0].delta.content or , end=)
家庭实验室图像生成
bash
curl -o homelab_output.png http://localhost:11435/api/generate-image \
-H Content-Type: application/json \
-d {model: z-image-turbo, prompt: 一个温馨的家庭实验室,配有服务器和RGB灯光, width: 1024, height: 1024}
家庭实验室转录
bash
curl http://localhost:11435/api/transcribe -F file=@homelab_standup.wav -F model=qwen3-asr
家庭实验室知识库
bash
curl http://localhost:11435/api/embed \
-d {model: nomic-embed-text, input: 家庭实验室网络与AI推理最佳实践}
家庭实验室如何路由请求
家庭实验室路由器根据7个信号对每台设备进行评分,并选择最佳设备:
| 家庭实验室信号 | 测量内容 |
|---|
| 热状态 | 家庭实验室模型是否已加载(热)或需要冷加载? |
| 内存适配 |
家庭实验室设备是否有足够RAM运行此模型? |
| 队列深度 | 家庭实验室设备是否正忙于处理其他请求? |
| 等待时间 | 家庭实验室请求已等待多久? |
| 角色亲和性 | 大模型偏好大机器,小模型偏好小机器 |
| 可用性趋势 | 此家庭实验室设备在一天中的这个时间是否可靠可用? |
| 上下文适配 | 已加载的上下文窗口是否适合家庭实验室请求? |
你无需管理任何这些。家庭实验室路由器会自动处理。
家庭实验室仪表板
在浏览器中打开http://localhost:11435/dashboard——你的家庭实验室指挥中心:
- - 家庭实验室集群概览 — 查看每台设备、已加载模型、队列深度、健康状况
- 趋势 — 家庭实验室每小时请求数、延迟、24小时至7天的令牌吞吐量
- 健康 — 15项自动化家庭实验室检查及建议
- 建议 — 基于你的硬件为每台设备推荐最佳家庭实验室模型组合
按设备推荐的家庭实验室模型
跨平台: 这些是示例配置。任何具有等效RAM的设备(Mac、Linux、Windows)均可使用。集群路由器在所有平台上运行。
| 家庭实验室设备 | RAM | 起始模型 |
|---|
| MacBook Air (8GB) | 8GB | phi4-mini、gemma3:1b |
| Mac Mini (16GB) |
16GB | phi4、gemma3:4b、nomic-embed-text |
| Mac Mini (32GB) | 32GB | qwen3:14b、deepseek-r1:14b |
| MacBook Pro (64GB) | 64GB | qwen3:32b、codestral、z-image-turbo |
| Mac Studio (128GB) | 128GB | llama3.3:70b、qwen3:72b |
| Mac Studio (256GB) | 256GB | gpt-oss:120b、sd3.5-large |
家庭实验室路由器的模型推荐器会建议最佳组合:GET /dashboard/api/recommendations。
与所有家庭实验室工具兼容
家庭实验室集群暴露了兼容OpenAI的API。任何与OpenAI兼容的工具都能与你的家庭实验室配合使用:
| 工具 | 家庭实验室连接方式 |
|---|
| Open WebUI | 将Ollama URL设置为http://homelab-router:11435 |
| Aider |
aider --openai-api-base http://homelab-router:11435/v1 |
|
Continue.dev | 基础URL:http://homelab-router:11435/v1 |
|
LangChain | ChatOpenAI(base_url=http://homelab-router:11435/v1) |
|
CrewAI | 设置OPENAI
APIBASE=http://homelab-router:11435/v1 |
|
任何OpenAI SDK | 基础URL:http://homelab-router:11435/v1,API密钥:任意字符串 |
完整文档
贡献
Ollama Herd是开源(MIT)的,由家庭实验室爱好者为家庭实验室爱好者构建:
- - 在GitHub上点星 — 帮助其他家庭实验室构建者找到我们
- 提交问题 — 分享你的家庭实验室设置、报告错误
- 欢迎PR — 来自人类和AI智能体。CLAUDE.md提供完整上下文。
- 由阿拉斯加的双胞胎兄弟构建,他们自己运营着家庭实验室集群。
家庭实验室安全护栏
- - 无自动下载 — 家庭实验室模型拉取需要明确的用户确认。某些模型超过70GB。
- 家庭实验室模型删除需要明确的用户确认。
- 所有家庭实验室请求保持本地 — 无数据离开你的家庭网络。
- 切勿删除或修改~/.fleet-manager/中的文件(家庭实验室路由数据和日志)。
- 无云依赖 — 初始模型下载后,你的家庭实验室可离线工作。