Mistral & Codestral — Mistral AI Models on Your Local Fleet

Mistral AI's open-source models run locally on your hardware. Mistral Large for frontier reasoning, Mistral-Nemo for efficiency, Codestral for code generation. The fleet router picks the best device for every Mistral request.

Supported Mistral models

Mistral Model	Parameters	Ollama name	Best for
Codestral (by Mistral)	22B	INLINECODE0	Mistral's code specialist — 80+ languages
Mistral Large

Setup Mistral locally

CODEBLOCK0

No Mistral models downloaded during installation. All Mistral model pulls are user-initiated.

Codestral code generation

Codestral is Mistral AI's dedicated coding model — trained on 80+ programming languages with fill-in-the-middle support.

CODEBLOCK1

Codestral via curl

CODEBLOCK2

Mistral Large reasoning

CODEBLOCK3

Mistral-Nemo for efficiency

CODEBLOCK4

Mistral hardware recommendations

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Mistral Model	Min RAM	Example hardware
INLINECODE5	8GB	Any Mac — lightweight Mistral
INLINECODE6

Monitor Mistral fleet

CODEBLOCK5

Example Mistral fleet response:
CODEBLOCK6

Mistral dashboard at http://localhost:11435/dashboard.

Also available alongside Mistral

Other LLMs (same Mistral-compatible endpoint)

Llama 3.3, Qwen 3.5, DeepSeek-V3, Phi 4, Gemma 3 — route alongside Mistral models.

Image generation

CODEBLOCK7

Speech-to-text

CODEBLOCK8

Embeddings

CODEBLOCK9

Full documentation

Contribute

Ollama Herd is open source (MIT). Run Mistral locally, contribute globally:

- Star on GitHub — help Mistral users find local inference
Open an issue — share your Mistral setup
PRs welcome — CLAUDE.md gives AI agents full context. 444 tests.

Guardrails

- Mistral model downloads require explicit user confirmation — Mistral models range from 4GB to 70GB+.
Mistral model deletion requires explicit user confirmation.
Never delete or modify files in ~/.fleet-manager/.
No Mistral models downloaded automatically — all pulls are user-initiated.

Mistral & Codestral — 本地设备上的Mistral AI模型

Mistral AI的开源模型在您的本地硬件上运行。Mistral Large用于前沿推理，Mistral-Nemo用于高效运行，Codestral用于代码生成。集群路由器为每个Mistral请求选择最佳设备。

支持的Mistral模型

Mistral模型	参数规模	Ollama名称	最佳用途
Codestral (Mistral出品)	22B	codestral	Mistral的代码专家 — 支持80+种编程语言
Mistral Large

本地部署Mistral

bash
pip install ollama-herd # 安装Mistral集群路由器
herd # 启动Mistral兼容路由器
herd-node # 在每个设备上运行 — Mistral请求自动路由

安装过程中不会下载任何Mistral模型。所有Mistral模型拉取均由用户主动发起。

Codestral代码生成

Codestral是Mistral AI的专用编程模型 — 在80+种编程语言上训练，支持中间填充功能。

python
from openai import OpenAI

连接到本地Mistral集群

mistralfleet = OpenAI(baseurl=http://localhost:11435/v1, api_key=not-needed)

使用Mistral的Codestral进行代码生成

codestralresponse = mistralfleet.chat.completions.create( model=codestral, # Mistral的Codestral模型 messages=[{role: user, content: 用Go编写一个基于Redis的速率限制器}], ) print(codestral_response.choices[0].message.content)

通过curl使用Codestral

bash

在本地Mistral集群上使用Codestral进行代码生成

curl http://localhost:11435/v1/chat/completions \
-H Content-Type: application/json \
-d {model: codestral, messages: [{role: user, content: 用Rust实现B树 — Mistral Codestral擅长系统编程}]}

Mistral Large推理

bash

Mistral Large用于复杂推理

curl http://localhost:11435/api/chat -d {
model: mistral-large,
messages: [{role: user, content: 比较Mistral与GPT-4在企业部署中的表现}],
stream: false
}

Mistral-Nemo高效运行

bash

Mistral-Nemo — Mistral AI的最佳质量/规模比

curl http://localhost:11435/api/chat -d {
model: mistral-nemo,
messages: [{role: user, content: 总结这篇Mistral AI技术论文}],
stream: false
}

Mistral硬件推荐

跨平台： 以下为示例配置。任何具有等效内存的设备（Mac、Linux、Windows）均可运行。集群路由器支持所有平台。

Mistral模型	最低内存	示例硬件
mistral:7b	8GB	任意Mac — 轻量级Mistral
mistral-nemo

监控Mistral集群

bash

查看已加载的Mistral模型

curl -s http://localhost:11435/api/ps | python3 -m json.tool

Mistral集群概览

curl -s http://localhost:11435/fleet/status | python3 -m json.tool

Mistral模型性能统计

curl -s http://localhost:11435/dashboard/api/models | python3 -m json.tool

Mistral集群响应示例：
json
{
node_id: Mistral-Server,
models_loaded: [codestral:22b, mistral-nemo:12b],
mistral_inference: active
}

Mistral仪表盘地址：http://localhost:11435/dashboard。

与Mistral同时可用的其他功能

其他大语言模型（同一Mistral兼容端点）

Llama 3.3、Qwen 3.5、DeepSeek-V3、Phi 4、Gemma 3 — 可与Mistral模型一起路由。

图像生成

bash curl http://localhost:11435/api/generate-image \ -d {model: z-image-turbo, prompt: Mistral AI标志重新构想为抽象艺术, width: 512, height: 512}

语音转文字

bash curl http://localhost:11435/api/transcribe -F file=@mistral_meeting.wav -F model=qwen3-asr

嵌入向量

bash curl http://localhost:11435/api/embed \ -d {model: nomic-embed-text, input: Mistral AI开源语言模型Codestral}

完整文档

贡献

Ollama Herd是开源项目（MIT许可）。本地运行Mistral，全球贡献：

- 在GitHub上标星 — 帮助Mistral用户发现本地推理
提交Issue — 分享您的Mistral配置
欢迎提交PR — CLAUDE.md为AI代理提供完整上下文。444个测试用例。

安全护栏

- Mistral模型下载需要用户明确确认 — Mistral模型大小从4GB到70GB+不等。
Mistral模型删除需要用户明确确认。
切勿删除或修改~/.fleet-manager/中的文件。
不会自动下载任何Mistral模型 — 所有拉取均由用户主动发起。

mistral-codestralMistral本地推理