homelab-ai

Home lab AI — turn your spare machines into a local AI home lab cluster. LLM inference, image generation, speech-to-text, and embeddings across macOS, Linux, and Windows devices. Zero-config mDNS discovery, real-time dashboard, 7-signal scoring. No cloud, no Docker, no Kubernetes. The home lab AI setup that just works. 家庭实验室AI本地推理集群。Laboratorio IA para inferencia local en casa.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "homelab-ai" with this command: npx skills add twinsgeeks/homelab-ai

Home Lab AI — Your Spare Machines Are a Cluster

You have machines sitting around your home lab. A mini PC in the closet. A workstation on the desk. Maybe a desktop doing light work. Together, your home lab has more compute than most cloud instances — you just need software that treats them as one home lab system. Works on macOS, Linux, and Windows.

Ollama Herd turns your home lab into a local AI cluster. One home lab endpoint, zero config, four model types.

What your home lab gets

Device 1 (32GB)    ─┐
Device 2 (64GB)     ├──→  Home Lab Router (:11435)  ←──  Your apps / agents
Device 3 (256GB)   ─┘
  • Home lab LLM inference — Llama, Qwen, DeepSeek, Phi, Mistral, Gemma
  • Home lab image generation — Stable Diffusion 3, Flux, z-image-turbo
  • Home lab speech-to-text — Qwen3-ASR transcription
  • Home lab embeddings — nomic-embed-text, mxbai-embed for RAG

All routed to the best available home lab device automatically.

Home Lab Setup (5 minutes)

On every home lab machine:

pip install ollama-herd    # Home lab AI router

Pick one home lab machine as the router:

herd    # starts the home lab router

On every other home lab machine:

herd-node    # joins the home lab fleet automatically

That's it. Home lab devices discover each other automatically on your local network. No IP addresses, no config files, no Docker, no Kubernetes.

Optional: add home lab image generation

uv tool install mflux           # Flux models (fastest for home labs)
uv tool install diffusionkit    # Stable Diffusion 3/3.5

Use Your Home Lab

Home lab LLM chat

from openai import OpenAI

# Home lab inference client
homelab_client = OpenAI(base_url="http://localhost:11435/v1", api_key="not-needed")
homelab_response = homelab_client.chat.completions.create(
    model="llama3.3:70b",
    messages=[{"role": "user", "content": "How do I set up a home lab NAS?"}],
    stream=True,
)
for chunk in homelab_response:
    print(chunk.choices[0].delta.content or "", end="")

Home lab image generation

curl -o homelab_output.png http://localhost:11435/api/generate-image \
  -H "Content-Type: application/json" \
  -d '{"model": "z-image-turbo", "prompt": "a cozy home lab with servers and RGB lighting", "width": 1024, "height": 1024}'

Home lab transcription

curl http://localhost:11435/api/transcribe -F "file=@homelab_standup.wav" -F "model=qwen3-asr"

Home lab knowledge base

curl http://localhost:11435/api/embed \
  -d '{"model": "nomic-embed-text", "input": "home lab networking and AI inference best practices"}'

How the Home Lab Routes Requests

The home lab router scores each device on 7 signals and picks the best one:

Home Lab SignalWhat it measures
Thermal stateIs the home lab model already loaded (hot) or needs cold-loading?
Memory fitDoes the home lab device have enough RAM for this model?
Queue depthIs the home lab device already busy with other requests?
Wait timeHow long has the home lab request been waiting?
Role affinityBig models prefer big home lab machines, small models prefer small ones
Availability trendIs this home lab device reliably available at this time of day?
Context fitDoes the loaded context window fit the home lab request?

You don't manage any of this. The home lab router handles it.

The Home Lab Dashboard

Open http://localhost:11435/dashboard in your browser — your home lab command center:

  • Home Lab Fleet Overview — see every device, loaded models, queue depths, health
  • Trends — home lab requests per hour, latency, token throughput over 24h-7d
  • Health — 15 automated home lab checks with recommendations
  • Recommendations — optimal home lab model mix per device based on your hardware

Recommended Home Lab Models by Device

Cross-platform: These are example configurations. Any device (Mac, Linux, Windows) with equivalent RAM works. The fleet router runs on all platforms.

Home Lab DeviceRAMStart with
MacBook Air (8GB)8GBphi4-mini, gemma3:1b
Mac Mini (16GB)16GBphi4, gemma3:4b, nomic-embed-text
Mac Mini (32GB)32GBqwen3:14b, deepseek-r1:14b
MacBook Pro (64GB)64GBqwen3:32b, codestral, z-image-turbo
Mac Studio (128GB)128GBllama3.3:70b, qwen3:72b
Mac Studio (256GB)256GBgpt-oss:120b, sd3.5-large

The home lab router's model recommender suggests the optimal mix: GET /dashboard/api/recommendations.

Works with Every Home Lab Tool

The home lab fleet exposes an OpenAI-compatible API. Any tool that works with OpenAI works with your home lab:

ToolHome Lab Connection
Open WebUISet Ollama URL to http://homelab-router:11435
Aideraider --openai-api-base http://homelab-router:11435/v1
Continue.devBase URL: http://homelab-router:11435/v1
LangChainChatOpenAI(base_url="http://homelab-router:11435/v1")
CrewAISet OPENAI_API_BASE=http://homelab-router:11435/v1
Any OpenAI SDKBase URL: http://homelab-router:11435/v1, API key: any string

Full documentation

Contribute

Ollama Herd is open source (MIT) and built by home lab enthusiasts for home lab enthusiasts:

  • Star on GitHub — help other home lab builders find us
  • Open an issue — share your home lab setup, report bugs
  • PRs welcome — from humans and AI agents. CLAUDE.md gives full context.
  • Built by twin brothers in Alaska who run their own home lab fleet.

Home Lab Guardrails

  • No automatic downloads — home lab model pulls require explicit user confirmation. Some models are 70GB+.
  • Home lab model deletion requires explicit user confirmation.
  • All home lab requests stay local — no data leaves your home network.
  • Never delete or modify files in ~/.fleet-manager/ (home lab routing data and logs).
  • No cloud dependencies — your home lab works offline after initial model downloads.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Gpu Cluster Manager

Turn your spare GPUs into one inference endpoint. Auto-discovers machines on your network, routes requests to the best available device, learns when your mac...

Registry SourceRecently Updated
2150Profile unavailable
Coding

Apple Silicon Ai

Apple Silicon AI — run LLMs, image generation, speech-to-text, and embeddings on Mac Studio, Mac Mini, MacBook Pro, and Mac Pro. Turn your Apple Silicon devi...

Registry SourceRecently Updated
1392Profile unavailable
General

Ollama Ollama Herd

Ollama Ollama Herd — multimodal Ollama model router that herds your Ollama LLMs into one smart Ollama endpoint. Route Ollama Llama, Qwen, DeepSeek, Phi, Mist...

Registry SourceRecently Updated
1522Profile unavailable
General

Mac Mini AI — Mac Mini Local LLM, Image Gen, STT on Apple Silicon

Mac Mini AI — run LLMs, image generation, speech-to-text, and embeddings on your Mac Mini. M4 (16-32GB) and M4 Pro (24-64GB) configurations make the Mac Mini...

Registry SourceRecently Updated
1012Profile unavailable