How Do You AI, Part 1: Local AI Ecosystem
Introduction
In this multi-part series, we’ll build a fully distributed, on-prem AI homelab with Homeassistant integration. Covering SW and HW requirements to run small models locally. Disclaimer, this guilde is being rebuild and regenerated by AI regularly. Later on in the the series we will build the pipeline that is used to do this. In Part 1, we focus on establishing a two-machine ecosystem:
- Home Assistant Hub (Debian) for smart-home automation
- AI Workload Server (AlmaLinux 9) for GPU-accelerated AI workloads
This separation provides clear boundaries—your day-to-day smart-home services stay rock-solid, while GPU jobs on the beefy server don’t interfere.
Machine Overview
Machine | Role | OS | CPU | RAM | Storage | Accelerators |
---|---|---|---|---|---|---|
Home Assistant | Smart home automation & services | Debian 12 “Bookworm” | AMD Ryzen 5 4650G | 64 GB DDR4 | 128 GB NVMe SSD 4x240GB SATA SSD(RAID 5) | Coral USB Edge TPU |
AI Server | Containerized AI training & inference | AlmaLinux 9 | AMD Ryzen 9 3900X | 128 GB DDR4 | 2 × 128GB NVMe (RAID 1) 4x480GB SATA SSD(RAID 5) 4x960GB SATA SSD(RAID 5) | NVIDIA Quadro P4000 |
Why two machines?
- Isolate your production smart-home stack from experimental AI work
- Optimize each OS for its specific workload and driver set
- Scale or replace one side without touching the other
Part A: Home Assistant Hub (Debian)
1. Install Debian 12
Download the Debian 12 netinst ISO from the official site.
Boot from the USB installer and choose:
• Partition scheme: GPT with / (30 GB), swap (4 GB), /home (remaining) • Hostname: home-hub.local • User: homeadmin
After first boot, update and install essentials:
sudo apt update && sudo apt upgrade -y sudo apt install -y curl git vim ufw
2. Set Up Home Assistant (Supervised)
Home Assistant Supervised installs Core in Docker alongside Supervisor, enabling add-ons, snapshots, and UI-managed updates.
2.1 Prerequisites
sudo apt update && sudo apt upgrade -y
sudo apt install -y jq wget curl udisks2 libglib2.0-bin network-manager dbus software-properties-common apparmor-utils
sudo systemctl disable ModemManager --now
Ensure Docker is installed (installer will handle if missing).
2.2 Install Supervised
curl -fsSL https://raw.githubusercontent.com/home-assistant/supervised-installer/main/installer.sh \
| bash -s -- -m generic-x86-64
Monitor until Supervisor and Core are running. Access UI at http://home-hub.local:8123
.
2.3 Post-Install Configuration
sudo systemctl enable hassio-supervisor.service
sudo ufw allow 8123/tcp
Use the Supervisor UI for snapshots, updates, and backups.
2.4 Install Frigate NVR Add-on
- Add repo: Supervisor → Add-on Store → Repositories →
https://github.com/blakeblackshear/frigate-hass-addons
. - Install Frigate NVR.
- Configure (Configuration tab):
mqtt:
host: core-mosquitto
user: mqtt_user
password: mqtt_password
detectors:
cpu1:
type: cpu
cameras:
front_door:
ffmpeg:
inputs:
- path: rtsp://user:[email protected]:554/stream
roles: [detect, record]
width: 1280
height: 720
fps: 5
record:
enabled: true
retain_days: 7
events:
max_seconds: 300
pre_capture: 5
post_capture: 5
- Start and integrate under Configuration → Integrations → Frigate.
3. Coral USB Edge TPU Integration
sudo apt update
sudo apt install -y libusb-1.0-0-dev python3-edge-tpu
lsusb | grep -i Coral
python3 - << 'EOF'
from edgetpu.basic import BasicEngine
print(BasicEngine('mobilenet_v2_coco_quant_postprocess.tflite').get_all_scores())
EOF
Test Docker container:
sudo docker run --rm --device /dev/bus/usb ghcr.io/google-coral/edgetpu:latest python3 - << 'EOF'
from edgetpu.basic import BasicEngine
print(BasicEngine('mobilenet_v2_coco_quant_postprocess.tflite').get_all_scores())
EOF
4. Harden & Extend
- UFW:
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp
sudo ufw allow 8123/tcp
sudo ufw enable
- Backups: nightly rsync of
/opt/homeassistant/config
to NAS. - Add-Ons: Mosquitto, Zigbee2MQTT, DuckDNS+Let’s Encrypt.
Part B: AI Workload Server (AlmaLinux 9)
1. Install AlmaLinux 9
Partition:
/ 50 GB
/var 100 GB # container data
swap 8 GB
/home rest
sudo dnf update -y
sudo dnf install -y wget vim git
2. GPU Support & Podman
sudo dnf install -y \
https://download1.rpmfusion.org/free/el/rpmfusion-free-release-9.noarch.rpm \
https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-9.noarch.rpm
sudo dnf install -y akmod-nvidia xorg-x11-drv-nvidia-cuda podman podman-docker podman-compose nvidia-container-toolkit
sudo systemctl restart podman
Test GPU access:
podman run --rm --hooks-dir=/usr/share/containers/oci/hooks.d docker.io/nvidia/cuda:11.8-base nvidia-smi
3. Install Ollama as Podman Container
Run LLMs locally with NVIDIA GPU acceleration.
- Pull Ollama image:
podman pull ollama/ollama:latest
- Run Ollama container:
podman run -d \
--name ollama \
--gpus=all \
-p 11434:11434 \
-v /opt/ollama/models:/models \
ollama/ollama:latest serve --port 11434
- Verify:
curl http://localhost:11434/v1/models
- Use Ollama CLI:
podman exec -it ollama ollama list
podman exec -it ollama ollama pull llama2
podman exec -it ollama ollama shell llama2
Proceed to VM & container management below.
4. Cockpit & Virtualization with libvirt
4.1 Install
sudo dnf install -y cockpit cockpit-machines libvirt-daemon-kvm qemu-kvm virt-install
sudo systemctl enable --now cockpit.socket libvirtd
4.2 Firewall
sudo firewall-cmd --permanent --add-service=cockpit
sudo firewall-cmd --reload
Access: https://ai-server.local:9090
.
4.3 VM Creation (GUI)
Cockpit → Virtual Machines → Create VM:
- Name:
vm1
- ISO: Debian 12 netinst
- OS variant: generic Linux
- CPUs: 2, RAM: 4 GB, Disk: 20 GB
- Name:
4.4 VM Creation (CLI)
sudo virt-install --name vm1 --ram 4096 --vcpus 2 --disk path=/var/lib/libvirt/images/vm1.img,size=20 --cdrom /path/to/debian-12-netinst.iso --network network=default --os-variant generic
Connect: sudo virsh console vm1
.
4.5 VM Setup: Debian, Docker & Portainer
# Debian install via console (LVM layout, user vmuser)
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git vim ufw
# Docker:
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io
sudo systemctl enable --now docker
# Portainer:
sudo docker volume create portainer_data
sudo docker run -d -p 9000:9000 --name portainer --restart=unless-stopped -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer-ce
# Firewall:
sudo ufw allow 22/tcp
sudo ufw allow 9000/tcp
sudo ufw enable
# Deploy OpenWebUI:
cat << 'EOF' > openwebui-docker-compose.yml
version: '3.8'
services:
openwebui:
image: openwebui/openwebui:latest
container_name: openwebui
restart: unless-stopped
ports:
- 7860:7860
volumes:
- /home/vmuser/models:/models
environment:
- NVIDIA_VISIBLE_DEVICES=all
EOF
docker-compose up -d
Access: http://vm1.local:7860
.
Part C: Integrate Ollama with Home Assistant
Home Assistant now offers native support for integrating with a local Ollama server, enabling you to leverage local Large Language Models (LLMs) for conversational interactions and, optionally, control Home Assistant entities.(Home Assistant)
1. Prerequisites
Ollama Server: Ensure you have an Ollama server running on your network. Ollama is available for macOS, Linux, and Windows. Follow the official installation instructions to set it up.
Network Accessibility: Configure the Ollama server to be accessible over your network. For example, it should be reachable at
http://ai-server.local:11434
.(Home Assistant)
2. Adding the Ollama Integration to Home Assistant
To integrate Ollama with Home Assistant:
Navigate to your Home Assistant instance.(Home Assistant)
Go to Settings > Devices & Services.(Home Assistant)
Click on the Add Integration button in the bottom right corner.(Home Assistant)
Search for and select Ollama from the list.
Follow the on-screen instructions to complete the setup.
3. Configuration Options
After adding the integration, you can configure the following options:
URL: The address of your Ollama server (e.g.,
http://ai-server.local:11434
).Model: Specify the model to use, such as
mistral
orllama2:13b
. Models will be automatically downloaded during setup.(Home Assistant)Instructions: Provide custom instructions for the AI on how it should respond to your requests. This uses Home Assistant templating.(Home Assistant)
Control Home Assistant: Enable this option if you want the AI to interact with your Home Assistant entities. Note that this feature is experimental and requires exposing specific entities to the AI.(Home Assistant)
Context Window Size: Set the number of tokens the model can take as input. The default is 8000 tokens, but you may adjust this based on your model’s capabilities and system resources.(Home Assistant)
Max History Messages: Define the maximum number of messages to keep for each conversation. Setting this to 0 means no limit.(Home Assistant)
Keep Alive: Determine how long (in seconds) the Ollama host should keep the model in memory after receiving a message. The default is -1 (no limit).(Home Assistant)
4. Controlling Home Assistant with Ollama
The ability for Ollama to control Home Assistant entities is experimental and requires models that support Tools. To enable this feature:(Home Assistant)
Ensure your selected model supports Tools.
During the integration setup, enable the Control Home Assistant option.(Home Assistant)
Expose the entities you want the AI to access via the exposed entities page.(Home Assistant)
Recommendations:
Use the
llama3.1:8b
model for better performance.(Home Assistant)Limit the number of exposed entities to fewer than 25 to reduce complexity and potential errors.(Home Assistant)
Consider setting up multiple Ollama integrations: one for general conversation without control capabilities and another with control enabled for managing Home Assistant entities.(Home Assistant)
By following these steps, you can seamlessly integrate Ollama with Home Assistant, enabling advanced conversational interactions and control over your smart home setup.
Conclusion & Next Steps
- Home Assistant Hub: Debian supervised install, Coral TPU, Frigate.
- AI Server: AlmaLinux + Podman, Ollama container, GPU & Cockpit virtualization.
- Network: Secure, monitored, VPN-enabled.
Part 2 will cover:
- Machine vision Models with Coral
- Text to speach and speach to text pipelines
- Agent software stack deployment
Stay tuned, and happy homelabbing!