How do you AI part 1

How Do You AI, Part 1: Local AI Ecosystem

Introduction

In this multi-part series, we’ll build a fully distributed, on-prem AI homelab with Homeassistant integration. Covering SW and HW requirements to run small models locally. Disclaimer, this guilde is being rebuild and regenerated by AI regularly. Later on in the the series we will build the pipeline that is used to do this. In Part 1, we focus on establishing a two-machine ecosystem:

  1. Home Assistant Hub (Debian) for smart-home automation
  2. AI Workload Server (AlmaLinux 9) for GPU-accelerated AI workloads

This separation provides clear boundaries—your day-to-day smart-home services stay rock-solid, while GPU jobs on the beefy server don’t interfere.


Machine Overview

MachineRoleOSCPURAMStorageAccelerators
Home AssistantSmart home automation & servicesDebian 12 “Bookworm”AMD Ryzen 5 4650G64 GB DDR4128 GB NVMe SSD 4x240GB SATA SSD(RAID 5)Coral USB Edge TPU
AI ServerContainerized AI training & inferenceAlmaLinux 9AMD Ryzen 9 3900X128 GB DDR42 × 128GB NVMe (RAID 1) 4x480GB SATA SSD(RAID 5) 4x960GB SATA SSD(RAID 5)NVIDIA Quadro P4000

Why two machines?

  • Isolate your production smart-home stack from experimental AI work
  • Optimize each OS for its specific workload and driver set
  • Scale or replace one side without touching the other

Part A: Home Assistant Hub (Debian)

1. Install Debian 12

  1. Download the Debian 12 netinst ISO from the official site.

  2. Boot from the USB installer and choose:

    • Partition scheme: GPT with / (30 GB), swap (4 GB), /home (remaining)
    • Hostname: home-hub.local
    • User: homeadmin
    
  3. After first boot, update and install essentials:

    sudo apt update && sudo apt upgrade -y
    sudo apt install -y curl git vim ufw
    

2. Set Up Home Assistant (Supervised)

Home Assistant Supervised installs Core in Docker alongside Supervisor, enabling add-ons, snapshots, and UI-managed updates.

2.1 Prerequisites

sudo apt update && sudo apt upgrade -y
sudo apt install -y jq wget curl udisks2 libglib2.0-bin network-manager dbus software-properties-common apparmor-utils
sudo systemctl disable ModemManager --now

Ensure Docker is installed (installer will handle if missing).

2.2 Install Supervised

curl -fsSL https://raw.githubusercontent.com/home-assistant/supervised-installer/main/installer.sh \
  | bash -s -- -m generic-x86-64

Monitor until Supervisor and Core are running. Access UI at http://home-hub.local:8123.

2.3 Post-Install Configuration

sudo systemctl enable hassio-supervisor.service
sudo ufw allow 8123/tcp

Use the Supervisor UI for snapshots, updates, and backups.

2.4 Install Frigate NVR Add-on

  1. Add repo: Supervisor → Add-on Store → Repositories → https://github.com/blakeblackshear/frigate-hass-addons.
  2. Install Frigate NVR.
  3. Configure (Configuration tab):
mqtt:
  host: core-mosquitto
  user: mqtt_user
  password: mqtt_password
detectors:
  cpu1:
    type: cpu
cameras:
  front_door:
    ffmpeg:
      inputs:
        - path: rtsp://user:[email protected]:554/stream
          roles: [detect, record]
    width: 1280
    height: 720
    fps: 5
record:
  enabled: true
  retain_days: 7
events:
  max_seconds: 300
  pre_capture: 5
  post_capture: 5
  1. Start and integrate under Configuration → Integrations → Frigate.

3. Coral USB Edge TPU Integration

sudo apt update
sudo apt install -y libusb-1.0-0-dev python3-edge-tpu
lsusb | grep -i Coral
python3 - << 'EOF'
from edgetpu.basic import BasicEngine
print(BasicEngine('mobilenet_v2_coco_quant_postprocess.tflite').get_all_scores())
EOF

Test Docker container:

sudo docker run --rm --device /dev/bus/usb ghcr.io/google-coral/edgetpu:latest python3 - << 'EOF'
from edgetpu.basic import BasicEngine
print(BasicEngine('mobilenet_v2_coco_quant_postprocess.tflite').get_all_scores())
EOF

4. Harden & Extend

  • UFW:
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow 22/tcp
sudo ufw allow 8123/tcp
sudo ufw enable
  • Backups: nightly rsync of /opt/homeassistant/config to NAS.
  • Add-Ons: Mosquitto, Zigbee2MQTT, DuckDNS+Let’s Encrypt.

Part B: AI Workload Server (AlmaLinux 9)

1. Install AlmaLinux 9

Partition:

/      50 GB
/var  100 GB   # container data
swap   8 GB
/home  rest
sudo dnf update -y
sudo dnf install -y wget vim git

2. GPU Support & Podman

sudo dnf install -y \
  https://download1.rpmfusion.org/free/el/rpmfusion-free-release-9.noarch.rpm \
  https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-9.noarch.rpm
sudo dnf install -y akmod-nvidia xorg-x11-drv-nvidia-cuda podman podman-docker podman-compose nvidia-container-toolkit
sudo systemctl restart podman

Test GPU access:

podman run --rm --hooks-dir=/usr/share/containers/oci/hooks.d docker.io/nvidia/cuda:11.8-base nvidia-smi

3. Install Ollama as Podman Container

Run LLMs locally with NVIDIA GPU acceleration.

  1. Pull Ollama image:
podman pull ollama/ollama:latest
  1. Run Ollama container:
podman run -d \
  --name ollama \
  --gpus=all \
  -p 11434:11434 \
  -v /opt/ollama/models:/models \
  ollama/ollama:latest serve --port 11434
  1. Verify:
curl http://localhost:11434/v1/models
  1. Use Ollama CLI:
podman exec -it ollama ollama list
podman exec -it ollama ollama pull llama2
podman exec -it ollama ollama shell llama2

Proceed to VM & container management below.

4. Cockpit & Virtualization with libvirt

4.1 Install

sudo dnf install -y cockpit cockpit-machines libvirt-daemon-kvm qemu-kvm virt-install
sudo systemctl enable --now cockpit.socket libvirtd

4.2 Firewall

sudo firewall-cmd --permanent --add-service=cockpit
sudo firewall-cmd --reload

Access: https://ai-server.local:9090.

4.3 VM Creation (GUI)

  • Cockpit → Virtual Machines → Create VM:

    • Name: vm1
    • ISO: Debian 12 netinst
    • OS variant: generic Linux
    • CPUs: 2, RAM: 4 GB, Disk: 20 GB

4.4 VM Creation (CLI)

sudo virt-install --name vm1 --ram 4096 --vcpus 2 --disk path=/var/lib/libvirt/images/vm1.img,size=20 --cdrom /path/to/debian-12-netinst.iso --network network=default --os-variant generic

Connect: sudo virsh console vm1.

4.5 VM Setup: Debian, Docker & Portainer
# Debian install via console (LVM layout, user vmuser)
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl git vim ufw
# Docker:
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io
sudo systemctl enable --now docker
# Portainer:
sudo docker volume create portainer_data
sudo docker run -d -p 9000:9000 --name portainer --restart=unless-stopped -v /var/run/docker.sock:/var/run/docker.sock -v portainer_data:/data portainer/portainer-ce
# Firewall:
sudo ufw allow 22/tcp
sudo ufw allow 9000/tcp
sudo ufw enable
# Deploy OpenWebUI:
cat << 'EOF' > openwebui-docker-compose.yml
version: '3.8'
services:
  openwebui:
    image: openwebui/openwebui:latest
    container_name: openwebui
    restart: unless-stopped
    ports:
      - 7860:7860
    volumes:
      - /home/vmuser/models:/models
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
EOF
docker-compose up -d

Access: http://vm1.local:7860.


Part C: Integrate Ollama with Home Assistant

Home Assistant now offers native support for integrating with a local Ollama server, enabling you to leverage local Large Language Models (LLMs) for conversational interactions and, optionally, control Home Assistant entities.(Home Assistant)

1. Prerequisites

  • Ollama Server: Ensure you have an Ollama server running on your network. Ollama is available for macOS, Linux, and Windows. Follow the official installation instructions to set it up.

  • Network Accessibility: Configure the Ollama server to be accessible over your network. For example, it should be reachable at http://ai-server.local:11434.(Home Assistant)

2. Adding the Ollama Integration to Home Assistant

To integrate Ollama with Home Assistant:

  1. Navigate to your Home Assistant instance.(Home Assistant)

  2. Go to Settings > Devices & Services.(Home Assistant)

  3. Click on the Add Integration button in the bottom right corner.(Home Assistant)

  4. Search for and select Ollama from the list.

  5. Follow the on-screen instructions to complete the setup.

3. Configuration Options

After adding the integration, you can configure the following options:

  • URL: The address of your Ollama server (e.g., http://ai-server.local:11434).

  • Model: Specify the model to use, such as mistral or llama2:13b. Models will be automatically downloaded during setup.(Home Assistant)

  • Instructions: Provide custom instructions for the AI on how it should respond to your requests. This uses Home Assistant templating.(Home Assistant)

  • Control Home Assistant: Enable this option if you want the AI to interact with your Home Assistant entities. Note that this feature is experimental and requires exposing specific entities to the AI.(Home Assistant)

  • Context Window Size: Set the number of tokens the model can take as input. The default is 8000 tokens, but you may adjust this based on your model’s capabilities and system resources.(Home Assistant)

  • Max History Messages: Define the maximum number of messages to keep for each conversation. Setting this to 0 means no limit.(Home Assistant)

  • Keep Alive: Determine how long (in seconds) the Ollama host should keep the model in memory after receiving a message. The default is -1 (no limit).(Home Assistant)

4. Controlling Home Assistant with Ollama

The ability for Ollama to control Home Assistant entities is experimental and requires models that support Tools. To enable this feature:(Home Assistant)

  1. Ensure your selected model supports Tools.

  2. During the integration setup, enable the Control Home Assistant option.(Home Assistant)

  3. Expose the entities you want the AI to access via the exposed entities page.(Home Assistant)

Recommendations:

  • Use the llama3.1:8b model for better performance.(Home Assistant)

  • Limit the number of exposed entities to fewer than 25 to reduce complexity and potential errors.(Home Assistant)

  • Consider setting up multiple Ollama integrations: one for general conversation without control capabilities and another with control enabled for managing Home Assistant entities.(Home Assistant)


By following these steps, you can seamlessly integrate Ollama with Home Assistant, enabling advanced conversational interactions and control over your smart home setup.


Conclusion & Next Steps

  • Home Assistant Hub: Debian supervised install, Coral TPU, Frigate.
  • AI Server: AlmaLinux + Podman, Ollama container, GPU & Cockpit virtualization.
  • Network: Secure, monitored, VPN-enabled.

Part 2 will cover:

  1. Machine vision Models with Coral
  2. Text to speach and speach to text pipelines
  3. Agent software stack deployment

Stay tuned, and happy homelabbing!