Deploying LocalAI on Ubuntu 24.04

LocalAI is an open-source platform for running Large Language Models locally with an OpenAI-compatible API, so you can swap it in behind existing OpenAI client code without paying per-token or sending data off-server. This guide deploys LocalAI using Docker Compose with Traefik handling automatic HTTPS, persistent model and cache directories, and a working chat-completion test, following self-hosted AI model management practices documented in Vultr Docs.

Set Up the Directory Structure

1. Create the project directories:

mkdir -p ~/localai/{models,cache}
cd ~/localai

models/ holds downloaded model files; cache/ persists between restarts.

2. Create the environment file:

nano .env

DOMAIN=localai.example.com
LETSENCRYPT_EMAIL=admin@example.com

Deploy with Docker Compose

1. Add your user to the Docker group:

sudo usermod -aG docker $USER
newgrp docker

2. Create the Compose manifest:

nano docker-compose.yaml

services:
  traefik:
    image: traefik:v3.6
    container_name: traefik
    restart: unless-stopped
    environment:
      DOCKER_API_VERSION: "1.44"
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--certificatesresolvers.le.acme.httpchallenge=true"
      - "--certificatesresolvers.le.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.le.acme.email=${LETSENCRYPT_EMAIL}"
      - "--certificatesresolvers.le.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./letsencrypt:/letsencrypt

  localai:
    image: localai/localai:latest-aio-cpu
    container_name: localai
    restart: unless-stopped
    volumes:
      - ./models:/models:cached
      - ./cache:/cache:cached
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
      interval: 1m
      timeout: 20s
      retries: 5
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.localai.rule=Host(`${DOMAIN}`)"
      - "traefik.http.routers.localai.entrypoints=websecure"
      - "traefik.http.routers.localai.tls=true"
      - "traefik.http.routers.localai.tls.certresolver=le"
      - "traefik.http.services.localai.loadbalancer.server.port=8080"

Swap localai/localai:latest-aio-cpu for a GPU variant (latest-aio-gpu-nvidia-cuda-12) if the host has an NVIDIA GPU.

3. Set the models directory permissions and start the stack:

sudo chmod -R 755 ~/localai/models
docker compose up -d
docker compose ps

Verify the API

1. Check readiness:

curl -i https://localai.example.com/readyz

A 200 OK confirms Traefik is routing to LocalAI.

2. List the available models:

curl https://localai.example.com/v1/models

3. Run a chat completion:

curl -X POST https://localai.example.com/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "gpt-4",
      "messages": [
        {"role": "user", "content": "Explain what LocalAI does in one sentence."}
      ],
      "max_tokens": 60
    }'

LocalAI returns a response in the OpenAI completion shape.

Access the Dashboard

Open https://localai.example.com in a browser to browse the model gallery, install new models, and run inference from the UI.

Next Steps

LocalAI is running and served securely over HTTPS. From here you can:

Install additional models from the gallery for domain-specific tasks
Point any OpenAI SDK at the LocalAI base URL by changing OPENAI_API_BASE
Run a GPU variant for image generation, embeddings, and faster LLM inference

For the full guide with additional tips, visit the original article on Vultr Docs.

Deploying LocalAI Self-Hosted AI Model Management Platform on Ubuntu 24.04

Set Up the Directory Structure

Deploy with Docker Compose

Verify the API

Access the Dashboard

Next Steps

Comments

The Self-Hosted Stack

More from this blog

Deploying Zabbix Open-Source Monitoring Platform on Ubuntu 24.04

Deploying SeaweedFS, an Open-Source S3 Storage Alternative to MinIO, on Ubuntu 24.04

Deploying Qdrant Open-Source Vector Database for AI Applications on Ubuntu 24.04

Deploying Paperless-ngx Open-Source Document Management System on Ubuntu 24.04

Deploying Overleaf Open-Source LaTeX Collaboration Platform on Ubuntu 24.04

Command Palette

Set Up the Directory Structure

Deploy with Docker Compose

Verify the API

Access the Dashboard

Next Steps

Comments

The Self-Hosted Stack

More from this blog