Deploying LocalAI Self-Hosted AI Model Management Platform on Ubuntu 24.04
Step-by-step guide to deploy LocalAI with Docker Compose and Traefik on Ubuntu 24.04, exposing an OpenAI-compatible API behind automatic HTTPS for self-hosted LLM serving.

LocalAI is an open-source platform for running Large Language Models locally with an OpenAI-compatible API, so you can swap it in behind existing OpenAI client code without paying per-token or sending data off-server. This guide deploys LocalAI using Docker Compose with Traefik handling automatic HTTPS, persistent model and cache directories, and a working chat-completion test, following self-hosted AI model management practices documented in Vultr Docs.
Set Up the Directory Structure
1. Create the project directories:
mkdir -p ~/localai/{models,cache}
cd ~/localai
models/ holds downloaded model files; cache/ persists between restarts.
2. Create the environment file:
nano .env
DOMAIN=localai.example.com
LETSENCRYPT_EMAIL=admin@example.com
Deploy with Docker Compose
1. Add your user to the Docker group:
sudo usermod -aG docker $USER
newgrp docker
2. Create the Compose manifest:
nano docker-compose.yaml
services:
traefik:
image: traefik:v3.6
container_name: traefik
restart: unless-stopped
environment:
DOCKER_API_VERSION: "1.44"
command:
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--entrypoints.web.http.redirections.entrypoint.to=websecure"
- "--entrypoints.web.http.redirections.entrypoint.scheme=https"
- "--certificatesresolvers.le.acme.httpchallenge=true"
- "--certificatesresolvers.le.acme.httpchallenge.entrypoint=web"
- "--certificatesresolvers.le.acme.email=${LETSENCRYPT_EMAIL}"
- "--certificatesresolvers.le.acme.storage=/letsencrypt/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./letsencrypt:/letsencrypt
localai:
image: localai/localai:latest-aio-cpu
container_name: localai
restart: unless-stopped
volumes:
- ./models:/models:cached
- ./cache:/cache:cached
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 1m
timeout: 20s
retries: 5
labels:
- "traefik.enable=true"
- "traefik.http.routers.localai.rule=Host(`${DOMAIN}`)"
- "traefik.http.routers.localai.entrypoints=websecure"
- "traefik.http.routers.localai.tls=true"
- "traefik.http.routers.localai.tls.certresolver=le"
- "traefik.http.services.localai.loadbalancer.server.port=8080"
Swap localai/localai:latest-aio-cpu for a GPU variant (latest-aio-gpu-nvidia-cuda-12) if the host has an NVIDIA GPU.
3. Set the models directory permissions and start the stack:
sudo chmod -R 755 ~/localai/models
docker compose up -d
docker compose ps
Verify the API
1. Check readiness:
curl -i https://localai.example.com/readyz
A 200 OK confirms Traefik is routing to LocalAI.
2. List the available models:
curl https://localai.example.com/v1/models
3. Run a chat completion:
curl -X POST https://localai.example.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Explain what LocalAI does in one sentence."}
],
"max_tokens": 60
}'
LocalAI returns a response in the OpenAI completion shape.
Access the Dashboard
Open https://localai.example.com in a browser to browse the model gallery, install new models, and run inference from the UI.
Next Steps
LocalAI is running and served securely over HTTPS. From here you can:
Install additional models from the gallery for domain-specific tasks
Point any OpenAI SDK at the LocalAI base URL by changing
OPENAI_API_BASERun a GPU variant for image generation, embeddings, and faster LLM inference
For the full guide with additional tips, visit the original article on Vultr Docs.






