Deploy MLflow on Ubuntu 24.04

MLflow is an open-source platform for managing the machine learning lifecycle, including experiment tracking, model registry, and reproducible runs. This guide deploys MLflow using Docker Compose with a PostgreSQL backend, S3-compatible artifact storage, basic-auth, and Traefik handling automatic HTTPS, then logs a sample scikit-learn run, following machine learning experiment tracking practices documented in Vultr Docs.

Prerequisite: An S3-compatible bucket (e.g. Vultr Object Storage) with access key, secret key, region, and endpoint URL.

Set Up the Directory Structure

1. Create the project directory:

mkdir -p ~/mlflow
cd ~/mlflow

2. Create the environment file:

nano .env

DOMAIN=mlflow.example.com
LETSENCRYPT_EMAIL=admin@example.com

POSTGRES_USER=mlflow
POSTGRES_PASSWORD=StrongDatabasePassword123

MLFLOW_AUTH_CONFIG_PATH=/app/basic_auth.ini
MLFLOW_FLASK_SERVER_SECRET_KEY=GENERATED_SECRET_KEY

S3_BUCKET=mlflow-artifacts
S3_ACCESS_KEY=YOUR_ACCESS_KEY
S3_SECRET_KEY=YOUR_SECRET_KEY
S3_REGION=YOUR_REGION
S3_ENDPOINT=https://YOUR_OBJECT_STORAGE_ENDPOINT

3. Create the basic-auth configuration:

nano basic_auth.ini

[mlflow]
default_permission = READ
database_uri = sqlite:///basic_auth.db
admin_username = admin
admin_password = ADMIN_PASSWORD
authorization_function = mlflow.server.auth:authenticate_request_basic_auth

4. Create a Dockerfile that adds the auth-server extras and Postgres/S3 clients to the official image:

nano Dockerfile

FROM ghcr.io/mlflow/mlflow:v3.10.1

RUN pip install --no-cache-dir psycopg2-binary boto3 'mlflow[auth]'

Deploy with Docker Compose

1. Create the Compose manifest:

nano docker-compose.yml

services:
  traefik:
    image: traefik:v3.6
    container_name: traefik
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.letsencrypt.acme.email=${LETSENCRYPT_EMAIL}"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "./letsencrypt:/letsencrypt"
    restart: unless-stopped

  postgres:
    image: postgres:15
    container_name: mlflow-postgres
    environment:
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_DB: mlflow
    volumes:
      - ./postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
      interval: 10s
      retries: 5
    restart: unless-stopped

  mlflow:
    build: .
    container_name: mlflow
    expose:
      - "5000"
    environment:
      AWS_ACCESS_KEY_ID: ${S3_ACCESS_KEY}
      AWS_SECRET_ACCESS_KEY: ${S3_SECRET_KEY}
      AWS_DEFAULT_REGION: ${S3_REGION}
      AWS_S3_FORCE_PATH_STYLE: "true"
      MLFLOW_S3_ENDPOINT_URL: ${S3_ENDPOINT}
      MLFLOW_AUTH_CONFIG_PATH: ${MLFLOW_AUTH_CONFIG_PATH}
      MLFLOW_FLASK_SERVER_SECRET_KEY: ${MLFLOW_FLASK_SERVER_SECRET_KEY}
    volumes:
      - ./basic_auth.ini:/app/basic_auth.ini:ro
      - ./mlflow_auth:/app
    command: >
      mlflow server
      --backend-store-uri "postgresql://\({POSTGRES_USER}:\){POSTGRES_PASSWORD}@postgres:5432/mlflow"
      --default-artifact-root "s3://${S3_BUCKET}/"
      --serve-artifacts
      --host 0.0.0.0
      --port 5000
      --allowed-hosts "\({DOMAIN},https://\){DOMAIN}"
      --app-name basic-auth
    depends_on:
      postgres:
        condition: service_healthy
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.mlflow.rule=Host(`${DOMAIN}`)"
      - "traefik.http.routers.mlflow.entrypoints=websecure"
      - "traefik.http.routers.mlflow.tls.certresolver=letsencrypt"
    restart: unless-stopped

2. Build and start the stack:

docker compose up -d --build

3. Verify the services and tail logs:

docker compose ps
docker compose logs

1. Open https://mlflow.example.com and authenticate with admin / the password from basic_auth.ini.

2. On the workstation, create a virtualenv and install dependencies:

sudo apt install python3-venv -y
python3 -m venv mlflow-env
source mlflow-env/bin/activate
pip install mlflow scikit-learn pandas numpy boto3

3. Save a demo experiment script:

nano mlflow_demo.py

import mlflow
import mlflow.sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.datasets import load_diabetes
import pandas as pd
import numpy as np

diabetes = load_diabetes()
X = pd.DataFrame(diabetes.data, columns=diabetes.feature_names)
y = pd.Series(diabetes.target)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

mlflow.set_tracking_uri("https://mlflow.example.com")
mlflow.set_experiment("official_demo_experiment")

with mlflow.start_run():
    alpha, l1_ratio = 0.5, 0.5
    mlflow.log_param("alpha", alpha)
    mlflow.log_param("l1_ratio", l1_ratio)

    model = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
    model.fit(X_train, y_train)

    y_pred = model.predict(X_test)
    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    r2 = r2_score(y_test, y_pred)
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("r2", r2)

    mlflow.sklearn.log_model(model, "model")

4. Export credentials and run the script:

export MLFLOW_TRACKING_USERNAME=admin
export MLFLOW_TRACKING_PASSWORD=ADMIN_PASSWORD
export AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY
export AWS_SECRET_ACCESS_KEY=YOUR_SECRET_KEY
export MLFLOW_S3_ENDPOINT_URL=https://YOUR_OBJECT_STORAGE_ENDPOINT
python3 mlflow_demo.py

Refresh the MLflow UI, the run appears under official_demo_experiment with logged parameters, metrics, and the persisted model artifact.

Next Steps

MLflow is running with PostgreSQL persistence and S3 artifacts. From here you can:

Register top runs in the Model Registry for staged promotion
Wire mlflow.autolog() into your training code for hands-off tracking
Add team users and per-experiment permissions in the basic-auth database

For the full guide with additional tips, visit the original article on Vultr Docs.

Deploying MLflow Open-Source Machine Learning Experiment Tracking on Ubuntu 24.04

Set Up the Directory Structure

Deploy with Docker Compose

Next Steps

Comments

The Self-Hosted Stack

More from this blog

Deploying Zabbix Open-Source Monitoring Platform on Ubuntu 24.04

Deploying SeaweedFS, an Open-Source S3 Storage Alternative to MinIO, on Ubuntu 24.04

Deploying Qdrant Open-Source Vector Database for AI Applications on Ubuntu 24.04

Deploying Paperless-ngx Open-Source Document Management System on Ubuntu 24.04

Deploying Overleaf Open-Source LaTeX Collaboration Platform on Ubuntu 24.04

Command Palette

Set Up the Directory Structure

Deploy with Docker Compose

Sign In and Log a Sample Experiment

Next Steps

Comments

The Self-Hosted Stack

More from this blog