Skip to main content

Command Palette

Search for a command to run...

Deploying MLflow Open-Source Machine Learning Experiment Tracking on Ubuntu 24.04

Step-by-step guide to deploy MLflow with PostgreSQL, S3-compatible object storage, Docker Compose, and Traefik on Ubuntu 24.04 — protected with basic auth and automatic HTTPS.

Updated
4 min read
Deploying MLflow Open-Source Machine Learning Experiment Tracking on Ubuntu 24.04
S
A Developer Advocate with a focus on improving the developer experience through clear communication, technical enablement, and community engagement.
A
DevOps Engineer with experience in Kubernetes, automation, cloud infrastructure, and observability. I work in Developer Relations, contribute to technical documentation, and collaborate on engineering-focused projects.

MLflow is an open-source platform for managing the machine learning lifecycle, including experiment tracking, model registry, and reproducible runs. This guide deploys MLflow using Docker Compose with a PostgreSQL backend, S3-compatible artifact storage, basic-auth, and Traefik handling automatic HTTPS, then logs a sample scikit-learn run, following machine learning experiment tracking practices documented in Vultr Docs.

Prerequisite: An S3-compatible bucket (e.g. Vultr Object Storage) with access key, secret key, region, and endpoint URL.


Set Up the Directory Structure

1. Create the project directory:

mkdir -p ~/mlflow
cd ~/mlflow

2. Create the environment file:

nano .env
DOMAIN=mlflow.example.com
LETSENCRYPT_EMAIL=admin@example.com

POSTGRES_USER=mlflow
POSTGRES_PASSWORD=StrongDatabasePassword123

MLFLOW_AUTH_CONFIG_PATH=/app/basic_auth.ini
MLFLOW_FLASK_SERVER_SECRET_KEY=GENERATED_SECRET_KEY

S3_BUCKET=mlflow-artifacts
S3_ACCESS_KEY=YOUR_ACCESS_KEY
S3_SECRET_KEY=YOUR_SECRET_KEY
S3_REGION=YOUR_REGION
S3_ENDPOINT=https://YOUR_OBJECT_STORAGE_ENDPOINT

3. Create the basic-auth configuration:

nano basic_auth.ini
[mlflow]
default_permission = READ
database_uri = sqlite:///basic_auth.db
admin_username = admin
admin_password = ADMIN_PASSWORD
authorization_function = mlflow.server.auth:authenticate_request_basic_auth

4. Create a Dockerfile that adds the auth-server extras and Postgres/S3 clients to the official image:

nano Dockerfile
FROM ghcr.io/mlflow/mlflow:v3.10.1

RUN pip install --no-cache-dir psycopg2-binary boto3 'mlflow[auth]'

Deploy with Docker Compose

1. Create the Compose manifest:

nano docker-compose.yml
services:
  traefik:
    image: traefik:v3.6
    container_name: traefik
    command:
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--entrypoints.web.http.redirections.entrypoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
      - "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
      - "--certificatesresolvers.letsencrypt.acme.email=${LETSENCRYPT_EMAIL}"
      - "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "./letsencrypt:/letsencrypt"
    restart: unless-stopped

  postgres:
    image: postgres:15
    container_name: mlflow-postgres
    environment:
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_DB: mlflow
    volumes:
      - ./postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
      interval: 10s
      retries: 5
    restart: unless-stopped

  mlflow:
    build: .
    container_name: mlflow
    expose:
      - "5000"
    environment:
      AWS_ACCESS_KEY_ID: ${S3_ACCESS_KEY}
      AWS_SECRET_ACCESS_KEY: ${S3_SECRET_KEY}
      AWS_DEFAULT_REGION: ${S3_REGION}
      AWS_S3_FORCE_PATH_STYLE: "true"
      MLFLOW_S3_ENDPOINT_URL: ${S3_ENDPOINT}
      MLFLOW_AUTH_CONFIG_PATH: ${MLFLOW_AUTH_CONFIG_PATH}
      MLFLOW_FLASK_SERVER_SECRET_KEY: ${MLFLOW_FLASK_SERVER_SECRET_KEY}
    volumes:
      - ./basic_auth.ini:/app/basic_auth.ini:ro
      - ./mlflow_auth:/app
    command: >
      mlflow server
      --backend-store-uri "postgresql://\({POSTGRES_USER}:\){POSTGRES_PASSWORD}@postgres:5432/mlflow"
      --default-artifact-root "s3://${S3_BUCKET}/"
      --serve-artifacts
      --host 0.0.0.0
      --port 5000
      --allowed-hosts "\({DOMAIN},https://\){DOMAIN}"
      --app-name basic-auth
    depends_on:
      postgres:
        condition: service_healthy
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.mlflow.rule=Host(`${DOMAIN}`)"
      - "traefik.http.routers.mlflow.entrypoints=websecure"
      - "traefik.http.routers.mlflow.tls.certresolver=letsencrypt"
    restart: unless-stopped

2. Build and start the stack:

docker compose up -d --build

3. Verify the services and tail logs:

docker compose ps
docker compose logs

Sign In and Log a Sample Experiment

1. Open https://mlflow.example.com and authenticate with admin / the password from basic_auth.ini.

2. On the workstation, create a virtualenv and install dependencies:

sudo apt install python3-venv -y
python3 -m venv mlflow-env
source mlflow-env/bin/activate
pip install mlflow scikit-learn pandas numpy boto3

3. Save a demo experiment script:

nano mlflow_demo.py
import mlflow
import mlflow.sklearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.datasets import load_diabetes
import pandas as pd
import numpy as np

diabetes = load_diabetes()
X = pd.DataFrame(diabetes.data, columns=diabetes.feature_names)
y = pd.Series(diabetes.target)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

mlflow.set_tracking_uri("https://mlflow.example.com")
mlflow.set_experiment("official_demo_experiment")

with mlflow.start_run():
    alpha, l1_ratio = 0.5, 0.5
    mlflow.log_param("alpha", alpha)
    mlflow.log_param("l1_ratio", l1_ratio)

    model = ElasticNet(alpha=alpha, l1_ratio=l1_ratio, random_state=42)
    model.fit(X_train, y_train)

    y_pred = model.predict(X_test)
    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    r2 = r2_score(y_test, y_pred)
    mlflow.log_metric("rmse", rmse)
    mlflow.log_metric("r2", r2)

    mlflow.sklearn.log_model(model, "model")

4. Export credentials and run the script:

export MLFLOW_TRACKING_USERNAME=admin
export MLFLOW_TRACKING_PASSWORD=ADMIN_PASSWORD
export AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY
export AWS_SECRET_ACCESS_KEY=YOUR_SECRET_KEY
export MLFLOW_S3_ENDPOINT_URL=https://YOUR_OBJECT_STORAGE_ENDPOINT
python3 mlflow_demo.py

Refresh the MLflow UI, the run appears under official_demo_experiment with logged parameters, metrics, and the persisted model artifact.


Next Steps

MLflow is running with PostgreSQL persistence and S3 artifacts. From here you can:

  • Register top runs in the Model Registry for staged promotion

  • Wire mlflow.autolog() into your training code for hands-off tracking

  • Add team users and per-experiment permissions in the basic-auth database

For the full guide with additional tips, visit the original article on Vultr Docs.

The Self-Hosted Stack

Part 1 of 50

The Self-Hosted Stack is a developer-focused series exploring open-source tools you can deploy, run, and manage on your own infrastructure. From AI platforms and databases to developer tools, observability stacks, and authentication systems, each guide walks through deploying production-ready open-source software on Vultr cloud infrastructure.

More from this blog

V

Vultr

81 posts

Vultr is a global cloud infrastructure provider trusted by developers and businesses in 185+ countries. We publishe hands-on guides spanning Linux administration, server configuration, DevOps, networking, open source stacks, AI code agents, and Vultr product walkthroughs, all tested against real cloud environments and built for engineers who ship.