MLOps Pipeline

MLOps Step 3: ML Model Training

● Intermediate ⏱ 45 min read MLOps Pipeline
🚀
Try it live Run the training and MLflow tracking examples in an interactive sandbox — no setup needed.
📓 Jupyter Notebook 📊 MLflow UI

Training a model is not a one-time event — it's a repeatable process that happens every time your data changes, your code improves, or you want to try a different approach. This guide focuses on making training reproducible, tracked, and deployable to Kubernetes at scale.

We'll train a LogisticRegression model to predict employee attrition — whether an employee is likely to leave the company. The training script uses a scikit-learn Pipeline that scales numeric features and feeds them into a logistic regression classifier, and tracks the run with MLflow.

The Training Pipeline

Every ML training run follows the same pattern: load data, define model, train, evaluate, save. The employee attrition model uses a scikit-learn Pipeline to chain a StandardScaler with a LogisticRegression classifier. Using a Pipeline means preprocessing and model are serialized together — when you load the model at serving time, you don't need to re-implement the scaling logic separately.

python

MLflow Experiment Tracking

MLflow is the de facto standard for tracking ML experiments. It stores every run's parameters, metrics, and artifacts in a queryable database, so you can compare runs and reproduce any result.

MLflow Components

  • Tracking Server: Stores experiment metadata (params, metrics). Backed by PostgreSQL in production.
  • Artifact Store: Stores large files (model binaries, plots). Backed by S3.
  • Model Registry: Versioned model catalog with stage transitions (Staging → Production).
  • UI: Web interface for comparing runs, viewing metrics, and downloading artifacts.
yaml

Hyperparameter Tuning with Optuna

Manual hyperparameter search doesn't scale. Optuna is a modern, efficient hyperparameter optimization framework that uses Bayesian optimization to find good parameters much faster than grid search.

python
💡
When to Tune Hyperparameters Hyperparameter tuning has diminishing returns. Spend 80% of your time on data quality and feature engineering, 20% on tuning. A well-tuned model on bad features will underperform a default model on good features.

Model Evaluation and Promotion

Before promoting a model to production, you need an automated evaluation gate that checks it meets minimum performance thresholds — and that it's better than the current production model.

python

Running Training on Kubernetes

For large models or datasets that won't fit on a single machine, you need to run training as a Kubernetes Job. This gives you access to GPU nodes, large memory instances, and automatic cleanup.

yaml