Foundation

ML Basics For DevOps Engineers

● Beginner ⏱ 30 min read Foundation

You've spent years automating infrastructure, writing pipelines, and making software delivery reliable. Now your team is talking about models, training runs, and feature stores. This guide bridges that gap — not by dumbing things down, but by mapping ML concepts onto things you already understand.

Why ML Is Different From Regular Software

In traditional software development, the behavior of a system is fully determined by the code you write. You can read the code and understand exactly what will happen for any input. In machine learning, the behavior is determined by data + algorithm + training process. The "code" that determines behavior is a trained model artifact — a file of numbers.

This changes everything about how you operate it:

  • Bugs can be invisible. A model can silently degrade without throwing errors. It just starts returning worse predictions.
  • Reproducibility is hard. Two identical training runs with different random seeds or data ordering can produce different models.
  • The artifact is huge. A model might be 500MB or 50GB. You can't store it in git like source code.
  • Deployment isn't enough. You must also monitor for data drift and model decay over time.
💡
Key Insight Think of an ML model like a compiled binary — but one that was "compiled" from data instead of source code. You need version control for both the code that trained it AND the data it was trained on.

Core ML Concepts You Need to Know

Supervised Learning

The most common type of ML. You have labeled examples (input → known output), and you train a model to predict the output for new inputs. Examples: predicting whether a deployment will fail (classification), forecasting server load (regression).

Features and Labels

A feature is an input variable — one piece of information about a data point. A label is the thing you're trying to predict. If you're predicting customer churn, features might be "days since last login", "number of support tickets", and "contract type". The label is "churned: yes/no".

Training, Validation, and Test Sets

You split your data into three buckets:

  • Training set (~70%): The model learns from this data.
  • Validation set (~15%): Used during training to tune hyperparameters and detect overfitting.
  • Test set (~15%): Held out completely, used only for final evaluation. Never touch this during training.

Overfitting vs Underfitting

Overfitting means the model memorized the training data but can't generalize to new data. Your training accuracy is great but validation accuracy is bad. Underfitting means the model is too simple to capture the patterns in the data — both training and validation accuracy are bad.

Hyperparameters vs Parameters

Parameters are what the model learns (the weights). Hyperparameters are the knobs you set before training: learning rate, number of trees, layer sizes. Tuning hyperparameters is a major part of the MLOps workflow.

The Python ML Ecosystem

Python is the dominant language in ML. Here's what you need to know about the key libraries:

LibraryPurposeDevOps Analogy
numpyN-dimensional array mathThe bash of ML — always present, everything uses it
pandasTabular data manipulationLike jq but for tables, plus SQL-like operations
scikit-learnClassical ML algorithmsThe standard library of ML
PyTorchDeep learning frameworkThe "engine" — low level, highly flexible
MLflowExperiment tracking & model registryLike git + Artifactory for models
DVCData version controlGit LFS but designed for ML datasets
bash

Training Your First Model

Let's train a real model. We'll use the classic iris dataset to classify flower species — a "hello world" for ML. Don't worry about what iris flowers are; focus on the pattern of how training works.

python

Run this and you'll see output like Accuracy: 0.9667. The model is stored in MLflow's artifact store. You can view the experiment in the MLflow UI by running mlflow ui and visiting http://localhost:5000.

MLflow vs Git Think of mlflow.start_run() as a git commit. It captures a snapshot: the hyperparameters (like your commit message), the metrics (like test results), and the model artifact (like the compiled binary). You can always go back and reproduce any run.

ML Workflow vs CI/CD: A Mapping

Your existing intuitions about software delivery map surprisingly well to MLOps:

CI/CD ConceptML EquivalentNotes
Source codeTraining code + datasetBoth must be versioned together
Build artifactTrained model fileStored in model registry, not git
Unit testsData validation testsCheck for nulls, schema drift, range violations
Integration testsModel evaluation metricsAccuracy, F1, AUC must meet thresholds
Canary deployA/B model testingRoute 5% of traffic to new model
RollbackChampion/challenger swapRevert to previous model version in registry
MonitoringPrediction monitoring + driftWatch for statistical drift, not just errors

The fundamental difference: in CI/CD you ship code. In MLOps you ship code AND data AND the artifact they produce together. A change to the training data can change the model behavior just as much as a code change.

You're now ready to understand the full MLOps lifecycle. In the next guide, we'll go deeper into exactly what skills you need to bring from DevOps and what you'll need to learn fresh.