Data Drift, Model Decay, and Dataset Versioning
Your model was perfect at launch. Six months later it's noticeably worse. No code changed. No outages occurred. The model just... drifted. This is the fundamental challenge of operating ML systems in production: the world changes, but your model doesn't, until you retrain it.
Types of Drift
Understanding what's drifting is essential for knowing what to do about it.
Feature Drift (Data Drift / Covariate Shift)
The input data distribution has changed. Your model was trained when users skewed 25-35 years old. Now the app went viral with teenagers. The feature age now has a very different distribution than during training. The model still works for inputs it's seen before, but it's now being asked to extrapolate to new territory.
Label Drift (Concept Drift)
The relationship between inputs and outputs has changed. You trained a churn model during economic growth. During a recession, high-paying customers who previously stayed start churning. The same features now predict a different label. This is the hardest type of drift to detect early because you need ground truth labels, which often come weeks after predictions.
Prediction Drift
The distribution of your model's outputs has changed, even if inputs look similar. More high-confidence predictions, or a shift in the average predicted score. This is often a leading indicator of the above drift types and is easy to monitor continuously.
| Drift Type | What Changes | Detectable Without Labels? | Response |
|---|---|---|---|
| Feature Drift | Input distributions P(X) | Yes | Retrain with new data |
| Label Drift | P(Y|X) relationship | No (need labels) | Retrain with fresh labels |
| Prediction Drift | Output distributions P(Ŷ) | Yes | Investigate; likely retrain |
| Upstream Data | Schema/format of raw data | Yes (validation) | Fix pipeline |
Statistical Drift Detection
Drift detection is fundamentally a statistical question: are two distributions the same? Several tests are commonly used:
Kolmogorov-Smirnov (KS) Test
The KS test measures the maximum distance between two empirical cumulative distribution functions. It works well for continuous features and doesn't assume a particular distribution shape.
Evidently AI: Automated Drift Reports
Evidently AI is an open-source library for generating comprehensive ML monitoring reports. It handles the statistical tests, visualizations, and report generation automatically.
Dataset Versioning for Retraining
When drift is detected and you decide to retrain, you need to know exactly which dataset version to use. DVC provides this capability — every dataset version is linked to a git commit hash.
Automated Retraining Pipelines
The final piece of the MLOps loop is automating retraining when drift exceeds your threshold. Here's a complete automated retraining trigger using an Airflow sensor:
Congratulations — you've completed the full MLOps learning path. You now understand the complete lifecycle from data ingestion to production monitoring and automated retraining.