Online Learning: Updating ML Models Continuously as New Data Arrives

Learn online learning — streaming model updates, stochastic gradient descent, concept drift, and how to build ML systems that adapt in real time in 2026.

Online Learning

Most machine learning workflows follow a static pattern: collect data, train a model, deploy it, and retrain periodically when it degrades. Online learning breaks that pattern. Instead of batch training, the model updates incrementally — processing one example (or a small batch) at a time, continuously incorporating new information.


Why Online Learning Exists

The world changes. A fraud detection model trained on last year’s patterns misses new attack vectors. A recommendation model trained on summer behavior is wrong in December. A price prediction model trained before a market shock is immediately stale.

Retraining from scratch on all historical data is:

  • Computationally expensive (hours or days)
  • Memory intensive (can’t hold all data)
  • Slow to respond to rapid changes

Online learning addresses this by updating the model incrementally as each new observation arrives.


The Core Mechanism

Batch learning:
All data → Train once → Deploy → Wait for degradation → Retrain
Online learning:
New example arrives →
1. Make prediction
2. Observe true label / feedback
3. Update model weights immediately
4. Repeat for next example

The model is always current — it’s seen every data point up to this moment.


Stochastic Gradient Descent (Online SGD)

The mathematical backbone of online learning. Instead of computing gradients over the full dataset (batch gradient descent), update weights after each example:

# Mini-batch SGD — the de facto standard
for epoch in range(n_epochs):
for X_batch, y_batch in dataloader: # process one batch
optimizer.zero_grad()
loss = criterion(model(X_batch), y_batch)
loss.backward() # compute gradients
optimizer.step() # update weights

With batch_size=1, this is true online learning. With small batches, it’s a practical compromise between the speed of online and the stability of batch.


Concept Drift: The Main Challenge

The biggest enemy of deployed models is concept drift — the statistical properties of the target variable changing over time.

Types of drift:
Sudden drift: Distribution changes abruptly (market crash, new fraud method)
Gradual drift: Slow shift over time (seasonal patterns, demographic change)
Recurring drift: Cyclical changes (day/night, weekday/weekend)
Incremental: Continuous small changes

Detection methods:

  • ADWIN (Adaptive Windowing): Detects distribution changes in a stream
  • Page-Hinkley test: Statistical test for abrupt changes
  • DDM/EDDM: Monitor error rate; trigger retraining when error rises significantly
  • Population Stability Index (PSI): Compare feature distributions between windows

Practical Online Learning with River

River is the leading Python library for online/streaming ML:

from river import linear_model, preprocessing, metrics, drift
# Build an incremental pipeline
scaler = preprocessing.StandardScaler()
model = linear_model.LogisticRegression()
metric = metrics.Accuracy()
drift_detector = drift.ADWIN()
for x, y_true in data_stream:
# Scale features incrementally (online)
x_scaled = scaler.transform_one(x)
scaler.learn_one(x)
# Predict before learning
y_pred = model.predict_one(x_scaled)
metric.update(y_true, y_pred)
# Learn from the new example
model.learn_one(x_scaled, y_true)
# Detect concept drift
drift_detector.update(int(y_pred != y_true))
if drift_detector.drift_detected:
print(f"Drift detected! Accuracy: {metric.get():.3f}")
metric = metrics.Accuracy() # reset metric

Online Learning in Production Systems

Real-world examples where online learning is essential:

Recommendation systems: User preferences evolve. Netflix, YouTube, and Spotify all use forms of online learning to update user embeddings in real time.

Fraud detection: Fraudsters adapt. Models that only update weekly fall behind attack patterns that change daily.

Ad click prediction: Billions of ad impressions per day; feedback arrives immediately. Batch retraining can’t keep up with fast-changing click rates.

Financial trading: Market microstructure changes; models must adapt continuously.


Online vs. Batch: When to Use Each

FactorOnline LearningBatch Learning
Data velocityHigh (streaming)Low-medium
Concept driftHigh riskManageable
Memory constraintTightFlexible
Training stabilityLowerHigher
Debugging complexityHighLower
Industry adoptionNicheDominant

Recommendation: Start with batch learning and periodic retraining (e.g., daily). Move to online learning only when retraining latency or concept drift creates a measurable business impact. The operational complexity of online systems is significant — stream processing infrastructure, drift monitoring, rollback mechanisms.


2025–2026: Continual Learning

Academic research increasingly focuses on continual learning — neural networks that learn new tasks without forgetting old ones (the “catastrophic forgetting” problem). Methods like EWC (Elastic Weight Consolidation), progressive networks, and memory replay are making large models adaptable without full retraining. This will likely find its way into production LLM fine-tuning pipelines in the coming years.