Underfitting in Machine Learning: When Your Model Is Too Simple

Understand underfitting — high bias, poor performance on training and test data, and how to fix it by adding complexity, better features, or tuning hyperparameters.

Underfitting

If overfitting is a model that’s too clever — one that’s memorized training details — underfitting is the opposite: a model that’s too simple to even capture the basic patterns. It fails not because it knows too much about training data, but because it hasn’t learned enough.


What Underfitting Looks Like

Training loss: ▼▼▼ → plateau (never gets low)
Validation loss: ▼▼▼ → plateau (close to training loss)

Both training and validation performance are poor. The model can’t even fit the training data well — a sure sign that its capacity or configuration is insufficient.


The Bias-Variance-Complexity Relationship

Model complexity → Simple ─────────────────────────────── Complex
Training error: High ────────────────────────────────── Low
Validation error: High ──── sweet spot ─────────────────── High (overfitting)
↑ ↑
Underfitting Overfitting

Underfitting = high bias: the model makes systematic errors even on training data because it can’t represent the true relationship.


Common Causes

Model too simple for the data:

  • Linear regression on data that has a curved relationship
  • A shallow decision tree on a complex multi-class problem
  • Low-degree polynomial when the true function has higher-order terms

Insufficient training:

  • Too few epochs (neural networks)
  • Too aggressive learning rate decay
  • Premature early stopping

Over-regularization:

  • L1/L2 penalty too large → model weights driven toward zero
  • Too much dropout → network can’t retain useful information

Poor feature set:

  • Important predictors missing from the feature set
  • Features haven’t been transformed to capture the relevant signal

Diagnosing Underfitting

Check Training Performance First

If your model achieves only 55% accuracy on training data for a task where 90% should be achievable, it’s underfitting — regardless of what validation accuracy looks like.

from sklearn.model_selection import cross_val_score
# Both train and test should be poor
model = LinearRegression()
train_scores = cross_val_score(model, X_train, y_train, cv=5, scoring='r2')
test_scores = cross_val_score(model, X_test, y_test, cv=5, scoring='r2')
print(f"Train R²: {train_scores.mean():.3f}") # e.g., 0.42 — underfitting
print(f"Test R²: {test_scores.mean():.3f}") # e.g., 0.40 — similar (not overfitting)

Residual Analysis (Regression)

Plot predicted vs. actual values. Underfitting shows as systematic patterns in residuals — the model consistently over or under-predicts in certain regions.


Fixes for Underfitting

1. Increase Model Complexity

# Decision tree: allow deeper splits
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier(max_depth=None) # Remove depth limit
# Linear → Polynomial
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
poly_model = Pipeline([
('poly', PolynomialFeatures(degree=3)),
('linear', LinearRegression())
])
# Neural net: add more layers/neurons
model = nn.Sequential(
nn.Linear(10, 64), # was 32
nn.ReLU(),
nn.Linear(64, 64), # added layer
nn.ReLU(),
nn.Linear(64, 1)
)

2. Reduce Regularization

# If Ridge is too strong, try smaller alpha
ridge = Ridge(alpha=0.01) # was alpha=100
# Reduce dropout rate
dropout = nn.Dropout(p=0.1) # was p=0.5

3. Add More Relevant Features

Underfitting is often a feature problem, not a model problem. Consider:

  • Domain-specific engineered features
  • Interaction terms (feature_a × feature_b)
  • Polynomial transformations
  • Time-based aggregations

4. Train Longer (Neural Networks)

# Remove premature early stopping or increase patience
early_stop = EarlyStopping(patience=50) # was patience=5
# Or simply train for more epochs
model.fit(X_train, y_train, epochs=500) # was epochs=20

A Decision Framework

Training Performance
Poor Good
Validation Poor → Underfitting | Overfitting
Good → Can't tell | Just right ✓

Both poor (underfitting): → More data won’t help (the model can’t even learn from what it has) → More features, more model capacity, less regularization

Training good, validation poor (overfitting): → More data, regularization, dropout, early stopping


The Balance: Finding the Sweet Spot

Neither underfitting nor overfitting is acceptable in production. The goal is a model that:

  1. Performs well enough on training data to confirm it has learned the patterns
  2. Generalizes well enough on validation/test data to confirm it hasn’t over-specialized

This balance is found empirically — through cross-validation, model selection, and careful monitoring of the learning curve. There’s no shortcut, but systematic evaluation across multiple complexity levels will always lead you there.