Autoencoders: Unsupervised Representation Learning and Data Compression

Learn autoencoders — encoder-decoder architecture, latent space, variational autoencoders, anomaly detection, dimensionality reduction, and denoising applications.

Autoencoders

Autoencoders learn to compress data into a compact representation and then reconstruct it. No labels required. The bottleneck layer forces the network to learn the most essential features — everything else is discarded. This makes autoencoders powerful for dimensionality reduction, anomaly detection, and generative modeling.


Architecture

Input Encoder Latent Space Decoder Output
[x ∈ ℝ⁷⁸⁴] → [128 → 64 →] → [z ∈ ℝ¹⁶] → [← 64 ← 128] → [x̂ ∈ ℝ⁷⁸⁴]
Bottleneck (compressed representation)
Loss = reconstruction error: ||x - x̂||²

The network is trained to minimize reconstruction error — any information not captured in the latent vector z is lost.


Basic Autoencoder in PyTorch

import torch
import torch.nn as nn
class Autoencoder(nn.Module):
def __init__(self, input_dim=784, hidden_dims=[512, 256], latent_dim=32):
super().__init__()
# Encoder: compress to latent space
encoder_layers = []
prev_dim = input_dim
for h_dim in hidden_dims:
encoder_layers.extend([nn.Linear(prev_dim, h_dim), nn.ReLU()])
prev_dim = h_dim
encoder_layers.append(nn.Linear(prev_dim, latent_dim))
self.encoder = nn.Sequential(*encoder_layers)
# Decoder: reconstruct from latent space
decoder_layers = []
prev_dim = latent_dim
for h_dim in reversed(hidden_dims):
decoder_layers.extend([nn.Linear(prev_dim, h_dim), nn.ReLU()])
prev_dim = h_dim
decoder_layers.extend([nn.Linear(prev_dim, input_dim), nn.Sigmoid()])
self.decoder = nn.Sequential(*decoder_layers)
def forward(self, x):
z = self.encoder(x)
return self.decoder(z), z # Return reconstruction and latent code
# Train with reconstruction loss
model = Autoencoder(784, [512, 256], 32)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()
for epoch in range(50):
for x_batch, _ in train_loader: # Labels ignored!
x_flat = x_batch.view(x_batch.size(0), -1)
x_recon, z = model(x_flat)
loss = criterion(x_recon, x_flat)
optimizer.zero_grad()
loss.backward()
optimizer.step()

Variational Autoencoder (VAE)

Regular autoencoders learn a deterministic compressed representation. VAEs learn a probabilistic latent space — each input maps to a distribution, not a point. This enables generation of new data by sampling from the latent distribution.

class VAE(nn.Module):
def __init__(self, input_dim, hidden_dim, latent_dim):
super().__init__()
self.encoder = nn.Sequential(nn.Linear(input_dim, hidden_dim), nn.ReLU())
self.fc_mu = nn.Linear(hidden_dim, latent_dim) # Mean of distribution
self.fc_logvar = nn.Linear(hidden_dim, latent_dim) # Log-variance
self.decoder = nn.Sequential(
nn.Linear(latent_dim, hidden_dim), nn.ReLU(),
nn.Linear(hidden_dim, input_dim), nn.Sigmoid()
)
def encode(self, x):
h = self.encoder(x)
return self.fc_mu(h), self.fc_logvar(h)
def reparameterize(self, mu, logvar):
# Reparameterization trick: z = mu + eps * sigma (differentiable sampling)
std = torch.exp(0.5 * logvar)
eps = torch.randn_like(std)
return mu + eps * std
def decode(self, z):
return self.decoder(z)
def forward(self, x):
mu, logvar = self.encode(x)
z = self.reparameterize(mu, logvar)
return self.decode(z), mu, logvar
def vae_loss(recon_x, x, mu, logvar, beta=1.0):
# Reconstruction loss + KL divergence
recon_loss = nn.functional.binary_cross_entropy(recon_x, x, reduction='sum')
kl_div = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
return recon_loss + beta * kl_div

Anomaly Detection

Autoencoders trained on normal data have high reconstruction error on anomalies:

# Train only on normal samples
model.train()
# ... training loop on normal data only ...
# Detect anomalies at inference
model.eval()
with torch.no_grad():
x_recon, z = model(X_test)
recon_errors = ((X_test - x_recon) ** 2).mean(dim=1)
# Samples with high reconstruction error = anomalies
threshold = recon_errors.quantile(0.95) # Top 5% = anomalies
anomalies = recon_errors > threshold

Applications

Use CaseAutoencoder Type
Dimensionality reductionStandard AE
Anomaly / fraud detectionStandard AE
Image denoisingDenoising AE
Data generationVAE, Diffusion
Feature learningStandard AE
Semi-supervised learningStandard AE + fine-tune

Autoencoders are often the simplest path to anomaly detection in tabular and image data — they require no labels and the reconstruction error provides a natural, interpretable anomaly score.