Autoencoders
Autoencoders learn to compress data into a compact representation and then reconstruct it. No labels required. The bottleneck layer forces the network to learn the most essential features — everything else is discarded. This makes autoencoders powerful for dimensionality reduction, anomaly detection, and generative modeling.
Architecture
Input Encoder Latent Space Decoder Output[x ∈ ℝ⁷⁸⁴] → [128 → 64 →] → [z ∈ ℝ¹⁶] → [← 64 ← 128] → [x̂ ∈ ℝ⁷⁸⁴] ↑ Bottleneck (compressed representation)
Loss = reconstruction error: ||x - x̂||²The network is trained to minimize reconstruction error — any information not captured in the latent vector z is lost.
Basic Autoencoder in PyTorch
import torchimport torch.nn as nn
class Autoencoder(nn.Module): def __init__(self, input_dim=784, hidden_dims=[512, 256], latent_dim=32): super().__init__()
# Encoder: compress to latent space encoder_layers = [] prev_dim = input_dim for h_dim in hidden_dims: encoder_layers.extend([nn.Linear(prev_dim, h_dim), nn.ReLU()]) prev_dim = h_dim encoder_layers.append(nn.Linear(prev_dim, latent_dim)) self.encoder = nn.Sequential(*encoder_layers)
# Decoder: reconstruct from latent space decoder_layers = [] prev_dim = latent_dim for h_dim in reversed(hidden_dims): decoder_layers.extend([nn.Linear(prev_dim, h_dim), nn.ReLU()]) prev_dim = h_dim decoder_layers.extend([nn.Linear(prev_dim, input_dim), nn.Sigmoid()]) self.decoder = nn.Sequential(*decoder_layers)
def forward(self, x): z = self.encoder(x) return self.decoder(z), z # Return reconstruction and latent code
# Train with reconstruction lossmodel = Autoencoder(784, [512, 256], 32)optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)criterion = nn.MSELoss()
for epoch in range(50): for x_batch, _ in train_loader: # Labels ignored! x_flat = x_batch.view(x_batch.size(0), -1) x_recon, z = model(x_flat) loss = criterion(x_recon, x_flat) optimizer.zero_grad() loss.backward() optimizer.step()Variational Autoencoder (VAE)
Regular autoencoders learn a deterministic compressed representation. VAEs learn a probabilistic latent space — each input maps to a distribution, not a point. This enables generation of new data by sampling from the latent distribution.
class VAE(nn.Module): def __init__(self, input_dim, hidden_dim, latent_dim): super().__init__() self.encoder = nn.Sequential(nn.Linear(input_dim, hidden_dim), nn.ReLU()) self.fc_mu = nn.Linear(hidden_dim, latent_dim) # Mean of distribution self.fc_logvar = nn.Linear(hidden_dim, latent_dim) # Log-variance self.decoder = nn.Sequential( nn.Linear(latent_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, input_dim), nn.Sigmoid() )
def encode(self, x): h = self.encoder(x) return self.fc_mu(h), self.fc_logvar(h)
def reparameterize(self, mu, logvar): # Reparameterization trick: z = mu + eps * sigma (differentiable sampling) std = torch.exp(0.5 * logvar) eps = torch.randn_like(std) return mu + eps * std
def decode(self, z): return self.decoder(z)
def forward(self, x): mu, logvar = self.encode(x) z = self.reparameterize(mu, logvar) return self.decode(z), mu, logvar
def vae_loss(recon_x, x, mu, logvar, beta=1.0): # Reconstruction loss + KL divergence recon_loss = nn.functional.binary_cross_entropy(recon_x, x, reduction='sum') kl_div = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp()) return recon_loss + beta * kl_divAnomaly Detection
Autoencoders trained on normal data have high reconstruction error on anomalies:
# Train only on normal samplesmodel.train()# ... training loop on normal data only ...
# Detect anomalies at inferencemodel.eval()with torch.no_grad(): x_recon, z = model(X_test) recon_errors = ((X_test - x_recon) ** 2).mean(dim=1)
# Samples with high reconstruction error = anomaliesthreshold = recon_errors.quantile(0.95) # Top 5% = anomaliesanomalies = recon_errors > thresholdApplications
| Use Case | Autoencoder Type |
|---|---|
| Dimensionality reduction | Standard AE |
| Anomaly / fraud detection | Standard AE |
| Image denoising | Denoising AE |
| Data generation | VAE, Diffusion |
| Feature learning | Standard AE |
| Semi-supervised learning | Standard AE + fine-tune |
Autoencoders are often the simplest path to anomaly detection in tabular and image data — they require no labels and the reconstruction error provides a natural, interpretable anomaly score.