Generative Adversarial Networks (GANs)
GANs learn to generate new data that is indistinguishable from real data. Two neural networks compete against each other: a generator that creates fake data and a discriminator that tries to tell fake from real. This adversarial training produces remarkably realistic synthetic outputs.
The Minimax Game
Generator (G): Takes random noise z → creates fake data G(z) Goal: fool the discriminator
Discriminator (D): Takes real or fake data → outputs probability P(real) Goal: correctly classify real vs. fake
Minimax objective:min_G max_D [ E[log D(x)] + E[log(1 - D(G(z)))] ]
D wants to maximize this (correctly classify both)G wants to minimize this (fool D into saying G(z) is real)At equilibrium: G generates data from the true data distribution, D can’t do better than random (outputs 0.5 everywhere).
Simple GAN in PyTorch
import torchimport torch.nn as nn
# Generator: noise → fake dataclass Generator(nn.Module): def __init__(self, latent_dim=100, output_dim=784): super().__init__() self.net = nn.Sequential( nn.Linear(latent_dim, 256), nn.LeakyReLU(0.2), nn.BatchNorm1d(256), nn.Linear(256, 512), nn.LeakyReLU(0.2), nn.BatchNorm1d(512), nn.Linear(512, output_dim), nn.Tanh() # Output in [-1, 1]; normalize real data to match ) def forward(self, z): return self.net(z)
# Discriminator: data → real/fake probabilityclass Discriminator(nn.Module): def __init__(self, input_dim=784): super().__init__() self.net = nn.Sequential( nn.Linear(input_dim, 512), nn.LeakyReLU(0.2), nn.Dropout(0.3), nn.Linear(512, 256), nn.LeakyReLU(0.2), nn.Dropout(0.3), nn.Linear(256, 1), nn.Sigmoid() # Probability: real ) def forward(self, x): return self.net(x)
latent_dim = 100G = Generator(latent_dim)D = Discriminator()
criterion = nn.BCELoss()opt_G = torch.optim.Adam(G.parameters(), lr=2e-4, betas=(0.5, 0.999))opt_D = torch.optim.Adam(D.parameters(), lr=2e-4, betas=(0.5, 0.999))Training Loop
for epoch in range(num_epochs): for real_data, _ in train_loader: batch_size = real_data.size(0) real_data = real_data.view(batch_size, -1)
# Labels real_labels = torch.ones(batch_size, 1) fake_labels = torch.zeros(batch_size, 1)
# ── Train Discriminator ── opt_D.zero_grad() z = torch.randn(batch_size, latent_dim) fake_data = G(z).detach() # Stop gradients from flowing into G
loss_D = (criterion(D(real_data), real_labels) + criterion(D(fake_data), fake_labels)) / 2 loss_D.backward() opt_D.step()
# ── Train Generator ── opt_G.zero_grad() z = torch.randn(batch_size, latent_dim) fake_data = G(z)
# G wants D to classify its output as real loss_G = criterion(D(fake_data), real_labels) loss_G.backward() opt_G.step()Common Problems
Mode collapse: Generator produces only a few types of outputs, ignoring diversity.
- Fix: Minibatch discrimination, Wasserstein GAN (WGAN), spectral normalization
Training instability: D becomes too strong → G gradients vanish; D too weak → G doesn’t improve.
- Fix: Balance D/G update frequency, use label smoothing (0.9 instead of 1.0 for real labels)
Vanishing gradients: BCE loss saturates when D is confident.
- Fix: WGAN uses Wasserstein distance instead of BCE — provides meaningful gradients everywhere
DCGAN (Deep Convolutional GAN)
For image generation, use convolutional layers and these architecture guidelines:
- Generator: Transposed convolutions for upsampling, BatchNorm, ReLU (Tanh on output)
- Discriminator: Strided convolutions for downsampling, LeakyReLU (0.2), no MaxPool, no BatchNorm on first layer
Conditional GAN (cGAN)
Control what the generator produces by conditioning on class labels:
# Both G and D receive class label embedding as additional inputclass ConditionalGenerator(nn.Module): def __init__(self, latent_dim, num_classes, output_dim): super().__init__() self.label_embed = nn.Embedding(num_classes, num_classes) self.net = nn.Sequential( nn.Linear(latent_dim + num_classes, 256), # ... rest of generator ... )
def forward(self, z, labels): label_embed = self.label_embed(labels) x = torch.cat([z, label_embed], dim=1) return self.net(x)GANs have largely been superseded by diffusion models for high-quality image generation (Stable Diffusion, DALL·E, Midjourney), but they remain important for understanding generative modeling and are still used in data augmentation, medical imaging, and controlled generation tasks.