Generative Adversarial Networks (GANs)

GANs learn to generate new data that is indistinguishable from real data. Two neural networks compete against each other: a generator that creates fake data and a discriminator that tries to tell fake from real. This adversarial training produces remarkably realistic synthetic outputs.

The Minimax Game

Generator (G): Takes random noise z → creates fake data G(z)
               Goal: fool the discriminator

Discriminator (D): Takes real or fake data → outputs probability P(real)
                   Goal: correctly classify real vs. fake

Minimax objective:
min_G max_D [ E[log D(x)] + E[log(1 - D(G(z)))] ]

D wants to maximize this (correctly classify both)
G wants to minimize this (fool D into saying G(z) is real)

At equilibrium: G generates data from the true data distribution, D can’t do better than random (outputs 0.5 everywhere).

Simple GAN in PyTorch

import torch
import torch.nn as nn

# Generator: noise → fake data
class Generator(nn.Module):
    def __init__(self, latent_dim=100, output_dim=784):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(latent_dim, 256),
            nn.LeakyReLU(0.2),
            nn.BatchNorm1d(256),
            nn.Linear(256, 512),
            nn.LeakyReLU(0.2),
            nn.BatchNorm1d(512),
            nn.Linear(512, output_dim),
            nn.Tanh()  # Output in [-1, 1]; normalize real data to match
        )
    def forward(self, z):
        return self.net(z)

# Discriminator: data → real/fake probability
class Discriminator(nn.Module):
    def __init__(self, input_dim=784):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(input_dim, 512),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3),
            nn.Linear(512, 256),
            nn.LeakyReLU(0.2),
            nn.Dropout(0.3),
            nn.Linear(256, 1),
            nn.Sigmoid()  # Probability: real
        )
    def forward(self, x):
        return self.net(x)

latent_dim = 100
G = Generator(latent_dim)
D = Discriminator()

criterion = nn.BCELoss()
opt_G = torch.optim.Adam(G.parameters(), lr=2e-4, betas=(0.5, 0.999))
opt_D = torch.optim.Adam(D.parameters(), lr=2e-4, betas=(0.5, 0.999))

Training Loop

for epoch in range(num_epochs):
    for real_data, _ in train_loader:
        batch_size = real_data.size(0)
        real_data = real_data.view(batch_size, -1)

        # Labels
        real_labels = torch.ones(batch_size, 1)
        fake_labels = torch.zeros(batch_size, 1)

        # ── Train Discriminator ──
        opt_D.zero_grad()
        z = torch.randn(batch_size, latent_dim)
        fake_data = G(z).detach()  # Stop gradients from flowing into G

        loss_D = (criterion(D(real_data), real_labels) +
                  criterion(D(fake_data), fake_labels)) / 2
        loss_D.backward()
        opt_D.step()

        # ── Train Generator ──
        opt_G.zero_grad()
        z = torch.randn(batch_size, latent_dim)
        fake_data = G(z)

        # G wants D to classify its output as real
        loss_G = criterion(D(fake_data), real_labels)
        loss_G.backward()
        opt_G.step()

Common Problems

Mode collapse: Generator produces only a few types of outputs, ignoring diversity.

Fix: Minibatch discrimination, Wasserstein GAN (WGAN), spectral normalization

Training instability: D becomes too strong → G gradients vanish; D too weak → G doesn’t improve.

Fix: Balance D/G update frequency, use label smoothing (0.9 instead of 1.0 for real labels)

Vanishing gradients: BCE loss saturates when D is confident.

Fix: WGAN uses Wasserstein distance instead of BCE — provides meaningful gradients everywhere

DCGAN (Deep Convolutional GAN)

For image generation, use convolutional layers and these architecture guidelines:

Generator: Transposed convolutions for upsampling, BatchNorm, ReLU (Tanh on output)
Discriminator: Strided convolutions for downsampling, LeakyReLU (0.2), no MaxPool, no BatchNorm on first layer

Conditional GAN (cGAN)

Control what the generator produces by conditioning on class labels:

# Both G and D receive class label embedding as additional input
class ConditionalGenerator(nn.Module):
    def __init__(self, latent_dim, num_classes, output_dim):
        super().__init__()
        self.label_embed = nn.Embedding(num_classes, num_classes)
        self.net = nn.Sequential(
            nn.Linear(latent_dim + num_classes, 256),
            # ... rest of generator ...
        )

    def forward(self, z, labels):
        label_embed = self.label_embed(labels)
        x = torch.cat([z, label_embed], dim=1)
        return self.net(x)

GANs have largely been superseded by diffusion models for high-quality image generation (Stable Diffusion, DALL·E, Midjourney), but they remain important for understanding generative modeling and are still used in data augmentation, medical imaging, and controlled generation tasks.