Azure Kubernetes Service: Managed Kubernetes With Deep Azure Integration

Kubernetes is the de-facto standard for orchestrating containerised workloads at scale. It handles scheduling, health checks, rolling updates, and service discovery — but running the control plane yourself means managing etcd, API server certificates, and upgrade procedures. Azure Kubernetes Service (AKS) removes that responsibility: Microsoft runs and patches the control plane at no extra charge, and you pay only for the agent (worker) nodes.

AKS integrates tightly with the Azure ecosystem — Azure Container Registry for image pulls, Azure Active Directory for user and workload identity, Azure CNI for VNet-native pod networking, Azure Monitor and Prometheus for observability, and Azure Policy for governance.

Real-World Scenario

A fintech company runs twenty microservices: payment processing, fraud detection, user auth, reporting, and several data pipelines. Each service has different scaling profiles — fraud detection needs GPU nodes during ML batch scoring, while the reporting service is nearly idle overnight. AKS node pools let the team define a CPU node pool for most services, a separate GPU pool for ML, and a spot node pool for batch jobs — all within one cluster, managed by a single control plane.

Architecture Overview

Azure Active Directory (user + Workload Identity)
              |
     AKS Control Plane (managed by Microsoft)
     [API Server | etcd | Scheduler | Controller Manager]
              |
     +---------+---------+
     |                   |
 System Node Pool    User Node Pool(s)
 (kube-system)       (your workloads)
     |                   |
 [Standard_D4s_v5]   [Standard_D8s_v5]  [Standard_NC6s_v3 GPU]
                          |
                 Azure CNI (pods get VNet IPs)
                          |
          +---------------+---------------+
          |               |               |
   Azure SQL DB    Azure Service Bus   Azure Storage
   (Private EP)   (Private EP)        (Private EP)

Node Pools

A node pool is a group of VMs with identical configuration. AKS requires at least one system pool that runs Kubernetes system pods. Additional user pools host your application workloads.

Key node pool settings:

VM size: Different pools can use different sizes. GPU pools use N-series VMs; burst workloads can use B-series.
Node count / autoscaler: Each pool has a min and max node count. The Cluster Autoscaler adds nodes when pods are unschedulable and removes idle nodes.
OS disk type: Ephemeral OS disks (placed on VM cache or temp disk) give faster node provision times and no separate disk cost.
Taints and labels: Taint a GPU pool with sku=gpu:NoSchedule so only pods that tolerate it land there.
Spot node pools: Use Azure Spot VMs for up to 90% savings on batch and stateless workloads that tolerate eviction.

# Add a spot node pool for batch jobs
az aks nodepool add \
  --resource-group prod-rg \
  --cluster-name my-aks \
  --name batchpool \
  --node-count 0 \
  --min-count 0 \
  --max-count 20 \
  --enable-cluster-autoscaler \
  --priority Spot \
  --spot-max-price -1 \
  --node-vm-size Standard_D4s_v5 \
  --node-taints "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"

Networking: Kubenet vs. Azure CNI

AKS supports two primary CNI options:

Kubenet (default)
  Pods get private IPs from a separate range
  Traffic between pods on different nodes goes through NAT
  Simple setup, but pods are not directly reachable from the VNet

Azure CNI
  Every pod gets an IP from the VNet subnet
  Direct routing — no NAT between pods and VNet resources
  Required for:
    - Private Endpoints (pods calling private Azure services)
    - Services that check source IP (firewalls, NSGs)
    - AKS cluster in a subnet with other non-Kubernetes resources

Azure CNI Overlay (newer)
  Pods get IPs from a separate CIDR (not consuming VNet IPs)
  But routing is handled by the Azure network stack, not NAT
  Supports large pod counts without exhausting VNet IP space

For production workloads with Private Endpoints, Azure CNI or Azure CNI Overlay is the right choice.

Workload Identity

The old approach to giving pods access to Azure services was to mount a service principal secret as a Kubernetes secret. Workload Identity replaces this with federated identity:

Workload Identity Flow
-----------------------
1. Create Azure Managed Identity (no secret to rotate)
2. Create Kubernetes ServiceAccount annotated with the identity client ID
3. AKS mutates pods to inject environment variables pointing to a token file
4. Pod exchanges Kubernetes OIDC token for an Azure AD token via federation
5. Pod calls Azure SDK — credential picks up the token automatically
No secrets stored in the cluster

apiVersion: v1
kind: ServiceAccount
metadata:
  name: payment-service-sa
  namespace: payments
  annotations:
    azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"

Cluster Autoscaler and KEDA

Cluster Autoscaler adds or removes nodes based on pending pods and node utilisation.

KEDA (Kubernetes Event-Driven Autoscaling) scales deployments to zero and back based on external event sources — Azure Service Bus queue depth, Blob Storage trigger count, Prometheus metrics, etc. KEDA is an add-on available directly from the AKS add-on API:

az aks update \
  --resource-group prod-rg \
  --name my-aks \
  --enable-keda

Combined: KEDA scales pods based on queue depth; Cluster Autoscaler adds nodes when pods cannot be scheduled; and removes nodes when the queue drains.

Upgrades

AKS releases Kubernetes minor versions and patches on a rolling schedule. Microsoft supports N-2 minor versions (three versions total). Clusters on an unsupported version lose SLA.

# Check available upgrade versions
az aks get-upgrades --resource-group prod-rg --name my-aks

# Upgrade node pools one at a time (cordon, drain, replace)
az aks upgrade \
  --resource-group prod-rg \
  --name my-aks \
  --kubernetes-version 1.30.0 \
  --node-image-only   # upgrade only node OS image, not control plane

Auto-upgrade channels: patch, stable, rapid, node-image. Setting channel=patch automatically applies patch releases without you scheduling upgrade jobs.

Key Interview Points

Control plane cost: The AKS control plane itself is free. You pay for agent nodes (VMs), load balancers, managed disks, and egress.
System pool requirement: AKS requires at least one system node pool with at least one node. You cannot delete the last system pool.
Kubenet vs. Azure CNI: Kubenet uses fewer IP addresses but pods are not first-class VNet citizens. Azure CNI gives pods VNet IPs but requires subnet IP planning.
Cluster Autoscaler vs. HPA: HPA (Horizontal Pod Autoscaler) adds pod replicas. Cluster Autoscaler adds nodes. They work together — HPA scales pods, and if nodes fill up, Cluster Autoscaler adds capacity.
RBAC and Azure AD integration: AKS supports both Kubernetes RBAC (bindings to ServiceAccounts) and Azure AD-integrated RBAC (bindings to Azure AD groups and users).
Private cluster: Setting --enable-private-cluster makes the API server only reachable from within the VNet. Useful for regulated environments where the control plane endpoint must not be publicly accessible.

Best Practices

Separate system and user node pools so a noisy workload cannot starve Kubernetes system components.
Use Workload Identity instead of service principal secrets for pod-to-Azure-service authentication.
Enable the Azure Policy add-on to enforce pod security standards (no privileged containers, required resource limits) at admission time.
Store container images in Azure Container Registry with geo-replication enabled so AKS nodes in multiple regions can pull images without crossing continents.
Set resource requests and limits on every container — the scheduler cannot make good placement decisions without them.
Use PodDisruptionBudgets on critical workloads to prevent node drain from removing too many replicas simultaneously.