Azure Kubernetes Service: Managed Kubernetes With Deep Azure Integration
Kubernetes is the de-facto standard for orchestrating containerised workloads at scale. It handles scheduling, health checks, rolling updates, and service discovery — but running the control plane yourself means managing etcd, API server certificates, and upgrade procedures. Azure Kubernetes Service (AKS) removes that responsibility: Microsoft runs and patches the control plane at no extra charge, and you pay only for the agent (worker) nodes.
AKS integrates tightly with the Azure ecosystem — Azure Container Registry for image pulls, Azure Active Directory for user and workload identity, Azure CNI for VNet-native pod networking, Azure Monitor and Prometheus for observability, and Azure Policy for governance.
Real-World Scenario
A fintech company runs twenty microservices: payment processing, fraud detection, user auth, reporting, and several data pipelines. Each service has different scaling profiles — fraud detection needs GPU nodes during ML batch scoring, while the reporting service is nearly idle overnight. AKS node pools let the team define a CPU node pool for most services, a separate GPU pool for ML, and a spot node pool for batch jobs — all within one cluster, managed by a single control plane.
Architecture Overview
Azure Active Directory (user + Workload Identity) | AKS Control Plane (managed by Microsoft) [API Server | etcd | Scheduler | Controller Manager] | +---------+---------+ | | System Node Pool User Node Pool(s) (kube-system) (your workloads) | | [Standard_D4s_v5] [Standard_D8s_v5] [Standard_NC6s_v3 GPU] | Azure CNI (pods get VNet IPs) | +---------------+---------------+ | | | Azure SQL DB Azure Service Bus Azure Storage (Private EP) (Private EP) (Private EP)Node Pools
A node pool is a group of VMs with identical configuration. AKS requires at least one system pool that runs Kubernetes system pods. Additional user pools host your application workloads.
Key node pool settings:
- VM size: Different pools can use different sizes. GPU pools use N-series VMs; burst workloads can use B-series.
- Node count / autoscaler: Each pool has a min and max node count. The Cluster Autoscaler adds nodes when pods are unschedulable and removes idle nodes.
- OS disk type: Ephemeral OS disks (placed on VM cache or temp disk) give faster node provision times and no separate disk cost.
- Taints and labels: Taint a GPU pool with
sku=gpu:NoScheduleso only pods that tolerate it land there. - Spot node pools: Use Azure Spot VMs for up to 90% savings on batch and stateless workloads that tolerate eviction.
# Add a spot node pool for batch jobsaz aks nodepool add \ --resource-group prod-rg \ --cluster-name my-aks \ --name batchpool \ --node-count 0 \ --min-count 0 \ --max-count 20 \ --enable-cluster-autoscaler \ --priority Spot \ --spot-max-price -1 \ --node-vm-size Standard_D4s_v5 \ --node-taints "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"Networking: Kubenet vs. Azure CNI
AKS supports two primary CNI options:
Kubenet (default) Pods get private IPs from a separate range Traffic between pods on different nodes goes through NAT Simple setup, but pods are not directly reachable from the VNet
Azure CNI Every pod gets an IP from the VNet subnet Direct routing — no NAT between pods and VNet resources Required for: - Private Endpoints (pods calling private Azure services) - Services that check source IP (firewalls, NSGs) - AKS cluster in a subnet with other non-Kubernetes resources
Azure CNI Overlay (newer) Pods get IPs from a separate CIDR (not consuming VNet IPs) But routing is handled by the Azure network stack, not NAT Supports large pod counts without exhausting VNet IP spaceFor production workloads with Private Endpoints, Azure CNI or Azure CNI Overlay is the right choice.
Workload Identity
The old approach to giving pods access to Azure services was to mount a service principal secret as a Kubernetes secret. Workload Identity replaces this with federated identity:
Workload Identity Flow-----------------------1. Create Azure Managed Identity (no secret to rotate)2. Create Kubernetes ServiceAccount annotated with the identity client ID3. AKS mutates pods to inject environment variables pointing to a token file4. Pod exchanges Kubernetes OIDC token for an Azure AD token via federation5. Pod calls Azure SDK — credential picks up the token automaticallyNo secrets stored in the clusterapiVersion: v1kind: ServiceAccountmetadata: name: payment-service-sa namespace: payments annotations: azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"Cluster Autoscaler and KEDA
Cluster Autoscaler adds or removes nodes based on pending pods and node utilisation.
KEDA (Kubernetes Event-Driven Autoscaling) scales deployments to zero and back based on external event sources — Azure Service Bus queue depth, Blob Storage trigger count, Prometheus metrics, etc. KEDA is an add-on available directly from the AKS add-on API:
az aks update \ --resource-group prod-rg \ --name my-aks \ --enable-kedaCombined: KEDA scales pods based on queue depth; Cluster Autoscaler adds nodes when pods cannot be scheduled; and removes nodes when the queue drains.
Upgrades
AKS releases Kubernetes minor versions and patches on a rolling schedule. Microsoft supports N-2 minor versions (three versions total). Clusters on an unsupported version lose SLA.
# Check available upgrade versionsaz aks get-upgrades --resource-group prod-rg --name my-aks
# Upgrade node pools one at a time (cordon, drain, replace)az aks upgrade \ --resource-group prod-rg \ --name my-aks \ --kubernetes-version 1.30.0 \ --node-image-only # upgrade only node OS image, not control planeAuto-upgrade channels: patch, stable, rapid, node-image. Setting channel=patch automatically applies patch releases without you scheduling upgrade jobs.
Key Interview Points
- Control plane cost: The AKS control plane itself is free. You pay for agent nodes (VMs), load balancers, managed disks, and egress.
- System pool requirement: AKS requires at least one system node pool with at least one node. You cannot delete the last system pool.
- Kubenet vs. Azure CNI: Kubenet uses fewer IP addresses but pods are not first-class VNet citizens. Azure CNI gives pods VNet IPs but requires subnet IP planning.
- Cluster Autoscaler vs. HPA: HPA (Horizontal Pod Autoscaler) adds pod replicas. Cluster Autoscaler adds nodes. They work together — HPA scales pods, and if nodes fill up, Cluster Autoscaler adds capacity.
- RBAC and Azure AD integration: AKS supports both Kubernetes RBAC (bindings to ServiceAccounts) and Azure AD-integrated RBAC (bindings to Azure AD groups and users).
- Private cluster: Setting
--enable-private-clustermakes the API server only reachable from within the VNet. Useful for regulated environments where the control plane endpoint must not be publicly accessible.
Best Practices
- Separate system and user node pools so a noisy workload cannot starve Kubernetes system components.
- Use Workload Identity instead of service principal secrets for pod-to-Azure-service authentication.
- Enable the Azure Policy add-on to enforce pod security standards (no privileged containers, required resource limits) at admission time.
- Store container images in Azure Container Registry with geo-replication enabled so AKS nodes in multiple regions can pull images without crossing continents.
- Set resource requests and limits on every container — the scheduler cannot make good placement decisions without them.
- Use PodDisruptionBudgets on critical workloads to prevent node drain from removing too many replicas simultaneously.