Cloud  /  Azure

Microsoft Azure 26 guides · updated 2026

Practical guides to Azure compute, networking, storage, and data services — built for engineers running production workloads on Microsoft's cloud.

Azure Kubernetes Service: Managed Kubernetes With Deep Azure Integration

Kubernetes is the de-facto standard for orchestrating containerised workloads at scale. It handles scheduling, health checks, rolling updates, and service discovery — but running the control plane yourself means managing etcd, API server certificates, and upgrade procedures. Azure Kubernetes Service (AKS) removes that responsibility: Microsoft runs and patches the control plane at no extra charge, and you pay only for the agent (worker) nodes.

AKS integrates tightly with the Azure ecosystem — Azure Container Registry for image pulls, Azure Active Directory for user and workload identity, Azure CNI for VNet-native pod networking, Azure Monitor and Prometheus for observability, and Azure Policy for governance.


Real-World Scenario

A fintech company runs twenty microservices: payment processing, fraud detection, user auth, reporting, and several data pipelines. Each service has different scaling profiles — fraud detection needs GPU nodes during ML batch scoring, while the reporting service is nearly idle overnight. AKS node pools let the team define a CPU node pool for most services, a separate GPU pool for ML, and a spot node pool for batch jobs — all within one cluster, managed by a single control plane.


Architecture Overview

Azure Active Directory (user + Workload Identity)
|
AKS Control Plane (managed by Microsoft)
[API Server | etcd | Scheduler | Controller Manager]
|
+---------+---------+
| |
System Node Pool User Node Pool(s)
(kube-system) (your workloads)
| |
[Standard_D4s_v5] [Standard_D8s_v5] [Standard_NC6s_v3 GPU]
|
Azure CNI (pods get VNet IPs)
|
+---------------+---------------+
| | |
Azure SQL DB Azure Service Bus Azure Storage
(Private EP) (Private EP) (Private EP)

Node Pools

A node pool is a group of VMs with identical configuration. AKS requires at least one system pool that runs Kubernetes system pods. Additional user pools host your application workloads.

Key node pool settings:

Terminal window
# Add a spot node pool for batch jobs
az aks nodepool add \
--resource-group prod-rg \
--cluster-name my-aks \
--name batchpool \
--node-count 0 \
--min-count 0 \
--max-count 20 \
--enable-cluster-autoscaler \
--priority Spot \
--spot-max-price -1 \
--node-vm-size Standard_D4s_v5 \
--node-taints "kubernetes.azure.com/scalesetpriority=spot:NoSchedule"

Networking: Kubenet vs. Azure CNI

AKS supports two primary CNI options:

Kubenet (default)
Pods get private IPs from a separate range
Traffic between pods on different nodes goes through NAT
Simple setup, but pods are not directly reachable from the VNet
Azure CNI
Every pod gets an IP from the VNet subnet
Direct routing — no NAT between pods and VNet resources
Required for:
- Private Endpoints (pods calling private Azure services)
- Services that check source IP (firewalls, NSGs)
- AKS cluster in a subnet with other non-Kubernetes resources
Azure CNI Overlay (newer)
Pods get IPs from a separate CIDR (not consuming VNet IPs)
But routing is handled by the Azure network stack, not NAT
Supports large pod counts without exhausting VNet IP space

For production workloads with Private Endpoints, Azure CNI or Azure CNI Overlay is the right choice.


Workload Identity

The old approach to giving pods access to Azure services was to mount a service principal secret as a Kubernetes secret. Workload Identity replaces this with federated identity:

Workload Identity Flow
-----------------------
1. Create Azure Managed Identity (no secret to rotate)
2. Create Kubernetes ServiceAccount annotated with the identity client ID
3. AKS mutates pods to inject environment variables pointing to a token file
4. Pod exchanges Kubernetes OIDC token for an Azure AD token via federation
5. Pod calls Azure SDK — credential picks up the token automatically
No secrets stored in the cluster
apiVersion: v1
kind: ServiceAccount
metadata:
name: payment-service-sa
namespace: payments
annotations:
azure.workload.identity/client-id: "<MANAGED_IDENTITY_CLIENT_ID>"

Cluster Autoscaler and KEDA

Cluster Autoscaler adds or removes nodes based on pending pods and node utilisation.

KEDA (Kubernetes Event-Driven Autoscaling) scales deployments to zero and back based on external event sources — Azure Service Bus queue depth, Blob Storage trigger count, Prometheus metrics, etc. KEDA is an add-on available directly from the AKS add-on API:

Terminal window
az aks update \
--resource-group prod-rg \
--name my-aks \
--enable-keda

Combined: KEDA scales pods based on queue depth; Cluster Autoscaler adds nodes when pods cannot be scheduled; and removes nodes when the queue drains.


Upgrades

AKS releases Kubernetes minor versions and patches on a rolling schedule. Microsoft supports N-2 minor versions (three versions total). Clusters on an unsupported version lose SLA.

Terminal window
# Check available upgrade versions
az aks get-upgrades --resource-group prod-rg --name my-aks
# Upgrade node pools one at a time (cordon, drain, replace)
az aks upgrade \
--resource-group prod-rg \
--name my-aks \
--kubernetes-version 1.30.0 \
--node-image-only # upgrade only node OS image, not control plane

Auto-upgrade channels: patch, stable, rapid, node-image. Setting channel=patch automatically applies patch releases without you scheduling upgrade jobs.


Key Interview Points


Best Practices