Cloud  /  Azure

Microsoft Azure 26 guides · updated 2026

Practical guides to Azure compute, networking, storage, and data services — built for engineers running production workloads on Microsoft's cloud.

Azure Cosmos DB: Multi-Model Globally Distributed Database With 99.999% SLA

Azure Cosmos DB is Microsoft’s planet-scale, fully managed database service. It replicates data synchronously or asynchronously across any number of Azure regions, serves reads and writes from the nearest region to the client, and guarantees sub-10-millisecond latency at the 99th percentile. The 99.999% SLA for multi-region write configurations is among the strongest availability commitments in the cloud database market.

Cosmos DB is not a single database type. It exposes multiple wire-compatible APIs, so teams can use familiar drivers and query languages — SQL for JSON documents, MongoDB BSON drivers, Cassandra CQL, Gremlin for graphs, and the Azure Table Storage API. All of these land in the same underlying ARS (Atom-Record-Sequence) storage engine.


Real-World Scenario

An e-commerce platform operates in North America, Europe, and Asia Pacific. Product catalogue reads happen from all three regions; writes originate from a central operations team. The team configures Cosmos DB with three regional replicas (East US, West Europe, Southeast Asia) using Session consistency. Users in Tokyo read from the Southeast Asia replica with sub-10 ms latency. Product updates from the operations team write to East US and replicate to all regions within seconds. When East US has a brief outage, Azure automatically fails over — the 99.999% SLA covers multi-region write configurations.


Global Distribution Architecture

Cosmos DB Global Distribution
-------------------------------
[Write Region: East US]
|
Replication (async or sync depending on consistency)
|
+-----+------+
| |
[Read Region: West Europe] [Read Region: Southeast Asia]
Client in Tokyo -> resolves to Southeast Asia endpoint
Client in London -> resolves to West Europe endpoint
Client in New York -> resolves to East US endpoint
Multi-region writes (multi-master):
All three regions accept writes simultaneously
Conflicts resolved by LWW (Last Write Wins) or custom policy

Adding a new region is a portal toggle or CLI command — Cosmos DB provisions the replica and backfills data automatically.


Consistency Levels

Cosmos DB offers five consistency levels, trading between data freshness and performance/cost:

Strongest STRONG
| All reads guaranteed to see latest write
| Latency: 2x round-trip to farthest replica
|
BOUNDED STALENESS
| Reads lag behind writes by K versions or T seconds
| Good for globally consistent reads with bounded lag
|
SESSION (default, most popular)
| Within a single client session: reads see own writes
| Best balance of consistency and performance
|
CONSISTENT PREFIX
| Reads never see out-of-order writes
| No guarantee on how far behind they are
|
Weakest EVENTUAL
| No ordering or freshness guarantee
| Highest throughput, lowest cost

Session consistency is the right choice for most OLTP applications. Each client session sees its own writes immediately (read-your-own-writes guarantee), while different clients may briefly see stale data.


Request Units (RU/s)

Cosmos DB abstracts compute as Request Units. One RU equals approximately the cost of reading a 1 KB document by its point key. Writes cost more (roughly 5-10x a read), and cross-partition queries can cost hundreds of RUs.

RU Cost Examples (approximate)
--------------------------------
Point read (1 KB document) 1 RU
Point write (1 KB document) 5 RU
Query with index hit (10 results) 10-20 RU
Cross-partition query (1000 results) 100-500 RU
Stored procedure (complex) 50-200 RU

Throughput can be provisioned at the database or container level, or set to autoscale (scales from 10% to 100% of max RU/s automatically, billed per maximum reached per hour).


Partitioning

Every Cosmos DB container has a partition key. All data with the same partition key value lives in the same logical partition. Physical partitions group multiple logical partitions and are managed automatically.

Container: Orders
Partition Key: /customerId
Logical Partition "CUST-001"
{"id": "ORD-A1", "customerId": "CUST-001", "total": 49.99}
{"id": "ORD-A2", "customerId": "CUST-001", "total": 120.00}
Logical Partition "CUST-002"
{"id": "ORD-B1", "customerId": "CUST-002", "total": 75.50}
Physical partitions (managed by Cosmos DB):
[P1: CUST-001, CUST-005, CUST-009...]
[P2: CUST-002, CUST-006, CUST-010...]
...
Good partition key:
- High cardinality (many unique values)
- Evenly distributed writes
- Appears in most queries (avoids cross-partition fan-out)

A poor partition key choice (e.g., a boolean flag) creates hot partitions where one value receives most of the traffic. The maximum logical partition size is 20 GB.


Change Feed

The change feed is an ordered log of inserts and updates to a Cosmos DB container. Every write is appended to the feed and can be read by one or more consumers. Deletes are not captured by default (use soft delete with a TTL field to work around this).

Change Feed Use Cases
----------------------
Materialized views:
Product updates in ProductCatalog container
-> Change feed consumer reads changes
-> Writes denormalised view to SearchIndex container
Event sourcing:
All order state changes appended to change feed
-> Multiple microservices consume independently
-> Each builds its own projection (email service, analytics, inventory)
Cache invalidation:
Price updates in ProductDB
-> Change feed consumer detects price change
-> Invalidates corresponding Redis cache keys

Change feed is consumed via Azure Functions (CosmosDBTrigger), the SDK’s change feed processor library, or Azure Stream Analytics.


Working With Cosmos DB (Python, SQL API)

from azure.cosmos import CosmosClient, PartitionKey, exceptions
endpoint = "https://myaccount.documents.azure.com:443/"
key = "<primary_key>"
client = CosmosClient(endpoint, key)
db = client.create_database_if_not_exists("ecommerce")
container = db.create_container_if_not_exists(
id="orders",
partition_key=PartitionKey(path="/customerId"),
offer_throughput=1000 # 1000 RU/s provisioned
)
# Insert a document
order = {
"id": "ORD-5001",
"customerId": "CUST-042",
"items": [{"sku": "PHONE-X", "qty": 1, "price": 999}],
"total": 999,
"status": "confirmed"
}
container.upsert_item(order)
# Point read (cheapest query: 1 RU)
item = container.read_item(item="ORD-5001", partition_key="CUST-042")
print(item["status"])
# Cross-partition query (more expensive)
for order in container.query_items(
query="SELECT * FROM c WHERE c.status = 'confirmed'",
enable_cross_partition_query=True
):
print(order["id"])

Key Interview Points


Best Practices