Azure Table Storage: Schemaless Key-Value Storage at Low Cost for Structured Data
Azure Table Storage is a key-value store for structured, non-relational data. It sits inside a general-purpose storage account alongside blobs and queues, which makes it the cheapest NoSQL option in Azure — there is no separate database server to pay for. You pay per GB stored and per 10,000 operations, with no minimum provisioned capacity.
The trade-off for that cost is limited query capability. Table Storage has no secondary indexes, no joins, no aggregation functions, and no server-side sorting beyond the natural partition-key/row-key ordering. If your access pattern fits the key-based model, it is an excellent low-cost choice. If you need rich queries, global distribution, or high throughput, Cosmos DB is the right next step.
Real-World Scenario
A gaming company stores player achievement records. Each player has a unique ID (the partition key) and each achievement has a unique code (the row key). The query pattern is always: “get all achievements for player X” — a partition query that returns every row in one partition. With 50 million players and 200 million achievement rows, Table Storage handles the load at a fraction of Cosmos DB’s cost because the access pattern is a perfect fit for the partition/row-key model.
Data Model
Table Storage organises data into tables containing entities. An entity is a set of properties. Every entity must have three system properties and can have up to 252 custom properties:
Entity Structure-----------------PartitionKey (string, required) -- groups related entities; determines placementRowKey (string, required) -- unique within partition; determines sort orderTimestamp (datetime, system) -- last update time, managed by Azure
+ up to 252 custom properties of any supported type: String, Int32, Int64, Double, Boolean, DateTime, Binary, GuidExample table: IoT device readings
PartitionKey | RowKey | Temperature | Humidity | Location--------------|----------------------|-------------|----------|----------device-001 | 2024-06-15T00:00:00Z | 22.4 | 61.2 | Floor-Adevice-001 | 2024-06-15T01:00:00Z | 22.1 | 60.8 | Floor-Adevice-002 | 2024-06-15T00:00:00Z | 19.7 | 55.0 | Floor-Bdevice-002 | 2024-06-15T01:00:00Z | 20.2 | 54.6 | Floor-BWith this design, all readings for a device are in one partition, ordered chronologically by row key. A query for “all readings for device-001 between midnight and 2 AM” scans only one partition.
Querying and Its Limitations
Table Storage supports filter expressions using OData syntax, but only over PartitionKey, RowKey, and Timestamp without needing a full table scan:
Fast queries (use partition key): PartitionKey eq 'device-001' PartitionKey eq 'device-001' and RowKey ge '2024-06-15' and RowKey lt '2024-06-16'
Slow queries (full table scan -- avoid in production): Temperature gt 25 <- no index on Temperature Location eq 'Floor-A' <- no index on Location (queries only on non-key properties scan every entity in the table)Because there are no secondary indexes, query patterns that require filtering on non-key properties are expensive. This forces a design discipline: choose partition and row keys based on your actual query patterns, not on conceptual data modelling.
Partition Key Design Patterns
Pattern: Time-series fan-out PartitionKey = deviceId (unique per device) RowKey = ISO timestamp (newest first: invert timestamp = MaxTick - tick) Result: all readings for a device in one partition, sorted newest-first
Pattern: User + resource PartitionKey = userId RowKey = resourceId Result: all resources for a user in one partition
Pattern: Hot partition antipattern PartitionKey = "all" <- AVOID: all writes go to one partition RowKey = sequential integer Result: throttling at high write rates because Azure cannot distribute loadHotspot avoidance is the most common Table Storage design mistake. If all writes go to a single partition key, that partition becomes a bottleneck — Table Storage cannot spread load because it must keep a partition on one storage node for consistency.
Working With Table Storage (Python)
from azure.data.tables import TableServiceClient, TableEntityfrom datetime import datetime, timezone
conn_str = "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...;"service = TableServiceClient.from_connection_string(conn_str)table = service.get_table_client("DeviceReadings")table.create_table() # no-op if already exists
# Insert entityentity: TableEntity = { "PartitionKey": "device-001", "RowKey": datetime.now(timezone.utc).isoformat(), "Temperature": 22.4, "Humidity": 61.2, "Location": "Floor-A",}table.upsert_entity(entity)
# Query: all readings for device-001 todayfilter_expr = "PartitionKey eq 'device-001' and RowKey ge '2024-06-15T00:00:00'"for row in table.query_entities(filter_expr): print(row["RowKey"], row["Temperature"])
# Batch insert (atomic within a partition)batch = []for i in range(10): batch.append(("upsert", { "PartitionKey": "device-002", "RowKey": f"reading-{i:04d}", "Temperature": 20.0 + i * 0.1 }))table.submit_transaction(batch)Table Storage vs. Cosmos DB Table API
Cosmos DB offers a Table API that is wire-compatible with Table Storage SDKs. The same code works against both, but Cosmos DB adds:
Capability | Table Storage | Cosmos DB Table API------------------------|----------------|-----------------------Secondary indexes | No | Yes (all properties)Global distribution | No | Yes (multi-region)SLA | 99.9% | 99.99%Throughput model | Best-effort | Provisioned RU/sLatency guarantee | No | < 10 ms SLAMax entity size | 1 MB | 2 MBAutomatic indexing | No | YesIf your application outgrows Table Storage — queries are too slow, or you need global replication — migrating to Cosmos DB Table API requires only a connection string change. No code rewrite needed.
Key Interview Points
- No server-side sorting: Results are returned sorted by PartitionKey then RowKey (both ascending). You cannot request descending order or sort by other properties. Design row keys accordingly (e.g., reverse timestamp for newest-first).
- Entity group transactions: Batch operations (up to 100 entities) within a single partition are atomic. Cross-partition batches are not supported — a common gotcha when trying to update entities in two partitions atomically.
- 1 MB entity limit: Each entity (with all its properties) cannot exceed 1 MB. For entities with large blobs, store the large content in Blob Storage and put a reference URI in the Table entity.
- Access from Cosmos DB API: If you create a Table Storage table in a GPv2 storage account and later want Cosmos DB features, you must create a new Cosmos DB account and migrate data. They are separate services despite API compatibility.
- Pricing: Table Storage is dramatically cheaper than Cosmos DB for read-heavy workloads with simple key access. At 10 billion entities and 100 million reads per month, Table Storage can cost 10-50x less.
Best Practices
- Design the partition key around your most common query predicate — not around logical grouping that matches a relational schema.
- Avoid large partitions (tens of millions of entities in one partition) as they create hotspots during range scans and slow down delete operations.
- Use the
upsert_entityoperation instead of separate insert/update calls to simplify code and reduce conditional write overhead. - Implement retry logic using the Azure SDK’s built-in exponential backoff policy — Table Storage returns 429 (throttled) during bursts, and the SDK handles retries transparently.
- Review access patterns quarterly; if cross-entity queries are becoming common, evaluate whether migrating to Cosmos DB or Azure SQL would reduce complexity and operational cost despite higher per-unit pricing.