Azure Table Storage: Schemaless Key-Value Storage at Low Cost for Structured Data

Azure Table Storage is a key-value store for structured, non-relational data. It sits inside a general-purpose storage account alongside blobs and queues, which makes it the cheapest NoSQL option in Azure — there is no separate database server to pay for. You pay per GB stored and per 10,000 operations, with no minimum provisioned capacity.

The trade-off for that cost is limited query capability. Table Storage has no secondary indexes, no joins, no aggregation functions, and no server-side sorting beyond the natural partition-key/row-key ordering. If your access pattern fits the key-based model, it is an excellent low-cost choice. If you need rich queries, global distribution, or high throughput, Cosmos DB is the right next step.

Real-World Scenario

A gaming company stores player achievement records. Each player has a unique ID (the partition key) and each achievement has a unique code (the row key). The query pattern is always: “get all achievements for player X” — a partition query that returns every row in one partition. With 50 million players and 200 million achievement rows, Table Storage handles the load at a fraction of Cosmos DB’s cost because the access pattern is a perfect fit for the partition/row-key model.

Data Model

Table Storage organises data into tables containing entities. An entity is a set of properties. Every entity must have three system properties and can have up to 252 custom properties:

Entity Structure
-----------------
PartitionKey  (string, required) -- groups related entities; determines placement
RowKey        (string, required) -- unique within partition; determines sort order
Timestamp     (datetime, system) -- last update time, managed by Azure

+ up to 252 custom properties of any supported type:
  String, Int32, Int64, Double, Boolean, DateTime, Binary, Guid

Example table: IoT device readings

PartitionKey  | RowKey               | Temperature | Humidity | Location
--------------|----------------------|-------------|----------|----------
device-001    | 2024-06-15T00:00:00Z | 22.4        | 61.2     | Floor-A
device-001    | 2024-06-15T01:00:00Z | 22.1        | 60.8     | Floor-A
device-002    | 2024-06-15T00:00:00Z | 19.7        | 55.0     | Floor-B
device-002    | 2024-06-15T01:00:00Z | 20.2        | 54.6     | Floor-B

With this design, all readings for a device are in one partition, ordered chronologically by row key. A query for “all readings for device-001 between midnight and 2 AM” scans only one partition.

Querying and Its Limitations

Table Storage supports filter expressions using OData syntax, but only over PartitionKey, RowKey, and Timestamp without needing a full table scan:

Fast queries (use partition key):
  PartitionKey eq 'device-001'
  PartitionKey eq 'device-001' and RowKey ge '2024-06-15' and RowKey lt '2024-06-16'

Slow queries (full table scan -- avoid in production):
  Temperature gt 25                          <- no index on Temperature
  Location eq 'Floor-A'                     <- no index on Location
  (queries only on non-key properties scan every entity in the table)

Because there are no secondary indexes, query patterns that require filtering on non-key properties are expensive. This forces a design discipline: choose partition and row keys based on your actual query patterns, not on conceptual data modelling.

Partition Key Design Patterns

Pattern: Time-series fan-out
  PartitionKey = deviceId (unique per device)
  RowKey       = ISO timestamp (newest first: invert timestamp = MaxTick - tick)
  Result: all readings for a device in one partition, sorted newest-first

Pattern: User + resource
  PartitionKey = userId
  RowKey       = resourceId
  Result: all resources for a user in one partition

Pattern: Hot partition antipattern
  PartitionKey = "all"   <- AVOID: all writes go to one partition
  RowKey       = sequential integer
  Result: throttling at high write rates because Azure cannot distribute load

Hotspot avoidance is the most common Table Storage design mistake. If all writes go to a single partition key, that partition becomes a bottleneck — Table Storage cannot spread load because it must keep a partition on one storage node for consistency.

Working With Table Storage (Python)

from azure.data.tables import TableServiceClient, TableEntity
from datetime import datetime, timezone

conn_str = "DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...;"
service = TableServiceClient.from_connection_string(conn_str)
table = service.get_table_client("DeviceReadings")
table.create_table()   # no-op if already exists

# Insert entity
entity: TableEntity = {
    "PartitionKey": "device-001",
    "RowKey": datetime.now(timezone.utc).isoformat(),
    "Temperature": 22.4,
    "Humidity": 61.2,
    "Location": "Floor-A",
}
table.upsert_entity(entity)

# Query: all readings for device-001 today
filter_expr = "PartitionKey eq 'device-001' and RowKey ge '2024-06-15T00:00:00'"
for row in table.query_entities(filter_expr):
    print(row["RowKey"], row["Temperature"])

# Batch insert (atomic within a partition)
batch = []
for i in range(10):
    batch.append(("upsert", {
        "PartitionKey": "device-002",
        "RowKey": f"reading-{i:04d}",
        "Temperature": 20.0 + i * 0.1
    }))
table.submit_transaction(batch)

Table Storage vs. Cosmos DB Table API

Cosmos DB offers a Table API that is wire-compatible with Table Storage SDKs. The same code works against both, but Cosmos DB adds:

Capability              | Table Storage  | Cosmos DB Table API
------------------------|----------------|-----------------------
Secondary indexes       | No             | Yes (all properties)
Global distribution     | No             | Yes (multi-region)
SLA                     | 99.9%          | 99.99%
Throughput model        | Best-effort    | Provisioned RU/s
Latency guarantee       | No             | < 10 ms SLA
Max entity size         | 1 MB           | 2 MB
Automatic indexing      | No             | Yes

If your application outgrows Table Storage — queries are too slow, or you need global replication — migrating to Cosmos DB Table API requires only a connection string change. No code rewrite needed.

Key Interview Points

No server-side sorting: Results are returned sorted by PartitionKey then RowKey (both ascending). You cannot request descending order or sort by other properties. Design row keys accordingly (e.g., reverse timestamp for newest-first).
Entity group transactions: Batch operations (up to 100 entities) within a single partition are atomic. Cross-partition batches are not supported — a common gotcha when trying to update entities in two partitions atomically.
1 MB entity limit: Each entity (with all its properties) cannot exceed 1 MB. For entities with large blobs, store the large content in Blob Storage and put a reference URI in the Table entity.
Access from Cosmos DB API: If you create a Table Storage table in a GPv2 storage account and later want Cosmos DB features, you must create a new Cosmos DB account and migrate data. They are separate services despite API compatibility.
Pricing: Table Storage is dramatically cheaper than Cosmos DB for read-heavy workloads with simple key access. At 10 billion entities and 100 million reads per month, Table Storage can cost 10-50x less.

Best Practices

Design the partition key around your most common query predicate — not around logical grouping that matches a relational schema.
Avoid large partitions (tens of millions of entities in one partition) as they create hotspots during range scans and slow down delete operations.
Use the upsert_entity operation instead of separate insert/update calls to simplify code and reduce conditional write overhead.
Implement retry logic using the Azure SDK’s built-in exponential backoff policy — Table Storage returns 429 (throttled) during bursts, and the SDK handles retries transparently.
Review access patterns quarterly; if cross-entity queries are becoming common, evaluate whether migrating to Cosmos DB or Azure SQL would reduce complexity and operational cost despite higher per-unit pricing.