Cloud  /  AWS

AWS Amazon Web Services 61 guides · updated 2026

Hands-on guides to compute, storage, databases, networking, and serverless on the world's most widely adopted cloud platform.

DynamoDB Streams: Real-Time Change Data Capture for Event-Driven Architectures

Applications often need to react when data changes — send a notification when an order is placed, update a search index when a product price changes, populate a reporting database when a new user registers. The naive approach is to poll the table repeatedly looking for changes. DynamoDB Streams offers something better: a reliable, ordered, time-limited log of every change made to a table.

What Streams Captures

When you enable DynamoDB Streams on a table, every write operation — PutItem, UpdateItem, DeleteItem, and transactional writes — generates a stream record. Stream records are:

Each stream record describes one item-level change. You configure how much data each record contains through the stream view type:

Stream View Types
==================
KEYS_ONLY Only the key attributes (PK and SK) of the changed item
NEW_IMAGE The entire item as it looks after the change
OLD_IMAGE The entire item as it looked before the change
NEW_AND_OLD_IMAGES Both the before and after state of the item
Most useful view types:
├── NEW_IMAGE for building downstream indexes or search
├── OLD_IMAGE for audit logs tracking what was deleted
└── NEW_AND_OLD_IMAGES for detecting what specifically changed

Architecture: Streams + Lambda

The most common pattern is connecting DynamoDB Streams to AWS Lambda. AWS manages the polling — Lambda polls the stream shards on your behalf, batches records together, and invokes your function when records are available. You do not write polling code; you write the handler.

DynamoDB Streams → Lambda Pattern
==================================
DynamoDB Table
│ (every write creates a stream record)
DynamoDB Stream (shards, 24-hour retention)
│ (Lambda polls; AWS manages this connection)
Lambda Function (event source mapping)
├──► Send SNS notification (order confirmation email)
├──► Update Elasticsearch/OpenSearch index (search sync)
├──► Write to Redshift (analytics pipeline)
├──► Update a DynamoDB aggregation table (counter update)
└──► Trigger Step Functions workflow
Lambda invocation batches up to 10,000 records at once
Batch size and parallelization factor are configurable

The event source mapping handles retry behavior: if your Lambda function fails, AWS retries the same batch. Items are processed in order within each shard. This ordering guarantee matters — if you have two updates to the same item, you want to process the first before the second.

Event-Driven Use Cases

Order processing pipeline: an e-commerce table has new orders written when customers check out. A Lambda function reads the stream, sends a confirmation email via SES, decrements inventory counts in another table, and puts a message on SQS for the warehouse fulfillment system. No code in the order write path is responsible for these downstream effects — the stream fans out the event.

Search index synchronization: a product catalog table is the system of record. A Lambda function processes stream records and writes updates to an OpenSearch index. The catalog team writes only to DynamoDB; search stays in sync automatically with under a second of lag.

Audit logging: a sensitive financial table has a Lambda function reading its stream with NEW_AND_OLD_IMAGES view type. Every change — what was there before, what it changed to, and the approximate timestamp — is written to an immutable audit log in S3 or CloudWatch Logs. No developer can delete this trail because it is written by the stream consumer, not the application making the changes.

Cross-table aggregations: a votes table records individual votes in a real-time poll. Rather than counting votes with an expensive Scan every time someone views results, a Lambda function processes the stream and maintains running totals in a separate aggregations table. Reads of the aggregations table are fast key lookups.

Streams vs Polling: Why It Matters

Without streams, reacting to changes requires polling:

Polling Approach (without streams)
===================================
Every 30 seconds:
scan = dynamodb.scan(table, filter="updated_at > last_poll_time")
for item in scan:
process(item)
Problems:
- Expensive (Scan reads entire table)
- Requires a timestamp attribute on every item
- Race conditions if updates happen between polls
- Deleted items are invisible — you can't scan for deletions
- Latency: up to 30 seconds between event and reaction
Streams Approach:
- Lambda invoked within seconds of change
- Captures inserts, updates, and deletes
- No table schema modification required
- Ordered delivery within partition
- Cost: per shard-hour + Lambda invocations

Enabling Streams

Streams is an optional feature, disabled by default. Enable it on an existing table without any downtime:

Terminal window
aws dynamodb update-table \
--table-name Orders \
--stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES

Then create the Lambda event source mapping:

Terminal window
aws lambda create-event-source-mapping \
--function-name OrderProcessor \
--event-source-arn arn:aws:dynamodb:us-east-1:123456789012:table/Orders/stream/2025-06-15T00:00:00.000 \
--starting-position LATEST \
--batch-size 100

Error Handling and Ordering

A shard processes records for a subset of partition keys. If your Lambda function throws an exception, the batch retries until it succeeds or until the records expire from the stream (24 hours). This is important to design around: an unhandled error in a Lambda function can block all subsequent stream records for that shard until the failing batch is processed or expires.

Best practices:

Key Interview Points