Step 3 โ Storage & Databases
Storage and database questions account for a significant portion of the SAA-C03 exam. The key is knowing which service fits which scenario โ not just what each service does, but when the exam expects you to choose it over the alternatives.
S3 โ Simple Storage Service
S3 is object storage: you store files (objects) inside containers (buckets). No file system, no block device โ just HTTP PUT/GET to a key-value store that scales to exabytes.
S3 Storage Classes (Hierarchy by Cost)
S3 Standard โ Frequent access. 11 9s durability. 3 AZ replication.S3 Intelligent-Tiering โ Auto-moves objects between tiers based on access.S3 Standard-IA โ Infrequent Access. Lower storage cost, retrieval fee.S3 One Zone-IA โ Like Standard-IA but only 1 AZ. 20% cheaper, less resilient.S3 Glacier Instant โ Archive, millisecond retrieval.S3 Glacier Flexible โ Archive, minutes-to-hours retrieval.S3 Glacier Deep Archive โ Cheapest. 12-hour retrieval. Compliance/long-term backup.S3 Intelligent-Tiering automatically moves objects that havenโt been accessed for 30+ days into cheaper tiers โ itโs the low-management answer for mixed-access workloads.
S3 Lifecycle Policies
Automate transitions and expiration:
Upload โโโบ S3 Standard (0 days) โ โผ after 30 days S3 Standard-IA โ โผ after 90 days S3 Glacier Instant Retrieval โ โผ after 365 days S3 Glacier Deep Archive โ โผ after 2555 days (7 years) Delete (expired)S3 Key Features
Versioning โ Keeps all versions of every object. Once enabled on a bucket, you can only suspend (not disable) it. Protects against accidental deletes and overwrites.
Replication โ Cross-Region Replication (CRR) copies objects to a bucket in a different region (compliance, lower latency for remote users). Same-Region Replication (SRR) copies within the same region (log aggregation, live replication between accounts). Requires versioning on both buckets.
S3 Transfer Acceleration โ Routes uploads through CloudFront edge locations โ AWS backbone โ destination bucket. Speeds up large uploads from globally distributed users.
Multipart Upload โ Required for objects > 5 GB, recommended for > 100 MB. Enables parallel upload of parts.
Static Website Hosting โ S3 can host a static website (HTML/CSS/JS) directly. No EC2 needed. Combine with CloudFront for HTTPS and global edge caching.
S3 Security
Bucket Policy โ Resource-based JSON policy. Controls who can access the bucket.ACLs โ Legacy. Avoid unless required for specific cross-account cases.Block Public Access โ Account-level or bucket-level override. Prevents accidental exposure.Server-Side Encryption: SSE-S3 โ AWS manages keys (AES-256). Default since Jan 2023. SSE-KMS โ You manage keys in KMS. Audit trail. Extra cost. SSE-C โ You provide keys per request. AWS never stores your key.Presigned URLs โ Temporarily grant access to a private object. Time-limited (default 1 hour, max 7 days with STS). The exam loves this for โallow a third party to download a private object without making it public.โ
Block & File Storage: EBS, Instance Store, EFS
EBS โ Elastic Block Store
Network-attached block storage for EC2. Persists independently of instance lifecycle (unlike instance store).
gp3 (General Purpose SSD) โ Default. 3,000 IOPS baseline, up to 16,000 IOPS.io2 Block Express โ Highest performance. Up to 256,000 IOPS. Multi-Attach.st1 (Throughput HDD) โ Sequential reads. Big data, log processing. Low cost.sc1 (Cold HDD) โ Cheapest. Infrequent access archival.EBS Multi-Attach โ io2 volumes can attach to multiple EC2 instances in the same AZ simultaneously. Requires cluster-aware filesystem.
EBS Snapshots โ Point-in-time backup stored in S3 (but you access via EBS APIs). Incremental after first snapshot. Can copy snapshots cross-region for DR.
Instance Store
Physically attached NVMe SSDs on the EC2 host. Fastest possible storage (sub-millisecond), but ephemeral โ data is lost when the instance stops, terminates, or the underlying host fails. Use for temporary data: cache, buffers, scratch space.
EFS โ Elastic File System
Managed NFS file system mountable by multiple EC2 instances simultaneously โ even across AZs. This is the key differentiator from EBS (which is single-AZ by default).
EBS EFSโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโSingle EC2 instance Multiple instances + AZsSingle AZ Regional (multi-AZ)Block storage File system (NFS)Fixed provisioned size Scales automaticallyFaster for single-instance Better for shared workloadsEFS Infrequent Access (EFS-IA) โ Automatically moves files not accessed for 30+ days to a cheaper storage tier. Savings up to 92% vs standard.
RDS โ Relational Database Service
RDS manages the database engine for you: patching, backups, storage scaling, and failover. You choose the engine, AWS handles the rest.
Supported Engines
MySQL, PostgreSQL, MariaDB, Oracle, Microsoft SQL Server, and Amazon Aurora.
Multi-AZ vs Read Replicas (Most Tested RDS Topic)
Multi-AZ Read Replicasโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโSynchronous replication Asynchronous replicationSame region (different AZ) Same region, cross-region, or cross-accountStandby is NOT accessible Read replica IS accessible (read traffic)Automatic failover (~60 seconds) No automatic failoverPurpose: HIGH AVAILABILITY Purpose: READ SCALABILITY + REPORTINGDNS endpoint stays the same Different endpoint per replicaExam rule: If the question mentions โfailoverโ, โdisaster recoveryโ, โavailabilityโ โ Multi-AZ. If it mentions โread-heavyโ, โanalytics offloadโ, โreportingโ โ Read Replica.
Multi-AZ can be in a different AZ but not a different region by default. For cross-region HA, use Aurora Global Database or RDS cross-region read replicas (then promote in a DR event).
Amazon Aurora
AWSโs proprietary database engine, compatible with MySQL and PostgreSQL but architecturally different:
Aurora Storage: distributed across 3 AZs, 6 copies of data, auto-healsAurora Compute: up to 15 read replicas (vs 5 for regular RDS)Aurora Failover: ~30 seconds (vs ~60 for regular RDS Multi-AZ)Aurora Cost: ~20% more than RDS, but 5x better throughputAurora Serverless v2 โ Scales compute in fine-grained increments based on actual load. Ideal for variable or unpredictable workloads, dev/test environments. Billed per ACU-second (Aurora Capacity Unit).
Aurora Global Database โ One primary region, up to 5 read-only secondary regions. Replication lag < 1 second. In a regional disaster, promote a secondary to primary in < 1 minute. This is the exam answer for โcross-region DR with sub-second RPO for relational data.โ
DynamoDB โ Serverless NoSQL
DynamoDB is a fully managed key-value and document database. No servers to manage, no schema migrations, single-digit millisecond performance at any scale.
Data Model
Every item has a partition key (required). Optionally add a sort key (creates a composite primary key enabling range queries).
Table: OrdersPartition Key: CustomerIdSort Key: OrderDate
CustomerId โ OrderDate โ Total โ Statusโโโโโโโโโโโโโผโโโโโโโโโโโโโโผโโโโโโโโโผโโโโโโโโuser_123 โ 2026-01-15 โ $142 โ Shippeduser_123 โ 2026-03-01 โ $67 โ Pendinguser_456 โ 2026-02-20 โ $890 โ DeliveredCapacity Modes
Provisioned โ You set Read Capacity Units (RCU) and Write Capacity Units (WCU). Enable auto-scaling to adjust automatically. Best for predictable, sustained traffic.
On-Demand โ Pay per request. No capacity planning needed. Best for variable, spiky, or unknown traffic patterns. More expensive per request but no waste.
Secondary Indexes
Global Secondary Index (GSI) โ Query on any non-key attribute. Has its own partition/sort key. Queries eventually consistent. Can be added/removed after table creation.
Local Secondary Index (LSI) โ Same partition key, different sort key. Strongly consistent queries possible. Must be defined at table creation. Max 10 GB per partition key value.
DynamoDB Accelerator (DAX)
In-memory cache specifically for DynamoDB. Reduces read latency from milliseconds to microseconds. API-compatible โ no application code changes needed. Use for read-heavy, latency-sensitive workloads.
DynamoDB Streams
Captures a time-ordered sequence of item modifications. Enables event-driven architectures:
DynamoDB Table โโโบ DynamoDB Stream โโโบ Lambda Function โโโบ SNS / SQS / ElasticSearch (24-hour window)Common exam use: โTrigger a process whenever an item is added or modified in a DynamoDB tableโ โ DynamoDB Streams + Lambda.
Storage Decision Matrix for the Exam
| Scenario | Best Answer |
|---|---|
| Store large media files, pay per GB | S3 |
| Single EC2 needs fast persistent disk | EBS gp3 |
| Multiple EC2 instances share a file system | EFS |
| Temporary scratch space, maximum speed | Instance Store |
| Relational DB, automatic failover, same region | RDS Multi-AZ |
| Relational DB, offload analytics queries | RDS Read Replica |
| Relational DB, cross-region DR, sub-second RPO | Aurora Global Database |
| NoSQL, millisecond latency, serverless | DynamoDB |
| DynamoDB latency must be microseconds | DynamoDB + DAX |
| Archive data, rare retrieval, cheapest | S3 Glacier Deep Archive |