Cloud/ AWS / AWS Certified Solutions Architect โ€” Professional (SAP-C02) / SAP-C02 New Solutions Design: Global, Event-Driven Architecture

AWS Amazon Web Services Professional Step 2 of 5 106 guides ยท updated 2026

Hands-on guides to compute, storage, databases, networking, and serverless on the world's most widely adopted cloud platform.

Step 2 โ€” New Solutions Design

Give ten Professional-level candidates the same greenfield requirements and youโ€™ll get ten different architecture diagrams โ€” and the exam is fine with that, because it isnโ€™t grading a single correct diagram. Itโ€™s grading whether your choices trace back to the stated constraints: latency budget, consistency requirements, failure tolerance, cost ceiling. This step works through the recurring building blocks youโ€™ll assemble differently depending on which constraint is dominant.


Compute Selection Is a Tradeoff Table, Not a Checklist

At Associate level, โ€œwhen do I use Lambda vs EC2โ€ has a reasonably short answer. At Professional level, the question arrives buried inside a paragraph of business requirements, and you have to extract it yourself. Frame every compute decision along four axes: control, operational overhead, cost model, and startup latency.

Compute OptionControl LevelOps OverheadCold Start ConcernBest Fit
EC2 (self-managed)Full OS accessHighNoneLicensing-bound software, custom kernels
ECS/EKS on EC2Container orchestrationMediumLowExisting container investment, need for GPU/specialized instances
FargateNone below taskLowLow-mediumContainerized workloads without capacity planning
LambdaNoneLowestCan matter at scaleEvent-driven, spiky, sub-15-minute execution

A pattern that shows up repeatedly in scenario questions: a company migrating from EC2 wants โ€œless operational burdenโ€ but also needs GPU instances for an ML inference workload. Fargate doesnโ€™t support arbitrary GPU types the way EC2-backed ECS/EKS does, so the โ€œleast operational overheadโ€ answer isnโ€™t always the right one โ€” the constraint (GPU) eliminates it. Always let the hard constraint prune the option list before you optimize for the soft one (operational simplicity).


Designing for Global Applications

A single-Region deployment is the default assumption at Associate level. At Professional level, โ€œcustomers in Tokyo, Frankfurt, and Sรฃo Paulo all need sub-100ms readsโ€ is a normal opening sentence, and it forces a genuinely different topology.

Route 53 (latency-based routing)
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”
โ”‚ us-east-1โ”‚ โ”‚ eu-central-1โ”‚ โ”‚ ap-northeast-1โ”‚
โ”‚ ALB โ”‚ โ”‚ ALB โ”‚ โ”‚ ALB โ”‚
โ”‚ ECS โ”‚ โ”‚ ECS โ”‚ โ”‚ ECS โ”‚
โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”
โ”‚ DynamoDB Global Table (multi-active, last-writer-wins) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

DynamoDB Global Tables replicate a table across Regions with active-active writes in every Region โ€” any Region can accept a write, and replication catches the others up within roughly a second under normal conditions. The tradeoff the exam wants you to articulate: this buys you low local write latency everywhere at the cost of eventual consistency and the possibility of conflicting writes resolved by last-writer-wins. If your application cannot tolerate that (financial ledgers, inventory counts that must never go negative), Global Tables is the wrong answer regardless of how attractive the latency profile looks.

Aurora Global Database takes the opposite shape: one primary Region handles all writes, and up to several secondary Regions get physical-layer replication with typically sub-second lag, providing fast local reads everywhere and a disaster-recovery target with a low RPO. It does not give you multi-Region writes. When a scenario says โ€œreporting queries in three Regions, but all order processing happens centrally,โ€ Aurora Global Database is the fit; when it says โ€œeach Region needs to accept writes independently,โ€ youโ€™re back to Global Tables or a custom conflict-resolution scheme.

Global Accelerator solves a different layer of the problem entirely โ€” itโ€™s not a data replication tool, itโ€™s a network entry point. It anycasts two static IPs from the AWS global network edge, then routes user traffic over AWSโ€™s backbone rather than the public internet, improving both latency and jitter, and it fails traffic over to a healthy Region automatically if an endpoint group becomes unhealthy. Compare that against CloudFront, which is built for caching content at edge locations โ€” Global Accelerator is for TCP/UDP traffic that isnโ€™t cacheable, like gaming, VoIP, or API traffic that needs consistent low-latency routing rather than content caching.


Event-Driven Architecture at Enterprise Scale

Once you have more than a handful of services, direct service-to-service calls create a dependency graph nobody can reason about. Event-driven design decouples producers from consumers through an intermediary, and the Professional exam expects fluency in choosing the right intermediary for the traffic shape.

Order Service โ”€โ”€eventโ”€โ”€โ–ถ EventBridge Bus โ”€โ”€ruleโ”€โ”€โ–ถ Inventory Service
โ”‚
โ”œโ”€โ”€ruleโ”€โ”€โ–ถ Notification Service (SQS)
โ”‚
โ””โ”€โ”€ruleโ”€โ”€โ–ถ Analytics Pipeline (Kinesis Firehose)

SNS fans a single message out to multiple subscribers immediately โ€” think alerting, or triggering several independent Lambda functions off one event. SQS buffers work for a single consumer group and lets it process at its own pace, with visibility timeouts and dead-letter queues protecting against poison messages. EventBridge goes further than either: itโ€™s a schema-aware event bus with content-based routing rules, native integration with dozens of AWS services and SaaS partners, and support for archiving and replaying events โ€” useful when you stand up a new consumer service six months later and need to backfill it against historical events. Kinesis Data Streams is for when order matters and throughput is continuous rather than discrete messages โ€” clickstream data, IoT telemetry, anything youโ€™ll process with multiple independent consumer applications reading the same ordered stream at their own checkpoint position.

A frequent scenario trap: a question describes โ€œmany independent teams need to react to the same business event, and new consumers get added regularly without changing the producer.โ€ That phrase โ€” new consumers added without touching the producer โ€” is the signature of EventBridge or SNS, not a direct API call or a tightly coupled SQS queue per consumer.


Microservices Decomposition and Data Ownership

Breaking a monolith into services is as much a data design problem as a compute design problem, and Professional-level questions increasingly hinge on the data side. The rule that trips people up: each microservice owns its data exclusively, and other services never reach directly into that data store. They ask through an API or react to an event.

Monolith: [ Single App ] โ”€โ”€โ”€โ”€ [ Single Shared Database ]
Microservices:
[ Orders Service ] โ”€โ”€ owns โ”€โ”€ [ Orders DB ]
[ Inventory Service ] โ”€โ”€ owns โ”€โ”€ [ Inventory DB ]
[ Shipping Service ] โ”€โ”€ owns โ”€โ”€ [ Shipping DB ]
โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ events via EventBridge โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

This is why a service mesh or API Gateway alone doesnโ€™t solve microservices decomposition โ€” the exam sometimes offers โ€œput API Gateway in front of the monolithโ€ as a distractor answer for a question thatโ€™s actually asking about decomposing the data layer, not just the request routing layer. Fronting a monolith with API Gateway changes nothing about coupling if every โ€œserviceโ€ still writes to the same tables.

Saga pattern handles transactions that span multiple services, since you can no longer rely on a single database transaction. Orchestration-based sagas (a Step Functions state machine coordinating each step and compensating on failure) are generally easier to reason about and test than choreography-based sagas (each service reacting to the previous serviceโ€™s event with no central coordinator), and the exam tends to favor Step Functions orchestration as the โ€œwell-architectedโ€ answer when a workflow has more than three or four steps or requires visible compensating transactions.


Elasticity and Scalability Tradeoffs

Scaling isnโ€™t free, and Professional-level design has to reckon with the failure modes of scaling itself, not just the mechanism.

PatternScales Fast?Cost BehaviorRisk
EC2 Auto Scaling (target tracking)MinutesPay for running capacityScaling lag during sudden spikes
Application Auto Scaling on ECS/FargateMinutesPay for running tasksSame lag, smaller blast radius per task
Lambda concurrencySecondsPay per invocationDownstream systems (RDS connections) can be overwhelmed by sudden concurrency
DynamoDB on-demandInstantPay per requestCost can spike unexpectedly under sustained high load

The recurring exam scenario: Lambda scales so quickly that it saturates a downstream RDS connection pool during a traffic spike, causing errors that look like a database problem but are actually a compute-scaling problem. The fix is RDS Proxy, which pools and multiplexes connections so thousands of concurrent Lambda invocations donโ€™t each open a direct database connection. Recognizing โ€œLambda + RDS + connection exhaustionโ€ as the setup for โ€œthe answer is RDS Proxyโ€ is one of the more reliable pattern-matches on this exam.


Exam Focus: What Questions Test From This Step