Step 5 — Networking, Cost & Exam Prep

We’ll close out the guide with the two areas that tend to get squeezed in last-minute study — network troubleshooting and cost tooling — and then step back and talk about the exam itself: how the domains are weighted, and where candidates who know the material still lose points.

Diagnosing Connectivity Problems

“I can’t reach the server” is the single most common ops ticket, and it can mean five different things depending on where the request actually fails. Work through it as a layered problem instead of guessing:

Client ──► Route Table ──► Security Group ──► NACL ──► Instance ──► App listening?
             │                  │                │
        Wrong/missing      Stateful — only     Stateless — needs
        route to IGW/      needs inbound rule  BOTH inbound and
        NAT/peering        (return traffic     outbound rules
                            auto-allowed)       explicitly

The Security Group vs NACL distinction is worth drilling until it’s automatic: security groups are stateful — if you allow inbound traffic, the response is automatically allowed out, no matching outbound rule needed. NACLs are stateless — an allowed inbound request needs a corresponding outbound rule for the response, or the response gets silently dropped and you’ll spend an hour looking in the wrong place.

VPC Flow Logs — Seeing What Actually Got Dropped and Where

Flow Logs capture accepted and rejected traffic at the ENI, subnet, or VPC level:

version account-id  eni-id      srcaddr      dstaddr      srcport dstport protocol action
2       123456789012 eni-abc123 10.0.1.15    10.0.2.20    443     51234   6        ACCEPT
2       123456789012 eni-abc123 203.0.113.9  10.0.1.15    22      443     6        REJECT

The action field (ACCEPT or REJECT) tells you immediately whether the block happened at the security group or NACL level versus a routing problem further along — if there’s no flow log record at all for the attempted connection, traffic likely never reached that ENI in the first place, which usually points you back at routing.

Reachability Analyzer — Testing the Path Without Generating Real Traffic

Reachability Analyzer takes a source and destination and statically evaluates the entire path — route tables, security groups, NACLs — telling you whether the path is reachable, and if not, exactly which hop blocks it. This beats manually tracing route tables and security group rules by hand, especially in a VPC with peering, Transit Gateway attachments, or multiple hops between source and destination.

Source: i-0source (subnet A)
Destination: i-0dest (subnet B), port 443
                │
                ▼
  Path Analysis: BLOCKED
  Hop: Security Group "web-sg" on i-0dest
  Reason: No inbound rule permits port 443 from subnet A's CIDR

DNS Resolution Issues

Most VPC DNS confusion traces back to one of: the VPC’s enableDnsSupport/enableDnsHostnames settings, a Route 53 Resolver rule misconfigured for hybrid DNS (on-prem to VPC or VPC to VPC via Resolver endpoints), or a private hosted zone not associated with the VPC that’s actually querying it. When on-premises resources can’t resolve VPC private hosted zone records, check the inbound/outbound Resolver endpoints before assuming an application bug.

Cost Visibility and Optimization Tooling

Cost operations questions test whether you know which tool answers which specific question — they overlap in purpose but not in what they actually surface.

Tool	Answers	Granularity
Cost Explorer	Where has money already gone, and what’s the forecasted trend	Historical + forecast, filterable by tag/service/account
Compute Optimizer	Is this specific resource sized correctly right now	Per-resource rightsizing recommendation with confidence level
Trusted Advisor	Broad checks across cost, security, fault tolerance, performance, limits	Checklist-style, some checks free tier, full set on Business/Enterprise support
AWS Budgets	Am I about to exceed a threshold I care about	Alert-driven, forward-looking

Cost Explorer is the investigative tool — filter spend by service, linked account, or cost allocation tag, and look at trends over time. It’s where you’d go to answer “why did last month’s bill jump” by drilling into which service or account drove it.

Compute Optimizer looks at actual utilization history for EC2, Auto Scaling groups, EBS volumes, Lambda functions, and ECS on Fargate, then recommends a specific rightsizing action with a confidence rating. A db.r5.2xlarge sitting at 8% average CPU utilization for weeks is the textbook case Compute Optimizer is built to catch — and by 2026 its recommendations extend well past EC2 into most of the compute surface area, so it’s worth checking across service types, not just instances.

Trusted Advisor runs a standing set of checks across five categories — cost optimization, performance, security, fault tolerance, service limits — and flags things like idle load balancers, underutilized EBS volumes, and unencrypted S3 buckets in one pass. The free tier gives you a handful of core checks; the full check library opens up with Business or Enterprise support.

Cost Optimization Workflow
────────────────────────────
Cost Explorer     → "RDS spend jumped 40% this quarter"
       │
       ▼
Compute Optimizer → "db.r5.2xlarge running at 8% avg CPU, recommend db.r5.large"
       │
       ▼
Trusted Advisor   → "3 additional idle RDS instances flagged across other accounts"
       │
       ▼
AWS Budgets       → Alert set at 90% of new, lower forecasted spend

Cost allocation tags underpin all of this — Cost Explorer and Budgets can only slice spend by tag if the tags exist consistently and are activated in the Billing console. A rightsizing win on an untagged resource is invisible in reporting even after you’ve fixed it, which is why tagging discipline is treated as an operational responsibility, not an afterthought.

SOA-C03 Exam Domains: A Realistic Breakdown

AWS structures the exam around five domains, and the weighting tells you where to spend your remaining study time if you’re tight on it:

Domain	Approx. weight	Core territory
Monitoring, Logging & Remediation	~20%	CloudWatch, Logs Insights, EventBridge-driven remediation
Reliability & Business Continuity	~16%	AWS Backup, DR patterns, RTO/RPO, Multi-AZ
Deployment, Provisioning & Automation	~18%	CloudFormation, Systems Manager, patching, deployment patterns
Security & Compliance	~16%	IAM, Config, GuardDuty, Security Hub, encryption
Networking & Content Delivery	~18%	VPC troubleshooting, Route 53, connectivity, CDN basics
Cost & Performance Optimization	~12%	Cost Explorer, Compute Optimizer, Trusted Advisor, rightsizing

Treat these percentages as directional rather than exact — AWS doesn’t publish a fixed formula, and blueprint weighting shifts slightly between exam versions. The practical takeaway: Monitoring, Deployment Automation, and Networking together make up over half the exam, which lines up with what this whole guide has spent most of its pages on.

Common Traps Associate-Level Candidates Fall Into

Assuming CloudWatch sees everything automatically. Memory, disk usage inside the OS — these need the CloudWatch agent. This single gap accounts for a surprising number of missed monitoring questions.
Confusing stateful and stateless filtering. Security groups don’t need outbound rules for return traffic; NACLs do. Mixing these up wrecks otherwise-correct troubleshooting logic.
Picking the most expensive DR pattern by default. If the scenario states a tolerable RTO of several hours, Active/Active or even Warm Standby is over-engineered — and on this exam, over-engineered is still wrong.
Forgetting that INSUFFICIENT_DATA isn’t OK. A missing datapoint is not confirmation that things are healthy — treat it as its own state requiring investigation.
Treating Config and GuardDuty as interchangeable. Config evaluates configuration state against rules; GuardDuty detects behavioral threats from logs. A question about “detecting a compromised credential” wants GuardDuty, not Config.
Skipping the “Replacement: True/False” detail on change sets. It’s an easy thing to gloss over in practice, and exam questions frequently hinge on exactly that field.
Not knowing what Reachability Analyzer actually does versus Flow Logs. Flow Logs show you what already happened; Reachability Analyzer predicts what would happen along a path without generating traffic. Different tools, different moments in the troubleshooting process.
Under-preparing for Logs Insights query reading. You will very likely be shown a query and asked what it returns, or shown a desired result and asked to pick the correct query. Practice reading them, not just writing them.

Final Prep Notes

Work through scenario-based practice questions rather than pure recall flashcards — this exam is built around “here’s a broken/inefficient situation, what’s your next operational step,” not “define this service.” If you can explain why a wrong answer is wrong, not just recognize the right one, you’re at the level this certification is actually testing for.

Exam Focus: What Questions Test From This Step

Distinguishing where a connectivity failure occurred: routing, security group, or NACL
Reading Flow Log records and correctly interpreting ACCEPT/REJECT and missing entries
Knowing what Reachability Analyzer checks statically versus what Flow Logs show after the fact
Diagnosing DNS resolution failures involving private hosted zones and Resolver endpoints
Choosing the right cost tool for a given question: Cost Explorer (historical/forecast), Compute Optimizer (rightsizing), Trusted Advisor (broad checks), Budgets (threshold alerts)
Domain weighting awareness — prioritizing Monitoring, Deployment Automation, and Networking in final review
Recognizing and avoiding the recurring traps around stateful/stateless filtering, DR over-engineering, and alarm state misreadings

Written by NPBlue Cloud Team — Cloud & Platform Engineers who runs production workloads on AWS daily and writes from real deployment experience, not the docs alone.

Reviewed for technical accuracy. Spot an error? Let us know.