Step 4 — Security & Compliance Operations

Security domain questions on this exam rarely ask you to design a security architecture from scratch. They put you in the middle of an already-built environment and ask: this policy isn’t working, this finding just fired, this resource drifted out of compliance — now what do you actually do about it? That operator’s-eye framing is what this step trains for.

IAM Policy Troubleshooting: Finding the Denial

“Access denied” tickets are a daily occurrence in any team running production AWS accounts, and the exam tests whether you know where to actually look. Permissions in AWS are evaluated across several layers, and a denial can originate from any one of them:

Request: user tries to s3:PutObject on bucket "reports"
              │
              ▼
   ┌────────────────────────────────────┐
   │ 1. SCP (Organizations)             │  ── explicit deny anywhere = denied
   ├────────────────────────────────────┤
   │ 2. Permission Boundary             │  ── caps max possible permission
   ├────────────────────────────────────┤
   │ 3. Identity-based policy           │  ── does the user/role allow it?
   ├────────────────────────────────────┤
   │ 4. Resource-based policy           │  ── does the bucket policy allow it?
   ├────────────────────────────────────┤
   │ 5. Session policy (if assumed role)│  ── further narrows the session
   └────────────────────────────────────┘
              │
       Any explicit Deny wins, regardless of Allows elsewhere.
       Default (nothing matches) = implicit deny.

The practical troubleshooting tool is the IAM Policy Simulator, which lets you test a specific action against a specific principal and see which policy is responsible for the denial — far faster than manually reading through five stacked policies looking for the one Deny statement. For a denial that only happens in practice but simulates as allowed, check CloudTrail for the actual denied API call — it records the exact error message and which policy evaluation failed, which the simulator can’t always replicate for edge cases involving conditions like source IP or MFA presence.

IAM Access Analyzer solves a different, adjacent problem — it flags resources (S3 buckets, IAM roles, KMS keys, Lambda functions) that are reachable from outside your account or organization, which is how you catch an accidentally public resource before an attacker does rather than after.

AWS Config: Continuous Compliance, Not Point-in-Time Audits

Config records configuration changes to your resources over time and evaluates them against rules — think of it as a continuous compliance engine rather than a one-time audit checklist.

Resource Change Event (SG rule modified)
        │
        ▼
Config Rule: "restricted-ssh"
        │
   ┌────┴────┐
   ▼         ▼
COMPLIANT  NON_COMPLIANT ──► EventBridge ──► Automation runbook (remediate)
                                          └─► SNS notification

Managed rules cover common checks out of the box (s3-bucket-public-read-prohibited, restricted-ssh, rds-storage-encrypted). Custom rules, backed by a Lambda function, cover organization-specific logic managed rules don’t. Either way, a rule evaluation that comes back NON_COMPLIANT can trigger auto-remediation through an SSM Automation document — the same Automation capability covered in the deployment step, now wired to compliance rather than deployment.

Conformance packs bundle a set of Config rules plus their remediation actions into a single deployable unit, typically mapped to a compliance framework (a CIS benchmark, PCI-DSS, an internal security baseline). Deploying a conformance pack across an organization via StackSets is the standard way to enforce a consistent compliance posture across every account without hand-configuring rules account by account.

Config aggregators roll up compliance data from multiple accounts and regions into a single view — essential once you’re past a handful of accounts, because nobody wants to check compliance dashboards account by account every morning.

GuardDuty and Security Hub: Detection to Triage

GuardDuty is a threat detection service analyzing CloudTrail management events, VPC Flow Logs, DNS query logs, and (for supported workloads) EKS audit logs and runtime activity — with no agents to deploy. It looks for patterns consistent with compromised credentials, cryptomining, reconnaissance activity, and command-and-control communication.

A GuardDuty finding has a severity score and a structured type string that actually tells you what happened if you know how to read it:

Finding type: UnauthorizedAccess:EC2/SSHBruteForce
                    │              │        │
              Threat category   Resource   Specific behavior
                                  type      observed

Severity: 5.4 (Medium)
Instance: i-0a1b2c3d
Action:  SSH brute force attempts detected from 203.0.113.44

Security Hub sits a layer above GuardDuty — it aggregates findings from GuardDuty, Inspector, Macie, Config, and third-party tools into one normalized format (ASFF) and provides a security posture score against standards like CIS or the AWS Foundational Security Best Practices standard. Operationally, Security Hub is where you triage volume: sort by severity, group by resource, and route the highest-confidence, highest-severity findings into an incident workflow rather than treating every finding as equally urgent.

Service	Primary job	Output
GuardDuty	Threat detection from logs, no agents	Individual findings, severity-scored
Security Hub	Aggregation, normalization, posture scoring	Unified dashboard across sources
Config	Continuous configuration compliance	Compliance state per resource, per rule
Inspector	Vulnerability scanning (EC2, ECR images)	CVE findings

A finding routed to EventBridge can trigger an Automation runbook automatically — isolate an instance by swapping its security group, snapshot it for forensics, and notify the security team — turning detection into a documented, repeatable response instead of an ad hoc scramble.

Encryption Key Rotation Operations

KMS customer managed keys support automatic annual rotation — when enabled, AWS rotates the underlying key material yearly while preserving the key ID and ARN, so nothing referencing that key needs to change. Previous key material is retained internally so data encrypted under an older rotation still decrypts correctly.

This is different from manual rotation, where you create a brand-new key and have to update every reference to it yourself — more control, more operational overhead, and a real migration risk if you miss a reference somewhere.

CMK (automatic rotation enabled)
  Year 1: key material v1  ──► encrypts data
  Year 2: key material v2  ──► encrypts new data
                                 (v1 retained, still decrypts old data)
  Key ID/ARN: unchanged throughout — nothing referencing this key breaks

Secrets Manager rotation is a separate, more operationally visible concern — it rotates the actual secret value (a database password, an API key), typically via a Lambda function that follows the two-version pattern: create a new secret version, update the underlying credential, test it, then mark it current. Native integrations exist for RDS, Redshift, and DocumentDB; anything else needs a custom rotation Lambda. Know the distinction: KMS rotation changes key material transparently, Secrets Manager rotation changes the actual credential and requires the dependent system to pick up the new value.

Incident Response Runbooks: An Operator’s View

A mature incident response runbook isn’t a wiki page — it’s closer to an automated decision tree with clear ownership at each branch:

GuardDuty finding: credential compromise suspected
        │
        ▼
 1. Contain  ─── revoke/rotate the compromised credential (IAM, Secrets Manager)
        │
        ▼
 2. Isolate  ─── quarantine affected resource (restrictive SG, network ACL)
        │
        ▼
 3. Investigate ─ CloudTrail for actions taken under compromised identity
        │
        ▼
 4. Eradicate ── remove persistence (rogue IAM users/roles, unexpected keys)
        │
        ▼
 5. Recover  ─── restore from known-good state (AMI, backup) if needed
        │
        ▼
 6. Review   ─── post-incident, feed findings back into Config rules/GuardDuty

The operational habit worth internalizing: containment comes before investigation. Revoke the credential or isolate the resource first, then investigate — don’t leave an active compromise running longer just to gather more evidence first. CloudTrail, Config’s configuration history, and VPC Flow Logs together reconstruct the timeline once containment is underway.

Exam Focus: What Questions Test From This Step

Tracing an IAM denial through SCP, permission boundary, identity policy, resource policy, and session policy layers
Using the Policy Simulator vs reading CloudTrail’s actual denial event
Config managed rules vs custom rules, and how NON_COMPLIANT triggers auto-remediation
Conformance packs as the deployment unit for compliance frameworks across accounts
Reading a GuardDuty finding type string and mapping it to severity and required action
Security Hub’s role as aggregator/normalizer, distinct from GuardDuty as detector
KMS automatic key rotation (transparent, same key ID) vs Secrets Manager rotation (new credential value, needs app awareness)
Incident response sequencing — contain and isolate before deep investigation

Written by NPBlue Cloud Team — Cloud & Platform Engineers who runs production workloads on AWS daily and writes from real deployment experience, not the docs alone.

Reviewed for technical accuracy. Spot an error? Let us know.