Step 4 โ Security & Compliance Operations
Security domain questions on this exam rarely ask you to design a security architecture from scratch. They put you in the middle of an already-built environment and ask: this policy isnโt working, this finding just fired, this resource drifted out of compliance โ now what do you actually do about it? That operatorโs-eye framing is what this step trains for.
IAM Policy Troubleshooting: Finding the Denial
โAccess deniedโ tickets are a daily occurrence in any team running production AWS accounts, and the exam tests whether you know where to actually look. Permissions in AWS are evaluated across several layers, and a denial can originate from any one of them:
Request: user tries to s3:PutObject on bucket "reports" โ โผ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ 1. SCP (Organizations) โ โโ explicit deny anywhere = denied โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ 2. Permission Boundary โ โโ caps max possible permission โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ 3. Identity-based policy โ โโ does the user/role allow it? โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ 4. Resource-based policy โ โโ does the bucket policy allow it? โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ 5. Session policy (if assumed role)โ โโ further narrows the session โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ Any explicit Deny wins, regardless of Allows elsewhere. Default (nothing matches) = implicit deny.The practical troubleshooting tool is the IAM Policy Simulator, which lets you test a specific action against a specific principal and see which policy is responsible for the denial โ far faster than manually reading through five stacked policies looking for the one Deny statement. For a denial that only happens in practice but simulates as allowed, check CloudTrail for the actual denied API call โ it records the exact error message and which policy evaluation failed, which the simulator canโt always replicate for edge cases involving conditions like source IP or MFA presence.
IAM Access Analyzer solves a different, adjacent problem โ it flags resources (S3 buckets, IAM roles, KMS keys, Lambda functions) that are reachable from outside your account or organization, which is how you catch an accidentally public resource before an attacker does rather than after.
AWS Config: Continuous Compliance, Not Point-in-Time Audits
Config records configuration changes to your resources over time and evaluates them against rules โ think of it as a continuous compliance engine rather than a one-time audit checklist.
Resource Change Event (SG rule modified) โ โผConfig Rule: "restricted-ssh" โ โโโโโโดโโโโโ โผ โผCOMPLIANT NON_COMPLIANT โโโบ EventBridge โโโบ Automation runbook (remediate) โโโบ SNS notificationManaged rules cover common checks out of the box (s3-bucket-public-read-prohibited, restricted-ssh, rds-storage-encrypted). Custom rules, backed by a Lambda function, cover organization-specific logic managed rules donโt. Either way, a rule evaluation that comes back NON_COMPLIANT can trigger auto-remediation through an SSM Automation document โ the same Automation capability covered in the deployment step, now wired to compliance rather than deployment.
Conformance packs bundle a set of Config rules plus their remediation actions into a single deployable unit, typically mapped to a compliance framework (a CIS benchmark, PCI-DSS, an internal security baseline). Deploying a conformance pack across an organization via StackSets is the standard way to enforce a consistent compliance posture across every account without hand-configuring rules account by account.
Config aggregators roll up compliance data from multiple accounts and regions into a single view โ essential once youโre past a handful of accounts, because nobody wants to check compliance dashboards account by account every morning.
GuardDuty and Security Hub: Detection to Triage
GuardDuty is a threat detection service analyzing CloudTrail management events, VPC Flow Logs, DNS query logs, and (for supported workloads) EKS audit logs and runtime activity โ with no agents to deploy. It looks for patterns consistent with compromised credentials, cryptomining, reconnaissance activity, and command-and-control communication.
A GuardDuty finding has a severity score and a structured type string that actually tells you what happened if you know how to read it:
Finding type: UnauthorizedAccess:EC2/SSHBruteForce โ โ โ Threat category Resource Specific behavior type observed
Severity: 5.4 (Medium)Instance: i-0a1b2c3dAction: SSH brute force attempts detected from 203.0.113.44Security Hub sits a layer above GuardDuty โ it aggregates findings from GuardDuty, Inspector, Macie, Config, and third-party tools into one normalized format (ASFF) and provides a security posture score against standards like CIS or the AWS Foundational Security Best Practices standard. Operationally, Security Hub is where you triage volume: sort by severity, group by resource, and route the highest-confidence, highest-severity findings into an incident workflow rather than treating every finding as equally urgent.
| Service | Primary job | Output |
|---|---|---|
| GuardDuty | Threat detection from logs, no agents | Individual findings, severity-scored |
| Security Hub | Aggregation, normalization, posture scoring | Unified dashboard across sources |
| Config | Continuous configuration compliance | Compliance state per resource, per rule |
| Inspector | Vulnerability scanning (EC2, ECR images) | CVE findings |
A finding routed to EventBridge can trigger an Automation runbook automatically โ isolate an instance by swapping its security group, snapshot it for forensics, and notify the security team โ turning detection into a documented, repeatable response instead of an ad hoc scramble.
Encryption Key Rotation Operations
KMS customer managed keys support automatic annual rotation โ when enabled, AWS rotates the underlying key material yearly while preserving the key ID and ARN, so nothing referencing that key needs to change. Previous key material is retained internally so data encrypted under an older rotation still decrypts correctly.
This is different from manual rotation, where you create a brand-new key and have to update every reference to it yourself โ more control, more operational overhead, and a real migration risk if you miss a reference somewhere.
CMK (automatic rotation enabled) Year 1: key material v1 โโโบ encrypts data Year 2: key material v2 โโโบ encrypts new data (v1 retained, still decrypts old data) Key ID/ARN: unchanged throughout โ nothing referencing this key breaksSecrets Manager rotation is a separate, more operationally visible concern โ it rotates the actual secret value (a database password, an API key), typically via a Lambda function that follows the two-version pattern: create a new secret version, update the underlying credential, test it, then mark it current. Native integrations exist for RDS, Redshift, and DocumentDB; anything else needs a custom rotation Lambda. Know the distinction: KMS rotation changes key material transparently, Secrets Manager rotation changes the actual credential and requires the dependent system to pick up the new value.
Incident Response Runbooks: An Operatorโs View
A mature incident response runbook isnโt a wiki page โ itโs closer to an automated decision tree with clear ownership at each branch:
GuardDuty finding: credential compromise suspected โ โผ 1. Contain โโโ revoke/rotate the compromised credential (IAM, Secrets Manager) โ โผ 2. Isolate โโโ quarantine affected resource (restrictive SG, network ACL) โ โผ 3. Investigate โ CloudTrail for actions taken under compromised identity โ โผ 4. Eradicate โโ remove persistence (rogue IAM users/roles, unexpected keys) โ โผ 5. Recover โโโ restore from known-good state (AMI, backup) if needed โ โผ 6. Review โโโ post-incident, feed findings back into Config rules/GuardDutyThe operational habit worth internalizing: containment comes before investigation. Revoke the credential or isolate the resource first, then investigate โ donโt leave an active compromise running longer just to gather more evidence first. CloudTrail, Configโs configuration history, and VPC Flow Logs together reconstruct the timeline once containment is underway.
Exam Focus: What Questions Test From This Step
- Tracing an IAM denial through SCP, permission boundary, identity policy, resource policy, and session policy layers
- Using the Policy Simulator vs reading CloudTrailโs actual denial event
- Config managed rules vs custom rules, and how NON_COMPLIANT triggers auto-remediation
- Conformance packs as the deployment unit for compliance frameworks across accounts
- Reading a GuardDuty finding type string and mapping it to severity and required action
- Security Hubโs role as aggregator/normalizer, distinct from GuardDuty as detector
- KMS automatic key rotation (transparent, same key ID) vs Secrets Manager rotation (new credential value, needs app awareness)
- Incident response sequencing โ contain and isolate before deep investigation