AWS NAT Gateway: Private Subnet Internet Access Without Exposure
Resources in private subnets — application servers, Lambda functions, ECS tasks — often need outbound internet access: to download software updates, call external APIs, reach public S3 endpoints, or communicate with third-party services. But they should not be reachable from the internet.
NAT Gateway solves this. It translates outbound traffic from private instances, replacing their private source IP with the NAT Gateway’s Elastic IP. Response traffic comes back to the NAT Gateway, which translates it back and forwards it to the originating private instance. Unsolicited inbound connections cannot reach private instances through NAT Gateway.
How NAT Gateway Works
Private instance (10.0.11.45) makes an outbound HTTPS request:
Step 1: Packet leaves 10.0.11.45 with destination api.stripe.comStep 2: Private subnet route table: 0.0.0.0/0 → NAT GatewayStep 3: NAT Gateway receives packet, replaces source 10.0.11.45 with its Elastic IP (54.123.45.67), records the mappingStep 4: NAT Gateway sends packet through Internet GatewayStep 5: api.stripe.com responds to 54.123.45.67Step 6: NAT Gateway receives response, looks up mapping, forwards to 10.0.11.45The private instance never needs a public IP. The internet sees only the NAT Gateway’s Elastic IP.
Creating a NAT Gateway
NAT Gateway must live in a public subnet — it needs access to the Internet Gateway to send traffic to the internet.
# Allocate an Elastic IP for the NAT GatewayEIP_ALLOC=$(aws ec2 allocate-address \ --domain vpc \ --query 'AllocationId' --output text)
EIP_ADDR=$(aws ec2 describe-addresses \ --allocation-ids $EIP_ALLOC \ --query 'Addresses[0].PublicIp' --output text)
echo "NAT Gateway will use EIP: $EIP_ADDR"
# Create NAT Gateway in a public subnetNAT_ID=$(aws ec2 create-nat-gateway \ --subnet-id subnet-public-0a1b2c \ --allocation-id $EIP_ALLOC \ --tag-specifications 'ResourceType=natgateway,Tags=[{Key=Name,Value=nat-1a}]' \ --query 'NatGateway.NatGatewayId' --output text)
# Wait for NAT Gateway to be available (2-3 minutes)echo "Waiting for NAT Gateway to be available..."aws ec2 wait nat-gateway-available --nat-gateway-ids $NAT_ID
echo "NAT Gateway $NAT_ID is ready"
# Update private subnet route table to use NAT Gatewayaws ec2 create-route \ --route-table-id rtb-private-0abc123 \ --destination-cidr-block 0.0.0.0/0 \ --nat-gateway-id $NAT_IDAfter this, instances in the private subnet can initiate outbound connections. To verify:
# SSH to instance in private subnet (via bastion), then:curl -s https://api.ipify.org # Should return the NAT Gateway's EIPOne NAT Gateway Per AZ
AWS recommends one NAT Gateway per Availability Zone. A single NAT Gateway in one AZ is a hidden cross-AZ dependency:
Bad design (single NAT Gateway): AZ-1a: public-1a (NAT-GW) ← All private subnets point here AZ-1a: private-1a → 0.0.0.0/0 → NAT-GW (same AZ — fine) AZ-1b: private-1b → 0.0.0.0/0 → NAT-GW (cross-AZ — problem if AZ-1a fails) AZ-1c: private-1c → 0.0.0.0/0 → NAT-GW (cross-AZ — problem if AZ-1a fails)
Good design (one NAT Gateway per AZ): AZ-1a: private-1a → 0.0.0.0/0 → NAT-GW-1a (same AZ) AZ-1b: private-1b → 0.0.0.0/0 → NAT-GW-1b (same AZ) AZ-1c: private-1c → 0.0.0.0/0 → NAT-GW-1c (same AZ)With one NAT Gateway per AZ:
- AZ failure only affects that AZ’s private subnet outbound traffic
- Eliminates cross-AZ data transfer costs (cross-AZ traffic is billed at $0.01/GB each way)
NAT Gateway Pricing
NAT Gateway costs two things:
Hourly rate: 0.135/hr = $97/month just for the gateway resource.
Data processing: $0.045 per GB of data processed through the NAT Gateway, in either direction.
For high-throughput workloads, the data processing cost dominates. Lambda functions that call external APIs frequently or ECS tasks downloading large payloads can generate significant NAT Gateway bills.
Cost optimisation strategies:
- Use VPC Gateway Endpoints for S3 and DynamoDB (free, bypasses NAT Gateway)
- Use VPC Interface Endpoints for other AWS services (eliminates NAT Gateway cost for AWS API calls)
- Evaluate whether private subnets need internet access at all — isolated subnets for pure internal traffic save NAT Gateway costs
# Estimate NAT Gateway data processing in CloudWatchaws cloudwatch get-metric-statistics \ --namespace AWS/NATGateway \ --metric-name BytesOutToInternet \ --dimensions Name=NatGatewayId,Value=$NAT_ID \ --statistics Sum \ --period 86400 \ --start-time 2024-01-01T00:00:00Z \ --end-time 2024-01-08T00:00:00Z \ --query 'Datapoints[*].[Timestamp,Sum]' \ --output tableNAT Gateway vs NAT Instance
NAT Instances are EC2 instances running a NAT AMI — a legacy option that predates NAT Gateway. The comparison:
┌────────────────────────────────────────────────────────────────┐│ NAT Gateway vs NAT Instance ││ ││ Attribute NAT Gateway NAT Instance ││ ───────────────────────────────────────────────────────── ││ Management Fully managed You manage the EC2 ││ Availability Highly available Single-instance (SPOF) ││ Bandwidth Up to 100 Gbps Limited by instance ││ Scaling Automatic Manual or ASG ││ Patching AWS patches You patch ││ Security groups Not supported Supported ││ Cost Per hour + per GB Per instance type ││ Port forwarding No Yes ││ Bastion use No Yes │└────────────────────────────────────────────────────────────────┘Use NAT Gateway for all new deployments. NAT Instances are only relevant if you need port forwarding or are running a very low-bandwidth workload where a t3.nano NAT Instance is cheaper than a NAT Gateway.
Private NAT Gateway
Standard NAT Gateways are “public” — they have an Elastic IP and route through an IGW. A “private” NAT Gateway has no Elastic IP and connects to an on-premises network via Direct Connect or Site-to-Site VPN.
Use case: resources in VPC A need to communicate with resources in VPC B, but the CIDR ranges overlap, preventing direct peering. A private NAT Gateway in a transit VPC translates the source addresses.
aws ec2 create-nat-gateway \ --subnet-id subnet-private-transit \ --connectivity-type private \ --tag-specifications 'ResourceType=natgateway,Tags=[{Key=Name,Value=private-nat-transit}]'This is an advanced pattern for complex multi-VPC architectures.
VPC Endpoints: Eliminating NAT Costs for AWS Services
The most impactful NAT Gateway cost reduction is adding VPC endpoints for AWS services your workloads use frequently:
# S3 Gateway Endpoint (free, no per-GB charge)aws ec2 create-vpc-endpoint \ --vpc-id vpc-0abc123 \ --service-name com.amazonaws.us-east-1.s3 \ --route-table-ids rtb-private-1a rtb-private-1b rtb-private-1c
# ECR Interface Endpoints (needed if pulling Docker images from private subnet)aws ec2 create-vpc-endpoint \ --vpc-id vpc-0abc123 \ --vpc-endpoint-type Interface \ --service-name com.amazonaws.us-east-1.ecr.dkr \ --subnet-ids subnet-private-1a subnet-private-1b \ --security-group-ids sg-vpc-endpoints
# After adding S3 gateway endpoint, ECS tasks pulling from ECR# no longer route S3 layer traffic through the NAT GatewayFor an ECS Fargate deployment pulling large container images, this can eliminate gigabytes of NAT Gateway processing per deployment.
Monitoring NAT Gateway
CloudWatch provides metrics for each NAT Gateway:
ActiveConnectionCount— current active TCP connectionsBytesInFromDestination— bytes received from internet (inbound to private instances)BytesOutToDestination— bytes sent to internet (outbound from private instances)ErrorPortAllocation— failed connection attempts (NAT Gateway exhausting ports)PacketDropCount— packets dropped (possible connection exhaustion)
Port exhaustion (ErrorPortAllocation) occurs when a NAT Gateway runs out of ports to map connections. Each NAT Gateway supports up to 55,000 simultaneous connections to the same destination IP and port. If your application opens many short-lived connections to a single destination (like a Redis SaaS), you may hit this limit.
Common Interview Questions
Q: What is the difference between NAT Gateway and Internet Gateway? Internet Gateway connects the VPC to the internet and handles two-way traffic for public subnets (instances with public IPs). NAT Gateway provides one-way outbound internet access for private subnet instances without exposing them to inbound internet connections.
Q: Does NAT Gateway need a public IP? A public NAT Gateway requires an Elastic IP. A private NAT Gateway (for on-premises connectivity without internet) does not need a public IP.
Q: Why would you create a NAT Gateway in each AZ instead of just one? Single-AZ NAT Gateway creates a cross-AZ dependency. If that AZ fails, private subnets in other AZs lose outbound internet. Also, cross-AZ data transfer is billed; keeping NAT Gateway and private subnet in the same AZ avoids that cost.
Q: Can NAT Gateway accept inbound connections from the internet? No. NAT Gateway only handles outbound-initiated flows. It maintains a connection tracking table so response traffic can return to the private instance, but no external system can initiate a connection through NAT Gateway to a private instance.