Cloud  /  AWS

AWS Amazon Web Services 61 guides · updated 2026

Hands-on guides to compute, storage, databases, networking, and serverless on the world's most widely adopted cloud platform.

EC2 Instance Families: Which Type Fits Your CPU, Memory, Storage, or GPU Workload

Choosing the wrong instance type costs money in two directions. A memory-optimised instance for a CPU-bound application wastes RAM you are paying for. A compute-optimised instance for a caching layer runs out of memory and starts swapping. AWS offers dozens of instance families specifically because different workloads have different bottlenecks.

The naming convention is consistent: family + generation + processor-suffix + . + size. For example, r7g.4xlarge is memory-optimised (r), seventh generation, Graviton ARM (g), 4xlarge.

┌──────────────────────────────────────────────────────────────────────┐
│ EC2 Instance Family Map │
│ │
│ PRIMARY BOTTLENECK FAMILY CURRENT GEN BEST FOR │
│ ────────────────────────────────────────────────────────────────── │
│ Balanced (dev/test) T-series t3, t4g Burstable VMs │
│ Balanced (prod) M-series m6i, m7g APIs, app servers │
│ CPU-bound C-series c6i, c7g Encoding, HPC │
│ Memory-bound R-series r6i, r7g DB buffer pools │
│ Extreme memory X-series x2gd, x2idn SAP HANA, Redis │
│ NVMe local storage I-series i3, i4i NoSQL, Elastic │
│ Spinning HDD storage D-series d3 Hadoop HDFS │
│ GPU training P-series p4d, p5 LLM training │
│ GPU inference G-series g5, g4dn Model serving │
│ AWS Inferentia Inf-series inf2 Low-cost infer. │
└──────────────────────────────────────────────────────────────────────┘

Burstable Instances: T-Series

T instances earn CPU credits when running below a baseline and spend them when bursting above it. A t3.small baseline is around 20% of one vCPU. An idle web server accumulates credits all night and spends them during the morning traffic peak.

CPU Credit mechanics:
t3.micro baseline = 10% of 1 vCPU
Credits earned per hour @ baseline: 6 credits
Credits spent per hour @ 100% CPU: 60 credits
Credit balance drains in 1 hour at 100% CPU

Standard mode (default): when credits run out, the instance is throttled to baseline. The user notices degraded performance rather than a bill surprise.

Unlimited mode: when credits run out, the instance continues at full speed and AWS charges for excess CPU at $0.05 per vCPU-hour. Useful if you occasionally need full CPU but cannot predict when.

Terminal window
# Launch t3.medium with unlimited CPU mode
aws ec2 run-instances \
--instance-type t3.medium \
--image-id ami-0c02fb55956c7d316 \
--credit-specification '{"CpuCredits":"unlimited"}'

Good for: dev environments, internal tools, microservices with irregular traffic, small databases. Avoid for: consistently CPU-heavy workloads — you pay more for an M-series equivalent.

General Purpose: M-Series

M instances deliver predictable, consistent CPU with no bursting mechanics. The ratio is roughly 1 vCPU to 4 GB RAM, which suits most application server workloads.

Instance vCPU RAM Baseline network
m6i.large 2 8 GB Up to 12.5 Gbps
m6i.xlarge 4 16 GB Up to 12.5 Gbps
m6i.4xlarge 16 64 GB 12.5 Gbps
m6i.8xlarge 32 128 GB 25 Gbps

The m7g Graviton3 versions are ~20% cheaper than equivalent m6i Intel instances for most web and API workloads. If your runtime is Python, Java, Go, or Node.js, migration to Graviton is straightforward.

Use M-series for: application servers, CI/CD build agents, medium-sized relational databases, backend services.

Compute Optimised: C-Series

C instances provide more vCPUs per dollar by giving you a tighter memory ratio — roughly 1 vCPU to 2 GB RAM.

Comparison at xlarge:
m5.xlarge: 4 vCPU, 16 GB RAM — $0.192/hr
c5.xlarge: 4 vCPU, 8 GB RAM — $0.170/hr
(same CPU, half RAM, 12% cheaper)

For workloads where RAM is not the constraint, C instances deliver more compute per dollar. Use for:

Terminal window
# C7g is the current Graviton3 compute-optimised instance
aws ec2 run-instances \
--instance-type c7g.2xlarge \
--image-id ami-arm64-amazon-linux-2023

Memory Optimised: R, X, U-Series

Memory-optimised families provide large amounts of RAM relative to vCPUs. Use them when your working data set needs to fit in memory rather than spilling to disk.

R-series (8 GB per vCPU):
r7g.large: 2 vCPU, 16 GB RAM
r7g.4xlarge: 16 vCPU, 128 GB RAM
r7g.16xlarge: 64 vCPU, 512 GB RAM
X-series (higher RAM density):
x2gd.large: 4 vCPU, 64 GB RAM + 118 GB local NVMe
x2idn.32xlarge: 128 vCPU, 2 TB RAM
U-series (ultra-high memory):
u-6tb1.112xlarge: 448 vCPU, 6 TB RAM

When you need memory-optimised instances:

Storage Optimised: I and D-Series

Storage-optimised instances have local NVMe SSDs (I-series) or high-capacity HDDs (D-series) physically attached to the host machine. The performance ceiling is dramatically higher than EBS for random I/O.

I-series local NVMe performance:
i3.large: 475 GB NVMe ~100,000 random read IOPS
i3.8xlarge: 6.25 TB NVMe ~1,600,000 random read IOPS
i4i.8xlarge: 7.5 TB NVMe ~3,750,000 random read IOPS
D-series local HDD (dense storage):
d3.8xlarge: 48 TB HDD ~900 MB/s sequential throughput

The critical tradeoff: local storage is not persistent. Data is gone when the instance stops, terminates, or the host hardware fails. Use storage-optimised instances for data that is replicated or can be rebuilt:

GPU and Accelerated Instances: P, G, Inf Series

P-series (training): The p4d.24xlarge has 8× NVIDIA A100 GPUs connected via NVLink and 400 Gbps networking — the standard for large model training. The p5.48xlarge uses H100 GPUs and is the current generation for transformer model training.

G-series (inference and light training): g4dn.xlarge has one NVIDIA T4 GPU and is the entry point for model inference, video transcoding, and light training. g5 instances use NVIDIA A10G GPUs for more demanding inference workloads.

Inf2 (AWS Inferentia): AWS Inferentia2 chips are custom inference accelerators. inf2.xlarge (1 chip) up to inf2.48xlarge (12 chips). For inference workloads where the model supports it, Inferentia2 delivers better throughput-per-dollar than equivalent G-series instances.

Terminal window
# g4dn.xlarge for ML inference
aws ec2 run-instances \
--instance-type g4dn.xlarge \
--image-id ami-deep-learning-ami-cuda
# inf2.xlarge for Inferentia
aws ec2 run-instances \
--instance-type inf2.xlarge \
--image-id ami-inf2-compatible

Graviton: ARM at Lower Cost

AWS builds its own ARM processors under the Graviton brand. Current generation is Graviton3 (g suffix in the instance name).

Most families have Graviton equivalents:

Graviton3 is typically 20% cheaper than equivalent Intel instances and uses ~60% less energy. For software compiled for any common runtime (Java on the JVM, Python, Go, Node.js, Ruby), Graviton migration requires no code changes. Native code compiled for x86 needs recompilation.

How Instance Sizes Scale

Resources double as you move up the size ladder within a family:

t3 family:
t3.nano: 2 vCPU, 0.5 GB RAM
t3.micro: 2 vCPU, 1 GB RAM
t3.small: 2 vCPU, 2 GB RAM
t3.medium: 2 vCPU, 4 GB RAM
t3.large: 2 vCPU, 8 GB RAM
t3.xlarge: 4 vCPU, 16 GB RAM
t3.2xlarge: 8 vCPU, 32 GB RAM

For M, C, and R families, both vCPU count and RAM double with each size step. Network bandwidth increases at the larger sizes.

Decision Framework

  1. Profile first. Check CloudWatch CPU and memory metrics on existing instances. If CPU averages 15% but memory is at 80%, the RAM is the constraint — move to R-series.
  2. Try Graviton. For any new workload on Python, Java, or Node.js, start with a Graviton instance unless you have a specific reason not to.
  3. Use current generation. Older generations (m4, c4, r4) are still available but cost more per unit of performance than current (m7, c7, r7).
  4. Right-size after a week. Launch with your best guess, monitor actual utilisation for 7 days, and adjust. AWS Compute Optimizer automates this analysis.

Common Interview Questions

Q: When would you choose c5 over m5? When the workload is CPU-bound and RAM is not the limiting factor. C5 gives the same vCPU count as M5 at a tighter memory ratio, making it cheaper per compute unit for encoding, scientific computing, or stateless API work.

Q: What happens to data on an i4i instance when you terminate it? All local NVMe data is destroyed. Storage-optimised instances should only hold data that is replicated elsewhere or can be rebuilt — cache, distributed database replicas, or scratch data.

Q: What is the CPU credit balance on a t3.micro after 24 hours of idle? T3 instances earn 6 credits per vCPU per hour. A t3.micro (2 vCPU) earns 12 credits per hour = 288 credits in 24 hours, capped at the maximum balance (144 credits for t3.micro). After 12 hours idle the bucket is full.

Q: Why might Graviton be faster than Intel in some benchmarks? Graviton3 has a wider SIMD implementation, faster memory bandwidth, and improved cryptographic acceleration for some workloads. It is not universally faster — workloads with x86-specific assembly paths or SIMD tuning may perform better on Intel.