Terraform Data Sources
Data sources allow Terraform to read information from existing infrastructure that Terraform doesn’t manage. While resource blocks create and manage objects, data source blocks fetch read-only information that your configuration can reference.
When to Use Data Sources vs. Resources
| Use Case | Use |
|---|---|
| Create a new EC2 instance | resource "aws_instance" |
| Look up the latest Amazon Linux AMI | data "aws_ami" |
| Create a new VPC | resource "aws_vpc" |
| Reference an existing VPC you didn’t create | data "aws_vpc" |
| Create a secret | resource "aws_secretsmanager_secret" |
| Read an existing secret’s value | data "aws_secretsmanager_secret_version" |
Data Source Syntax
data "<PROVIDER_DATA_SOURCE_TYPE>" "<LOCAL_NAME>" { # Filter arguments to identify which resource to read filter_argument = "value"}
# Reference: data.<type>.<name>.<attribute>Practical Data Source Examples
Find the Latest AWS AMI
# Latest Amazon Linux 2023data "aws_ami" "amazon_linux_2023" { most_recent = true owners = ["amazon"]
filter { name = "name" values = ["al2023-ami-*-x86_64"] }
filter { name = "architecture" values = ["x86_64"] }
filter { name = "state" values = ["available"] }}
# Latest Ubuntu 24.04 LTSdata "aws_ami" "ubuntu_2404" { most_recent = true owners = ["099720109477"] # Canonical's AWS account
filter { name = "name" values = ["ubuntu/images/hvm-ssd-gp3/ubuntu-noble-24.04-amd64-server-*"] }}
resource "aws_instance" "app" { ami = data.aws_ami.amazon_linux_2023.id # Always current instance_type = "t3.micro"}Look Up an Existing VPC
# Find VPC by tagdata "aws_vpc" "shared" { filter { name = "tag:Name" values = ["shared-networking-vpc"] }}
# Find VPC by CIDR blockdata "aws_vpc" "production" { cidr_block = "10.0.0.0/16"}
# Find all subnets in the shared VPCdata "aws_subnets" "private" { filter { name = "vpc-id" values = [data.aws_vpc.shared.id] } filter { name = "tag:Tier" values = ["private"] }}
resource "aws_instance" "app" { subnet_id = data.aws_subnets.private.ids[0]}Read Secrets from AWS Secrets Manager
data "aws_secretsmanager_secret" "db_password" { name = "production/rds/master-password"}
data "aws_secretsmanager_secret_version" "db_password" { secret_id = data.aws_secretsmanager_secret.db_password.id}
resource "aws_db_instance" "main" { password = data.aws_secretsmanager_secret_version.db_password.secret_string}AWS Account and Region Information
# Current caller identity (who Terraform is running as)data "aws_caller_identity" "current" {}
# Current regiondata "aws_region" "current" {}
# Available AZs in the current regiondata "aws_availability_zones" "available" { state = "available"}
# Usageresource "aws_s3_bucket" "logs" { bucket = "logs-${data.aws_caller_identity.current.account_id}-${data.aws_region.current.name}"}
resource "aws_subnet" "public" { for_each = toset(slice(data.aws_availability_zones.available.names, 0, 3)) # Creates subnets in the first 3 AZs}Azure Data Sources
# Current subscriptiondata "azurerm_subscription" "current" {}
# Existing resource groupdata "azurerm_resource_group" "shared" { name = "shared-infrastructure-rg"}
# Existing key vault secretdata "azurerm_key_vault" "shared" { name = "company-shared-kv" resource_group_name = data.azurerm_resource_group.shared.name}
data "azurerm_key_vault_secret" "db_password" { name = "database-master-password" key_vault_id = data.azurerm_key_vault.shared.id}
# Client config (current service principal)data "azurerm_client_config" "current" {}GCP Data Sources
# Current project and billing infodata "google_project" "current" {}
data "google_compute_network" "shared_vpc" { name = "shared-vpc" project = "networking-project-id"}
data "google_compute_subnetwork" "private" { name = "private-subnet" region = "us-central1" project = "networking-project-id"}
# Secret Managerdata "google_secret_manager_secret_version" "db_password" { secret = "database-password" project = var.project_id}Data Sources for Configuration Templates
# IAM policy document — cleaner than inline JSONdata "aws_iam_policy_document" "lambda_assume_role" { statement { effect = "Allow" actions = ["sts:AssumeRole"] principals { type = "Service" identifiers = ["lambda.amazonaws.com"] } }}
data "aws_iam_policy_document" "s3_access" { statement { effect = "Allow" actions = ["s3:GetObject", "s3:PutObject"] resources = ["${aws_s3_bucket.data.arn}/*"] }
statement { effect = "Allow" actions = ["s3:ListBucket"] resources = [aws_s3_bucket.data.arn] }}
resource "aws_iam_role" "lambda" { assume_role_policy = data.aws_iam_policy_document.lambda_assume_role.json}
resource "aws_iam_role_policy" "lambda_s3" { role = aws_iam_role.lambda.id policy = data.aws_iam_policy_document.s3_access.json}External Data Source
When you need data from a script not natively supported by any provider:
data "external" "git_commit" { program = ["bash", "-c", "echo '{\"sha\": \"'$(git rev-parse --short HEAD)'\"}'"]}
resource "aws_s3_object" "version" { bucket = aws_s3_bucket.deploy.id key = "version.json" content = jsonencode({ commit = data.external.git_commit.result.sha })}Use external sparingly — it breaks plan determinism and can cause issues in CI/CD. Prefer native data sources when available.