📘 Terraform Data Sources: Reading Dynamic Information from Providers

Imagine you are building infrastructure with Terraform and want to launch an AWS EC2 instance. You need the latest Amazon Linux AMI ID. Instead of hardcoding that ID (which changes frequently), you want Terraform to look it up dynamically.

That’s where Terraform Data Sources come in.

👉 A data source in Terraform allows you to query existing information from a provider (AWS, Azure, GCP, etc.) and use it in your configurations.

Think of data sources as “read-only resources”—they don’t create anything but fetch information to make your infrastructure smarter and more reusable.


🔑 What Are Terraform Data Sources?

  • Definition: Data sources allow Terraform to read information from a provider without creating or modifying resources.
  • Purpose: Fetch dynamic values like the latest AMI ID, an existing VPC, subnet IDs, or Azure VM images.
  • Usage: Avoids hardcoding static values and makes infrastructure more flexible.

📌 General Syntax

data "<PROVIDER>_<DATA_TYPE>" "<NAME>" {
filter1 = "value1"
filter2 = "value2"
}
  • data – keyword for declaring data sources.
  • provider_data_type – the provider and what you want (AWS AMI, Azure resource group, GCP network).
  • name – internal name used inside Terraform.
  • arguments/filters – specify what you are searching for.

🖥️ Example 1: AWS Data Source – Fetching Latest AMI

data "aws_ami" "latest_amazon_linux" {
most_recent = true
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
owners = ["amazon"]
}
resource "aws_instance" "example" {
ami = data.aws_ami.latest_amazon_linux.id
instance_type = "t2.micro"
}

Explanation:

  • data "aws_ami" queries AWS for the latest Amazon Linux 2 AMI.
  • ami attribute in the EC2 instance uses that ID dynamically.
  • No need to update AMI IDs manually.

📦 Example 2: Azure Data Source – Fetching Resource Group Info

data "azurerm_resource_group" "example" {
name = "my-existing-rg"
}
resource "azurerm_storage_account" "example" {
name = "tfstorageacctdemo"
resource_group_name = data.azurerm_resource_group.example.name
location = data.azurerm_resource_group.example.location
account_tier = "Standard"
account_replication_type = "LRS"
}

Explanation:

  • Reads an existing resource group instead of creating a new one.
  • Useful when infrastructure is a mix of manual and automated resources.

🌐 Example 3: GCP Data Source – Fetching Existing Network

data "google_compute_network" "existing_vpc" {
name = "default"
}
resource "google_compute_instance" "example" {
name = "vm-with-data-source"
machine_type = "e2-micro"
zone = "us-central1-a"
boot_disk {
initialize_params {
image = "debian-cloud/debian-11"
}
}
network_interface {
network = data.google_compute_network.existing_vpc.name
}
}

Explanation:

  • Reads an existing default VPC network in GCP.
  • VM attaches to that network dynamically.

🛠️ 3 More Unique Use-Cases of Data Sources

1. AWS: Using Subnet IDs Dynamically

data "aws_subnets" "example" {
filter {
name = "tag:Environment"
values = ["production"]
}
}
resource "aws_instance" "example" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t2.micro"
subnet_id = data.aws_subnets.example.ids[0]
}

🔹 Fetches subnet IDs tagged as “production” and deploys VM inside it.


2. Azure: Using Existing Key Vault

data "azurerm_key_vault" "example" {
name = "my-key-vault"
resource_group_name = "rg-demo"
}
output "vault_uri" {
value = data.azurerm_key_vault.example.vault_uri
}

🔹 Reads an existing Key Vault URI for secrets management.


3. GCP: Using Existing Project Info

data "google_project" "current" {}
output "project_number" {
value = data.google_project.current.number
}

🔹 Retrieves project number for integration with IAM policies.


🧠 How to Remember for Interview & Exam

  1. Mnemonic: DARE (Data sources Are Read-Only Entities) – Always remember they fetch, not create.

  2. Think of “Google Search” – Instead of storing everything, you search dynamically.

  3. Hands-On Practice – Write a simple Terraform config using data for AWS AMI ID daily.

  4. Flashcards – Create cards: data.aws_ami, data.azurerm_resource_group, data.google_compute_network.

  5. Interview Trick – When asked:

    • Resources = Create Infra
    • Data Sources = Read Existing Infra

🚀 Why It’s Important to Learn

  • Dynamic & Flexible: Avoids hardcoding changing values (e.g., AMI IDs).
  • Hybrid Environments: Integrates Terraform with pre-existing resources.
  • Reusable Modules: Makes Terraform modules more portable across environments.
  • Exam/Interview Relevance: Commonly asked scenario: “How do you fetch the latest AMI dynamically?”
  • Industry Demand: Teams prefer automation + dynamic adaptability.

📚 Conclusion

Terraform Data Sources are a powerful mechanism for reading information from providers like AWS, Azure, and GCP.

They don’t create new resources, but they make your configurations dynamic, reusable, and future-proof.

From fetching the latest AMI IDs, to reading existing networks, or integrating with existing resource groups, data sources make Terraform smarter.

👉 If Resource Blocks are the builders, then Data Sources are the explorers—they help Terraform understand what’s already there before creating new infrastructure.

By mastering Data Sources, you’ll not only ace interviews and certifications but also build production-grade infrastructure that adapts dynamically to changes.