🧠 dbt Model Dependencies: `ref()`

In data analytics engineering, managing relationships between data models is the backbone of efficient pipelines.

Imagine you’re transforming raw data into insights — one SQL model depends on another. You need a system that automatically knows the order in which to run these models.

That’s where dbt (Data Build Tool) shines with its ref() function, which links models together and builds a dependency graph automatically.

In this guide, you’ll learn:

What model dependencies are in dbt
How ref() works
Three practical coding examples
A visual Mer
Memory tips for interviews
Why it’s crucial for scalable data projects

🔍 What Are Model Dependencies in dbt?

In dbt, each model is a SQL file that represents a transformation. However, these models often depend on each other.

For example:

You first clean raw data in one model (stg_orders.sql).
Then you use that cleaned data in another model (customer_orders.sql).

To link them properly, dbt provides the ref() function, which:

✅ Creates a dependency between models. ✅ Ensures that models run in the correct order. ✅ Automatically updates schema or table references.

⚙️ How `ref()` Works

The syntax is simple yet powerful:

select * from {{ ref('model_name') }}

The {{ }} syntax indicates a Jinja expression.
The ref() function tells dbt: “This model depends on another one named 'model_name'.”

When dbt runs, it builds a Directed Acyclic Graph (DAG) — a visual map showing model dependencies.

✅ Example:

select * from {{ ref('stg_orders') }}

This means: “Use the output of stg_orders.sql as input for this model.”

🧭 Model Dependency Flow

Here’s how model dependencies work in dbt:

          ┌──────────────────────┐
          │  raw_orders (source) │
          └──────────────────────┘
                     │
                     ▼
          ┌──────────────────────┐
          │  stg_orders (model)  │  ←  Cleans data
          └──────────────────────┘
                     │
                     ▼
          ┌──────────────────────────────┐
          │  customer_orders (model)     │  ←  Uses ref('stg_orders')
          └──────────────────────────────┘
                     │
                     ▼
          ┌──────────────────────────────┐
          │  sales_summary (model)       │  ←  Uses ref('customer_orders')
          └──────────────────────────────┘

This chain shows that dbt runs models in dependency order — bottom models depend on top models.

🧩 Example Set 1: Basic Model Dependencies

Let’s start with a simple scenario.

🧮 Example 1: Building a Staging Model

File: models/staging/stg_customers.sql

-- Cleans raw customer data
select
  id as customer_id,
  trim(name) as customer_name,
  lower(email) as email
from {{ source('raw', 'customers') }}

This model pulls raw data from the source and cleans it.

🧮 Example 2: Creating a Core Model that Depends on the Staging Model

File: models/core/customer_orders.sql

-- Aggregates data by customer
select
  c.customer_id,
  c.customer_name,
  count(o.order_id) as total_orders,
  sum(o.amount) as total_spent
from {{ ref('stg_customers') }} as c
join {{ ref('stg_orders') }} as o
  on c.customer_id = o.customer_id
group by 1, 2

Here, dbt knows this model depends on both stg_customers and stg_orders.

✅ dbt automatically builds them first before executing this model.

🧮 Example 3: Final Summary Model

File: models/marts/sales_summary.sql

-- Final summary for business reports
select
  region,
  sum(total_spent) as revenue,
  count(distinct customer_id) as customers
from {{ ref('customer_orders') }}
group by region

This final layer uses data from the core model, forming a multi-level dependency chain.

🔗 How dbt Builds the Dependency Graph (DAG)

When you run:

dbt run

dbt automatically:

Scans each model for ref() calls.
Builds a dependency graph.
Executes models in topological order.

Example build order:

stg_customers → stg_orders → customer_orders → sales_summary

You can visualize this with:

dbt docs generate
dbt docs serve

Then, open the DAG in your browser — a perfect visualization of how models connect.

🧩 Example Set 2: Intermediate Dependency Scenarios

Now, let’s explore more advanced use cases.

🧮 Example 1: Conditional References

{% if target.name == 'prod' %}
  select * from {{ ref('sales_summary') }}
{% else %}
  select * from {{ ref('sales_summary_dev') }}
{% endif %}

💡 Use Case: This approach lets you dynamically reference different models for different environments (like dev or prod).

🧮 Example 2: Multiple Ref() in Joins

select
  p.product_id,
  p.product_name,
  sum(o.quantity) as total_sold
from {{ ref('stg_products') }} p
join {{ ref('stg_orders') }} o
  on p.product_id = o.product_id
group by 1, 2

Here, both models stg_products and stg_orders are dependencies. dbt ensures both are ready before running this.

🧮 Example 3: Ref with CTEs (Common Table Expressions)

with
  customer_data as (
    select * from {{ ref('stg_customers') }}
  ),
  order_data as (
    select * from {{ ref('stg_orders') }}
  )
select
  c.customer_id,
  c.customer_name,
  sum(o.amount) as total_spent
from customer_data c
join order_data o
  on c.customer_id = o.customer_id
group by 1, 2

Using ref() inside CTEs keeps your SQL modular and readable.

🧩 Example Set 3: Advanced Project-Level Dependencies

🧮 Example 1: Cross-Project Reference

If you organize models by folders, you can use relative paths in ref().

select * from {{ ref('marts.sales_summary') }}

This tells dbt to look inside the marts folder for the sales_summary.sql model.

🧮 Example 2: Macros Using Ref()

You can even call ref() inside custom macros to make dynamic SQL reusable.

{% macro union_models(model_list) %}
  {% for model in model_list %}
    select * from {{ ref(model) }}
    {% if not loop.last %} union all {% endif %}
  {% endfor %}
{% endmacro %}

Usage:

{{ union_models(['stg_orders', 'stg_customers']) }}

🧮 Example 3: Ref() with Incremental Models

You can combine ref() with incremental strategies for large datasets.

{{ config(materialized='incremental') }}

select *
from {{ ref('stg_transactions') }}
where updated_at > (select max(updated_at) from {{ this }})

Here, dbt only processes new records, ensuring faster builds.

🧠 How to Remember dbt Model Dependencies (for Interview/Exam)

To remember how dbt links models, use the mnemonic “R-D-G”:

Letter	Concept	Memory Tip
R	Ref()	“Use `ref()` to reference another model.”
D	Dependencies	“Ref builds the dependency DAG.”
G	Graph (DAG)	“dbt visualizes the graph automatically.”

💡 Interview Flashcards

Question	Answer
What does `ref()` do in dbt?	It links models and defines dependencies.
Why not hardcode table names?	`ref()` automatically updates schema references.
How does dbt know model order?	It scans `ref()` calls to build a DAG.
What is a DAG in dbt?	Directed Acyclic Graph showing model dependencies.
Can you use `ref()` in macros?	Yes, to dynamically reference models.
What happens if a referenced model fails?	dbt halts the run for dependent models.

⚡ Why It’s Important to Learn Model Dependencies

Understanding model dependencies is crucial for professional dbt use.

Benefit	Description
Automation	dbt runs models in the right order automatically
Scalability	Simplifies managing 100s of interdependent models
Clarity	DAG visualizes relationships for better understanding
Maintainability	Easier to debug and modify pipelines
Reusability	`ref()` enables modular design across teams
Environment Agility	Automatically adapts to schema changes
Collaboration	Developers can work independently on dependent models

🧱 Best Practices

Practice	Description
Always use `ref()` instead of hardcoding	Ensures portability and dependency tracking
Name models clearly	Helps visualize DAG structure
Avoid circular dependencies	Keep graph acyclic (no loops)
Group models by layer	Example: staging → core → marts
Use macros for repetitive ref logic	Promotes clean, DRY code
Visualize DAG regularly	Detects broken or missing dependencies early

🧩 Common Mistakes & Fixes

Mistake	Problem	Fix
Hardcoded table names	Schema breaks after deployment	Replace with `ref()`
Circular references	dbt run fails	Reorganize dependencies
Model naming inconsistency	DAG confusion	Use naming convention (`stg_`, `core_`, `mart_`)
No DAG visualization	Hidden dependency errors	Run `dbt docs serve` regularly

🔍 Real-World Use Cases

Industry	Use Case
E-Commerce	Chain of dependencies from raw orders → cleaned → sales summary
Finance	Data lineage tracking for compliance
Marketing	Campaign-level performance aggregation
Healthcare	Layered models for patient, hospital, and outcome analysis
SaaS Analytics	Building metrics dashboards across multiple teams

⚙️ Visualizing Dependencies in dbt

You can view model relationships interactively using the dbt Docs tool.

Commands:

dbt docs generate
dbt docs serve

The output shows a graph of models (DAG) — each node represents a model, and edges represent ref() relationships.

This makes debugging or optimizing pipelines much easier.

🧠 Quick Recap

✅ Model Dependencies define how models rely on one another. ✅ ref() is the key dbt function for linking models. ✅ dbt builds a DAG to execute models in correct order. ✅ You can use ref() in models, macros, and even conditionals. ✅ It ensures consistency, automation, and maintainability.

🚀 Final Thoughts

Mastering model dependencies in dbt through the ref() function is like mastering the foundation of a data city — every building (model) stands on another, and ref() is the blueprint that keeps everything connected.

Whether you’re building a simple two-model project or an enterprise-scale pipeline with 500 models, ref() keeps your transformations organized, traceable, and scalable.

“Without ref(), dbt would just be SQL scripts. With ref(), it becomes an intelligent, interconnected data system.”

Data Build Tools

Core Foundations of dbt

Models and Materializations

🧠 dbt Model Dependencies: `ref()`

🔍 What Are Model Dependencies in dbt?

⚙️ How `ref()` Works

🧭 Model Dependency Flow

🧩 Example Set 1: Basic Model Dependencies

🧮 Example 1: Building a Staging Model

🧮 Example 2: Creating a Core Model that Depends on the Staging Model

🧮 Example 3: Final Summary Model

🔗 How dbt Builds the Dependency Graph (DAG)

🧩 Example Set 2: Intermediate Dependency Scenarios

🧮 Example 1: Conditional References

🧮 Example 2: Multiple Ref() in Joins

🧮 Example 3: Ref with CTEs (Common Table Expressions)

🧩 Example Set 3: Advanced Project-Level Dependencies

🧮 Example 1: Cross-Project Reference

🧮 Example 2: Macros Using Ref()

🧮 Example 3: Ref() with Incremental Models

🧠 How to Remember dbt Model Dependencies (for Interview/Exam)

💡 Interview Flashcards

⚡ Why It’s Important to Learn Model Dependencies

🧱 Best Practices

🧩 Common Mistakes & Fixes

🔍 Real-World Use Cases

⚙️ Visualizing Dependencies in dbt

🧠 Quick Recap

🚀 Final Thoughts

Data Build Tools

Core Foundations of dbt

Models and Materializations

🧠 dbt Model Dependencies: ref()

🔍 What Are Model Dependencies in dbt?

⚙️ How ref() Works

🧭 Model Dependency Flow

🧩 Example Set 1: Basic Model Dependencies

🧮 Example 1: Building a Staging Model

🧮 Example 2: Creating a Core Model that Depends on the Staging Model

🧮 Example 3: Final Summary Model

🔗 How dbt Builds the Dependency Graph (DAG)

🧩 Example Set 2: Intermediate Dependency Scenarios

🧮 Example 1: Conditional References

🧮 Example 2: Multiple Ref() in Joins

🧮 Example 3: Ref with CTEs (Common Table Expressions)

🧩 Example Set 3: Advanced Project-Level Dependencies

🧮 Example 1: Cross-Project Reference

🧮 Example 2: Macros Using Ref()

🧮 Example 3: Ref() with Incremental Models

🧠 How to Remember dbt Model Dependencies (for Interview/Exam)

💡 Interview Flashcards

⚡ Why It’s Important to Learn Model Dependencies

🧱 Best Practices

🧩 Common Mistakes & Fixes

🔍 Real-World Use Cases

⚙️ Visualizing Dependencies in dbt

🧠 Quick Recap

🚀 Final Thoughts

🧠 dbt Model Dependencies: `ref()`

⚙️ How `ref()` Works