๐Ÿง  dbt Model Dependencies: ref()

In data analytics engineering, managing relationships between data models is the backbone of efficient pipelines.

Imagine youโ€™re transforming raw data into insights โ€” one SQL model depends on another. You need a system that automatically knows the order in which to run these models.

Thatโ€™s where dbt (Data Build Tool) shines with its ref() function, which links models together and builds a dependency graph automatically.

In this guide, youโ€™ll learn:

  • What model dependencies are in dbt
  • How ref() works
  • Three practical coding examples
  • A visual Mer
  • Memory tips for interviews
  • Why itโ€™s crucial for scalable data projects

๐Ÿ” What Are Model Dependencies in dbt?

In dbt, each model is a SQL file that represents a transformation. However, these models often depend on each other.

For example:

  • You first clean raw data in one model (stg_orders.sql).
  • Then you use that cleaned data in another model (customer_orders.sql).

To link them properly, dbt provides the ref() function, which:

โœ… Creates a dependency between models. โœ… Ensures that models run in the correct order. โœ… Automatically updates schema or table references.


โš™๏ธ How ref() Works

The syntax is simple yet powerful:

select * from {{ ref('model_name') }}
  • The {{ }} syntax indicates a Jinja expression.
  • The ref() function tells dbt: โ€œThis model depends on another one named 'model_name'.โ€

When dbt runs, it builds a Directed Acyclic Graph (DAG) โ€” a visual map showing model dependencies.

โœ… Example:

select * from {{ ref('stg_orders') }}

This means: โ€œUse the output of stg_orders.sql as input for this model.โ€


๐Ÿงญ Model Dependency Flow

Hereโ€™s how model dependencies work in dbt:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ raw_orders (source) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ stg_orders (model) โ”‚ โ† Cleans data
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ customer_orders (model) โ”‚ โ† Uses ref('stg_orders')
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ sales_summary (model) โ”‚ โ† Uses ref('customer_orders')
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

This chain shows that dbt runs models in dependency order โ€” bottom models depend on top models.


๐Ÿงฉ Example Set 1: Basic Model Dependencies

Letโ€™s start with a simple scenario.


๐Ÿงฎ Example 1: Building a Staging Model

File: models/staging/stg_customers.sql

-- Cleans raw customer data
select
id as customer_id,
trim(name) as customer_name,
lower(email) as email
from {{ source('raw', 'customers') }}

This model pulls raw data from the source and cleans it.


๐Ÿงฎ Example 2: Creating a Core Model that Depends on the Staging Model

File: models/core/customer_orders.sql

-- Aggregates data by customer
select
c.customer_id,
c.customer_name,
count(o.order_id) as total_orders,
sum(o.amount) as total_spent
from {{ ref('stg_customers') }} as c
join {{ ref('stg_orders') }} as o
on c.customer_id = o.customer_id
group by 1, 2

Here, dbt knows this model depends on both stg_customers and stg_orders.

โœ… dbt automatically builds them first before executing this model.


๐Ÿงฎ Example 3: Final Summary Model

File: models/marts/sales_summary.sql

-- Final summary for business reports
select
region,
sum(total_spent) as revenue,
count(distinct customer_id) as customers
from {{ ref('customer_orders') }}
group by region

This final layer uses data from the core model, forming a multi-level dependency chain.


๐Ÿ”— How dbt Builds the Dependency Graph (DAG)

When you run:

Terminal window
dbt run

dbt automatically:

  1. Scans each model for ref() calls.
  2. Builds a dependency graph.
  3. Executes models in topological order.

Example build order:

stg_customers โ†’ stg_orders โ†’ customer_orders โ†’ sales_summary

You can visualize this with:

Terminal window
dbt docs generate
dbt docs serve

Then, open the DAG in your browser โ€” a perfect visualization of how models connect.


๐Ÿงฉ Example Set 2: Intermediate Dependency Scenarios

Now, letโ€™s explore more advanced use cases.


๐Ÿงฎ Example 1: Conditional References

{% if target.name == 'prod' %}
select * from {{ ref('sales_summary') }}
{% else %}
select * from {{ ref('sales_summary_dev') }}
{% endif %}

๐Ÿ’ก Use Case: This approach lets you dynamically reference different models for different environments (like dev or prod).


๐Ÿงฎ Example 2: Multiple Ref() in Joins

select
p.product_id,
p.product_name,
sum(o.quantity) as total_sold
from {{ ref('stg_products') }} p
join {{ ref('stg_orders') }} o
on p.product_id = o.product_id
group by 1, 2

Here, both models stg_products and stg_orders are dependencies. dbt ensures both are ready before running this.


๐Ÿงฎ Example 3: Ref with CTEs (Common Table Expressions)

with
customer_data as (
select * from {{ ref('stg_customers') }}
),
order_data as (
select * from {{ ref('stg_orders') }}
)
select
c.customer_id,
c.customer_name,
sum(o.amount) as total_spent
from customer_data c
join order_data o
on c.customer_id = o.customer_id
group by 1, 2

Using ref() inside CTEs keeps your SQL modular and readable.


๐Ÿงฉ Example Set 3: Advanced Project-Level Dependencies


๐Ÿงฎ Example 1: Cross-Project Reference

If you organize models by folders, you can use relative paths in ref().

select * from {{ ref('marts.sales_summary') }}

This tells dbt to look inside the marts folder for the sales_summary.sql model.


๐Ÿงฎ Example 2: Macros Using Ref()

You can even call ref() inside custom macros to make dynamic SQL reusable.

{% macro union_models(model_list) %}
{% for model in model_list %}
select * from {{ ref(model) }}
{% if not loop.last %} union all {% endif %}
{% endfor %}
{% endmacro %}

Usage:

{{ union_models(['stg_orders', 'stg_customers']) }}

๐Ÿงฎ Example 3: Ref() with Incremental Models

You can combine ref() with incremental strategies for large datasets.

{{ config(materialized='incremental') }}
select *
from {{ ref('stg_transactions') }}
where updated_at > (select max(updated_at) from {{ this }})

Here, dbt only processes new records, ensuring faster builds.


๐Ÿง  How to Remember dbt Model Dependencies (for Interview/Exam)

To remember how dbt links models, use the mnemonic โ€œR-D-Gโ€:

LetterConceptMemory Tip
RRef()โ€œUse ref() to reference another model.โ€
DDependenciesโ€œRef builds the dependency DAG.โ€
GGraph (DAG)โ€œdbt visualizes the graph automatically.โ€

๐Ÿ’ก Interview Flashcards

QuestionAnswer
What does ref() do in dbt?It links models and defines dependencies.
Why not hardcode table names?ref() automatically updates schema references.
How does dbt know model order?It scans ref() calls to build a DAG.
What is a DAG in dbt?Directed Acyclic Graph showing model dependencies.
Can you use ref() in macros?Yes, to dynamically reference models.
What happens if a referenced model fails?dbt halts the run for dependent models.

โšก Why Itโ€™s Important to Learn Model Dependencies

Understanding model dependencies is crucial for professional dbt use.

BenefitDescription
Automationdbt runs models in the right order automatically
ScalabilitySimplifies managing 100s of interdependent models
ClarityDAG visualizes relationships for better understanding
MaintainabilityEasier to debug and modify pipelines
Reusabilityref() enables modular design across teams
Environment AgilityAutomatically adapts to schema changes
CollaborationDevelopers can work independently on dependent models

๐Ÿงฑ Best Practices

PracticeDescription
Always use ref() instead of hardcodingEnsures portability and dependency tracking
Name models clearlyHelps visualize DAG structure
Avoid circular dependenciesKeep graph acyclic (no loops)
Group models by layerExample: staging โ†’ core โ†’ marts
Use macros for repetitive ref logicPromotes clean, DRY code
Visualize DAG regularlyDetects broken or missing dependencies early

๐Ÿงฉ Common Mistakes & Fixes

MistakeProblemFix
Hardcoded table namesSchema breaks after deploymentReplace with ref()
Circular referencesdbt run failsReorganize dependencies
Model naming inconsistencyDAG confusionUse naming convention (stg_, core_, mart_)
No DAG visualizationHidden dependency errorsRun dbt docs serve regularly

๐Ÿ” Real-World Use Cases

IndustryUse Case
E-CommerceChain of dependencies from raw orders โ†’ cleaned โ†’ sales summary
FinanceData lineage tracking for compliance
MarketingCampaign-level performance aggregation
HealthcareLayered models for patient, hospital, and outcome analysis
SaaS AnalyticsBuilding metrics dashboards across multiple teams

โš™๏ธ Visualizing Dependencies in dbt

You can view model relationships interactively using the dbt Docs tool.

Commands:

Terminal window
dbt docs generate
dbt docs serve

The output shows a graph of models (DAG) โ€” each node represents a model, and edges represent ref() relationships.

This makes debugging or optimizing pipelines much easier.


๐Ÿง  Quick Recap

โœ… Model Dependencies define how models rely on one another. โœ… ref() is the key dbt function for linking models. โœ… dbt builds a DAG to execute models in correct order. โœ… You can use ref() in models, macros, and even conditionals. โœ… It ensures consistency, automation, and maintainability.


๐Ÿš€ Final Thoughts

Mastering model dependencies in dbt through the ref() function is like mastering the foundation of a data city โ€” every building (model) stands on another, and ref() is the blueprint that keeps everything connected.

Whether youโ€™re building a simple two-model project or an enterprise-scale pipeline with 500 models, ref() keeps your transformations organized, traceable, and scalable.

โ€œWithout ref(), dbt would just be SQL scripts. With ref(), it becomes an intelligent, interconnected data system.โ€