Data Engineering  /  dbt

πŸ”„ dbt β€” Data Build Tool 23 guides Β· updated 2026

Analytics engineering with SQL β€” models, tests, sources, and Jinja macros that turn raw warehouse tables into trustworthy, documented data products.

dbt Seeds: Static Reference Data That Belongs in Your DAG

Most data in a warehouse comes from source systems β€” applications, APIs, event streams. But some data does not come from anywhere except a spreadsheet someone maintains manually. Country codes. Fiscal calendar mappings. Product category hierarchies. Internal cost rates. This kind of reference data needs to live in your warehouse too, and dbt seeds are the right way to get it there.


What Is a dbt Seed?

A seed is a CSV file stored inside your dbt project that dbt loads into your warehouse as a table. Once loaded, you can reference it in models exactly the same way you reference any other model β€” using {{ ref() }}.

Seeds are not meant for large datasets. They are for small, slowly-changing reference tables where the authoritative source is a flat file rather than a live system.

dbt project structure with seeds
---------------------------------
my_project/
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ staging/
β”‚ └── marts/
β”œβ”€β”€ seeds/ <-- your CSV files go here
β”‚ β”œβ”€β”€ country_codes.csv
β”‚ β”œβ”€β”€ product_categories.csv
β”‚ └── fiscal_calendar.csv
β”œβ”€β”€ snapshots/
└── dbt_project.yml

When to Use Seeds (and When Not To)

Seeds are a good fit when:

Seeds are not a good fit when:


Creating a Seed File

Drop a CSV file into the seeds/ directory. dbt infers column types from the data. For a country code lookup table, seeds/country_codes.csv might look like:

country_code,country_name,region,currency_code
US,United States,North America,USD
GB,United Kingdom,Europe,GBP
DE,Germany,Europe,EUR
JP,Japan,Asia Pacific,JPY
AU,Australia,Asia Pacific,AUD
CA,Canada,North America,CAD
SG,Singapore,Asia Pacific,SGD

Load it into the warehouse:

Terminal window
dbt seed

dbt creates a table in your warehouse using the filename as the table name (country_codes), in the schema defined in your project config.


Configuring Seeds in dbt_project.yml

You can control how seeds behave through dbt_project.yml:

seeds:
my_project:
+schema: reference # loads into a 'reference' schema
+quote_columns: false
country_codes:
+column_types:
country_code: varchar(2)
currency_code: varchar(3)
fiscal_calendar:
+schema: finance_reference

For columns where dbt’s type inference might be unreliable (like codes that look like integers), always specify types explicitly.


Referencing Seeds in Models

Once loaded, a seed is referenced just like any other model:

-- models/marts/fct_orders_with_region.sql
with orders as (
select * from {{ ref('stg_orders') }}
),
countries as (
select * from {{ ref('country_codes') }}
),
enriched as (
select
o.order_id,
o.order_date,
o.customer_id,
o.order_amount_usd,
c.country_name,
c.region,
c.currency_code
from orders o
left join countries c
on o.ship_to_country_code = c.country_code
)
select * from enriched

dbt knows that fct_orders_with_region depends on both stg_orders and the country_codes seed, so both are included in the lineage graph.


How Seeds Appear in the DAG

The DAG with seeds included:

[source: raw.orders] [seed: country_codes]
| |
[stg_orders] |
| |
+------[fct_orders_with_region]--+
|
[regional_revenue_report]

Seeds show up in dbt docs with the same documentation and lineage tracking as any other node in the project.


Adding Tests and Documentation to Seeds

You can document and test seeds with YAML, just like models:

version: 2
seeds:
- name: country_codes
description: "ISO 3166-1 alpha-2 country codes with region and currency mapping"
columns:
- name: country_code
description: "Two-letter ISO country code"
tests:
- unique
- not_null
- name: country_name
tests:
- not_null
- name: region
tests:
- not_null
- accepted_values:
values:
- 'North America'
- 'Europe'
- 'Asia Pacific'
- 'Latin America'
- 'Middle East & Africa'

Run tests against seeds the same way as models:

Terminal window
dbt test --select country_codes

A Practical Example: Fiscal Calendar Seed

Finance teams often work on non-standard calendars. A fiscal calendar seed solves the problem of mapping dates to fiscal periods without custom code in every model.

seeds/fiscal_calendar.csv:

calendar_date,fiscal_year,fiscal_quarter,fiscal_month,fiscal_week
2025-01-01,FY2025,Q1,M01,W01
2025-01-02,FY2025,Q1,M01,W01
...
2025-03-31,FY2025,Q1,M03,W13
2025-04-01,FY2025,Q2,M04,W14

Load it once, and every model that needs fiscal context can join against it:

with daily_revenue as (
select
order_date,
sum(order_amount_usd) as revenue
from {{ ref('stg_orders') }}
group by 1
),
with_fiscal as (
select
d.order_date,
d.revenue,
fc.fiscal_year,
fc.fiscal_quarter,
fc.fiscal_month
from daily_revenue d
left join {{ ref('fiscal_calendar') }} fc
on d.order_date = fc.calendar_date
)
select * from with_fiscal

Updating Seeds Over Time

When your reference data changes, update the CSV file in your repo and run:

Terminal window
dbt seed --full-refresh

The --full-refresh flag drops and recreates the table, which is necessary when columns change or rows are deleted. Without it, dbt appends new rows but does not remove old ones.

For seeds that update frequently, consider whether a proper ingestion tool would be a better fit than maintaining a CSV manually.


Seeds in CI/CD Pipelines

In most CI/CD setups, you include dbt seed as a step before dbt build:

Terminal window
dbt deps # install packages
dbt seed # load reference CSVs
dbt build # run models, snapshots, tests

This ensures reference tables are always current before models that depend on them run.


2025-2026 Notes on Seeds

Seeds have remained stable as a feature, but a few patterns have emerged in how teams use them:

Separate schema for seeds β€” Most teams now configure seeds to land in a dedicated schema (like reference or static) so they are visually distinct from transformed models in the warehouse catalog.

Seeds as a last resort β€” The dbt community increasingly treats seeds as a last resort rather than a convenience. If data has a live source (even a Google Sheet), tools like Airbyte or custom connectors are preferred because they automate updates. Seeds work best for data that genuinely has no automated source.

dbt packages with shared seeds β€” Some open-source dbt packages include seed files (like the dbt-date package’s holiday calendars). These install via dbt deps and can be referenced with {{ ref('package_name', 'seed_name') }}.


Seeds are a small feature with a clear use case: getting manually-maintained reference data into your warehouse, versioned alongside your models, with the same testing and documentation infrastructure you use for everything else. Used correctly, they eliminate the category of β€œdata that lives in a spreadsheet and gets joined in by hand.”