❄️ Snowflake Recursive CTEs (Common Table Expressions) – Simplifying Hierarchical Data Queries


In data analytics, not all relationships are flat. Many datasets — such as organization charts, product categories, or directory trees — have a hierarchical structure.

For example:

  • An employee reports to a manager who reports to a director.
  • A category has subcategories that may have further subcategories.
  • A file folder contains subfolders and files recursively.

How do we query such structures using SQL?

Enter Recursive CTEs (Common Table Expressions) — a feature in Snowflake SQL that allows you to traverse hierarchical or self-referential data efficiently.


🧠 What is a Recursive CTE?

A Recursive Common Table Expression (CTE) is a temporary result set that references itself. It’s ideal for exploring parent-child relationships or multi-level hierarchies in data.

🧩 General Syntax

WITH RECURSIVE cte_name AS (
-- Anchor member (base result)
SELECT ... FROM ... WHERE ...
UNION ALL
-- Recursive member (references cte_name)
SELECT ... FROM cte_name JOIN ... ON ...
)
SELECT * FROM cte_name;

⚙️ CTE Components

ComponentDescription
Anchor memberThe initial query that defines the starting point.
Recursive memberReferences the CTE itself to fetch subsequent levels.
Termination conditionThe recursion stops when no more rows are returned.
UNION ALLCombines base and recursive results into a growing result set.

🎨 ** – Recursive CTE Process**

Anchor Query: Level 0

Recursive Step: Level 1

Recursive Step: Level 2

Recursive Step: Level N

No More Rows: Stop Recursion


💡 Why Recursive CTEs Matter

BenefitDescription
🧱 Hierarchical Data HandlingIdeal for org charts, product hierarchies, and bill of materials.
🧠 Readable LogicEasier to understand than complex joins or loops.
PerformanceRuns efficiently on Snowflake’s cloud engine.
🪜 Level TrackingHelps analyze data depth or reporting levels.
🧩 Reusable LogicWorks inside larger SQL transformations.

🧩 Example 1 – Organizational Hierarchy

🧱 Scenario:

You have an employee table with each employee’s manager. You want to build a full hierarchy tree from the CEO downwards.

💾 Step 1: Create Sample Data

CREATE OR REPLACE TABLE employees (
emp_id INT,
emp_name STRING,
manager_id INT
);
INSERT INTO employees VALUES
(1, 'Alice', NULL),
(2, 'Bob', 1),
(3, 'Charlie', 1),
(4, 'David', 2),
(5, 'Eve', 2),
(6, 'Frank', 3);

💡 Step 2: Recursive CTE Query

WITH RECURSIVE org_hierarchy AS (
-- Anchor: CEO (no manager)
SELECT emp_id, emp_name, manager_id, 0 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive: Fetch direct reports
SELECT e.emp_id, e.emp_name, e.manager_id, oh.level + 1
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.emp_id
)
SELECT * FROM org_hierarchy;

Output

emp_idemp_namemanager_idlevel
1AliceNULL0
2Bob11
3Charlie11
4David22
5Eve22
6Frank32

🧠 Explanation

  • Anchor Query: Starts with the CEO (no manager).
  • Recursive Query: Finds employees whose manager_id matches an existing emp_id.
  • Level Counter: Tracks hierarchy depth.

Use Case: Organization charts, reporting structures.


🧩 Example 2 – Product Category Hierarchy

🧱 Scenario:

You have product categories where each category can have a parent category.

💾 Step 1: Create Sample Table

CREATE OR REPLACE TABLE categories (
category_id INT,
category_name STRING,
parent_id INT
);
INSERT INTO categories VALUES
(1, 'Electronics', NULL),
(2, 'Laptops', 1),
(3, 'Mobiles', 1),
(4, 'Gaming Laptops', 2),
(5, 'Android Phones', 3),
(6, 'Accessories', NULL);

💡 Step 2: Recursive Query

WITH RECURSIVE category_tree AS (
-- Anchor: Root categories
SELECT category_id, category_name, parent_id, 0 AS level
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive: Find subcategories
SELECT c.category_id, c.category_name, c.parent_id, ct.level + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.category_id
)
SELECT * FROM category_tree ORDER BY level;

Output

category_idcategory_nameparent_idlevel
1ElectronicsNULL0
6AccessoriesNULL0
2Laptops11
3Mobiles11
4Gaming Laptops22
5Android Phones32

🧠 Explanation

  • Parent-child mapping: Recursive join expands children categories.
  • Level indicator: Helps visualize tree depth.
  • ORDER BY level: Displays hierarchical flow neatly.

Use Case: E-commerce category trees or product taxonomies.


🧩 Example 3 – Directory Tree Traversal

🧱 Scenario:

You have a file directory system where each folder may have subfolders.

💾 Step 1: Create Sample Table

CREATE OR REPLACE TABLE directories (
dir_id INT,
dir_name STRING,
parent_dir INT
);
INSERT INTO directories VALUES
(1, 'root', NULL),
(2, 'documents', 1),
(3, 'images', 1),
(4, 'work', 2),
(5, 'personal', 2),
(6, 'vacation', 3);

💡 Step 2: Recursive Query

WITH RECURSIVE dir_tree AS (
-- Base directory
SELECT dir_id, dir_name, parent_dir, '/' || dir_name AS full_path
FROM directories
WHERE parent_dir IS NULL
UNION ALL
-- Recursive expansion
SELECT d.dir_id, d.dir_name, d.parent_dir,
dt.full_path || '/' || d.dir_name AS full_path
FROM directories d
JOIN dir_tree dt ON d.parent_dir = dt.dir_id
)
SELECT * FROM dir_tree;

Output

dir_iddir_nameparent_dirfull_path
1rootNULL/root
2documents1/root/documents
3images1/root/images
4work2/root/documents/work
5personal2/root/documents/personal
6vacation3/root/images/vacation

🧠 Explanation

  • Builds folder paths recursively.
  • Each recursive level appends folder names to full_path.
  • Stops when no subfolders remain.

Use Case: File systems, nested folder structures, data lineage tracking.


🎨 ** – Directory Hierarchy**

root

documents

images

work

personal

vacation


🧩 Advantages of Recursive CTEs

AdvantageExplanation
Simplifies Tree TraversalNo need for procedural loops
Dynamic DepthHandles unlimited levels
Readable QueriesLogical separation via anchor and recursion
ReusabilityUsed in multiple data models
Efficient ProcessingOptimized by Snowflake’s query engine

⚙️ Performance Optimization Tips

  1. Use proper WHERE filters to limit recursion depth.
  2. Avoid large recursion chains unless necessary.
  3. Add termination logic to prevent infinite loops.
  4. Test output incrementally (start small).
  5. Combine with caching or materialized views for faster reuse.

🧠 Mnemonic to Remember Recursive CTE Structure

A.R.T. – Anchor, Recursion, Termination

StepDescriptionExample
AAnchor QueryStart with base condition
RRecursive QueryJoin on itself
TTerminationStops automatically when no more matches

💡 Memory Hook: Think of recursive CTEs as “SQL loops for tree data.”


🧩 Common Interview Questions

  1. What is a Recursive CTE in Snowflake?
  2. How does recursion terminate in CTEs?
  3. Can you track levels or paths in Recursive CTEs?
  4. What’s the difference between normal and recursive CTEs?
  5. Give a use case for hierarchical data in Snowflake.

🧠 How to Remember for Exam & Interviews

StepTaskTrick
1Remember keyword WITH RECURSIVE“Recursive CTE starts with R”
2Anchor = base caseThink “root node”
3Recursive = child expansionThink “grow the tree”
4UNION ALLCombines levels
5Stop when no resultsThink “no more branches”

💡 Quick recall:

“Start at the root, connect the branches, stop when the tree ends.”


🧩 Real-World Use Cases

IndustryUse Case
HR / Org ManagementEmployee → Manager → Director hierarchy
E-commerceCategory → Subcategory → Product hierarchy
FinanceAccount rollups or balance consolidation
IT / InfrastructureDirectory paths or network hierarchies
Supply ChainBill of Materials (BOM) relationships

⚙️ Why It’s Important to Learn

  1. Essential for Hierarchical Data Modeling Recursive CTEs are fundamental for representing multi-level data.

  2. Used in Snowflake Certifications Frequently tested in SnowPro Core and Advanced Architect exams.

  3. Common in Real-World Analytics Every industry uses hierarchical relationships — from teams to products.

  4. Improves SQL Thinking Strengthens logical understanding of recursion within SQL.

  5. Prepares You for Big Data Engineering Roles Recursive CTEs are used in ETL, data lineage, and aggregation logic.


🧭 ** – Recursive Expansion Example**

Anchor Row

Recursive Step 1

Recursive Step 2

Recursive Step 3

End of Tree


🧠 Common Mistakes to Avoid

MistakeFix
Forgetting WITH RECURSIVEAlways specify it explicitly
Using UNION instead of UNION ALLCauses missing rows
Missing termination logicCan lead to infinite recursion
Misaligned join conditionVerify parent-child relationship correctly
Not tracking recursion depthAdd a level column for debugging

🧩 Example 4 – Limiting Recursion Depth

WITH RECURSIVE org_hierarchy AS (
SELECT emp_id, emp_name, manager_id, 0 AS level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.emp_id, e.emp_name, e.manager_id, oh.level + 1
FROM employees e
JOIN org_hierarchy oh ON e.manager_id = oh.emp_id
WHERE oh.level < 3
)
SELECT * FROM org_hierarchy;

Purpose: Prevents recursion from going beyond 3 levels.


🧠 Visualization – Recursive CTE Flow

CEO: Alice

Manager: Bob

Manager: Charlie

Employee: David

Employee: Eve

Employee: Frank


🎯 Summary Table

ConceptKey Point
DefinitionRecursive SQL construct for hierarchical data
SyntaxWITH RECURSIVE … UNION ALL …
ComponentsAnchor, Recursive, Termination
ApplicationsOrg charts, product trees, directories
BenefitsReadable, scalable, efficient
Keyword to RememberA.R.T (Anchor, Recursive, Termination)

🧩 Conclusion

The Snowflake Recursive CTE is a powerful tool for querying and visualizing hierarchical data. Whether you’re modeling organizational structures, category trees, or nested directories, Recursive CTEs let you elegantly express recursive logic in a single SQL query — without writing complex scripts or stored procedures.

By mastering this concept, you gain:

  • 🔹 Stronger analytical problem-solving skills
  • 🔹 Clearer understanding of hierarchical data
  • 🔹 An edge in interviews & certifications

Remember:

“Recursive CTEs are the backbone of hierarchical SQL — they turn trees into tables.”