Snowflake
Architecture
- Snowflake Architecture
- Multi-Cluster Architecture
- Shared Disk and MPP Processing
- Virtual Warehouses
- Zero-Copy Cloning
Data Storage & Management
Data Loading & Integration
QueryProcessing & Optimization
- Query Caching
- Query Profile & Optimization
- Materialized Views
- Adaptive Query Execution
- Query Acceleration Service
SQL & Analytics Features
❄️ Snowflake Recursive CTEs (Common Table Expressions) – Simplifying Hierarchical Data Queries
In data analytics, not all relationships are flat. Many datasets — such as organization charts, product categories, or directory trees — have a hierarchical structure.
For example:
- An employee reports to a manager who reports to a director.
- A category has subcategories that may have further subcategories.
- A file folder contains subfolders and files recursively.
How do we query such structures using SQL?
Enter Recursive CTEs (Common Table Expressions) — a feature in Snowflake SQL that allows you to traverse hierarchical or self-referential data efficiently.
🧠 What is a Recursive CTE?
A Recursive Common Table Expression (CTE) is a temporary result set that references itself. It’s ideal for exploring parent-child relationships or multi-level hierarchies in data.
🧩 General Syntax
WITH RECURSIVE cte_name AS ( -- Anchor member (base result) SELECT ... FROM ... WHERE ... UNION ALL -- Recursive member (references cte_name) SELECT ... FROM cte_name JOIN ... ON ...)SELECT * FROM cte_name;
⚙️ CTE Components
Component | Description |
---|---|
Anchor member | The initial query that defines the starting point. |
Recursive member | References the CTE itself to fetch subsequent levels. |
Termination condition | The recursion stops when no more rows are returned. |
UNION ALL | Combines base and recursive results into a growing result set. |
🎨 ** – Recursive CTE Process**
💡 Why Recursive CTEs Matter
Benefit | Description |
---|---|
🧱 Hierarchical Data Handling | Ideal for org charts, product hierarchies, and bill of materials. |
🧠 Readable Logic | Easier to understand than complex joins or loops. |
⚡ Performance | Runs efficiently on Snowflake’s cloud engine. |
🪜 Level Tracking | Helps analyze data depth or reporting levels. |
🧩 Reusable Logic | Works inside larger SQL transformations. |
🧩 Example 1 – Organizational Hierarchy
🧱 Scenario:
You have an employee table with each employee’s manager. You want to build a full hierarchy tree from the CEO downwards.
💾 Step 1: Create Sample Data
CREATE OR REPLACE TABLE employees ( emp_id INT, emp_name STRING, manager_id INT);
INSERT INTO employees VALUES(1, 'Alice', NULL),(2, 'Bob', 1),(3, 'Charlie', 1),(4, 'David', 2),(5, 'Eve', 2),(6, 'Frank', 3);
💡 Step 2: Recursive CTE Query
WITH RECURSIVE org_hierarchy AS ( -- Anchor: CEO (no manager) SELECT emp_id, emp_name, manager_id, 0 AS level FROM employees WHERE manager_id IS NULL
UNION ALL
-- Recursive: Fetch direct reports SELECT e.emp_id, e.emp_name, e.manager_id, oh.level + 1 FROM employees e JOIN org_hierarchy oh ON e.manager_id = oh.emp_id)SELECT * FROM org_hierarchy;
✅ Output
emp_id | emp_name | manager_id | level |
---|---|---|---|
1 | Alice | NULL | 0 |
2 | Bob | 1 | 1 |
3 | Charlie | 1 | 1 |
4 | David | 2 | 2 |
5 | Eve | 2 | 2 |
6 | Frank | 3 | 2 |
🧠 Explanation
- Anchor Query: Starts with the CEO (no manager).
- Recursive Query: Finds employees whose
manager_id
matches an existingemp_id
. - Level Counter: Tracks hierarchy depth.
✅ Use Case: Organization charts, reporting structures.
🧩 Example 2 – Product Category Hierarchy
🧱 Scenario:
You have product categories where each category can have a parent category.
💾 Step 1: Create Sample Table
CREATE OR REPLACE TABLE categories ( category_id INT, category_name STRING, parent_id INT);
INSERT INTO categories VALUES(1, 'Electronics', NULL),(2, 'Laptops', 1),(3, 'Mobiles', 1),(4, 'Gaming Laptops', 2),(5, 'Android Phones', 3),(6, 'Accessories', NULL);
💡 Step 2: Recursive Query
WITH RECURSIVE category_tree AS ( -- Anchor: Root categories SELECT category_id, category_name, parent_id, 0 AS level FROM categories WHERE parent_id IS NULL
UNION ALL
-- Recursive: Find subcategories SELECT c.category_id, c.category_name, c.parent_id, ct.level + 1 FROM categories c JOIN category_tree ct ON c.parent_id = ct.category_id)SELECT * FROM category_tree ORDER BY level;
✅ Output
category_id | category_name | parent_id | level |
---|---|---|---|
1 | Electronics | NULL | 0 |
6 | Accessories | NULL | 0 |
2 | Laptops | 1 | 1 |
3 | Mobiles | 1 | 1 |
4 | Gaming Laptops | 2 | 2 |
5 | Android Phones | 3 | 2 |
🧠 Explanation
- Parent-child mapping: Recursive join expands children categories.
- Level indicator: Helps visualize tree depth.
- ORDER BY level: Displays hierarchical flow neatly.
✅ Use Case: E-commerce category trees or product taxonomies.
🧩 Example 3 – Directory Tree Traversal
🧱 Scenario:
You have a file directory system where each folder may have subfolders.
💾 Step 1: Create Sample Table
CREATE OR REPLACE TABLE directories ( dir_id INT, dir_name STRING, parent_dir INT);
INSERT INTO directories VALUES(1, 'root', NULL),(2, 'documents', 1),(3, 'images', 1),(4, 'work', 2),(5, 'personal', 2),(6, 'vacation', 3);
💡 Step 2: Recursive Query
WITH RECURSIVE dir_tree AS ( -- Base directory SELECT dir_id, dir_name, parent_dir, '/' || dir_name AS full_path FROM directories WHERE parent_dir IS NULL
UNION ALL
-- Recursive expansion SELECT d.dir_id, d.dir_name, d.parent_dir, dt.full_path || '/' || d.dir_name AS full_path FROM directories d JOIN dir_tree dt ON d.parent_dir = dt.dir_id)SELECT * FROM dir_tree;
✅ Output
dir_id | dir_name | parent_dir | full_path |
---|---|---|---|
1 | root | NULL | /root |
2 | documents | 1 | /root/documents |
3 | images | 1 | /root/images |
4 | work | 2 | /root/documents/work |
5 | personal | 2 | /root/documents/personal |
6 | vacation | 3 | /root/images/vacation |
🧠 Explanation
- Builds folder paths recursively.
- Each recursive level appends folder names to
full_path
. - Stops when no subfolders remain.
✅ Use Case: File systems, nested folder structures, data lineage tracking.
🎨 ** – Directory Hierarchy**
🧩 Advantages of Recursive CTEs
Advantage | Explanation |
---|---|
Simplifies Tree Traversal | No need for procedural loops |
Dynamic Depth | Handles unlimited levels |
Readable Queries | Logical separation via anchor and recursion |
Reusability | Used in multiple data models |
Efficient Processing | Optimized by Snowflake’s query engine |
⚙️ Performance Optimization Tips
- Use proper WHERE filters to limit recursion depth.
- Avoid large recursion chains unless necessary.
- Add termination logic to prevent infinite loops.
- Test output incrementally (start small).
- Combine with caching or materialized views for faster reuse.
🧠 Mnemonic to Remember Recursive CTE Structure
A.R.T. – Anchor, Recursion, Termination
Step | Description | Example |
---|---|---|
A | Anchor Query | Start with base condition |
R | Recursive Query | Join on itself |
T | Termination | Stops automatically when no more matches |
💡 Memory Hook: Think of recursive CTEs as “SQL loops for tree data.”
🧩 Common Interview Questions
- What is a Recursive CTE in Snowflake?
- How does recursion terminate in CTEs?
- Can you track levels or paths in Recursive CTEs?
- What’s the difference between normal and recursive CTEs?
- Give a use case for hierarchical data in Snowflake.
🧠 How to Remember for Exam & Interviews
Step | Task | Trick |
---|---|---|
1 | Remember keyword WITH RECURSIVE | “Recursive CTE starts with R” |
2 | Anchor = base case | Think “root node” |
3 | Recursive = child expansion | Think “grow the tree” |
4 | UNION ALL | Combines levels |
5 | Stop when no results | Think “no more branches” |
💡 Quick recall:
“Start at the root, connect the branches, stop when the tree ends.”
🧩 Real-World Use Cases
Industry | Use Case |
---|---|
HR / Org Management | Employee → Manager → Director hierarchy |
E-commerce | Category → Subcategory → Product hierarchy |
Finance | Account rollups or balance consolidation |
IT / Infrastructure | Directory paths or network hierarchies |
Supply Chain | Bill of Materials (BOM) relationships |
⚙️ Why It’s Important to Learn
-
Essential for Hierarchical Data Modeling Recursive CTEs are fundamental for representing multi-level data.
-
Used in Snowflake Certifications Frequently tested in SnowPro Core and Advanced Architect exams.
-
Common in Real-World Analytics Every industry uses hierarchical relationships — from teams to products.
-
Improves SQL Thinking Strengthens logical understanding of recursion within SQL.
-
Prepares You for Big Data Engineering Roles Recursive CTEs are used in ETL, data lineage, and aggregation logic.
🧭 ** – Recursive Expansion Example**
🧠 Common Mistakes to Avoid
Mistake | Fix |
---|---|
Forgetting WITH RECURSIVE | Always specify it explicitly |
Using UNION instead of UNION ALL | Causes missing rows |
Missing termination logic | Can lead to infinite recursion |
Misaligned join condition | Verify parent-child relationship correctly |
Not tracking recursion depth | Add a level column for debugging |
🧩 Example 4 – Limiting Recursion Depth
WITH RECURSIVE org_hierarchy AS ( SELECT emp_id, emp_name, manager_id, 0 AS level FROM employees WHERE manager_id IS NULL
UNION ALL
SELECT e.emp_id, e.emp_name, e.manager_id, oh.level + 1 FROM employees e JOIN org_hierarchy oh ON e.manager_id = oh.emp_id WHERE oh.level < 3)SELECT * FROM org_hierarchy;
✅ Purpose: Prevents recursion from going beyond 3 levels.
🧠 Visualization – Recursive CTE Flow
🎯 Summary Table
Concept | Key Point |
---|---|
Definition | Recursive SQL construct for hierarchical data |
Syntax | WITH RECURSIVE … UNION ALL … |
Components | Anchor, Recursive, Termination |
Applications | Org charts, product trees, directories |
Benefits | Readable, scalable, efficient |
Keyword to Remember | A.R.T (Anchor, Recursive, Termination) |
🧩 Conclusion
The Snowflake Recursive CTE is a powerful tool for querying and visualizing hierarchical data. Whether you’re modeling organizational structures, category trees, or nested directories, Recursive CTEs let you elegantly express recursive logic in a single SQL query — without writing complex scripts or stored procedures.
By mastering this concept, you gain:
- 🔹 Stronger analytical problem-solving skills
- 🔹 Clearer understanding of hierarchical data
- 🔹 An edge in interviews & certifications
Remember:
“Recursive CTEs are the backbone of hierarchical SQL — they turn trees into tables.”