Snowflake
Architecture
- Snowflake Architecture
- Multi-Cluster Architecture
- Shared Disk and MPP Processing
- Virtual Warehouses
- Zero-Copy Cloning
Data Storage & Management
Data Loading & Integration
QueryProcessing & Optimization
- Query Caching
- Query Profile & Optimization
- Materialized Views
- Adaptive Query Execution
- Query Acceleration Service
SQL & Analytics Features
❄️ Snowflake Query Caching: The Secret to Lightning-Fast Performance
Imagine running a query that scans millions of rows — it takes seconds the first time but finishes in milliseconds the next. That’s Snowflake Query Caching at work.
Snowflake’s architecture is designed to cache query results at multiple levels, ensuring you don’t reprocess data unnecessarily. It’s one of the most powerful features that make Snowflake fast, cost-efficient, and scalable.
🧩 What is Query Caching in Snowflake?
Query caching means storing the results or intermediate data of previously executed queries so that subsequent queries can reuse them. This minimizes computation, reduces warehouse load, and significantly improves performance.
Snowflake automatically manages multiple layers of caching — you don’t have to configure anything manually.
⚙️ Types of Query Caching in Snowflake
Snowflake uses three main caching mechanisms:
Cache Type | Description | Duration |
---|---|---|
Result Cache | Stores complete query results | 24 hours |
Local Disk Cache | Stores micro-partitions on virtual warehouse SSDs | Until warehouse suspension |
Metadata Cache | Stores metadata about tables and micro-partitions | Persistent |
Each layer serves a different purpose and complements the others.
🌐 ** – Snowflake Query Caching Workflow**
🧠 1️⃣ Result Cache
📘 Definition
The Result Cache stores the final output of executed queries for 24 hours. If the same user (with the same role and query text) reruns it, Snowflake retrieves results instantly — without re-executing the query.
💻 Example 1 – Using Result Cache
-- First execution (takes longer)SELECT * FROM sales WHERE region = 'US';
-- Second execution (uses result cache)SELECT * FROM sales WHERE region = 'US';
✅ The second query returns results instantly because the result cache is used.
💻 Example 2 – Cache Validity
If data hasn’t changed, cached results remain valid for 24 hours.
-- Run a querySELECT COUNT(*) FROM orders;
-- After 2 hours, run againSELECT COUNT(*) FROM orders;
✅ As long as no INSERT
, UPDATE
, or DELETE
occurred, Snowflake serves the cached result.
💻 Example 3 – Bypassing Result Cache
Sometimes, you may want to ignore cached results for fresh computation.
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
SELECT * FROM customer_activity;
✅ Disables caching for this session.
📘 When to Use
- Frequent queries with no data changes
- Analytical dashboards where same query repeats
- To save compute cost and improve user experience
⚙️ 2️⃣ Local Disk Cache (Warehouse Cache)
📘 Definition
The Warehouse Cache (also called local disk cache) stores data micro-partitions that have been read from cloud storage (like S3). This cache resides on the SSD of the virtual warehouse.
Unlike result cache, it stores raw table data, not the final query result.
💻 Example 1 – First Query (Cache Miss)
SELECT * FROM inventory WHERE category = 'Electronics';
✅ Snowflake fetches data from remote cloud storage, populating the cache.
💻 Example 2 – Second Query (Cache Hit)
SELECT * FROM inventory WHERE category = 'Electronics';
✅ Data is fetched from local disk cache — faster and cheaper.
💻 Example 3 – Warehouse Restart
If the warehouse is suspended or resized, the cache clears.
ALTER WAREHOUSE my_wh SUSPEND;ALTER WAREHOUSE my_wh RESUME;
⚠️ The cache is lost upon suspension.
📘 When to Use
- For repeated queries on the same data set
- ETL jobs where same data partitions are accessed multiple times
- Medium-term caching for active sessions
⚙️ 3️⃣ Metadata Cache
📘 Definition
The Metadata Cache keeps track of table structure, micro-partition statistics, and file locations. This allows Snowflake to prune data efficiently and skip irrelevant partitions — a major performance booster.
💻 Example 1 – Automatic Metadata Usage
SELECT * FROM orders WHERE order_date > '2025-01-01';
✅ Snowflake uses metadata cache to skip partitions before scanning — reducing I/O.
💻 Example 2 – Checking Pruning Efficiency
EXPLAIN USING TEXT SELECT * FROM orders WHERE region = 'APAC';
✅ The explain plan shows how Snowflake prunes unnecessary partitions via cached metadata.
💻 Example 3 – Metadata Refresh
To ensure updated metadata after major schema change:
ALTER TABLE orders REFRESH;
✅ Refreshes metadata cache for that table.
📘 When to Use
- Always active — you don’t control it directly
- Most beneficial for partition pruning
- Helps query optimizer decide the fastest path
🧠 How to Remember Query Caching Levels
Use this memory trick:
“R-L-M” → Result, Local, Metadata (Think: Results Load More efficiently)
🔹 Quick Associations:
Cache Type | Think Of | Memory Cue |
---|---|---|
Result Cache | Query result storage | “Instant Replay” |
Local Cache | Temporary SSD cache | “Short-term memory” |
Metadata Cache | Table stats & partition info | “Index guide” |
🧩 ** – Query Cache Layer Interaction**
🧭 How Caching Improves Performance
Cache Type | Query Time Reduction | Storage Source |
---|---|---|
Result Cache | Up to 99% faster | Cached query results |
Local Cache | 2–5x faster | Warehouse SSD |
Metadata Cache | Efficient data pruning | Metadata repository |
🧩 Caching Rules in Snowflake
-
Result Cache
- Stored for 24 hours
- Invalidated when underlying data changes
- Specific to the same user, role, and query text
-
Local Cache
- Stored until warehouse suspend
- Automatically managed
- Cache differs per warehouse
-
Metadata Cache
- Persistent
- Refreshed automatically on DML/DDL events
💡 Performance Example
Imagine you run this analytical query:
SELECT region, SUM(sales_amount)FROM salesGROUP BY region;
Execution Type | Source | Time Taken |
---|---|---|
First Execution | From Cloud Storage | 4 seconds |
Second Execution | From Result Cache | 50 ms |
After Warehouse Resume | From Local Cache | 1.5 seconds |
✅ The difference shows how cache layers optimize query time dramatically.
🧠 Interview Questions & Answers
Question | Answer |
---|---|
What is Snowflake Query Caching? | Mechanism to store query results and data for faster performance. |
Name the types of caching in Snowflake. | Result Cache, Local Disk Cache, Metadata Cache. |
How long does result cache last? | 24 hours unless underlying data changes. |
What happens when the warehouse is suspended? | Local disk cache is cleared. |
Can I disable caching? | Yes, using ALTER SESSION SET USE_CACHED_RESULT = FALSE . |
Does Snowflake cache semi-structured data? | Yes, caching applies regardless of data type. |
📈 Why Query Caching Matters
Benefit | Description |
---|---|
1. Speed | Reduces query latency drastically. |
2. Cost Efficiency | Minimizes compute usage by reusing results. |
3. Scalability | Handles high concurrency efficiently. |
4. Resource Optimization | Frees warehouse resources for new queries. |
5. Better User Experience | Dashboards and reports load instantly. |
🔍 Real-World Scenarios
Scenario | Cache Used | Description |
---|---|---|
BI dashboard with repeated queries | Result Cache | Instant refresh without reprocessing |
ETL pipeline using same warehouse | Local Cache | Faster repeated loads |
Analytical query with partition filter | Metadata Cache | Prunes irrelevant partitions |
🧩 Common Mistakes to Avoid
Mistake | Impact | Fix |
---|---|---|
Warehouse suspended too often | Cache lost frequently | Use auto-suspend wisely |
Data updated frequently | Invalidates result cache | Use warehouse/local cache |
Using different roles | Result cache not shared | Use same user and role |
Changing query formatting | Cache miss | Keep consistent SQL text |
⚙️ Cache Monitoring
Check cache efficiency using Snowflake Query History:
SELECT query_id, query_text, bytes_scanned, execution_timeFROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORYWHERE start_time > DATEADD('hour', -1, CURRENT_TIMESTAMP());
- Low bytes_scanned = cache used
- Short execution_time = query reused cache
🧠 How to Prepare for Interview or Exam
- Memorize R-L-M (Result, Local, Metadata).
- Practice enabling/disabling caching with
USE_CACHED_RESULT
. - Understand what clears each cache type.
- Remember cache lifetimes (24h, warehouse lifetime, persistent).
- Review the s visually.
💡 Mnemonic: “Real-Life Memory” = Result, Local, Metadata
📊 ** – Query Caching Lifecycle**
🧩 Performance Tuning Recommendations
- Keep warehouses running for frequent queries
- Use materialized views for large result caching
- Schedule refresh windows wisely
- Encourage query standardization for better cache reuse
- Avoid random functions (RAND(), CURRENT_TIMESTAMP) in cacheable queries
📘 Quick Recap
Cache Type | Stores | Duration | Main Benefit |
---|---|---|---|
Result Cache | Final query results | 24 hours | Instant response |
Local Cache | Data micro-partitions | Warehouse lifetime | Reuse data |
Metadata Cache | Table and partition info | Persistent | Query pruning |
🎯 Conclusion
Snowflake Query Caching is one of the most underrated performance boosters in cloud data warehousing. By caching query results, table data, and metadata, Snowflake minimizes redundant computation, saving both time and cost.
For data engineers, analysts, and architects — mastering caching behavior means mastering query efficiency.
💡 Key Takeaway:
“Snowflake doesn’t just store data — it remembers your queries and makes them faster every time.”