SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed
SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog
Back to Blog
2026-02-21
9 min read

Star Schema vs. Snowflake Schema Explained

sqldata-warehouseschema-designanalyticsolap

You've decided to build a data warehouse. Your first major design decision: how do you structure your tables?

Two patterns dominate the field—the star schema and the snowflake schema. They look similar at first glance, but the choice between them affects every query you write, every dashboard you build, and every analyst who joins your team.

Let's break down both schemas, understand when to use each, and get hands-on with SQL examples.

Side-by-side comparison of a star schema and a snowflake schema showing fact tables, dimension tables, and normalized sub-dimensions
Side-by-side comparison of a star schema and a snowflake schema showing fact tables, dimension tables, and normalized sub-dimensions

The Core Idea: Fact Tables and Dimension Tables

Both designs share the same fundamental vocabulary:

  • Fact table: The central table that stores quantitative, measurable events—sales, page views, payments. It's typically long (millions of rows) and narrow (mostly foreign keys and numbers).
  • Dimension table: Descriptive tables that provide context—customers, products, dates, locations. They answer who, what, where, and when.

The difference between star and snowflake schemas lies in how the dimension tables are organized.

The Star Schema

In a star schema, each dimension table is fully denormalized and connects directly to the fact table. Viewed visually, the fact table sits in the center with dimension tables radiating outward—like a star.

Notice that DIM_PRODUCTS stores category, sub_category, and brand all in one flat table—even though a category could itself be described by multiple attributes. This is intentional denormalization.

Advantages:

  • Simple, easy-to-understand structure for analysts.
  • Fewer JOINs per query—usually just fact table + 1 or 2 dimensions.
  • Faster query performance on most OLAP systems.

Disadvantages:

  • Redundant data (the same category string is repeated across many product rows).
  • Harder to maintain if dimension attributes change frequently.

The Snowflake Schema

In a snowflake schema, dimension tables are normalized. Instead of one flat DIM_PRODUCTS table, you split it into multiple related tables: dim_products, dim_categories, dim_brands.

Advantages:

  • Less data redundancy—each category name is stored exactly once.
  • Easier to update: rename a brand in one row, not thousands.
  • Better fits data from normalized OLTP source systems.

Disadvantages:

  • More JOINs per query—analysts must navigate deeper table chains.
  • More complex SQL that's harder for non-technical users to write.
  • Joins can be slower without careful indexing.

Side-by-Side SQL Comparison

Let's answer the same question with both schemas: "What is total revenue by product category in Q1 2025?"

Star Schema query — clean and direct:

SELECT
  p.category,
  SUM(f.total_amount) AS total_revenue
FROM fact_orders f
JOIN dim_products p ON f.product_id = p.product_id
JOIN dim_dates d   ON f.date_id = d.date_id
WHERE d.year = 2025 AND d.quarter = 1
GROUP BY p.category
ORDER BY total_revenue DESC;

Snowflake Schema query — requires an extra JOIN:

SELECT
  c.category_name,
  SUM(f.total_amount) AS total_revenue
FROM fact_orders f
JOIN dim_products p   ON f.product_id = p.product_id
JOIN dim_categories c ON p.category_id = c.category_id
JOIN dim_dates d      ON f.date_id = d.date_id
WHERE d.year = 2025 AND d.quarter = 1
GROUP BY c.category_name
ORDER BY total_revenue DESC;

Same result, but the snowflake version requires the analyst to know that categories live in a separate table. Multiply this across 10 reports and you start to feel the maintenance overhead.

Let's Build a Star Schema

Here's a working star schema you can explore directly:

Interactive SQL
Loading...

Notice how clean the query is: join fact → dimensions, filter, group. This is the star schema's core value proposition—analysts get results with minimal complexity.

Drilling Down: Filtering by Dimension Attributes

One of the most common patterns in OLAP is drilling down. Let's filter by customer segment and see which segments drive the most Technology revenue:

Interactive SQL
Loading...

When to Choose Which Schema

FactorStar SchemaSnowflake Schema
Query simplicity✅ Fewer JOINs❌ More JOINs
Storage efficiency❌ Some redundancy✅ Normalized, less redundancy
Update ease❌ Updates fan out✅ Single-row updates
BI tool friendliness✅ Most tools prefer it⚠️ Requires more configuration
Large dim tables⚠️ Redundancy grows✅ Scales better
Team SQL skillAnalyst-friendlyRequires stronger SQL skills

Choose star schema when:

  • Your analytics team writes ad hoc queries directly.
  • You're using a BI tool like Tableau, Looker, or Power BI.
  • Query performance is the top priority.
  • Dimension tables are relatively small and stable.

Choose snowflake schema when:

  • Your dimension tables are very large and redundancy is costly.
  • Data is loaded directly from normalized OLTP systems.
  • Attributes change frequently and you need clean update paths.
  • Storage cost is a significant concern.

A Practical Middle Ground

In real life, most production data warehouses are hybrids. You might fully denormalize small, stable dimensions (like dim_dates) while keeping a dim_categories table separate from dim_products because categories are managed by a product team that updates them weekly.

The right answer isn't "always star" or "always snowflake"—it's "which tables change together, and how often?"

Conclusion

The star schema and snowflake schema are both proven patterns for structuring analytical data. The star schema wins on simplicity and query speed; the snowflake schema wins on storage efficiency and maintainability.

For most teams starting out, start with a star schema. It's easier to explain, easier to query, and easier to connect to reporting tools. You can always normalize a hot dimension table later if you need to—going the other direction is harder.

Good schema design is mostly about understanding your queries before you design your tables. Sketch out the five most common questions your team will ask, then build the schema that makes those queries simple.

Share this article:

Related Articles

sqlanalytics

Mastering ROLLUP, CUBE, and GROUPING SETS in SQL

Stop running multiple queries for subtotals. Learn how to use advanced GROUP BY extensions to generate powerful reports in a single pass.

Read more
sqlanalytics

Building Histograms and Frequency Distributions in SQL

Learn how to build histograms, bucket data into ranges, and compute frequency distributions directly in SQL without external tools.

Read more
sqlanalytics

SQL for Anomaly Detection: Finding Outliers

Learn how to detect anomalies and statistical outliers in your data using SQL with Z-score, IQR, and moving average methods.

Read more
Previous

SQL for Anomaly Detection: Finding Outliers

Next

Full-Text Search in SQLite with FTS5

Comments

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed