SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed
SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog
Back to Blog
2026-02-28
7 min read

Building Histograms and Frequency Distributions in SQL

sqlanalyticsdata-analysisstatisticsvisualization

You have a column of numbers—order amounts, page load times, user ages, test scores. You want to know: how is this data distributed? Are most values clustered in the middle? Is there a long tail? Are there suspicious gaps?

A histogram answers all of these questions. And you don't need Python, R, or a BI tool to build one. SQL can do it directly—and understanding how SQL builds histograms will make you a better data analyst.

A colorful hand-drawn bar chart showing a frequency distribution histogram with bucket labels on the x-axis and counts on the y-axis
A colorful hand-drawn bar chart showing a frequency distribution histogram with bucket labels on the x-axis and counts on the y-axis

What is a Frequency Distribution?

A frequency distribution counts how many values fall into each distinct category or range. For continuous numeric data, you first group values into buckets (also called bins), then count occurrences per bucket.

For example, instead of knowing that order amounts are "$12.50, $47.80, $13.20, $95.00…", you might want:

BucketCount
$0–$2542
$25–$5038
$50–$10027
$100+8

Let's build this progressively in SQL.

The Sample Dataset

Interactive SQL
Loading...

Method 1: Fixed-Width Buckets with CASE

The simplest approach: define your bucket boundaries manually using a CASE expression.

Interactive SQL
Loading...

The REPLACE(HEX(ZEROBLOB(n)), '00', '▪') trick generates an ASCII bar of length n—a quick way to visualize distribution directly in a query result.

Method 2: Dynamic Buckets with Math

Hard-coding bucket boundaries gets tedious at scale. A better approach: use arithmetic to compute the bucket number from the value itself.

The formula is:

bucket_num = FLOOR((value - min) / bucket_width)
Interactive SQL
Loading...

This produces 10 equal-width buckets automatically regardless of the data range. Change / 10.0 to / 20.0 to get 20 buckets, or parameterize it based on your precision needs.

Note: For databases like PostgreSQL, use the built-in WIDTH_BUCKET(value, min, max, num_buckets) function—it does exactly this in one call. In SQLite, the manual formula above is the standard approach.

Method 3: Frequency Distribution for Categorical Data

Not all frequency distributions involve numbers. Sometimes you want to count occurrences of each distinct text value—a categorical frequency distribution.

Interactive SQL
Loading...

The SUM(COUNT(*)) OVER() window function calculates the grand total across all categories, enabling the percentage column without a subquery.

Adding Cumulative Frequency

A cumulative frequency column shows the running percentage—useful for questions like "what order amount covers 80% of all transactions?"

Interactive SQL
Loading...

From this result you can read that ~70% of orders are under $100, and less than 10% are over $200. This is the kind of insight that shapes pricing, discount strategies, and fraud detection thresholds.

Comparing Distributions Across Segments

One of the most powerful uses of histograms is comparing two groups. Here we split the same data into "low value" vs "high value" orders and compare their bucket distributions side by side using conditional aggregation:

Interactive SQL
Loading...

This side-by-side pivoted histogram is a staple of A/B test reporting and cohort analysis.

Choosing Bucket Width

The hardest part of building a histogram isn't the SQL—it's choosing how wide to make your buckets.

Sturges' rule is a classic starting point: use k = ceil(log₂ n) + 1 buckets for n observations. For 100 rows: ceil(log₂ 100) + 1 = 7 + 1 = 8 buckets.

Scott's rule bases bucket width on the standard deviation: h = 3.5 × σ / n^(1/3). This adapts to spread in the data.

In practice:

  • Too few buckets → hiding variation (one giant bucket that says "most orders are $0–$200").
  • Too many buckets → noise (every bucket has 1–2 rows and the pattern is invisible).
  • A good rule of thumb: aim for 5–15 buckets for exploratory analysis.

Practical Tips

  • Round bucket boundaries to human-friendly numbers ($25 increments, not $21.7 increments) for readability.
  • Always include a $X+ overflow bucket so no rows are silently dropped.
  • Check for gaps: if a bucket has 0 rows, GROUP BY won't show it. Use a numbers table or recursive CTE to fill in zero-count buckets if continuity matters.
  • Log-scale buckets work better for highly skewed data (e.g., revenue where most customers spend < $50 but a few spend $10,000): use doublings ($0–$10, $10–$100, $100–$1000).

Conclusion

Building histograms in SQL requires nothing more than CASE expressions for bucketing, COUNT(*) for frequency, and window functions for percentages. These techniques scale from exploratory one-off analysis to production reporting queries that run every day.

The next time you're asked "how is this data distributed?"—skip the notebook. Open a SQL editor and build the histogram right where the data lives.

Share this article:

Related Articles

sqlstatistics

Calculating Percentiles and Median in SQL

AVG tells you the mean, but what about median and percentiles? Learn how to calculate these essential statistics in SQL using window functions and clever tricks.

Read more
sqlanalytics

SQL for Anomaly Detection: Finding Outliers

Learn how to detect anomalies and statistical outliers in your data using SQL with Z-score, IQR, and moving average methods.

Read more
sqldata-analysis

SQL for Data Analysis: The Ultimate Guide

Move beyond basic SELECTs. Master the core SQL techniques for real-world data analysis: Data Cleaning, Time-Series Analysis, Window Functions, and Cohort Analysis.

Read more
Previous

Full-Text Search in SQLite with FTS5

Comments

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed