SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed
SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog
Back to Blog
2025-12-19
4 min read

Descriptive Statistics in SQL: Beyond Average and Count

sqldata-analysisstatisticsadvanced-sql

Most SQL beginners know how to calculate a total with SUM or an average with AVG. But if you're working in data analysis, finance, or data science, basic averages are rarely enough. Averages can be heavily distorted by outliers, and they don't tell you anything about the distribution of your data.

To truly understand your data, you need Descriptive Statistics. In this guide, we'll go beyond the basics and learn how to calculate median, mode, percentiles, and variance using standard SQL.

1. The Median: The True Middle

The average (mean) is sensitive to extreme values. The Median is the middle value when data is sorted. It's much more robust for things like housing prices or employee salaries.

While some databases have a built-in MEDIAN() function, others require a bit more work.

-- Using percentile_cont (PostgreSQL, SQL Server, etc.)
SELECT 
  PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary
FROM employees;

In SQLite (which we use for our playground), we can calculate it by sorting and picking the middle row:

Interactive SQL
Loading...

Insight: Notice how the average salary is $239,000, but the median is $50,000. The median gives a much better sense of what a "typical" employee earns.

2. The Mode: The Most Frequent Value

The Mode is the value that appears most often in your dataset. This is useful for answering questions like "What is our best-selling product category?" or "Where do most of our users live?"

SELECT category, COUNT(*) as frequency
FROM sales
GROUP BY category
ORDER BY frequency DESC
LIMIT 1;

3. Percentiles: Understanding Distribution

Percentiles (like the 90th or 95th percentile) are essential for performance monitoring (P95 latency) and grading. They tell you the value below which a given percentage of data falls.

Interactive SQL
Loading...

4. Range and Spread

The simplest measure of spread is the Range—the difference between the maximum and minimum values. While simple, it helps you identify the boundaries of your dataset.

SELECT 
  MIN(price) as min_price,
  MAX(price) as max_price,
  MAX(price) - MIN(price) as price_range
FROM products;

5. Standard Deviation and Variance (The Advanced Stuff)

Standard Deviation measures how spread out your numbers are from the average. A low standard deviation means the data is clustered closely around the mean, while a high one means it's widely dispersed.

Most enterprise databases (Postgres, Oracle, SQL Server) provide:

  • STDDEV() or STDEV()
  • VARIANCE() or VAR()

Summary Table

StatisticWhat it tells youSQL Keyword/Pattern
MeanThe arithmetic averageAVG()
MedianThe "middle" valuePERCENTILE_CONT(0.5)
ModeThe most common valueGROUP BY + ORDER BY + LIMIT 1
RangeThe spread (Max - Min)MAX() - MIN()
Std DevHow dispersed data isSTDDEV()

Conclusion

Mastering statistics in SQL allows you to perform complex data analysis directly where the data lives. Instead of exporting large CSVs to Excel or Python, you can generate powerful insights with a few lines of SQL.

Next time you're asked for a "summary report," don't just provide the average. Include the median and the range to tell the full story of your data.

Try it out!

Use the playground snippet above to add more "outlier" salaries and see how the average moves while the median stays stable!

Share this article:

Comments

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed