Mastering SQL GROUP BY: From Basics to Advanced Aggregations

The GROUP BY clause is one of SQL's most powerful features for data analysis. It transforms raw rows into meaningful summaries by grouping related data and calculating aggregates like totals, averages, and counts.

If you've ever needed to answer questions like "How many orders per customer?" or "What's the average salary by department?", you've needed GROUP BY.

How GROUP BY Works

Think of GROUP BY as organizing a messy pile of papers into labeled folders. All papers (rows) with the same label go into the same folder (group), then you can count how many papers are in each folder or calculate statistics about them.

The process:

Scan all rows in the table
Group rows that have the same value in the GROUP BY column(s)
Aggregate each group using functions like COUNT, SUM, AVG, MAX, MIN
Return one row per group

Basic GROUP BY with COUNT

Let's start with the most common use case: counting items in each group.

Visualizing GROUP BY with COUNT:

Visualizing GROUP BY department with COUNT(*)

Source Rows

Sales

Engineering

Sales

Groups

Result

Phase 1: Grouping rows by department...

Try it yourself:

Interactive SQL

Key Insight: Notice how 6 rows become 2 rows—one per department. COUNT(*) counts all rows in each group.

GROUP BY with SUM

Now let's calculate totals for each group.

Visualizing GROUP BY with SUM:

Visualizing GROUP BY department with SUM(salary)

Source Rows

Sales

salary: 60000

Sales

salary: 55000

Engineering

salary: 80000

Engineering

salary: 85000

Engineering

salary: 75000

Sales

salary: 62000

Groups

Result

Phase 1: Grouping rows by department...

Interactive SQL

Pro Tip: You can use multiple aggregate functions in the same query! Combine COUNT, SUM, and AVG to get a comprehensive summary.

The HAVING Clause: Filtering Groups

WHERE filters rows before grouping. HAVING filters groups after aggregation.

Example: Find departments with more than 2 employees

Interactive SQL

Result: Both departments appear because they each have 3 employees. If we changed the condition to > 3, neither would appear.

Grouping by Multiple Columns

You can group by more than one column to create more granular groups.

Interactive SQL

This creates a cross-tabulation: each unique combination of department and salary band gets its own group.

Common Aggregate Functions

Function	Purpose	Example
`COUNT(*)`	Count all rows in group	Total orders per customer
`COUNT(column)`	Count non-NULL values	Customers with email addresses
`SUM(column)`	Add up values	Total revenue per product
`AVG(column)`	Calculate average	Average rating per movie
`MAX(column)`	Find maximum	Highest salary per department
`MIN(column)`	Find minimum	Earliest order date per customer

Common Mistakes to Avoid

❌ Selecting non-grouped columns

-- WRONG: name is not in GROUP BY
SELECT department, name, COUNT(*)
FROM employees
GROUP BY department;

Rule: Every column in SELECT must either be:

In the GROUP BY clause, OR
Inside an aggregate function

✅ Correct version

SELECT department, COUNT(*) as count
FROM employees
GROUP BY department;

Practice Challenges

Try modifying the queries above to:

Find the department with the highest average salary
Count how many employees earn more than $60,000 in each department
List departments where the total salary budget exceeds $150,000

Test Your Skills with Real Interview Questions

Ready to apply your GROUP BY knowledge? Try these real interview questions:

Payment Success Rate by Country (Stripe) - Calculate success rates using conditional aggregation with CASE WHEN and GROUP BY
Top Products by Revenue (Shopify) - Use SUM() with calculated columns and GROUP BY for product analysis
Department Budget Variance (Oracle) - Combine JOINs with GROUP BY and HAVING for financial analysis

These questions test your ability to use GROUP BY with aggregations, HAVING clauses, and conditional logic in production scenarios.

Key Takeaways

GROUP BY collapses rows into groups based on column values
Aggregate functions (COUNT, SUM, AVG, MAX, MIN) calculate one value per group
HAVING filters groups after aggregation (use WHERE to filter rows before)
Every SELECT column must be grouped or aggregated
You can group by multiple columns for finer granularity

Master GROUP BY and you'll unlock powerful data analysis capabilities in SQL!

If you've ever needed to answer questions like "How many orders per customer?" or "What's the average salary by department?", you've needed GROUP BY.

How GROUP BY Works

The process:

Scan all rows in the table
Group rows that have the same value in the GROUP BY column(s)
Aggregate each group using functions like COUNT, SUM, AVG, MAX, MIN
Return one row per group

Basic GROUP BY with COUNT

Let's start with the most common use case: counting items in each group.

Visualizing GROUP BY with COUNT:

Visualizing GROUP BY department with COUNT(*)

Source Rows

Sales

Engineering

Sales

Groups

Result

Phase 1: Grouping rows by department...

Try it yourself:

Interactive SQL

Key Insight: Notice how 6 rows become 2 rows—one per department. COUNT(*) counts all rows in each group.

GROUP BY with SUM

Now let's calculate totals for each group.

Visualizing GROUP BY with SUM:

Visualizing GROUP BY department with SUM(salary)

Source Rows

Sales

salary: 60000

Sales

salary: 55000

Engineering

salary: 80000

Engineering

salary: 85000

Engineering

salary: 75000

Sales

salary: 62000

Groups

Result

Phase 1: Grouping rows by department...

Interactive SQL

Pro Tip: You can use multiple aggregate functions in the same query! Combine COUNT, SUM, and AVG to get a comprehensive summary.

The HAVING Clause: Filtering Groups

WHERE filters rows before grouping. HAVING filters groups after aggregation.

Example: Find departments with more than 2 employees

Interactive SQL

Result: Both departments appear because they each have 3 employees. If we changed the condition to > 3, neither would appear.

Grouping by Multiple Columns

You can group by more than one column to create more granular groups.

Interactive SQL

This creates a cross-tabulation: each unique combination of department and salary band gets its own group.

Common Aggregate Functions

Function	Purpose	Example
`COUNT(*)`	Count all rows in group	Total orders per customer
`COUNT(column)`	Count non-NULL values	Customers with email addresses
`SUM(column)`	Add up values	Total revenue per product
`AVG(column)`	Calculate average	Average rating per movie
`MAX(column)`	Find maximum	Highest salary per department
`MIN(column)`	Find minimum	Earliest order date per customer

Common Mistakes to Avoid

❌ Selecting non-grouped columns

-- WRONG: name is not in GROUP BY
SELECT department, name, COUNT(*)
FROM employees
GROUP BY department;

Rule: Every column in SELECT must either be:

In the GROUP BY clause, OR
Inside an aggregate function

✅ Correct version

SELECT department, COUNT(*) as count
FROM employees
GROUP BY department;

Practice Challenges

Try modifying the queries above to:

Find the department with the highest average salary
Count how many employees earn more than $60,000 in each department
List departments where the total salary budget exceeds $150,000

Test Your Skills with Real Interview Questions

Ready to apply your GROUP BY knowledge? Try these real interview questions:

Payment Success Rate by Country (Stripe) - Calculate success rates using conditional aggregation with CASE WHEN and GROUP BY
Top Products by Revenue (Shopify) - Use SUM() with calculated columns and GROUP BY for product analysis
Department Budget Variance (Oracle) - Combine JOINs with GROUP BY and HAVING for financial analysis

These questions test your ability to use GROUP BY with aggregations, HAVING clauses, and conditional logic in production scenarios.

Key Takeaways

GROUP BY collapses rows into groups based on column values
Aggregate functions (COUNT, SUM, AVG, MAX, MIN) calculate one value per group
HAVING filters groups after aggregation (use WHERE to filter rows before)
Every SELECT column must be grouped or aggregated
You can group by multiple columns for finer granularity

Master GROUP BY and you'll unlock powerful data analysis capabilities in SQL!

Mastering SQL GROUP BY: From Basics to Advanced Aggregations

How GROUP BY Works

Basic GROUP BY with COUNT

Visualizing GROUP BY department with COUNT(*)

GROUP BY with SUM

Visualizing GROUP BY department with SUM(salary)

The HAVING Clause: Filtering Groups

Grouping by Multiple Columns

Common Aggregate Functions

Common Mistakes to Avoid

❌ Selecting non-grouped columns

✅ Correct version

Practice Challenges

Test Your Skills with Real Interview Questions

Key Takeaways

Comments

Mastering SQL GROUP BY: From Basics to Advanced Aggregations

How GROUP BY Works

Basic GROUP BY with COUNT

Visualizing GROUP BY department with COUNT(*)

GROUP BY with SUM

Visualizing GROUP BY department with SUM(salary)

The HAVING Clause: Filtering Groups

Grouping by Multiple Columns

Common Aggregate Functions

Common Mistakes to Avoid

❌ Selecting non-grouped columns

✅ Correct version

Practice Challenges

Test Your Skills with Real Interview Questions

Key Takeaways

Comments