SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed
SQL Playground
TutorialsPlayground
SQL FormatterSQL MinifierSyntax ValidatorJSON to SQLCSV to SQLSQL to JSONER Diagram Generator
Daily ChallengeInterviewsCheat SheetBlog
Back to Blog
2026-01-04
3 min read

Market Basket Analysis in SQL: What Do Customers Buy Together?

sqlanalyticsdata-sciencemarket-basketassociations

"Customers who bought X also bought Y."

We see this everywhere, from Amazon recommendations to Netflix "Because you watched..." lists. This is known as Market Basket Analysis (or Association Rule Mining).

While serious data scientists use algorithms like Apriori or FP-Growth in Python, you can get 90% of the value using simple SQL to find frequent item pairs.

The Core Concept: Self-Joins

To find items bought together, we need to look at the order_items table. Specifically, we want to find two different items that appear in the same order_id.

This requires a Self-Join: joining the table to itself.

SELECT
    a.product_id as product_A,
    b.product_id as product_B,
    COUNT(*) as frequency
FROM order_items a
JOIN order_items b 
    ON a.order_id = b.order_id     -- Same Order
    AND a.product_id < b.product_id -- Different Products (avoid mirrors)
GROUP BY product_A, product_B
ORDER BY frequency DESC;

Why a.product_id < b.product_id? If we just used !=, we would get pairs twice: (Bread, Milk) and (Milk, Bread). Using < forces a specific order, ensuring we count the pair only once.

Calculating Confidence

Knowing that "Bread and Milk" were bought together 500 times is useful. But is that high? If there are 1,000,000 orders, 500 is nothing. If there are 600 orders, it's huge.

We usually calculate Confidence: When a customer buys Product A, how likely are they to buy Product B?

Formula:

Confidence(A -> B) = (Orders containing A and B) / (Orders containing A)

Interactive Playground

Let's find the most popular product pairings in our grocery store data.

Interactive SQL
Loading...

Taking It Further: The "Combo Deal"

Once you identify these pairs (e.g., "Beer and Diapers"), you can take action:

  1. Product Placement: Put them next to each other on the shelf (or web page).
  2. Bundling: Create a "Weekend Warrior Pack" with both items.
  3. Recommendations: "You added Diapers to your cart. Don't forget the Beer!"

Conclusion

You don't always need complex machine learning models to build a recommendation engine. A clever SQL self-join is often enough to uncover the strongest relationships in your data.

Share this article:

Comments

© 2026 SQL Playground. Built for developers.

PrivacyTermsAboutRSS Feed