Your sales dashboard looks fine on average—revenue is up 5% month-over-month. But buried in your data are a few transactions that are wildly off: a $0.01 order, a sudden spike in refunds, a server metric that jumped 10x for one minute.
These are anomalies, and averages hide them. SQL can surface them.

In this post, we'll explore three practical methods to detect outliers directly in SQL: the Z-Score method, the Interquartile Range (IQR) method, and using moving averages to catch time-series spikes.
What is an Anomaly?
An anomaly (or outlier) is a data point that deviates significantly from expected behavior. Anomaly detection matters for:
- Fraud detection: A transaction 100x larger than a user's typical order.
- System monitoring: A server latency spike at 3am.
- Data quality: Accidentally inserted negative prices or future timestamps.
- Business intelligence: An inexplicable dip in conversion rates.
The challenge is defining "significantly different"—and SQL gives us several statistical tools to do exactly that.
The Sample Dataset
Let's create a table of daily sales transactions. Two rows are clearly suspicious: row #7 is a $1,850 spike, and row #10 is a $4.99 near-zero value.
Now let's catch these programmatically—without eyeballing the data.
Method 1: Z-Score
The Z-Score measures how many standard deviations a value is from the mean:
Z = (x − μ) / σ
Where μ is the mean and σ is the standard deviation. A Z-Score above 2 or 3 (or below -2 / -3) is typically flagged as an anomaly.
Notice that only the $1,850 spike is flagged here. Because that outlier is so extreme, it dramatically inflates the standard deviation, which actually protects the $4.99 value from being caught. This is the core weakness of the Z-Score method.
Limitation: A single extreme outlier can inflate σ, making other moderate outliers appear normal.
Method 2: Interquartile Range (IQR)
The IQR method is more robust to extreme values. It defines the normal band as:
- Lower Fence = Q1 − 1.5 × IQR
- Upper Fence = Q3 + 1.5 × IQR
Where IQR = Q3 − Q1. Values outside these fences are outliers.
SQLite doesn't have a built-in PERCENTILE_CONT function, but NTILE(4) gives us a solid approximation:
The IQR method catches both the $1,850 spike and the $4.99 dip because it isn't distorted by either extreme. This makes it the go-to method for skewed financial data.
Method 3: Moving Average Deviation
For time-series data, comparing each value to a rolling window is often the most intuitive approach. A value is anomalous if it deviates significantly from its recent neighborhood:
The window ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING compares each day's sales to the previous 4 days. The January 7th spike stands out sharply against the $118–$142 range of the preceding days.
Which Method Should You Use?
| Method | Best For | Weakness |
|---|---|---|
| Z-Score | Normally distributed data | One extreme outlier distorts $\sigma$ |
| IQR | Skewed data, robust detection | Misses gradual drift over time |
| Moving Average | Time-series, sequential events | Needs sufficient historical data |
In practice, you can combine methods: flag only records that are anomalous by two or more methods out of three. This dramatically reduces false positives.
Practical Tips
- Segment before analyzing: Compute statistics per
product_categoryorregion—not across all data—so you're comparing apples to apples. - Tune thresholds by domain: A 3-sigma rule for financial data; a lower threshold may work better for operational metrics.
- Log, don't delete: When you find an anomaly, write it to a review table rather than silently discarding it.
- Schedule it: Run your anomaly queries daily or hourly as part of your data pipeline. Stale anomaly detection is no anomaly detection.
Conclusion
Anomaly detection in SQL doesn't require machine learning. With Z-Scores, IQR fences, and rolling window comparisons, you can surface outliers directly in your data warehouse using tools you already know.
Start with the IQR method for a robust, distribution-agnostic approach. Layer in moving averages when working with time-series data. Your future self—staring at a fraud alert at 2am—will thank you.