The LIKE operator is great for simple patterns (keeps starting with 'A', ends with 'z'). But what if you need to:
- Find email addresses in a mess of text?
- Validate phone numbers?
- Match strings with "3 digits, a dash, then 4 letters"?
For these power-user tasks, you need Regular Expressions (Regex).
What is Regex?
Regex is a sequence of characters that specifies a search pattern. It's available in almost all programming languages, and yes—in SQL too!
In SQL Boy (and SQLite), we use the REGEXP operator.
Note: Standard SQLite doesn't strictly implement
REGEXPby default, but we've added a custom function to the SQL Boy playground so you can use it right here!
Basic Syntax Cheat Sheet
| Symbol | Description | Example | Matches |
|---|---|---|---|
. | Any single character | h.t | hat, hot, hit |
^ | Start of string | ^Hello | Hello World |
$ | End of string | World$ | Hello World |
[abc] | Any one character in brackets | [bcr]at | bat, cat, rat |
[0-9] | Any digit | User[0-9] | User1, User9 |
* | Zero or more repetitions | go*d | gd, god, good |
+ | One or more repetitions | go+d | god, good |
Interactive Example: Validating Emails
Let's find all invalid email addresses in our users table. A valid email should roughly follow: text @ text . text.
Finding Patterns: Phone Numbers
Imagine you have a text field where users can dump any contact info. You want to extract rows that look like they contain a US phone number (3 digits - 3 digits - 4 digits).
REGEXP vs LIKE
| Feature | LIKE | REGEXP |
|---|---|---|
| Simplicity | High (just % and _) | Low (cryptic syntax) |
| Power | Basic prefix/suffix | Unlimited pattern matching |
| Performance | Fast (can use indexes) | Slower (scans full strings) |
| Portability | Universal | Varies (MySQL REGEXP, Postgres ~, Oracle REGEXP_LIKE) |
Best Practices
- Don't overcomplicate: Use
LIKEif you just need "starts with" or "contains". - Test your patterns: Regex is notorious for "false positives". Test with diverse data.
- Performance: Avoid running complex Regex on millions of rows in real-time. Do it in background jobs or ETL processes.
Conclusion
Regex gives you superpowers for text data. It unlocks the ability to clean messy inputs, validate formats, and extract hidden insights that standard SQL functions can't touch.
Start simple with ^ (starts with) and $ (ends with), and build up to complex validation patterns!