Skip to content
AldeaCode Logo
Regex Tester / Python Developer 100% local

Test regex in Python: re module, raw strings, and the JS-flavor mismatch

Python's `re` module is a complete regex engine with one persistent gotcha: forgetting the `r` prefix turns half your patterns into Python escape sequences. Once that habit is internalised, the API is straightforward.

Always use raw strings

Python interprets backslashes inside ordinary string literals: "\n" is a newline, "\t" is a tab. Regex syntax is full of backslashes, so without the r prefix you are fighting the language.

```py import re

# Wrong: \d becomes nothing useful before the regex engine sees it pattern = "\d{3}"

# Right: r prefix tells Python to leave backslashes alone pattern = r"\d{3}"

re.findall(r"\d{3}", "Order 123 ships in 7 days") # ['123'] ```

Make raw strings the default for every regex you write. It is one extra character and it eliminates an entire category of bugs.

The four functions you actually use

re.search looks for the first match anywhere in the string and returns a Match object or None. Use it for "is it in there?".

re.match only matches at the start of the string. Almost never what you want, despite the name.

re.findall returns every non-overlapping match as a list of strings, or a list of tuples if the pattern has groups. Quick and useful for extraction.

re.finditer returns an iterator of Match objects with positions and groups. Use it when you care about where each match landed.

A complete extraction example:

text = "Contact hello@aldeacode.com or support@aldeacode.com"
emails = re.findall(r"[\w.+-]+@[\w-]+\.[\w.-]+", text)

Python regex is not JavaScript regex

Most regex testers online (including AldeaCode's) speak the JavaScript flavor. Python's flavor is similar but not identical, and the differences bite when you copy a pattern between them.

Python supports named groups with (?P...). JavaScript adopted (?...) in ES2018. Both work in their own engine, neither works in the other.

Python supports inline flags like (?i) at the start of a pattern. JavaScript does not.

Python lookbehind requires fixed width on the default engine. The regex module on PyPI lifts that restriction; the stdlib re module does not.

Test in the engine you will deploy in. Cross-engine bugs do not surface until production traffic hits the path.

When the stdlib re is not enough

For 95% of regex tasks re is fine. For the remaining 5% there is the regex package on PyPI: pip install regex. It is API-compatible with re (same function names, same arguments) but adds:

Variable-width lookbehind. Real Unicode property classes like \p{Greek}. POSIX-style character classes. Recursive patterns. Atomic groups. Possessive quantifiers.

If you find yourself fighting the stdlib for any of these, switch the import line to import regex as re and most of your code keeps working. The lift is small, the capability gain is huge.

Working example

python
import re

text = "Contact hello@aldeacode.com or support@aldeacode.com"

# Find every match
emails = re.findall(r"[\w.+-]+@[\w-]+\.[\w.-]+", text)
print(emails)
# ['hello@aldeacode.com', 'support@aldeacode.com']

# Iterate with positions
for m in re.finditer(r"[\w.+-]+@[\w-]+\.[\w.-]+", text):
    print(m.group(), m.span())

# Compile if you reuse the same pattern
EMAIL = re.compile(r"[\w.+-]+@[\w-]+\.[\w.-]+")
EMAIL.findall(text)

Just need the result?

When you just want to iterate on a pattern visually with live highlighting before pasting it into your Python file, the browser-based regex tester gives you that loop with zero setup. Note the JS flavor differences, fix the Python-only escapes when you copy back.

Open Regex Tester (JavaScript Flavor) →

Frequently asked questions

Should I always re.compile patterns?

Only if you reuse the same pattern many times in a hot loop. Python caches recently used patterns internally, so a one-shot re.findall is no slower than the compiled equivalent.

How do I match across newlines?

Pass re.DOTALL as a flag to make . match newlines too, or use [\s\S] inside the pattern. Default behavior is for . to skip newlines, which surprises people coming from PCRE one-liners.

Why is my regex slow on long input?

Catastrophic backtracking. Patterns with nested quantifiers like (a+)+ explode on certain inputs. Use atomic groups or the regex package's possessive quantifiers, or rethink the pattern with non-greedy matches and explicit anchors.