Skip to content
AldeaCode Logo
Regex Tester / PostgreSQL Developer 100% local

Regex in PostgreSQL: ~, regexp_match, and POSIX vs PCRE differences

PostgreSQL has the most complete regex toolbox of any major database, but it speaks POSIX, not PCRE. The shorthand classes you use in JavaScript and Python do not exist here, and that is the single source of every "my regex works in my IDE but not in the query" bug.

The four operators: ~ ~* !~ !~*

For boolean matching, PostgreSQL gives you four operators:

text ~ pattern: case-sensitive match. text ~* pattern: case-insensitive match. text !~ pattern: negated case-sensitive match. text !~* pattern: negated case-insensitive match.

SELECT 'hello' ~ '^h';      -- t
SELECT 'Hello' ~ '^h';      -- f
SELECT 'Hello' ~* '^h';     -- t
SELECT 'Hello' !~ '^h';     -- t

These return a plain boolean and are the right tool for WHERE and CHECK clauses. They do not extract anything. For extraction, you need the function-style siblings.

regexp_match, regexp_matches, regexp_replace, regexp_split_to_array

regexp_match(text, pattern) returns a text[] array of capture groups, or NULL if there is no match. It returns at most one match.

regexp_matches(text, pattern, 'g') returns a setof text[] with all matches. The function name is plural and it expands rows in the FROM clause; that surprises people coming from SELECT-only thinking.

regexp_replace(text, pattern, replacement) substitutes. regexp_replace(text, pattern, replacement, 'g') substitutes globally.

regexp_split_to_array(text, pattern) and regexp_split_to_table(...) split a string by a pattern and return an array or a row set respectively.

```sql -- Extract domain from email SELECT (regexp_match('alice@example.com', '@(.+)$'))[1]; -- → example.com

-- Strip multiple whitespace runs SELECT regexp_replace('hello world', '\\s+', ' ', 'g'); -- → hello world ```

POSIX flavor: \d does not exist

PostgreSQL implements POSIX regular expressions, not PCRE. The single biggest difference: shorthand character classes like \d, \w, \s are not POSIX. They work in JavaScript, Python, Perl, Ruby. They do not work in Postgres.

The POSIX equivalents:

[[:digit:]] for \d. [[:alpha:]] for letters. [[:alnum:]] for word characters that are not underscore. [[:space:]] for whitespace.

Or use plain ranges: [0-9], [a-zA-Z0-9_], [ \t\n].

PostgreSQL does also accept some ARE (Advanced Regular Expression) extensions, including \d and \s since 8.0, but the safe and portable choice is the bracket form. If you ever export the query to another database or to a CHECK constraint that another tool reads, the bracket form survives intact.

Other POSIX gotchas

Lookarounds ((?=...), (?!...)) are not in POSIX BRE/ERE but PostgreSQL supports them via the ARE extension. Test before relying on them in portable code.

The (?i) inline modifier works in Postgres but not in some other POSIX engines; prefer the operator form (~*) when the case sensitivity is fixed for a query.

Anchors are ^ and $. By default they match the start and end of the string, not of a line. Pass the 'n' flag to make them line-aware: regexp_match(text, '^foo', 'n').

Backreferences (\1, \2) work in both pattern and replacement strings. PCRE-style named groups ((?P...)) are not supported; use positional groups.

Working example

sql
-- Validate UK postcode-ish format in a CHECK constraint
ALTER TABLE addresses
  ADD CONSTRAINT postcode_format
  CHECK (postcode ~ '^[A-Z]{1,2}[0-9][A-Z0-9]? [0-9][A-Z]{2}$');

-- Filter rows whose email is well-formed
SELECT *
FROM users
WHERE email ~* '^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$';

-- Strip non-digits from a phone column for indexing
SELECT regexp_replace(phone, '[^[:digit:]]', '', 'g') AS phone_digits
FROM contacts;

-- Split a comma-separated tags column into rows
SELECT id, regexp_split_to_table(tags, ',\s*') AS tag
FROM posts
WHERE tags IS NOT NULL;

Just need the result?

When you are designing a CHECK constraint or debugging why a Postgres regex behaves differently than the same pattern in your IDE, the browser-based regex tester at aldeacode.com lets you iterate on the pattern in PCRE mode and surfaces exactly what the engine sees, so you can translate to POSIX without a round-trip to psql.

Open Regex Tester (JavaScript Flavor) →

Frequently asked questions

Why does my \d pattern return no matches in Postgres?

POSIX regex does not have \d as a shorthand. PostgreSQL accepts it through the ARE extension since 8.0, so it should work, but if your pattern is wrapped in a string with extra backslash escaping (a JDBC param, for example) the literal sent to Postgres might be a single backslash plus 'd'. Switch to [[:digit:]] or [0-9] and the ambiguity disappears.

Should I use regexp_match or regexp_matches?

Singular regexp_match for at most one result. It returns a text[] in a single row, with NULL on no match. Plural regexp_matches expands into multiple rows when you pass the 'g' flag, and you have to LATERAL it into the FROM clause. Use regexp_match in 95% of WHERE clauses; regexp_matches when you genuinely need every occurrence.

How do I do case-insensitive matching efficiently?

The ~* operator is the easy answer. For high-volume queries, build a functional index: CREATE INDEX ON users (LOWER(email)); and rewrite the query as WHERE LOWER(email) = $1. The functional index turns a regex scan into a B-tree lookup. ~* alone forces a sequential scan unless you also have a trigram or other regex-friendly index.