Thread: BUG #15046: non-greedy ignored
The following bug has been logged on the website: Bug reference: 15046 Logged by: Bob Gailer Email address: bgailer@gmail.com PostgreSQL version: 10.1 Operating system: windows 10 Description: I start psql; enter: postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)', '', 'g'); regexp_replace ---------------- asf (1 row) Works as expected. Then I add |q to the pattern, and the .*? becomes greedy! postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)|q', '', 'g'); regexp_replace ---------------- af (1 row)
On Friday, February 2, 2018, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 15046
Logged by: Bob Gailer
Email address: bgailer@gmail.com
PostgreSQL version: 10.1
Operating system: windows 10
Description:
I start psql; enter:
postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)', '', 'g');
regexp_replace
----------------
asf
(1 row)
Works as expected. Then I add |q to the pattern, and the .*? becomes
greedy!
postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)|q', '', 'g');
regexp_replace
----------------
af
(1 row)
This seems to be explained by the final greediness rule:
An RE consisting of two or more branches connected by the
|
operator is always greedy.David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Friday, February 2, 2018, PG Bug reporting form <noreply@postgresql.org> > wrote: >> Works as expected. Then I add |q to the pattern, and the .*? becomes >> greedy! > This seems to be explained by the final greediness rule: > https://www.postgresql.org/docs/10/static/functions-matching.html#POSIX-MATCHING-RULES > An RE consisting of two or more branches connected by the | operator is > always greedy. Yeah. That subsection also contains some useful advice about how to control greediness decisions --- in this case, wrapping the whole thing with (...){1,1}? might do what you want. The short answer, perhaps, is that non-greedy patterns are not standardized by POSIX and you shouldn't expect that all regex engines do them the same way. Ours is definitely different from Perl's, for example. regards, tom lane
Thanks! Rtfp, eh?
On Feb 2, 2018 8:48 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Friday, February 2, 2018, PG Bug reporting form <noreply@postgresql.org>
> wrote:
>> Works as expected. Then I add |q to the pattern, and the .*? becomes
>> greedy!
> This seems to be explained by the final greediness rule:
> https://www.postgresql.org/docs/10/static/functions- matching.html#POSIX-MATCHING- RULES
> An RE consisting of two or more branches connected by the | operator is
> always greedy.
Yeah. That subsection also contains some useful advice about how to
control greediness decisions --- in this case, wrapping the whole
thing with (...){1,1}? might do what you want.
The short answer, perhaps, is that non-greedy patterns are not
standardized by POSIX and you shouldn't expect that all regex
engines do them the same way. Ours is definitely different
from Perl's, for example.
regards, tom lane