Thread: regexp_replace grief

regexp_replace grief

From
Armin Resch
Date:
Not sure this is the right list to vent about this but here you go:

I) select regexp_replace('BEFORE.AFTER','(.*)\..*','\1','g') "Substring"
II) select regexp_replace('BEFORE.AFTER','(.*)\\..*','\\1','g') "Substring"

Executing (II) against pg 8.4.4 or 9.0.4 yields 'BEFORE', but in order for 9.1.7 to yield the same one has to execute (I) .. bummer

-ar

Re: regexp_replace grief

From
Craig James
Date:
On Wed, Apr 10, 2013 at 4:59 PM, Armin Resch <reschab@gmail.com> wrote:
Not sure this is the right list to vent about this but here you go:

I) select regexp_replace('BEFORE.AFTER','(.*)\..*','\1','g') "Substring"
II) select regexp_replace('BEFORE.AFTER','(.*)\\..*','\\1','g') "Substring"

Executing (II) against pg 8.4.4 or 9.0.4 yields 'BEFORE', but in order for 9.1.7 to yield the same one has to execute (I) .. bummer

This has nothing to do with regexp's.  It's a change in how '\' is interpreted in any quoted string.  The change came with Postgres 9.x and is documented in the release notes.  It brings Postgres into compliance with the SQL standard.

In Perl, I do something like this:

my $pg_bs_char;
if ($dbh->get_info($GetInfoType{SQL_DBMS_VER}) gt "09.00.00") {
    $pg_bs_char = "\\";         # a single '\' for PG 9.1 and higher
} else {
    $pg_bs_char = "\\\\";       # a double '\\' for PG up to 9.0
}

You can also revert to the old 8.x interpretation; see

  http://www.postgresql.org/docs/9.1/static/runtime-config-compatible.html

Craig
 

-ar

Re: regexp_replace grief

From
Armin Resch
Date:
Thx for clarification, Craig. Your Perl snippet comes in handy, too.
-ar

On Apr 10, 2013, at 8:08 PM, Craig James <cjames@emolecules.com> wrote:

On Wed, Apr 10, 2013 at 4:59 PM, Armin Resch <reschab@gmail.com> wrote:
Not sure this is the right list to vent about this but here you go:

I) select regexp_replace('BEFORE.AFTER','(.*)\..*','\1','g') "Substring"
II) select regexp_replace('BEFORE.AFTER','(.*)\\..*','\\1','g') "Substring"

Executing (II) against pg 8.4.4 or 9.0.4 yields 'BEFORE', but in order for 9.1.7 to yield the same one has to execute (I) .. bummer

This has nothing to do with regexp's.  It's a change in how '\' is interpreted in any quoted string.  The change came with Postgres 9.x and is documented in the release notes.  It brings Postgres into compliance with the SQL standard.

In Perl, I do something like this:

my $pg_bs_char;
if ($dbh->get_info($GetInfoType{SQL_DBMS_VER}) gt "09.00.00") {
    $pg_bs_char = "\\";         # a single '\' for PG 9.1 and higher
} else {
    $pg_bs_char = "\\\\";       # a double '\\' for PG up to 9.0
}

You can also revert to the old 8.x interpretation; see

  http://www.postgresql.org/docs/9.1/static/runtime-config-compatible.html

Craig
 

-ar

Re: regexp_replace grief

From
Albe Laurenz
Date:
Armin Resch wrote:
> Not sure this is the right list to vent about this but here you go:
>
> I) select regexp_replace('BEFORE.AFTER','(.*)\..*','\1','g') "Substring"
> II) select regexp_replace('BEFORE.AFTER','(.*)\\..*','\\1','g') "Substring"
>
> Executing (II) against pg 8.4.4 or 9.0.4 yields 'BEFORE', but in order for 9.1.7 to yield the same one
> has to execute (I) .. bummer

To be immune against different settings of standard_conforming_strings,
use the extended string literal syntax:

select regexp_replace('BEFORE.AFTER',E'(.*)\\..*',E'\\1','g') "Substring";

That will work in all versions.

Yours,
Laurenz Albe


Re: regexp_replace grief

From
Armin Resch
Date:
Thx, Albe. I tested both proposals, and I gravitate now more towards the E'' nomenclature since it avoids something like this:

$pg_bs_char = ( $dbh->{pg_server_version} >= 90100 ) ? "\\" : "\\\\";

Hoping for a long half life of the E'' nomenclature ...

-ar 


On Fri, Apr 12, 2013 at 5:16 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Armin Resch wrote:
> Not sure this is the right list to vent about this but here you go:
>
> I) select regexp_replace('BEFORE.AFTER','(.*)\..*','\1','g') "Substring"
> II) select regexp_replace('BEFORE.AFTER','(.*)\\..*','\\1','g') "Substring"
>
> Executing (II) against pg 8.4.4 or 9.0.4 yields 'BEFORE', but in order for 9.1.7 to yield the same one
> has to execute (I) .. bummer

To be immune against different settings of standard_conforming_strings,
use the extended string literal syntax:

select regexp_replace('BEFORE.AFTER',E'(.*)\\..*',E'\\1','g') "Substring";

That will work in all versions.

Yours,
Laurenz Albe

Re: regexp_replace grief

From
Alvaro Herrera
Date:
Armin Resch escribió:
> Thx, Albe. I tested both proposals, and I gravitate now more towards the
> E'' nomenclature since it avoids something like this:
>
> $pg_bs_char = ( $dbh->{pg_server_version} >= 90100 ) ? "\\" : "\\\\";
>
> Hoping for a long half life of the E'' nomenclature ...

E'' is very unlikely to ever go away; its main problem is that it's not
standards-conformant.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services