Version 17.6 changed how similar works compared to version 17.5 - Mailing list pgsql-bugs
From | Stephan Springl |
---|---|
Subject | Version 17.6 changed how similar works compared to version 17.5 |
Date | |
Msg-id | 41a37137-f8bb-8fc5-2948-31b528f166dc@bfw-online.de Whole thread Raw |
Responses |
Re: Version 17.6 changed how similar works compared to version 17.5
Re: Version 17.6 changed how similar works compared to version 17.5 |
List | pgsql-bugs |
Hello, version 17.6 changed how similar works compared to version 17.5. With file f as cat >f <<END drop table t; create table t (p varchar (1)); insert into t values ('_'); select * from t; select * from t where p similar to '[\_]%'; END psql -f f gives: DROP TABLE CREATE TABLE INSERT 0 1 p --- _ (1 row) p --- (0 rows) The expression with similar does not find the row. With version 17.5, the row was found, as wanted. Reverting commit e3ffc3e91d04579240fb54a96f9059b246488dce "Fix conversion of SIMILAR TO regexes for character classes" brings back the previous behavior. The patch does not take account of the first character in a character class being escaped. In this case it skips the closing ']' of the caracter class. "[_]%" as similar expression gets translated to "^(?:[\_]%)$" as a regular expression. Version 17.5 generates "^(?:[\_].*)$" as regular expression. I suggest a fix. Unfortunately, I am not sure about what an escape in a character class of a similar expression should mean and whether the escape character should always be '\' (as the patch does it) or the escape value given to the similar expression. Branches REL_18_STABLE and master are affectes as well. Thank you for your great work on postgresql. Regards, Stephan diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c index 37ca136acf1..114fb43fd91 100644 --- a/src/backend/utils/adt/regexp.c +++ b/src/backend/utils/adt/regexp.c @@ -905,9 +905,41 @@ similar_escape_internal(text *pat_text, text *esc_text) } /* fast path */ - if (afterescape) + if (charclass_depth > 0) { - if (pchar == '"' && charclass_depth < 1) /* escape-double-quote? */ + if (afterescape) + { + *r++ = '\\'; + afterescape = false; + } + *r++ = pchar; + + /* + * Ignore a closing bracket at the start of a character class. + * Such a bracket is taken literally rather than closing the + * class. "charclass_start" is 1 right at the beginning of a + * class and 2 after an initial caret. + */ + if (pchar == ']' && charclass_start > 2) + charclass_depth--; + else if (pchar == '[') + charclass_depth++; + + /* + * If there is a caret right after the opening bracket, it negates + * the character class, but a following closing bracket should + * still be treated as a normal character. That holds only for + * the first caret, so only the values 1 and 2 mean that closing + * brackets should be taken literally. + */ + if (pchar == '^') + charclass_start++; + else + charclass_start = 3; /* definitely past the start */ + } + else if (afterescape) + { + if (pchar == '"') /* escape-double-quote? */ { /* emit appropriate part separator, per notes above */ if (nquotes == 0) @@ -956,35 +988,6 @@ similar_escape_internal(text *pat_text, text *esc_text) /* SQL escape character; do not send to output */ afterescape = true; } - else if (charclass_depth > 0) - { - if (pchar == '\\') - *r++ = '\\'; - *r++ = pchar; - - /* - * Ignore a closing bracket at the start of a character class. - * Such a bracket is taken literally rather than closing the - * class. "charclass_start" is 1 right at the beginning of a - * class and 2 after an initial caret. - */ - if (pchar == ']' && charclass_start > 2) - charclass_depth--; - else if (pchar == '[') - charclass_depth++; - - /* - * If there is a caret right after the opening bracket, it negates - * the character class, but a following closing bracket should - * still be treated as a normal character. That holds only for - * the first caret, so only the values 1 and 2 mean that closing - * brackets should be taken literally. - */ - if (pchar == '^') - charclass_start++; - else - charclass_start = 3; /* definitely past the start */ - } else if (pchar == '[') { /* start of a character class */
pgsql-bugs by date: