Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation - Mailing list pgsql-bugs

From Laurenz Albe
Subject Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
Date
Msg-id 0bbe14779f2c37e5a2fbd514861ebedaf8f53df6.camel@cybertec.at
Whole thread Raw
In response to Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
List pgsql-bugs
On Tue, 2025-12-02 at 15:53 -0500, Tom Lane wrote:
> > The attached patch v3 turns it into a while loop to avoid
> > the problem.
>
> Looking at the code overall, I wonder if the outer loop doesn't have
> the same issue.  The comments claim that we should be able to handle
> zero-length matches, but if the overall haystack is of length zero,
> we will fail to check for such a match.

If you can find zero-length matches at all, you could find a
zero-length match in a non-empty haystack.  Perhaps the function is
never called with an empty haystack...

> Also, since we have haystack <= haystack_end as a starting condition,
> I think both loops could omit the initial test.  I'd be inclined
> to code them like
>
>     test_ptr = start point;
>     for (;;)
>     {
>         ...
>         if (test_ptr >= haystack_end)
>             break;
>         test_ptr += pg_mblen(test_ptr);
>     }

True.  The attached v4 patch does it like that.

> On the other hand ... is that comment really right about zero-length
> match being possible?  If it is, the API for this function is in
> need of redesign, because callers that try to find "the next match"
> would go into an infinite loop re-finding the same zero-length
> match over and over.

Right.  I'll see if I can trigger such a case.

Yours,
Laurenz Albe

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #19340: Wrong result from CORR() function
Next
From: Heikki Linnakangas
Date:
Subject: Re: BUG #19343: toast_internals.c:139:2: warning: missing braces around initializer [-Wmissing-braces]