Re: Add CASEFOLD() function. - Mailing list pgsql-hackers

From Robert Treat
Subject Re: Add CASEFOLD() function.
Date
Msg-id CABV9wwM36FAN38jURxw-U5u-FPcbtbcnZGDT6ip_fRRaRFZzZg@mail.gmail.com
Whole thread Raw
In response to Re: Add CASEFOLD() function.  (Ian Lawrence Barwick <barwick@gmail.com>)
List pgsql-hackers
On Thu, Jun 19, 2025 at 11:37 AM Thom Brown <thom@linux.com> wrote:
> On Thu, 19 Jun 2025 at 15:51, Peter Eisentraut <peter@eisentraut.org> wrote:
> > On 19.06.25 06:03, Thom Brown wrote:
> > > Late to the party, but is there an argument for porting this to the
> > > citext type? Or supplementing the extension with an additional type
> > > ("cftext"? *shrug*). It currently uses lower(), so our current
> > > recommendation for dealing with all unicode characters is to use
> > > nondeterministic collations.
> >
> > What is the motivation for wanting a citext variant instead of using
> > nondeterministic collations?
>
> Ease of use, perhaps. It seems easier to use:
>
> column_name cftext
>
> rather than:
>
> CREATE COLLATION case_insensitive_collation (
>     PROVIDER = icu,
>     LOCALE = 'und-u-ks-level2',
>     DETERMINISTIC = FALSE
> );
>
> column_name text COLLATE case_insensitive_collation
>
> But I see the arguments against it. It creates an unnecessary
> dependency on an extension, and if someone wants to ignore both case
> and accents, they may resort to using 2 extensions (citext + unaccent)
> when none are needed. I guess I don't feel strongly about it either
> way.

Don't forget, if you have a defined insensitive / normalized
collations, you also enable on-the-fly collation based matching, a la
"SELECT 'Å' = 'A' COLLATE ignore_accent_case;" regardless of the
provided collations (which I think is much more common certain in
other databases)

Robert Treat
https://xzilla.net



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: libxml2 author overwhelmed with security requests
Next
From: Jeff Davis
Date:
Subject: Re: Add CASEFOLD() function.