Home > mailing lists

Re: PostgreSQL 18 GA press release draft - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: PostgreSQL 18 GA press release draft
Date	September 12, 2025 22:13:23
Msg-id	2c1f238473721d5a277ce047f40158536aa1d72d.camel@j-davis.com Whole thread Raw
In response to	Re: PostgreSQL 18 GA press release draft (Nico Williams <nico@cryptonector.com>)
Responses	Re: PostgreSQL 18 GA press release draft
List	pgsql-hackers

Tree view

On Fri, 2025-09-12 at 13:21 -0500, Nico Williams wrote:
> On Fri, Sep 12, 2025 at 10:11:59AM -0700, Jeff Davis wrote:
> > The name PG_UNICODE_FAST is meant to convey that it provides full
> > unicode semantics for case mapping and pattern matching, while also
> > being fast because it uses memcmp for comparisons. While the name
> > PG_C_UTF8 is meant to convey that it's closer to what users of the
> > libc
> > "C.UTF-8" locale might expect.
>
> How does one do form-insensitive comparisons?

If you mean case insensitive matching, you can do:

   CASEFOLD(a) = CASEFOLD(b)

in any locale or provider, but it's best when using PG_UNICODE_FAST or
ICU, because it handles some nuances better. For instance:

   CASEFOLD('ß') = CASEFOLD('SS') AND
   CASEFOLD('ß') = CASEFOLD('ss') AND
   CASEFOLD('ß') = CASEFOLD('ẞ')

are all true in PG_UNICODE_FAST and "en-US-x-icu", but not in libc
collations.

ICU also has case-insensitive collations, which offer a bit more
flexibility:

https://www.postgresql.org/docs/current/collation.html#COLLATION-NONDETERMINISTIC

Regards,
    Jeff Davis

pgsql-hackers by date:

From: "Jonathan S. Katz"
Date: 12 September 2025, 21:59:34
Subject: Re: PostgreSQL 18 GA press release draft

From: Álvaro Herrera
Date: 12 September 2025, 22:24:40
Subject: Re: PostgreSQL 18 GA press release draft

Re: PostgreSQL 18 GA press release draft - Mailing list pgsql-hackers

Previous

Next