On Thu, 2025-06-05 at 15:07 +0200, Dominique Devienne wrote:
> But isn't the point of the new-in-v17 builtin provider is to be
> system
> independent???
Yes, a major part of the builtin provider is complete consistency
across platforms for the entire collation system -- anything affected
by the database default collation or a COLLATE clause, including
comparisons, casing behavior, pattern matching, etc. New major
versions of Postgres may update Unicode, but those updates will never
affect comparisons in the builtin C.UTF-8 locale; and will only affect
other behaviors (like casing) subject to the (rather strict) Unicode
stability policy[1].
Regarding datcollate and datctype: those affect the LC_COLLATE and
LC_CTYPE environment variables, and Postgres does a setlocale() upon a
new database connection. That only affects libc functions like
strcoll(), so it won't affect the builtin provider or ICU which don't
use strcoll().
You're right to ask why those matter at all, then. It's hard for me to
guarantee that datcollate/datctype won't affect some other part of the
system or an extension (I see that Daniel offered some more details).
I'd like to force LC_COLLATE=C and LC_CTYPE=C, and then there'd be no
question, but I won't promise when that will happen. I'd suggest just
forcing those to "C" in your database.
Regards,
Jeff Davis
[1] https://www.unicode.org/policies/stability_policy.html