Thread: CREATE DATABASE command for non-libc providers

CREATE DATABASE command for non-libc providers

From
Jeff Davis
Date:
From the discussion here:

https://www.postgresql.org/message-id/CAFCRh--rtqbOBpJYFDmPD9kYCYxsxKpLW7LHxYMYhHXa2XoStw@mail.gmail.com

the CREATE DATABASE command has a tendency to throw errors in confusing
ways when using non-libc providers. I have attached a patch 0001 that
fixes a misleading hint, but it's still not great.

When using ICU or the builtin provider, it still requires coming up
with some valid locale name for LC_COLLATE and LC_CTYPE, even though
those have little or no effect. And because LOCALE is the fallback when
LC_COLLATE and/or LC_CTYPE are unspecified, it's confusing to the user
because they aren't even trying to specify a libc locale name at all.

The solution, as I see it, is:

* Force the environment variables LC_COLLATE=C and LC_CTYPE=C
unconditionally, and pg_perm_setlocale() them. This requires closing a
few loose ends, but it should be doable[1]. Even the libc provider uses
the "_l()" functions already, and no longer depends on setlocale().

* When datlocprovider<>'c', force datcollate and datctype to NULL.

* If the user specifies LC_CTYPE or LC_COLLATE to CREATE DATABASE, and
the provider is not libc, then ignore LC_COLLATE/LC_CTYPE and emit a
WARNING, rather than trying to set it based on LOCALE and getting an
error.

Regards,
    Jeff Davis

[1]
https://www.postgresql.org/message-id/cd3517c7-ddb8-454e-9dd5-70e3d84ff6a2%40eisentraut.org

Attachment

Re: CREATE DATABASE command for non-libc providers

From
"Daniel Verite"
Date:
    Jeff Davis wrote:

> I have attached a patch 0001 that
> fixes a misleading hint, but it's still not great.

+1 for the patch

> When using ICU or the builtin provider, it still requires coming up
> with some valid locale name for LC_COLLATE and LC_CTYPE

No, since the following invocation does work:

 CREATE DATABASE test
   template='template0'
   locale_provider='builtin'
   builtin_locale='C.UTF-8';

Here we let 'locale' or 'lc_collate/lc_ctype' which is the same thing,
defaulting from the template database.

In the discussion you mentioned, the error comes from the OP using
'locale' instead of 'builtin_locale'. At least that's my understanding.
This mistake is not surprising, because when you specify a locale
provider followed by a locale, intuitively you'd expect this locale
to refer to that locale provider. Yet that's not case, mostly for backward
compatibility reasons.



> * Force the environment variables LC_COLLATE=C and LC_CTYPE=C
> unconditionally, and pg_perm_setlocale() them

Currently that would be a regression for some people, because
when LC_CTYPE=C, the FTS parser produces substandard results with
characters beyond ASCII.



Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/



Re: CREATE DATABASE command for non-libc providers

From
Jeff Davis
Date:
On Fri, 2025-06-06 at 22:03 +0200, Daniel Verite wrote:
> +1 for the patch

Thank you, committed.

>
> Here we let 'locale' or 'lc_collate/lc_ctype' which is the same
> thing,
> defaulting from the template database.

Right, in the normal case it's OK, but if anything goes wrong, it gets
fairly confusing.

> > * Force the environment variables LC_COLLATE=C and LC_CTYPE=C
> > unconditionally, and pg_perm_setlocale() them
>
> Currently that would be a regression for some people, because
> when LC_CTYPE=C, the FTS parser produces substandard results with
> characters beyond ASCII.

In the other thread, I posted a patch:

https://www.postgresql.org/message-id/a1396f17f462ee6561820f755caaf2d12eb9fd15.camel%40j-davis.com

for the callers that rely on datctype (regardless of datlocprovider),
they access the locale_t through a global, and use the "_l" variants.

There should be no behavior change, and we still need to set LC_CTYPE,
so you are right that it's not a solution yet. I think it moves us in
the right direction, though.

If nothing else, we can easily identify the places that have behavior
dependent on datctype, and I could have offered a more clear reply to
the user.

Regards,
    Jeff Davis