On Oct 11, 2025, at 02:28, Jeff Davis <pgsql@j-davis.com> wrote:
On Fri, 2025-10-10 at 12:13 +0800, Chao Li wrote:
Are we assuming that
* if the settings come from command line options, then the user is
intentionally doing that, so we throw an error
* if the settings come from env, then the user might not be aware of
them, so we only issue a warning?
If that’s the case, I’m not fully convinced by this design. Since
initdb is a one-time operation, I think it would be better to require
everything to be explicit.
That would have been ideal a long time ago, but plain "initdb" without
locale options has succeeded for a long time, using information from
the environment. If we make that fail and require the user to specify
the options explicitly, I fear that would be too disruptive to the many
scripts around.
So we need to do something reasonable when the provider is builtin and
LC_CTYPE/LC_COLLATE from the environment are incompatible with UTF-8.
Forcing LC_CTYPE=C and/or LC_COLLATE=C:
* Only happens if:
- the provider is builtin;
- LC_CTYPE/LC_COLLATE come from the environment (i.e.
--locale/--lc-ctype/--lc-collate are unspecified); and
- LC_CTYPE/LC_COLLATE are incompatible with UTF-8.
* Has little practical effect because those settings aren't
used many places when the provider is builtin or ICU.
so I think a warning is acceptable there.
Thanks for the explanation, that sounds reasonable. In the meantime, my last arguments are:
* If we make that fail, I don’t think that would break existing scripts. Because the default provider is libc and you are introducing a new environment variable to set locale provider, thus a plain initdb will not use builtin provider. Maybe provider can come from PG_TEST_INITDB_EXTRA_OPTS, I'm ok for test environment to only only issue warnings.
* I am thinking loudly. Builtin provider is more performant but with certain limitations. Some production users may want to try builtin provider for better performance but not being aware of the limitation. Their environment contains the actual LC_CTYPE/LC_COLLATE they want to use, and they set the new environment variable with “builtin” for provider. In this case, failing “initdb” would make the user clearly realize the limitation of builtin provider. Otherwise, if the user also ignores the warning messages, then the database would be created with unexpected ctype, which would lead to loss (time, data, etc.)
If those are not the cases, then I am fine with the design.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/