Thread: BUG #17170: Invalid collation created with provider icu and initdb' locale C
BUG #17170: Invalid collation created with provider icu and initdb' locale C
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17170 Logged by: Andrew Bille Email address: andrewbille@gmail.com PostgreSQL version: 14beta3 Operating system: centos-7 Description: Hi! When using postgres built "--with-icu" I get an invalid collation with the following script: initdb -D db --locale=C createdb testdb psql testdb -c "CREATE COLLATION test (provider=icu, locale='en_US.utf8');" psql testdb -c "SELECT 'a' < 'b' COLLATE test;" ERROR: collation "test" for encoding "SQL_ASCII" does not exist LINE 1: SELECT 'a' < 'b' COLLATE test; ^ or psql testdb -c "DROP COLLATION test;" ERROR: collation "test" for encoding "SQL_ASCII" does not exist Reproduced on REL_10_STABLE..REL_14_STABLE. Thanks!
Re: BUG #17170: Invalid collation created with provider icu and initdb' locale C
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > When using postgres built "--with-icu" I get an invalid collation with the > following script: > initdb -D db --locale=C > createdb testdb > psql testdb -c "CREATE COLLATION test (provider=icu, > locale='en_US.utf8');" > psql testdb -c "SELECT 'a' < 'b' COLLATE test;" > ERROR: collation "test" for encoding "SQL_ASCII" does not exist > LINE 1: SELECT 'a' < 'b' COLLATE test; > ^ That error doesn't seem inappropriate; ICU collations don't work in SQL_ASCII encoding. > psql testdb -c "DROP COLLATION test;" > ERROR: collation "test" for encoding "SQL_ASCII" does not exist That, however, is a bit annoying. If you can create the collation you should be able to drop it. The immediate issue is that DROP COLLATION uses the same lookup rule as other code paths, so it fails to see inapplicable pg_collation entries at all. So one approach to a fix is to relax its lookup rule. But I wonder if we should have prevented the CREATE in the first place, instead. Unfortunately, either approach is likely to still have some edge-case odd behaviors :-(. This whole business of having the database encoding be implicitly part of the lookup rule is just prone to that. regards, tom lane
Re: BUG #17170: Invalid collation created with provider icu and initdb' locale C
From
Tom Lane
Date:
I wrote: > The immediate issue is that DROP COLLATION uses the same lookup > rule as other code paths, so it fails to see inapplicable > pg_collation entries at all. So one approach to a fix is to > relax its lookup rule. But I wonder if we should have prevented > the CREATE in the first place, instead. After a bit of experimentation, I'm inclined to the idea that we should have rejected the CREATE COLLATION in the first place. It's not very useful to create a collation that you're not going to be able to use, and I think it's more user-friendly to say so up front than to leave the user guessing about why subsequent references don't work. There is arguably one use-case that this fix shuts off: manually adding ICU-based collations to template0, in the expectation that they could get cloned into other databases where they'd be useful. But that seems like a mighty thin argument, especially since you could initdb with an ICU-compatible encoding if you want to do that. regards, tom lane diff --git a/src/backend/commands/collationcmds.c b/src/backend/commands/collationcmds.c index 5a2ba56cec..9b8a259eb2 100644 --- a/src/backend/commands/collationcmds.c +++ b/src/backend/commands/collationcmds.c @@ -215,7 +215,19 @@ DefineCollation(ParseState *pstate, List *names, List *parameters, bool if_not_e if (!fromEl) { if (collprovider == COLLPROVIDER_ICU) + { + /* + * While an ICU collation can support various encodings, only + * allow one to be created when the current database's encoding is + * supported. Otherwise we get surprising behaviors like not + * being able to drop the collation. + */ + if (!is_encoding_supported_by_icu(GetDatabaseEncoding())) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("current database's encoding is not supported with this provider"))); collencoding = -1; + } else { collencoding = GetDatabaseEncoding();