pg_upgrade fails to preserve old versions of the predefinedcollations - Mailing list pgsql-hackers
From | Alexander Lakhin |
---|---|
Subject | pg_upgrade fails to preserve old versions of the predefinedcollations |
Date | |
Msg-id | 01adb62f-b114-1237-63ae-b7f030a18389@gmail.com Whole thread Raw |
Responses |
Re: pg_upgrade fails to preserve old versions of the predefined collations
|
List | pgsql-hackers |
Hello hackers, When dealing with an OS upgrade, a some kind of anomaly related to collations was found. Suppose, we have Debian 8 with postgresql 12 installed. Then we create a custom collation: CREATE COLLATION russian (provider=icu, locale='ru_RU'); and SELECT oid, collname, collnamespace, collprovider, collversion FROM pg_collation WHERE collname like 'ru%'; returns 12847 | ru-RU-x-icu | 11 | i | 58.0.0.50 ... 16384 | russian | 2200 | i | 58.0.0.50 Then let's create two tables with text columns and indexes and fill them with some data: CREATE TABLE test_icu_ru (f1 varchar COLLATE "ru-RU-x-icu", i int); INSERT INTO test_icu_ru SELECT chr(x), x FROM generate_series(1, 2000) as y(x); CREATE INDEX ON test_icu_ru (f1); CREATE TABLE test_icu_russian (f1 varchar COLLATE "russian", i int); INSERT INTO test_icu_russian SELECT chr(x), x FROM generate_series(1, 2000) as y(x); CREATE INDEX ON test_icu_russian (f1); Perform two test queries: postgres=# select * from test_icu_ru where f1=chr(821); f1 | i ----+----- ̵ | 821 (1 row) postgres=# select * from test_icu_russian where f1=chr(821); f1 | i ----+----- ̵ | 821 (1 row) postgres=# EXPLAIN select * from test_icu_ru where f1=chr(821); QUERY PLAN -------------------------------------------------------------------------------------- Index Scan using test_icu_ru_f1_idx on test_icu_ru (cost=0.28..8.29 rows=1 width=6) Index Cond: ((f1)::text = '̵'::text) (2 rows) postgres=# EXPLAIN select * from test_icu_russian where f1=chr(821); QUERY PLAN ------------------------------------------------------------------------------------------------ Index Scan using test_icu_russian_f1_idx on test_icu_russian (cost=0.28..8.29 rows=1 width=6) Index Cond: ((f1)::text = '̵'::text) (2 rows) (The indexes are indeed used by the above queries.) Now suppose that the OS is upgraded to Debian 9 (or the pgdata just moved to Debian 9 with the postgresql 12). The same queries return: postgres=# select * from test_icu_ru where f1=chr(821); WARNING: collation "ru-RU-x-icu" has version mismatch DETAIL: The collation in the database was created using version 58.0.0.50, but the operating system provides version 153.64.29. HINT: Rebuild all objects affected by this collation and run ALTER COLLATION pg_catalog."ru-RU-x-icu" REFRESH VERSION, or build PostgreSQL with the right library version. f1 | i ----+--- (0 rows) postgres=# select * from test_icu_russian where f1=chr(821); WARNING: collation "russian" has version mismatch DETAIL: The collation in the database was created using version 58.0.0.50, but the operating system provides version 153.64.29. HINT: Rebuild all objects affected by this collation and run ALTER COLLATION public.russian REFRESH VERSION, or build PostgreSQL with the right library version. f1 | i ----+--- (0 rows) We get no data due to the real collation/sort order change but the warning says what to do. The query presented at https://www.postgresql.org/docs/12/sql-altercollation.html returns: Collation | Object -------------------------+------------------------------------- collation "ru-RU-x-icu" | column f1 of table test_icu_ru collation "ru-RU-x-icu" | index test_icu_ru_f1_idx collation russian | column f1 of table test_icu_russian collation russian | index test_icu_russian_f1_idx So the documented behavior is observed. But after pg_upgrade: pg_createcluster 12 new /usr/lib/postgresql/12/bin/pg_upgrade -b /usr/lib/postgresql/12/bin -B /usr/lib/postgresql/12/bin -d /etc/postgresql/12/main -D /etc/postgresql/12/new In the new cluster the same queries return: postgres=# select * from test_icu_russian where f1=chr(821); WARNING: collation "russian" has version mismatch DETAIL: The collation in the database was created using version 58.0.0.50, but the operating system provides version 153.64.29. HINT: Rebuild all objects affected by this collation and run ALTER COLLATION public.russian REFRESH VERSION, or build PostgreSQL with the right library version. f1 | i ----+--- (0 rows) postgres=# select * from test_icu_ru where f1=chr(821); f1 | i ----+--- (0 rows) (There is no warning for the predefined collation now.) The query presented at https://www.postgresql.org/docs/12/sql-altercollation.html returns: Collation | Object -------------------+------------------------------------- collation russian | column f1 of table test_icu_russian collation russian | index test_icu_russian_f1_idx (2 rows) and SELECT oid, collname, collnamespace, collprovider, collversion FROM pg_collation WHERE collname like 'ru%'; returns oid | collname | collnamespace | collprovider | collversion -------+-------------+---------------+--------------+------------- 12884 | ru-BY-x-icu | 11 | i | 153.64.29 12885 | ru-KG-x-icu | 11 | i | 153.64.29 12886 | ru-KZ-x-icu | 11 | i | 153.64.29 12887 | ru-MD-x-icu | 11 | i | 153.64.29 12888 | ru-RU-x-icu | 11 | i | 153.64.29 12889 | ru-UA-x-icu | 11 | i | 153.64.29 12883 | ru-x-icu | 11 | i | 153.64.29 12329 | ru_RU | 11 | c | 12328 | ru_RU.utf8 | 11 | c | 16402 | russian | 2200 | i | 58.0.0.50 So only the custom collation' version is actual, but predefined ones correspond to a newer libicu, but not to actual data. For all that, REINDEX repairs both indexes: postgres=# REINDEX INDEX test_icu_russian_f1_idx; REINDEX postgres=# select * from test_icu_russian where f1=chr(821); f1 | i ----+----- ̵ | 821 (1 row) postgres=# REINDEX INDEX test_icu_ru_f1_idx; REINDEX postgres=# select * from test_icu_ru where f1=chr(821); f1 | i ----+----- ̵ | 821 (1 row) So for now it seems dangerous to use predefined collations as their old versions are not preserved by pg_upgrade and the user doesn't know which indexes affected by the actual ICU collation changes. Best regards, Alexander
pgsql-hackers by date: