Explicit deterministic COLLATE fails with pattern matching operationson column with non-deterministic collation - Mailing list pgsql-bugs
From | James Lucas |
---|---|
Subject | Explicit deterministic COLLATE fails with pattern matching operationson column with non-deterministic collation |
Date | |
Msg-id | CAAFmbbOvfi=wMM=3qRsPunBSLb8BFREno2oOzSBS=mzfLPKABw@mail.gmail.com Whole thread Raw |
Responses |
Re: Explicit deterministic COLLATE fails with pattern matchingoperations on column with non-deterministic collation
|
List | pgsql-bugs |
Hi all, Wanted to call out what seems like a possible bug in non-deterministic collation handling with pattern matching operators. Per the documentation, non-deterministic collations are not supported with pattern matching operators. Section 9.7 of the PG12 manual recommends "The pattern matching operators of all three kinds do not support nondeterministic collations. If required, apply a different collation to the expression to work around this limitation." However, I'm finding that pattern matching operations fail when a column is declared with a non-deterministic collation, *even if* a different, deterministic collation is explicitly applied to the pattern matching operation. This doesn't seem to be the expected behavior. Example. This is tested on Postgres 12.3, on Centos 8.1.1911 with libicu 60.3. Create a non-deterministic collation. create collation mycollation (provider = icu, locale = 'en-US-ks-level2.utf8', deterministic = false); Create a couple of sample tables: create table ctest (id numeric, t text); create table ctestnd (id numeric, t text collate mycollation); Populate them with some data: insert into ctest values (1,'aAa'); insert into ctest select generate_series(2,100000),'bbb'; insert into ctestnd select id, t from ctest; analyze ctest, ctestnd; Add a few indexes: create index ctest_idx01 on ctest (t); create index ctest_idx02 on ctest (t collate "C"); create index ctestnd_idx01 on ctestnd (t); create index ctestnd_idx02 on ctestnd (t collate "C"); Test on ctest: explain select * from ctest where t = 'aAa' collate "C"; QUERY PLAN -------------------------------------------------------------------------- Index Scan using ctest_idx02 on ctest (cost=0.42..4.44 rows=1 width=10) Index Cond: (t = 'aAa'::text COLLATE "C") COMMENT: Works as expected. explain select * from ctest where t like 'a%'; QUERY PLAN -------------------------------------------------------------------------- Index Scan using ctest_idx02 on ctest (cost=0.42..8.44 rows=1 width=10) Index Cond: ((t >= 'a'::text) AND (t < 'b'::text)) Filter: (t ~~ 'a%'::text) COMMENT: Actually this is very interesting, because even without an explicit COLLATE clause, LIKE still uses the "C" collation index. Not sure if that's intended behavior either? explain select * from ctest where t like 'a%' collate "C"; QUERY PLAN -------------------------------------------------------------------------- Index Scan using ctest_idx02 on ctest (cost=0.42..8.44 rows=1 width=10) Index Cond: ((t >= 'a'::text) AND (t < 'b'::text)) Filter: (t ~~ 'a%'::text COLLATE "C") COMMENT: Uses explicit collation and index as expected. Test on ctestnd: explain select * from ctestnd where t = 'aAa' collate "C"; QUERY PLAN ------------------------------------------------------------------------------ Index Scan using ctestnd_idx02 on ctestnd (cost=0.42..4.44 rows=1 width=10) Index Cond: (t = 'aAa'::text COLLATE "C") COMMENT: Works as expected. explain select * from ctestnd where t like 'a%'; ERROR: nondeterministic collations are not supported for LIKE COMMENT: Fails as expected. explain select * from ctestnd where t like 'a%' collate "C"; ERROR: nondeterministic collations are not supported for LIKE COMMENT: Not expected. It seems like the explicit COLLATE clause is ignored in this case. I've tried different placements for the COLLATE clause, and none seem to work. Is this a bug, or have I missed something? Thanks, James Lucas
pgsql-bugs by date: