Thread: BUG #17350: GIST TRGM Index is broken when combining with combining INCLUDE with a string function (e.g. lower).
BUG #17350: GIST TRGM Index is broken when combining with combining INCLUDE with a string function (e.g. lower).
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17350 Logged by: louis jachiet Email address: louis.jachiet@telecom-paris.fr PostgreSQL version: 13.4 Operating system: Archlinux Description: Hello everyone, I hope this is not a duplicate of a similar bug (I tried to check…). GIST TRGM Index seems to return wrong values when populated with a string function (such as lower) and an include. If I issue the following lines, instead of returning 'foo' and 'bar', postgresql will return two empty lines: CREATE EXTENSION IF NOT EXISTS pg_trgm ; CREATE TABLE t (a VARCHAR(50)); INSERT INTO t VALUES ('foo') ; INSERT INTO t VALUES ('BAR') ; CREATE INDEX test_idx ON t USING gist(lower(a) gist_trgm_ops) INCLUDE (a) ; set enable_seqscan=off ; SELECT lower(a) FROM t ; Obviously for this last SELECT, the index is only useful because of the seqscan=off but if you have a large database and issue a command like: SELECT 1 FROM t WHERE lower(a) LIKE '%o%' ; then the output will be also empty because the index will return empty lines and the recheck condition will fail. If I use a function different than lower (e.g. upper) the problem persists but if remove the function or if remove the include then everything works just fine. It really seems that it is the combination of include + string function + gist trgm that makes the SELECT return empty lines. Thank you for your time! Regards
Re: BUG #17350: GIST TRGM Index is broken when combining with combining INCLUDE with a string function (e.g. lower).
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > If I issue the > following lines, instead of returning 'foo' and 'bar', postgresql will > return two empty lines: > CREATE EXTENSION IF NOT EXISTS pg_trgm ; > CREATE TABLE t (a VARCHAR(50)); > INSERT INTO t VALUES ('foo') ; > INSERT INTO t VALUES ('BAR') ; > CREATE INDEX test_idx ON t USING gist(lower(a) gist_trgm_ops) INCLUDE > (a) ; > set enable_seqscan=off ; > SELECT lower(a) FROM t ; Fascinating! This is actually a very ancient core-planner bug, though you could not observe it with all index types. The GIST AM correctly reports that it can return the value of a, but not the value of lower(a), in an index-only scan. indxpath.c sees that it can still make an index-only scan, reasoning that lower(a) can be recomputed from the returned value of a. But then set_indexonlyscan_references() screws up: it's told to compute lower(a) from a tlist that includes lower(a) and a, and it naturally figures it can just re-use the first output column. Then at runtime you get a NULL result since that's what the index will output for non-returnable columns. We need some mechanism to mark that not all of the index columns are usable at that step. Doesn't seem terribly hard to fix, though. Thanks for the report! regards, tom lane