Re: BUG #17975: Nested Loop Index Scan returning wrong result - Mailing list pgsql-bugs
From | Andres Freund |
---|---|
Subject | Re: BUG #17975: Nested Loop Index Scan returning wrong result |
Date | |
Msg-id | 20230614211231.pudg2y2rz6az2vij@awork3.anarazel.de Whole thread Raw |
In response to | Re: BUG #17975: Nested Loop Index Scan returning wrong result (Andres Freund <andres@anarazel.de>) |
Responses |
Re: BUG #17975: Nested Loop Index Scan returning wrong result
|
List | pgsql-bugs |
Hi, On 2023-06-14 13:36:51 -0700, Andres Freund wrote: > On 2023-06-14 15:17:31 +0000, PG Bug reporting form wrote: > > The following bug has been logged on the website: > > > > Bug reference: 17975 > > Logged by: Tor Erik Linnerud > > Email address: tel@jklm.no > > PostgreSQL version: 15.3 > > Operating system: MacOS 13.4, Linux 5.16 > > Description: > > > > Hi, first let me say thanks for all the hard work that goes into Postgres. > > > > > > I ran into a very specific query + index + data combination that appears to > > return the wrong result. After much trial and error I’ve been able to > > construct a dump to reproduce the problem, when running ANALYZE after the > > import. > > > > 1. Grab the DB dump (13 MB) > > > > curl -L "https://www.dropbox.com/s/k1ai0765gc2k98f/bug5.sql?dl=1" -o > > bug5.sql > > > > 2. Create an empty database, import the dump and analyze: > > > > createdb bug5 && psql -d bug5 -f bug5.sql && psql -d bug5 -c "ANALYZE" > > > > 3. Run queries: > > > > psql -d bug5 -c "set enable_indexscan = 'off'; SELECT a.id FROM a JOIN b ON > > b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'" > > > > 1 row expected, get 1 > > > > psql -d bug5 -c "set enable_indexscan = 'on'; SELECT a.id FROM a JOIN b ON > > b.a_id = a.id JOIN c ON c.id = b.c_id WHERE c.tag = '13880'" > > > > 1 row expected, get 0 > > Uh, huh. I can reproduce that. And it's not just 15, reproduces in old > versions too. > > The plan doesn't look obviously wrong: > ┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ > │ QUERY PLAN │ > ├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ > │ Nested Loop (cost=1.26..17.46 rows=1 width=8) (actual time=0.063..0.065 rows=0 loops=1) │ > │ Output: a.id │ > │ Inner Unique: true │ > │ -> Nested Loop (cost=0.84..16.88 rows=1 width=8) (actual time=0.056..0.059 rows=1 loops=1) │ > │ Output: b.a_id │ > │ Inner Unique: true │ > │ -> Index Scan using c_tag_idx on public.c (cost=0.42..8.44 rows=1 width=8) (actual time=0.034..0.036 rows=1loops=1) │ > │ Output: c.id, c.tag │ > │ Index Cond: ((c.tag)::text = '13880'::text) │ > │ -> Index Scan using index_b_c_id on public.b (cost=0.42..8.44 rows=1 width=16) (actual time=0.015..0.016 rows=1loops=1) │ > │ Output: b.id, b.c_id, b.a_id │ > │ Index Cond: (b.c_id = c.id) │ > │ -> Index Only Scan using a_pkey on public.a (cost=0.42..0.59 rows=1 width=8) (actual time=0.002..0.003 rows=0 loops=1) │ > │ Output: a.id │ > │ Index Cond: (a.id = b.a_id) │ > │ Heap Fetches: 0 │ > │ Planning Time: 0.998 ms │ > │ Execution Time: 0.148 ms │ > └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ > (18 rows) > > It's not an IOS issue, it happens without IOS as well. > > > Something seems to be off with the relevant param - it's NULL. Haven't dug > deeper. I think it's a problem with the uniqueness determination / missing a qual / index selection. There are two rows in b with b.c_id = 13880, except that one of them has a NULL a_id: => SELECT * FROM b WHERE c_id = 13880; ┌────────┬───────┬────────┐ │ id │ c_id │ a_id │ ├────────┼───────┼────────┤ │ 326048 │ 13880 │ (null) │ │ 572151 │ 13880 │ 955968 │ └────────┴───────┴────────┘ (2 rows) . The uniqueness information comes from: "index_a_cannot_share_c" UNIQUE, btree (c_id) WHERE a_id IS NOT NULL But note that we aren't using that index, we use "index_b_c_id" btree (c_id) which of course also contains the a_id = NULL row. We either need to force the index that we got the uniqueness information to be used when it is partial, or add the quals from the partial unique index to all other index scans. Greetings, Andres Freund
pgsql-bugs by date: