Thread: BUG #5165: Poor performance with Left-join where right side does not exist
The following bug has been logged online: Bug reference: 5165 Logged by: assaf Email address: assaf_lehr@yahoo.com PostgreSQL version: 8.37 Operating system: linux Description: Poor performance with Left-join where right side does not exist Details: A category | idB | multiple-values B category | idB | multiple-values -------------- select B.idB , A.idB from B left join A on B.idB = A.idB and A.category=B.category where A.idB is null [and A.category=202] limit 10 -------------- I have indexes on all columns and thier permutations. I needed merge-join here and expected brief results ,as it is easy to find not-nulls running on both indexes. My DB is quite big (20M items) and the result was supposed to be 0.5M items. I stopped waiting after 18 hours. There is workaround , select B.idB from B where category=202 and idB not in (select distinct idB from A where category=202); It finishes in good time(10min) , but it`s totally a waste to use subplan here.
"assaf" <assaf_lehr@yahoo.com> writes: > PostgreSQL version: 8.37 > Description: Poor performance with Left-join where right side does > not exist 8.4 might be smarter about this case for you. It's hard to tell for sure with so few details. regards, tom lane
Re: BUG #5165: Poor performance with Left-join where right side does not exist
From
Robert Haas
Date:
On Wed, Nov 4, 2009 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "assaf" <assaf_lehr@yahoo.com> writes: >> PostgreSQL version: 8.37 >> Description: =A0 =A0 =A0 =A0Poor performance with Left-join where right = side does >> not exist > > 8.4 might be smarter about this case for you. =A0It's hard to tell for su= re > with so few details. EXPLAIN output would be a good place to start, and EXPLAIN ANALYZE for the queries that run quickly enough that you can let them run to completion. ...Robert
Re: BUG #5165: Poor performance with Left-join where right side does not exist
From
assaf lehr
Date:
I did not not check in 8.4 and maybe it is solved there, but let me explain it better. My particular case is a Join of two tables where the other table result should be "is null". Using explain, I found out that nested-loop and hash-join merge algrothims are bad for me [both tables are quite big]. The merge-join algorithm looked like a great way to solve the problem , as both the table has the same ordered index , whichis just the thing the merge-join needs. The two tables and the slow join are: A category | idB | multiple-values B category | idB | multiple-values -------------- select B.idB , A.idB from B left join A on B.idB = A.idB and A.category=B.category where A.idB is null [and A.category=202] limit 10 -------------- Mind you that if the last where is switched from "A.idB is null" to any regular check which can be done on the one row (likecolC=5) , the merge join works just fine (and fast). In the "is null" case however, the performance is very poor. I would guess that there is a special case in the postgres source-codefor the "is null" which has a different behavior. It can be, because unlike a single-row condition, is-null needsa different behaviour. assaf ________________________________ From: Robert Haas <robertmhaas@gmail.com> To: Tom Lane <tgl@sss.pgh.pa.us> Cc: assaf <assaf_lehr@yahoo.com>; pgsql-bugs@postgresql.org Sent: Thu, November 5, 2009 3:07:41 AM Subject: Re: [BUGS] BUG #5165: Poor performance with Left-join where right side does not exist On Wed, Nov 4, 2009 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "assaf" <assaf_lehr@yahoo.com> writes: >> PostgreSQL version: 8.37 >> Description: Poor performance with Left-join where right side does >> not exist > > 8.4 might be smarter about this case for you. It's hard to tell for sure > with so few details. EXPLAIN output would be a good place to start, and EXPLAIN ANALYZE for the queries that run quickly enough that you can let them run to completion. ...Robert
Re: BUG #5165: Poor performance with Left-join where right side does not exist
From
"Kevin Grittner"
Date:
assaf lehr <assaf_lehr@yahoo.com> wrote: > From: Robert Haas <robertmhaas@gmail.com> >> EXPLAIN output would be a good place to start, and EXPLAIN ANALYZE >> for the queries that run quickly enough that you can let them run >> to completion. > let me explain it better. > [hand-waving and pseudo-code] People would like to help you, but you're not providing enough information to let them. If you can show the actual tables involved in a slow query, the actual query, and the output you get when you run the query with EXPLAIN ANALYZE in front of the query (or just EXPLAIN if the EXPLAIN ANALYZE can't run to completion in an acceptable amount of time); then people will probably be able to help more. It's also good to include the output of SELECT version(); and information on the OS, hardware, and the contents of your postgresql.conf file (excluding comments). -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > assaf lehr <assaf_lehr@yahoo.com> wrote: >> let me explain it better. >> [hand-waving and pseudo-code] > People would like to help you, but you're not providing enough > information to let them. Well, the really important point is that this query is using a construction that was specifically improved in 8.4. The advice is still going to be the same: see if it isn't better with 8.4. regards, tom lane
Re: BUG #5165: Poor performance with Left-join where right side does not exist
From
assaf lehr
Date:
ok. thanks, when my project will move to 8.4, I `ll re-do this test. ________________________________ From: Tom Lane <tgl@sss.pgh.pa.us> To: Kevin Grittner <Kevin.Grittner@wicourts.gov> Cc: Robert Haas <robertmhaas@gmail.com>; assaf lehr <assaf_lehr@yahoo.com>; pgsql-bugs@postgresql.org Sent: Thu, November 5, 2009 5:21:28 PM Subject: Re: [BUGS] BUG #5165: Poor performance with Left-join where right side does not exist "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > assaf lehr <assaf_lehr@yahoo.com> wrote: >> let me explain it better. >> [hand-waving and pseudo-code] > People would like to help you, but you're not providing enough > information to let them. Well, the really important point is that this query is using a construction that was specifically improved in 8.4. The advice is still going to be the same: see if it isn't better with 8.4. regards, tom lane