Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?) - Mailing list pgsql-bugs
From | Alexander Steffens |
---|---|
Subject | Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?) |
Date | |
Msg-id | af11f8750712190157p18fceaa1k9d9aeb05fe04d6ae@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #3826: Very Slow Execution of examplequery (wrong
plan?)
|
List | pgsql-bugs |
So sorry, I send you only the half of information: MS-SQL uses the the plan with only nested loops when i have abt less than 100 tuples in both the tables (="small") when t1 gets abt > 1000 tuples (="medium") it switches to the hash-anti-semi-join, when both tables gets more than 1000 tuples (="big") it too adds parallelism. (on my 2cpu machine). from the file i attached in the last mail in the big-plan-xml-file I extraxced this plan by cutting of what i thought is too much detail: <TopExpression> <Parallelism> <Sort Distinct="true"> <Parallelism PartitioningType="Hash"> <Hash> <ComputeScalar> <Hash> <Parallelism PartitioningType="Hash"> <TableScan> <Parallelism PartitioningType="Hash"> <ComputeScalar> <NestedLoops Optimized="false"> <TableScan> <Spool> <TableScan> It is the third one Gregory had get from the image (that had a cutoff on the top because of my resolution) If you look inside the xml (plan_bigdata.sqlplan) you can find interesting details i think. for me it's clear that the query is not nice. I used it to provoke the optimizer only for studiing the possibilities of what can be optimized how far. from my POV it's not clear why PostgreSQL runs into the triple-table nested-loop which will lead to a cardinality of abt 8*10^9 where it could make the t1,t2-nested loop with cardinality 5*10^6 and then a merge-anti-semi-join on t1 (#1300) which should be able to do in about log(5*10^6)*5*10^6. so there is a gap of nearly factor 1000. for me it looks like it can not rotate the calculation of the expression (2*(a1+a2)) outside of the "not exists"? best regards, alexander. PS: I will now let the query run to an end if it takes less than 10 hours 2007/12/19, Tom Lane <tgl@sss.pgh.pa.us>: > Gregory Stark <stark@enterprisedb.com> writes: > > "Tom Lane" <tgl@sss.pgh.pa.us> writes: > >> It's possible that MS-SQL is doing something analogous to the > >> hashed-subplan approach (hopefully with suitable tweaking for the NULL > >> case) but even then it's hard to see how it could take only 9 sec. > >> The cartesian product is too big. > > > Fwiw it seems MS-SQL is doing something funny. The three plans posted in the > > screenshots for the "small", "mediu", and "large" cases are: > > ... > > Postgres is doing something equivalent to the first plan. > > Hmm. I think the second plan is probably equivalent to the > hashed-subplan behavior that you can get in PG by rewriting the query to > NOT IN as I illustrated. The third plan looks to be the same thing plus > some parallelization frammishes. > > I'm not clear on what "small/medium/large" means, in particular not on > which of these corresponds to the OP's report of 9-second execution. > > regards, tom lane > -- Alexander Steffens Georgstr. 3 53111 Bonn +49 228 2661615
pgsql-bugs by date: