Home > mailing lists

Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?) - Mailing list pgsql-bugs

From	Alexander Steffens
Subject	Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?)
Date	December 19, 2007 06:09:22
Msg-id	af11f8750712190157p18fceaa1k9d9aeb05fe04d6ae@mail.gmail.com Whole thread Raw
In response to	Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?) (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?)
List	pgsql-bugs

Tree view

So sorry, I send you only the half of information:
MS-SQL uses the the plan with only nested loops when i have abt less
than 100 tuples in both the tables (="small")

when t1 gets abt > 1000 tuples (="medium") it switches to the
hash-anti-semi-join, when both tables gets more than 1000 tuples
(="big") it too adds parallelism. (on my 2cpu machine).

from the file i attached in the last mail in the big-plan-xml-file I
extraxced this plan by cutting of what i thought is too much detail:


<TopExpression>
  <Parallelism>
      <Sort Distinct="true">
          <Parallelism PartitioningType="Hash">
              <Hash>
                  <ComputeScalar>
                      <Hash>
                          <Parallelism PartitioningType="Hash">
                              <TableScan>
                          <Parallelism PartitioningType="Hash">
                              <ComputeScalar>
                                  <NestedLoops Optimized="false">
                                      <TableScan>
                                      <Spool>
                                          <TableScan>



It is the third one Gregory had get from the image (that had a cutoff
on the top because of my resolution)

If you look inside the xml (plan_bigdata.sqlplan) you can find
interesting details i think.

for me it's clear that the query is not nice. I used it to provoke the
optimizer only for studiing the possibilities of what can be optimized
how far.

from my POV it's not clear why PostgreSQL runs into the triple-table
nested-loop which will lead to a cardinality of abt 8*10^9 where it
could make the t1,t2-nested loop with cardinality 5*10^6 and then a
merge-anti-semi-join on t1 (#1300) which should be able to do in about
log(5*10^6)*5*10^6. so there is a gap of nearly factor 1000.

for me it looks like it can not rotate the calculation of the
expression (2*(a1+a2)) outside of the "not exists"?

best regards, alexander.

PS: I will now let the query run to an end if it takes less than 10 hours

2007/12/19, Tom Lane <tgl@sss.pgh.pa.us>:
> Gregory Stark <stark@enterprisedb.com> writes:
> > "Tom Lane" <tgl@sss.pgh.pa.us> writes:
> >> It's possible that MS-SQL is doing something analogous to the
> >> hashed-subplan approach (hopefully with suitable tweaking for the NULL
> >> case) but even then it's hard to see how it could take only 9 sec.
> >> The cartesian product is too big.
>
> > Fwiw it seems MS-SQL is doing something funny. The three plans posted in the
> > screenshots for the "small", "mediu", and "large" cases are:
> > ...
> > Postgres is doing something equivalent to the first plan.
>
> Hmm.  I think the second plan is probably equivalent to the
> hashed-subplan behavior that you can get in PG by rewriting the query to
> NOT IN as I illustrated.  The third plan looks to be the same thing plus
> some parallelization frammishes.
>
> I'm not clear on what "small/medium/large" means, in particular not on
> which of these corresponds to the OP's report of 9-second execution.
>
>                         regards, tom lane
>


--
Alexander Steffens
Georgstr. 3
53111 Bonn
+49 228 2661615

pgsql-bugs by date:

From: Dave Page
Date: 19 December 2007, 06:03:54
Subject: Re: BUG #3829: Wrong index reporting from pgAdmin III (v1.8.0 rev 6766-6767)

From: Zdenek Kotala
Date: 19 December 2007, 06:45:50
Subject: Re: Bug (#3484) - Invalid page header again

Re: BUG #3826: Very Slow Execution of examplequery (wrong plan?) - Mailing list pgsql-bugs

Previous

Next