Re: performance regression in 9.2/9.3 - Mailing list pgsql-hackers
From | Linos |
---|---|
Subject | Re: performance regression in 9.2/9.3 |
Date | |
Msg-id | 5395C9FA.6000009@linos.es Whole thread Raw |
In response to | Re: performance regression in 9.2/9.3 (Linos <info@linos.es>) |
Responses |
Re: performance regression in 9.2/9.3
|
List | pgsql-hackers |
On 05/06/14 23:09, Linos wrote: > On 05/06/14 19:39, Tom Lane wrote: >> Merlin Moncure <mmoncure@gmail.com> writes: >>> On Thu, Jun 5, 2014 at 9:54 AM, Linos <info@linos.es> wrote: >>>> What I don't understand is why the statistics have this bad information, all my tests are done on a database just restoredand analyzed. Can I do something to improve the quality of my database statistics and let the planner do better choices?Maybe increase the statistics target of the columns involved? >>> By that I meant row count estimates coming out of the joins are way >>> off. This is pushing the planner into making bad choices. The most >>> pervasive problem I see is that the row count estimate boils down to >>> '1' at some juncture causing the server to favor nestloop/index scan >>> when something like a hash join would likely be more appropriate. >> There's some fairly wacko stuff going on in this example, like why >> is the inner HashAggregate costed so much higher by 9.3 than 8.4, >> when the inputs are basically the same? And why does 9.3 fail to >> suppress the SubqueryScan on "ven", when 8.4 does get rid of it? >> And why is the final output rows estimate so much higher in 9.3? >> That one is actually higher than the product of the two nestloop >> inputs, which looks like possibly a bug. >> >> I think what's happening is that 9.3 is picking what it knows to be a less >> than optimal join method so that it can sort the output by means of the >> ordered scan "Index Scan using referencia_key on modelo mo", and thereby >> avoid an explicit sort of what it thinks would be 42512461 rows. With a >> closer-to-reality estimate there, it would have gone for a plan more >> similar to 8.4's, ie, hash joins and then an explicit sort. >> >> There is a lot going on in this plan that we haven't been told about; for >> instance at least one of the query's tables seems to actually be a view, >> and some other ones appear to be inheritance trees with partitioning >> constraints, and I'm suspicious that some of the aggregates might be >> user-defined functions with higher than normal costs. >> >> I'd like to see a self-contained test case, by which I mean full details >> about the table/view schemas; it's not clear whether the actual data >> is very important here. >> >> regards, tom lane > Query 2 doesn't use any view and you can find the schema here: > http://pastebin.com/Nkv7FwRr > > > Query 1 use 5 views: ticket_cabecera, ticket_linea, reserva_cabecera, reserva_linea and tarifa_proveedor_modelo_precio,I have factored out the four first with the same result as before, you can find the new queryand the new plan here: > > http://pastebin.com/7u2Dkyxp > http://explain.depesz.com/s/2V9d > > Actually the execution time is worse than before. > > About the last view if I change join from tarifa_proveedor_modelo_precio to tarifa_modelo_precio (a table with nearly thesame structure as the view) the query is executed much faster, but I get a similar time changing the (MIN(cab.time_stamp_recepcion)::DATE= ....) to (WHERE cab.time_stamp_recepcion::date = ....) in the "ent" subquery that neverwas a view. > > Anyway I included tarifa_modelo_precio to the query1 schema file for reference and you can find the plan using tarifa_modelo_precioinstead of the view tarifa_proveedor_modelo_precio here: > > http://explain.depesz.com/s/4gV > > query1 schema file: > http://pastebin.com/JpqM87dr > > > Regards, > Miguel Angel. > > > > Hello, Is this information enough? I could try to assemble a complete test case but I have very little time right now because Iam trying to meet a very difficult deadline. I will do ASAP if needed. Regards, Miguel Angel.
pgsql-hackers by date: