Home > mailing lists

Re: Changing optimizations - Mailing list pgsql-general

From	Philip Molter
Subject	Re: Changing optimizations
Date	July 5, 2001 13:53:17
Msg-id	20010705125308.A12723@datafoundry.net Whole thread Raw
In response to	Re: Changing optimizations (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Changing optimizations
List	pgsql-general

Tree view

On Thu, Jul 05, 2001 at 12:05:52PM -0400, Tom Lane wrote:
: Hmm.  Nothing out of the ordinary about the table schemas.  The relevant
: bits of the query seem to be
:
: >         FROM percept p
: >              INNER JOIN perceptType pt ON pt.ptid=p.ptid
: >                AND pt.runinterval IS NOT NULL
:
: >        WHERE p.deleted=0 AND UNIX_TIMESTAMP( p.nextrun )<=NOW() AND
: >              pt.runinterval IS NOT NULL AND p.running=0 AND h.active=1
:
: What seems to be happening is that as you repeatedly VACUUM ANALYZE,
: the statistics shift causing a shift in the estimated number of percept
: rows that match the WHERE clauses.  As that estimate rises, you get a
: change in the selected plan types for the later joins, in a direction
: that isn't favorable if the correct number of rows is small.

Well the rowcount is always small (or should be; less than 15 rows per
return).  p.nextrun constantly adjusts to a time within the next 5
minutes and p.running constantly shifts between one and zero, but the
other fields (deleted, runinterval, active, etc.) stay the same.  Now,
it could be that it's having a problem with the running or nextrun, but
that should only affect approximately 750 of the rows.  Certainly the
query optimizer should never have to worry about the other 8500.
Furthermore, at any moment, the number of rows matching any of those
fields should be about the same.  If it wasn't, I'd see problems
elsewhere.

: But it seems odd that you'd get a factor-of-100 change in that estimate
: if the true underlying data distribution isn't changing much.  Could
: you keep track of the results of these two queries:
:
: select * from pg_class where relname = 'percept';
:
: select attname,attdispersion,s.*
: from pg_statistic s, pg_attribute a, pg_class c
: where starelid = c.oid and attrelid = c.oid and staattnum = attnum
: and relname = 'percept';

Is there anything I should specifically be looking for?  I'd assume
that attdispersion should stay relatively constant if the data
distribution doesn't change much, but I'm not really familiar with how
the system tables factor into the optimizations.

: Another possibility is that what looks to be the same bottom-level join
: plan isn't really the same, but is using different restriction/join
: clauses for some weird reason.  It would be good to look at EXPLAIN
: VERBOSE output not just EXPLAIN output for the two plans, just to rule
: that out.

My brain hurts when I do that (literally thousands of lines of output).

Thanks for the help,
Philip

* Philip Molter
* DataFoundry.net
* http://www.datafoundry.net/
* philip@datafoundry.net

pgsql-general by date:

From: "Chuck Shunk"
Date: 05 July 2001, 12:58:07
Subject: Re: Remote queries

From: Tom Lane
Date: 05 July 2001, 14:24:15
Subject: Re: Changing optimizations

Re: Changing optimizations - Mailing list pgsql-general

Previous

Next