Thread: analyze strangeness

analyze strangeness

From

Tim Allen

Date:

18 July 2001, 03:59:44

We are seeing what seems to me to be very peculiar behaviour. We have a
schema upgrade script that alters the schema of an existing production
database. One of the things we do is create two new indexes. The script
then immediately performs a vacuum analyze.

The problem is (or was) that this analyze didn't seem to work. Queries
performed thereafter would run slowly. Doing another vacuum analyze later
on would fix this, and queries would then perform well.

We have two approaches that fix this. The first was to just sleep for two
seconds between creating the indexes and doing the vacuum analyze. The
second was to perform an explicit checkpoint between index creation and
vacuum analyze. The second approach seems the most sound, the sleep
approach relies too much on coincidence. But both work in our tests so
far.

However, why is this so? Can analyze not work properly unless the data
files have all been fsynced to disk? Does the WAL really stop analyze from
working?

Even stranger, it turns out that doing the checkpoint _after_ the vacuum
analyze also fixes this behaviour, ie queries perform well
immediately. This part is _so_ strange that I'm tempted to just not
believe it ever happened... except that it seems it did.

Any insights? Is this expected behaviour? Can anyone explain why this is
happening? We have a workaround (checkpoint), so we're not too concerned,
but would like to understand what's going on.

Platform is PG7.1.2 on Red Hat Linux 6.2, x86.

Tim

-- 
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/ http://www4.tpg.com.au/users/rita_tim/

Re: analyze strangeness

From

Tom Lane

Date:

18 July 2001, 13:36:24

Tim Allen <tim@proximity.com.au> writes:
> The problem is (or was) that this analyze didn't seem to work. Queries
> performed thereafter would run slowly. Doing another vacuum analyze later
> on would fix this, and queries would then perform well.

This makes no sense to me, either.  Can you put together a
self-contained test case that demonstrates the problem?

One thing that would be useful is to compare the planner statistics
produced by the first and second vacuums.  To see the stats, do

select relname,relpages,reltuples from pg_class where
relname in ('tablename', 'indexname', ...);

(include each index on the table, as well as the table itself) and also

select attname,attdispersion,s.*
from pg_statistic s, pg_attribute a, pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 'tablename';


> Even stranger, it turns out that doing the checkpoint _after_ the vacuum
> analyze also fixes this behaviour, ie queries perform well
> immediately.

I don't really believe that checkpoint has anything to do with it.
However, if the queries are being done in a different backend than the
one doing the vacuum, is it possible that the other backend is inside an
open transaction and does not see the catalog updates from the
later-starting vacuum transaction?
        regards, tom lane