Thread: analyze strangeness
We are seeing what seems to me to be very peculiar behaviour. We have a schema upgrade script that alters the schema of an existing production database. One of the things we do is create two new indexes. The script then immediately performs a vacuum analyze. The problem is (or was) that this analyze didn't seem to work. Queries performed thereafter would run slowly. Doing another vacuum analyze later on would fix this, and queries would then perform well. We have two approaches that fix this. The first was to just sleep for two seconds between creating the indexes and doing the vacuum analyze. The second was to perform an explicit checkpoint between index creation and vacuum analyze. The second approach seems the most sound, the sleep approach relies too much on coincidence. But both work in our tests so far. However, why is this so? Can analyze not work properly unless the data files have all been fsynced to disk? Does the WAL really stop analyze from working? Even stranger, it turns out that doing the checkpoint _after_ the vacuum analyze also fixes this behaviour, ie queries perform well immediately. This part is _so_ strange that I'm tempted to just not believe it ever happened... except that it seems it did. Any insights? Is this expected behaviour? Can anyone explain why this is happening? We have a workaround (checkpoint), so we're not too concerned, but would like to understand what's going on. Platform is PG7.1.2 on Red Hat Linux 6.2, x86. Tim -- ----------------------------------------------- Tim Allen tim@proximity.com.au Proximity Pty Ltd http://www.proximity.com.au/ http://www4.tpg.com.au/users/rita_tim/
Tim Allen <tim@proximity.com.au> writes: > The problem is (or was) that this analyze didn't seem to work. Queries > performed thereafter would run slowly. Doing another vacuum analyze later > on would fix this, and queries would then perform well. This makes no sense to me, either. Can you put together a self-contained test case that demonstrates the problem? One thing that would be useful is to compare the planner statistics produced by the first and second vacuums. To see the stats, do select relname,relpages,reltuples from pg_class where relname in ('tablename', 'indexname', ...); (include each index on the table, as well as the table itself) and also select attname,attdispersion,s.* from pg_statistic s, pg_attribute a, pg_class c where starelid = c.oid and attrelid = c.oid and staattnum = attnum and relname = 'tablename'; > Even stranger, it turns out that doing the checkpoint _after_ the vacuum > analyze also fixes this behaviour, ie queries perform well > immediately. I don't really believe that checkpoint has anything to do with it. However, if the queries are being done in a different backend than the one doing the vacuum, is it possible that the other backend is inside an open transaction and does not see the catalog updates from the later-starting vacuum transaction? regards, tom lane