Thread: Remaining VACUUM patches
There are two additional patches in the VACUUM code. One is Heikki's patch to recalculate OldestXmin in the vacuum run. http://groups.google.es/group/pgsql.patches/browse_thread/thread/b2cfc901534d8990/40ba5b2fbb8f5b91 (much nicer than our archives because the whole thread is there, not just month-sized pieces). That thread ended without any conclusion; it is said that the patch will be reconsidered when Simon Riggs' patch about the WAL flushing bug lands, but I don't know what patch is that. Is it in the patch queue? Was it already applied? The problem with the patch is that the DBT-2 test showed decreased performance, but that was still under investigation. What is the status of this? The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in pgstats after VACUUM when there is concurrent update activity. This patch is still on hold largely because the above patch would cause it to be a bit obsolete. So I think if we're not going to apply the former, we should apply this one. http://archives.postgresql.org/pgsql-hackers/2007-02/msg00051.php http://archives.postgresql.org/pgsql-patches/2007-02/msg00021.php Comments? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > There are two additional patches in the VACUUM code. One is Heikki's > patch to recalculate OldestXmin in the vacuum run. > > http://groups.google.es/group/pgsql.patches/browse_thread/thread/b2cfc901534d8990/40ba5b2fbb8f5b91 > (much nicer than our archives because the whole thread is there, not > just month-sized pieces). > > That thread ended without any conclusion; it is said that the patch will > be reconsidered when Simon Riggs' patch about the WAL flushing bug > lands, but I don't know what patch is that. Is it in the patch queue? > Was it already applied? It's in patch queue, not applied. It's the one with title "Bug: Buffer cache is not scan resistant": http://momjian.us/mhonarc/patches/msg00048.html > The problem with the patch is that the DBT-2 test showed decreased > performance, but that was still under investigation. > > What is the status of this? The plan is that I'll rerun the DBT-2 test after the above patch is applied. After that we'll decide if we want the OldestXmin patch or not. > The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in > pgstats after VACUUM when there is concurrent update activity. This > patch is still on hold largely because the above patch would cause it to > be a bit obsolete. So I think if we're not going to apply the former, > we should apply this one. I'd like to have the "buffer cache is not scan resistant" patch reviewed first to get the ball rolling on these other patches. The vacuum-related patches are just small tweaks, and they don't conflict with any of the bigger patches in the queue, so there's no reason to rush them, -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Alvaro Herrera <alvherre@commandprompt.com> writes: > The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in > pgstats after VACUUM when there is concurrent update activity. This > patch is still on hold largely because the above patch would cause it to > be a bit obsolete. I objected (and still object) to this patch because it allows n_dead_tuples to drift arbitrarily far away from reality --- a series of vacuums will incrementally update it using probably-inaccurate deltas, and there's nothing to ensure that the result converges rather than diverging. In the real world it will result in n_dead_tuples becoming less accurate, not more so. There was some discussion about better ways to do it, IIRC, but no new patch has been submitted. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > > The other patch was ITAGAKI Takahiro's patch to fix n_dead_tuples in > > pgstats after VACUUM when there is concurrent update activity. > > I objected (and still object) to this patch because it allows > n_dead_tuples to drift arbitrarily far away from reality > There was some discussion about better ways to do it, IIRC, but no new > patch has been submitted. I wrote the patch *after* the dicussion (and it is still valid with some hunks). It sets n_dead_tuples as the follows: | n_dead_tuples ---------------------------+--------------- (1) At the start of vacuum | N (2) At the end of vacuum | M (>=N) (3) After updating stats | M - N So if we don't update the table during vacuum, n_dead_tuples will be definitely zero. Even if there are some updates with inaccurate stats in a vacuum, only the errors generated in the vacuum are left. Errors generated before the vacuum are completely cleared so that the formula does not enlarge the inaccuracy. I've waited for the completion of "Recalculating OldestXmin in a long-running vacuum" patch, because it changes the accuracy of (3). But without the recalculating patch, I have no plan to modify my n_dead_tuples patch any further. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center