Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: New IndexAM API controlling index vacuum strategies |
Date | |
Msg-id | CAH2-WzkeOSYwC6KNckbhk2b1aNnWum6Yyn0NKP9D-Hq1LGTDPw@mail.gmail.com Whole thread Raw |
In response to | Re: New IndexAM API controlling index vacuum strategies (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: New IndexAM API controlling index vacuum strategies
|
List | pgsql-hackers |
On Wed, Mar 24, 2021 at 6:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I've attached the updated patch set (nothing changed in 0001 and 0002 patch). Attached is v7, which takes the last two patches from your v6 and rebases them on top of my recent work. This includes handling index cleanup consistently in the event of an emergency. We want to do index cleanup when the optimization case works out. It would be arbitrary to not do index cleanup because there was 1 dead tuple instead of 0. Plus index cleanup is actually useful in some index AMs. At the same time, it seems like a bad idea to do cleanup in an emergency case. Note that this includes the case where the new wraparound mechanism kicks in, as well as the case where INDEX_CLEANUP = off. In general INDEX_CLEANUP = off should be 100% equivalent to the emergency mechanism, except that the decision is made dynamically instead of statically. The two patches that you have been working on are combined into one patch in v7 -- the last patch in the series. Maintaining two separate patches there doesn't seem that useful. The main change I've made in v7 is structural. There is a new patch in the series, which is now the first. It adds more variables to the top-level state variable used by VACUUM. We shouldn't have to pass the same "onerel" variable and other similar variables to so many similar functions. Plus we shouldn't rely on global state so much. That makes the code a lot easier to understand. Another change that appears in the first patch concerns parallel VACUUM, and how it is structured. It is hard to know which function concerned parallel VACUUM and which is broader than that right now. It makes it seriously hard to follow at times. So I have consolidated those functions, and given them less generic, more descriptive names. (In general it should be possible to read most of the code in vacuumlazy.c without thinking about parallel VACUUM in particular.) I had many problems with existing function arguments that look like this: IndexBulkDeleteResult **stats // this is a pointer to a pointer to a IndexBulkDeleteResult. Sometimes this exact spelling indicates: 1. "This is one particular index's stats -- this function will have the index AM set the statistics during ambulkdelete() and/or amvacuumcleanup()". But at other times/with other function arguments, it indicates: 2. "Array of stats, once for each of the heap relation's indexes". I found the fact that both 1 and 2 appear together side by side very confusing. It is much clearer with 0001-*, though. It establishes a consistent singular vs plural variable naming convention. It also no longer uses IndexBulkDeleteResult ** args for case 1 -- even the C type system ambiguity is avoided. Any thoughts on my approach to this? Another change in v7: We now stop applying any cost-based delay that may be in force if and when we abandon index vacuuming to finish off the VACUUM operation. Robert thought that that was important, and I agree. I think that it's 100% justified, because this is a true emergency. When the emergency mechanism (though not INDEX_CLEANUP=off) actually kicks in, we now also have a scary WARNING. Since this behavior only occurs when the system is definitely at very real risk of becoming unavailable, we can justify practically any intervention that makes it less likely that the system will become 100% unavailable (except for anything that creates additional risk of data loss). BTW, spotted this compiler warning in v6: /code/postgresql/patch/build/../source/src/backend/access/heap/vacuumlazy.c: In function ‘check_index_vacuum_xid_limit’: /code/postgresql/patch/build/../source/src/backend/access/heap/vacuumlazy.c:2314:6: warning: variable ‘effective_multixact_freeze_max_age’ set but not used [-Wunused-but-set-variable] 2314 | int effective_multixact_freeze_max_age; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ I think that it's just a leftover chunk of code. The variable in question ('effective_multixact_freeze_max_age') does not appear in v7, in any case. BTW I moved this function into vacuum.c, next to vacuum_set_xid_limits() -- that seemed like a better place for it. But please check this yourself. > Regarding "auto" option, I think it would be a good start to enable > the index vacuum skipping behavior by default instead of adding “auto” > mode. That is, we could skip index vacuuming if INDEX_CLEANUP ON. With > 0003 and 0004 patch, there are two cases where we skip index > vacuuming: the garbage on heap is very concentrated and the table is > at risk of XID wraparound. It seems to make sense to have both > behaviors by default. I agree. Adding a new "auto" option now seems to me to be unnecessary complexity. Besides, switching a boolean reloption to a bool-like enum reloption may have subtle problems. > If we want to have a way to force doing index > vacuuming, we can add “force” option instead of adding “auto” option > and having “on” mode force doing index vacuuming. It's hard to imagine anybody using the "force option". Let's not have one. Let's not change the fact that "INDEX_CLEANUP = on" means "default index vacuuming behavior". Let's just change the index vacuuming behavior. If we get the details wrong, a simple reloption will make little difference. We're already being fairly conservative in terms of the skipping behavior, including with the SKIP_VACUUM_PAGES_RATIO. > Also regarding new GUC parameters, vacuum_skip_index_age and > vacuum_multixact_skip_index_age, those are not autovacuum-dedicated > parameter. VACUUM command also uses those parameters to skip index > vacuuming dynamically. In such an emergency case, it seems appropriate > to me to skip index vacuuming even in VACUUM command. And I don’t add > any reloption for those two parameters. Since those parameters are > unlikely to be changed from the default value, I think it don’t > necessarily need to provide a way for per-table configuration. +1 for all that. We already have a reloption for this behavior, more or less -- it's called INDEX_CLEANUP. The existing autovacuum_freeze_max_age GUC (which is highly related to your new GUCs) is both an autovacuum GUC, and somehow also not an autovacuum GUC at the same time. The apparent contradiction only seems to resolve itself when you consider the perspective of DBAs and the perspective of Postgres hackers separately. *Every* VACUUM GUC is an autovacuum GUC when you know for sure that the relfrozenxid is 1 billion+ XIDs in the past. > + bool skipping; > Can we flip the boolean? I mean to use a positive form such as > "do_vacuum". It seems to be more readable especially for the changes > made in 0003 and 0004 patches. I agree that it's clearer that way around. The code is structured this way in v7. Specifically, there are now both do_index_vacuuming and do_index_cleanup in the per-VACUUM state struct in patch 0002-*. These are a direct replacement for useindex. (Though we only start caring about the do_index_cleanup field in the final patch, 0004-*) Thanks -- Peter Geoghegan
Attachment
pgsql-hackers by date: