Re: New IndexAM API controlling index vacuum strategies - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: New IndexAM API controlling index vacuum strategies |
Date | |
Msg-id | CAH2-WzmkebqPd4MVGuPTOS9bMFvp9MDs5cRTCOsv1rQJ3jCbXw@mail.gmail.com Whole thread Raw |
In response to | Re: New IndexAM API controlling index vacuum strategies (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: New IndexAM API controlling index vacuum strategies
Re: New IndexAM API controlling index vacuum strategies |
List | pgsql-hackers |
On Fri, Jan 29, 2021 at 5:26 PM Peter Geoghegan <pg@bowt.ie> wrote: > It'll be essential to have good instrumentation as we do more > benchmarking. We're probably going to have to make subjective > assessments of benchmark results, based on multiple factors. That will > probably be the only practical way to assess how much better (or > worse) the patch is compared to master. This patch is more about > efficiency and predictability than performance per se. Which is good, > because that's where most of the real world problems actually are. I've been thinking about how to get this patch committed for PostgreSQL 14. This will probably require cutting scope, so that the initial commit is not so ambitious. I think that "incremental VACUUM" could easily take up a lot of my time for Postgres 15, and maybe even Postgres 16. I'm starting to think that the right short term goal should not directly involve bottom-up index deletion. We should instead return to the idea of "unifying" the vacuum_cleanup_index_scale_factor feature with the INDEX_CLEANUP feature, which is kind of where this whole idea started out at. This short term goal is much more than mere refactoring. It is still a whole new user-visible feature. The patch would teach VACUUM to skip doing any real index work within both ambulkdelete() and amvacuumcleanup() in many important cases. Here is a more detailed explanation: Today we can skip all significant work in ambulkdelete() and amvacuumcleanup() when there are zero dead tuples in the table. But why is the threshold *precisely* zero? If we could treat cases that have "practically zero" dead tuples in the same way (or almost the same way) as cases with *exactly* zero dead tuple, that's still a big improvement. And it still sets an important precedent that is crucial for the wider "incremental VACUUM" project: the criteria for triggering index vacuuming becomes truly "fuzzy" for the first time. It is "fuzzy" in the sense that index vacuuming might not happen during VACUUM at all now, even when the user didn't explicitly use VACUUUM's INDEX_CLEANUP option, and even when more than *precisely* zero dead index tuples are involved (though not *much* more than zero, can't be too aggressive). That really is a big change. A recap on vacuum_cleanup_index_scale_factor, just to avoid confusion: The reader should note that this is very different to Masahiko's vacuum_cleanup_index_scale_factor project, which skips *cleanup* in VACUUM (not bulk delete), a question which only comes up when there are definitely zero dead index tuples. The unifying work I'm talking about now implies that we completely avoid scanning indexes during vacuum, even when they are known to have at least a few dead index tuples, and even when VACUUM's INDEX_CLEANUP emergency option is not in use. Which, as I just said, is a big change. Thoughts on triggering criteria for new "unified" design, ~99.9% append-only tables: Actually, in *one* sense the difference between "precisely zero" and "practically zero" here *is* small. But it's still probably going to result in skipping reading indexes during VACUUM in many important cases. Like when you must VACUUM a table that is ~99.9% append-only. In the real world things are rarely in exact discrete categories, even when we imagine that they are. It's too easy to be wrong about one tiny detail -- like one tiny UPDATE from 4 weeks ago, perhaps. Having a tiny amount of "forgiveness" here is actually a *huge* improvement on having precisely zero forgiveness. Small and big. This should help cases that get big surprising spikes due to anti-wraparound vacuums that must vacuum indexes for the first time in ages -- indexes may be vacuumed despite only having a tiny absolute number of dead tuples. I don't think that it's necessary to treat anti-wraparound vacuums as special at all (not in Postgres 14 and probably not ever), because simply considering cases where the table has "practically zero" dead tuples alone should be enough. Vacuuming a 10GB index to delete only 10 tuples simply makes no sense. It doesn't necessarily matter how we end up there, it just shouldn't happen. The ~99.9% append-only table case is likely to be important and common in the real world. We should start there for Postgres 14 because it's easier, that's all. It's not fundamentally different to what happens in workloads involving lots of bottom-up deletion -- it's just simpler, and easier to reason about. Bottom-up deletion is an important piece of the big puzzle here, but some variant of "incremental VACUUM" really would still make sense in a world where bottom-up index deletion does not exist. (In fact, I started thinking about "incremental VACUUM" before bottom-up index deletion, and didn't make any connection between them until work on bottom-up deletion had already progressed significantly.) Here is how the triggering criteria could work: maybe skipping accessing all indexes during VACUUM happens when less than 1% or 10,000 of the items from the table are to be removed by VACUUM -- whichever is greater. Of course this is just the first thing I thought of. It's a starting point for further discussion. My concerns won't be a surprise to you, Masahiko, but I'll list them for the record. The bottom-up index deletion related complexity that I want to avoid dealing with for Postgres 14 is in the following areas (areas that Masahiko's patch dealt with): * No need to teach indexes to do the amvacuumstrategy() stuff in Postgres 14 -- so no need to worry about the exact criteria used within AMs like nbtree to determine whether or not index vacuuming seems appropriate from the "selfish" perspective of one particular index. I'm concerned that factors like bulk DELETEs, that may complicate things for the amvacuumstrategy() routine -- doing something relatively simple based on the recent growth of the index might have downsides. Balancing competing considerations is hard. * No need to change MaxHeapTuplesPerPage for now, since that only really makes sense in cases that heavily involve bottom-up deletion, where we care about the *concentration* of LP_DEAD line pointers in heap pages (and not just the absolute number in the entire table), which is qualitative, not quantitative (somewhat like bottom-up deletion). The change to MaxHeapTuplesPerPage that Masahiko has proposed does make sense -- there are good reasons to increase it. Of course there are also good reasons to not do so. I'm concerned that we won't have time to think through all the possible consequences. * Since "practically zero" dead tuples from a table still isn't very many, the risk of "leaking" many deleted pages due to a known issue with INDEX_CLEANUP in nbtree [1] is much less significant. (FWIW I doubt that skipping index vacuuming is the only way that we can fail to recycle deleted pages anyway -- the FSM is not crash safe, of course, plus I think that _bt_page_recyclable() might be broken in other ways.) In short: we can cut scope and de-risk the patch for Postgres 14 by following this plan, while still avoiding unnecessary index vacuuming within VACUUM in certain important cases. The high-level goal for this patch has always been to recognize that index vacuuming is basically wasted effort in certain cases. Cutting scope here merely means addressing the relatively easy cases first, where simple triggering logic will clearly be effective. I still strongly believe in "incremental VACUUM". What do you think of cutting scope like this for Postgres 14, Masahiko? Sorry to change my mind, but I had to see the prototype to come to this decision. [1] https://www.postgresql.org/message-id/CA+TgmoYD7Xpr1DWEWWXxiw4-WC1NBJf3Rb9D2QGpVYH9ejz9fA@mail.gmail.com -- Peter Geoghegan
pgsql-hackers by date: