Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] GUC for cleanup indexes threshold. |
Date | |
Msg-id | CAA4eK1+geeoXzGe516S=qaU-2+ZHvnZxVwBizXZp8dSFSqei-A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] GUC for cleanup indexes threshold. (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: [HACKERS] GUC for cleanup indexes threshold.
|
List | pgsql-hackers |
On Sat, Mar 4, 2017 at 5:59 AM, Peter Geoghegan <pg@bowt.ie> wrote: > On Fri, Mar 3, 2017 at 2:41 PM, Peter Geoghegan <pg@bowt.ie> wrote: >> In other words, the number of B-Tree pages that the last VACUUM >> deleted, and thus made eligible to recycle by the next VACUUM has no >> relationship with the number of pages the next VACUUM will itself end >> up deleting, in general, or how long it will be before that next >> VACUUM comes, if it comes at all, or anything else that seems at all >> relevant. > > This raises another question, though: Why have this GUC at all? Why > use *any* threshold that is to be compared against the number of heap > pages that were processed by VACUUM this time? > > B-Tree page deletion isn't really part of the ordinary life cycle of a > B-Tree index. In order for that to be the case, somebody would have to > delete large ranges of indexed values (typically hundreds of logically > contiguous values -- no gaps), without anyone else ever inserting new > tuples that are in the same range before the next VACUUM. It's very > unlikely that this would happen again and again in the real world. So, > even if we never freeze, the number of B-Tree pages that we delete > when we VACUUM today is generally a useless predictor of how many will > be deleted by a VACUUM that occurs tomorrow. This is true despite the > fact that the number of dead heap tuples is probably almost identical > for each VACUUM (or the number of heap pages that end up being > processed by VACUUM, if you prefer). > > Barring any concerns about crash safety, we can be completely certain > that any "recycling-orientated B-Tree VACUUM" (a btvacuumcleanup() > call to btvacuumscan(), which happens because there are no tuples in > the index to kill) will end up recycling however many pages the last > VACUUM managed to delete, which is a precisely knowable number (or > could be made knowable if we stashed that number somewhere, like the > meta-page). It will typically only take seconds or minutes after the > VACUUM finishes for its RecentGlobalXmin interlock to stop being a > problem (that is, for _bt_page_recyclable() to return "true" for any > pages that that VACUUM deleted). From that point on, those deleted > pages are "money in the bank" for the FSM. The only reason why we'd > want to tie "the FSM withdrawing that money" to VACUUM is because that > might be needed to clean up regular bloat anyway. > > The test performed by this patch within lazy_scan_heap(), to determine > whether we should avoid calling lazy_cleanup_index() would therefore > look like this, ideally: Do I want to go to the trouble of scanning > this index (a cost that is proportionate to the size of the index) in > order to recycle this number of known-deleted pages (a benefit that is > proportionate to that number)? (I still think that the important thing > is that we don't let the number of unclaimed-by-FSM recyclable pages > grow forever, though.) > You are right that we don't want the number of unclaimed-by-FSM recyclable pages to grow forever, but I think that won't happen with this patch. As soon as there are more deletions (in heap), in the next vacuum cycle, the pages will be reclaimed by lazy_vacuum_index(). > (Thinks about it some more...) > > Unfortunately, I just saw a whole new problem with this patch: > _bt_page_recyclable() is the one place in the B-Tree AM where we stash > an XID. > Can you be more specific to tell which code exactly you are referring here? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: