Re: Lowering the ever-growing heap->pd_lower - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Lowering the ever-growing heap->pd_lower |
Date | |
Msg-id | 20220408210601.ny6maqqnfdipq7tx@alap3.anarazel.de Whole thread Raw |
In response to | Re: Lowering the ever-growing heap->pd_lower (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Lowering the ever-growing heap->pd_lower
|
List | pgsql-hackers |
Hi, On 2022-04-08 09:17:40 -0400, Robert Haas wrote: > I agree that the value of 291 is pretty much accidental, but it also > seems fairly generous to me. The bigger you make it, the more space > you can waste. I must have missed (or failed to understand) previous > discussions about why raising it would be a good idea. It's not hard to hit scenarios where pages are effectively unusable, because they have close to 291 dead items, without autovacuum triggering (or autovacuum just taking a while). You basically just need updates / deletes to concentrate in a certain range of the table and have indexing that prevents HOT updates. Because the overall percentage of dead tuples is low, no autovacuum is triggered, yet a range of the table contains little but dead items. At which point you basically waste 7k bytes (1164 bytes for dead items IIRC) until a vacuum finally kicks in - way more than what what you'd waste if the number of line items were limited at e.g. 2 x MaxHeapTuplesPerPage This has become a bit more pronounced with vacuum skipping index cleanup when there's "just a few" dead items - if all your updates concentrate in a small region, 2% of the whole relation size isn't actually that small. I wonder if we could reduce the real-world space wastage of the line pointer array, if we changed the the logic about which OffsetNumbers to use during inserts / updates and and made a few tweaks to to pruning. 1) It's kind of OK for heap-only tuples to get a high OffsetNumber - we can reclaim them during pruning once they're dead. They don't leave behind a dead item that's unreclaimable until the next vacuum with an index cleanup pass. 2) Arguably the OffsetNumber of a redirect target can be changed. It might break careless uses of WHERE ctid = ... though (which likely are already broken, just harder to hit). These leads me to a few potential improvements: a) heap_page_prune_prune() should take the number of used items into account when deciding whether to prune. Right now we trigger hot pruning based on the number of items only if PageGetMaxOffsetNumber(page) >= MaxHeapTuplesPerPage. But because it requires a vacuum to reclaim an ItemId used for a root tuple, we should trigger HOT pruning when it might lower which OffsetNumber get used. b) heap_page_prune_prune() should be triggered in more paths. E.g. when inserting / updating, we should prune if it allows us to avoid using a high OffsetNumber. c) What if we left some percentage of ItemIds unused, when looking for the OffsetNumber of a new HOT row version? That'd make it more likely for non-HOT updates and inserts to fit onto the page, without permanently increasing the size of the line pointer array. d) If we think 2) is acceptable, we could move the targets of redirects to make space for new root tuples, without increasing the permanent size of the line pointer array. Crazy? Greetings, Andres Freund
pgsql-hackers by date: