Re: Set visibility map bit after HOT prune - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Set visibility map bit after HOT prune |
Date | |
Msg-id | CA+TgmoYQ1FsDrZ6mB95KDGM+9YKa3QRvY9kMyTHe19qMAmbJqA@mail.gmail.com Whole thread Raw |
In response to | Re: Set visibility map bit after HOT prune (Pavan Deolasee <pavan.deolasee@gmail.com>) |
Responses |
Re: Set visibility map bit after HOT prune
|
List | pgsql-hackers |
On Thu, Dec 20, 2012 at 11:49 AM, Pavan Deolasee <pavan.deolasee@gmail.com> wrote: > I wonder if we should add a flag to heap_page_prune and try to do some > additional work if its being called from lazy vacuum such as setting > the VM bit and the tuple freeze. IIRC I had put something like that in > the early patches, but then ripped of for simplicity. May be its time > to play with that again. That seems unlikely to be a good trade-off. If VACUUM is going to do extra stuff, it's better to have that in the vacuum-specific code, rather than in code that is also traversed from other places. Otherwise the conditional logic might impose a penalty on people who aren't taking those branches. >> IMHO, it's probably fairly hopeless to make a pure pgbench workload >> show a benefit from index-only scans. A large table under a very >> heavy, completely random write workload is just about the worst >> possible case for index-only scans. Index-only scans are a way of >> avoiding unnecessary visibility checks when the target data hasn't >> changed recently, not a magic bullet to escape all heap access. If >> the target data has changed, you're going to have to touch the heap. > > Not always. Not clearing the VM bit at HOT update is one such idea we > discussed. Of course, there are open issues with that, but they are > not unsolvable. The advantage of not touching heap is just too big to > ignore. I don't really agree. Sure, not touching the heap is nice, but mostly because you avoid pulling pages into shared_buffers that aren't otherwise needed. IIRC, an index-only scan isn't faster than an index scan if all the necessary table and index pages are already cached. Touching already-resident pages just isn't that expensive. And of course, if a page has recently suffered an insert, update, or delete, it is more likely to be resident. You can construct access patterns where this isn't so - e.g. update the page, wait for it to get paged out, and then SELECT from it with an index-only scan, wait for it to get paged out again, etc. - but I'm not sure how much of a problem that is in the real world. >> And while I agree that we aren't aggressive enough in setting the VM >> bits right now, I also think it wouldn't be too hard to go too far in >> the opposite direction: we could easily spend more effort trying to >> make index-only scans effective than we could ever hope to recoup from >> the scans themselves. > > I agree. I also started having that worry. We are at one extreme right > now and it might not help to get to the other extreme. Looks like I'm > coming along the idea of somehow detecting if the scan is happening on > the result relation of a ModifyTable and avoid setting VM bit in that > case. It's unclear to me that that's the right way to slice it. There are several different sets of concerns here: (1) avoiding setting the all-visible bit when it'll be cleared again just after, (2) avoiding slowing down SELECT with hot-pruning, and (3) avoiding slowing down repeated SELECTs by NOT having the first one do HOT-pruning. And maybe others. The right thing to do depends on which problems you think are relatively more important. That question might not even have one right answer, but even if it does we don't have consensus on what it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: