Re: [PERFORM] encouraging index-only scans - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: [PERFORM] encouraging index-only scans |
Date | |
Msg-id | 20130907053449.GE626072@alap2.anarazel.de Whole thread Raw |
In response to | Re: [PERFORM] encouraging index-only scans (Bruce Momjian <bruce@momjian.us>) |
Responses |
Re: [PERFORM] encouraging index-only scans
Re: [PERFORM] encouraging index-only scans |
List | pgsql-hackers |
On 2013-09-06 20:29:08 -0400, Bruce Momjian wrote: > On Sat, Sep 7, 2013 at 12:26:23AM +0200, Andres Freund wrote: > > I wonder if we shouldn't trigger most vacuums (not analyze!) via unset > > fsm bits. Perhaps combined with keeping track of RecentGlobalXmin to > > Fsm bits? FSM tracks the free space on each page. How does that > help? Err. I was way too tired when I wrote that. vm bits. > > make sure we're not repeatedly checking for work that cannot yet be > > done. > The idea of using RecentGlobalXmin to see how much _work_ has happened > since the last vacuum is interesting, but it doesn't handle read-only > transactions; I am not sure how they can be tracked. You make a good > point that 5 minutes passing is meaningless --- you really want to know > how many transactions have completed. So, what I was pondering went slightly into a different direction: (lets ignore anti wraparound vacuum for now) Currently we trigger autovacuums by the assumed number of dead tuples. In the course of it's action it usually will find that it cannot remove all dead rows and that it cannot mark everything as all visible. That's because the xmin horizon hasn't advanced far enough. We won't trigger another vacuum after that unless there are further dead tuples in the relation... One trick if we want to overcome that problem and that we do not handle setting all visible nicely for INSERT only workloads would be to trigger vacuum by the amount of pages that are not marked all visible in the vm. The problem there is that repeatedly scanning a relation that's only 50% visible where the rest cannot be marked all visible because of a longrunning pg_dump obivously isn't a good idea. So we need something to notify us when there's work to be done. Using elapsed time seems like a bad idea because it doesn't adapt to changing workloads very well and doesn't work nicely for different relations. What I was thinking of was to keep track of the oldest xids on pages that cannot be marked all visible. I haven't thought about the statistics part much, but what if we binned the space between [RecentGlobalXmin, ->nextXid) into 10 bins and counted the number of pages falling into each bin. Then after the vacuum finished we could compute how far RecentGlobalXmin would have to progress to make another vacuum worthwile by counting the number of pages from the lowest bin upwards and use the bin's upper limit as the triggering xid. Now, we'd definitely need to amend that scheme by something that handles pages that are newly written to, but it seems something like that wouldn't be too hard to implement and would make autovacuum more useful. > Unfortunately, our virtual transactions make that hard to compute. I don't think they pose too much of a complexity. We basically only have to care about PGXACT->xmin here and virtual transactions don't change the handling of that ... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: