Re: Incomplete freezing when truncating a relation during vacuum - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Incomplete freezing when truncating a relation during vacuum |
Date | |
Msg-id | 20131130160058.GB31100@awork2.anarazel.de Whole thread Raw |
In response to | Re: Incomplete freezing when truncating a relation during vacuum (Noah Misch <noah@leadboat.com>) |
Responses |
Re: Incomplete freezing when truncating a relation during vacuum
Re: Incomplete freezing when truncating a relation during vacuum |
List | pgsql-hackers |
Hi Noah, On 2013-11-30 00:40:06 -0500, Noah Misch wrote: > > > On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote: > > > > With regard to fixing things up, ISTM the best bet is heap_prune_chain() > > > > so far. That's executed b vacuum and by opportunistic pruning and we > > > > know we have the appropriate locks there. Looks relatively easy to fix > > > > up things there. Not sure if there are any possible routes to WAL log > > > > this but using log_newpage()? > > > > I am really not sure what the best course of action is :( > > Based on subsequent thread discussion, the plan you outline sounds reasonable. > Here is a sketch of the specific semantics of that fixup. If a HEAPTUPLE_LIVE > tuple has XIDs older than the current relfrozenxid/relminmxid of its relation > or newer than ReadNewTransactionId()/ReadNextMultiXactId(), freeze those XIDs. > Do likewise for HEAPTUPLE_DELETE_IN_PROGRESS, ensuring a proper xmin if the > in-progress deleter aborts. Using log_newpage_buffer() seems fine; there's no > need to optimize performance there. We'd need to decide what to do with xmax values, they'd likely need to be treated differently. The problem with log_newpage_buffer() is that we'd quite possibly issue one such call per item on a page. And that might become quite expensive. Logging ~1.5MB per 8k page in the worst case sounds a bit scary. > (All the better if we can, with minimal > hacks, convince heap_freeze_tuple() itself to log the right changes.) That likely comes to late - we've already pruned the page and might have made wrong decisions there. Also, heap_freeze_tuple() is run on both the primary and standbys. I think our xl_heap_freeze format, that relies on running heap_freeze_tuple() during recovery, is a terrible idea, but we cant change that right now. > Time is tight to finalize this, but it would be best to get this into next > week's release. That way, the announcement, fix, and mitigating code > pertaining to this data loss bug all land in the same release. If necessary, > I think it would be worth delaying the release, or issuing a new release a > week or two later, to closely align those events. That being said, I'm > prepared to review a patch in this area over the weekend. I don't think I currently have the energy/brainpower/time to develop such a fix in a suitable quality till monday. I've worked pretty hard on trying to fix the host of multixact data corruption bugs the last days and developing a solution that I'd be happy to put into such critical paths is certainly several days worth of work. I am not sure if it's a good idea to delay the release because of this, there are so many other critical issues that that seems like a bad tradeoff. That said, if somebody else is taking the lead I am certainly willing to help in detail with review and testing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: