Re: heapgetpage() and ->takenDuringRecovery - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: heapgetpage() and ->takenDuringRecovery |
Date | |
Msg-id | 20140303120729.GC23352@awork2.anarazel.de Whole thread Raw |
In response to | Re: heapgetpage() and ->takenDuringRecovery (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: heapgetpage() and ->takenDuringRecovery
|
List | pgsql-hackers |
On 2014-03-03 06:57:00 -0500, Robert Haas wrote: > On Sun, Mar 2, 2014 at 8:39 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > While reading around which references to SnapshotData's members exist, I > > once more came about the following tidbit in heapgetpage(): > > /* > > * If the all-visible flag indicates that all tuples on the page are > > * visible to everyone, we can skip the per-tuple visibility tests. > > * > > * Note: In hot standby, a tuple that's already visible to all > > * transactions in the master might still be invisible to a read-only > > * transaction in the standby. We partly handle this problem by tracking > > * the minimum xmin of visible tuples as the cut-off XID while marking a > > * page all-visible on master and WAL log that along with the visibility > > * map SET operation. In hot standby, we wait for (or abort) all > > * transactions that can potentially may not see one or more tuples on the > > * page. That's how index-only scans work fine in hot standby. A crucial > > * difference between index-only scans and heap scans is that the > > * index-only scan completely relies on the visibility map where as heap > > * scan looks at the page-level PD_ALL_VISIBLE flag. We are not sure if > > * the page-level flag can be trusted in the same way, because it might > > * get propagated somehow without being explicitly WAL-logged, e.g. via a > > * full page write. Until we can prove that beyond doubt, let's check each > > * tuple for visibility the hard way. > > */ > > all_visible = PageIsAllVisible(dp) && !snapshot->takenDuringRecovery; > > > > I don't think this is neccessary >= 9.2. The are two only "interestings" place > > where PD_ALL_VISIBLE is set: > > a) lazy_vacuum_page() where a xl_heap_clean is logged *before* > > PD_ALL_VISIBLE/the vm is touched and that causes recovery > > conflicts. The heap page is locked for cleanup at that point. As the > > logging of xl_heap_clean sets the page's LSN there's no way the page > > can appear on the standby too early. > > b) empty pages in lazy_scan_heap(). If they always were empty, there's > > no need for conflicts. The only other way I can see to end up there > > is a previous heap_page_prune() that repaired fragmentation. But that > > logs a WAL record with conflict information. > > I don't think there's any reason to believe that lazy_scan_heap() can > only hit pages that are empty or have just been defragged. Suppose > that there's a tuple on the page which was recently inserted; the > inserting transaction has committed but there are some backends that > still have older snapshots. The page won't be marked all-visible > because it isn't. Now, eventually those older snapshots will go away, > and sometime after that the relation will get vacuumed again, and > we'll once again look the page. But this time we notice that it is > all-visible, and mark it so. Hm, right, that can happen. How about adding a LSN interlock in visibilitymap_set() for those cases as well, not just for checksums? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: