Thread: Re: [COMMITTERS] pgsql: Add the "snapshot too old" feature
Kevin Grittner wrote: > Add the "snapshot too old" feature > src/backend/access/gin/ginbtree.c | 9 +- > src/backend/access/gin/gindatapage.c | 7 +- > src/backend/access/gin/ginget.c | 22 +- > src/backend/access/gin/gininsert.c | 2 +- I'm wondering about the TestForOldSnapshot call in scanPendingInsert(). Why do we apply it to the metapage buffer (line 1717 in master)? Shouldn't we apply it to the pending-list pages themselves only, if any? (If there are no pending-list pages, surely the age of the snapshot used to read the metapage doesn't matter; and if there are, then the age of the pending-list pages will fail the test.) FWIW I like the "revert" commit, because it easily shows me in what places you considered a snapshot-too-old test and decided not to add one. Bare BufferGetPage calls (the current situation) don't indicate that. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Aug 25, 2016 at 2:56 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Kevin Grittner wrote: >> Add the "snapshot too old" feature > >> src/backend/access/gin/ginbtree.c | 9 +- >> src/backend/access/gin/gindatapage.c | 7 +- >> src/backend/access/gin/ginget.c | 22 +- >> src/backend/access/gin/gininsert.c | 2 +- > > I'm wondering about the TestForOldSnapshot call in scanPendingInsert(). > Why do we apply it to the metapage buffer (line 1717 in master)? > Shouldn't we apply it to the pending-list pages themselves only, if any? > (If there are no pending-list pages, surely the age of the snapshot used > to read the metapage doesn't matter; and if there are, then the age of > the pending-list pages will fail the test.) > > > FWIW I like the "revert" commit, because it easily shows me in what > places you considered a snapshot-too-old test and decided not to add > one. Bare BufferGetPage calls (the current situation) don't indicate that. What about the state after pending-list entries are applied? Would the change you suggest still run across a page with an appropriate LSN? I will go review what I did in the gin AM, but wanted to put that question out there first... -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Aug 25, 2016 at 2:56 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I'm wondering about the TestForOldSnapshot call in scanPendingInsert(). > Why do we apply it to the metapage buffer (line 1717 in master)? If there is any chance that GinPageGetMeta(page)->head could have changed from a valid block number to InvalidBlockNumber or a different pending-list page due to a vacuum freeing pages from the pending-list, the metapage must be checked -- there is no other way to detect that data might have disappeared. > Shouldn't we apply it to the pending-list pages themselves only, if any? If the pending-list pages are never freed, we could drop the metapage test. > (If there are no pending-list pages, surely the age of the snapshot used > to read the metapage doesn't matter; and if there are, then the age of > the pending-list pages will fail the test.) True as long as no pending-list page is ever removed from the pending-list. If we take out the test, it might be worth a comment explaining why it's not needed and that it will be needed if someone modifies the AM to recycle empty pages from the list for reuse. > FWIW I like the "revert" commit, because it easily shows me in what > places you considered a snapshot-too-old test and decided not to add > one. Bare BufferGetPage calls (the current situation) don't indicate that. I'm glad there is some value from having done that little dance. :-) Thanks for looking this over so closely! -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner wrote: > On Thu, Aug 25, 2016 at 2:56 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > > I'm wondering about the TestForOldSnapshot call in scanPendingInsert(). > > Why do we apply it to the metapage buffer (line 1717 in master)? > > If there is any chance that GinPageGetMeta(page)->head could have > changed from a valid block number to InvalidBlockNumber or a > different pending-list page due to a vacuum freeing pages from the > pending-list, the metapage must be checked -- there is no other way > to detect that data might have disappeared. Hmm ... but the disappearing data is moved to regular GIN pages, right? It doesn't just go away. I suppose that the error would be raised as soon as we scan a GIN data page that, because of receiving data from the pending list, has a newer LSN. I don't know GIN in detail but perhaps it's possible that the data is inserted into data pages in lsn A, then removed from the pending list in lsn B (where A < B). If the snapshot's LSN lies in between, a spurious error would be raised. > > FWIW I like the "revert" commit, because it easily shows me in what > > places you considered a snapshot-too-old test and decided not to add > > one. Bare BufferGetPage calls (the current situation) don't indicate that. > > I'm glad there is some value from having done that little dance. :-) I'm sure that was an annoying thing to do, so I agree. > Thanks for looking this over so closely! You're welcome. I'm not actually reviewing this patch specifically, just trying to figure out a different new feature and reading a few AMs code while at it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Aug 25, 2016 at 4:51 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Kevin Grittner wrote: >> On Thu, Aug 25, 2016 at 2:56 PM, Alvaro Herrera >> <alvherre@2ndquadrant.com> wrote: >> >> > I'm wondering about the TestForOldSnapshot call in scanPendingInsert(). >> > Why do we apply it to the metapage buffer (line 1717 in master)? >> >> If there is any chance that GinPageGetMeta(page)->head could have >> changed from a valid block number to InvalidBlockNumber or a >> different pending-list page due to a vacuum freeing pages from the >> pending-list, the metapage must be checked -- there is no other way >> to detect that data might have disappeared. > > Hmm ... but the disappearing data is moved to regular GIN pages, right? > It doesn't just go away. I suppose that the error would be raised as > soon as we scan a GIN data page that, because of receiving data from the > pending list, has a newer LSN. That would cover things as long as data is always moved from the pending list to a data page before it is vacuumed away. If that is true, there is no need to check the metapage *or* the pending list -- but I'm pretty skeptical that this is the case. The real question is whether pages are ever removed from the pending list. > I don't know GIN in detail but perhaps > it's possible that the data is inserted into data pages in lsn A, then > removed from the pending list in lsn B (where A < B). If the snapshot's > LSN lies in between, a spurious error would be raised. The implementation does allow false positives and slightly less aggressive early cleanup than could be achieved -- in order to avoid the extra locking that would be needed to achieve higher precision. Since I expect that the threshold will usually be set to at least a couple hours, these effects should have minimal impact. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company