Re: snapshot too old, configured by time - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: snapshot too old, configured by time |
Date | |
Msg-id | CACjxUsPosBXyaKFaT08c0J_Zx5bqt1iNg9prKRAn7=xSc9kOEw@mail.gmail.com Whole thread Raw |
In response to | Re: snapshot too old, configured by time (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: snapshot too old, configured by time
|
List | pgsql-hackers |
On Mon, Apr 4, 2016 at 9:15 PM, Peter Geoghegan <pg@heroku.com> wrote: > The patch does this: > >> --- a/src/backend/access/heap/pruneheap.c >> +++ b/src/backend/access/heap/pruneheap.c >> @@ -92,12 +92,21 @@ heap_page_prune_opt(Relation relation, Buffer buffer) >> * need to use the horizon that includes slots, otherwise the data-only >> * horizon can be used. Note that the toast relation of user defined >> * relations are *not* considered catalog relations. >> + * >> + * It is OK to apply the old snapshot limit before acquiring the cleanup >> + * lock because the worst that can happen is that we are not quite as >> + * aggressive about the cleanup (by however many transaction IDs are >> + * consumed between this point and acquiring the lock). This allows us to >> + * save significant overhead in the case where the page is found not to be >> + * prunable. >> */ >> if (IsCatalogRelation(relation) || >> RelationIsAccessibleInLogicalDecoding(relation)) >> OldestXmin = RecentGlobalXmin; >> else >> - OldestXmin = RecentGlobalDataXmin; >> + OldestXmin = >> + TransactionIdLimitedForOldSnapshots(RecentGlobalDataXmin, >> + relation); > > This new intermediary function TransactionIdLimitedForOldSnapshots() > is called to decide what OldestXmin actually gets to be above, based > in part on the new GUC old_snapshot_threshold: > >> +/* >> + * TransactionIdLimitedForOldSnapshots >> + * >> + * Apply old snapshot limit, if any. This is intended to be called for page >> + * pruning and table vacuuming, to allow old_snapshot_threshold to override >> + * the normal global xmin value. Actual testing for snapshot too old will be >> + * based on whether a snapshot timestamp is prior to the threshold timestamp >> + * set in this function. >> + */ >> +TransactionId >> +TransactionIdLimitedForOldSnapshots(TransactionId recentXmin, >> + Relation relation) > > It might not be RecentGlobalDataXmin that is usually returned as > OldestXmin as it is today, which is exactly the point of this patch: Right. > VACUUM can be more aggressive in cleaning up bloat, [...], on the > theory that we can reliably detect when that causes failures for > old snapshots, and just raise a "snapshot too old" error. Right. > ...because this patch does nothing to advance RecentGlobalXmin (or > RecentGlobalDataXmin) itself more aggressively. It does make > vacuum_set_xid_limits() get a more aggressive cutoff point, but we > don't see that being passed back down by lazy vacuum here; within > _bt_page_recyclable(), we rely on the more conservative > RecentGlobalXmin, which is not subject to any clever optimization in > the patch. Right. > Fortunately, this seems correct, since index scans will always succeed > in finding a deleted page, per nbtree README notes on > RecentGlobalXmin. Right. > Unfortunately, this does stop recycling from > happening early for B-Tree pages, even though that's probably safe in > principle. This is probably not so bad -- it just needs to be > considered when reviewing this patch (the same is true of logical > decoding's RecentGlobalDataXmin; it also doesn't appear in > _bt_page_recyclable(), and I guess that that was never a problem). > Index relations will not get smaller in some important cases, but they > will be made less bloated by VACUUM in a sense that's still probably > very useful. As I see it, if the long-running transaction(s) have written (and thus acquired transaction IDs), we can't safely advance the global Xmin values until they complete. If the long-running transactions with the old snapshots don't have transaction IDs, the bloat will be contained. > Maybe that explains some of what Jeff talked about. I think he just didn't have autovacuum configured to where it was being very effective on the tiny tables involved. See my reply to Jeff and the graphs from running his test on my system. I don't think the lines could get much more flat than what I'm seeing with the patch. It may be that this general approach could be made more aggressive and effective by pushing things in the direction you suggest, but we are far past the time to consider that sort of change for 9.6. This patch has been through several rounds of 30-day testing; a change like you propose would require that those tests be redone. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: