Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Improving connection scalability: GetSnapshotData() |
Date | |
Msg-id | 20200329033522.wplrrv2w7g5bjumc@alap3.anarazel.de Whole thread Raw |
In response to | Re: Improving connection scalability: GetSnapshotData() (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Improving connection scalability: GetSnapshotData()
|
List | pgsql-hackers |
Hi, On 2020-03-28 18:39:32 -0700, Peter Geoghegan wrote: > I have heard quite a few complaints about the scalability of snapshot > acquisition in Postgres. Generally from very large users that are not > well represented on the mailing lists, for a variety of reasons. The > GetSnapshotData() bottleneck is a *huge* problem for us. (As problems > for Postgres users go, I would probably rank it second behind issues > with VACUUM.) Yea, I see it similarly. For busy databases, my experience is that vacuum is the big problem for write heavy workloads (or the write portion), and snapshot scalability the big problem for read heavy oltp workloads. > This scalability improvement is clearly very significant. There is > little question that this is a strategically important enhancement for > the Postgres project in general. I hope that you will ultimately be > able to commit the patchset before feature freeze. I've done a fair bit of cleanup, but I'm still fighting with how to implement old_snapshot_threshold in a good way. It's not hard to get it back to kind of working, but it requires some changes that go into the wrong direction. The problem basically is that the current old_snapshot_threshold implementation just reduces OldestXmin to whatever is indicated by old_snapshot_threshold, even if not necessary for pruning to do the specific cleanup that's about to be done. If OldestXmin < threshold, it'll set shared state that fails all older accesses. But that doesn't really work well with approach in the patch of using a lower/upper boundary for potentially valid xmin horizons. I thinkt he right approach would be to split TransactionIdLimitedForOldSnapshots() into separate parts. One that determines the most aggressive horizon that old_snapshot_threshold allows, and a separate part that increases the threshold after which accesses need to error out (i.e. SetOldSnapshotThresholdTimestamp()). Then we can only call SetOldSnapshotThresholdTimestamp() for exactly the xids that are removed, not for the most aggressive interpretation. Unfortunately I think that basically requires changing HeapTupleSatisfiesVacuum's signature, to take a more complex parameter than OldestXmin (to take InvisibleToEveryoneState *), which quickly increases the size of the patch. I'm currently doing that and seeing how the result makes me feel about the patch. Alternatively we also can just be less efficient and call GetOldestXmin() more aggressively when old_snapshot_threshold is set. That'd be easier to implement - but seems like an ugly gotcha. Greetings, Andres Freund
pgsql-hackers by date: