Home > mailing lists

Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Improving connection scalability: GetSnapshotData()
Date	March 29, 2020 03:35:22
Msg-id	20200329033522.wplrrv2w7g5bjumc@alap3.anarazel.de Whole thread Raw
In response to	Re: Improving connection scalability: GetSnapshotData() (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: Improving connection scalability: GetSnapshotData()
List	pgsql-hackers

Tree view

Hi,

On 2020-03-28 18:39:32 -0700, Peter Geoghegan wrote:
> I have heard quite a few complaints about the scalability of snapshot
> acquisition in Postgres. Generally from very large users that are not
> well represented on the mailing lists, for a variety of reasons. The
> GetSnapshotData() bottleneck is a *huge* problem for us. (As problems
> for Postgres users go, I would probably rank it second behind issues
> with VACUUM.)

Yea, I see it similarly. For busy databases, my experience is that
vacuum is the big problem for write heavy workloads (or the write
portion), and snapshot scalability the big problem for read heavy oltp
workloads.

> This scalability improvement is clearly very significant. There is
> little question that this is a strategically important enhancement for
> the Postgres project in general. I hope that you will ultimately be
> able to commit the patchset before feature freeze.

I've done a fair bit of cleanup, but I'm still fighting with how to
implement old_snapshot_threshold in a good way. It's not hard to get it
back to kind of working, but it requires some changes that go into the
wrong direction.

The problem basically is that the current old_snapshot_threshold
implementation just reduces OldestXmin to whatever is indicated by
old_snapshot_threshold, even if not necessary for pruning to do the
specific cleanup that's about to be done. If OldestXmin < threshold,
it'll set shared state that fails all older accesses.  But that doesn't
really work well with approach in the patch of using a lower/upper
boundary for potentially valid xmin horizons.

I thinkt he right approach would be to split
TransactionIdLimitedForOldSnapshots() into separate parts. One that
determines the most aggressive horizon that old_snapshot_threshold
allows, and a separate part that increases the threshold after which
accesses need to error out
(i.e. SetOldSnapshotThresholdTimestamp()). Then we can only call
SetOldSnapshotThresholdTimestamp() for exactly the xids that are
removed, not for the most aggressive interpretation.

Unfortunately I think that basically requires changing
HeapTupleSatisfiesVacuum's signature, to take a more complex parameter
than OldestXmin (to take InvisibleToEveryoneState *), which quickly
increases the size of the patch.

I'm currently doing that and seeing how the result makes me feel about
the patch.

Alternatively we also can just be less efficient and call
GetOldestXmin() more aggressively when old_snapshot_threshold is
set. That'd be easier to implement - but seems like an ugly gotcha.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Masahiko Sawada
Date: 29 March 2020, 03:33:57
Subject: Re: error context for vacuum to include block number

From: Noah Misch
Date: 29 March 2020, 03:40:10
Subject: Re: backup manifests

Re: Improving connection scalability: GetSnapshotData() - Mailing list pgsql-hackers

Previous

Next