Re: Proposal for CSN based snapshots - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Proposal for CSN based snapshots |
Date | |
Msg-id | 20140530152721.GC30516@awork2.anarazel.de Whole thread Raw |
In response to | Re: Proposal for CSN based snapshots (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: Proposal for CSN based snapshots
Re: Proposal for CSN based snapshots |
List | pgsql-hackers |
Hi, On 2014-05-30 17:59:23 +0300, Heikki Linnakangas wrote: > So, here's a first version of the patch. Still very much WIP. Cool. > One thorny issue came up in discussions with other hackers on this in PGCon: > > When a transaction is committed asynchronously, it becomes visible to other > backends before the commit WAL record is flushed. With CSN-based snapshots, > the order that transactions become visible is always based on the LSNs of > the WAL records. This is a problem when there is a mix of synchronous and > asynchronous commits: > > If transaction A commits synchronously with commit LSN 1, and transaction B > commits asynchronously with commit LSN 2, B cannot become visible before A. > And we cannot acknowledge B as committed to the client until it's visible to > other transactions. That means that B will have to wait for A's commit > record to be flushed to disk, before it can return, even though it was an > asynchronous commit. > I personally think that's annoying, but we can live with it. The most common > usage of synchronous_commit=off is to run a lot of transactions in that > mode, setting it in postgresql.conf. And it wouldn't completely defeat the > purpose of mixing synchronous and asynchronous commits either: an > asynchronous commit still only needs to wait for any already-logged > synchronous commits to be flushed to disk, not the commit record of the > asynchronous transaction itself. I have a hard time believing that users won't hate us for such a regression. It's pretty common to mix both sorts of transactions and this will - by my guesstimate - dramatically reduce throughput for the async backends. > * Logical decoding is broken. I hacked on it enough that it looks roughly > sane and it compiles, but didn't spend more time to debug. I think we can live with it not working for the first few iterations. I'll look into it once the patch has stabilized a bit. > * I expanded pg_clog to 64-bits per XID, but people suggested keeping > pg_clog as is, with two bits per commit, and adding a new SLRU for the > commit LSNs beside it. Probably will need to do something like that to avoid > bloating the clog. It also influences how on-disk compatibility is dealt with. So: How are you planning to deal with on-disk compatibility? > * Add some kind of backend-private caching of clog, to make it faster to > access. The visibility checks are now hitting the clog a lot more heavily > than before, as you need to check the clog even if the hint bits are set, if > the XID falls between xmin and xmax of the snapshot. That'll hurt a lot in concurrent scenarios :/. Have you measured how 'wide' xmax-xmin usually is? I wonder if we could just copy a range of values from the clog when we start scanning.... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: