Re: Synchronous commit not... synchronous? - Mailing list pgsql-hackers
From | Daniel Farina |
---|---|
Subject | Re: Synchronous commit not... synchronous? |
Date | |
Msg-id | CAAZKuFZ+UXdd0Hm_UOu5Z9n+8UUG7ZsaNG3q-S2npKhnfozA4A@mail.gmail.com Whole thread Raw |
In response to | Synchronous commit not... synchronous? (Peter van Hardenberg <pvh@pvh.ca>) |
Responses |
Re: Synchronous commit not... synchronous?
Re: Synchronous commit not... synchronous? |
List | pgsql-hackers |
On Fri, Nov 2, 2012 at 1:06 PM, Jeff Janes <jeff.janes@gmail.com> wrote: >> I see why it is implemented this way, but it's also still pretty >> unsatisfying because it means that with cancellation requests clients >> are in theory able to commit an unlimited number of transactions, >> synchronous commit or no. > > What evil does this allow the client to perpetrate? The client can commit against my will by accident in an automated system whose behavior is at least moderately complex and hard to understand completely for all involved, and then the client's author subsequently writes me a desperate or angry support request asking why data was lost. This is not the best time for me to ask "did you setup a scheduled task to cancel hanging queries automatically? Because yeah...there's this...thing." >> It's probably close enough for most purposes, but what would you think >> about a "2PC-ish" mode at the physical (rather than logical/PREPARE >> TRANSACTION) level, whereby the master would insist that its standbys >> have more data written (or at least received...or at least sent) than >> it has guaranteed flushed to its own xlog at any point? > > Then if they interrupt the commit, the remote has it permanently but > the local does not. That would be corruption. That is a good point. When the server starts up it could interrogate it standbys for WAL to apply. My ideal is to get a similar relationship between a master and its 'local' pg_xlog, except over socket, and possibly (but entirely optionally) to a non-Postgres receiver of WAL, that may buffer WAL and then submit it directly to what is typically thought of as the archives. I have a number of reasons for doing that, but they can all be summed as: block devices are much more prone to failures -- both simple and byzantine -- than memory and network traffic with enough fidelity checking (such as TLS), and the pain from block device failures -- in particular, the byzantine ones -- is very high when they occur. The bar for "reliable" non-volatile storage for me are things like Amazon's S3, and I think a lot of that has to do with the otherwise relatively impoverished semantics it has, so I think this reliability profile will be or has been duplicated elsewhere. In general, this has some relation to remastering issues. In the future, I'd like to be able to turn off the local pg_xlog, at my option. This is something that I've been very slowly moving forward on for a while, with the first step being writing a Postgres proxy, currently underway. The tool support for this kind of facility is not really in existence yet, but I'll catch up some day... > What the "DETAIL" doesn't make clear about the current system is that > the commit *will* be replicated to the standby *eventually*, unless > the master burns down first. In particular, if any commit after this > one makes it to the standby, then the interrupted one is guaranteed to > have made it as well. > >> This would be a nice invariant to have when dealing with a large >> number of systems, allowing for the catching of some tricky bugs, that >> standbys are always greater-than-or-equal-to the master's XLogPos. > > Could you elaborate on that? Sure. I'd like to sanity check failovers with as many simple invariants as I can to catch problems. Losing a cheap to confirm invariant is losing a check, so that would be unfortunate when doing more failovers simultaneously than humans can realistically be involved with in a short amount of time, as the results of a bug there are most unpleasant. -- fdr
pgsql-hackers by date: