Home > mailing lists

Re: Standalone synchronous master - Mailing list pgsql-hackers

From	Josh Berkus
Subject	Re: Standalone synchronous master
Date	January 12, 2014 21:04:26
Msg-id	52D30351.2040401@agliodbs.com Whole thread Raw
In response to	Re: Standalone synchronous master ("Joshua D. Drake" <jd@commandprompt.com>)
Responses	Re: Standalone synchronous master
List	pgsql-hackers

Tree view

On 01/12/2014 12:35 PM, Stephen Frost wrote:
> * Josh Berkus (josh@agliodbs.com) wrote:
>> You don't want to handle all of those issues the same way as far as sync
>> rep is concerned.  For example, if the standby is restaring, you
>> probably want to wait instead of degrading.
> 
> *What*?!  Certainly not in any kind of OLTP-type system; a system
> restart can easily take minutes.  Clearly, you want to resume once the
> standby is back up, which I feel like the people against an auto-degrade
> mode are missing, but holding up a commit until the standby finishes
> rebooting isn't practical.

Well, then that becomes a reason to want better/more configurability.
In the couple of sync rep sites I admin, I *would* want to wait.

>> There's also the issue that this patch, and necessarily any
>> walsender-level auto-degrade, has IMHO no safe way to resume sync
>> replication.  This means that any use who has a network or storage blip
>> once a day (again, think AWS) would be constantly in degraded mode, even
>> though both the master and the replica are up and running -- and it will
>> come as a complete surprise to them when the lose the master and
>> discover that they've lost data.
> 
> I don't follow this logic at all- why is there no safe way to resume?
> You wait til the slave is caught up fully and then go back to sync mode.
> If that turns out to be an extended problem then an alarm needs to be
> raised, of course.

So, if you have auto-resume, how do you handle the "flaky network" case?And how would an alarm be raised?

On 01/12/2014 12:51 PM, Kevin Grittner wrote:
> Josh Berkus <josh@agliodbs.com> wrote:
>> I know others have dismissed this idea as too "talky", but from my
>> perspective, the agreement with the client for each synchronous
>> commit is being violated, so each and every synchronous commit
>> should report failure to sync.  Also, having a warning on every
>> commit would make it easier to troubleshoot degraded mode for users
>> who have ignored the other warnings we give them.
>
> I agree that every synchronous commit on a master which is configured
> for synchronous replication which returns without persisting the work
> of the transaction on both the (local) primary and a synchronous
> replica should issue a WARNING.  That said, the API for some
> connectors (like JDBC) puts the burden on the application or its
> framework to check for warnings each time and do something reasonable
> if found; I fear that a Venn diagram of those shops which would use
> this new feature and those shops that don't rigorously look for and
> reasonably deal with warnings would have significant overlap.

Oh, no question.  However, having such a WARNING would help with
interactive troubleshooting once a problem has been identified, and
that's my main reason for wanting it.

Imagine the case where you have auto-degrade and a flaky network.  The
user would experience problems as performance problems; that is, some
commits take minutes on-again, off-again.  They wouldn't necessarily
even LOOK at the sync rep settings.  So next step is to try walking
through a sample transaction on the command line, and then the
DBA/consultant gets WARNING messages, which gives an idea where the real
problem lies.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

pgsql-hackers by date:

From: Kevin Grittner
Date: 12 January 2014, 20:51:44
Subject: Re: Standalone synchronous master

From: Stephen Frost
Date: 12 January 2014, 21:18:24
Subject: Re: Standalone synchronous master

Re: Standalone synchronous master - Mailing list pgsql-hackers

Previous

Next