Re: Standalone synchronous master - Mailing list pgsql-hackers
From | Josh Berkus |
---|---|
Subject | Re: Standalone synchronous master |
Date | |
Msg-id | 52D30351.2040401@agliodbs.com Whole thread Raw |
In response to | Re: Standalone synchronous master ("Joshua D. Drake" <jd@commandprompt.com>) |
Responses |
Re: Standalone synchronous master
|
List | pgsql-hackers |
On 01/12/2014 12:35 PM, Stephen Frost wrote: > * Josh Berkus (josh@agliodbs.com) wrote: >> You don't want to handle all of those issues the same way as far as sync >> rep is concerned. For example, if the standby is restaring, you >> probably want to wait instead of degrading. > > *What*?! Certainly not in any kind of OLTP-type system; a system > restart can easily take minutes. Clearly, you want to resume once the > standby is back up, which I feel like the people against an auto-degrade > mode are missing, but holding up a commit until the standby finishes > rebooting isn't practical. Well, then that becomes a reason to want better/more configurability. In the couple of sync rep sites I admin, I *would* want to wait. >> There's also the issue that this patch, and necessarily any >> walsender-level auto-degrade, has IMHO no safe way to resume sync >> replication. This means that any use who has a network or storage blip >> once a day (again, think AWS) would be constantly in degraded mode, even >> though both the master and the replica are up and running -- and it will >> come as a complete surprise to them when the lose the master and >> discover that they've lost data. > > I don't follow this logic at all- why is there no safe way to resume? > You wait til the slave is caught up fully and then go back to sync mode. > If that turns out to be an extended problem then an alarm needs to be > raised, of course. So, if you have auto-resume, how do you handle the "flaky network" case?And how would an alarm be raised? On 01/12/2014 12:51 PM, Kevin Grittner wrote: > Josh Berkus <josh@agliodbs.com> wrote: >> I know others have dismissed this idea as too "talky", but from my >> perspective, the agreement with the client for each synchronous >> commit is being violated, so each and every synchronous commit >> should report failure to sync. Also, having a warning on every >> commit would make it easier to troubleshoot degraded mode for users >> who have ignored the other warnings we give them. > > I agree that every synchronous commit on a master which is configured > for synchronous replication which returns without persisting the work > of the transaction on both the (local) primary and a synchronous > replica should issue a WARNING. That said, the API for some > connectors (like JDBC) puts the burden on the application or its > framework to check for warnings each time and do something reasonable > if found; I fear that a Venn diagram of those shops which would use > this new feature and those shops that don't rigorously look for and > reasonably deal with warnings would have significant overlap. Oh, no question. However, having such a WARNING would help with interactive troubleshooting once a problem has been identified, and that's my main reason for wanting it. Imagine the case where you have auto-degrade and a flaky network. The user would experience problems as performance problems; that is, some commits take minutes on-again, off-again. They wouldn't necessarily even LOOK at the sync rep settings. So next step is to try walking through a sample transaction on the command line, and then the DBA/consultant gets WARNING messages, which gives an idea where the real problem lies. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
pgsql-hackers by date: