Re: PostgreSQL HA config recommendations - Mailing list pgsql-general
From | Fabio Ugo Venchiarutti |
---|---|
Subject | Re: PostgreSQL HA config recommendations |
Date | |
Msg-id | 5542D0C7.2020003@vuole.me Whole thread Raw |
In response to | Re: PostgreSQL HA config recommendations (William Dunn <dunnwjr@gmail.com>) |
Responses |
Re: PostgreSQL HA config recommendations
|
List | pgsql-general |
Point taken. William is right. My recommendations were unusually pessimistic as I didn't take enough time to assess global+instantaneous data changes visibility requirements on your part. If the cluster is ONLY HA and you don't need to read fresh data off secondary nodes (E.G.: HA+read load balancing), asynchronous is good enough in most cases. On 01/05/15 06:37, William Dunn wrote: > Alex, > Note that you should be weary of suggestions to make your replication > synchronous. Synchronous replication is rarely used for this kind of use > case (Cisco Jabber) where the most complete durability of the standby is > not of the utmost concern (as it would be in a banking application). Not > only will it decrease performance, but since you expect to have only one > local standby it could actually decrease your availability because if > your standby went down no transactions would be able to commit on the > master. See the Synchronous Replication section of the docs for more > details (http://www.postgresql.org/docs/devel/static/warm-standby.html). > > Also note that the suggestion provided by Fabio that you should not have > your application commit more than one transaction per user operation is > only applicable in synchronous replication (though since this is for a > Cisco Jabber, where you neither have control over nor much information > regarding the number of commits sent by the transaction per user > operation, that suggestion is not applicable anyway...). In the case of > asynchronous master-slave replication the typical issue with streaming > replication latency is that you have transactions going to the master > and then the application sends a read only transaction to the slave > before the slave receives the transaction. So long as you don't have the > application consider the user operation completed before all the > transactions are committed I don't think having multiple transactions > would make your replication latency issue any less. > > For example, if you had a calendar application where a user enters > event details and creates an event for the calendar. The application > may be set up to execute 2 transactions, 1) Add the event and > details to the calendar events table and 2) once the event creation > transaction returns add the current user as an attendee for that > event. In this case both transactions would be going against the > master, so how far the slave is behind wouldn't be a factor. Of > course it would be faster overall to send the inserts as a single > database procedure, but that all goes against the master database so > the streaming replication is not a factor in that consideration. > > > *William J. Dunn* > _willjdunn.com <http://willjdunn.com>_ > > *William J. Dunn* > *P* 978-844-4427 | _dunnwjr@gmail.com <mailto:dunnwjr@gmail.com>_ > _dunnw@bu.edu <mailto:dunnw@bu.edu>_ > > On Thu, Apr 30, 2015 at 9:02 AM, Fabio Ugo Venchiarutti <fabio@vuole.me > <mailto:fabio@vuole.me>> wrote: > > > WAN delays can cause problems for any replication system; you just have > > to be aware of that and not push things too hard (or try and violate the > > laws of physics). For example, streaming replication set to be > > synchronous crossing the planet is something you'd probably be rather > > unhappy with. :) > > > In my experience streaming replication fits most use cases due to > inherent its simplicity and robustness, but you might need to adjust > your software design to get the best out of it. > > > More specifically, latency issues can be heavily mitigated by having > application software commit no more than one transaction per user > operation, provided 1 x "master<->sync_slave round trip time" is > acceptable delay when they submit forms or the like. > > It can get much worse if the application server is on a different > geographical node than the DB master. In such case it is > realistically beneficial to batch multiple write operations in a > single STATEMENT instead. > If the replication synchronous slave is on yet another node, the > best case (single statement) scenario would be 2 x round trip time. > This configuration is more common than you might think as some > setups feature remote app servers reading off synchronous slaves at > their own physical location but committing against a master that is > somewhere else. > > > Cheers > > > > > > > > On 30/04/15 11:06, Jim Nasby wrote: > > On 4/29/15 1:13 PM, Alex Gregory wrote: > > I was thinking that I could use Slony but then I read that > it does not > like WAN replication. I have also read about streaming > replication > native to Postgres but was not sure how that would work over > the WAN. > Bucardo seems better for Data Warehousing or multimaster > situations > which this is not. That leaves pgpool ii which seems like > it would > add an extra layer of complexity. > > > WAN delays can cause problems for any replication system; you > just have > to be aware of that and not push things too hard (or try and > violate the > laws of physics). For example, streaming replication set to be > synchronous crossing the planet is something you'd probably be > rather > unhappy with. :) > > I haven't played with Slony in forever, but when I did it loved > to lock > things. That would not play well with high latency. > > I have run londiste between sites within the same city, and that > worked > well. > > Bucardo and pg_pool are both based on the idea of replaying SQL > statements instead of replicating actual data. They have their > uses, but > I personally distrust that idea, especially for DR. > > When it comes down to to there are so many choices I am not > sure if I > need one or a combination of two. Any help you could > provide could > be greatly appreciated. > > > If you want to replicate within a data center then streaming > replication > is pretty nice, and as a bonus you might be able to do > synchronous as > well. The downside to streaming rep is that it's binary, so if > you ever > suffer data corruption you're practically guaranteed that corruption > will end up on the replica. Logical replication like londiste or > Slony > are much more robust against that. You also can't use temporary > tables > with streaming rep, and you have to replicate the details of ALL > activity, including maintenance like VACUUM. In some > environments that > might be slower than logical replication. > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org > <mailto:pgsql-general@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > >
pgsql-general by date: