Re: [ADMIN] pg_basebackup blocking all queries with horrible performance - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: [ADMIN] pg_basebackup blocking all queries with horrible performance |
Date | |
Msg-id | CAHGQGwHhUZf=vMhhAp+HBsYEEvxjTk_krL+s7H0n_eiXn4zqDA@mail.gmail.com Whole thread Raw |
In response to | Re: [ADMIN] pg_basebackup blocking all queries with horrible performance (Magnus Hagander <magnus@hagander.net>) |
Responses |
Re: [ADMIN] pg_basebackup blocking all queries with
horrible performance
|
List | pgsql-hackers |
On Mon, Jul 2, 2012 at 4:01 AM, Magnus Hagander <magnus@hagander.net> wrote: > On Sun, Jul 1, 2012 at 7:14 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> >> On Fri, Jun 29, 2012 at 7:22 PM, Magnus Hagander <magnus@hagander.net> wrote: >>> On Wed, Jun 27, 2012 at 7:24 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >>>> On Thu, Jun 21, 2012 at 3:18 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>>>> On Wed, Jun 20, 2012 at 7:18 AM, Magnus Hagander <magnus@hagander.net> wrote: >>>>>>>>> You agreed to add something like NOSYNC option into START_REPLICATION command? >>>>>>>> >>>>>>>> I'm on the fence. I was hoping somebody else would chime in with an >>>>>>>> opinion as well. >>>>>>> >>>>>>> +1 >>>>>> >>>>>> Nobody else with any opinion on this? :( >>>>> >>>>> I don't think we really need a NOSYNC flag at this point. Just not >>>>> setting the flush location in clients that make a point of flushing in >>>>> a timely fashion seems fine. >>>> >>>> Okay, I'm in the minority, so I'm writing the patch that way. WIP >>>> patch attached. >>>> >>>> In the patch, pg_basebackup background process and pg_receivexlog always >>>> return invalid location as flush one, and will never become sync standby even >>>> if their name is in synchronous_standby_names. The timing of their sending >>> >>> That doesn't match with the patch, afaics. The patch always sets the >>> correct write location, which means it can become a remote_write >>> synchronous standby, no? It will only send it back when timeout >>> expires, but it will be sent back. >> >> No. Though correct write location is sent back, they don't become sync standby >> because flush location is always invalid. While flush location is >> invalid, the master >> will never regard the remote server as sync one even if synchronous_commit is >> set to remote_write. > > Oh. I wasn't aware of that part. > > >>> I wonder if that might actually be a more reasonable mode of operation >>> in general: >>> >>> * always send back the write position, at the write interval >>> * always send back the flush position, when we're flushing (meaning >>> when we switch xlog) >>> >>> have an option that makes it possible to: >>> * always send back the write position as soon as it changes (making >>> for a reasonable remote_write sync standby) >>> * actually flush the log after each write instead of end of file >>> (making for a reasonable full sync standby) >>> >>> meaning you'd have something like "pg_receivexlog --sync=write" and >>> "pg_receivexlog --sync=flush" controlling it instead. >> >> Yeah, in this way, pg_receivexlog can become sync even if >> synchronous_commit is on, which seems more useful. But >> I'm thinking that the synchronous pg_receivexlog stuff should >> be postponed to 9.3 because its patch seems to become too >> big to apply at this beta stage. So, in 9.2, to fix the problem, >> what about just applying the simple patch which prevents >> pg_basebackup background process and pg_receivexlog from >> becoming sync standby whatever synchronous_standby_names >> and synchronous_commit are set to? > > Agreed. > > With the addition that we should set the write location, because > that's very useful and per what you said above should be perfectly > safe. > > >>> And deal with the "user put * in synchronous_standby_names and >>> accidentally got pg_receivexlog as the sync standby" by more clearly >>> warning people not to use * for that parameter... Since it's simply >>> dangerous :) >> >> Yep. > > What would be good wording? Something along the line of "Using the * > entry is not recommended since it can lead to unexpected results when > new standbys are added" or something like that? > > >>>> the reply depends on the standby_message_timeout specified in -s option. So >>>> the write position may lag behind the true position. >>>> >>>> pg_receivexlog accepts new option -S (better option character?). If this option >>>> is specified, pg_receivexlog returns true flush position, and can become sync >>>> standby. It sends back the reply to the master each time the write position >>>> changes or the timeout passes. If synchronous_commit is set to remote_write, >>>> synchronous replication to pg_receivexlog would work well. >>> >>> Yeah, I hadn't considered the remote_write mode, but I guess that's >>> why you have to track the current write position across loads, which >>> first confused me. >> >> The patch has to track the current write location to decide whether to send >> back the reply to the master, IOW to know whether the write location >> has changed, IOW to know whether we've already sent the reply about >> the latest write location. > > Yeha, makes perfect sense. > > >>> Looking at some other usecases for this, I wonder if we should also >>> force a status message whenever we switch xlog files, even if we >>> aren't running in sync mode, even if the timeout hasn't expired. I >>> think that would be a reasonable thing to do, since you often want to >>> track things based on files. >> >> You mean that the pg_receivexlog should send back the correct flush >> location whenever it switches xlog files? > > No, I mean just send back a status message. Meaning that without > specifiying the sync modes per above, it would send back the *write* > location. This would be useful for tracking xlog filenames between > master and pg_receivexlog, without extra delay. > >>>> The patch needs more documentation. But I think that it's worth reviewing the >>>> code in advance, so I attached the WIP patch. Comments? Objections? >>> >>> Looking at the code, what exactly prompts the changes to the backend >>> side? That seems unrelated? Are we actually considering picking a >>> standby with InvalidXlogRecPtr as a sync standby today? >>> >>> Isn't it enough to just send the proper write and flush locations from >>> the frontend? >> >> No, unless I'm missing something. >> >> The problem that we should address first is that the master can pick up >> pg_basebackup background process and pg_receivexlog as a sync >> standby even if they always return an invalid write and flush positions. >> Since they don't send any correct write and flush positions, if they are >> accidentally regarded as sync standby, all transactions can get blocked >> infinitely. So the patch needed to change the walsender code so that it >> doesn't pick up the remote server as sync one while its flush position >> is invalid. > > Yeah, that is clearly wrong. I think I missed this behaviour, and got > confused by the fact that the patch was trying to fix two different > things - only one of which I was aware of. > > So yes, per above, let's isolate out this part as one patch and get > that into 9.2, along with the "set the proper write location", but > leave everything else for 9.3. Agreed. The attached patch always sets the correct write location and prevents the remote server sending back invalid flush location from becoming sync standby. Regards, -- Fujii Masao
Attachment
pgsql-hackers by date: