Re: New replication mode: write - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: New replication mode: write |
Date | |
Msg-id | CAHGQGwE+Zxk_yNw0rw8bo__+YzFOvEw3HtCb+8FQL=fzTaPxJA@mail.gmail.com Whole thread Raw |
In response to | New replication mode: write (Fujii Masao <masao.fujii@gmail.com>) |
Responses |
Re: New replication mode: write
Re: New replication mode: write |
List | pgsql-hackers |
On Fri, Jan 13, 2012 at 7:30 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Fri, Jan 13, 2012 at 9:15 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On Fri, Jan 13, 2012 at 7:41 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> >>> Thought? Comments? >> >> This is almost exactly the same as my patch series >> "syncrep_queues.v[1,2].patch" earlier this year. Which I know because >> I was updating that patch myself last night for 9.2. I'm about half >> way through doing that, since you and I agreed in Ottawa I would do >> this. Perhaps it is better if we work together? > > I think this comment is mostly pointless. We don't have time to work > together and there's no real reason to. You know what you're doing, so > I'll leave you to do it. > > Please add the Apply mode. OK, will do. > In my patch, the reason I avoided doing WRITE mode (which we had > previously referred to as RECV) was that no fsync of the WAL contents > takes place. In that case we are applying changes using un-fsynced WAL > data and in case of crash this would cause a problem. My patch has not changed the execution order of WAL flush and replay. WAL records are always replayed after they are flushed by walreceiver. So, such a problem doesn't happen. But which means that transaction might need to wait for WAL flush caused by previous transaction even if WRITE mode is chosen. Which limits the performance gain by WRITE mode, and should be improved later, I think. > I was going to > make the WalWriter available during recovery to cater for that. Do you > not think that is no longer necessary? That's still necessary to improve the performance in sync rep further, I think. What I'd like to do (maybe in 9.3dev) after supporting WRITE mode is: * Allow WAL records to be replayed before they are flushed to the disk. * Add new GUC parameter specifying whether to allow the standby to defer WAL flush. If the parameter is false, walreceiverflushes WAL whenever it receives WAL (i.e., it's same as the current behavior). If true, walreceiver doesn'tflush WAL at all. Instead, walwriter, backend or startup process does that. Walwriter periodically checks whetherthere is un-flushed WAL file, and flushes it if exists. When the buffer page is written out, backend or startupprocess forces WAL flush up to buffer's LSN. If the above GUC parameter is set to true (i.e., walreceiver doesn't flush WAL at all) and WRITE mode is chosen, transaction doesn't need to wait for WAL flush on the standby at all. Also the frequency of WAL flush on the standby would become lower, which significantly reduces I/O load. After all, the performance in sync rep would improve very much. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: