Home > mailing lists

Re: Synchronous Log Shipping Replication - Mailing list pgsql-hackers

From	Markus Wanner
Subject	Re: Synchronous Log Shipping Replication
Date	September 6, 2008 04:26:51
Msg-id	48C230AD.1060606@bluegap.ch Whole thread Raw
In response to	Synchronous Log Shipping Replication ("Fujii Masao" <masao.fujii@gmail.com>)
Responses	Re: Synchronous Log Shipping Replication
List	pgsql-hackers

Tree view

Hi,

Fujii Masao wrote:
> Pavan re-designed the sync replication based on the prototype
> and I posted that design doc on wiki. Please check it if you
> are interested in it.
> http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects

I've read that wiki page and allow myself to comment from a Postgres-R 
developer's perspective ;-)

R1: "without ... any negative performance overhead"? For fully 
synchronous replication, that's clearly not possible. I guess that 
applies only for async WAL shipping.

NR3: who is supposed to do failure detection and manage automatic 
failover? How does integration with such an additional tool work?

I got distracted by the SBY and ACT abbreviations. Why abbreviate 
standby or active at all? It's not like we don't already have enough 
three letter acronyms, but those stand for rather more complex terms 
than single words.

Standby Bootstrap: "stopping the archiving at the ACT" doesn't prevent 
overriding WAL files in pg_xlog. It just stops archiving a WAL file 
before it gets overridden - which clearly doesn't solve the problem here.

How is communication done? "Serialization of WAL shipping" should better 
not mean serialization on the network, i.e. the WAL Sender Process 
should be able to await acknowledgment of multiple WAL packets in 
parallel, otherwise the interconnect latency might turn into a 
bottleneck. How is communication done? What happens if the link between 
the active and standby goes down? Or if it's temporarily unavailable for 
some time?

The IPC mechanism reminds me a lot of what I did for Postgres-R, which 
also has a central "replication manager" process, which receives 
changesets from multiple backends. I've implemented an internal 
messaging mechanism based on shared memory and signals, using only 
Postgres methods. It allows arbitrary processes to send messages to each 
other by process id.

Moving the WAL Sender and WAL Receiver processes under the control of 
the postmaster certainly sounds like a good thing. After all, those are 
fiddling wiht Postgres internals.

> This design is too huge. In order to enhance the extensibility
> of postgres, I'd like to divide the sync replication into
> minimum hooks and some plugins and to develop it, respectively.
> Plugins for the sync replication plan to be available at the
> time of 8.4 release.

Hooks again? I bet you all know by now, that my excitement for hooks has 
always been pretty narrow. ;-)

> In my design, WAL sending is achieved as follow by WALSender.
> WALSender is a new process which I introduce.
> 
>   1) On COMMIT, backend requests WALSender to send WAL.
>   2) WALSender reads WAL from walbuffers and send it to slave.
>   3) WALSender waits for the response from slave and replies
>      backend.
> 
> I propose two hooks for WAL sending.
> 
> WAL-writing hook
> ----------------
> This hook is for backend to communicate with WALSender.
> WAL-writing hook intercepts write system call in XLogWrite.
> That is, backend requests WAL sending whenever write is called.
> 
> WAL-writing hook is available also for other uses e.g.
> Software RAID (writes WAL into two files for durability).
> 
> Hook for WALSender
> ------------------
> This hook is for introducing WALSender. There are the following
> three ideas of how to introduce WALSender. A required hook
> differs by which idea is adopted.
> 
> a) Use WALWriter as WALSender
> 
>    This idea needs WALWriter hook which intercepts WALWriter
>    literally. WALWriter stops the local WAL write and focuses on
>    WAL sending. This idea is very simple, but I don't think of
>    the use of WALWriter hook other than WAL sending.
> 
> b) Use new background process as WALSender
> 
>    This idea needs background-process hook which enables users
>    to define new background processes. I think the design of this
>    hook resembles that of rmgr hook proposed by Simon. I define
>    the table like RmgrTable. It's for registering some functions
>    (e.g. main function and exit...) for operating a background
>    process. Postmaster calls the function from the table suitably,
>    and manages a start and end of background process. ISTM that
>    there are many uses in this hook, e.g. performance monitoring
>    process like statspack.
> 
> c) Use one backend as WALSender
> 
>    In this idea, slave calls the user-defined function which
>    takes charge of WAL sending via SQL e.g. "SELECT pg_walsender()".
>    Compared with other ideas, it's easy to implement WALSender
>    because postmater handles the establishment and authentication
>    of connection. But, this SQL causes a long transaction which
>    prevents vacuum. So, this idea needs idle-state hook which
>    executes plugin before transaction starts. I don't think of
>    the use of this hook other than WAL sending either.

The above cited wiki page sounds like you've already decided for b).

I'm unclear on what you want hooks for. If additional processes get 
integrated into Postgres, those certainly need to get integrated very 
much like we integrated other auxiliary processes. I wouldn't call that 
'hooking', but YMMV.

Regards

Markus Wanner

pgsql-hackers by date:

From: "Asko Oja"
Date: 06 September 2008, 04:22:27
Subject: Re: reducing statistics write overhead

From: Greg Smith
Date: 06 September 2008, 05:03:53
Subject: Re: Need more reviewers!

Re: Synchronous Log Shipping Replication - Mailing list pgsql-hackers

Previous

Next