Home > mailing lists
Postgres Replication - Mailing list pgsql-hackers

From	Darren Johnson
Subject	Postgres Replication
Date	June 11, 2001 15:48:24
Msg-id	20010611.19464400@j2.us.greatbridge.com Whole thread
Responses	Re: Postgres Replication
List	pgsql-hackers
Tree view
We have been researching replication for several months now, and
I have some opinions to share to the community for feedback,
discussion, and/or participation. Our goal is to get a replication
solution for PostgreSQL that will meet most needs of users
and applications alike (mission impossible theme here :). 

My research work along with others contributors has been collected
and presented here http://www.greatbridge.org/genpage?replication_top
If there is something missing, especially PostgreSQL related
work, I would like to know about it, and my apologies to any
one who got left off the list. This work is ongoing and doesn't
draw a conclusion, which IMHO should be left up to the user,
but I'm offering my opinions to spur discussion and/or feed back
from this list, and try not to offend any one.

Here's my opinion: of the approaches we've surveyed, the most
promising one is the Postgres-R project from the Information and
Communication Systems Group, ETH  in Zurich, Switzerland, originally 
produced by Bettina Kemme, Gustavo Alonso, and others.  Although 
Postgres-R is a synchronous approach, I believe it is the closest to 
the goal mentioned above. Here is an abstract of the advantages.

1) Postgres-R is built on the PostgreSQL-6.4.2 code base.  The 
replication 
functionality is an optional parameter, so there will be insignificant 
overhead for non replication situations. The replication and 
communication
managers are the two new modules added to the PostgreSQL code base.

2) The replication manager's main function is controlling the
replication protocol via a message handling process. It receives
messages from the local and remote backends and forwards write
sets and decision messages via the communication manager to the
other servers. The replication manager controls all the transactions 
running on the local server by keeping track of the states, including 
which protocol phase (read, send, lock, or write) the transaction is
in. The replication manager maintains a two way channel
implemented as buffered sockets to each backend.

3) The main task of the communication manager is to provide simple
socket based interface between the replication manager and the
group communication system (currently Ensemble). The
communication system is a cluster of servers connected via
the communication manager.  The replication manager also maintains
three one-way channels to the communication system: a broadcast
channel to send messages, a total-order channel to receive
totally orders write sets, and a no-order channel to listen for
decision messages from the communication system. Decision
messages can be received at any time where the reception of
totally ordered write sets can be blocked in certain phases.

4) Based on a two phase locking approach, all dead lock situations
are local and detectable by Postgres-R code base, and aborted.

5) The write set messages used to send database changes to other
servers, can use either the SQL statements or the actual tuples
changed. This is a parameter based on number of tuples changed
by a transaction. While sending the tuple changes reduces
overhead in query parse, plan and execution, there is a negative
effect in sending a large write set across the network.

6) Postgres-R uses a synchronous approach that keeps the data on 
all sites consistent and provides serializability. The user does not 
have to bother with conflict resolution, and receives the same 
correctness and consistency of a centralized system.

7) Postgres-R could be part of a good fault-resilient and load 
distribution 
solution.  It is peer-to-peer based and incurs low overhead propagating 
updates to the other cluster members.  All replicated databases locally 
process queries.

8) Compared to other synchronous replication strategies (e.g., standard 
distributed 2-phase-locking + 2-phase-commit), Postgres-R has much 
better performance using 2-phase-locking.


There are some issues that are not currently addressed by
Postgres-R, but some enhancements made to PostgreSQL since the
6.4.2 tree are very favorable to addressing these short comings.

1) The addition of WAL in 7.1 has the information for recovering 
failed/off-line servers, currently all the servers would have to be 
stopped, and a copy would be used to get all the servers synchronized
before starting again. 

2)Being synchronous, Postgres-R would not be a good solution 
for off line/WAN scenarios where asynchronous replication is 
required.  There are some theories on this issue which involve servers
connecting and disconnecting from the cluster.

3)As in any serialized synchronous approach there is  change in the 
flow of execution of a transaction; while most of these changes can 
be solved by calling newly developed functions at certain time points, 
synchronous replica control is tightly coupled with the concurrency 
control.
Hence, especially in PostgreSQL 7.2 some parts of the concurrency control
(MVCC) might have to be adjusted. This can lead to a slightly more 
complicated maintenance than a system that does not change the backend.

4)Partial replication is not addressed. 


Any feedback on this post will be appreciated.

Thanks,

Darren
pgsql-hackers by date:
From: Tom Lane
Date: 11 June 2001, 15:16:21
Subject: Re: OID Wrap
From: Vincent Roberts
Date: 11 June 2001, 16:23:21
Subject: Calling lo_open within user defined C function
Postgres Replication - Mailing list pgsql-hackers

Previous

Next