Thread: Re: QUESTION: Replication
> Seasons Greetings Bruce, > > Sorry to bother you but do you know of anyone working on replication > with PostgreSQL? I saw in the past, pgsnap (PG Snapshot) but I don't > know if anyone's working on replication these days. There was a > thread in the past and I sent them messages but haven't replied yet. > So I'm hoping you would know of someone or point me in the right > direction. We need major work in this area, or at least a plan and an FAQ item. We are getting major questions on this, and I don't know enough even to make an FAQ item telling people their options. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > We need major work in this area, or at least a plan and an FAQ item. > We are getting major questions on this, and I don't know enough even to > make an FAQ item telling people their options. My 2 cents, or 2 ören since I'm a Swede, on this: It is pretty simple to build a replication with pg_dump, transfer, empty replic and reload. But if we want "live replicas" we better base our efforts on a mechanism using WAL-logs to rollforward the replicas. regards, ----------------- Göran Thyni On quiet nights you can hear Windows NT reboot!
Hello Goran, > > We need major work in this area, or at least a plan and an FAQ item. > > We are getting major questions on this, and I don't know enough even to > > make an FAQ item telling people their options. > It is pretty simple to build a replication with pg_dump, transfer, > empty replic and reload. Simple process for simple replication. This would not work optimally for replication through the WAN (or Internet) since it does not do differential replication (and therefore transfering dumps isn't good enough). Then one has to worry about the integrity of the replicated data. Then we need to answer the n-way question (any db server may be a source for the replication). Regards, Neil D. Quiogue STO - dotPH, Inc. "Nothing great was ever achieved without enthusiasm." - Ralph WaldoEmerson
One issue that needs to be handled with replication is the synchronization of serial/sequence values, especially when these are used as primary/foreign keys. Here's part of how I've seen this handled, for what it's worth to anyone who might work on this. I'm sure there are better solutions, but its food for thought. There was the notion of a 'primary' db server and any number of 'secondary' db servers. Replication would occur on the 'secondaries' by serially hot-streaming a log of all successful state-changing queries (creates, drops, inserts, updates, and deletes) to a replayer on each downstream 2ndary server. The replayers would then re-execute those queries on that server. Q: How would you keep track of where in the replay log you were if a server went down, etc.? A: Each secondary dbserver had a table with a single record that noted the filename and offset of the log it was currently processing. The replayer would read the logs and update this table as it processed the log. Q: How were serials/sequences kept in sync? A: A special INSERT command was created called 'SINSERT' (as in "Serial INSERT"). When the primary db server saw this, it knew to log the query with the explicit value of the primary's sequence/serial rather than allow the downstream secondaries to autogenerate it. Q: Did the databases ever get out of sync? A: Yes, occasionally. It was not bulletproof. If things got way out of sync, a backup copy was put out on all servers (with some data loss, which was acceptable). Q: How did it handle transaction effects, serialization level, etc.? A: The db had no transactions, so it didn't handle this at all. This is the hard part of the problem in my view. Cheers, Ed Loehr