Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel - Mailing list pgsql-hackers
From | Greg Stark |
---|---|
Subject | Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel |
Date | |
Msg-id | CAM-w4HM7huYyeZDzwNwb3Doq8hGVXivUcbB-7SZnaowVp13Zgg@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [PATCH 8/8] Introduce wal decoding via catalog
timetravel
Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel |
List | pgsql-hackers |
On Thu, Oct 11, 2012 at 2:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think I've mentioned it before, but in the interest of not being >> seen to critique the bikeshed only after it's been painted: this >> design gives up something very important that exists in our current >> built-in replication solution, namely pipelining. > > Isn't there an even more serious problem, namely that this assumes > *all* transactions are serializable? What happens when they aren't? > Or even just that the effective commit order is not XID order? Firstly, I haven't read the code but I'm confident it doesn't make the elementary error of assuming commit order == xid order. I assume it's applying the reassembled transactions in commit order. I don't think it assumes the transactions are serializable because it's only concerned with writes, not reads. So the transaction it's replaying may or may not have been able to view the data written by other transactions that commited earlier but it doesn't matter when trying to reproduce the effects using constants. The data written by this transaction is either written or not when the commit happens and it's all written or not at that time. Even in non-serializable mode updates take row locks and nobody can see the data or modify it until the transaction commits. I have to say I was curious about Robert's point as well when I read Peter's review. Especially because this is exactly how other logical replication systems I've seen work too and I've always wondered about it in those systems. Both MySQL and Oracle reassemble transactions and don't write anything until they have the whole transaction reassembled. To me this always struck me as a bizarre and obviously bad thing to do though. It seems to me it would be better to create sessions (or autonomous transactions) for each transaction seen in the stream and issue the DML as it shows up, committing and cleaning each up when the commit or abort (or shutdown or startup) record comes along. I imagine the reason lies with dealing with locking and ensuring that you get the correct results without deadlocks when multiple transactions try to update the same record. But it seems to me that the original locks the source database took should protect you against any problems. As long as you can suspend a transaction when it takes a lock that blocks and keep processing WAL for other transactions (or an abort for that transaction if that happened due to a deadlock or user interruption) you should be fine. -- greg
pgsql-hackers by date: