Home > mailing lists

Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel
Date	October 11, 2012 02:17:23
Msg-id	CAM-w4HM7huYyeZDzwNwb3Doq8hGVXivUcbB-7SZnaowVp13Zgg@mail.gmail.com Whole thread Raw
In response to	Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel
List	pgsql-hackers

Tree view

On Thu, Oct 11, 2012 at 2:40 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think I've mentioned it before, but in the interest of not being
>> seen to critique the bikeshed only after it's been painted: this
>> design gives up something very important that exists in our current
>> built-in replication solution, namely pipelining.
>
> Isn't there an even more serious problem, namely that this assumes
> *all* transactions are serializable?  What happens when they aren't?
> Or even just that the effective commit order is not XID order?

Firstly, I haven't read the code but I'm confident it doesn't make the
elementary error of assuming commit order == xid order. I assume it's
applying the reassembled transactions in commit order.

I don't think it assumes the transactions are serializable because
it's only concerned with writes, not reads. So the transaction it's
replaying may or may not have been able to view the data written by
other transactions that commited earlier but it doesn't matter when
trying to reproduce the effects using constants. The data written by
this transaction is either written or not when the commit happens and
it's all written or not at that time. Even in non-serializable mode
updates take row locks and nobody can see the data or modify it until
the transaction commits.

I have to say I was curious about Robert's point as well when I read
Peter's review. Especially because this is exactly how other logical
replication systems I've seen work too and I've always wondered about
it in those systems. Both MySQL and Oracle reassemble transactions and
don't write anything until they have the whole transaction
reassembled.  To me this always struck me as a bizarre and obviously
bad thing to do though.  It seems to me it would be better to create
sessions (or autonomous transactions) for each transaction seen in the
stream and issue the DML as it shows up, committing and cleaning each
up when the commit or abort (or shutdown or startup) record comes
along.

I imagine the reason lies with dealing with locking and ensuring that
you get the correct results without deadlocks when multiple
transactions try to update the same record. But it seems to me that
the original locks the source database took should protect you against
any problems. As long as you can suspend a transaction when it takes a
lock that blocks and keep processing WAL for other transactions (or an
abort for that transaction if that happened due to a deadlock or user
interruption) you should be fine.

-- 
greg

pgsql-hackers by date:

From: Bruce Momjian
Date: 11 October 2012, 02:17:12
Subject: Re: Warnings from fwrite() in git head

From: Andres Freund
Date: 11 October 2012, 02:21:19
Subject: Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel

Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel - Mailing list pgsql-hackers

Previous

Next