Home > mailing lists

Re: PATCH: 9.5 replication origins fix for logical decoding - Mailing list pgsql-hackers

From	Craig Ringer
Subject	Re: PATCH: 9.5 replication origins fix for logical decoding
Date	October 16, 2015 03:51:31
Msg-id	CAMsr+YFuS5hrAZRQmZJM9+uuy5mZvsTNL+4twPALw3CjNp_itQ@mail.gmail.com Whole thread Raw
In response to	Re: PATCH: 9.5 replication origins fix for logical decoding (Andres Freund <andres@anarazel.de>)
Responses	Re: PATCH: 9.5 replication origins fix for logical decoding
List	pgsql-hackers

Tree view

On 15 October 2015 at 20:55, Andres Freund <andres@anarazel.de> wrote:
> On 2015-10-15 20:52:41 +0800, Craig Ringer wrote:
>> You'll note that the tests fail. When the replication origin is reset
>> and set again with pg_replication_origin_xact_setup mid-xact, the
>> origin identity exposed to the decoding plugin callbacks for all
>> records (including those created before the origin change) is the
>> latter origin, the one active at COMMIT time.
>>
>> Is that the intended behaviour? That the session identifier is
>> determined by what was active at commit time, and only the lsn and
>> timestamp vary throughout the xact? It looks like it from the code.
>
> Uh. Isn't that just because you looked at txn->origin_id instead of the
> change's origin_id?

Yes, it is. I didn't realise that the individual changes had their own
origins, rather than changing the origin in the txn, though I can see
that now that I know to look.

Either I'm confused (likely) or the concept behind allowing this is
critically flawed.

Say some client code does

set session origin=1
begin
set xact lsn=0/123, ts=13:00
do some inserts
set session origin=2
set xact lsn=0/199, ts=14:00
do some more inserts
commit

it seems to be decoded as:

begin origin=2 lsn=0/199
inserts origin=1 lsn=0/199
more inserts origin=2 lsn=0/199
commit origin=2 lsn=0/199

i.e. the begin and commit have the final session origin. Individual
changes have the session origin in effect at the time the change was
created. The last-set origin commit timestamp and origin lsn override
all prior ones; they aren't recorded per-change, only on the commit.

This means you have change records with a change->origin_id that's
from a completely different node, which makes no sense at all with the
txn->origin_lsn . It matches the txn->origin_id, which is the same
throughout, but then why even have the change->origin_id?

I find the idea of each change having its own origin node - but not
its own origin LSN - very confusing. For one thing the origin filter
callback can't know about that, and can only filter based on the txn's
origin. I guess that's the output plugin's problem - if it wants to
cope with arbitrary mixed-origin tx's it can't use the origin filter
and has to check each message.

I really don't see how it makes any sense to allow the origin_id to
change mid-tx. I can see how sending the origin_id for each change
could make sense to allow future support for transaction streaming
where decoding starts before we receive the commit record, but
changing the origin_id within the tx doesn't make any sense.

IMO changing the origin should be disallowed within a tx. Otherwise
there needs to be some way to record the origin lsn and commit
timestamp changing within the tx too.

I was going to just send a patch to disallow changing the origin
mid-tx, but I'm not sure I see a good way to do that since the origin
is a session-level global, not part of the xact info.

Document it as a "don't do that, if you do it you get to keep the pieces"?

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

From: Haribabu Kommi
Date: 16 October 2015, 03:39:10
Subject: Re: Parallel Seq Scan

From: Tom Lane
Date: 16 October 2015, 03:51:49
Subject: Re: plpython is broken for recursive use

Re: PATCH: 9.5 replication origins fix for logical decoding - Mailing list pgsql-hackers

Previous

Next