Re: PATCH: 9.5 replication origins fix for logical decoding - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: PATCH: 9.5 replication origins fix for logical decoding |
Date | |
Msg-id | CAMsr+YFuS5hrAZRQmZJM9+uuy5mZvsTNL+4twPALw3CjNp_itQ@mail.gmail.com Whole thread Raw |
In response to | Re: PATCH: 9.5 replication origins fix for logical decoding (Andres Freund <andres@anarazel.de>) |
Responses |
Re: PATCH: 9.5 replication origins fix for logical decoding
|
List | pgsql-hackers |
On 15 October 2015 at 20:55, Andres Freund <andres@anarazel.de> wrote: > On 2015-10-15 20:52:41 +0800, Craig Ringer wrote: >> You'll note that the tests fail. When the replication origin is reset >> and set again with pg_replication_origin_xact_setup mid-xact, the >> origin identity exposed to the decoding plugin callbacks for all >> records (including those created before the origin change) is the >> latter origin, the one active at COMMIT time. >> >> Is that the intended behaviour? That the session identifier is >> determined by what was active at commit time, and only the lsn and >> timestamp vary throughout the xact? It looks like it from the code. > > Uh. Isn't that just because you looked at txn->origin_id instead of the > change's origin_id? Yes, it is. I didn't realise that the individual changes had their own origins, rather than changing the origin in the txn, though I can see that now that I know to look. Either I'm confused (likely) or the concept behind allowing this is critically flawed. Say some client code does set session origin=1 begin set xact lsn=0/123, ts=13:00 do some inserts set session origin=2 set xact lsn=0/199, ts=14:00 do some more inserts commit it seems to be decoded as: begin origin=2 lsn=0/199 inserts origin=1 lsn=0/199 more inserts origin=2 lsn=0/199 commit origin=2 lsn=0/199 i.e. the begin and commit have the final session origin. Individual changes have the session origin in effect at the time the change was created. The last-set origin commit timestamp and origin lsn override all prior ones; they aren't recorded per-change, only on the commit. This means you have change records with a change->origin_id that's from a completely different node, which makes no sense at all with the txn->origin_lsn . It matches the txn->origin_id, which is the same throughout, but then why even have the change->origin_id? I find the idea of each change having its own origin node - but not its own origin LSN - very confusing. For one thing the origin filter callback can't know about that, and can only filter based on the txn's origin. I guess that's the output plugin's problem - if it wants to cope with arbitrary mixed-origin tx's it can't use the origin filter and has to check each message. I really don't see how it makes any sense to allow the origin_id to change mid-tx. I can see how sending the origin_id for each change could make sense to allow future support for transaction streaming where decoding starts before we receive the commit record, but changing the origin_id within the tx doesn't make any sense. IMO changing the origin should be disallowed within a tx. Otherwise there needs to be some way to record the origin lsn and commit timestamp changing within the tx too. I was going to just send a patch to disallow changing the origin mid-tx, but I'm not sure I see a good way to do that since the origin is a session-level global, not part of the xact info. Document it as a "don't do that, if you do it you get to keep the pieces"? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: