Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAA4eK1LasocmhFeBb0D4ixf_J=pDr1OYdyTnEvbhcYToqA=GMw@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > I'll submit the patch tomorrow. > > While updating the patch, I realized that skipping a transaction that > is prepared on the publisher will be tricky a bit; > > First of all, since skip-xid is in pg_subscription catalog, we need to > do a catalog update in a transaction and commit it to disable it. I > think we need to set origin-lsn and timestamp of the transaction being > skipped to the transaction that does the catalog update. That is, > during skipping the (not prepared) transaction, we skip all > data-modification changes coming from the publisher, do a catalog > update, and commit the transaction. If we do the catalog update in the > next transaction after skipping the whole transaction, skip_xid could > be left in case of a server crash between them. > But if we haven't updated origin_lsn/timestamp before the crash, won't it request the same transaction again from the publisher? If so, it will be again able to skip it because skip_xid is still not updated. > Also, we cannot set > origin-lsn and timestamp to an empty transaction. > But won't we update the catalog for skip_xid in that case? Do we see any advantage of updating the skip_xid in the same transaction vs. doing it in a separate transaction? If not then probably we can choose either of those ways and add some comments to indicate the possibility of doing it another way. > In prepared transaction cases, I think that when handling a prepare > message, we need to commit the transaction to update the catalog, > instead of preparing it. And at the commit prepared and rollback > prepared time, we skip it since there is not the prepared transaction > on the subscriber. > Can't we think of just allowing prepare in this case and updating the skip_xid only at commit time? I see that in this case, we would be doing prepare for a transaction that has no changes but as such cases won't be common, isn't that acceptable? > Currently, handling rollback prepared already > behaves so; it first checks whether we have prepared the transaction > or not and skip it if haven’t. So I think we need to do that also for > commit prepared case. With that, this requires protocol changes so > that the subscriber can get prepare-lsn and prepare-time when handling > commit prepared. > > So I’m writing a separate patch to add prepare-lsn and timestamp to > commit_prepared message, which will be a building block for skipping > prepared transactions. Actually, I think it’s beneficial even today; > we can skip preparing the transaction if it’s an empty transaction. > Although the comment it’s not a common case, I think that it could > happen quite often in some cases: > > * XXX, We can optimize such that at commit prepared time, we first check > * whether we have prepared the transaction or not but that doesn't seem > * worthwhile because such cases shouldn't be common. > */ > > For example, if the publisher has multiple subscriptions and there are > many prepared transactions that modify the particular table subscribed > by one publisher, many empty transactions are replicated to other > subscribers. > I think this is not clear to me. Why would one have multiple subscriptions for the same publication? I thought it is possible when say some publisher doesn't publish any data of prepared transaction say because the corresponding action is not published or something like that. I don't deny that someday we want to optimize this case but it might be better if we don't need to do it along with this patch. -- With Regards, Amit Kapila.
pgsql-hackers by date: