Thread: Track replica origin progress for Rollback Prepared
While reviewing logical decoding of 2PC xacts work, I noticed that we need $SUBJECT [1]. Commit 1eb6d6527a [2] allowed to track replica origin replay progress for 2PC but it was not complete. It misses to properly track the progress for rollback prepared especially it missed to update the code for recovery. Additionally, we need to allow tracking it on subscriber nodes where wal_level might not be logical. As noted in [1], without this the subscriber might again ask for rollback prepared lsn after restart. Attached patch addresses this problem. Thoughts? [1] - https://www.postgresql.org/message-id/CAA4eK1L3p4z%2B9wtK77MbdpkagR4GS2Y3r1Je7ZEvLQVF9GArfg%40mail.gmail.com [2] - commit 1eb6d6527aae264b3e0b9c95aa70bb7a594ad1cf Author: Simon Riggs <simon@2ndQuadrant.com> Date: Wed Mar 28 17:42:50 2018 +0100 Store 2PC GID in commit/abort WAL recs for logical decoding Store GID of 2PC in commit/abort WAL records when wal_level = logical. This allows logical decoding to send the SAME gid to subscribers across restarts of logical replication. Track relica origin replay progress for 2PC. -- With Regards, Amit Kapila.
Attachment
On Tue, Jan 05, 2021 at 09:35:21AM +0530, Amit Kapila wrote: > As noted in [1], without this the subscriber might again ask for > rollback prepared lsn after restart. > > Attached patch addresses this problem. Is it possible to add some tests in test_decoding? /* dump transaction origin information only for abort prepared */ if ((replorigin_session_origin != InvalidRepOriginId) && - TransactionIdIsValid(twophase_xid) && - XLogLogicalInfoActive()) + TransactionIdIsValid(twophase_xid)) It seems to me that you may want to document as a comment the reason why this gets sent even if (wal_level < logical). -- Michael
Attachment
On Tue, Jan 5, 2021 at 12:32 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Jan 05, 2021 at 09:35:21AM +0530, Amit Kapila wrote: > > As noted in [1], without this the subscriber might again ask for > > rollback prepared lsn after restart. > > > > Attached patch addresses this problem. > > Is it possible to add some tests in test_decoding? > There are already tests [1] in one of the upcoming patches for logical decoding of 2PC which covers this code using which I have found this problem. So, I thought those would be sufficient. I have not checked the feasibility of using test_decoding because I thought adding more using test_decoding will unnecessarily duplicate the tests. > /* dump transaction origin information only for abort prepared */ > if ((replorigin_session_origin != InvalidRepOriginId) && > - TransactionIdIsValid(twophase_xid) && > - XLogLogicalInfoActive()) > + TransactionIdIsValid(twophase_xid)) > It seems to me that you may want to document as a comment the reason > why this gets sent even if (wal_level < logical). > How about something like "Dump transaction origin information only for abort prepared. We need this during recovery to update the replication origin progress."? [1] - See the tests with comments "check that 2PC gets replicated to subscriber then ROLLBACK PREPARED" and "Check that ROLLBACK PREPARED is decoded properly on crash restart (publisher and subscriber crash)" in v33-0007-Support-2PC-txn-subscriber-tests at https://www.postgresql.org/message-id/CAA4eK1L3p4z%2B9wtK77MbdpkagR4GS2Y3r1Je7ZEvLQVF9GArfg%40mail.gmail.com -- With Regards, Amit Kapila.
On Tue, Jan 05, 2021 at 04:24:21PM +0530, Amit Kapila wrote: > There are already tests [1] in one of the upcoming patches for logical > decoding of 2PC which covers this code using which I have found this > problem. So, I thought those would be sufficient. I have not checked > the feasibility of using test_decoding because I thought adding more > using test_decoding will unnecessarily duplicate the tests. Hmm. This stuff does not check after replication origins even if it stresses 2PC, so that looks incomplete when seen from here. > How about something like "Dump transaction origin information only for > abort prepared. We need this during recovery to update the replication > origin progress."? That sounds fine. -- Michael
Attachment
On Wed, Jan 6, 2021 at 5:18 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Jan 05, 2021 at 04:24:21PM +0530, Amit Kapila wrote: > > There are already tests [1] in one of the upcoming patches for logical > > decoding of 2PC which covers this code using which I have found this > > problem. So, I thought those would be sufficient. I have not checked > > the feasibility of using test_decoding because I thought adding more > > using test_decoding will unnecessarily duplicate the tests. > > Hmm. This stuff does not check after replication origins even if it > stresses 2PC, so that looks incomplete when seen from here. > I think it does. Let me try to explain in a bit more detail. Internally, the apply worker uses replication origins to track the progress of apply, see the code near ApplyWorkerMain->replorigin_create. We will store the progress (WAL LSN) for each commit (prepared)/ rollback prepared with this origin. If the server crashes and restarts, we will use the origin's LSN as the start decoding point (the subscriber sends the last LSN to the publisher). The bug here is that after restart the origin was not advanced for rollback prepared which I have fixed with this patch. Now, let us see how the tests mentioned by me cover this code. In the first test (check that 2PC gets replicated to subscriber then ROLLBACK PREPARED), we do below on publisher and wait for it to be applied on the subscriber. BEGIN; INSERT INTO tab_full VALUES (12); PREPARE TRANSACTION 'test_prepared_tab_full'; ROLLBACK PREPARED 'test_prepared_tab_full'; Note that we would have WAL logged the LSN (replication_origin_lsn) corresponding to ROLLBACK PREPARED on the subscriber during apply. Now, in the second test(Check that ROLLBACK PREPARED is decoded properly on crash restart (publisher and subscriber crash)), we prepare a transaction and crash the server. After the restart, because we have not advanced the replication origin in the recovery of Rollback Prepared, the subscriber won't consider that transaction has been applied so it again requests that transaction. Actually speaking, we don't need the second test to reproduce this exact problem, if we would have restarted after the first test the problem would be reproduced but I was consistent getting the problem so with the current way tests are written. However, we can change it slightly to restart after the first test if we want. -- With Regards, Amit Kapila.
On Wed, Jan 6, 2021 at 11:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > Now, let us see how the tests mentioned by me cover this code. In the > first test (check that 2PC gets replicated to subscriber then ROLLBACK > PREPARED), we do below on publisher and wait for it to be applied on > the subscriber. > BEGIN; > INSERT INTO tab_full VALUES (12); > PREPARE TRANSACTION 'test_prepared_tab_full'; > ROLLBACK PREPARED 'test_prepared_tab_full'; Added some prints in this patch, just to verify that the code is hit and applied this patch along with the other patches of two-phase commit shared in [1]. Below are the added logs. +++ b/src/backend/access/transam/twophase.c @@ -1133,9 +1133,9 @@ EndPrepare(GlobalTransaction gxact) gxact->prepare_start_lsn = ProcLastRecPtr; /* - * Mark the prepared transaction as valid. As soon as xact.c marks - * MyProc as not running our XID (which it will do immediately after - * this function returns), others can commit/rollback the xact. + * Mark the prepared transaction as valid. As soon as xact.c marks MyProc + * as not running our XID (which it will do immediately after this + * function returns), others can commit/rollback the xact. * * NB: a side effect of this is to make a dummy ProcArray entry for the * prepared XID. This must happen before we clear the XID from MyProc / @@ -2277,6 +2277,16 @@ RecordTransactionAbortPrepared(TransactionId xid, const char *gid) { XLogRecPtr recptr; + bool replorigin; + + /* + * Are we using the replication origins feature? Or, in other words, are + * we replaying remote actions? + */ + replorigin = (replorigin_session_origin != InvalidRepOriginId && + replorigin_session_origin != DoNotReplicateId); + + elog(LOG,"change replorigin after a rollback prepared"); @@ -2299,6 +2309,14 @@ RecordTransactionAbortPrepared(TransactionId xid, MyXactFlags | XACT_FLAGS_ACQUIREDACCE xid, gid); + if (replorigin) + { + /* Move LSNs forward for this replication origin */ + replorigin_session_advance(replorigin_session_origin_lsn, + XactLastRecEnd); + elog(LOG,"advance replorigin for abort prepared"); + } + +++ b/src/backend/access/transam/xact.c @@ -5720,13 +5720,13 @@ XactLogAbortRecord(TimestampTz abort_time, /* dump transaction origin information only for abort prepared */ if ((replorigin_session_origin != InvalidRepOriginId) && - TransactionIdIsValid(twophase_xid) && - XLogLogicalInfoActive()) + TransactionIdIsValid(twophase_xid)) { xl_xinfo.xinfo |= XACT_XINFO_HAS_ORIGIN; xl_origin.origin_lsn = replorigin_session_origin_lsn; xl_origin.origin_timestamp = replorigin_session_origin_timestamp; + elog(LOG,"transaction origin information for abort prepared"); } And created a pub/sub for table tab: server1: postgres=# create table tab(i int); CREATE TABLE postgres=# CREATE PUBLICATION tap_pub FOR all tables; CREATE PUBLICATION server2: postgres=# create table tab(i int); CREATE TABLE postgres=# CREATE SUBSCRIPTION tap_sub CONNECTION 'host=localhost dbname=postgres' PUBLICATION tap_pub with(two_phase=on); NOTICE: created replication slot "tap_sub" on publisher CREATE SUBSCRIPTION server1: postgres=# begin; BEGIN postgres=*# insert into tab values (3); INSERT 0 1 postgres=*# prepare transaction 'test1'; PREPARE TRANSACTION postgres=# rollback prepared 'test1'; ROLLBACK PREPARED And confirmed in logs on server2: 2021-01-12 04:01:01.292 EST [19896] LOG: logical replication apply worker for subscription "tap_sub" has started 2021-01-12 04:01:01.302 EST [19898] LOG: logical replication table synchronization worker for subscription "tap_sub", table "tab" has started 2021-01-12 04:01:01.343 EST [19898] LOG: logical replication table synchronization worker for subscription "tap_sub", table "tab" has finished 2021-01-12 04:01:49.240 EST [19896] LOG: change replorigin after a rollback prepared 2021-01-12 04:01:49.240 EST [19896] LOG: transaction origin information for abort prepared 2021-01-12 04:01:49.240 EST [19896] LOG: advance replorigin for abort prepared And logs on server1: 2021-01-12 04:01:01.343 EST [19899] STATEMENT: START_REPLICATION SLOT "tap_sub_16389_sync_16384" LOGICAL 0/16054E0 (proto_version '2', publication_names '"tap_pub"') 2021-01-12 04:01:49.235 EST [4195] LOG: change replorigin after a rollback prepared Also restarted server2 and confirmed that the rollback prepared was not re-sent again. regards, Ajin Cherian Fujitsu Australia [1] - https://www.postgresql.org/message-id/CAA4eK1J7TiOQSiDq9MS_4k7zr4Jca7SnO8Vy3L4wtc7t1X9zsQ%40mail.gmail.com
On Tue, Jan 12, 2021 at 3:18 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Wed, Jan 6, 2021 at 11:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Now, let us see how the tests mentioned by me cover this code. In the > > first test (check that 2PC gets replicated to subscriber then ROLLBACK > > PREPARED), we do below on publisher and wait for it to be applied on > > the subscriber. > > BEGIN; > > INSERT INTO tab_full VALUES (12); > > PREPARE TRANSACTION 'test_prepared_tab_full'; > > ROLLBACK PREPARED 'test_prepared_tab_full'; > .. > > Also restarted server2 and confirmed that the rollback prepared was > not re-sent again. > Thanks for doing these tests. I think you can put an elog in the below code change as well to show that the recovery code path is also hit: +xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid, + XLogRecPtr lsn, RepOriginId origin_id) { ... + if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN) + { + /* recover apply progress */ + replorigin_advance(origin_id, parsed->origin_lsn, lsn, + false /* backward */, false /* WAL */); + } -- With Regards, Amit Kapila.
On Wed, Jan 13, 2021 at 12:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > Thanks for doing these tests. I think you can put an elog in the below > code change as well to show that the recovery code path is also hit: > > +xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid, > + XLogRecPtr lsn, RepOriginId origin_id) > { > ... > + if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN) > + { > + /* recover apply progress */ > + replorigin_advance(origin_id, parsed->origin_lsn, lsn, > + false /* backward */, false /* WAL */); > + } I added the below log: @@ -2299,6 +2309,14 @@ RecordTransactionAbortPrepared(TransactionId xid, MyXactFlags | XACT_FLAGS_ACQUIREDACCE xid, gid); + if (replorigin) + { + /* Move LSNs forward for this replication origin */ + replorigin_session_advance(replorigin_session_origin_lsn, + XactLastRecEnd); + elog(LOG,"advance replorigin for abort prepared"); + } + /* Always flush, since we're about to remove the 2PC state file */ XLogFlush(recptr); Then on server1: postgres=# begin; BEGIN postgres=*# insert into tab values (5); INSERT 0 1 postgres=*# prepare transaction 'test'; PREPARE TRANSACTION postgres=# rollback prepared 'test'; ROLLBACK PREPARED And I immediately stopped server 2 to prevent checkpoint: pg_ctl stop -m immediate and restarted server 2: I got the following logs: 2021-01-13 02:14:40.297 EST [4851] LOG: listening on IPv6 address "::1", port 54321 2021-01-13 02:14:40.297 EST [4851] LOG: listening on IPv4 address "127.0.0.1", port 54321 2021-01-13 02:14:40.323 EST [4851] LOG: listening on Unix socket "/tmp/.s.PGSQL.54321" 2021-01-13 02:14:40.333 EST [4852] LOG: database system was interrupted; last known up at 2021-01-13 02:04:08 EST 2021-01-13 02:14:40.842 EST [4852] LOG: recovered replication state of node 1 to 0/1606C60 2021-01-13 02:14:40.843 EST [4852] LOG: database system was not properly shut down; automatic recovery in progress 2021-01-13 02:14:40.850 EST [4852] LOG: redo starts at 0/160BFE8 2021-01-13 02:14:40.850 EST [4852] LOG: recover apply progress in red abort <==== the log added for this 2021-01-13 02:14:40.850 EST [4852] CONTEXT: WAL redo at 0/160C0E8 for Transaction/ABORT_PREPARED: 533: 2021-01-13 02:14:20.911933-05 2021-01-13 02:14:40.853 EST [4852] LOG: invalid record length at 0/160C160: wanted 24, got 0 2021-01-13 02:14:40.853 EST [4852] LOG: redo done at 0/160C128 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s 2021-01-13 02:14:40.868 EST [4851] LOG: database system is ready to accept connections 2021-01-13 02:14:40.897 EST [4887] LOG: logical replication apply worker for subscription "tap_sub" has started I see the same logs are seen in the test cases that have been added as part of https://www.postgresql.org/message-id/CAA4eK1L3p4z+9wtK77MbdpkagR4GS2Y3r1Je7ZEvLQVF9GArfg@mail.gmail.com regards, Ajin Cherian Fujitsu Australia .
On Wed, Jan 13, 2021 at 6:28 PM Ajin Cherian <itsajin@gmail.com> wrote: > I added the below log: Small correction, I sent the wrong code change for logging, this was the correct one: @@ -5976,6 +5977,14 @@ xact_redo_abort(xl_xact_parsed_abort *parsed, TransactionId xid) StandbyReleaseLockTree(xid, parsed->nsubxacts, parsed->subxacts); } + if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN) + { + /* recover apply progress */ + replorigin_advance(origin_id, parsed->origin_lsn, lsn, + false /* backward */, false /* WAL */); + elog(LOG,"recover apply progress in red abort"); + } + regards, Ajin Cherian Fujitsu Australia