Thread: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18267 Logged by: song yutao Email address: sytoptimisprime@163.com PostgreSQL version: 15.5 Operating system: Linux Description: Hi hackers, I found when insert plenty of data into a table, and add the table to publication (through Alter Publication) meanwhile, it's likely that the incremental data cannot be synchronized to the subscriber. Here is my test method: 1. On publisher and subscriber, create table for test: CREATE TABLE tab_1 (a int); 2. Setup logical replication: on publisher: SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false, false); CREATE PUBLICATION tap_pub; on subscriber: CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1') 3. Perform Insert: for (my $i = 1; $i <= 1000; $i++) { $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT generate_series(1, 1000)"); } Each transaction contains 1000 insertion, and 1000 transactions are in total. 4. When performing step 3, add table tab_1 to publication. ALTER PUBLICATION tap_pub ADD TABLE tab_1 ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION The root cause of the problem is as follows: pgoutput relies on the invalidation mechanism to validate publications. When walsender decoding an Alter Publication transaction, catalog caches are invalidated at once. Furthermore, since pg_publication_rel is modified, snapshot changes are added to all transactions currently being decoded. For other transactions, catalog caches have been invalidated. However, it is likely that the snapshot changes have not yet been decoded. In pgoutput implementation, these transactions query the system table pg_publication_rel to determine whether to publish changes made in transactions. In this case, catalog tuples are not found because snapshot has not been updated. As a result, changes in transactions are considered not to be published, and subsequent data cannot be synchronized. I think it's necessary to add invalidations to other transactions after adding a snapshot change to them. Therefore, I submitted a patch for this bug.
Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
Amit Kapila
Date:
On Wed, Jan 3, 2024 at 9:51 PM PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: > > Bug reference: 18267 > Logged by: song yutao > Email address: sytoptimisprime@163.com > PostgreSQL version: 15.5 > Operating system: Linux > Description: > > Hi hackers, I found when insert plenty of data into a table, and add the > table to publication (through Alter Publication) meanwhile, it's likely that > the incremental data cannot be synchronized to the subscriber. Here is my > test method: > > 1. On publisher and subscriber, create table for test: > CREATE TABLE tab_1 (a int); > > 2. Setup logical replication: > on publisher: > SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false, > false); > CREATE PUBLICATION tap_pub; > on subscriber: > CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION > tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1') > > 3. Perform Insert: > for (my $i = 1; $i <= 1000; $i++) { > $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT > generate_series(1, 1000)"); > } > Each transaction contains 1000 insertion, and 1000 transactions are in > total. > > 4. When performing step 3, add table tab_1 to publication. > ALTER PUBLICATION tap_pub ADD TABLE tab_1 > ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION > > The root cause of the problem is as follows: > pgoutput relies on the invalidation mechanism to validate publications. When > walsender decoding an Alter Publication transaction, catalog caches are > invalidated at once. Furthermore, since pg_publication_rel is modified, > snapshot changes are added to all transactions currently being decoded. For > other transactions, catalog caches have been invalidated. However, it is > likely that the snapshot changes have not yet been decoded. In pgoutput > implementation, these transactions query the system table pg_publication_rel > to determine whether to publish changes made in transactions. In this case, > catalog tuples are not found because snapshot has not been updated. As a > result, changes in transactions are considered not to be published, and > subsequent data cannot be synchronized. > As per my understanding, we distribute snapshot to other transactions at commit time (LSN) which means in your case at the time of commit for "ALTER PUBLICATION tap_pub ADD TABLE tab_1". So any changes after that should see the changes in pg_publication_rel. > I think it's necessary to add invalidations to other transactions after > adding a snapshot change to them. > Therefore, I submitted a patch for this bug. > Sorry, I didn't understand your proposal and I don't see any patch attached as you are claiming in the last sentence. -- With Regards, Amit Kapila.
Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
Dilip Kumar
Date:
On Wed, Jan 3, 2024 at 9:51 PM PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: > > Bug reference: 18267 > Logged by: song yutao > Email address: sytoptimisprime@163.com > PostgreSQL version: 15.5 > Operating system: Linux > Description: > > Hi hackers, I found when insert plenty of data into a table, and add the > table to publication (through Alter Publication) meanwhile, it's likely that > the incremental data cannot be synchronized to the subscriber. Here is my > test method: > > 1. On publisher and subscriber, create table for test: > CREATE TABLE tab_1 (a int); > > 2. Setup logical replication: > on publisher: > SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false, > false); > CREATE PUBLICATION tap_pub; > on subscriber: > CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION > tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1') > > 3. Perform Insert: > for (my $i = 1; $i <= 1000; $i++) { > $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT > generate_series(1, 1000)"); > } > Each transaction contains 1000 insertion, and 1000 transactions are in > total. > > 4. When performing step 3, add table tab_1 to publication. > ALTER PUBLICATION tap_pub ADD TABLE tab_1 > ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION I could not reproduce this issue. Can you tell me exactly which data were missing for you? When you add a table to the publication and refresh, and as soon as you identify that the table is part of the publication and send the first commit which contains the changes for the table it will identify that the table state is not yet SYNC READY and then it will trigger a sync worker and via that it should be able to get all the previous data for that table. > The root cause of the problem is as follows: > pgoutput relies on the invalidation mechanism to validate publications. When > walsender decoding an Alter Publication transaction, catalog caches are > invalidated at once. Furthermore, since pg_publication_rel is modified, > snapshot changes are added to all transactions currently being decoded. For > other transactions, catalog caches have been invalidated. However, it is > likely that the snapshot changes have not yet been decoded. In pgoutput > implementation, these transactions query the system table pg_publication_rel > to determine whether to publish changes made in transactions. In this case, > catalog tuples are not found because snapshot has not been updated. As a > result, changes in transactions are considered not to be published, and > subsequent data cannot be synchronized. > > I think it's necessary to add invalidations to other transactions after > adding a snapshot change to them. > Therefore, I submitted a patch for this bug. I think you missed attaching the patch, as Amit also pointed out. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Song, > > Hi hackers, I found when insert plenty of data into a table, and add the > table to publication (through Alter Publication) meanwhile, it's likely that > the incremental data cannot be synchronized to the subscriber. Here is my > test method: Good catch. > 1. On publisher and subscriber, create table for test: > CREATE TABLE tab_1 (a int); > > 2. Setup logical replication: > on publisher: > SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false, > false); > CREATE PUBLICATION tap_pub; > on subscriber: > CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' > PUBLICATION > tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1') > > 3. Perform Insert: > for (my $i = 1; $i <= 1000; $i++) { > $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT > generate_series(1, 1000)"); > } > Each transaction contains 1000 insertion, and 1000 transactions are in > total. > > 4. When performing step 3, add table tab_1 to publication. > ALTER PUBLICATION tap_pub ADD TABLE tab_1 > ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION I could reproduce the failure. PSA the script. In the script, ALTER PUBLICATION was executed while doing the initial data sync. (The workload is almost same as what you reporter posted, but number of rows are reduced) In total, 4000 tuples are inserted on publisher. However, after sometime, only 2500 tuples are replicated. ``` publisher=# SELECT count(*) FROM tab_1 ; count ------- 40000 (1 row) subscriber=# SELECT count(*) FROM tab_1 ; count ------- 25000 (1 row) ``` Is it same failure you saw? > The root cause of the problem is as follows: > pgoutput relies on the invalidation mechanism to validate publications. When > walsender decoding an Alter Publication transaction, catalog caches are > invalidated at once. Furthermore, since pg_publication_rel is modified, > snapshot changes are added to all transactions currently being decoded. For > other transactions, catalog caches have been invalidated. However, it is > likely that the snapshot changes have not yet been decoded. In pgoutput > implementation, these transactions query the system table pg_publication_rel > to determine whether to publish changes made in transactions. In this case, > catalog tuples are not found because snapshot has not been updated. As a > result, changes in transactions are considered not to be published, and > subsequent data cannot be synchronized. > > I think it's necessary to add invalidations to other transactions after > adding a snapshot change to them. > Therefore, I submitted a patch for this bug. I cannot see your attaching, but I found that proposed patch in [1] can solve the issue. After applying 0001 + 0002 + 0003 (open relations as ShareRowExclusiveLock, in OpenTableList), the data gap was removed. Thought? [1]: https://www.postgresql.org/message-id/de52b282-1166-1180-45a2-8d8917ca74c6%40enterprisedb.com Best Regards, Hayato Kuroda FUJITSU LIMITED
Attachment
Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
Dilip Kumar
Date:
On Fri, Jan 5, 2024 at 9:25 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Song, > > > > > Hi hackers, I found when insert plenty of data into a table, and add the > > table to publication (through Alter Publication) meanwhile, it's likely that > > the incremental data cannot be synchronized to the subscriber. Here is my > > test method: > > Good catch. > > > 1. On publisher and subscriber, create table for test: > > CREATE TABLE tab_1 (a int); > > > > 2. Setup logical replication: > > on publisher: > > SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false, > > false); > > CREATE PUBLICATION tap_pub; > > on subscriber: > > CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' > > PUBLICATION > > tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1') > > > > 3. Perform Insert: > > for (my $i = 1; $i <= 1000; $i++) { > > $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT > > generate_series(1, 1000)"); > > } > > Each transaction contains 1000 insertion, and 1000 transactions are in > > total. > > > > 4. When performing step 3, add table tab_1 to publication. > > ALTER PUBLICATION tap_pub ADD TABLE tab_1 > > ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION > > I could reproduce the failure. PSA the script. > > In the script, ALTER PUBLICATION was executed while doing the initial data sync. > (The workload is almost same as what you reporter posted, but number of rows are reduced) > > In total, 4000 tuples are inserted on publisher. However, after sometime, only 2500 tuples are replicated. > > ``` > publisher=# SELECT count(*) FROM tab_1 ; > count > ------- > 40000 > (1 row) > > subscriber=# SELECT count(*) FROM tab_1 ; > count > ------- > 25000 > (1 row) > ``` > > Is it same failure you saw? With your attached script I was able to see this gap, I didn't dig deeper but with the initial investigation, I could see that even after ALTER PUBLICATION, the pgoutput_change continues to see 'relentry->pubactions.pubinsert' as false, even after re fetching the relation entry after the invalidation. That shows the invalidation framework might be working fine but we are using the older snapshot to fetch the entry. I did not debug it further why it is not getting the updated snapshot which can see the change in publication, because I assume Yutao Song as already analyzed that as per his first email, so I would wait for his patch. > > The root cause of the problem is as follows: > > pgoutput relies on the invalidation mechanism to validate publications. When > > walsender decoding an Alter Publication transaction, catalog caches are > > invalidated at once. Furthermore, since pg_publication_rel is modified, > > snapshot changes are added to all transactions currently being decoded. For > > other transactions, catalog caches have been invalidated. However, it is > > likely that the snapshot changes have not yet been decoded. In pgoutput > > implementation, these transactions query the system table pg_publication_rel > > to determine whether to publish changes made in transactions. In this case, > > catalog tuples are not found because snapshot has not been updated. As a > > result, changes in transactions are considered not to be published, and > > subsequent data cannot be synchronized. > > > > I think it's necessary to add invalidations to other transactions after > > adding a snapshot change to them. > > Therefore, I submitted a patch for this bug. > > I cannot see your attaching, but I found that proposed patch in [1] can solve > the issue. After applying 0001 + 0002 + 0003 (open relations as ShareRowExclusiveLock, > in OpenTableList), the data gap was removed. Thought? Not sure why 'open relations as ShareRowExclusiveLock' would help in this case? have you investigated that? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
RE: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Dilip, > > > The root cause of the problem is as follows: > > > pgoutput relies on the invalidation mechanism to validate publications. When > > > walsender decoding an Alter Publication transaction, catalog caches are > > > invalidated at once. Furthermore, since pg_publication_rel is modified, > > > snapshot changes are added to all transactions currently being decoded. For > > > other transactions, catalog caches have been invalidated. However, it is > > > likely that the snapshot changes have not yet been decoded. In pgoutput > > > implementation, these transactions query the system table pg_publication_rel > > > to determine whether to publish changes made in transactions. In this case, > > > catalog tuples are not found because snapshot has not been updated. As a > > > result, changes in transactions are considered not to be published, and > > > subsequent data cannot be synchronized. > > > > > > I think it's necessary to add invalidations to other transactions after > > > adding a snapshot change to them. > > > Therefore, I submitted a patch for this bug. > > > > I cannot see your attaching, but I found that proposed patch in [1] can solve > > the issue. After applying 0001 + 0002 + 0003 (open relations as > ShareRowExclusiveLock, > > in OpenTableList), the data gap was removed. Thought? > > Not sure why 'open relations as ShareRowExclusiveLock' would help in > this case? have you investigated that? > Sorry for confusing, I did not analyze because I thought Song will tell us the reason with his patch. I have just reported my tested... I will spend time later based on the requirement. Best Regards, Hayato Kuroda FUJITSU LIMITED
RE: Re:Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Song, > I see that snapshot changes is added to the tail of txn->changes list in SnapBuildDistributeNewCatalogSnapshot. When the DML transaction is committed, the corresponding entry in RelationSyncCache has already been invalidated. However, snapshot hasn't been changed as this time (The newly added snapshot change is still in the list and hasn't been invoked.), so as you say: " pgoutput_change continues to see 'relentry->pubactions.pubinsert' as false, even after re fetching the relation entry after the invalidation." Therefore, there is one solution, as Andres Freund[1] states, that if a transaction is found to have modified catalog during logical decoding, invalidations are forced to be assigned to all concurrent in-progress transactions, just as my patch did. [1] https://www.postgresql.org/message-id/20231118025445.crhaeeuvoe2g5dv6%40awork3.anarazel.de > Thanks for making a patch, but your patch cannot be built: ``` reorderbuffer.c: In function ‘ReorderBufferDistributeInvalidation’: ../../../../src/include/lib/ilist.h:594:2: error: expected identifier before ‘(’ token (AssertVariableIsOfTypeMacro(ptr, dlist_node *), \ ^ reorderbuffer.c:2648:8: note: in expansion of macro ‘dlist_container’ txn->dlist_container(ReorderBufferTXN, node, txn_i.cur); ^~~~~~~~~~~~~~~ ``` Based on your arguments, I revised your patch. Is it same as your expectations? This also fixes some coding conventions. > Although my approach seems to be effective, > I ran my reproducer and confirmed that the issue was fixed. > it may bring additional overhead to logical decoding under special circumstances. I think upgrading lock mentioned in [1] is also an effective solution, and it seems to have less impact. But I don't quite understand why it may cause deadlocks mentioned in [1]. > IIUC, no one pointed out a concrete case for deadlocks, but anyone could not say that it is safe, neither. We must understand the thread more and must decide how we fix: modify the lock level or send invalidation messages to all the reorderbuffer transactions. How do you think? Best Regards, Hayato Kuroda FUJITSU LIMITED