Thread: pg_createsubscriber: drop pre-existing subscriptions from the converted node
pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Hackers, This is a follow-up thread for pg_createsubscriber [1]. I started a new thread since there is no activity around here. ## Problem Assuming that there is a cascading replication like below: node A --(logical replication)--> node B --(streaming replication)--> node C In this case, subscriptions exist even on node C, but it does not try to connect to node A because the logical replication launcher/worker won't be launched. After the conversion, node C becomes a subscriber for node B, and the subscription toward node A remains. Therefore, another worker that tries to connect to node A will be launched, raising an ERROR [2]. This failure may occur even during the conversion. ## Solution The easiest solution is to drop pre-existing subscriptions from the converted node. To avoid establishing connections during the conversion, slot_name is set to NONE on the primary first, then drop on the standby. The setting will be restored on the primary node. Attached patch implements the idea. Test script is also included, but not sure it should be on the HEAD BTW, I found that LogicalRepInfo.oid won't be used. If needed, I can create another patch to remove the attribute. How do you think? [1]: https://www.postgresql.org/message-id/CAA4eK1J22UEfrqx222h5j9DQ7nxGrTbAa_BC%2B%3DmQXdXs-RCsew%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CANhcyEWvimA1-f6hSrA%3D9qkfR5SonFb56b36M%2B%2BvT%3DLiFj%3D76g%40mail.gmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Attachment
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
vignesh C
Date:
On Fri, 21 Jun 2024 at 16:51, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Hackers, > > This is a follow-up thread for pg_createsubscriber [1]. I started a new thread > since there is no activity around here. > > ## Problem > > Assuming that there is a cascading replication like below: > > node A --(logical replication)--> node B --(streaming replication)--> node C > > In this case, subscriptions exist even on node C, but it does not try to connect > to node A because the logical replication launcher/worker won't be launched. > After the conversion, node C becomes a subscriber for node B, and the subscription > toward node A remains. Therefore, another worker that tries to connect to node A > will be launched, raising an ERROR [2]. This failure may occur even during the > conversion. > > ## Solution > > The easiest solution is to drop pre-existing subscriptions from the converted node. > To avoid establishing connections during the conversion, slot_name is set to NONE > on the primary first, then drop on the standby. The setting will be restored on the > primary node. > Attached patch implements the idea. Test script is also included, but not sure it should > be on the HEAD Few comments: 1) Should we do this only for the enabled subscription, otherwise the disabled subscriptions will be enabled after running pg_createsubscriber: +obtain_and_disable_pre_existing_subscriptions(struct LogicalRepInfo *dbinfo) +{ + PQExpBuffer query = createPQExpBuffer(); + + for (int i = 0; i < num_dbs; i++) + { + PGconn *conn; + PGresult *res; + int ntups; + + /* Connect to publisher */ + conn = connect_database(dbinfo[i].pubconninfo, true); + + appendPQExpBuffer(query, + "SELECT s.subname, s.subslotname FROM pg_catalog.pg_subscription s " + "INNER JOIN pg_catalog.pg_database d ON (s.subdbid = d.oid) " + "WHERE d.datname = '%s'", + dbinfo[i].dbname); + 2) disconnect_database not called here, should the connection be disconnected: +drop_pre_existing_subscriptions(struct LogicalRepInfo *dbinfo) +{ + PQExpBuffer query = createPQExpBuffer(); + + for (int i = 0; i < num_dbs; i++) + { + PGconn *conn; + struct LogicalRepInfo info = dbinfo[i]; + + /* Connect to subscriber */ + conn = connect_database(info.subconninfo, false); + + for (int j = 0; j < info.num_subscriptions; j++) + { + appendPQExpBuffer(query, + "DROP SUBSCRIPTION %s;", info.pre_subnames[j]); + PQexec(conn, query->data); + resetPQExpBuffer(query); + } + } 3) Similarly here too: +static void +enable_subscirptions_on_publisher(struct LogicalRepInfo *dbinfo) +{ + PQExpBuffer query = createPQExpBuffer(); + + for (int i = 0; i < num_dbs; i++) + { + PGconn *conn; + struct LogicalRepInfo info = dbinfo[i]; + + /* Connect to publisher */ + conn = connect_database(info.pubconninfo, false); 4) them should be then here: + /* ...and them enable the subscription */ + appendPQExpBuffer(query, + "ALTER SUBSCRIPTION %s ENABLE", + info.pre_subnames[j]); + PQclear(PQexec(conn, query->data)); + resetPQExpBuffer(query); > BTW, I found that LogicalRepInfo.oid won't be used. If needed, I can create > another patch to remove the attribute. I was able to compile without this, I think this can be removed. Regards, Vignesh
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
Amit Kapila
Date:
On Fri, Jun 21, 2024 at 4:51 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > This is a follow-up thread for pg_createsubscriber [1]. I started a new thread > since there is no activity around here. > > ## Problem > > Assuming that there is a cascading replication like below: > > node A --(logical replication)--> node B --(streaming replication)--> node C > > In this case, subscriptions exist even on node C, but it does not try to connect > to node A because the logical replication launcher/worker won't be launched. > After the conversion, node C becomes a subscriber for node B, and the subscription > toward node A remains. Therefore, another worker that tries to connect to node A > will be launched, raising an ERROR [2]. This failure may occur even during the > conversion. > > ## Solution > > The easiest solution is to drop pre-existing subscriptions from the converted node. > To avoid establishing connections during the conversion, slot_name is set to NONE > on the primary first, then drop on the standby. The setting will be restored on the > primary node. > It seems disabling subscriptions on the primary can make the primary stop functioning for some duration of time. I feel we need some solution where after converting to subscriber, we disable and drop pre-existing subscriptions. One idea could be that we use the list of new subscriptions created by the tool such that any subscription not existing in that list will be dropped. Shouldn't this be an open item for PG17? -- With Regards, Amit Kapila.
RE: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit, Vingesh, Thanks for giving comments! > It seems disabling subscriptions on the primary can make the primary > stop functioning for some duration of time. I feel we need some > solution where after converting to subscriber, we disable and drop > pre-existing subscriptions. One idea could be that we use the list of > new subscriptions created by the tool such that any subscription not > existing in that list will be dropped. Previously I avoided coding like yours, because there is a room that converted node can connect to another publisher. But per off-list discussion, we can skip it by setting max_logical_replication_workers = 0. I refactored with the approach. Note that the GUC is checked at verification phase, so an attribute is added to start_standby_server() to select the workload. Most of comments by Vignesh were invalidated due to the code change, but I hoped I checked your comments were not reproduced. Also, 0001 was created to remove an unused attribute. > Shouldn't this be an open item for PG17? Added this thread to wikipage. Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Attachment
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
Amit Kapila
Date:
On Thu, Jun 27, 2024 at 11:47 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > > It seems disabling subscriptions on the primary can make the primary > > stop functioning for some duration of time. I feel we need some > > solution where after converting to subscriber, we disable and drop > > pre-existing subscriptions. One idea could be that we use the list of > > new subscriptions created by the tool such that any subscription not > > existing in that list will be dropped. > > Previously I avoided coding like yours, because there is a room that converted > node can connect to another publisher. But per off-list discussion, we can skip > it by setting max_logical_replication_workers = 0. I refactored with the approach. > Note that the GUC is checked at verification phase, so an attribute is added to > start_standby_server() to select the workload. > Thanks, this is a better approach. I have changed a few comments and made some other cosmetic changes. See attached. Euler, Peter E., and others, do you have any comments/suggestions? BTW, why have you created a separate test file for this test? I think we should add a new test to one of the existing tests in 040_pg_createsubscriber. You can create a dummy subscription on node_p and do a test similar to what we are doing in "# Create failover slot to test its removal". -- With Regards, Amit Kapila.
Attachment
RE: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
"Hayato Kuroda (Fujitsu)"
Date:
Dear Amit, Thanks for giving comments! PSA new version. > Thanks, this is a better approach. I have changed a few comments and > made some other cosmetic changes. See attached. I checked your attached and LGTM. Based on that, I added some changes like below: - Made dbname be escaped while listing up pre-existing subscriptions Previous version could not pass tests by recent commits. - Skipped dropping subscriptions in dry_run mode I found the issue while poring the test to 040_pg_createsubscriber.pl. - Added info-level output to follow other drop_XXX functions > BTW, why have you created a separate test file for this test? I think > we should add a new test to one of the existing tests in > 040_pg_createsubscriber. I was separated a test file for just confirmation purpose, I've planned to merge. New patch set did that. > You can create a dummy subscription on node_p > and do a test similar to what we are doing in "# Create failover slot > to test its removal". Your approach looks better than mine. I followed the approach. Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Attachment
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
Amit Kapila
Date:
On Mon, Jul 1, 2024 at 11:44 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > > You can create a dummy subscription on node_p > > and do a test similar to what we are doing in "# Create failover slot > > to test its removal". > > Your approach looks better than mine. I followed the approach. > LGTM. -- With Regards, Amit Kapila.
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
Shlok Kyal
Date:
On Mon, 1 Jul 2024 at 11:44, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Amit, > > Thanks for giving comments! PSA new version. > > > Thanks, this is a better approach. I have changed a few comments and > > made some other cosmetic changes. See attached. > > I checked your attached and LGTM. Based on that, I added some changes > like below: > > - Made dbname be escaped while listing up pre-existing subscriptions > Previous version could not pass tests by recent commits. > - Skipped dropping subscriptions in dry_run mode > I found the issue while poring the test to 040_pg_createsubscriber.pl. > - Added info-level output to follow other drop_XXX functions > > > BTW, why have you created a separate test file for this test? I think > > we should add a new test to one of the existing tests in > > 040_pg_createsubscriber. > > I was separated a test file for just confirmation purpose, I've planned to merge. > New patch set did that. > > > You can create a dummy subscription on node_p > > and do a test similar to what we are doing in "# Create failover slot > > to test its removal". > > Your approach looks better than mine. I followed the approach. Hi Kuroda-san, I tested the patches on linux and windows and I confirm that it successfully fixes the issue [1]. [1]: https://www.postgresql.org/message-id/CANhcyEWvimA1-f6hSrA%3D9qkfR5SonFb56b36M%2B%2BvT%3DLiFj%3D76g%40mail.gmail.com Thanks and Regards, Shlok Kyal
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
From
Amit Kapila
Date:
On Tue, Jul 2, 2024 at 9:57 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Mon, 1 Jul 2024 at 11:44, Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Amit, > > > > Thanks for giving comments! PSA new version. > > > > > Thanks, this is a better approach. I have changed a few comments and > > > made some other cosmetic changes. See attached. > > > > I checked your attached and LGTM. Based on that, I added some changes > > like below: > > > > - Made dbname be escaped while listing up pre-existing subscriptions > > Previous version could not pass tests by recent commits. > > - Skipped dropping subscriptions in dry_run mode > > I found the issue while poring the test to 040_pg_createsubscriber.pl. > > - Added info-level output to follow other drop_XXX functions > > > > > BTW, why have you created a separate test file for this test? I think > > > we should add a new test to one of the existing tests in > > > 040_pg_createsubscriber. > > > > I was separated a test file for just confirmation purpose, I've planned to merge. > > New patch set did that. > > > > > You can create a dummy subscription on node_p > > > and do a test similar to what we are doing in "# Create failover slot > > > to test its removal". > > > > Your approach looks better than mine. I followed the approach. > > Hi Kuroda-san, > > I tested the patches on linux and windows and I confirm that it > successfully fixes the issue [1]. > Thanks for the verification. I have pushed the patch. -- With Regards, Amit Kapila.