RE: logical replication restrictions - Mailing list pgsql-hackers

From kuroda.hayato@fujitsu.com
Subject RE: logical replication restrictions
Date
Msg-id TYAPR01MB5866F9716A18DA0C68A2CDB3F5469@TYAPR01MB5866.jpnprd01.prod.outlook.com
Whole thread Raw
In response to RE: logical replication restrictions  ("kuroda.hayato@fujitsu.com" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
Hi,

Sorry for noise but I found another bug.
When the 032_apply_delay.pl is modified like following,
the test will be always failed even if my patch is applied.

```
# Disable subscription. worker should die immediately.
-$node_subscriber->safe_psql('postgres',
-       "ALTER SUBSCRIPTION tap_sub DISABLE"
+$node_subscriber->safe_psql('postgres', q{
+BEGIN;
+ALTER SUBSCRIPTION tap_sub DISABLE;
+SELECT pg_sleep(1);
+COMMIT;
+}
 );
```

The point of failure is same as I reported previously.

```
...
2022-09-14 12:00:48.891 UTC [11330] 032_apply_delay.pl LOG:  statement: ALTER SUBSCRIPTION tap_sub SET (min_apply_delay
=86460000) 
2022-09-14 12:00:48.910 UTC [11226] DEBUG:  sending feedback (force 0) to recv 0/1690220, write 0/1690220, flush
0/1690220
2022-09-14 12:00:48.937 UTC [11208] DEBUG:  server process (PID 11328) exited with exit code 0
2022-09-14 12:00:48.950 UTC [11226] DEBUG:  logical replication apply delay: 86459996 ms
2022-09-14 12:00:48.950 UTC [11226] CONTEXT:  processing remote data for replication origin "pg_16393" during "BEGIN"
intransaction 734 finished at 0/16902A8 
2022-09-14 12:00:48.979 UTC [11208] DEBUG:  forked new backend, pid=11334 socket=6
2022-09-14 12:00:49.007 UTC [11334] 032_apply_delay.pl LOG:  statement: BEGIN;
2022-09-14 12:00:49.008 UTC [11334] 032_apply_delay.pl LOG:  statement: ALTER SUBSCRIPTION tap_sub DISABLE;
2022-09-14 12:00:49.009 UTC [11334] 032_apply_delay.pl LOG:  statement: SELECT pg_sleep(1);
2022-09-14 12:00:49.009 UTC [11226] DEBUG:  check status of MySubscription
2022-09-14 12:00:49.009 UTC [11226] CONTEXT:  processing remote data for replication origin "pg_16393" during "BEGIN"
intransaction 734 finished at 0/16902A8 
2022-09-14 12:00:49.009 UTC [11226] DEBUG:  logical replication apply delay: 86459937 ms
2022-09-14 12:00:49.009 UTC [11226] CONTEXT:  processing remote data for replication origin "pg_16393" during "BEGIN"
intransaction 734 finished at 0/16902A8 
...
```

I think it may be caused that waken worker read catalogs that have not modified yet.
In AlterSubscription(), the backend kicks the apply worker ASAP, but it should be at
end of the transaction, like ApplyLauncherWakeupAtCommit() and AtEOXact_ApplyLauncher().

```
+                               /*
+                                * If this subscription has been disabled and it has an apply
+                                * delay set, wake up the logical replication worker to finish
+                                * it as soon as possible.
+                                */
+                               if (!opts.enabled && sub->applydelay > 0)
+                                       logicalrep_worker_wakeup(sub->oid, InvalidOid);
+
```

How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




pgsql-hackers by date:

Previous
From: Önder Kalacı
Date:
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Next
From: "Imseih (AWS), Sami"
Date:
Subject: Re: Query Jumbling for CALL and SET utility statements