Re: Skipping logical replication transactions on subscriber side - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Skipping logical replication transactions on subscriber side |
Date | |
Msg-id | CAA4eK1KRMY8CSyjURw6b-tnN4MMd5emmYMUdKY1bQ3msjwELyg@mail.gmail.com Whole thread Raw |
In response to | Re: Skipping logical replication transactions on subscriber side (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Skipping logical replication transactions on subscriber side
|
List | pgsql-hackers |
On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > I've attached updated patches. > Few comments: ============== 1. Is the patch cleaning tablesync error entries except via vacuum? If not, can't we send a message to remove tablesync errors once tablesync is successful (say when we reset skip_xid or when tablesync is finished) or when we drop subscription? I think the same applies to apply worker. I think we may want to track it in some way whether an error has occurred before sending the message but relying completely on a vacuum might be the recipe of bloat. I think in the case of a drop subscription we can simply send the message as that is not a frequent operation. I might be missing something here because in the tests after drop subscription you are expecting the entries from the view to get cleared 2. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>count</structfield> <type>uint8</type> + </para> + <para> + Number of consecutive times the error occurred + </para></entry> Shall we name this field as error_count as there will be other fields in this view in the future that may not be directly related to the error? 3. + +CREATE VIEW pg_stat_subscription_workers AS + SELECT + e.subid, + s.subname, + e.subrelid, + e.relid, + e.command, + e.xid, + e.count, + e.error_message, + e.last_error_time, + e.stats_reset + FROM (SELECT + oid as subid, + NULL as relid + FROM pg_subscription + UNION ALL + SELECT + srsubid as subid, + srrelid as relid + FROM pg_subscription_rel + WHERE srsubstate <> 'r') sr, + LATERAL pg_stat_get_subscription_worker(sr.subid, sr.relid) e It might be better to use 'w' as an alias instead of 'e' as the information is now not restricted to only errors. 4. +# Test if the error reported on pg_subscription_workers view is expected. The view name is wrong in the above comment 5. +# Check if the view doesn't show any entries after dropping the subscriptions. +$node_subscriber->safe_psql( + 'postgres', + q[ +DROP SUBSCRIPTION tap_sub; +DROP SUBSCRIPTION tap_sub_streaming; +]); +$result = $node_subscriber->safe_psql('postgres', + "SELECT count(1) FROM pg_stat_subscription_workers"); +is($result, q(0), 'no error after dropping subscription'); Don't we need to wait after dropping the subscription and before checking the view as there might be a slight delay in messages to get cleared? 7. +# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to +# infinite error due to violating the unique constraint. +my $appname = 'tap_sub'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, two_phase = on);"); +my $appname_streaming = 'tap_sub_streaming'; +$node_subscriber->safe_psql( + 'postgres', + "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming' PUBLICATION tap_pub_streaming WITH (streaming = on, two_phase = on);"); + +$node_publisher->wait_for_catchup($appname); +$node_publisher->wait_for_catchup($appname_streaming); How can we ensure that subscriber would have caught up when one of the tablesync workers is constantly in the error loop? Isn't it possible that the subscriber didn't send the latest lsn feedback till the table sync worker is finished? 8. +# Create subscriptions. The table sync for test_tab2 on tap_sub will enter to +# infinite error due to violating the unique constraint. The second sentence of the comment can be written as: "The table sync for test_tab2 on tap_sub will enter into infinite error loop due to violating the unique constraint." -- With Regards, Amit Kapila.
pgsql-hackers by date: