Re: Disabled logical replication origin session causes primary key errors - Mailing list pgsql-bugs
From | Masahiko Sawada |
---|---|
Subject | Re: Disabled logical replication origin session causes primary key errors |
Date | |
Msg-id | CAD21AoDhBDJcWsmsS4Du8E6Vzk6krxu65sCbpGggBxLHj3-8eA@mail.gmail.com Whole thread Raw |
In response to | RE: Disabled logical replication origin session causes primary key errors ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Responses |
RE: Disabled logical replication origin session causes primary key errors
|
List | pgsql-bugs |
On Tue, Apr 22, 2025 at 4:01 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Vignesh, > > > Couple of minor suggestions in the test: > > 1) I felt this is not required for this test as it has been verified many times: > > + > > +# Insert tuples and confirms replication works well > > +$node_publisher->safe_psql('postgres', "INSERT INTO t1 VALUES (1);"); > > + > > +$node_publisher->wait_for_catchup('regress_sub'); > > I added it to ensure remote_lsn to the valid value, but yes it is not mandatory. > > > 2) How about we change: > > +# Confirms the origin can be advanced > > +ok( $node_subscriber->poll_query_until( > > + 'postgres', > > + "SELECT remote_lsn > '$remote_lsn' FROM > > pg_catalog.pg_replication_origin_status os, pg_catalog.pg_subscription > > s WHERE os.external_id = 'pg_' || s.oid AND s.subname = > > 'regress_sub'", > > + 't') > > + or die "Timed out while waiting for replication origin to be > > updated"); > > > > to: > > $node_publisher->wait_for_catchup('regress_sub'); > > > > # Confirms the origin can be advanced > > $result = $node_subscriber->safe_psql('postgres', > > "SELECT remote_lsn > '$remote_lsn' FROM > > pg_catalog.pg_replication_origin_status os, pg_catalog.pg_subscription > > s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub'" > > ); > > is($result, 't', > > 'remote_lsn has advanced for apply worker raising an exception'); > > > > In that way, we need not wait on poll_query_until. > > I intentionally used poll_query_until(), but I have no strong opinions. > Modified. > > PSA new patches. Thank you for updating the patch! Here are some comments: +# The bug was that the replication origin wasn’t updated whe +# apply_error_callback() was called with elevel >= ERROR, and the apply worker +# continued running afterward. I think it would be better to mention the fact that the problem happened when an error was caught for instance by a plpgsql function. How about rewriting it as follows? # The bug was that when an ERROR was caught, for instance by a PL/pgSQL function, # the apply worker reset the replication origin but continued processing # subsequent changes. This behavior resulted in a failure to update the replication # origin during further apply operations. --- +# Define an after-trigger function for the table insert. It can be fired even +# by the apply worker and always raises an exception. This situation allows +# worker continue after apply_error_callback() is called with elevel = ERROR. +$node_subscriber->safe_psql( + 'postgres', q{ +CREATE FUNCTION handle_exception_trigger() +RETURNS TRIGGER AS $$ +BEGIN + BEGIN + -- Raise an exception + RAISE EXCEPTION 'This is a test exception'; + EXCEPTION + WHEN OTHERS THEN + RETURN NEW; + END; + + RETURN NEW; +END; +$$ LANGUAGE plpgsql; +}); + +$node_subscriber->safe_psql( + 'postgres', q{ +CREATE TRIGGER silent_exception_trigger +AFTER INSERT OR UPDATE ON t1 +FOR EACH ROW +EXECUTE FUNCTION handle_exception_trigger(); + +ALTER TABLE t1 ENABLE ALWAYS TRIGGER silent_exception_trigger; +}); How about rewriting the comment as follows? # Create an AFTER INSERT trigger on the table that raises and subsequently # handles an exception. Subsequent insertions will trigger this exception, # causing the apply worker to invoke its error callback with an ERROR. However, # since the error is caught within the trigger, the apply worker will continue # processing changes. And can we execute these queries in one safe_psql() call? --- +# Obtain current remote_lsn value to check its advancement later +my $remote_lsn = $node_subscriber->safe_psql('postgres', + "SELECT remote_lsn FROM pg_catalog.pg_replication_origin_status os, pg_catalog.pg_subscription s WHERE os.external_id = 'pg_' || s.oid AND s.subname = 'regress_sub'" +); It seems to make sense to me to get the remote_lsn value just before executing INSERT after creating the trigger. Is it a conventional way to always use schema-qualified catalogs names in regression tests? Looking at other tests in src/test/subscription, there are only three cases: % git grep pg_catalog src/test/subscription/t src/test/subscription/t/001_rep_changes.pl: FROM pg_catalog.pg_stat_io src/test/subscription/t/020_messages.pl: "SELECT COUNT(*) FROM pg_catalog.pg_replication_slots WHERE slot_name = 'tap_sub' AND active='f'", src/test/subscription/t/029_on_error.pl: "SELECT subenabled = false FROM pg_catalog.pg_subscription WHERE subname = 'sub'" ISTM that we don't necessarily need to make the catalog name schema-qualified. --- We might want to stop both the publisher and the subscriber at the end of the tests. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-bugs by date: