Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAHut+PvbbPz1=T4bzY0_GotUK460Eih41Twjt=czJ1z2J8SGEw@mail.gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (shveta malik <shveta.malik@gmail.com>) |
Responses |
RE: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
Here are some review comments for patch v57-0001. ====== doc/src/sgml/protocol.sgml 1. CREATE_REPLICATION_SLOT ... FAILOVER + <varlistentry> + <term><literal>FAILOVER [ <replaceable class="parameter">boolean</replaceable> ]</literal></term> + <listitem> + <para> + If true, the slot is enabled to be synced to the physical + standbys so that logical replication can be resumed after failover. + </para> + </listitem> + </varlistentry> This syntax says passing the boolean value is optional. So the default needs to the specified here in the docs (like what the TWO_PHASE option does). ~~~ 2. ALTER_REPLICATION_SLOT ... FAILOVER + <variablelist> + <varlistentry> + <term><literal>FAILOVER [ <replaceable class="parameter">boolean</replaceable> ]</literal></term> + <listitem> + <para> + If true, the slot is enabled to be synced to the physical + standbys so that logical replication can be resumed after failover. + </para> + </listitem> + </varlistentry> + </variablelist> This syntax says passing the boolean value is optional. So it needs to be specified here in the docs that not passing a value would be the same as passing the value true. ====== doc/src/sgml/ref/alter_subscription.sgml 3. + If <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link> + is enabled, you can temporarily disable it in order to execute these commands. /in order to/to/ ~~~ 4. + <para> + When altering the + <link linkend="sql-createsubscription-params-with-slot-name"><literal>slot_name</literal></link>, + the <literal>failover</literal> property of the new slot may differ from the + <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link> + parameter specified in the subscription. When creating the slot, + ensure the slot failover property matches the + <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link> + parameter value of the subscription. + </para> 4a. the <literal>failover</literal> property of the new slot may differ Maybe it would be more clear if that said "the failover property value of the named slot...". ~ 4b. In the "failover property matches" part should that failover also be rendered as <literal> like before in the same paragraph? ====== doc/src/sgml/system-views.sgml 5. + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>failover</structfield> <type>bool</type> + </para> + <para> + True if this logical slot is enabled to be synced to the physical + standbys so that logical replication can be resumed from the new primary + after failover. Always false for physical slots. + </para></entry> + </row> /True if this logical slot is enabled.../True if this is a logical slot enabled.../ ====== src/backend/commands/subscriptioncmds.c 6. CreateSubscription + /* + * Even if failover is set, don't create the slot with failover + * enabled. Will enable it once all the tables are synced and + * ready. The intention is that if failover happens at the time of + * table-sync, user should re-launch the subscription instead of + * relying on main slot (if synced) with no table-sync data + * present. When the subscription has no tables, leave failover as + * false to allow ALTER SUBSCRIPTION ... REFRESH PUBLICATION to + * work. + */ + if (opts.failover && !opts.copy_data && tables != NIL) + failover_enabled = true; AFAICT it might be possible for this to set failover_enabled = true if copy_data is false. So failover_enabled would be true when later calling: walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled, failover_enabled, CRS_NOEXPORT_SNAPSHOT, NULL); Isn't that contrary to what this comment said: "Even if failover is set, don't create the slot with failover enabled" ~~~ 7. AlterSubscription. case ALTER_SUBSCRIPTION_OPTIONS: + if (!sub->slotname) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cannot set failover for subscription that does not have slot name"))); /for subscription that does not have slot name/for a subscription that does not have a slot name/ ====== .../libpqwalreceiver/libpqwalreceiver.c 8. + if (PQresultStatus(res) != PGRES_COMMAND_OK) + ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("could not alter replication slot \"%s\"", + slotname))); This used to display the error message like pchomp(PQerrorMessage(conn->streamConn)) but it was removed. Is it OK? ====== src/backend/replication/logical/tablesync.c 9. + if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING) + ereport(LOG, + /* translator: %s is a subscription option */ + (errmsg("logical replication apply worker for subscription \"%s\" will restart so that %s can be enabled", + MySubscription->name, "two_phase"))); + + if (MySubscription->failoverstate == LOGICALREP_FAILOVER_STATE_PENDING) + ereport(LOG, + /* translator: %s is a subscription option */ + (errmsg("logical replication apply worker for subscription \"%s\" will restart so that %s can be enabled", + MySubscription->name, "failover"))); Those errors have multiple %s, so the translator's comment should say "the 2nd %s is a..." ~~~ 10. void -UpdateTwoPhaseState(Oid suboid, char new_state) +EnableTwoPhaseFailoverTriState(Oid suboid, bool enable_twophase, + bool enable_failover) I felt the function name was a bit confusing. Maybe it is simpler to call it like "EnableTriState" or "EnableSubTriState" -- the parameters anyway specify what actual state(s) will be set. ====== src/backend/replication/logical/worker.c 11. + /* Update twophase and/or failover */ + EnableTwoPhaseFailoverTriState(MySubscription->oid, twophase_pending, + failover_pending); + if (twophase_pending) + MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED; + + if (failover_pending) + MySubscription->failoverstate = LOGICALREP_FAILOVER_STATE_ENABLED; Can't you pass the MySubscription as a parameter and then the EnableTwoPhaseFailoverTriState can also set these LOGICALREP_TWOPHASE_STATE_ENABLED/LOGICALREP_FAILOVER_STATE_ENABLED states within the Enable* function? ====== src/backend/replication/repl_gram.y 12. %token K_CREATE_REPLICATION_SLOT %token K_DROP_REPLICATION_SLOT +%token K_ALTER_REPLICATION_SLOT and + create_replication_slot drop_replication_slot + alter_replication_slot identify_system read_replication_slot + timeline_history show upload_manifest and | create_replication_slot | drop_replication_slot + | alter_replication_slot and | K_CREATE_REPLICATION_SLOT { $$ = "create_replication_slot"; } | K_DROP_REPLICATION_SLOT { $$ = "drop_replication_slot"; } + | K_ALTER_REPLICATION_SLOT { $$ = "alter_replication_slot"; } etc. ~ Although it makes no difference IMO it is more natural to code everything in the order: create, alter, drop. ====== src/backend/replication/repl_scanner.l 13. CREATE_REPLICATION_SLOT { return K_CREATE_REPLICATION_SLOT; } DROP_REPLICATION_SLOT { return K_DROP_REPLICATION_SLOT; } +ALTER_REPLICATION_SLOT { return K_ALTER_REPLICATION_SLOT; } and case K_CREATE_REPLICATION_SLOT: case K_DROP_REPLICATION_SLOT: + case K_ALTER_REPLICATION_SLOT: Although it makes no difference IMO it is more natural to code everything in the order: create, alter, drop. ====== src/backend/replication/slot.c 14. + if (SlotIsPhysical(MyReplicationSlot)) + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cannot use %s with a physical replication slot", + "ALTER_REPLICATION_SLOT")); /with a/for a/ ====== src/backend/replication/walsender.c 15. +static void +ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover) +{ + ListCell *lc; + bool failover_given = false; + + /* Parse options */ + foreach(lc, cmd->options) + { + DefElem *defel = (DefElem *) lfirst(lc); AFAIK there are some new-style macros now you can use for this code. e.g. foreach_ptr? See [1]. ~~~ 16. + if (strcmp(defel->defname, "failover") == 0) + { + if (failover_given) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), + errmsg("conflicting or redundant options"))); + failover_given = true; + *failover = defGetBoolean(defel); + } The documented syntax showed that passing the boolean value for the FAILOVER option is not mandatory. Does this code work if the boolean value is not passed? ====== src/bin/psql/tab-complete.c 17. I think "ALTER SUBSCRIPTION ... SET (failover)" is possible, but the ALTER SUBSCRIPTION tab completion code is missing. ====== src/include/nodes/replnodes.h 18. +/* ---------------------- + * ALTER_REPLICATION_SLOT command + * ---------------------- + */ +typedef struct AlterReplicationSlotCmd +{ + NodeTag type; + char *slotname; + List *options; +} AlterReplicationSlotCmd; + + Same as an earlier comment. Although it makes no difference IMO it is more natural to define these structs in the order: CreateReplicationSlotCmd, then AlterReplicationSlotCmd, then DropReplicationSlotCmd. ====== .../t/050_standby_failover_slots_sync.pl 19. + +# Copyright (c) 2023, PostgreSQL Global Development Group + /2023/2024/ ~~~ 20. +# Create another subscription (using the same slot created above) that enables +# failover. +$subscriber1->safe_psql( + 'postgres', qq[ + CREATE TABLE tab_int (a int PRIMARY KEY); + CREATE SUBSCRIPTION regress_mysub1 CONNECTION '$publisher_connstr' PUBLICATION regress_mypub WITH (slot_name = lsub1_slot, copy_data=false, failover = true, create_slot = false); The comment should not say "Create another subscription" because this is the first subscription being created. /another/a/ ~~~ 21. +################################################## +# Test if changing the failover property of a subscription updates the +# corresponding failover property of the slot. +################################################## /Test if/Test that/ ====== src/test/regress/sql/subscription.sql 22. +CREATE SUBSCRIPTION regress_testsub CONNECTION 'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect = false, failover = true); + +\dRs+ This is currently only testing the explicit "failover=true". Maybe you can also test the other kinds work as expected: - explicit "SET (failover=false)" - explicit "SET (failover)" with no value specified ====== [1] https://github.com/postgres/postgres/commit/14dd0f27d7cd56ffae9ecdbe324965073d01a9ff Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: