Re: Excessive number of replication slots for 12->14 logical replication - Mailing list pgsql-bugs
From | Masahiko Sawada |
---|---|
Subject | Re: Excessive number of replication slots for 12->14 logical replication |
Date | |
Msg-id | CAD21AoBGXi9nFsQTvgF-NfqcvoGOA9SrgSRwoEdGevEHvTm53g@mail.gmail.com Whole thread Raw |
In response to | Re: Excessive number of replication slots for 12->14 logical replication (Ajin Cherian <itsajin@gmail.com>) |
Responses |
Re: Excessive number of replication slots for 12->14 logical replication
Re: Excessive number of replication slots for 12->14 logical replication |
List | pgsql-bugs |
On Mon, Aug 1, 2022 at 3:46 PM Ajin Cherian <itsajin@gmail.com> wrote: > > On Mon, Aug 1, 2022 at 11:50 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Here is my review comment for v4-0001. > > > > (Just a nitpick about comments) > > > > ====== > > > > 1. src/backend/replication/logical/tablesync.c - process_syncing_tables_for_sync > > > > There are really just 2 main things that are being done for the cleanup here: > > - Drop the origin tracking > > - Drop the tablesync slot > > > > So, IMO the code will have more clarity by having one bigger comment > > for each of those drops, instead of commenting every step separately. > > > > e.g. > > > > BEFORE > > /* > > * Cleanup the tablesync origin tracking if exists. > > */ > > ReplicationOriginNameForTablesync(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > originname, > > sizeof(originname)); > > > > /* > > * Reset the origin session before dropping. > > * > > * This is required to reset the ownership of the slot > > * and allow the origin to be dropped. > > */ > > replorigin_session_reset(); > > > > replorigin_session_origin = InvalidRepOriginId; > > replorigin_session_origin_lsn = InvalidXLogRecPtr; > > replorigin_session_origin_timestamp = 0; > > > > /* > > * There is a chance that the user is concurrently performing > > * refresh for the subscription where we remove the table > > * state and its origin and by this time the origin might be > > * already removed. So passing missing_ok = true. > > */ > > replorigin_drop_by_name(originname, true, false); > > > > /* Cleanup the tablesync slot. */ > > ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > syncslotname, > > sizeof(syncslotname)); > > > > /* > > * It is important to give an error if we are unable to drop the slot, > > * otherwise, it won't be dropped till the corresponding subscription > > * is dropped. So passing missing_ok = false. > > */ > > ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false); > > > > > > SUGGESTION > > > > /* > > * Cleanup the tablesync origin tracking. > > * > > * Resetting the origin session removes the ownership of the slot. > > * This is needed to allow the origin to be dropped. > > * > > * Also, there is a chance that the user is concurrently performing > > * refresh for the subscription where we remove the table > > * state and its origin and by this time the origin might be > > * already removed. So passing missing_ok = true. > > */ > > ReplicationOriginNameForTablesync(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > originname, > > sizeof(originname)); > > replorigin_session_reset(); > > replorigin_session_origin = InvalidRepOriginId; > > replorigin_session_origin_lsn = InvalidXLogRecPtr; > > replorigin_session_origin_timestamp = 0; > > > > replorigin_drop_by_name(originname, true, false); > > > > /* > > * Cleanup the tablesync slot. > > * > > * It is important to give an error if we are unable to drop the slot, > > * otherwise, it won't be dropped till the corresponding subscription > > * is dropped. So passing missing_ok = false. > > */ > > ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid, > > MyLogicalRepWorker->relid, > > syncslotname, > > sizeof(syncslotname)); > > ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false); > > > Updated. Thank you for working on this. I have a comment and a question: * This has to be done after updating the state because otherwise if * there is an error while doing the database operations we won't be - * able to rollback dropped slot. + * able to rollback dropped slot or origin tracking. I think we can actually roll back dropping the replication origin. So the above comment is true for only replication slots. --- + replorigin_session_reset(); + replorigin_session_origin = InvalidRepOriginId; + replorigin_session_origin_lsn = InvalidXLogRecPtr; + replorigin_session_origin_timestamp = 0; + + replorigin_drop_by_name(originname, true, false); With this change, the committing the ongoing transaction will be done without replication origin. Is this okay? it's probably okay, but since tablesync worker commits other changes with replication origin I'm concerned a bit there might be a corner case. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
pgsql-bugs by date: