Re: Handle infinite recursion in logical replication setup - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Handle infinite recursion in logical replication setup |
Date | |
Msg-id | CAHut+Ptx2gPMmNMdiq+izLhZcv66t80r6n91KXegVc1cgC2qGQ@mail.gmail.com Whole thread Raw |
In response to | Re: Handle infinite recursion in logical replication setup (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Handle infinite recursion in logical replication setup
|
List | pgsql-hackers |
On Thu, Apr 7, 2022 at 2:09 PM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI, here is a test script that is using the current patch (v6) to > demonstrate a way to share table data between different numbers of > nodes (up to 5 of them here). > > The script starts off with just 2-way sharing (nodes N1, N2), > then expands to 3-way sharing (nodes N1, N2, N3), > then 4-way sharing (nodes N1, N2, N3, N4), > then 5-way sharing (nodes N1, N2, N3, N4, N5). > > As an extra complication, for this test, all 5 nodes have different > initial table data, which gets replicated to the others whenever each > new node joins the existing share group. > > PSA. > Hi Vignesh. I had some problems getting the above test script working. It was OK up until I tried to join the 5th node (N5) to the existing 4 nodes. The ERROR was manifesting itself strangely because it appeared that there was an index violation in the pg_subscription_rel catalog even though AFAIK the N5 did not have any entries in it. e.g. 2022-04-07 09:13:28.361 AEST [24237] ERROR: duplicate key value violates unique constraint "pg_subscription_rel_srrelid_srsubid_index" 2022-04-07 09:13:28.361 AEST [24237] DETAIL: Key (srrelid, srsubid)=(16384, 16393) already exists. 2022-04-07 09:13:28.361 AEST [24237] STATEMENT: create subscription sub51 connection 'port=7651' publication pub1 with (subscribe_local_only=true,copy_data=force); 2022-04-07 09:13:28.380 AEST [24237] ERROR: duplicate key value violates unique constraint "pg_subscription_rel_srrelid_srsubid_index" 2022-04-07 09:13:28.380 AEST [24237] DETAIL: Key (srrelid, srsubid)=(16384, 16394) already exists. 2022-04-07 09:13:28.380 AEST [24237] STATEMENT: create subscription sub52 connection 'port=7652' publication pub2 with (subscribe_local_only=true,copy_data=false); 2022-04-07 09:13:28.405 AEST [24237] ERROR: duplicate key value violates unique constraint "pg_subscription_rel_srrelid_srsubid_index" 2022-04-07 09:13:28.405 AEST [24237] DETAIL: Key (srrelid, srsubid)=(16384, 16395) already exists. 2022-04-07 09:13:28.405 AEST [24237] STATEMENT: create subscription sub53 connection 'port=7653' publication pub3 with (subscribe_local_only=true,copy_data=false); 2022-04-07 09:13:28.425 AEST [24237] ERROR: duplicate key value violates unique constraint "pg_subscription_rel_srrelid_srsubid_index" 2022-04-07 09:13:28.425 AEST [24237] DETAIL: Key (srrelid, srsubid)=(16384, 16396) already exists. 2022-04-07 09:13:28.425 AEST [24237] STATEMENT: create subscription sub54 connection 'port=7654' publication pub4 with (subscribe_local_only=true,copy_data=false); 2022-04-07 09:17:52.472 AEST [25852] ERROR: duplicate key value violates unique constraint "pg_subscription_rel_srrelid_srsubid_index" 2022-04-07 09:17:52.472 AEST [25852] DETAIL: Key (srrelid, srsubid)=(16384, 16397) already exists. 2022-04-07 09:17:52.472 AEST [25852] STATEMENT: create subscription sub51 connection 'port=7651' publication pub1; ~~~ When I debugged this it seemed like each of the CREAT SUBSCRIPTION was trying to make a double-entry, because the fetch_tables (your patch v6-0002 modified SQL of this) was retuning the same table 2x. (gdb) bt #0 errfinish (filename=0xbc1057 "nbtinsert.c", lineno=671, funcname=0xbc25e0 <__func__.15798> "_bt_check_unique") at elog.c:510 #1 0x0000000000526d83 in _bt_check_unique (rel=0x7f654219c2a0, insertstate=0x7ffd9629ddd0, heapRel=0x7f65421b0e28, checkUnique=UNIQUE_CHECK_YES, is_unique=0x7ffd9629de01, speculativeToken=0x7ffd9629ddcc) at nbtinsert.c:664 #2 0x0000000000526157 in _bt_doinsert (rel=0x7f654219c2a0, itup=0x19ea8e8, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, heapRel=0x7f65421b0e28) at nbtinsert.c:208 #3 0x000000000053450e in btinsert (rel=0x7f654219c2a0, values=0x7ffd9629df10, isnull=0x7ffd9629def0, ht_ctid=0x19ea894, heapRel=0x7f65421b0e28, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, indexInfo=0x19dea80) at nbtree.c:201 #4 0x00000000005213b6 in index_insert (indexRelation=0x7f654219c2a0, values=0x7ffd9629df10, isnull=0x7ffd9629def0, heap_t_ctid=0x19ea894, heapRelation=0x7f65421b0e28, checkUnique=UNIQUE_CHECK_YES, indexUnchanged=false, indexInfo=0x19dea80) at indexam.c:193 #5 0x00000000005c81d5 in CatalogIndexInsert (indstate=0x19de540, heapTuple=0x19ea890) at indexing.c:158 #6 0x00000000005c8325 in CatalogTupleInsert (heapRel=0x7f65421b0e28, tup=0x19ea890) at indexing.c:231 #7 0x00000000005f0170 in AddSubscriptionRelState (subid=16400, relid=16384, state=105 'i', sublsn=0) at pg_subscription.c:315 #8 0x00000000006d6fa5 in CreateSubscription (pstate=0x1942dc0, stmt=0x191f6a0, isTopLevel=true) at subscriptioncmds.c:767 ~~ Aside: All this was happening when I did not have enough logical replication workers configured. (There were WARNINGS in the logfile that I had not noticed). When I fix the configuration then all these other problems went away! ~~ So to summarize, I'm not sure if the fetch_tables still has some potential problem lurking or not, but I feel that the SQL in that function maybe needs a closer look to ensure it is always impossible to return the same table multiple times. ------ Kind Regards, Peter Smith. Fujitsu Australia.
pgsql-hackers by date: