Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date | |
Msg-id | CAHut+PubGNUrT7j6mDYVHgR6pkBYGbLTPdMJr8_W2_SFOSSOGg@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Önder Kalacı <onderkalaci@gmail.com>) |
Responses |
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
List | pgsql-hackers |
Here are my review comments for v37-0001. ====== General - should/must. 1. In my previous review [1] (comment #1) I wrote that only some of the "should" were misleading and gave examples where to change. But I didn't say that *every* usage of that word was wrong, so your global replace of "should" to "must" has modified a couple of places in unexpected ways. Details are in subsequent review comments below -- see #2b, #3, #5. ====== doc/src/sgml/logical-replication.sgml 2. A published table must have a “replica identity” configured in order to be able to replicate UPDATE and DELETE operations, so that appropriate rows to update or delete can be identified on the subscriber side. By default, this is the primary key, if there is one. Another unique index (with certain additional requirements) can also be set to be the replica identity. If the table does not have any suitable key, then it can be set to replica identity “full”, which means the entire row becomes the key. When replica identity “full” is specified, indexes can be used on the subscriber side for searching the rows. Candidate indexes must be btree, non-partial, and have at least one column reference (i.e. cannot consist of only expressions). These restrictions on the non-unique index properties adheres to some of the restrictions that are enforced for primary keys. Internally, we follow a similar approach for supporting index scans within logical replication scope. If there are no such suitable indexes, the search on the subscriber side can be very inefficient, therefore replica identity “full” must only be used as a fallback if no other solution is possible. If a replica identity other than “full” is set on the publisher side, a replica identity comprising the same or fewer columns must also be set on the subscriber side. See REPLICA IDENTITY for details on how to set the replica identity. If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. INSERT operations can proceed regardless of any replica identity. ~~ 2a. My previous review [1] (comment #2) was not fixed quite as suggested. Please change: "adheres to" --> "adhere to" ~~ 2b. should/must This should/must change was OK as it was before, because here it is only advice. Please change this back how it was: "must only be used as a fallback" --> "should only be used as a fallback" ====== src/backend/executor/execReplication.c 3. build_replindex_scan_key /* * Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that * is setup to match 'rel' (*NOT* idxrel!). * - * Returns whether any column contains NULLs. + * Returns how many columns must be used for the index scan. + * ~ This should/must change does not seem quite right. SUGGESTION (reworded) Returns how many columns to use for the index scan. ~~~ 4. build_replindex_scan_key > > Based on the discussions below, I kept as-is. I really don't want to do unrelated > changes in this patch, as I also got several feedback for not doing it, > Hmm, although this code pre-existed I don’t consider this one as "unrelated changes" because the patch introduced the new "if (!AttributeNumberIsValid(table_attno))" which changed things. As I wrote to Amit yesterday [2] IMO it would be better to do the 'opttype' assignment *after* the potential 'continue' otherwise there is every chance that the assignment is just redundant. And if you move the assignment where it belongs, then you might as well declare the variable in the more appropriate place at the same time – i.e. with 'opfamily' declaration. Anyway, I've given my reason a couple of times now, so if you don't want to change it I won't about it debate anymore. ====== src/backend/replication/logical/relation.c 5. FindUsableIndexForReplicaIdentityFull + * XXX: There are no fundamental problems for supporting non-btree indexes. + * We mostly need to relax the limitations in RelationFindReplTupleByIndex(). + * For partial indexes, the required changes are likely to be larger. If + * none of the tuples satisfy the expression for the index scan, we must + * fall-back to sequential execution, which might not be a good idea in some + * cases. The should/must change (the one in the XXX comment) does not seem quite right. SUGGESTION "we must fall-back to sequential execution" --> "we fallback to sequential execution" ====== .../subscription/t/032_subscribe_use_index.pl FYI, I get TAP test in error (Note - this is when only patch 0001 is appied) t/031_column_list.pl ............... ok t/032_subscribe_use_index.pl ....... 19/? # Failed test 'ensure subscriber has not used index with enable_indexscan=false' # at t/032_subscribe_use_index.pl line 806. # got: '1' # expected: '0' t/032_subscribe_use_index.pl ....... 21/? # Looks like you failed 1 test of 22. t/032_subscribe_use_index.pl ....... Dubious, test returned 1 (wstat 256, 0x100) Failed 1/22 subtests t/100_bugs.pl ...................... ok AFAICT there is a test case that is testing the patch 0002 functionality even when patch 0002 is not applied yet. ------ [1] Replies to my review v35 - https://www.postgresql.org/message-id/CACawEhXnTQyGNCXeQGhN3_%2BGWujhS3MyY27C4sSqRvZ%2B_B7FLg%40mail.gmail.com [2] https://www.postgresql.org/message-id/CAHut%2BPvLvDGFzk4fSaevGY5h2PpAeSZjJjje_7vBdb7ag%3DzswA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: