Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date | |
Msg-id | CAHut+PtS5XpCM_wUXwJHLBOBVgZ8E=T8JtYsXHUQCzWASUyY2A@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher (Önder Kalacı <onderkalaci@gmail.com>) |
Responses |
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
|
List | pgsql-hackers |
Here are some review comments for v27-0001 (not the tests) ====== Commit Message 1. There is no smart mechanism to pick the index. Instead, we choose the first index that fulfils the requirements mentioned above. ~ 1a. I think this paragraph should immediately follow the earlier one ("With this patch...") which talked about the index requirements. ~ 1b. Slight rewording SUGGESTION If there is more than one index that satisfies these requirements, we just pick the first one. ====== doc/src/sgml/logical-replication.sgml 2. A published table must have a “replica identity” configured in order to be able to replicate UPDATE and DELETE operations, so that appropriate rows to update or delete can be identified on the subscriber side. By default, this is the primary key, if there is one. Another unique index (with certain additional requirements) can also be set to be the replica identity. When replica identity FULL is specified, indexes can be used on the subscriber side for searching the rows. These indexes should be btree, non-partial and have at least one column reference (e.g., should not consist of only expressions). These restrictions on the non-unique index properties are in essence the same restrictions that are enforced for primary keys. Internally, we follow the same approach for supporting index scans within logical replication scope. If there are no such suitable indexes, the search on the subscriber s ide can be very inefficient, therefore replica identity FULL should only be used as a fallback if no other solution is possible. If a replica identity other than “full” is set on the publisher side, a replica identity comprising the same or fewer columns must also be set on the subscriber side. See REPLICA IDENTITY for details on how to set the replica identity. If a table without a replica identity is added to a publication that replicates UPDATE or DELETE operations then subsequent UPDATE or DELETE operations will cause an error on the publisher. INSERT operations can proceed regardless of any replica identity. ~ 2a. IMO the <quote>replica identity</quote> in the first sentence should be changed to be <firstterm>replica identity</firstterm> ~ 2b. Typo: "subscriber s ide" --> "subscriber side" ~ 2c. There is still one remaining "full" in this text. I think ought to be changed to <literal>FULL</literal> to match the others. ====== src/backend/executor/execReplication.c 3. IdxIsRelationIdentityOrPK +/* + * Given a relation and OID of an index, returns true if the + * index is relation's primary key's index or relation's + * replica identity index. + * + * Returns false otherwise. + */ +bool +IdxIsRelationIdentityOrPK(Relation rel, Oid idxoid) +{ + Assert(OidIsValid(idxoid)); + + return RelationGetReplicaIndex(rel) == idxoid || + RelationGetPrimaryKeyIndex(rel) == idxoid; } ~ Since the function name mentions RI (1st) and then PK (2nd), and since the implementation also has the same order, I think the function comment should use the same consistent order when describing what it does. ====== src/backend/replication/logical/relation.c 4. FindUsableIndexForReplicaIdentityFull +/* + * Returns an index oid if there is an index that can be used + * via the apply worker. The index should be btree, non-partial + * and have at least one column reference (e.g., should + * not consist of only expressions). The limitations arise from + * RelationFindReplTupleByIndex(), which is designed to handle + * PK/RI and these limitations are inherent to PK/RI. + * + * There are no fundamental problems for supporting non-btree + * and/or partial indexes. We should mostly relax the limitations + * in RelationFindReplTupleByIndex(). + * + * If no suitable index is found, returns InvalidOid. + * + * Note that this is not a generic function, it expects REPLICA + * IDENTITY FULL for the remote relation. + */ ~ 4a. Minor rewording of 1st sentence. BEFORE Returns an index oid if there is an index that can be used via the apply worker. SUGGESTION Returns the oid of an index that can be used via the apply worker. ~ 4b. + * There are no fundamental problems for supporting non-btree + * and/or partial indexes. We should mostly relax the limitations + * in RelationFindReplTupleByIndex(). I think this paragraph should come later in the comment (just before the Note) and should also have "XXX" prefix to indicate it is some implementation note for future versions. ~~~ 5. GetRelationIdentityOrPK +/* + * Get replica identity index or if it is not defined a primary key. + * + * If neither is defined, returns InvalidOid + */ +Oid +GetRelationIdentityOrPK(Relation rel) +{ + Oid idxoid; + + idxoid = RelationGetReplicaIndex(rel); + + if (!OidIsValid(idxoid)) + idxoid = RelationGetPrimaryKeyIndex(rel); + + return idxoid; +} This is really very similar code to the other new function called IdxIsRelationIdentityOrPK. I wondered if such similar functions could be defined together. ~~~ 6. FindLogicalRepUsableIndex +/* + * Returns an index oid if we can use an index for subscriber. If not, + * returns InvalidOid. + */ SUGGESTION Returns the oid of an index that can be used by a subscriber. Otherwise, returns InvalidOid. ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: