Home > mailing lists
Re: row filtering for logical replication - Mailing list pgsql-hackers

From	Peter Smith
Subject	Re: row filtering for logical replication
Date	February 7, 2022 07:50:37
Msg-id	CAHut+PtTtAG+5T7rGA1FaAKM-JofSBOOxXhVtnCqcY22jH7MyA@mail.gmail.com Whole thread Raw
In response to	RE: row filtering for logical replication ("houzj.fnst@fujitsu.com" <houzj.fnst@fujitsu.com>)
Responses	Re: row filtering for logical replication RE: row filtering for logical replication
List	pgsql-hackers
Tree view
Hi - I did a review of the v77 patches merged with Amit's v77 diff patch [1].

(Maybe this is equivalent to reviewing v78)

Below are my review comments:

======

1. doc/src/sgml/ref/create_publication.sgml - CREATE PUBLICATION

+   The <literal>WHERE</literal> clause allows simple expressions that
don't have
+   user-defined functions, operators, collations, non-immutable built-in
+   functions, or references to system columns.
+  </para>

That seems slightly ambiguous for operators and collations. It's only
the USER-DEFINED ones we don't support.

Perhaps it should be worded like:

"allows simple expressions that don't have user-defined
functions/operators/collations, non-immutable built-in functions..."

or like

"allows simple expressions that don't have user-defined functions,
user-defined operators, user-defined collations, non-immutable
built-in functions..."

~~~

2. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication

+Oid
+GetTopMostAncestorInPublication(Oid puboid, List *ancestors)
+{
+ ListCell   *lc;
+ Oid topmost_relid = InvalidOid;
+
+ /*
+ * Find the "topmost" ancestor that is in this publication.
+ */
+ foreach(lc, ancestors)
+ {
+ Oid ancestor = lfirst_oid(lc);
+ List    *apubids = GetRelationPublications(ancestor);
+ List    *aschemaPubids = NIL;
+
+ if (list_member_oid(apubids, puboid))
+ topmost_relid = ancestor;
+ else
+ {
+ aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));
+ if (list_member_oid(aschemaPubids, puboid))
+ topmost_relid = ancestor;
+ }
+
+ list_free(apubids);
+ list_free(aschemaPubids);
+ }
+
+ return topmost_relid;
+}

Wouldn't it be better for the aschemaPubids to be declared and freed
inside the else block?

e.g.

else
{
List *aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor));

if (list_member_oid(aschemaPubids, puboid))
topmost_relid = ancestor;

list_free(aschemaPubids);
}

~~~

3. src/backend/commands/publicationcmds.c - contain_invalid_rfcolumn

+ if (pubviaroot && relation->rd_rel->relispartition)
+ {
+ publish_as_relid = GetTopMostAncestorInPublication(pubid, ancestors);
+
+ if (publish_as_relid == InvalidOid)
+ publish_as_relid = relid;
+ }

Consider using the macro code for the InvalidOid check. e.g.

if (!OidIsValid(publish_as_relid)
publish_as_relid = relid;

~~~

4. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Tests)

+ switch (nodeTag(node))
+ {
+ case T_ArrayExpr:
+ case T_BooleanTest:
+ case T_BoolExpr:
+ case T_CaseExpr:
+ case T_CaseTestExpr:
+ case T_CoalesceExpr:
+ case T_CollateExpr:
+ case T_Const:
+ case T_FuncExpr:
+ case T_MinMaxExpr:
+ case T_NullTest:
+ case T_RelabelType:
+ case T_XmlExpr:
+ return true;
+ default:
+ return false;
+ }

I think there are several missing regression tests.

4a. There is a new message that says "User-defined collations are not
allowed." but I never saw any test case for it.

4b. There is also the RelabelType which seems to have no test case.
Amit previously provided [2] some SQL which would give an unexpected
error, so I guess that should be a new regression test case. e.g.
create table t1(c1 int, c2 varchar(100));
create publication pub1 for table t1 where (c2 < 'john');

~~~

5. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (Simple?)

+/*
+ * Is this a simple Node permitted within a row filter expression?
+ */
+static bool
+IsRowFilterSimpleExpr(Node *node)
+{

A lot has changed in this area recently and I feel that there is
something not quite 100% right with the naming and/or logic in this
expression validation. IMO there are several functions that seem to
depend too much on each other in special ways...

IIUC the "walker" logic now seems to be something like this:
a) Check for special cases of the supported nodes
b) Then check for supported (simple?) nodes (i.e.
IsRowFilterSimpleExpr is now acting as a "catch-all" after the special
case checks)
c) Then check for some unsupported node embedded within a supported
node (i.e. call expr_allowed_in_node)
d) If any of a,b,c was bad then give an error.

To achieve that logic the T_FuncExpr was added to the
"IsRowFilterSimpleExpr". Meanwhile, other nodes like
T_ScalarArrayOpExpr and T_NullIfExpr now are removed from
IsRowFilterSimpleExpr - I don't quite know why these got removed but
perhaps there is implicit knowledge that those node kinds were already
checked by the "walker" before the IsRowFilterSimpleExpr function ever
gets called.

So, although I trust that everything is working OK,  I don't think
IsRowFilterSimpleExpr is really just about simple nodes anymore. It is
harder to see why some supported nodes are in there, and some
supported nodes are not. It seems tightly entwined with the logic of
check_simple_rowfilter_expr_walker; i.e. there seem to be assumptions
about exactly when it will be called and what was checked before and
what will be checked after calling it.

IMO probably all the nodes we are supporting should be in the
IsRowFilterSimpleExpr just for completeness (e.g. put T_NullIfExpr and
T_ScalarArrayOpExpr back in there...), and maybe the function should
be renamed (IsRowFilterAllowedNode?), and probably there need to be
more comments describing the validation logic (e.g. the a/b/c/d logic
I mentioned above).

~~~

6. src/backend/commands/publicationcmds.c - IsRowFilterSimpleExpr (T_List)

(From Amit's patch)

@@ -395,6 +397,7 @@ IsRowFilterSimpleExpr(Node *node)
  case T_NullTest:
  case T_RelabelType:
  case T_XmlExpr:
+ case T_List:
  return true;
  default:
  return false;


The case T_List should be moved to be alphabetical the same as all the
other cases.

~~~

7. src/backend/commands/publicationcmds.c -
contain_mutable_or_ud_functions_checker

+/* check_functions_in_node callback */
+static bool
+contain_mutable_or_ud_functions_checker(Oid func_id, void *context)

"ud" seems a strange name. Maybe better to name this function
"contain_mutable_or_user_functions_checker" ?

~~~

8. src/backend/commands/publicationcmds.c - expr_allowed_in_node (comment)

(From Amit's patch)

@@ -410,6 +413,37 @@ contain_mutable_or_ud_functions_checker(Oid
func_id, void *context)
 }

 /*
+ * Check, if the node contains any unallowed object in node. See
+ * check_simple_rowfilter_expr_walker.
+ *
+ * Returns the error detail meesage in errdetail_msg for unallowed expressions.
+ */
+static bool
+expr_allowed_in_node(Node *node, ParseState *pstate, char **errdetail_msg)

Remove the comma: "Check, if ..." --> "Check if ..."
Typo: "meesage" --> "message"

~~~

9. src/backend/commands/publicationcmds.c - expr_allowed_in_node (else)

(From Amit's patch)

+ if (exprType(node) >= FirstNormalObjectId)
+ *errdetail_msg = _("User-defined types are not allowed.");
+ if (check_functions_in_node(node, contain_mutable_or_ud_functions_checker,
+ (void*) pstate))
+ *errdetail_msg = _("User-defined or built-in mutable functions are
not allowed.");
+ else if (exprCollation(node) >= FirstNormalObjectId)
+ *errdetail_msg = _("User-defined collations are not allowed.");
+ else if (exprInputCollation(node) >= FirstNormalObjectId)
+ *errdetail_msg = _("User-defined collations are not allowed.");

Is that correct - isn't there a missing "else" on the 2nd "if"?

~~~

10. src/backend/commands/publicationcmds.c - expr_allowed_in_node (bool)

(From Amit's patch)

+static bool
+expr_allowed_in_node(Node *node, ParseState *pstate, char **errdetail_msg)

Why is this a boolean function? It can never return false (??)

~~~

11. src/backend/commands/publicationcmds.c -
check_simple_rowfilter_expr_walker (else)

(From Amit's patch)

@@ -500,12 +519,18 @@ check_simple_rowfilter_expr_walker(Node *node,
ParseState *pstate)
  }
  }
  }
- else if (!IsRowFilterSimpleExpr(node))
+ else if (IsRowFilterSimpleExpr(node))
+ {
+ }
+ else
  {
  elog(DEBUG3, "row filter contains an unexpected expression
component: %s", nodeToString(node));
  errdetail_msg = _("Expressions only allow columns, constants,
built-in operators, built-in data types, built-in collations and
immutable built-in functions.");
  }

Why introduce a new code block that does nothing?

~~~

12. src/backend/replication/pgoutput/pgoutput.c - get_rel_sync_entry

+ /*
+ * Initialize the row filter after getting the final publish_as_relid
+ * as we only evaluate the row filter of the relation which we publish
+ * change as.
+ */
+ pgoutput_row_filter_init(data, active_publications, entry);

The comment "which we publish change as" seems strangely worded.

Perhaps it should be:
"... only evaluate the row filter of the relation which being published."

~~~

13. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc (release)

+ /*
+ * Check if all columns referenced in the filter expression are part of
+ * the REPLICA IDENTITY index or not.
+ *
+ * If the publication is FOR ALL TABLES then it means the table has no
+ * row filters and we can skip the validation.
+ */
+ if (!pubform->puballtables &&
+ (pubform->pubupdate || pubform->pubdelete) &&
+ contain_invalid_rfcolumn(pubid, relation, ancestors,
+ pubform->pubviaroot))
+ {
+ if (pubform->pubupdate)
+ pubdesc->rf_valid_for_update = false;
+ if (pubform->pubdelete)
+ pubdesc->rf_valid_for_delete = false;
+ }

  ReleaseSysCache(tup);

This change has the effect of moving the location of the
"ReleaseSysCache(tup);" to much lower in the code but I think there is
no point to move it for the Row Filter patch, so it should be left
where it was before.

~~~

14. src/backend/utils/cache/relcache.c - RelationBuildPublicationDesc
(if refactor)

- if (pubactions->pubinsert && pubactions->pubupdate &&
- pubactions->pubdelete && pubactions->pubtruncate)
+ if (pubdesc->pubactions.pubinsert && pubdesc->pubactions.pubupdate &&
+ pubdesc->pubactions.pubdelete && pubdesc->pubactions.pubtruncate &&
+ !pubdesc->rf_valid_for_update && !pubdesc->rf_valid_for_delete)
  break;

I felt that the "rf_valid_for_update" and "rf_valid_for_delete" should
be checked first in that if condition. It is probably more optimal to
move them because then it can bail out early. All those other
pubaction flags are more likely to be true most of the time (because
that is the default case).

~~~

15. src/bin/psql/describe.c - SQL format

@@ -2898,12 +2902,12 @@ describeOneTableDetails(const char *schemaname,
  else
  {
  printfPQExpBuffer(&buf,
-   "SELECT pubname\n"
+   "SELECT pubname, NULL\n"
    "FROM pg_catalog.pg_publication p\n"
    "JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n"
    "WHERE pr.prrelid = '%s'\n"
    "UNION ALL\n"
-   "SELECT pubname\n"
+   "SELECT pubname, NULL\n"
    "FROM pg_catalog.pg_publication p\n"

I thought it may be better to reformat to put the NULL columns on a
different line for consistent format with the other SQL just above
this one. e.g.

  printfPQExpBuffer(&buf,
    "SELECT pubname\n"
+   " , NULL\n"
...

------
[1] https://www.postgresql.org/message-id/CAA4eK1LApUf%3DagS86KMstoosEBD74GD6%2BPPYGF419kwLw6fvrw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAA4eK1KDtwUcuFHOJ4zCCTEY4%2B_-X3fKTjn%3DkyaZwBeeqRF-oA%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia
pgsql-hackers by date:
From: Peter Eisentraut
Date: 07 February 2022, 07:48:50
Subject: Remove IS_AF_UNIX macro?
From: Dilip Kumar
Date: 07 February 2022, 07:57:13
Subject: Re: [BUG]Update Toast data failure in logical replication
Re: row filtering for logical replication - Mailing list pgsql-hackers

Previous

Next