Re: bogus: logical replication rows/cols combinations - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: bogus: logical replication rows/cols combinations |
Date | |
Msg-id | CAA4eK1KsRLSEU-0Spny7TyEyhnvq8sCHqC9wGu_DeUvNT3BktA@mail.gmail.com Whole thread Raw |
In response to | Re: bogus: logical replication rows/cols combinations (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: bogus: logical replication rows/cols combinations
Re: bogus: logical replication rows/cols combinations RE: bogus: logical replication rows/cols combinations |
List | pgsql-hackers |
On Tue, Apr 26, 2022 at 4:00 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 4/25/22 17:48, Alvaro Herrera wrote: > > > The desired result on subscriber is: > > > > table uno; > > a │ b │ c > > ────┼───┼─── > > 1 │ 2 │ > > -1 │ │ 4 > > > > > > Thoughts? > > > > I'm not quite sure which of the two behaviors is more "desirable". In a > way, it's somewhat similar to publish_as_relid, which is also calculated > not considering which of the row filters match? > Right, or in other words, we check all publications to decide it and similar is the case for publication actions which are also computed independently for all publications. > But maybe you're right and it should behave the way you propose ... the > example I have in mind is a use case replicating table with two types of > rows - sensitive and non-sensitive. For sensitive, we replicate only > some of the columns, for non-sensitive we replicate everything. Which > could be implemented as two publications > > create publication sensitive_rows > for table t (a, b) where (is_sensitive); > > create publication non_sensitive_rows > for table t where (not is_sensitive); > > But the way it's implemented now, we'll always replicate all columns, > because the second publication has no column list. > > Changing this to behave the way you expect would be quite difficult, > because at the moment we build a single OR expression from all the row > filters. We'd have to keep the individual expressions, so that we can > build a column list for each of them (in order to ignore those that > don't match). > > We'd have to remove various other optimizations - for example we can't > just discard row filters if we found "no_filter" publication. > I don't think that is the right way. We need some way to combine expressions and I feel the current behavior is sane. I mean to say that even if there is one publication that has no filter (column/row), we should publish all rows with all columns. Now, as mentioned above combining row filters or column lists for all publications appears to be consistent with what we already do and seems correct behavior to me. To me, it appears that the method used to decide whether a particular table is published or not is also similar to what we do for row filters or column lists. Even if there is one publication that publishes all tables, we consider the current table to be published irrespective of whether other publications have published that table or not. > Or more > precisely, we'd have to consider column lists too. > > In other words, we'd have to merge pgoutput_column_list_init into > pgoutput_row_filter_init, and then modify pgoutput_row_filter to > evaluate the row filters one by one, and build the column list. > Hmm, I think even if we want to do something here, we also need to think about how to achieve similar behavior for initial tablesync which will be more tricky. > I can take a stab at it, but it seems strange to not apply the same > logic to evaluation of publish_as_relid. > Yeah, the current behavior seems to be consistent with what we already do. > I wonder what Amit thinks about > this, as he wrote the row filter stuff. > I feel we can explain a bit more about this in docs. We already have some explanation of how row filters are combined [1]. We can probably add a few examples for column lists. [1] - https://www.postgresql.org/docs/devel/logical-replication-row-filter.html#LOGICAL-REPLICATION-ROW-FILTER-COMBINING -- With Regards, Amit Kapila.
pgsql-hackers by date: