Thread: Skipping schema changes in publication
Hi, This feature adds an option to skip changes of all tables in specified schema while creating publication. This feature is helpful for use cases where the user wants to subscribe to all the changes except for the changes present in a few schemas. Ex: CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; OR ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; A new column pnskip is added to table "pg_publication_namespace", to maintain the schemas that the user wants to skip publishing through the publication. Modified the output plugin (pgoutput) to skip publishing the changes if the relation is part of skip schema publication. As a continuation to this, I will work on implementing skipping tables from all tables in schema and skipping tables from all tables publication. Attached patch has the implementation for this. This feature is for the pg16 version. Thoughts? Regards, Vignesh
Attachment
On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote: > > Hi, > > This feature adds an option to skip changes of all tables in specified > schema while creating publication. > This feature is helpful for use cases where the user wants to > subscribe to all the changes except for the changes present in a few > schemas. > Ex: > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > OR > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > A new column pnskip is added to table "pg_publication_namespace", to > maintain the schemas that the user wants to skip publishing through > the publication. Modified the output plugin (pgoutput) to skip > publishing the changes if the relation is part of skip schema > publication. > As a continuation to this, I will work on implementing skipping tables > from all tables in schema and skipping tables from all tables > publication. > > Attached patch has the implementation for this. The patch was not applying on top of HEAD because of the recent commits, attached patch is rebased on top of HEAD. Regards, Vignesh
Attachment
On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Hi, > > > > This feature adds an option to skip changes of all tables in specified > > schema while creating publication. > > This feature is helpful for use cases where the user wants to > > subscribe to all the changes except for the changes present in a few > > schemas. > > Ex: > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > > OR > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > > > A new column pnskip is added to table "pg_publication_namespace", to > > maintain the schemas that the user wants to skip publishing through > > the publication. Modified the output plugin (pgoutput) to skip > > publishing the changes if the relation is part of skip schema > > publication. > > As a continuation to this, I will work on implementing skipping tables > > from all tables in schema and skipping tables from all tables > > publication. > > > > Attached patch has the implementation for this. > > The patch was not applying on top of HEAD because of the recent > commits, attached patch is rebased on top of HEAD. The patch does not apply on top of HEAD because of the recent commit, attached patch is rebased on top of HEAD. I have also included the implementation for skipping a few tables from all tables publication, the 0002 patch has the implementation for the same. This feature is helpful for use cases where the user wants to subscribe to all the changes except for the changes present in a few tables. Ex: CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; OR ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; Regards, Vignesh
Attachment
On Tue, Apr 12, 2022 at 11:53 AM vignesh C <vignesh21@gmail.com> wrote: > > On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Hi, > > > > > > This feature adds an option to skip changes of all tables in specified > > > schema while creating publication. > > > This feature is helpful for use cases where the user wants to > > > subscribe to all the changes except for the changes present in a few > > > schemas. > > > Ex: > > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > > > OR > > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > > > > > A new column pnskip is added to table "pg_publication_namespace", to > > > maintain the schemas that the user wants to skip publishing through > > > the publication. Modified the output plugin (pgoutput) to skip > > > publishing the changes if the relation is part of skip schema > > > publication. > > > As a continuation to this, I will work on implementing skipping tables > > > from all tables in schema and skipping tables from all tables > > > publication. > > > > > > Attached patch has the implementation for this. > > > > The patch was not applying on top of HEAD because of the recent > > commits, attached patch is rebased on top of HEAD. > > The patch does not apply on top of HEAD because of the recent commit, > attached patch is rebased on top of HEAD. > > I have also included the implementation for skipping a few tables from > all tables publication, the 0002 patch has the implementation for the > same. > This feature is helpful for use cases where the user wants to > subscribe to all the changes except for the changes present in a few > tables. > Ex: > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; > OR > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > For the second syntax (Alter Publication ...), isn't it better to avoid using ADD? It looks odd to me because we are not adding anything in publication with this sytax. -- With Regards, Amit Kapila.
On Tue, Apr 12, 2022 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Apr 12, 2022 at 11:53 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Sat, Mar 26, 2022 at 7:37 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, Mar 22, 2022 at 12:38 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Hi, > > > > > > > > This feature adds an option to skip changes of all tables in specified > > > > schema while creating publication. > > > > This feature is helpful for use cases where the user wants to > > > > subscribe to all the changes except for the changes present in a few > > > > schemas. > > > > Ex: > > > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > > > > OR > > > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > > > > > > > A new column pnskip is added to table "pg_publication_namespace", to > > > > maintain the schemas that the user wants to skip publishing through > > > > the publication. Modified the output plugin (pgoutput) to skip > > > > publishing the changes if the relation is part of skip schema > > > > publication. > > > > As a continuation to this, I will work on implementing skipping tables > > > > from all tables in schema and skipping tables from all tables > > > > publication. > > > > > > > > Attached patch has the implementation for this. > > > > > > The patch was not applying on top of HEAD because of the recent > > > commits, attached patch is rebased on top of HEAD. > > > > The patch does not apply on top of HEAD because of the recent commit, > > attached patch is rebased on top of HEAD. > > > > I have also included the implementation for skipping a few tables from > > all tables publication, the 0002 patch has the implementation for the > > same. > > This feature is helpful for use cases where the user wants to > > subscribe to all the changes except for the changes present in a few > > tables. > > Ex: > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; > > OR > > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > > > > For the second syntax (Alter Publication ...), isn't it better to > avoid using ADD? It looks odd to me because we are not adding anything > in publication with this sytax. I was thinking of the scenario where user initially creates the publication for all tables: CREATE PUBLICATION pub1 FOR ALL TABLES; After that user decides to skip few tables ex: t1, t2 ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; I thought of supporting this syntax if incase user decides to add the skipping of a few tables later. Thoughts? Regards, Vignesh
On Tue, Apr 12, 2022 at 4:17 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Apr 12, 2022 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > For the second syntax (Alter Publication ...), isn't it better to > > avoid using ADD? It looks odd to me because we are not adding anything > > in publication with this sytax. > > I was thinking of the scenario where user initially creates the > publication for all tables: > CREATE PUBLICATION pub1 FOR ALL TABLES; > > After that user decides to skip few tables ex: t1, t2 > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > > I thought of supporting this syntax if incase user decides to add the > skipping of a few tables later. > I understand that part but what I pointed out was that it might be better to avoid using ADD keyword in this syntax like: ALTER PUBLICATION pub1 SKIP TABLE t1,t2; -- With Regards, Amit Kapila.
On Tue, Apr 12, 2022 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Apr 12, 2022 at 4:17 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Apr 12, 2022 at 12:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > For the second syntax (Alter Publication ...), isn't it better to > > > avoid using ADD? It looks odd to me because we are not adding anything > > > in publication with this sytax. > > > > I was thinking of the scenario where user initially creates the > > publication for all tables: > > CREATE PUBLICATION pub1 FOR ALL TABLES; > > > > After that user decides to skip few tables ex: t1, t2 > > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > > > > I thought of supporting this syntax if incase user decides to add the > > skipping of a few tables later. > > > > I understand that part but what I pointed out was that it might be > better to avoid using ADD keyword in this syntax like: ALTER > PUBLICATION pub1 SKIP TABLE t1,t2; Currently we are supporting Alter publication using the following syntax: ALTER PUBLICATION pub1 ADD TABLE t1; ALTER PUBLICATION pub1 SET TABLE t1; ALTER PUBLICATION pub1 DROP TABLE T1; ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1; ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1; ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1; I have extended the new syntax in similar lines: ALTER PUBLICATION pub1 ADD SKIP TABLE t1; ALTER PUBLICATION pub1 SET SKIP TABLE t1; ALTER PUBLICATION pub1 DROP SKIP TABLE T1; I did it like this to maintain consistency. But I'm fine doing it either way to keep it simple for the user. Regards, Vignesh
On Wed, Apr 13, 2022 at 8:45 AM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, Apr 12, 2022 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > I understand that part but what I pointed out was that it might be > > better to avoid using ADD keyword in this syntax like: ALTER > > PUBLICATION pub1 SKIP TABLE t1,t2; > > Currently we are supporting Alter publication using the following syntax: > ALTER PUBLICATION pub1 ADD TABLE t1; > ALTER PUBLICATION pub1 SET TABLE t1; > ALTER PUBLICATION pub1 DROP TABLE T1; > ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1; > ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1; > ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1; > > I have extended the new syntax in similar lines: > ALTER PUBLICATION pub1 ADD SKIP TABLE t1; > ALTER PUBLICATION pub1 SET SKIP TABLE t1; > ALTER PUBLICATION pub1 DROP SKIP TABLE T1; > > I did it like this to maintain consistency. > What is the difference between ADD and SET variants? I understand we need some way to remove the SKIP table setting but not sure if DROP is the best alternative. The other ideas could be: To set skip tables: ALTER PUBLICATION pub1 SKIP TABLE t1, t2...; To reset skip tables: ALTER PUBLICATION pub1 SKIP TABLE; /* basically an empty list*/ Yet another way to reset skip tables: ALTER PUBLICATION pub1 RESET SKIP TABLE; /* Here we need to introduce RESET. */ -- With Regards, Amit Kapila.
On Wed, Apr 13, 2022 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Apr 13, 2022 at 8:45 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, Apr 12, 2022 at 4:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > I understand that part but what I pointed out was that it might be > > > better to avoid using ADD keyword in this syntax like: ALTER > > > PUBLICATION pub1 SKIP TABLE t1,t2; > > > > Currently we are supporting Alter publication using the following syntax: > > ALTER PUBLICATION pub1 ADD TABLE t1; > > ALTER PUBLICATION pub1 SET TABLE t1; > > ALTER PUBLICATION pub1 DROP TABLE T1; > > ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1; > > ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1; > > ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1; > > > > I have extended the new syntax in similar lines: > > ALTER PUBLICATION pub1 ADD SKIP TABLE t1; > > ALTER PUBLICATION pub1 SET SKIP TABLE t1; > > ALTER PUBLICATION pub1 DROP SKIP TABLE T1; > > > > I did it like this to maintain consistency. > > > > What is the difference between ADD and SET variants? I understand we > need some way to remove the SKIP table setting but not sure if DROP is > the best alternative. > > The other ideas could be: > To set skip tables: ALTER PUBLICATION pub1 SKIP TABLE t1, t2...; > To reset skip tables: ALTER PUBLICATION pub1 SKIP TABLE; /* basically > an empty list*/ > Yet another way to reset skip tables: ALTER PUBLICATION pub1 RESET > SKIP TABLE; /* Here we need to introduce RESET. */ > When you were talking about SKIP TABLE then I liked the idea of: ALTER ... SET SKIP TABLE; /* empty list to reset the table skips */ ALTER ... SET SKIP TABLE t1,t2; /* non-empty list to replace the table skips */ But when you apply that rule to SKIP ALL TABLES IN SCHEMA, then the reset syntax looks too awkward. ALTER ... SET SKIP ALL TABLES IN SCHEMA; /* empty list to reset the schema skips */ ALTER ... SET SKIP ALL TABLES IN SCHEMA s1,s2; /* non-empty list to replace the schema skips */ ~~~ IMO it might be simpler to do it like: ALTER ... DROP SKIP; /* reset/remove the skip */ ALTER ... SET SKIP TABLE t1,t2; /* non-empty list to replace table skips */ ALTER ... SET SKIP ALL TABLES IS SCHEMA s1,s2; /* non-empty list to replace schema skips */ I don't really think that the ALTER ... SET SKIP empty list should be supported (because reason above) I don't really think that the ALTER ... ADD SKIP should be supported. === More questions - What happens if the skip table or skip schema no longer exists exist? Does that mean error? Maybe there is a dependency on it but OTOH it might be annoying - e.g. to disallow a DROP TABLE when the only dependency was that the user wanted to SKIP it... ------ Kind Regards, Peter Smith. Fujitsu Australia
On Tue, Apr 12, 2022 2:23 PM vignesh C <vignesh21@gmail.com> wrote: > > The patch does not apply on top of HEAD because of the recent commit, > attached patch is rebased on top of HEAD. > Thanks for your patch. Here are some comments for 0001 patch. 1. doc/src/sgml/catalogs.sgml @@ -6438,6 +6438,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l A null value indicates that all columns are published. </para></entry> </row> + + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>pnskip</structfield> <type>bool</type> + </para> + <para> + True if the schema is skip schema + </para></entry> + </row> </tbody> </tgroup> </table> This change is added to pg_publication_rel, I think it should be added to pg_publication_namespace, right? 2. postgres=# alter publication p1 add skip all tables in schema s1,s2; ERROR: schema "s1" is already member of publication "p1" This error message seems odd to me, can we improve it? Something like: schema "s1" is already skipped in publication "p1" 3. create table tbl (a int primary key); create schema s1; create schema s2; create table s1.tbl (a int); create publication p1 for all tables skip all tables in schema s1,s2; postgres=# \dRp+ Publication p1 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root ----------+------------+---------+---------+---------+-----------+---------- postgres | t | t | t | t | t | f Skip tables from schemas: "s1" "s2" postgres=# select * from pg_publication_tables; pubname | schemaname | tablename ---------+------------+----------- p1 | public | tbl p1 | s1 | tbl (2 rows) There shouldn't be a record of s1.tbl, since all tables in schema s1 are skipped. I found that it is caused by the following code: src/backend/catalog/pg_publication.c + foreach(cell, pubschemalist) + { + PublicationSchInfo *pubsch = (PublicationSchInfo *) lfirst(cell); + + skipschemaidlist = lappend_oid(result, pubsch->oid); + } The first argument to append_oid() seems wrong, should it be: skipschemaidlist = lappend_oid(skipschemaidlist, pubsch->oid); 4. src/backend/commands/publicationcmds.c /* * Convert the PublicationObjSpecType list into schema oid list and * PublicationTable list. */ static void ObjectsInPublicationToOids(List *pubobjspec_list, ParseState *pstate, List **rels, List **schemas) Should we modify the comment of ObjectsInPublicationToOids()? "schema oid list" should be "PublicationSchInfo list". Regards, Shi yu
On Tue, Apr 12, 2022 at 2:23 PM vignesh C <vignesh21@gmail.com> wrote: > The patch does not apply on top of HEAD because of the recent commit, > attached patch is rebased on top of HEAD. Thanks for your patches. Here are some comments for v1-0001: 1. I found the patch add the following two new functions in gram.y: preprocess_alltables_pubobj_list, check_skip_in_pubobj_list. These two functions look similar. So could we just add one new function? Besides, do we need the API `location` in new function preprocess_alltables_pubobj_list? It seems that "location" is not used in this new function. In addition, the location of error cursor in the messages seems has a little problem. For example: postgres=# create publication pub for all TABLES skip all tables in schema public, table test; ERROR: only SKIP ALL TABLES IN SCHEMA or SKIP TABLE can be specified with ALL TABLES option LINE 1: create publication pub for all TABLES skip all tables in sch... ^ (The location of error cursor is under 'create') 2. I think maybe there is a minor missing in function preprocess_alltables_pubobj_list and check_skip_in_pubobj_list: We seem to be missing the CURRENT_SCHEMA case. For example(In function preprocess_alltables_pubobj_list) : + /* Only SKIP ALL TABLES IN SCHEMA option supported with ALL TABLES */ + if (pubobj->pubobjtype != PUBLICATIONOBJ_TABLES_IN_SCHEMA || + !pubobj->skip) maybe need to be changed like this: + /* Only SKIP ALL TABLES IN SCHEMA option supported with ALL TABLES */ + if ((pubobj->pubobjtype != PUBLICATIONOBJ_TABLES_IN_SCHEMA && + pubobj->pubobjtype != PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA) && + pubobj->skip) 3. I think maybe there are some minor missing in create_publication.sgml. + [ FOR ALL TABLES [SKIP ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA}] maybe need to be changed to this: + [ FOR ALL TABLES [SKIP ALL TABLES IN SCHEMA { <replaceable class="parameter">schema_name</replaceable> | CURRENT_SCHEMA} [, ... ]] 4. The error message of function CreatePublication. Does the message below need to be modified like the comment? In addition, I think maybe "FOR/SKIP" is better. @@ -835,18 +843,21 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt) - /* FOR ALL TABLES IN SCHEMA requires superuser */ + /* FOR [SKIP] ALL TABLES IN SCHEMA requires superuser */ if (list_length(schemaidlist) > 0 && !superuser()) ereport(ERROR, errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); 5. I think there are some minor missing in tab-complete.c. + Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "SKIP", "ALL", "TABLES", "IN", "SCHEMA")) maybe need to be changed to this: + Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "SKIP", "ALL", "TABLES", "IN", "SCHEMA")) + Matches("CREATE", "PUBLICATION", MatchAny, "SKIP", "FOR", "ALL", "TABLES", "IN", "SCHEMA", MatchAny)) && maybe need to be changed to this: + Matches("CREATE", "PUBLICATION", MatchAny, "FOR", "ALL", "TABLES", "SKIP", "ALL", "TABLES", "IN", "SCHEMA",MatchAny)) && 6. In function get_rel_sync_entry, do we need `if (!publish)` in below code? I think `publish` is always false here, as we delete the check for "pub->alltables". ``` - /* - * If this is a FOR ALL TABLES publication, pick the partition root - * and set the ancestor level accordingly. - */ - if (pub->alltables) - { - ...... - } - if (!publish) ``` Regards, Wang wei
On 12.04.22 08:23, vignesh C wrote: > I have also included the implementation for skipping a few tables from > all tables publication, the 0002 patch has the implementation for the > same. > This feature is helpful for use cases where the user wants to > subscribe to all the changes except for the changes present in a few > tables. > Ex: > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; > OR > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; We have already allocated the "skip" terminology for skipping transactions, which is a dynamic run-time action. We are also using the term "skip" elsewhere to skip locked rows, which is similarly a run-time action. I think it would be confusing to use the term SKIP for DDL construction. Let's find another term like "omit", "except", etc. I would also think about this in broader terms. For example, sometimes people want features like "all columns except these" in certain places. The syntax for those things should be similar. That said, I'm not sure this feature is worth the trouble. If this is useful, what about "whole database except these schemas"? What about "create this database from this template except these schemas". This could get out of hand. I think we should encourage users to group their object the way they want and not offer these complicated negative selection mechanisms.
On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote:
On 12.04.22 08:23, vignesh C wrote:> I have also included the implementation for skipping a few tables from> all tables publication, the 0002 patch has the implementation for the> same.> This feature is helpful for use cases where the user wants to> subscribe to all the changes except for the changes present in a few> tables.> Ex:> CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2;> OR> ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2;We have already allocated the "skip" terminology for skippingtransactions, which is a dynamic run-time action. We are also using theterm "skip" elsewhere to skip locked rows, which is similarly a run-timeaction. I think it would be confusing to use the term SKIP for DDLconstruction.
I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN
SCHEMA and if I were to suggest a keyword, it would be EXCEPT.
I would also think about this in broader terms. For example, sometimespeople want features like "all columns except these" in certain places.The syntax for those things should be similar.
The questions are:
What kind of issues does it solve?
Do we have a workaround for it?
That said, I'm not sure this feature is worth the trouble. If this isuseful, what about "whole database except these schemas"? What about"create this database from this template except these schemas". Thiscould get out of hand. I think we should encourage users to group theirobject the way they want and not offer these complicated negativeselection mechanisms.
I have the same impression too. We already provide a way to:
* include individual tables;
* include all tables;
* include all tables in a certain schema.
Doesn't it cover the majority of the use cases? We don't need to cover all
possible cases in one DDL command. IMO the current grammar for CREATE
PUBLICATION is already complicated after the ALL TABLES IN SCHEMA. You are
proposing to add "ALL TABLES SKIP ALL TABLES" that sounds repetitive but it is
not; doesn't seem well-thought-out. I'm also concerned about possible gotchas
for this proposal. The first command above suggests that it skips all tables in a
certain schema. What happen if I decide to include a particular table of the
skipped schema (second command)?
ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2;
ALTER PUBLICATION pub1 ADD TABLE s1.foo;
Having said that I'm not wedded to this proposal. Unless someone provides
compelling use cases for this additional syntax, I think we should leave the
publication syntax as is.
On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote: > > On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote: > > On 12.04.22 08:23, vignesh C wrote: > > I have also included the implementation for skipping a few tables from > > all tables publication, the 0002 patch has the implementation for the > > same. > > This feature is helpful for use cases where the user wants to > > subscribe to all the changes except for the changes present in a few > > tables. > > Ex: > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; > > OR > > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > > We have already allocated the "skip" terminology for skipping > transactions, which is a dynamic run-time action. We are also using the > term "skip" elsewhere to skip locked rows, which is similarly a run-time > action. I think it would be confusing to use the term SKIP for DDL > construction. > > I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN > SCHEMA and if I were to suggest a keyword, it would be EXCEPT. > +1 for EXCEPT. > I would also think about this in broader terms. For example, sometimes > people want features like "all columns except these" in certain places. > The syntax for those things should be similar. > > The questions are: > What kind of issues does it solve? As far as I understand, it is for usability, otherwise, users need to list all required columns' names even if they don't want to hide most of the columns in the table. Consider user doesn't want to publish the 'salary' or other sensitive information of executives/employees but would like to publish all other columns. I feel in such cases it will be a lot of work for the user especially when the table has many columns. I see that Oracle has a similar feature [1]. I think without this it will be difficult for users to use this feature in some cases. > Do we have a workaround for it? > I can't think of any except the user needs to manually input all required columns. Can you think of any other workaround? > That said, I'm not sure this feature is worth the trouble. If this is > useful, what about "whole database except these schemas"? What about > "create this database from this template except these schemas". This > could get out of hand. I think we should encourage users to group their > object the way they want and not offer these complicated negative > selection mechanisms. > > I have the same impression too. We already provide a way to: > > * include individual tables; > * include all tables; > * include all tables in a certain schema. > > Doesn't it cover the majority of the use cases? > Similar to columns, the same applies to tables. Users need to manually add all tables for a database even when she wants to avoid only a handful of tables from the database say because they contain sensitive information or are not required. I think we don't need to cover all possible exceptions but a few where users can avoid some tables would be useful. If not, what kind of alternative do users have except for listing all columns or all tables that are required. [1] - https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20 -- With Regards, Amit Kapila.
On Thu, Apr 14, 2022 at 7:18 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 12.04.22 08:23, vignesh C wrote: > > I have also included the implementation for skipping a few tables from > > all tables publication, the 0002 patch has the implementation for the > > same. > > This feature is helpful for use cases where the user wants to > > subscribe to all the changes except for the changes present in a few > > tables. > > Ex: > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; > > OR > > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > > We have already allocated the "skip" terminology for skipping > transactions, which is a dynamic run-time action. We are also using the > term "skip" elsewhere to skip locked rows, which is similarly a run-time > action. I think it would be confusing to use the term SKIP for DDL > construction. > > Let's find another term like "omit", "except", etc. +1 for Except > I would also think about this in broader terms. For example, sometimes > people want features like "all columns except these" in certain places. > The syntax for those things should be similar. > > That said, I'm not sure this feature is worth the trouble. If this is > useful, what about "whole database except these schemas"? What about > "create this database from this template except these schemas". This > could get out of hand. I think we should encourage users to group their > object the way they want and not offer these complicated negative > selection mechanisms. I thought this feature would help when there are many many tables in the database and the user wants only certain confidential tables like credit card information. In this case instead of specifying the whole table list it will be better to specify "ALL TABLES EXCEPT cred_info_tbl". I had seen that mysql also has a similar option replicate-ignore-table to ignore the changes on specific tables as mentioned in [1]. Similar use case exists in pg_dump too. pg_dump has an option exclude-table that will be used for not dumping any tables that are matching the table specified as in [2]. [1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html [2] - https://www.postgresql.org/docs/devel/app-pgdump.html Regards, Vignesh
On Mon, Apr 18, 2022 at 12:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Apr 15, 2022 at 1:26 AM Euler Taveira <euler@eulerto.com> wrote: > > > > On Thu, Apr 14, 2022, at 10:47 AM, Peter Eisentraut wrote: > > > > On 12.04.22 08:23, vignesh C wrote: > > > I have also included the implementation for skipping a few tables from > > > all tables publication, the 0002 patch has the implementation for the > > > same. > > > This feature is helpful for use cases where the user wants to > > > subscribe to all the changes except for the changes present in a few > > > tables. > > > Ex: > > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP TABLE t1,t2; > > > OR > > > ALTER PUBLICATION pub1 ADD SKIP TABLE t1,t2; > > > > We have already allocated the "skip" terminology for skipping > > transactions, which is a dynamic run-time action. We are also using the > > term "skip" elsewhere to skip locked rows, which is similarly a run-time > > action. I think it would be confusing to use the term SKIP for DDL > > construction. > > > > I didn't like the SKIP choice too. We already have EXCEPT for IMPORT FOREIGN > > SCHEMA and if I were to suggest a keyword, it would be EXCEPT. > > > > +1 for EXCEPT. Updated patch by changing the syntax to use EXCEPT instead of SKIP. Regards, Vignesh
Attachment
On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote: > > Hi, > > This feature adds an option to skip changes of all tables in specified > schema while creating publication. > This feature is helpful for use cases where the user wants to > subscribe to all the changes except for the changes present in a few > schemas. > Ex: > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > OR > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > A new column pnskip is added to table "pg_publication_namespace", to > maintain the schemas that the user wants to skip publishing through > the publication. Modified the output plugin (pgoutput) to skip > publishing the changes if the relation is part of skip schema > publication. > As a continuation to this, I will work on implementing skipping tables > from all tables in schema and skipping tables from all tables > publication. > > Attached patch has the implementation for this. > This feature is for the pg16 version. > Thoughts? The feature seems to be useful especially when there are lots of schemas in a database. However, I don't quite like the syntax. Do we have 'SKIP' identifier in any of the SQL statements in SQL standard? Can we think of adding skip_schema_list as an option, something like below? CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2'); ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset Regards, Bharath Rupireddy.
On Sat, Apr 23, 2022 at 2:09 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Hi, > > > > This feature adds an option to skip changes of all tables in specified > > schema while creating publication. > > This feature is helpful for use cases where the user wants to > > subscribe to all the changes except for the changes present in a few > > schemas. > > Ex: > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > > OR > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > > > A new column pnskip is added to table "pg_publication_namespace", to > > maintain the schemas that the user wants to skip publishing through > > the publication. Modified the output plugin (pgoutput) to skip > > publishing the changes if the relation is part of skip schema > > publication. > > As a continuation to this, I will work on implementing skipping tables > > from all tables in schema and skipping tables from all tables > > publication. > > > > Attached patch has the implementation for this. > > This feature is for the pg16 version. > > Thoughts? > > The feature seems to be useful especially when there are lots of > schemas in a database. However, I don't quite like the syntax. Do we > have 'SKIP' identifier in any of the SQL statements in SQL standard? > Can we think of adding skip_schema_list as an option, something like > below? > > CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2'); > ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set > ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset > I had been wondering for some time if there was any way to introduce a more flexible pattern matching into PUBLICATION but without bloating the syntax. Maybe your idea to use an option for the "skip" gives a way to do it... For example, if we could use regex (for <schemaname>.<tablename> patterns) for the option value then.... ~~ e.g.1. Exclude certain tables: // do NOT publish any tables of schemas s1,s2 CREATE PUBLICATION foo FOR ALL TABLES (exclude_match = '(s1\..*)|(s2\..*)'); // do NOT publish my secret tables (those called "mysecretXXX") CREATE PUBLICATION foo FOR ALL TABLES (exclude_match = '(.*\.mysecret.*)'); ~~ e.g.2. Only allow certain tables. // ONLY publish my tables (those called "mytableXXX") CREATE PUBLICATION foo FOR ALL TABLES (subset_match = '(.*\.mytable.*)'); // So following is equivalent to FOR ALL TABLES IN SCHEMA s1 CREATE PUBLICATION foo FOR ALL TABLES (subset_match = '(s1\..*)'); ------ Kind Regards, Peter Smith. Fujitsu Australia
On Thursday, April 21, 2022 12:15 PM vignesh C <vignesh21@gmail.com> wrote: > Updated patch by changing the syntax to use EXCEPT instead of SKIP. Hi This is my review comments on the v2 patch. (1) gram.y I think we can make a unified function that merges preprocess_alltables_pubobj_list with check_except_in_pubobj_list. With regard to preprocess_alltables_pubobj_list, we don't use the 2nd argument "location" in this function. (2) create_publication.sgml + <para> + Create a publication that publishes all changes in all the tables except for + the changes of <structname>users</structname> and + <structname>departments</structname> table; This sentence should end ":" not ";". (3) publication.out & publication.sql +-- fail - can't set except table to schema publication +ALTER PUBLICATION testpub_forschema SET EXCEPT TABLE testpub_tbl1; There is one unnecessary space in the comment. Kindly change from "schema publication" to "schema publication". (4) pg_dump.c & describe.c In your first email of this thread, you explained this feature is for PG16. Don't we need additional branch for PG16 ? @@ -6322,6 +6328,21 @@ describePublications(const char *pattern) } } + if (pset.sversion >= 150000) + { @@ -4162,7 +4164,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables) /* Collect all publication membership info. */ if (fout->remoteVersion >= 150000) appendPQExpBufferStr(query, - "SELECT tableoid, oid, prpubid, prrelid, " + "SELECT tableoid, oid, prpubid, prrelid, prexcept," (5) psql-ref.sgml + If <literal>+</literal> is appended to the command name, the tables, + except tables and schemas associated with each publication are shown as + well. I'm not sure if "except tables" is a good description. I suggest "excluded tables". This applies to the entire patch, in case if this is reasonable suggestion. Best Regards, Takamichi Osumi
On Tue, Apr 26, 2022 at 11:32 AM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Thursday, April 21, 2022 12:15 PM vignesh C <vignesh21@gmail.com> wrote: > > Updated patch by changing the syntax to use EXCEPT instead of SKIP. > Hi > > > This is my review comments on the v2 patch. > > (1) gram.y > > I think we can make a unified function that merges > preprocess_alltables_pubobj_list with check_except_in_pubobj_list. > > With regard to preprocess_alltables_pubobj_list, > we don't use the 2nd argument "location" in this function. Removed location and made a unified function. > (2) create_publication.sgml > > + <para> > + Create a publication that publishes all changes in all the tables except for > + the changes of <structname>users</structname> and > + <structname>departments</structname> table; > > This sentence should end ":" not ";". Modified > (3) publication.out & publication.sql > > +-- fail - can't set except table to schema publication > +ALTER PUBLICATION testpub_forschema SET EXCEPT TABLE testpub_tbl1; > > There is one unnecessary space in the comment. > Kindly change from "schema publication" to "schema publication". Modified > (4) pg_dump.c & describe.c > > In your first email of this thread, you explained this feature > is for PG16. Don't we need additional branch for PG16 ? > > @@ -6322,6 +6328,21 @@ describePublications(const char *pattern) > } > } > > + if (pset.sversion >= 150000) > + { > > > @@ -4162,7 +4164,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables) > /* Collect all publication membership info. */ > if (fout->remoteVersion >= 150000) > appendPQExpBufferStr(query, > - "SELECT tableoid, oid, prpubid, prrelid, " > + "SELECT tableoid, oid, prpubid, prrelid, prexcept," > Modified by adding a comment saying "FIXME: 150000 should be changed to 160000 later for PG16." > (5) psql-ref.sgml > > + If <literal>+</literal> is appended to the command name, the tables, > + except tables and schemas associated with each publication are shown as > + well. > > I'm not sure if "except tables" is a good description. > I suggest "excluded tables". This applies to the entire patch, > in case if this is reasonable suggestion. Modified it in most of the places where it was applicable. I felt the usage was ok in a few places. Thanks for the comments, the attached v3 patch has the changes for the same. Regards. Vignesh
Attachment
On Wednesday, April 27, 2022 9:50 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for the comments, the attached v3 patch has the changes for the same. Hi Thank you for updating the patch. Several minor comments on v3. (1) commit message The new syntax allows specifying schemas. For example: CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; OR ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2; We have above sentence, but it looks better to make the description a bit more accurate. Kindly change From : "The new syntax allows specifying schemas" To : "The new syntax allows specifying excluded relations" Also, kindly change "OR" to "or", because this description is not syntax. (2) publication_add_relation @@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri, ObjectIdGetDatum(pubid); values[Anum_pg_publication_rel_prrelid - 1] = ObjectIdGetDatum(relid); + values[Anum_pg_publication_rel_prexcept - 1] = + BoolGetDatum(pri->except); + /* Add qualifications, if available */ It would be better to remove the blank line, because with this change, we'll have two blank lines in a row. (3) pg_dump.h & pg_dump_sort.c @@ -80,6 +80,7 @@ typedef enum DO_REFRESH_MATVIEW, DO_POLICY, DO_PUBLICATION, + DO_PUBLICATION_EXCEPT_REL, DO_PUBLICATION_REL, DO_PUBLICATION_TABLE_IN_SCHEMA, DO_SUBSCRIPTION @@ -90,6 +90,7 @@ enum dbObjectTypePriorities PRIO_FK_CONSTRAINT, PRIO_POLICY, PRIO_PUBLICATION, + PRIO_PUBLICATION_EXCEPT_REL, PRIO_PUBLICATION_REL, PRIO_PUBLICATION_TABLE_IN_SCHEMA, PRIO_SUBSCRIPTION, @@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] = PRIO_REFRESH_MATVIEW, /* DO_REFRESH_MATVIEW */ PRIO_POLICY, /* DO_POLICY */ PRIO_PUBLICATION, /* DO_PUBLICATION */ + PRIO_PUBLICATION_EXCEPT_REL, /* DO_PUBLICATION_EXCEPT_REL */ PRIO_PUBLICATION_REL, /* DO_PUBLICATION_REL */ PRIO_PUBLICATION_TABLE_IN_SCHEMA, /* DO_PUBLICATION_TABLE_IN_SCHEMA */ PRIO_SUBSCRIPTION /* DO_SUBSCRIPTION */ How about having similar order between pg_dump.h and pg_dump_sort.c, like we'll add DO_PUBLICATION_EXCEPT_REL after DO_PUBLICATION_REL in pg_dump.h ? (4) GetAllTablesPublicationRelations + /* + * pg_publication_rel and pg_publication_namespace will only have except + * tables in case of all tables publication, no need to pass except flag + * to get the relations. + */ + List *exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL); + There is one unnecessary space in a comment "...pg_publication_namespace will only have...". Kindly remove it. Then, how about diving the variable declaration and the insertion of the return value of GetPublicationRelations ? That might be aligned with other places in this file. (5) GetTopMostAncestorInPublication @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level foreach(lc, ancestors) { Oid ancestor = lfirst_oid(lc); - List *apubids = GetRelationPublications(ancestor); + List *apubids = GetRelationPublications(ancestor, false); List *aschemaPubids = NIL; + List *aexceptpubids; level++; @@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level else { aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); - if (list_member_oid(aschemaPubids, puboid)) + aexceptpubids = GetRelationPublications(ancestor, true); + if (list_member_oid(aschemaPubids, puboid) || + (puballtables && !list_member_oid(aexceptpubids, puboid))) { topmost_relid = ancestor; It seems we forgot to call list_free for "aexceptpubids". Best Regards, Takamichi Osumi
On Fri, Apr 22, 2022 at 9:39 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Mar 22, 2022 at 12:39 PM vignesh C <vignesh21@gmail.com> wrote: > > > > This feature adds an option to skip changes of all tables in specified > > schema while creating publication. > > This feature is helpful for use cases where the user wants to > > subscribe to all the changes except for the changes present in a few > > schemas. > > Ex: > > CREATE PUBLICATION pub1 FOR ALL TABLES SKIP ALL TABLES IN SCHEMA s1,s2; > > OR > > ALTER PUBLICATION pub1 ADD SKIP ALL TABLES IN SCHEMA s1,s2; > > > > The feature seems to be useful especially when there are lots of > schemas in a database. However, I don't quite like the syntax. Do we > have 'SKIP' identifier in any of the SQL statements in SQL standard? > After discussion, it seems EXCEPT is a preferred choice and the same is used in the other existing syntax as well. > Can we think of adding skip_schema_list as an option, something like > below? > > CREATE PUBLICATION foo FOR ALL TABLES (skip_schema_list = 's1, s2'); > ALTER PUBLICATION foo SET (skip_schema_list = 's1, s2'); - to set > ALTER PUBLICATION foo SET (skip_schema_list = ''); - to reset > Yeah, that is also an option but it seems it will be difficult to extend if want to support "all columns except (c1, ..)" for the column list feature. The other thing to decide is for which all objects we want to support EXCEPT clause as it may not be useful for everything as indicated by Peter E. and Euler. We have seen that Oracle supports "all columns except (c1, ..)" [1] and MySQL seems to support for tables [2]. I guess we should restrict ourselves to those two cases for now and then we can extend it later for schemas if required or people agree. Also, we should see the syntax we choose here should be extendable. Another idea that occurred to me today for tables this is as follows: 1. Allow to mention except during create publication ... For All Tables. CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; 2. Allow to Reset it. This new syntax will reset all objects in the publications. Alter Publication ... RESET; 3. Allow to add it to an existing publication Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2]; I think it can be extended in a similar way for schema syntax as well. [1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html [2] - https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20 -- With Regards, Amit Kapila.
On Thu, Apr 28, 2022 at 4:50 PM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Wednesday, April 27, 2022 9:50 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v3 patch has the changes for the same. > Hi > > Thank you for updating the patch. Several minor comments on v3. > > (1) commit message > > The new syntax allows specifying schemas. For example: > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > OR > ALTER PUBLICATION pub1 ADD EXCEPT TABLE t1,t2; > > We have above sentence, but it looks better > to make the description a bit more accurate. > > Kindly change > From : > "The new syntax allows specifying schemas" > To : > "The new syntax allows specifying excluded relations" > > Also, kindly change "OR" to "or", > because this description is not syntax. Slightly reworded and modified > (2) publication_add_relation > > @@ -396,6 +400,9 @@ publication_add_relation(Oid pubid, PublicationRelInfo *pri, > ObjectIdGetDatum(pubid); > values[Anum_pg_publication_rel_prrelid - 1] = > ObjectIdGetDatum(relid); > + values[Anum_pg_publication_rel_prexcept - 1] = > + BoolGetDatum(pri->except); > + > > /* Add qualifications, if available */ > > It would be better to remove the blank line, > because with this change, we'll have two blank > lines in a row. Modified > (3) pg_dump.h & pg_dump_sort.c > > @@ -80,6 +80,7 @@ typedef enum > DO_REFRESH_MATVIEW, > DO_POLICY, > DO_PUBLICATION, > + DO_PUBLICATION_EXCEPT_REL, > DO_PUBLICATION_REL, > DO_PUBLICATION_TABLE_IN_SCHEMA, > DO_SUBSCRIPTION > > @@ -90,6 +90,7 @@ enum dbObjectTypePriorities > PRIO_FK_CONSTRAINT, > PRIO_POLICY, > PRIO_PUBLICATION, > + PRIO_PUBLICATION_EXCEPT_REL, > PRIO_PUBLICATION_REL, > PRIO_PUBLICATION_TABLE_IN_SCHEMA, > PRIO_SUBSCRIPTION, > @@ -144,6 +145,7 @@ static const int dbObjectTypePriority[] = > PRIO_REFRESH_MATVIEW, /* DO_REFRESH_MATVIEW */ > PRIO_POLICY, /* DO_POLICY */ > PRIO_PUBLICATION, /* DO_PUBLICATION */ > + PRIO_PUBLICATION_EXCEPT_REL, /* DO_PUBLICATION_EXCEPT_REL */ > PRIO_PUBLICATION_REL, /* DO_PUBLICATION_REL */ > PRIO_PUBLICATION_TABLE_IN_SCHEMA, /* DO_PUBLICATION_TABLE_IN_SCHEMA */ > PRIO_SUBSCRIPTION /* DO_SUBSCRIPTION */ > > How about having similar order between > pg_dump.h and pg_dump_sort.c, like > we'll add DO_PUBLICATION_EXCEPT_REL > after DO_PUBLICATION_REL in pg_dump.h ? > Modified > (4) GetAllTablesPublicationRelations > > + /* > + * pg_publication_rel and pg_publication_namespace will only have except > + * tables in case of all tables publication, no need to pass except flag > + * to get the relations. > + */ > + List *exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL); > + > > There is one unnecessary space in a comment > "...pg_publication_namespace will only have...". Kindly remove it. > > Then, how about diving the variable declaration and > the insertion of the return value of GetPublicationRelations ? > That might be aligned with other places in this file. Modified > (5) GetTopMostAncestorInPublication > > > @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level > foreach(lc, ancestors) > { > Oid ancestor = lfirst_oid(lc); > - List *apubids = GetRelationPublications(ancestor); > + List *apubids = GetRelationPublications(ancestor, false); > List *aschemaPubids = NIL; > + List *aexceptpubids; > > level++; > > @@ -317,7 +319,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level > else > { > aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); > - if (list_member_oid(aschemaPubids, puboid)) > + aexceptpubids = GetRelationPublications(ancestor, true); > + if (list_member_oid(aschemaPubids, puboid) || > + (puballtables && !list_member_oid(aexceptpubids, puboid))) > { > topmost_relid = ancestor; > > It seems we forgot to call list_free for "aexceptpubids". Modified The attached v4 patch has the changes for the same. Regards, Vignesh
Attachment
On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > ... > Another idea that occurred to me today for tables this is as follows: > 1. Allow to mention except during create publication ... For All Tables. > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > 2. Allow to Reset it. This new syntax will reset all objects in the > publications. > Alter Publication ... RESET; > 3. Allow to add it to an existing publication > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2]; > > I think it can be extended in a similar way for schema syntax as well. > Consider if the user does CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT t3,t4; What does it mean? e.g. Is there only one exception list that is modified? Or did the ADD ALL TABLES override all meaning of the original list? e.g. Are we now skipping t1,t2,t3,t4, or are we now only skipping t3,t4? ~~~ Here is a similar example, where the ADD TABLE seems confusing to me when it intersects with a prior EXCEPT e.g. CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT t1,t2; // ok ALTER PUBLICATION pub1 ADD TABLE t1; ??? What does it mean? e.g. Does the explicit ADD TABLE override the original exception list? e.g. Is t1 published now or should that ALTER have caused an error? ~~ It feels like there are too many tricky rules when using EXCEPT with ALTER PUBLICATION. I guess complexities can be described in the documentation but IMO it would be better if the ALTER syntax could be unambiguous in the first place. So perhaps the rules should be more restrictive (e.g. just disallow ALTER ... ADD any table that overlaps the existing EXCEPT list ??) ------ Kind Regards, Peter Smith. Fujitsu Australia.
On Tue, May 3, 2022 at 2:24 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > ... > > Another idea that occurred to me today for tables this is as follows: > > 1. Allow to mention except during create publication ... For All Tables. > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > > 2. Allow to Reset it. This new syntax will reset all objects in the > > publications. > > Alter Publication ... RESET; > > 3. Allow to add it to an existing publication > > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2]; > > > > I think it can be extended in a similar way for schema syntax as well. > > > > Consider if the user does > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT t3,t4; > > What does it mean? > e.g. Is there only one exception list that is modified? Or did the ADD > ALL TABLES override all meaning of the original list? > e.g. Are we now skipping t1,t2,t3,t4, or are we now only skipping t3,t4? > This won't be allowed. We won't allow changing ALL TABLES publication unless the user first performs RESET. This is the purpose of providing the RESET variant. > ~~~ > > Here is a similar example, where the ADD TABLE seems confusing to me > when it intersects with a prior EXCEPT > e.g. > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT t1,t2; // ok > ALTER PUBLICATION pub1 ADD TABLE t1; ??? > > What does it mean? > e.g. Does the explicit ADD TABLE override the original exception list? > e.g. Is t1 published now or should that ALTER have caused an error? > This won't be allowed either. We don't allow to Add/Drop from All Tables publication unless the user performs a RESET. This is true even today except that we don't have a RESET syntax. > ~~ > > It feels like there are too many tricky rules when using EXCEPT with > ALTER PUBLICATION. I guess complexities can be described in the > documentation but IMO it would be better if the ALTER syntax could be > unambiguous in the first place. > Agreed. > So perhaps the rules should be more > restrictive (e.g. just disallow ALTER ... ADD any table that overlaps > the existing EXCEPT list ??) > I think the current proposal seems to be restrictive enough to avoid any tricky issues. Do you see any other problem? -- With Regards, Amit Kapila.
On 14.04.22 15:47, Peter Eisentraut wrote: > That said, I'm not sure this feature is worth the trouble. If this is > useful, what about "whole database except these schemas"? What about > "create this database from this template except these schemas". This > could get out of hand. I think we should encourage users to group their > object the way they want and not offer these complicated negative > selection mechanisms. Another problem in general with this "all except these" way of specifying things is that you need to track negative dependencies. For example, assume you can't add a table to a publication unless it has a replica identity. Now, if you have a publication p1 that says includes "all tables except t1", you now have to check p1 whenever a new table is created, even though the new table has no direct dependency link with p1. So in more general cases, you would have to check all existing objects to see whether their specification is in conflict with the new object being created. Now publications don't actually work that way, so it's not a real problem right now, but similar things could work like that. So I think it's worth thinking this through a bit.
On Wed, May 4, 2022 at 7:05 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 14.04.22 15:47, Peter Eisentraut wrote: > > That said, I'm not sure this feature is worth the trouble. If this is > > useful, what about "whole database except these schemas"? What about > > "create this database from this template except these schemas". This > > could get out of hand. I think we should encourage users to group their > > object the way they want and not offer these complicated negative > > selection mechanisms. > > Another problem in general with this "all except these" way of > specifying things is that you need to track negative dependencies. > > For example, assume you can't add a table to a publication unless it has > a replica identity. Now, if you have a publication p1 that says > includes "all tables except t1", you now have to check p1 whenever a new > table is created, even though the new table has no direct dependency > link with p1. So in more general cases, you would have to check all > existing objects to see whether their specification is in conflict with > the new object being created. > Yes, I think we should avoid adding such negative dependencies. We have carefully avoided such dependencies during row filter, column list work where we don't try to perform DDL time verification. However, it is not clear to me how this proposal is related to this example or in general about tracking negative dependencies? AFAIR, we currently have such a check while changing persistence of logged table (logged to unlogged, see ATPrepChangePersistence) where we cannot allow changing persistence if that relation is part of some publication. But as per my understanding, this feature shouldn't add any such new dependencies. I agree that we have to ensure that existing checks shouldn't break due to this feature. > Now publications don't actually work that way, so it's not a real > problem right now, but similar things could work like that. So I think > it's worth thinking this through a bit. > This is a good point and I agree that we should be careful to not add some new negative dependencies unless it is really required but I can't see how this proposal will make it more prone to such checks. -- With Regards, Amit Kapila.
On Thu, May 5, 2022 at 9:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, May 4, 2022 at 7:05 PM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: > > > > On 14.04.22 15:47, Peter Eisentraut wrote: > > > That said, I'm not sure this feature is worth the trouble. If this is > > > useful, what about "whole database except these schemas"? What about > > > "create this database from this template except these schemas". This > > > could get out of hand. I think we should encourage users to group their > > > object the way they want and not offer these complicated negative > > > selection mechanisms. > > > > Another problem in general with this "all except these" way of > > specifying things is that you need to track negative dependencies. > > > > For example, assume you can't add a table to a publication unless it has > > a replica identity. Now, if you have a publication p1 that says > > includes "all tables except t1", you now have to check p1 whenever a new > > table is created, even though the new table has no direct dependency > > link with p1. So in more general cases, you would have to check all > > existing objects to see whether their specification is in conflict with > > the new object being created. > > > > Yes, I think we should avoid adding such negative dependencies. We > have carefully avoided such dependencies during row filter, column > list work where we don't try to perform DDL time verification. > However, it is not clear to me how this proposal is related to this > example or in general about tracking negative dependencies? > I mean to say that even if we have such a restriction, it would apply to "for all tables" or other publications as well. In your example, consider one wants to Alter a table and remove its replica identity, we have to check whether the table is part of any publication similar to what we are doing for relation persistence in ATPrepChangePersistence. > AFAIR, we > currently have such a check while changing persistence of logged table > (logged to unlogged, see ATPrepChangePersistence) where we cannot > allow changing persistence if that relation is part of some > publication. But as per my understanding, this feature shouldn't add > any such new dependencies. I agree that we have to ensure that > existing checks shouldn't break due to this feature. > > > Now publications don't actually work that way, so it's not a real > > problem right now, but similar things could work like that. So I think > > it's worth thinking this through a bit. > > > > This is a good point and I agree that we should be careful to not add > some new negative dependencies unless it is really required but I > can't see how this proposal will make it more prone to such checks. > -- With Regards, Amit Kapila.
On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > ... > > Another idea that occurred to me today for tables this is as follows: > 1. Allow to mention except during create publication ... For All Tables. > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > 2. Allow to Reset it. This new syntax will reset all objects in the > publications. > Alter Publication ... RESET; > 3. Allow to add it to an existing publication > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2]; > > I think it can be extended in a similar way for schema syntax as well. > If the proposed syntax ALTER PUBLICATION ... RESET will reset all the objects in the publication then there still seems simple way to remove only the EXCEPT list but leave everything else intact. IIUC to clear just the EXCEPT list would require a 2 step process - 1) ALTER ... RESET then 2) ALTER ... ADD ALL TABLES again. I was wondering if it might be useful to have a variation that *only* resets the EXCEPT list, but still leaves everything else as-is? So, instead of: ALTER PUBLICATION pubname RESET use a syntax something like: ALTER PUBLICATION pubname RESET {ALL | EXCEPT} or ALTER PUBLICATION pubname RESET [EXCEPT] ------ Kind Regards, Peter Smith. Fujitsu Australia
On Fri, May 6, 2022 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > ... > > > > Another idea that occurred to me today for tables this is as follows: > > 1. Allow to mention except during create publication ... For All Tables. > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > > 2. Allow to Reset it. This new syntax will reset all objects in the > > publications. > > Alter Publication ... RESET; > > 3. Allow to add it to an existing publication > > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2]; > > > > I think it can be extended in a similar way for schema syntax as well. > > > > If the proposed syntax ALTER PUBLICATION ... RESET will reset all the > objects in the publication then there still seems simple way to remove > only the EXCEPT list but leave everything else intact. IIUC to clear > just the EXCEPT list would require a 2 step process - 1) ALTER ... > RESET then 2) ALTER ... ADD ALL TABLES again. > > I was wondering if it might be useful to have a variation that *only* > resets the EXCEPT list, but still leaves everything else as-is? > > So, instead of: > ALTER PUBLICATION pubname RESET +1 for this syntax as this syntax can be extendable to include options like (except/all/etc) later. Currently we can support this syntax and can be extended later based on the requirements. The new feature will handle the various use cases based on the behavior given below: -- CREATE Publication with EXCEPT TABLE syntax CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; -- ok Alter Publication pub1 RESET; -- All Tables and options are reset similar to creating publication without any publication object and publication option (create publication pub1) \dRp+ pub1 Publication pub2 Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root ---------+------------+---------+---------+---------+-----------+---------- vignesh | f | t | t | t | t | f (1 row) -- Can add except table after reset of publication ALTER PUBLICATION pub1 Add ALL TABLES EXCEPT TABLE t1,t2; -- ok -- Cannot add except table without reset of publication ALTER PUBLICATION pub1 Add EXCEPT TABLE t3,t4; -- not ok, need to be reset Alter Publication pub1 RESET; -- Cannot add table to ALL TABLES Publication ALTER PUBLICATION pub1 Add ALL TABLES EXCEPT TABLE t1,t2, t3, t4, TABLE t5; -- not ok, ALL TABLES Publications does not support including of TABLES Alter Publication pub1 RESET; -- Cannot add table to ALL TABLES Publication ALTER PUBLICATION pub1 Add ALL TABLES TABLE t1,t2; -- not ok, ALL TABLES Publications does not support including of TABLES -- Cannot add ALL TABLES IN SCHEMA to ALL TABLES Publication ALTER PUBLICATION pub1 Add ALL TABLES ALL TABLES IN SCHEMA sch1, sch2; -- not ok, ALL TABLES Publications does not support including of ALL TABLES IN SCHEMA -- Existing syntax should work as it is CREATE PUBLICATION pub1 FOR TABLE t1; ALTER PUBLICATION pub1 ADD TABLE t1; -- ok, existing ALTER should work as it is (ok without reset) ALTER PUBLICATION pub1 ADD ALL TABLES IN SCHEMA sch1; -- ok, existing ALTER should work as it is (ok without reset) ALTER PUBLICATION pub1 DROP TABLE t1; -- ok, existing ALTER should work as it is (ok without reset) ALTER PUBLICATION pub1 DROP ALL TABLES IN SCHEMA sch1; -- ok, existing ALTER should work as it is (ok without reset) ALTER PUBLICATION pub1 SET TABLE t1; -- ok, existing ALTER should work as it is (ok without reset) ALTER PUBLICATION pub1 SET ALL TABLES IN SCHEMA sch1; -- ok, existing ALTER should work as it is (ok without reset) I will modify the patch to handle this. Regards, Vignesh
On Tue, May 10, 2022 at 9:08 AM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, May 6, 2022 at 8:05 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Thu, Apr 28, 2022 at 9:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > ... > > > > > > Another idea that occurred to me today for tables this is as follows: > > > 1. Allow to mention except during create publication ... For All Tables. > > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > > > 2. Allow to Reset it. This new syntax will reset all objects in the > > > publications. > > > Alter Publication ... RESET; > > > 3. Allow to add it to an existing publication > > > Alter Publication ... Add ALL TABLES [EXCEPT TABLE t1,t2]; > > > > > > I think it can be extended in a similar way for schema syntax as well. > > > > > > > If the proposed syntax ALTER PUBLICATION ... RESET will reset all the > > objects in the publication then there still seems simple way to remove > > only the EXCEPT list but leave everything else intact. IIUC to clear > > just the EXCEPT list would require a 2 step process - 1) ALTER ... > > RESET then 2) ALTER ... ADD ALL TABLES again. > > > > I was wondering if it might be useful to have a variation that *only* > > resets the EXCEPT list, but still leaves everything else as-is? > > > > So, instead of: > > ALTER PUBLICATION pubname RESET > > +1 for this syntax as this syntax can be extendable to include options > like (except/all/etc) later. > Currently we can support this syntax and can be extended later based > on the requirements. The attached patch has the implementation for "ALTER PUBLICATION pubname RESET". This command will reset the publication to default state which includes resetting the publication options, setting ALL TABLES option to false and dropping the relations and schemas that are associated with the publication. Regards, Vignesh
Attachment
On Thu, May 12, 2022 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote: > ... > The attached patch has the implementation for "ALTER PUBLICATION > pubname RESET". This command will reset the publication to default > state which includes resetting the publication options, setting ALL > TABLES option to false and dropping the relations and schemas that are > associated with the publication. > Please see below my review comments for the v1-0001 (RESET) patch ====== 1. Commit message This patch adds a new RESET option to ALTER PUBLICATION which Wording: "RESET option" -> "RESET clause" ~~~ 2. doc/src/sgml/ref/alter_publication.sgml + <para> + The <literal>RESET</literal> clause will reset the publication to default + state which includes resetting the publication options, setting + <literal>ALL TABLES</literal> option to <literal>false</literal> and drop the + relations and schemas that are associated with the publication. </para> 2a. Wording: "to default state" -> "to the default state" 2b. Wording: "and drop the relations..." -> "and dropping all relations..." ~~~ 3. doc/src/sgml/ref/alter_publication.sgml + invoking user to be a superuser. <literal>RESET</literal> of publication + requires invoking user to be a superuser. To alter the owner, you must also Wording: "requires invoking user" -> "requires the invoking user" ~~~ 4. doc/src/sgml/ref/alter_publication.sgml - Example @@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing, sales; <structname>production_publication</structname>: <programlisting> ALTER PUBLICATION production_publication ADD TABLE users, departments, ALL TABLES IN SCHEMA production; +</programlisting></para> + + <para> + Resetting the publication <structname>production_publication</structname>: +<programlisting> +ALTER PUBLICATION production_publication RESET; Wording: "Resetting the publication" -> "Reset the publication" ~~~ 5. src/backend/commands/publicationcmds.c + /* Check and reset the options */ IMO the code can just reset all these options unconditionally. I did not see the point to check for existing option values first. I feel the simpler code outweighs any negligible performance difference in this case. ~~~ 6. src/backend/commands/publicationcmds.c + /* Check and reset the options */ Somehow it seemed a pity having to hardcode all these default values true/false in multiple places; e.g. the same is already hardcoded in the parse_publication_options function. To avoid multiple hard coded bools you could just call the parse_publication_options with an empty options list. That would set the defaults which you can then use: values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert); Alternatively, maybe there should be #defines to use instead of having the scattered hardcoded bool defaults: #define PUBACTION_DEFAULT_INSERT true #define PUBACTION_DEFAULT_UPDATE true etc ~~~ 7. src/include/nodes/parsenodes.h @@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction { AP_AddObjects, /* add objects to publication */ AP_DropObjects, /* remove objects from publication */ - AP_SetObjects /* set list of objects */ + AP_SetObjects, /* set list of objects */ + AP_ReSetPublication /* reset the publication */ } AlterPublicationAction; Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication" ~~~ 8. src/test/regress/sql/publication.sql 8a. +-- Test for RESET PUBLICATION SUGGESTED +-- Tests for ALTER PUBLICATION ... RESET 8b. +-- Verify that 'ALL TABLES' option is reset SUGGESTED: +-- Verify that 'ALL TABLES' flag is reset 8c. +-- Verify that publish option and publish via root option is reset SUGGESTED: +-- Verify that publish options and publish_via_partition_root option are reset 8d. +-- Verify that only superuser can execute RESET publication SUGGESTED +-- Verify that only superuser can reset a publication ------ Kind Regards, Peter Smith. Fujitsu Australia
On Fri, May 13, 2022 at 9:37 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, May 12, 2022 at 2:24 PM vignesh C <vignesh21@gmail.com> wrote: > > > ... > > The attached patch has the implementation for "ALTER PUBLICATION > > pubname RESET". This command will reset the publication to default > > state which includes resetting the publication options, setting ALL > > TABLES option to false and dropping the relations and schemas that are > > associated with the publication. > > > > Please see below my review comments for the v1-0001 (RESET) patch > > ====== > > 1. Commit message > > This patch adds a new RESET option to ALTER PUBLICATION which > > Wording: "RESET option" -> "RESET clause" Modified > ~~~ > > 2. doc/src/sgml/ref/alter_publication.sgml > > + <para> > + The <literal>RESET</literal> clause will reset the publication to default > + state which includes resetting the publication options, setting > + <literal>ALL TABLES</literal> option to <literal>false</literal> > and drop the > + relations and schemas that are associated with the publication. > </para> > > 2a. Wording: "to default state" -> "to the default state" Modified > 2b. Wording: "and drop the relations..." -> "and dropping all relations..." Modified > ~~~ > > 3. doc/src/sgml/ref/alter_publication.sgml > > + invoking user to be a superuser. <literal>RESET</literal> of publication > + requires invoking user to be a superuser. To alter the owner, you must also > > Wording: "requires invoking user" -> "requires the invoking user" Modified > ~~~ > > 4. doc/src/sgml/ref/alter_publication.sgml - Example > > @@ -207,6 +220,12 @@ ALTER PUBLICATION sales_publication ADD ALL > TABLES IN SCHEMA marketing, sales; > <structname>production_publication</structname>: > <programlisting> > ALTER PUBLICATION production_publication ADD TABLE users, > departments, ALL TABLES IN SCHEMA production; > +</programlisting></para> > + > + <para> > + Resetting the publication <structname>production_publication</structname>: > +<programlisting> > +ALTER PUBLICATION production_publication RESET; > > Wording: "Resetting the publication" -> "Reset the publication" Modified > ~~~ > > 5. src/backend/commands/publicationcmds.c > > + /* Check and reset the options */ > > IMO the code can just reset all these options unconditionally. I did > not see the point to check for existing option values first. I feel > the simpler code outweighs any negligible performance difference in > this case. Modified > ~~~ > > 6. src/backend/commands/publicationcmds.c > > + /* Check and reset the options */ > > Somehow it seemed a pity having to hardcode all these default values > true/false in multiple places; e.g. the same is already hardcoded in > the parse_publication_options function. > > To avoid multiple hard coded bools you could just call the > parse_publication_options with an empty options list. That would set > the defaults which you can then use: > values[Anum_pg_publication_pubinsert - 1] = BoolGetDatum(pubactiondefs->insert); > > Alternatively, maybe there should be #defines to use instead of having > the scattered hardcoded bool defaults: > #define PUBACTION_DEFAULT_INSERT true > #define PUBACTION_DEFAULT_UPDATE true > etc I have used #define for default value and used it in both the functions. > ~~~ > > 7. src/include/nodes/parsenodes.h > > @@ -4033,7 +4033,8 @@ typedef enum AlterPublicationAction > { > AP_AddObjects, /* add objects to publication */ > AP_DropObjects, /* remove objects from publication */ > - AP_SetObjects /* set list of objects */ > + AP_SetObjects, /* set list of objects */ > + AP_ReSetPublication /* reset the publication */ > } AlterPublicationAction; > > Unusual case: "AP_ReSetPublication" -> "AP_ResetPublication" Modified > ~~~ > > 8. src/test/regress/sql/publication.sql > > 8a. > +-- Test for RESET PUBLICATION > SUGGESTED > +-- Tests for ALTER PUBLICATION ... RESET Modified > 8b. > +-- Verify that 'ALL TABLES' option is reset > SUGGESTED: > +-- Verify that 'ALL TABLES' flag is reset Modified > 8c. > +-- Verify that publish option and publish via root option is reset > SUGGESTED: > +-- Verify that publish options and publish_via_partition_root option are reset Modified > 8d. > +-- Verify that only superuser can execute RESET publication > SUGGESTED > +-- Verify that only superuser can reset a publication Modified Thanks for the comments, the attached v5 patch has the changes for the same. Also I have made the changes for SKIP Table based on the new syntax, the changes for the same are available in v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. Regards, Vignesh
Attachment
On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for the comments, the attached v5 patch has the changes for the same. > Also I have made the changes for SKIP Table based on the new syntax, the > changes for the same are available in > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. Hi, Thank you for updating the patch. I'll share few minor review comments on v5-0001. (1) doc/src/sgml/ref/alter_publication.sgml @@ -73,12 +85,13 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r Adding a table to a publication additionally requires owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal> and <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the - invoking user to be a superuser. To alter the owner, you must also be a - direct or indirect member of the new owning role. The new owner must have - <literal>CREATE</literal> privilege on the database. Also, the new owner - of a <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN - SCHEMA</literal> publication must be a superuser. However, a superuser can - change the ownership of a publication regardless of these restrictions. + invoking user to be a superuser. <literal>RESET</literal> of publication + requires the invoking user to be a superuser. To alter the owner, you must ... I suggest to combine the first part of your change with one existing sentence before your change, to make our description concise. FROM: "The <literal>ADD ALL TABLES IN SCHEMA</literal> and <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the invoking user to be a superuser. <literal>RESET</literal> of publication requires the invoking user to be a superuser." TO: "The <literal>ADD ALL TABLES IN SCHEMA</literal>, <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and <literal>RESET</literal> of publication requires the invoking user to be a superuser." (2) typo +++ b/src/backend/commands/publicationcmds.c @@ -53,6 +53,13 @@ #include "utils/syscache.h" #include "utils/varlena.h" +#define PUB_ATION_INSERT_DEFAULT true +#define PUB_ACTION_UPDATE_DEFAULT true Kindly change FROM: "PUB_ATION_INSERT_DEFAULT" TO: "PUB_ACTION_INSERT_DEFAULT" (3) src/test/regress/expected/publication.out +-- Verify that only superuser can reset a publication +ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2; +SET ROLE regress_publication_user2; +ALTER PUBLICATION testpub_reset RESET; -- fail We have "-- fail" for one case in this patch. On the other hand, isn't better to add "-- ok" (or "-- success") for other successful statements, when we consider the entire tests description consistency ? Best Regards, Takamichi Osumi
On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote: > Thanks for the comments, the attached v5 patch has the changes for the same. > Also I have made the changes for SKIP Table based on the new syntax, the > changes for the same are available in > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. Hi, Several comments on v5-0002. (1) One unnecessary space before "except_pub_obj_list" syntax definition + except_pub_obj_list: ExceptPublicationObjSpec + { $$ = list_make1($1); } + | except_pub_obj_list ',' ExceptPublicationObjSpec + { $$ = lappend($1, $3); } + | /*EMPTY*/ { $$ = NULL; } + ; + From above part, kindly change FROM: " except_pub_obj_list: ExceptPublicationObjSpec" TO: "except_pub_obj_list: ExceptPublicationObjSpec" (2) doc/src/sgml/ref/create_publication.sgml (2-1) @@ -22,7 +22,7 @@ PostgreSQL documentation <refsynopsisdiv> <synopsis> CREATE PUBLICATION <replaceable class="parameter">name</replaceable> - [ FOR ALL TABLES + [ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]] | FOR <replaceable class="parameter">publication_object</replaceable> [, ... ] ] [ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>][, ... ] ) ] Here I think we need to add two more whitespaces around square brackets. Please change FROM: "[ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]]" TO: "[ FOR ALL TABLES [ EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ] ]" When I check other documentations, I see whitespaces before/after square brackets. (2-2) This whitespace alignment applies to alter_publication.sgml as well. (3) @@ -156,6 +156,24 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> </listitem> </varlistentry> + + <varlistentry> + <term><literal>EXCEPT TABLE</literal></term> + <listitem> + <para> + Marks the publication as one that excludes replicating changes for the + specified tables. + </para> + + <para> + <literal>EXCEPT TABLE</literal> can be specified only for + <literal>FOR ALL TABLES</literal> publication. It is not supported for + <literal>FOR ALL TABLES IN SCHEMA </literal> publication and + <literal>FOR TABLE</literal> publication. + </para> + </listitem> + </varlistentry> + This EXCEPT TABLE clause is only for FOR ALL TABLES. So, how about extracting the main message from above part and moving it to an exising paragraph below, instead of having one independent paragraph ? <varlistentry> <term><literal>FOR ALL TABLES</literal></term> <listitem> <para> Marks the publication as one that replicates changes for all tables in the database, including tables created in the future. </para> </listitem> </varlistentry> Something like "Marks the publication as one that replicates changes for all tables in the database, including tables created in the future. EXCEPT TABLE indicates excluded tables for the defined publication. " (4) One minor confirmation about the syntax Currently, we allow one way of writing to indicate excluded tables like below. (example) CREATE PUBLICATION mypub FOR ALL TABLES EXCEPT TABLE tab3, tab4, EXCEPT TABLE tab5; This is because we define ExceptPublicationObjSpec with EXCEPT TABLE. Is it OK to have a room to write duplicate "EXCEPT TABLE" clauses ? I think there is no harm in having this, but I'd like to confirm whether this syntax might be better to be adjusted or not. (5) CheckAlterPublication + + if (excepttable && !stmt->for_all_tables) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("publication \"%s\" is not defined as FOR ALL TABLES", + NameStr(pubform->pubname)), + errdetail("except table cannot be added to, dropped from, or set on NON ALL TABLES publications."))); Could you please add a test for this ? Best Regards, Takamichi Osumi
Below are my review comments for v5-0001. There is some overlap with comments recently posted by Osumi-san [1]. (I also have review comments for v5-0002; will post them tomorrow) ====== 1. Commit message This patch adds a new RESET clause to ALTER PUBLICATION which will reset the publication to default state which includes resetting the publication options, setting ALL TABLES option to false and dropping the relations and schemas that are associated with the publication. SUGGEST "to default state" -> "to the default state" "ALL TABLES option" -> "ALL TABLES flag" ~~~ 2. doc/src/sgml/ref/alter_publication.sgml + <para> + The <literal>RESET</literal> clause will reset the publication to the + default state which includes resetting the publication options, setting + <literal>ALL TABLES</literal> option to <literal>false</literal> and + dropping all relations and schemas that are associated with the publication. </para> "ALL TABLES option" -> "ALL TABLES flag" ~~~ 3. doc/src/sgml/ref/alter_publication.sgml + invoking user to be a superuser. <literal>RESET</literal> of publication + requires the invoking user to be a superuser. To alter the owner, you must SUGGESTION To <literal>RESET</literal> a publication requires the invoking user to be a superuser. ~~~ 4. src/backend/commands/publicationcmds.c @@ -53,6 +53,13 @@ #include "utils/syscache.h" #include "utils/varlena.h" +#define PUB_ATION_INSERT_DEFAULT true +#define PUB_ACTION_UPDATE_DEFAULT true +#define PUB_ACTION_DELETE_DEFAULT true +#define PUB_ACTION_TRUNCATE_DEFAULT true +#define PUB_VIA_ROOT_DEFAULT false +#define PUB_ALL_TABLES_DEFAULT false 4a. Typo: "ATION" -> "ACTION" 4b. I think these #defines deserve a 1 line comment. e.g. /* CREATE PUBLICATION default values for flags and options */ 4c. Since the "_DEFAULT" is a common part of all the names, maybe it is tidier if it comes first. e.g. #define PUB_DEFAULT_ACTION_INSERT true #define PUB_DEFAULT_ACTION_UPDATE true #define PUB_DEFAULT_ACTION_DELETE true #define PUB_DEFAULT_ACTION_TRUNCATE true #define PUB_DEFAULT_VIA_ROOT false #define PUB_DEFAULT_ALL_TABLES false ------ [1] https://www.postgresql.org/message-id/TYCPR01MB8373C3120C2B3112001ED6F1EDCF9%40TYCPR01MB8373.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
Below are my review comments for v5-0002. There may be an overlap with comments recently posted by Osumi-san [1]. (I also have review comments for v5-0002; will post them tomorrow) ====== 1. General Is it really necessary to have to say "EXCEPT TABLE" instead of just "EXCEPT". It seems unnecessarily verbose and redundant when you write "FOR ALL TABLES EXCEPT TABLE...". If you want to keep this TABLE keyword (maybe you have plans for other kinds of except?) then IMO perhaps at least it can be the optional default except type. e.g. EXCEPT [TABLE]. ~~~ 2. General (I was unsure whether to even mention this one). I understand the "EXCEPT" is chosen as the user-facing syntax, but it still seems strange when reading the patch to see attribute members and column names called 'except'. I think the problem is that "except" is not a verb, so saying except=t/f just does not make much sense. Sometimes I feel that for all the internal usage (code/comments/catalog) using "skip" and "skip-list" etc would be a much better choice of names. OTOH I can see that having consistency with the outside syntax might also be good. Anyway, please consider - maybe other people feel the same? ~~~ 3. General The ONLY keyword seems supported by the syntax for tables of the except-list (more on this in later comments) but: a) I am not sure if the patch code is accounting for that, and b) There are no test cases using ONLY. ~~~ 4. Commit message A new option "EXCEPT TABLE" in Create/Alter Publication allows one or more tables to be excluded, publisher will exclude sending the data of the excluded tables to the subscriber. SUGGESTION A new "EXCEPT TABLE" clause for CREATE/ALTER PUBLICATION allows one or more tables to be excluded. The publisher will not send the data of excluded tables to the subscriber. ~~ 5. Commit message The new syntax allows specifying exclude relations while creating a publication or exclude relations in alter publication. For example: SUGGESTION The new syntax allows specifying excluded relations when creating or altering a publication. For example: ~~~ 6. Commit message A new column prexcept is added to table "pg_publication_rel", to maintain the relations that the user wants to exclude publishing through the publication. SUGGESTION A new column "prexcept" is added to table "pg_publication_rel", to maintain the relations that the user wants to exclude from the publications. ~~~ 7. Commit message Modified the output plugin (pgoutput) to exclude publishing the changes of the excluded tables. I did not feel it was necessary to say this. It is already said above that the data is not sent, so that seems enough. ~~~ 8. Commit message Updates pg_dump to identify and dump the excluded tables of the publications. Updates the \d family of commands to display excluded tables of the publications and \dRp+ variant will now display associated except tables if any. SUGGESTION pg_dump is updated to identify and dump the excluded tables of the publications. The psql \d family of commands to display excluded tables. e.g. psql \dRp+ variant will now display associated "except tables" if any. ~~~ 9. doc/src/sgml/catalogs.sgml @@ -6426,6 +6426,15 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l if there is no publication qualifying condition.</para></entry> </row> + <row> + <entry role="catalog_table_entry"><para role="column_definition"> + <structfield>prexcept</structfield> <type>bool</type> + </para> + <para> + True if the table must be excluded + </para></entry> + </row> Other descriptions on this page refer to "relation" instead of "table". Probably this should do the same to be consistent. ~~~ 10. doc/src/sgml/logical-replication.sgml @@ -1167,8 +1167,9 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER <para> To add tables to a publication, the user must have ownership rights on the table. To add all tables in schema to a publication, the user must be a - superuser. To create a publication that publishes all tables or all tables in - schema automatically, the user must be a superuser. + superuser. To add all tables to a publication, the user must be a superuser. + To create a publication that publishes all tables or all tables in schema + automatically, the user must be a superuser. </para> It seems like a valid change but how is this related to this EXCEPT patch. Maybe this fix should be patched separately? ~~~ 11. doc/src/sgml/ref/alter_publication.sgml @@ -22,6 +22,7 @@ PostgreSQL documentation <refsynopsisdiv> <synopsis> ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD <replaceable class="parameter">publication_object</replaceable> [, ...] +ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]] The [ONLY] looks misplaced when the syntax is described like this. For example, in practice it is possible to write "EXCEPT TABLE ONLY t1, ONLY t2, t3, ONLY t4" but it doesn't seem that way by looking at these PG DOCS. IMO would be better described like this: [ FOR ALL TABLES [ EXCEPT TABLE exception_object [,...] ]] where exception_object is: [ ONLY ] table_name [ * ] ~~~ 12. doc/src/sgml/ref/alter_publication.sgml @@ -82,8 +83,8 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RESET <para> You must own the publication to use <command>ALTER PUBLICATION</command>. - Adding a table to a publication additionally requires owning that table. - The <literal>ADD ALL TABLES IN SCHEMA</literal> and + Adding a table or excluding a table to a publication additionally requires + owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal> and SUGGESTION Adding a table to or excluding a table from a publication additionally requires owning that table. ~~~ 13. doc/src/sgml/ref/alter_publication.sgml @@ -213,6 +214,14 @@ ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing, sales; </programlisting> </para> + <para> + Alter publication <structname>production_publication</structname> that + publishes all tables except <structname>users</structname> and + <structname>departments</structname> tables: +<programlisting> "that publishes" -> "to publish" ~~~ 14. doc/src/sgml/ref/create_publication.sgml (Same comment about the ONLY syntax as #11) ~~~ 15. doc/src/sgml/ref/create_publication.sgml + <varlistentry> + <term><literal>EXCEPT TABLE</literal></term> + <listitem> + <para> + Marks the publication as one that excludes replicating changes for the + specified tables. + </para> + + <para> + <literal>EXCEPT TABLE</literal> can be specified only for + <literal>FOR ALL TABLES</literal> publication. It is not supported for + <literal>FOR ALL TABLES IN SCHEMA </literal> publication and + <literal>FOR TABLE</literal> publication. + </para> + </listitem> + </varlistentry> IMO you can remove all that "It is not supported for..." sentence. You don't need to spell that out again when it is already clear from the syntax. ~~~ 16. doc/src/sgml/ref/psql-ref.sgml @@ -1868,8 +1868,9 @@ testdb=> If <replaceable class="parameter">pattern</replaceable> is specified, only those publications whose names match the pattern are listed. - If <literal>+</literal> is appended to the command name, the tables and - schemas associated with each publication are shown as well. + If <literal>+</literal> is appended to the command name, the tables, + excluded tables and schemas associated with each publication are shown as + well. </para> Perhaps this is OK just as-is, but OTOH I felt that the change was almost unnecessary because saying it displays "the tables" kind of implies it would also have to account for the "excluded tables" too. ~~~ 17. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level foreach(lc, ancestors) { Oid ancestor = lfirst_oid(lc); - List *apubids = GetRelationPublications(ancestor); + List *apubids = GetRelationPublications(ancestor, false); List *aschemaPubids = NIL; + List *aexceptpubids = NIL; 17a. I think the var "aschemaPubids" and "aexceptpubids" are only used in the 'else' block so it seems better they can be declared and freed in that block too instead of always. 17b. Also, the camel-case of those variables is inconsistent so may fix that at the same time. ~~~ 18. src/backend/catalog/pg_publication.c - GetRelationPublications @@ -666,7 +673,7 @@ publication_add_schema(Oid pubid, Oid schemaid, bool if_not_exists) /* Gets list of publication oids for a relation */ List * -GetRelationPublications(Oid relid) +GetRelationPublications(Oid relid, bool bexcept) 18a. I felt that "except_flag" is a better name than "bexcept" for this param. 18b. The function comment should be updated to say only relations matching this except_flag are returned in the list. ~~~ 19. src/backend/catalog/pg_publication.c - GetAllTablesPublicationRelations @@ -787,6 +795,15 @@ GetAllTablesPublicationRelations(bool pubviaroot) HeapTuple tuple; List *result = NIL; + /* + * pg_publication_rel and pg_publication_namespace will only have excluded + * tables in case of all tables publication, no need to pass except flag + * to get the relations. + */ + List *exceptpubtablelist; + + exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL); + 19a. I wasn't very sure of the meaning/intent of the comment, but IIUC it seems to be explaining why it is not necessary to use an "except_flag" parameter in this code. Is it necessary/helpful to explain parameters that do NOT exist? 19b. The var name "exceptpubtablelist" seems a bit overkill. (e.g. "excepttablelist" or "exceptlist" etc... are shorter but seem equally informative). ~~~ 20. src/backend/commands/publicationcmds.c - CreatePublication @@ -843,54 +849,52 @@ CreatePublication(ParseState *pstate, CreatePublicationStmt *stmt) /* Make the changes visible. */ CommandCounterIncrement(); - /* Associate objects with the publication. */ - if (stmt->for_all_tables) - { - /* Invalidate relcache so that publication info is rebuilt. */ - CacheInvalidateRelcacheAll(); - } - else - { - ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations, - &schemaidlist); + ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations, + &schemaidlist); - /* FOR ALL TABLES IN SCHEMA requires superuser */ - if (list_length(schemaidlist) > 0 && !superuser()) - ereport(ERROR, - errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), - errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); + /* FOR ALL TABLES IN SCHEMA requires superuser */ + if (list_length(schemaidlist) > 0 && !superuser()) + ereport(ERROR, + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); - if (list_length(relations) > 0) - { - List *rels; + if (list_length(relations) > 0) + { + List *rels; - rels = OpenTableList(relations); - CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist, - PUBLICATIONOBJ_TABLE); + rels = OpenTableList(relations); + CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist, + PUBLICATIONOBJ_TABLE); - TransformPubWhereClauses(rels, pstate->p_sourcetext, - publish_via_partition_root); + TransformPubWhereClauses(rels, pstate->p_sourcetext, + publish_via_partition_root); - CheckPubRelationColumnList(rels, pstate->p_sourcetext, - publish_via_partition_root); + CheckPubRelationColumnList(rels, pstate->p_sourcetext, + publish_via_partition_root); - PublicationAddTables(puboid, rels, true, NULL); - CloseTableList(rels); - } + PublicationAddTables(puboid, rels, true, NULL); + CloseTableList(rels); + } - if (list_length(schemaidlist) > 0) - { - /* - * Schema lock is held until the publication is created to prevent - * concurrent schema deletion. - */ - LockSchemaList(schemaidlist); - PublicationAddSchemas(puboid, schemaidlist, true, NULL); - } + if (list_length(schemaidlist) > 0) + { + /* + * Schema lock is held until the publication is created to prevent + * concurrent schema deletion. + */ + LockSchemaList(schemaidlist); + PublicationAddSchemas(puboid, schemaidlist, true, NULL); } table_close(rel, RowExclusiveLock); + /* Associate objects with the publication. */ + if (stmt->for_all_tables) + { + /* Invalidate relcache so that publication info is rebuilt. */ + CacheInvalidateRelcacheAll(); + } + This function is refactored a lot to not use "if/else" as it did before. But AFAIK (maybe I misunderstood) this refactor doesn't seem to actually have anything to do with the EXCEPT patch. If it really is unrelated maybe it should not be part of this patch. ~~~ 21. src/backend/commands/publicationcmds.c - CheckPublicationDefValues + if (pubform->puballtables) + return false; + + if (!pubform->pubinsert || !pubform->pubupdate || !pubform->pubdelete || + !pubform->pubtruncate || pubform->pubviaroot) + return false; Now you have all the #define for the PUB_DEFAULT_XXX values, perhaps this function should be using them instead of the hardcoded assumptions what the default values are. e.g. if (pubform->puballtables != PUB_DEFAULT_ALL_TABLES) return false; if (pubform->pubinsert != PUB_DEFAULT_ACTION_INSERT) return false; ... etc. ~~~ 22. src/backend/commands/publicationcmds.c - CheckAlterPublication @@ -1442,6 +1516,19 @@ CheckAlterPublication(AlterPublicationStmt *stmt, HeapTuple tup, List *tables, List *schemaidlist) { Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup); + ListCell *lc; + bool nonexcepttable = false; + bool excepttable = false; + + foreach(lc, tables) + { + PublicationTable *pub_table = lfirst_node(PublicationTable, lc); + + if (!pub_table->except) + nonexcepttable = true; + else + excepttable = true; + } 22a. The names are very confusing. e.g. "nonexcepttable" is like a double-negative. SUGGEST: bool has_tables = false; bool has_except_tables = false; 22b. Reverse the "if" condition to be positive instead of negative (remove !) e.g. if (pub_table->except) has_except_table = true; else has_table = true; ~~~ 23. src/backend/commands/publicationcmds.c - CheckAlterPublication @@ -1461,12 +1548,19 @@ CheckAlterPublication(AlterPublicationStmt *stmt, HeapTuple tup, errdetail("Tables from schema cannot be added to, dropped from, or set on FOR ALL TABLES publications."))); /* Check that user is allowed to manipulate the publication tables. */ - if (tables && pubform->puballtables) + if (nonexcepttable && tables && pubform->puballtables) ereport(ERROR, Seems no reason for "tables" to be in the condition since "nonexcepttable" can't be true if "tables" is NIL. ~~~ 24. src/backend/commands/publicationcmds.c - CheckAlterPublication + + if (excepttable && !stmt->for_all_tables) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("publication \"%s\" is not defined as FOR ALL TABLES", + NameStr(pubform->pubname)), + errdetail("except table cannot be added to, dropped from, or set on NON ALL TABLES publications."))); The errdetail message seems over-complex. SUGGESTION "EXCEPT TABLE clause is only allowed for FOR ALL TABLES publications." ~~~ 25. src/backend/commands/publicationcmds.c - AlterPublication @@ -1500,6 +1594,20 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt) aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, stmt->pubname); + if (stmt->for_all_tables) + { + bool isdefault = CheckPublicationDefValues(tup); + + if (!isdefault) + ereport(ERROR, + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("Setting ALL TABLES requires publication \"%s\" to have default values", + stmt->pubname), + errhint("Either the publication has tables/schemas associated or does not have default publication options or ALL TABLES option is set.")); The errhint message seems over-complex. SUGGESTION "Use ALTER PUBLICATION ... RESET" ~~~ 26. src/bin/pg_dump/pg_dump.c - dumpPublication @@ -3980,8 +3982,34 @@ dumpPublication(Archive *fout, const PublicationInfo *pubinfo) qpubname); if (pubinfo->puballtables) + { + SimplePtrListCell *cell; + bool first = true; appendPQExpBufferStr(query, " FOR ALL TABLES"); + /* Include exception tables if the publication has except tables */ + for (cell = exceptinfo.head; cell; cell = cell->next) + { + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr; + PublicationInfo *relpubinfo = pubrinfo->publication; + TableInfo *tbinfo; + + if (pubinfo == relpubinfo) + { + tbinfo = pubrinfo->pubtable; + + if (first) + { + appendPQExpBufferStr(query, " EXCEPT TABLE ONLY"); + first = false; + } + else + appendPQExpBufferStr(query, ", "); + appendPQExpBuffer(query, " %s", fmtQualifiedDumpable(tbinfo)); + } + } + } + IIUC this usage of ONLY looks incorrect. 26a. Firstly, if you want to hardwire ONLY then shouldn't it apply to every of the except-list table, not just the first one? e.g. "EXCEPT TABLE ONLY t1, ONLY t2, ONLY t3..." 26b. Secondly, is it even correct to unconditionally hardwire the ONLY? How do you know that is how the user wanted it? ~~~ 27. src/bin/pg_dump/pg_dump.c @@ -127,6 +127,8 @@ static SimpleOidList foreign_servers_include_oids = {NULL, NULL}; static SimpleStringList extension_include_patterns = {NULL, NULL}; static SimpleOidList extension_include_oids = {NULL, NULL}; +static SimplePtrList exceptinfo = {NULL, NULL}; + Probably I just did not understand how this logic works, but how does this static work properly if there are multiple publications and 2 different EXCEPT lists? E.g. where is it clearing the "exceptinfo" so that multiple EXCEPT TABLE lists don't become muddled? ~~~ 28. src/bin/pg_dump/pg_dump.c - dumpPublicationTable @@ -4330,8 +4378,11 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo) query = createPQExpBuffer(); - appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY", + appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD ", fmtId(pubinfo->dobj.name)); + + appendPQExpBufferStr(query, "TABLE ONLY"); + That code refactor does not seem necessary for this patch. ~~~ 29. src/bin/pg_dump/pg_dump_sort.c @@ -90,6 +90,7 @@ enum dbObjectTypePriorities PRIO_FK_CONSTRAINT, PRIO_POLICY, PRIO_PUBLICATION, + PRIO_PUBLICATION_EXCEPT_REL, PRIO_PUBLICATION_REL, PRIO_PUBLICATION_TABLE_IN_SCHEMA, PRIO_SUBSCRIPTION, I'm not sure how this enum is used (so perhaps this makes no difference) but judging by the enum comment why did you put the sort priority order PRIO_PUBLICATION_EXCEPT_REL before PRIO_PUBLICATION_REL. Wouldn’t it make more sense the other way around? ~~~ 30. src/bin/psql/describe.c @@ -2950,17 +2950,34 @@ describeOneTableDetails(const char *schemaname, " WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n" " ELSE NULL END) " "FROM pg_catalog.pg_publication p\n" - " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n" - " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n" - "WHERE pr.prrelid = '%s'\n" - "UNION\n" + " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n" + " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n" + "WHERE pr.prrelid = '%s'", + oid, oid, oid); I feel that trailing "\n" ("WHERE pr.prrelid = '%s'\n") should not have been removed. ~~~ 31. src/bin/psql/describe.c + /* FIXME: 150000 should be changed to 160000 later for PG16. */ + if (pset.sversion >= 150000) + appendPQExpBufferStr(&buf, " AND pr.prexcept = 'f'\n"); + + appendPQExpBuffer(&buf, "UNION\n" The "UNION\n" param might be better wrapped onto the next line like it used to be. ~~~ 32. src/bin/psql/describe.c + /* FIXME: 150000 should be changed to 160000 later for PG16. */ + if (pset.sversion >= 150000) + appendPQExpBuffer(&buf, + " AND NOT EXISTS (SELECT 1\n" + " FROM pg_catalog.pg_publication_rel pr\n" + " JOIN pg_catalog.pg_class pc\n" + " ON pr.prrelid = pc.oid\n" + " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n", + oid); The whitespace indents in the SQL seem excessive here. ~~~ 33. src/bin/psql/describe.c - describePublications @@ -6322,6 +6344,22 @@ describePublications(const char *pattern) } } + /* FIXME: 150000 should be changed to 160000 later for PG16. */ + if (pset.sversion >= 150000) + { + /* Get the excluded tables for the specified publication */ + printfPQExpBuffer(&buf, + "SELECT concat(c.relnamespace::regnamespace, '.', c.relname)\n" + "FROM pg_catalog.pg_class c\n" + " JOIN pg_catalog.pg_publication_rel pr ON c.oid = pr.prrelid\n" + "WHERE pr.prpubid = '%s'\n" + " AND pr.prexcept = 't'\n" + "ORDER BY 1", pubid); + if (!addFooterToPublicationDesc(&buf, "Except tables:", + true, &cont)) + goto error_return; + } + I think this code is misplaced. Shouldn't it be if/else and be above the other 150000 check, otherwise when you change this to PG16 it may not work as expected? ~~~ 34. src/bin/psql/describe.c - describePublications + if (!addFooterToPublicationDesc(&buf, "Except tables:", + true, &cont)) + goto error_return; + } Should this be using the _T() macros same as the other prompts for translation? ~~~ 35. src/include/catalog/pg_publication.h I thought the param "bexpect" should be "except_flag". (same comment as #18a) ~~~ 36. src/include/catalog/pg_publication_rel.h @@ -31,6 +31,7 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId) Oid oid; /* oid */ Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */ Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */ + bool prexcept BKI_DEFAULT(f); /* except the relation */ SUGGEST (comment) /* skip the relation */ ~~~ 37. src/include/commands/publicationcmds.h @@ -32,8 +32,8 @@ extern ObjectAddress AlterPublicationOwner(const char *name, Oid newOwnerId); extern void AlterPublicationOwner_oid(Oid pubid, Oid newOwnerId); extern void InvalidatePublicationRels(List *relids); extern bool pub_rf_contains_invalid_column(Oid pubid, Relation relation, - List *ancestors, bool pubviaroot); + List *ancestors, bool pubviaroot, bool alltables); extern bool pub_collist_contains_invalid_column(Oid pubid, Relation relation, - List *ancestors, bool pubviaroot); + List *ancestors, bool pubviaroot, bool alltables); Elsewhere in this patch, a similarly added param is called "puballtables" (not "alltables"). Please check all places and use a consistent param name for all of them. ~~~ 38. src/test/regress/sql/publication.sql There don't seem to be any tests for more than one EXCEPT TABLE (e.g. no list tests?) ~~~ 38. src/test/regress/sql/publication.sql Maybe adjust all the below comments (a-d) to say "EXCEPT TABLES" intead of "except tables" 38a. +-- can't add except table to 'FOR ALL TABLES' publication 38b. +-- can't add except table to 'FOR TABLE' publication 38c. +-- can't add except table to 'FOR ALL TABLES IN SCHEMA' publication 38d. +-- can't add except table when publish_via_partition_root option does not +-- have default value 38e. +-- can't add except table when the publication options does not have default +-- values SUGGESTION can't add EXCEPT TABLE when the publication options are not the default values ~~~ 39. .../t/032_rep_changes_except_table.pl 39a. +# Check the table data does not sync for excluded table +my $result = $node_subscriber->safe_psql('postgres', + "SELECT count(*), min(a), max(a) FROM sch1.tab1"); +is($result, qq(0||), 'check tablesync is excluded for excluded tables'); Maybe the "is" message should say "check there is no initial data copied for the excluded table" ~~~ 40 .../t/032_rep_changes_except_table.pl +# Insert some data into few tables and verify that inserted data is not +# replicated +$node_publisher->safe_psql('postgres', + "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))"); The comment is not quite correct. You are inserting into only one table here - not "few tables". ~~~ 41. .../t/032_rep_changes_except_table.pl +# Alter publication to exclude data changes in public.tab1 and verify that +# subscriber does not get the new table data. "new table data" -> "changed data for this table" ------ [1] https://www.postgresql.org/message-id/TYCPR01MB83737C28187A6E0BADAE98F0EDCF9%40TYCPR01MB8373.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
On Tue, May 17, 2022 at 7:35 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Below are my review comments for v5-0002. > > There may be an overlap with comments recently posted by Osumi-san [1]. > > (I also have review comments for v5-0002; will post them tomorrow) > > ====== > > 1. General > > Is it really necessary to have to say "EXCEPT TABLE" instead of just > "EXCEPT". It seems unnecessarily verbose and redundant when you write > "FOR ALL TABLES EXCEPT TABLE...". > > If you want to keep this TABLE keyword (maybe you have plans for other > kinds of except?) > I don't think there is an immediate plan but one can imagine using EXCEPT SCHEMA. Then for column lists, one may want to use the syntax Create Publication pub1 For Table t1 Except Cols (c1, ..); > then IMO perhaps at least it can be the optional > default except type. e.g. EXCEPT [TABLE]. > Yeah, that might be okay, so, even if we plan to extend this in the future, by default we will consider the list of tables after EXCEPT but if the user mentions EXCEPT SCHEMA or something else then we can use a different object. Is that sound okay? > > 3. General > > The ONLY keyword seems supported by the syntax for tables of the > except-list (more on this in later comments) but: > a) I am not sure if the patch code is accounting for that, and > b) There are no test cases using ONLY. > > ~~~ > Isn't it better to map ONLY with the way it can already be specified in CREATE PUBLICATION? I am not sure what exactly is proposed and what is your suggestion? Can you please explain if it is different from the way we use it for CREATE PUBLICATION? -- With Regards, Amit Kapila.
On Tue, May 17, 2022 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, May 17, 2022 at 7:35 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Below are my review comments for v5-0002. > > > > There may be an overlap with comments recently posted by Osumi-san [1]. > > > > (I also have review comments for v5-0002; will post them tomorrow) > > > > ====== > > > > 1. General > > > > Is it really necessary to have to say "EXCEPT TABLE" instead of just > > "EXCEPT". It seems unnecessarily verbose and redundant when you write > > "FOR ALL TABLES EXCEPT TABLE...". > > > > If you want to keep this TABLE keyword (maybe you have plans for other > > kinds of except?) > > > > I don't think there is an immediate plan but one can imagine using > EXCEPT SCHEMA. Then for column lists, one may want to use the syntax > Create Publication pub1 For Table t1 Except Cols (c1, ..); > > > then IMO perhaps at least it can be the optional > > default except type. e.g. EXCEPT [TABLE]. > > > > Yeah, that might be okay, so, even if we plan to extend this in the > future, by default we will consider the list of tables after EXCEPT > but if the user mentions EXCEPT SCHEMA or something else then we can > use a different object. Is that sound okay? Yes. That is what I meant. > > > > > 3. General > > > > The ONLY keyword seems supported by the syntax for tables of the > > except-list (more on this in later comments) but: > > a) I am not sure if the patch code is accounting for that, and > > b) There are no test cases using ONLY. > > > > ~~~ > > > > Isn't it better to map ONLY with the way it can already be specified > in CREATE PUBLICATION? I am not sure what exactly is proposed and what > is your suggestion? Can you please explain if it is different from the > way we use it for CREATE PUBLICATION? > Yes, I am not proposing anything different to how ONLY already works for published tables. I was only questioning whether the patch behaves correctly when ONLY is specified for the tables of the EXCEPT list. I had some doubt about it because there are a few other review comments I wrote (e.g. in pg_dump.c), and also I did not find any ONLY tests, ------ Kind Regards, Peter Smith. Fujitsu Australia
On Sat, May 14, 2022 9:33 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v5 patch has the changes for the > same. Also I have made the changes for SKIP Table based on the new > syntax, the changes for the same are available in > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. > Thanks for your patch. Here are some comments on v5-0001 patch. + Oid relid = lfirst_oid(lc); + + prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid, + ObjectIdGetDatum(relid), + ObjectIdGetDatum(pubid)); + if (!OidIsValid(prid)) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("relation \"%s\" is not part of the publication", + RelationGetRelationName(rel)))); I think the relation in the error message should be the one whose oid is "relid", instead of relation "rel". Besides, I think it might be better not to report an error in this case. If "prid" is invalid, just ignore this relation. Because in RESET cases, we want to drop all tables in the publications, and there is no specific table. (If you agree with that, similarly missing_ok should be set to true when calling PublicationDropSchemas().) Regards, Shi yu
On Mon, May 16, 2022 at 8:32 AM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v5 patch has the changes for the same. > > Also I have made the changes for SKIP Table based on the new syntax, the > > changes for the same are available in > > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. > Hi, > > > Thank you for updating the patch. > I'll share few minor review comments on v5-0001. > > > (1) doc/src/sgml/ref/alter_publication.sgml > > @@ -73,12 +85,13 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RENAME TO <r > Adding a table to a publication additionally requires owning that table. > The <literal>ADD ALL TABLES IN SCHEMA</literal> and > <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the > - invoking user to be a superuser. To alter the owner, you must also be a > - direct or indirect member of the new owning role. The new owner must have > - <literal>CREATE</literal> privilege on the database. Also, the new owner > - of a <literal>FOR ALL TABLES</literal> or <literal>FOR ALL TABLES IN > - SCHEMA</literal> publication must be a superuser. However, a superuser can > - change the ownership of a publication regardless of these restrictions. > + invoking user to be a superuser. <literal>RESET</literal> of publication > + requires the invoking user to be a superuser. To alter the owner, you must > ... > > > I suggest to combine the first part of your change with one existing sentence > before your change, to make our description concise. > > FROM: > "The <literal>ADD ALL TABLES IN SCHEMA</literal> and > <literal>SET ALL TABLES IN SCHEMA</literal> to a publication requires the > invoking user to be a superuser. <literal>RESET</literal> of publication > requires the invoking user to be a superuser." > > TO: > "The <literal>ADD ALL TABLES IN SCHEMA</literal>, > <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and > <literal>RESET</literal> of publication requires the invoking user to be a superuser." Modified > > (2) typo > > +++ b/src/backend/commands/publicationcmds.c > @@ -53,6 +53,13 @@ > #include "utils/syscache.h" > #include "utils/varlena.h" > > +#define PUB_ATION_INSERT_DEFAULT true > +#define PUB_ACTION_UPDATE_DEFAULT true > > > Kindly change > FROM: > "PUB_ATION_INSERT_DEFAULT" > TO: > "PUB_ACTION_INSERT_DEFAULT" Modified > > (3) src/test/regress/expected/publication.out > > +-- Verify that only superuser can reset a publication > +ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2; > +SET ROLE regress_publication_user2; > +ALTER PUBLICATION testpub_reset RESET; -- fail > > > We have "-- fail" for one case in this patch. > On the other hand, isn't better to add "-- ok" (or "-- success") for > other successful statements, > when we consider the entire tests description consistency ? We generally do not mention success comments for all the success cases as that might be an overkill. I felt it is better to keep it as it is. Thoughts? The attached v6 patch has the changes for the same. Regards, Vignesh
Attachment
On Mon, May 16, 2022 at 2:53 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Below are my review comments for v5-0001. > > There is some overlap with comments recently posted by Osumi-san [1]. > > (I also have review comments for v5-0002; will post them tomorrow) > > ====== > > 1. Commit message > > This patch adds a new RESET clause to ALTER PUBLICATION which will reset > the publication to default state which includes resetting the publication > options, setting ALL TABLES option to false and dropping the relations and > schemas that are associated with the publication. > > SUGGEST > "to default state" -> "to the default state" > "ALL TABLES option" -> "ALL TABLES flag" Modified > ~~~ > > 2. doc/src/sgml/ref/alter_publication.sgml > > + <para> > + The <literal>RESET</literal> clause will reset the publication to the > + default state which includes resetting the publication options, setting > + <literal>ALL TABLES</literal> option to <literal>false</literal> and > + dropping all relations and schemas that are associated with the publication. > </para> > > "ALL TABLES option" -> "ALL TABLES flag" Modified > ~~~ > > 3. doc/src/sgml/ref/alter_publication.sgml > > + invoking user to be a superuser. <literal>RESET</literal> of publication > + requires the invoking user to be a superuser. To alter the owner, you must > > SUGGESTION > To <literal>RESET</literal> a publication requires the invoking user > to be a superuser. I have combined it with the earlier sentence. > ~~~ > > 4. src/backend/commands/publicationcmds.c > > @@ -53,6 +53,13 @@ > #include "utils/syscache.h" > #include "utils/varlena.h" > > +#define PUB_ATION_INSERT_DEFAULT true > +#define PUB_ACTION_UPDATE_DEFAULT true > +#define PUB_ACTION_DELETE_DEFAULT true > +#define PUB_ACTION_TRUNCATE_DEFAULT true > +#define PUB_VIA_ROOT_DEFAULT false > +#define PUB_ALL_TABLES_DEFAULT false > > 4a. > Typo: "ATION" -> "ACTION" Modified > 4b. > I think these #defines deserve a 1 line comment. > e.g. > /* CREATE PUBLICATION default values for flags and options */ Added comment > 4c. > Since the "_DEFAULT" is a common part of all the names, maybe it is > tidier if it comes first. > e.g. > #define PUB_DEFAULT_ACTION_INSERT true > #define PUB_DEFAULT_ACTION_UPDATE true > #define PUB_DEFAULT_ACTION_DELETE true > #define PUB_DEFAULT_ACTION_TRUNCATE true > #define PUB_DEFAULT_VIA_ROOT false > #define PUB_DEFAULT_ALL_TABLES false Modified The v6 patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com Regards, Vignesh
On Wed, May 18, 2022 at 8:30 AM shiy.fnst@fujitsu.com <shiy.fnst@fujitsu.com> wrote: > > On Sat, May 14, 2022 9:33 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the comments, the attached v5 patch has the changes for the > > same. Also I have made the changes for SKIP Table based on the new > > syntax, the changes for the same are available in > > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. > > > > Thanks for your patch. Here are some comments on v5-0001 patch. > > + Oid relid = lfirst_oid(lc); > + > + prid = GetSysCacheOid2(PUBLICATIONRELMAP, Anum_pg_publication_rel_oid, > + ObjectIdGetDatum(relid), > + ObjectIdGetDatum(pubid)); > + if (!OidIsValid(prid)) > + ereport(ERROR, > + (errcode(ERRCODE_UNDEFINED_OBJECT), > + errmsg("relation \"%s\" is not part of the publication", > + RelationGetRelationName(rel)))); > > I think the relation in the error message should be the one whose oid is > "relid", instead of relation "rel". Modified it > Besides, I think it might be better not to report an error in this case. If > "prid" is invalid, just ignore this relation. Because in RESET cases, we want to > drop all tables in the publications, and there is no specific table. > (If you agree with that, similarly missing_ok should be set to true when calling > PublicationDropSchemas().) Ideally this scenario should not happen, but if it happens I felt we should throw an error in this case. The v6 patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com Regards, Vignesh
On Mon, May 16, 2022 at 2:00 PM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Saturday, May 14, 2022 10:33 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v5 patch has the changes for the same. > > Also I have made the changes for SKIP Table based on the new syntax, the > > changes for the same are available in > > v5-0002-Skip-publishing-the-tables-specified-in-EXCEPT-TA.patch. > Hi, > > > > Several comments on v5-0002. > > (1) One unnecessary space before "except_pub_obj_list" syntax definition > > + except_pub_obj_list: ExceptPublicationObjSpec > + { $$ = list_make1($1); } > + | except_pub_obj_list ',' ExceptPublicationObjSpec > + { $$ = lappend($1, $3); } > + | /*EMPTY*/ { $$ = NULL; } > + ; > + > > From above part, kindly change > FROM: > " except_pub_obj_list: ExceptPublicationObjSpec" > TO: > "except_pub_obj_list: ExceptPublicationObjSpec" > Modified > (2) doc/src/sgml/ref/create_publication.sgml > > (2-1) > > @@ -22,7 +22,7 @@ PostgreSQL documentation > <refsynopsisdiv> > <synopsis> > CREATE PUBLICATION <replaceable class="parameter">name</replaceable> > - [ FOR ALL TABLES > + [ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]] > | FOR <replaceable class="parameter">publication_object</replaceable> [, ... ] ] > [ WITH ( <replaceable class="parameter">publication_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>][, ... ] ) ] > > > Here I think we need to add two more whitespaces around square brackets. > Please change > FROM: > "[ FOR ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ]]" > TO: > "[ FOR ALL TABLES [ EXCEPT TABLE [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [, ... ] ]" > > When I check other documentations, I see whitespaces before/after square brackets. > > (2-2) > This whitespace alignment applies to alter_publication.sgml as well. Modified > (3) > > > @@ -156,6 +156,24 @@ CREATE PUBLICATION <replaceable class="parameter">name</replaceable> > </listitem> > </varlistentry> > > + > + <varlistentry> > + <term><literal>EXCEPT TABLE</literal></term> > + <listitem> > + <para> > + Marks the publication as one that excludes replicating changes for the > + specified tables. > + </para> > + > + <para> > + <literal>EXCEPT TABLE</literal> can be specified only for > + <literal>FOR ALL TABLES</literal> publication. It is not supported for > + <literal>FOR ALL TABLES IN SCHEMA </literal> publication and > + <literal>FOR TABLE</literal> publication. > + </para> > + </listitem> > + </varlistentry> > + > > This EXCEPT TABLE clause is only for FOR ALL TABLES. > So, how about extracting the main message from above part and > moving it to an exising paragraph below, instead of having one independent paragraph ? > > <varlistentry> > <term><literal>FOR ALL TABLES</literal></term> > <listitem> > <para> > Marks the publication as one that replicates changes for all tables in > the database, including tables created in the future. > </para> > </listitem> > </varlistentry> > > Something like > "Marks the publication as one that replicates changes for all tables in > the database, including tables created in the future. EXCEPT TABLE indicates > excluded tables for the defined publication. > " > Modified > (4) One minor confirmation about the syntax > > Currently, we allow one way of writing to indicate excluded tables like below. > > (example) CREATE PUBLICATION mypub FOR ALL TABLES EXCEPT TABLE tab3, tab4, EXCEPT TABLE tab5; > > This is because we define ExceptPublicationObjSpec with EXCEPT TABLE. > Is it OK to have a room to write duplicate "EXCEPT TABLE" clauses ? > I think there is no harm in having this, > but I'd like to confirm whether this syntax might be better to be adjusted or not. Changed it to allow except table only once > > (5) CheckAlterPublication > > + > + if (excepttable && !stmt->for_all_tables) > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("publication \"%s\" is not defined as FOR ALL TABLES", > + NameStr(pubform->pubname)), > + errdetail("except table cannot be added to, dropped from, or set on NON ALL TABLES publications."))); > > Could you please add a test for this ? This code can be removed because of grammar optimization, it will not allow tables without "ALL TABLES". Removed this code The v6 patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com Regards, Vignesh
On Thursday, May 19, 2022 2:45 AM vignesh C <vignesh21@gmail.com> wrote: > On Mon, May 16, 2022 at 8:32 AM osumi.takamichi@fujitsu.com > <osumi.takamichi@fujitsu.com> wrote: > > (3) src/test/regress/expected/publication.out > > > > +-- Verify that only superuser can reset a publication ALTER > > +PUBLICATION testpub_reset OWNER TO regress_publication_user2; SET > > +ROLE regress_publication_user2; ALTER PUBLICATION testpub_reset > > +RESET; -- fail > > > > > > We have "-- fail" for one case in this patch. > > On the other hand, isn't better to add "-- ok" (or "-- success") for > > other successful statements, when we consider the entire tests > > description consistency ? > > We generally do not mention success comments for all the success cases as > that might be an overkill. I felt it is better to keep it as it is. > Thoughts? Thank you for updating the patches ! In terms of this point, I meant to say we add "-- ok" for each successful "ALTER PUBLICATION testpub_reset RESET;" statement. That means, we'll have just three places to add "--ok" and I thought this was not an overkill. *But*, I'm also OK with your idea. Please don't change the comments and keep them as it is like v6. Best Regards, Takamichi Osumi
On Tue, May 17, 2022 at 7:35 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Below are my review comments for v5-0002. > > There may be an overlap with comments recently posted by Osumi-san [1]. > > (I also have review comments for v5-0002; will post them tomorrow) > > ====== > > 1. General > > Is it really necessary to have to say "EXCEPT TABLE" instead of just > "EXCEPT". It seems unnecessarily verbose and redundant when you write > "FOR ALL TABLES EXCEPT TABLE...". > > If you want to keep this TABLE keyword (maybe you have plans for other > kinds of except?) then IMO perhaps at least it can be the optional > default except type. e.g. EXCEPT [TABLE]. I have made TABLE optional. > ~~~ > > 2. General > > (I was unsure whether to even mention this one). > > I understand the "EXCEPT" is chosen as the user-facing syntax, but it > still seems strange when reading the patch to see attribute members > and column names called 'except'. I think the problem is that "except" > is not a verb, so saying except=t/f just does not make much sense. > Sometimes I feel that for all the internal usage > (code/comments/catalog) using "skip" and "skip-list" etc would be a > much better choice of names. OTOH I can see that having consistency > with the outside syntax might also be good. Anyway, please consider - > maybe other people feel the same? Earlier we had discussed whether to use SKIP, but felt SKIP was not appropriate and planned to use except as in [1]. Let's use except unless we find a better alternative. > ~~~ > > 3. General > > The ONLY keyword seems supported by the syntax for tables of the > except-list (more on this in later comments) but: > a) I am not sure if the patch code is accounting for that, and I have kept the behavior similar to FOR TABLE > b) There are no test cases using ONLY. Added tests for the same > ~~~ > > 4. Commit message > > A new option "EXCEPT TABLE" in Create/Alter Publication allows > one or more tables to be excluded, publisher will exclude sending the data > of the excluded tables to the subscriber. > > SUGGESTION > A new "EXCEPT TABLE" clause for CREATE/ALTER PUBLICATION allows one or > more tables to be excluded. The publisher will not send the data of > excluded tables to the subscriber. Modified > ~~ > > 5. Commit message > > The new syntax allows specifying exclude relations while creating a publication > or exclude relations in alter publication. For example: > > SUGGESTION > The new syntax allows specifying excluded relations when creating or > altering a publication. For example: Modified > ~~~ > > 6. Commit message > > A new column prexcept is added to table "pg_publication_rel", to maintain > the relations that the user wants to exclude publishing through the publication. > > SUGGESTION > A new column "prexcept" is added to table "pg_publication_rel", to > maintain the relations that the user wants to exclude from the > publications. Modified > ~~~ > > 7. Commit message > > Modified the output plugin (pgoutput) to exclude publishing the changes of the > excluded tables. > > I did not feel it was necessary to say this. It is already said above > that the data is not sent, so that seems enough. Modified > ~~~ > > 8. Commit message > > Updates pg_dump to identify and dump the excluded tables of the publications. > Updates the \d family of commands to display excluded tables of the > publications and \dRp+ variant will now display associated except tables if any. > > SUGGESTION > pg_dump is updated to identify and dump the excluded tables of the publications. > > The psql \d family of commands to display excluded tables. e.g. psql > \dRp+ variant will now display associated "except tables" if any. Modified > ~~~ > > 9. doc/src/sgml/catalogs.sgml > > @@ -6426,6 +6426,15 @@ SCRAM-SHA-256$<replaceable><iteration > count></replaceable>:<replaceable>&l > if there is no publication qualifying condition.</para></entry> > </row> > > + <row> > + <entry role="catalog_table_entry"><para role="column_definition"> > + <structfield>prexcept</structfield> <type>bool</type> > + </para> > + <para> > + True if the table must be excluded > + </para></entry> > + </row> > > Other descriptions on this page refer to "relation" instead of > "table". Probably this should do the same to be consistent. Modified > ~~~ > > 10. doc/src/sgml/logical-replication.sgml > > @@ -1167,8 +1167,9 @@ CONTEXT: processing remote data for replication > origin "pg_16395" during "INSER > <para> > To add tables to a publication, the user must have ownership rights on the > table. To add all tables in schema to a publication, the user must be a > - superuser. To create a publication that publishes all tables or > all tables in > - schema automatically, the user must be a superuser. > + superuser. To add all tables to a publication, the user must be a superuser. > + To create a publication that publishes all tables or all tables in schema > + automatically, the user must be a superuser. > </para> > > It seems like a valid change but how is this related to this EXCEPT > patch. Maybe this fix should be patched separately? Earlier we were not allowed to add ALL TABLES, while altering publication. This is mentioned in this patch as we suport: ALTER PUBLICATION pubname ADD ALL TABLES syntax. > ~~~ > > 11. doc/src/sgml/ref/alter_publication.sgml > > @@ -22,6 +22,7 @@ PostgreSQL documentation > <refsynopsisdiv> > <synopsis> > ALTER PUBLICATION <replaceable class="parameter">name</replaceable> > ADD <replaceable class="parameter">publication_object</replaceable> [, > ...] > +ALTER PUBLICATION <replaceable class="parameter">name</replaceable> > ADD ALL TABLES [EXCEPT TABLE [ ONLY ] <replaceable > class="parameter">table_name</replaceable> [ * ] [, ... ]] > > The [ONLY] looks misplaced when the syntax is described like this. For > example, in practice it is possible to write "EXCEPT TABLE ONLY t1, > ONLY t2, t3, ONLY t4" but it doesn't seem that way by looking at these > PG DOCS. > > IMO would be better described like this: > > [ FOR ALL TABLES [ EXCEPT TABLE exception_object [,...] ]] > > where exception_object is: > > [ ONLY ] table_name [ * ] Modified > ~~~ > > 12. doc/src/sgml/ref/alter_publication.sgml > > @@ -82,8 +83,8 @@ ALTER PUBLICATION <replaceable > class="parameter">name</replaceable> RESET > > <para> > You must own the publication to use <command>ALTER PUBLICATION</command>. > - Adding a table to a publication additionally requires owning that table. > - The <literal>ADD ALL TABLES IN SCHEMA</literal> and > + Adding a table or excluding a table to a publication additionally requires > + owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal> and > > SUGGESTION > Adding a table to or excluding a table from a publication additionally > requires owning that table. Modified > ~~~ > > 13. doc/src/sgml/ref/alter_publication.sgml > > @@ -213,6 +214,14 @@ ALTER PUBLICATION sales_publication ADD ALL > TABLES IN SCHEMA marketing, sales; > </programlisting> > </para> > > + <para> > + Alter publication <structname>production_publication</structname> that > + publishes all tables except <structname>users</structname> and > + <structname>departments</structname> tables: > +<programlisting> > > "that publishes" -> "to publish" Modified > ~~~ > > 14. doc/src/sgml/ref/create_publication.sgml > > (Same comment about the ONLY syntax as #11) Modified > ~~~ > > 15. doc/src/sgml/ref/create_publication.sgml > > + <varlistentry> > + <term><literal>EXCEPT TABLE</literal></term> > + <listitem> > + <para> > + Marks the publication as one that excludes replicating changes for the > + specified tables. > + </para> > + > + <para> > + <literal>EXCEPT TABLE</literal> can be specified only for > + <literal>FOR ALL TABLES</literal> publication. It is not supported for > + <literal>FOR ALL TABLES IN SCHEMA </literal> publication and > + <literal>FOR TABLE</literal> publication. > + </para> > + </listitem> > + </varlistentry> > > IMO you can remove all that "It is not supported for..." sentence. You > don't need to spell that out again when it is already clear from the > syntax. Modified > ~~~ > > 16. doc/src/sgml/ref/psql-ref.sgml > > @@ -1868,8 +1868,9 @@ testdb=> > If <replaceable class="parameter">pattern</replaceable> is > specified, only those publications whose names match the pattern are > listed. > - If <literal>+</literal> is appended to the command name, the tables and > - schemas associated with each publication are shown as well. > + If <literal>+</literal> is appended to the command name, the tables, > + excluded tables and schemas associated with each publication > are shown as > + well. > </para> > > Perhaps this is OK just as-is, but OTOH I felt that the change was > almost unnecessary because saying it displays "the tables" kind of > implies it would also have to account for the "excluded tables" too. I mentioned it that way so that it is clearer and to avoid confusions to be pointed out by other members later. I felt let's keep it this way. > ~~~ > > 17. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication > > @@ -302,8 +303,9 @@ GetTopMostAncestorInPublication(Oid puboid, List > *ancestors, int *ancestor_level > foreach(lc, ancestors) > { > Oid ancestor = lfirst_oid(lc); > - List *apubids = GetRelationPublications(ancestor); > + List *apubids = GetRelationPublications(ancestor, false); > List *aschemaPubids = NIL; > + List *aexceptpubids = NIL; > > 17a. > I think the var "aschemaPubids" and "aexceptpubids" are only used in > the 'else' block so it seems better they can be declared and freed in > that block too instead of always. Modified > 17b. > Also, the camel-case of those variables is inconsistent so may fix > that at the same time. Modified > ~~~ > > 18. src/backend/catalog/pg_publication.c - GetRelationPublications > > @@ -666,7 +673,7 @@ publication_add_schema(Oid pubid, Oid schemaid, > bool if_not_exists) > > /* Gets list of publication oids for a relation */ > List * > -GetRelationPublications(Oid relid) > +GetRelationPublications(Oid relid, bool bexcept) > > 18a. > I felt that "except_flag" is a better name than "bexcept" for this param. Modified > 18b. > The function comment should be updated to say only relations matching > this except_flag are returned in the list. Modified > ~~~ > > 19. src/backend/catalog/pg_publication.c - GetAllTablesPublicationRelations > > @@ -787,6 +795,15 @@ GetAllTablesPublicationRelations(bool pubviaroot) > HeapTuple tuple; > List *result = NIL; > > + /* > + * pg_publication_rel and pg_publication_namespace will only have excluded > + * tables in case of all tables publication, no need to pass except flag > + * to get the relations. > + */ > + List *exceptpubtablelist; > + > + exceptpubtablelist = GetPublicationRelations(pubid, PUBLICATION_PART_ALL); > + > > 19a. > I wasn't very sure of the meaning/intent of the comment, but IIUC it > seems to be explaining why it is not necessary to use an "except_flag" > parameter in this code. Is it necessary/helpful to explain parameters > that do NOT exist? I have removed it > 19b. > The var name "exceptpubtablelist" seems a bit overkill. (e.g. > "excepttablelist" or "exceptlist" etc... are shorter but seem equally > informative). Modified > ~~~ > > 20. src/backend/commands/publicationcmds.c - CreatePublication > > @@ -843,54 +849,52 @@ CreatePublication(ParseState *pstate, > CreatePublicationStmt *stmt) > /* Make the changes visible. */ > CommandCounterIncrement(); > > - /* Associate objects with the publication. */ > - if (stmt->for_all_tables) > - { > - /* Invalidate relcache so that publication info is rebuilt. */ > - CacheInvalidateRelcacheAll(); > - } > - else > - { > - ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations, > - &schemaidlist); > + ObjectsInPublicationToOids(stmt->pubobjects, pstate, &relations, > + &schemaidlist); > > - /* FOR ALL TABLES IN SCHEMA requires superuser */ > - if (list_length(schemaidlist) > 0 && !superuser()) > - ereport(ERROR, > - errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > - errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); > + /* FOR ALL TABLES IN SCHEMA requires superuser */ > + if (list_length(schemaidlist) > 0 && !superuser()) > + ereport(ERROR, > + errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be superuser to create FOR ALL TABLES IN SCHEMA publication")); > > - if (list_length(relations) > 0) > - { > - List *rels; > + if (list_length(relations) > 0) > + { > + List *rels; > > - rels = OpenTableList(relations); > - CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist, > - PUBLICATIONOBJ_TABLE); > + rels = OpenTableList(relations); > + CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist, > + PUBLICATIONOBJ_TABLE); > > - TransformPubWhereClauses(rels, pstate->p_sourcetext, > - publish_via_partition_root); > + TransformPubWhereClauses(rels, pstate->p_sourcetext, > + publish_via_partition_root); > > - CheckPubRelationColumnList(rels, pstate->p_sourcetext, > - publish_via_partition_root); > + CheckPubRelationColumnList(rels, pstate->p_sourcetext, > + publish_via_partition_root); > > - PublicationAddTables(puboid, rels, true, NULL); > - CloseTableList(rels); > - } > + PublicationAddTables(puboid, rels, true, NULL); > + CloseTableList(rels); > + } > > - if (list_length(schemaidlist) > 0) > - { > - /* > - * Schema lock is held until the publication is created to prevent > - * concurrent schema deletion. > - */ > - LockSchemaList(schemaidlist); > - PublicationAddSchemas(puboid, schemaidlist, true, NULL); > - } > + if (list_length(schemaidlist) > 0) > + { > + /* > + * Schema lock is held until the publication is created to prevent > + * concurrent schema deletion. > + */ > + LockSchemaList(schemaidlist); > + PublicationAddSchemas(puboid, schemaidlist, true, NULL); > } > > table_close(rel, RowExclusiveLock); > > + /* Associate objects with the publication. */ > + if (stmt->for_all_tables) > + { > + /* Invalidate relcache so that publication info is rebuilt. */ > + CacheInvalidateRelcacheAll(); > + } > + > > This function is refactored a lot to not use "if/else" as it did > before. But AFAIK (maybe I misunderstood) this refactor doesn't seem > to actually have anything to do with the EXCEPT patch. If it really is > unrelated maybe it should not be part of this patch. Earlier tables cannot be specified with all tables, now except tables can be specified with all tables, except tables should be added to pg_publication_rel, to handle it the code changes are required. > ~~~ > > 21. src/backend/commands/publicationcmds.c - CheckPublicationDefValues > > + if (pubform->puballtables) > + return false; > + > + if (!pubform->pubinsert || !pubform->pubupdate || !pubform->pubdelete || > + !pubform->pubtruncate || pubform->pubviaroot) > + return false; > > Now you have all the #define for the PUB_DEFAULT_XXX values, perhaps > this function should be using them instead of the hardcoded > assumptions what the default values are. > > e.g. > > if (pubform->puballtables != PUB_DEFAULT_ALL_TABLES) return false; > if (pubform->pubinsert != PUB_DEFAULT_ACTION_INSERT) return false; > ... > etc. Modified > ~~~ > > 22. src/backend/commands/publicationcmds.c - CheckAlterPublication > > > @@ -1442,6 +1516,19 @@ CheckAlterPublication(AlterPublicationStmt > *stmt, HeapTuple tup, > List *tables, List *schemaidlist) > { > Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup); > + ListCell *lc; > + bool nonexcepttable = false; > + bool excepttable = false; > + > + foreach(lc, tables) > + { > + PublicationTable *pub_table = lfirst_node(PublicationTable, lc); > + > + if (!pub_table->except) > + nonexcepttable = true; > + else > + excepttable = true; > + } > > 22a. > The names are very confusing. e.g. "nonexcepttable" is like a double-negative. > > SUGGEST: > bool has_tables = false; > bool has_except_tables = false; > > 22b. > Reverse the "if" condition to be positive instead of negative (remove !) > e.g. > if (pub_table->except) > has_except_table = true; > else > has_table = true; This code can be removed because of grammar optimization, it will not allow except table without "ALL TABLES". Removed these changes. > ~~~ > > 23. src/backend/commands/publicationcmds.c - CheckAlterPublication > > @@ -1461,12 +1548,19 @@ CheckAlterPublication(AlterPublicationStmt > *stmt, HeapTuple tup, > errdetail("Tables from schema cannot be added to, dropped from, or > set on FOR ALL TABLES publications."))); > > /* Check that user is allowed to manipulate the publication tables. */ > - if (tables && pubform->puballtables) > + if (nonexcepttable && tables && pubform->puballtables) > ereport(ERROR, > > Seems no reason for "tables" to be in the condition since > "nonexcepttable" can't be true if "tables" is NIL. This code can be removed because of grammar optimization, it will not allow except table without "ALL TABLES". Removed these changes. > ~~~ > > 24. src/backend/commands/publicationcmds.c - CheckAlterPublication > > + > + if (excepttable && !stmt->for_all_tables) > + ereport(ERROR, > + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > + errmsg("publication \"%s\" is not defined as FOR ALL TABLES", > + NameStr(pubform->pubname)), > + errdetail("except table cannot be added to, dropped from, or set on > NON ALL TABLES publications."))); > > The errdetail message seems over-complex. > > SUGGESTION > "EXCEPT TABLE clause is only allowed for FOR ALL TABLES publications." This code can be removed because of grammar optimization, it will not allow except table without "ALL TABLES". Removed this code > ~~~ > > 25. src/backend/commands/publicationcmds.c - AlterPublication > > @@ -1500,6 +1594,20 @@ AlterPublication(ParseState *pstate, > AlterPublicationStmt *stmt) > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, > stmt->pubname); > > + if (stmt->for_all_tables) > + { > + bool isdefault = CheckPublicationDefValues(tup); > + > + if (!isdefault) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > + errmsg("Setting ALL TABLES requires publication \"%s\" to have > default values", > + stmt->pubname), > + errhint("Either the publication has tables/schemas associated or > does not have default publication options or ALL TABLES option is > set.")); > > The errhint message seems over-complex. > > SUGGESTION > "Use ALTER PUBLICATION ... RESET" Modified > ~~~ > > 26. src/bin/pg_dump/pg_dump.c - dumpPublication > > @@ -3980,8 +3982,34 @@ dumpPublication(Archive *fout, const > PublicationInfo *pubinfo) > qpubname); > > if (pubinfo->puballtables) > + { > + SimplePtrListCell *cell; > + bool first = true; > appendPQExpBufferStr(query, " FOR ALL TABLES"); > > + /* Include exception tables if the publication has except tables */ > + for (cell = exceptinfo.head; cell; cell = cell->next) > + { > + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr; > + PublicationInfo *relpubinfo = pubrinfo->publication; > + TableInfo *tbinfo; > + > + if (pubinfo == relpubinfo) > + { > + tbinfo = pubrinfo->pubtable; > + > + if (first) > + { > + appendPQExpBufferStr(query, " EXCEPT TABLE ONLY"); > + first = false; > + } > + else > + appendPQExpBufferStr(query, ", "); > + appendPQExpBuffer(query, " %s", fmtQualifiedDumpable(tbinfo)); > + } > + } > + } > + > > IIUC this usage of ONLY looks incorrect. > > 26a. > Firstly, if you want to hardwire ONLY then shouldn't it apply to every > of the except-list table, not just the first one? e.g. "EXCEPT TABLE > ONLY t1, ONLY t2, ONLY t3..." Modified, included ONLY for all the tables > 26b. > Secondly, is it even correct to unconditionally hardwire the ONLY? How > do you know that is how the user wanted it? The table ONLY selection is handled appropriately while creating publication and stored in pg_publication_rel. When we dump all the parent and child table will be included specifying ONLY will handle both scenarios with and without ONLY. This is the same behavior as in FOR TABLE publication > ~~~ > > 27. src/bin/pg_dump/pg_dump.c > > @@ -127,6 +127,8 @@ static SimpleOidList foreign_servers_include_oids > = {NULL, NULL}; > static SimpleStringList extension_include_patterns = {NULL, NULL}; > static SimpleOidList extension_include_oids = {NULL, NULL}; > > +static SimplePtrList exceptinfo = {NULL, NULL}; > + > > Probably I just did not understand how this logic works, but how does > this static work properly if there are multiple publications and 2 > different EXCEPT lists? E.g. where is it clearing the "exceptinfo" so > that multiple EXCEPT TABLE lists don't become muddled? Currently exceptinfo holds all the exception tables and the corresponding publications. When we dump the publication it will select the appropriate exception tables that correspond to the publication and dump the exception tables associated for this publication. Since this is a special syntax "CREATE PUBLICATION FOR ALL TABLES EXCEPT TABLE tb1 .." all the except tables should be specified in a single statement unlike the other publication objects. > ~~~ > > 28. src/bin/pg_dump/pg_dump.c - dumpPublicationTable > > @@ -4330,8 +4378,11 @@ dumpPublicationTable(Archive *fout, const > PublicationRelInfo *pubrinfo) > > query = createPQExpBuffer(); > > - appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY", > + appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD ", > fmtId(pubinfo->dobj.name)); > + > + appendPQExpBufferStr(query, "TABLE ONLY"); > + > > That code refactor does not seem necessary for this patch. Modified > ~~~ > > 29. src/bin/pg_dump/pg_dump_sort.c > > @@ -90,6 +90,7 @@ enum dbObjectTypePriorities > PRIO_FK_CONSTRAINT, > PRIO_POLICY, > PRIO_PUBLICATION, > + PRIO_PUBLICATION_EXCEPT_REL, > PRIO_PUBLICATION_REL, > PRIO_PUBLICATION_TABLE_IN_SCHEMA, > PRIO_SUBSCRIPTION, > > I'm not sure how this enum is used (so perhaps this makes no > difference) but judging by the enum comment why did you put the sort > priority order PRIO_PUBLICATION_EXCEPT_REL before > PRIO_PUBLICATION_REL. Wouldn’t it make more sense the other way > around? This order does not matter, since the new syntax is like "CREATE PUBLICATION.. FOR ALL TABLES EXCEPT TABLE ....", all the except tables need to be accumulated and handled during dump publication. This code changes take care of accumulating the exception table which will be used later by dump publication > ~~~ > > 30. src/bin/psql/describe.c > > @@ -2950,17 +2950,34 @@ describeOneTableDetails(const char *schemaname, > " WHERE attrelid = pr.prrelid AND attnum = prattrs[s])\n" > " ELSE NULL END) " > "FROM pg_catalog.pg_publication p\n" > - " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n" > - " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n" > - "WHERE pr.prrelid = '%s'\n" > - "UNION\n" > + " JOIN pg_catalog.pg_publication_rel pr ON p.oid = pr.prpubid\n" > + " JOIN pg_catalog.pg_class c ON c.oid = pr.prrelid\n" > + "WHERE pr.prrelid = '%s'", > + oid, oid, oid); > > I feel that trailing "\n" ("WHERE pr.prrelid = '%s'\n") should not > have been removed. Modified > ~~~ > > 31. src/bin/psql/describe.c > > + /* FIXME: 150000 should be changed to 160000 later for PG16. */ > + if (pset.sversion >= 150000) > + appendPQExpBufferStr(&buf, " AND pr.prexcept = 'f'\n"); > + > + appendPQExpBuffer(&buf, "UNION\n" > > The "UNION\n" param might be better wrapped onto the next line like it > used to be. Modified > ~~~ > > 32. src/bin/psql/describe.c > > + /* FIXME: 150000 should be changed to 160000 later for PG16. */ > + if (pset.sversion >= 150000) > + appendPQExpBuffer(&buf, > + " AND NOT EXISTS (SELECT 1\n" > + " FROM pg_catalog.pg_publication_rel pr\n" > + " JOIN pg_catalog.pg_class pc\n" > + " ON pr.prrelid = pc.oid\n" > + " WHERE pr.prrelid = '%s' AND pr.prpubid = p.oid)\n", > + oid); > > The whitespace indents in the SQL seem excessive here. Modified > ~~~ > > 33. src/bin/psql/describe.c - describePublications > > @@ -6322,6 +6344,22 @@ describePublications(const char *pattern) > } > } > > + /* FIXME: 150000 should be changed to 160000 later for PG16. */ > + if (pset.sversion >= 150000) > + { > + /* Get the excluded tables for the specified publication */ > + printfPQExpBuffer(&buf, > + "SELECT concat(c.relnamespace::regnamespace, '.', c.relname)\n" > + "FROM pg_catalog.pg_class c\n" > + " JOIN pg_catalog.pg_publication_rel pr ON c.oid = pr.prrelid\n" > + "WHERE pr.prpubid = '%s'\n" > + " AND pr.prexcept = 't'\n" > + "ORDER BY 1", pubid); > + if (!addFooterToPublicationDesc(&buf, "Except tables:", > + true, &cont)) > + goto error_return; > + } > + > > I think this code is misplaced. Shouldn't it be if/else and be above > the other 150000 check, otherwise when you change this to PG16 it may > not work as expected? I moved this to else. I felt this is applicable only for all tables publication. Just keeping in else is fine. > ~~~ > > 34. src/bin/psql/describe.c - describePublications > > + if (!addFooterToPublicationDesc(&buf, "Except tables:", > + true, &cont)) > + goto error_return; > + } > > Should this be using the _T() macros same as the other prompts for translation? Modified > ~~~ > > 35. src/include/catalog/pg_publication.h > > I thought the param "bexpect" should be "except_flag". > > (same comment as #18a) Modified > ~~~ > > 36. src/include/catalog/pg_publication_rel.h > > @@ -31,6 +31,7 @@ CATALOG(pg_publication_rel,6106,PublicationRelRelationId) > Oid oid; /* oid */ > Oid prpubid BKI_LOOKUP(pg_publication); /* Oid of the publication */ > Oid prrelid BKI_LOOKUP(pg_class); /* Oid of the relation */ > + bool prexcept BKI_DEFAULT(f); /* except the relation */ > > SUGGEST (comment) > /* skip the relation */ Changed it to exclude the relation > ~~~ > > 37. src/include/commands/publicationcmds.h > > @@ -32,8 +32,8 @@ extern ObjectAddress AlterPublicationOwner(const > char *name, Oid newOwnerId); > extern void AlterPublicationOwner_oid(Oid pubid, Oid newOwnerId); > extern void InvalidatePublicationRels(List *relids); > extern bool pub_rf_contains_invalid_column(Oid pubid, Relation relation, > - List *ancestors, bool pubviaroot); > + List *ancestors, bool pubviaroot, bool alltables); > extern bool pub_collist_contains_invalid_column(Oid pubid, Relation relation, > - List *ancestors, bool pubviaroot); > + List *ancestors, bool pubviaroot, bool alltables); > > Elsewhere in this patch, a similarly added param is called > "puballtables" (not "alltables"). Please check all places and use a > consistent param name for all of them. Modified > ~~~ > > 38. src/test/regress/sql/publication.sql > > There don't seem to be any tests for more than one EXCEPT TABLE (e.g. > no list tests?) Modified > ~~~ > > 38. src/test/regress/sql/publication.sql > > Maybe adjust all the below comments (a-d) to say "EXCEPT TABLES" > intead of "except tables" > > 38a. > +-- can't add except table to 'FOR ALL TABLES' publication > > 38b. > +-- can't add except table to 'FOR TABLE' publication > > 38c. > +-- can't add except table to 'FOR ALL TABLES IN SCHEMA' publication > > 38d. > +-- can't add except table when publish_via_partition_root option does not > +-- have default value > > 38e. > +-- can't add except table when the publication options does not have default > +-- values > > SUGGESTION > can't add EXCEPT TABLE when the publication options are not the default values Modified > ~~~ > > 39. .../t/032_rep_changes_except_table.pl > > 39a. > +# Check the table data does not sync for excluded table > +my $result = $node_subscriber->safe_psql('postgres', > + "SELECT count(*), min(a), max(a) FROM sch1.tab1"); > +is($result, qq(0||), 'check tablesync is excluded for excluded tables'); > > Maybe the "is" message should say "check there is no initial data > copied for the excluded table" Modified > ~~~ > > > 40 .../t/032_rep_changes_except_table.pl > > +# Insert some data into few tables and verify that inserted data is not > +# replicated > +$node_publisher->safe_psql('postgres', > + "INSERT INTO sch1.tab1 VALUES(generate_series(11,20))"); > > The comment is not quite correct. You are inserting into only one > table here - not "few tables". Modified > ~~~ > > 41. .../t/032_rep_changes_except_table.pl > > +# Alter publication to exclude data changes in public.tab1 and verify that > +# subscriber does not get the new table data. > > "new table data" -> "changed data for this table" Modified Thanks for the comments, the v6 patch attached at [2] has the changes for the same. [1] - https://www.postgresql.org/message-id/a2004f08-eb2f-b124-115c-f8f18667e585%40enterprisedb.com [2] - https://www.postgresql.org/message-id/CALDaNm0iZZDB300Dez_97S8G6_RW5QpQ8ef6X3wq8tyK-8wnXQ%40mail.gmail.com Regards, Vignesh
Below are my review comments for v6-0001. ====== 1. General. The patch failed 'publication' tests in the make check phase. Please add this work to the commit-fest so that the 'cfbot' can report such errors sooner. ~~~ 2. src/backend/commands/publicationcmds.c - AlterPublicationReset +/* + * Reset the publication. + * + * Reset the publication options, publication relations and publication schemas. + */ +static void +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt, + Relation rel, HeapTuple tup) SUGGESTION (Make the comment similar to the sgml text instead of repeating "publication" 4x !) /* * Reset the publication options, set the ALL TABLES flag to false, and * drop all relations and schemas that are associated with the publication. */ ~~~ 3. src/test/regress/expected/publication.out make check failed. The diff is below: @@ -1716,7 +1716,7 @@ -- Verify that only superuser can reset a publication ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2; SET ROLE regress_publication_user2; -ALTER PUBLICATION testpub_reset RESET; -- fail +ALTER PUBLICATION testpub_reset RESET; -- fail - must be superuser ERROR: must be superuser to RESET publication SET ROLE regress_publication_user; DROP PUBLICATION testpub_reset; ------ Kind Regards, Peter Smith. Fujitsu Australia
FYI, although the v6-0002 patch applied cleanly, I found that the SGML was malformed and so the pg docs build fails. ~~~ e.g. [postgres@CentOS7-x64 sgml]$ make STYLE=website html { \ echo "<!ENTITY version \"15beta1\">"; \ echo "<!ENTITY majorversion \"15\">"; \ } > version.sgml '/usr/bin/perl' ./mk_feature_tables.pl YES ../../../src/backend/catalog/sql_feature_packages.txt ../../../src/backend/catalog/sql_features.txt > features-supported.sgml '/usr/bin/perl' ./mk_feature_tables.pl NO ../../../src/backend/catalog/sql_feature_packages.txt ../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml '/usr/bin/perl' ./generate-errcodes-table.pl ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml /usr/bin/xmllint --path . --noout --valid postgres.sgml ref/create_publication.sgml:171: parser error : Opening and ending tag mismatch: varlistentry line 166 and listitem </listitem> ^ ref/create_publication.sgml:172: parser error : Opening and ending tag mismatch: variablelist line 60 and varlistentry </varlistentry> ^ ref/create_publication.sgml:226: parser error : Opening and ending tag mismatch: refsect1 line 57 and variablelist </variablelist> ^ ... I will work around it locally, but for future patches please check the SGML builds ok before posting. ------ Kind Regards, Peter Smith. Fujitsu Australia
On Fri, May 20, 2022 at 10:19 AM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI, although the v6-0002 patch applied cleanly, I found that the SGML > was malformed and so the pg docs build fails. > > ~~~ > e.g. > > [postgres@CentOS7-x64 sgml]$ make STYLE=website html > { \ > echo "<!ENTITY version \"15beta1\">"; \ > echo "<!ENTITY majorversion \"15\">"; \ > } > version.sgml > '/usr/bin/perl' ./mk_feature_tables.pl YES > ../../../src/backend/catalog/sql_feature_packages.txt > ../../../src/backend/catalog/sql_features.txt > > features-supported.sgml > '/usr/bin/perl' ./mk_feature_tables.pl NO > ../../../src/backend/catalog/sql_feature_packages.txt > ../../../src/backend/catalog/sql_features.txt > > features-unsupported.sgml > '/usr/bin/perl' ./generate-errcodes-table.pl > ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml > '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml > /usr/bin/xmllint --path . --noout --valid postgres.sgml > ref/create_publication.sgml:171: parser error : Opening and ending tag > mismatch: varlistentry line 166 and listitem > </listitem> > ^ > ref/create_publication.sgml:172: parser error : Opening and ending tag > mismatch: variablelist line 60 and varlistentry > </varlistentry> > ^ > ref/create_publication.sgml:226: parser error : Opening and ending tag > mismatch: refsect1 line 57 and variablelist > </variablelist> > ^ > ... > > I will work around it locally, but for future patches please check the > SGML builds ok before posting. FYI, I rewrote the bad SGML fragment like this: <varlistentry> <term><literal>EXCEPT TABLE</literal></term> <listitem> <para> This clause specifies a list of tables to exclude from the publication. It can only be used with <literal>FOR ALL TABLES</literal>. </para> </listitem> </varlistentry> ------ Kind Regards, Peter Smith. Fujitsu Australia
Below are my review comments for v6-0002. ====== 1. Commit message. The psql \d family of commands to display excluded tables. SUGGESTION The psql \d family of commands can now display excluded tables. ~~~ 2. doc/src/sgml/ref/alter_publication.sgml @@ -22,6 +22,7 @@ PostgreSQL documentation <refsynopsisdiv> <synopsis> ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD <replaceable class="parameter">publication_object</replaceable> [, ...] +ALTER PUBLICATION <replaceable class="parameter">name</replaceable> ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ] The "exception_object" font is wrong. Should look the same as "publication_object" ~~~ 3. doc/src/sgml/ref/alter_publication.sgml - Examples @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL TABLES IN SCHEMA marketing, sales; </programlisting> </para> + <para> + Alter publication <structname>production_publication</structname> to publish + all tables except <structname>users</structname> and + <structname>departments</structname> tables: +<programlisting> +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE users, departments; +</programlisting></para> Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will show TABLE keyword is optional. ~~~ 4. doc/src/sgml/ref/create_publication.sgml An SGML tag error caused building the docs to fail. My fix was previously reported [1]. ~~~ 5. doc/src/sgml/ref/create_publication.sgml @@ -22,7 +22,7 @@ PostgreSQL documentation <refsynopsisdiv> <synopsis> CREATE PUBLICATION <replaceable class="parameter">name</replaceable> - [ FOR ALL TABLES + [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ] The "exception_object" font is wrong. Should look the same as "publication_object" ~~~ 6. doc/src/sgml/ref/create_publication.sgml - Examples @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR TABLE users, departments, ALL TABL CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales; </programlisting></para> + <para> + Create a publication that publishes all changes in all the tables except for + the changes of <structname>users</structname> and + <structname>departments</structname> table: +<programlisting> +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments; +</programlisting> + </para> + 6a. Typo: "FOR ALL TABLE" -> "FOR ALL TABLES" 6b. Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will show TABLE keyword is optional. ~~~ 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List *ancestors, int *ancestor_level } else { - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); - if (list_member_oid(aschemaPubids, puboid)) + List *aschemapubids = NIL; + List *aexceptpubids = NIL; + + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor)); + aexceptpubids = GetRelationPublications(ancestor, true); + if (list_member_oid(aschemapubids, puboid) || + (puballtables && !list_member_oid(aexceptpubids, puboid))) { You could re-write this as multiple conditions instead of one. That could avoid always assigning the 'aexceptpubids', so it might be a more efficient way to write this logic. ~~~ 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues +/* + * Check if the publication has default values + * + * Check the following: + * Publication is having default options + * Publication is not associated with relations + * Publication is not associated with schemas + * Publication is not set with "FOR ALL TABLES" + */ +static bool +CheckPublicationDefValues(HeapTuple tup) 8a. Remove the tab. Replace with spaces. 8b. It might be better if this comment order is the same as the logic order. e.g. * Check the following: * Publication is not set with "FOR ALL TABLES" * Publication is having default options * Publication is not associated with schemas * Publication is not associated with relations ~~~ 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables +/* + * Reset the publication. + * + * Reset the publication options, publication relations and publication schemas. + */ +static void +AlterPublicationSetAllTables(Relation rel, HeapTuple tup) The function comment and the function name do not seem to match here; something looks like a cut/paste error ?? ~~~ 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables + /* set all tables option */ + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true); + replaces[Anum_pg_publication_puballtables - 1] = true; SUGGEST (comment) /* set all ALL TABLES flag */ ~~~ 11. src/backend/catalog/pg_publication.c - AlterPublication @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt) aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, stmt->pubname); + if (stmt->for_all_tables) + { + bool isdefault = CheckPublicationDefValues(tup); + + if (!isdefault) + ereport(ERROR, + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("Setting ALL TABLES requires publication \"%s\" to have default values", + stmt->pubname), + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); The errmsg should start with a lowercase letter. ~~~ 12. src/backend/catalog/pg_publication.c - AlterPublication @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate, AlterPublicationStmt *stmt) aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, stmt->pubname); + if (stmt->for_all_tables) + { + bool isdefault = CheckPublicationDefValues(tup); + + if (!isdefault) + ereport(ERROR, + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("Setting ALL TABLES requires publication \"%s\" to have default values", + stmt->pubname), + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); Example test: postgres=# create table t1(a int); CREATE TABLE postgres=# create publication p1 for table t1; CREATE PUBLICATION postgres=# alter publication p1 add all tables except t1; 2022-05-20 14:34:49.301 AEST [21802] ERROR: Setting ALL TABLES requires publication "p1" to have default values 2022-05-20 14:34:49.301 AEST [21802] HINT: Use ALTER PUBLICATION ... RESET to reset the publication 2022-05-20 14:34:49.301 AEST [21802] STATEMENT: alter publication p1 add all tables except t1; ERROR: Setting ALL TABLES requires publication "p1" to have default values HINT: Use ALTER PUBLICATION ... RESET to reset the publication postgres=# alter publication p1 set all tables except t1; That error message does not quite match what the user was doing. Firstly, they were adding the ALL TABLES, not setting it. Secondly, all the values of the publication were already defaults (only there was an existing table t1 in the publication). Maybe some minor changes to the message wording can be a better reflect what the user is doing here. ~~~ 13. src/backend/parser/gram.y @@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE aggregate_with_argtypes OWNER TO RoleSpec * * CREATE PUBLICATION name [WITH options] * - * CREATE PUBLICATION FOR ALL TABLES [WITH options] + * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]] [WITH options] Comment should show the "TABLE" keyword is optional ~~~ 14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable @@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const PublicationRelInfo *pubrinfo) appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY", fmtId(pubinfo->dobj.name)); + appendPQExpBuffer(query, " %s", fmtQualifiedDumpable(tbinfo)); This additional whitespace seems unrelated to this patch ~~~ 15. src/include/nodes/parsenodes.h 15a. @@ -3999,6 +3999,7 @@ typedef struct PublicationTable RangeVar *relation; /* relation to be published */ Node *whereClause; /* qualifications */ List *columns; /* List of columns in a publication table */ + bool except; /* except relation */ } PublicationTable; Maybe the comment should be more like similar ones: /* exclude the relation */ 15b. @@ -4007,6 +4008,7 @@ typedef struct PublicationTable typedef enum PublicationObjSpecType { PUBLICATIONOBJ_TABLE, /* A table */ + PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */ PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */ PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of Maybe the comment should be more like: /* A table to be excluded */ ~~~ 16. src/test/regress/sql/publication.sql I did not see any test cases using EXCEPT when the optional TABLE keyword is omitted. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPtZDfBJ1d%3D3kSexgM5m%2BP_ok8sdsJXKimsXycaMyqXsNA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Thu, May 19, 2022 at 1:49 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Below are my review comments for v6-0001. > > ====== > > 1. General. > > The patch failed 'publication' tests in the make check phase. > > Please add this work to the commit-fest so that the 'cfbot' can report > such errors sooner. Added commitfest entry > ~~~ > > 2. src/backend/commands/publicationcmds.c - AlterPublicationReset > > +/* > + * Reset the publication. > + * > + * Reset the publication options, publication relations and > publication schemas. > + */ > +static void > +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt, > + Relation rel, HeapTuple tup) > > SUGGESTION (Make the comment similar to the sgml text instead of > repeating "publication" 4x !) > /* > * Reset the publication options, set the ALL TABLES flag to false, and > * drop all relations and schemas that are associated with the publication. > */ Modified > ~~~ > > 3. src/test/regress/expected/publication.out > > make check failed. The diff is below: > > @@ -1716,7 +1716,7 @@ > -- Verify that only superuser can reset a publication > ALTER PUBLICATION testpub_reset OWNER TO regress_publication_user2; > SET ROLE regress_publication_user2; > -ALTER PUBLICATION testpub_reset RESET; -- fail > +ALTER PUBLICATION testpub_reset RESET; -- fail - must be superuser > ERROR: must be superuser to RESET publication > SET ROLE regress_publication_user; > DROP PUBLICATION testpub_reset; It passed for me locally because the change was present in the 002 patch. I have moved the change to 001. The attached v7 patch has the changes for the same. [1] - https://commitfest.postgresql.org/38/3646/ Regards, Vignesh
Attachment
On Fri, May 20, 2022 at 5:49 AM Peter Smith <smithpb2250@gmail.com> wrote: > > FYI, although the v6-0002 patch applied cleanly, I found that the SGML > was malformed and so the pg docs build fails. > > ~~~ > e.g. > > [postgres@CentOS7-x64 sgml]$ make STYLE=website html > { \ > echo "<!ENTITY version \"15beta1\">"; \ > echo "<!ENTITY majorversion \"15\">"; \ > } > version.sgml > '/usr/bin/perl' ./mk_feature_tables.pl YES > ../../../src/backend/catalog/sql_feature_packages.txt > ../../../src/backend/catalog/sql_features.txt > > features-supported.sgml > '/usr/bin/perl' ./mk_feature_tables.pl NO > ../../../src/backend/catalog/sql_feature_packages.txt > ../../../src/backend/catalog/sql_features.txt > > features-unsupported.sgml > '/usr/bin/perl' ./generate-errcodes-table.pl > ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml > '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml > /usr/bin/xmllint --path . --noout --valid postgres.sgml > ref/create_publication.sgml:171: parser error : Opening and ending tag > mismatch: varlistentry line 166 and listitem > </listitem> > ^ > ref/create_publication.sgml:172: parser error : Opening and ending tag > mismatch: variablelist line 60 and varlistentry > </varlistentry> > ^ > ref/create_publication.sgml:226: parser error : Opening and ending tag > mismatch: refsect1 line 57 and variablelist > </variablelist> > ^ > ... > > I will work around it locally, but for future patches please check the > SGML builds ok before posting. Thanks for reporting this, I have made the changes for this. The v7 patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com Regards, Vignesh
On Fri, May 20, 2022 at 11:23 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Below are my review comments for v6-0002. > > ====== > > 1. Commit message. > The psql \d family of commands to display excluded tables. > > SUGGESTION > The psql \d family of commands can now display excluded tables. Modified > ~~~ > > 2. doc/src/sgml/ref/alter_publication.sgml > > @@ -22,6 +22,7 @@ PostgreSQL documentation > <refsynopsisdiv> > <synopsis> > ALTER PUBLICATION <replaceable class="parameter">name</replaceable> > ADD <replaceable class="parameter">publication_object</replaceable> [, > ...] > +ALTER PUBLICATION <replaceable class="parameter">name</replaceable> > ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ] > > The "exception_object" font is wrong. Should look the same as > "publication_object" Modified > ~~~ > > 3. doc/src/sgml/ref/alter_publication.sgml - Examples > > @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL > TABLES IN SCHEMA marketing, sales; > </programlisting> > </para> > > + <para> > + Alter publication <structname>production_publication</structname> to publish > + all tables except <structname>users</structname> and > + <structname>departments</structname> tables: > +<programlisting> > +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE > users, departments; > +</programlisting></para> > > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will > show TABLE keyword is optional. Modified > ~~~ > > 4. doc/src/sgml/ref/create_publication.sgml > > An SGML tag error caused building the docs to fail. My fix was > previously reported [1]. Modified > ~~~ > > 5. doc/src/sgml/ref/create_publication.sgml > > @@ -22,7 +22,7 @@ PostgreSQL documentation > <refsynopsisdiv> > <synopsis> > CREATE PUBLICATION <replaceable class="parameter">name</replaceable> > - [ FOR ALL TABLES > + [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ] > > The "exception_object" font is wrong. Should look the same as > "publication_object" Modified > ~~~ > > 6. doc/src/sgml/ref/create_publication.sgml - Examples > > @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR > TABLE users, departments, ALL TABL > CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales; > </programlisting></para> > > + <para> > + Create a publication that publishes all changes in all the tables except for > + the changes of <structname>users</structname> and > + <structname>departments</structname> table: > +<programlisting> > +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments; > +</programlisting> > + </para> > + > > 6a. > Typo: "FOR ALL TABLE" -> "FOR ALL TABLES" Modified > 6b. > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will > show TABLE keyword is optional. Modified > ~~~ > > 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication > > @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List > *ancestors, int *ancestor_level > } > else > { > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); > - if (list_member_oid(aschemaPubids, puboid)) > + List *aschemapubids = NIL; > + List *aexceptpubids = NIL; > + > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor)); > + aexceptpubids = GetRelationPublications(ancestor, true); > + if (list_member_oid(aschemapubids, puboid) || > + (puballtables && !list_member_oid(aexceptpubids, puboid))) > { > > You could re-write this as multiple conditions instead of one. That > could avoid always assigning the 'aexceptpubids', so it might be a > more efficient way to write this logic. Modified > ~~~ > > 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues > > +/* > + * Check if the publication has default values > + * > + * Check the following: > + * Publication is having default options > + * Publication is not associated with relations > + * Publication is not associated with schemas > + * Publication is not set with "FOR ALL TABLES" > + */ > +static bool > +CheckPublicationDefValues(HeapTuple tup) > > 8a. > Remove the tab. Replace with spaces. Modified > 8b. > It might be better if this comment order is the same as the logic order. > e.g. > > * Check the following: > * Publication is not set with "FOR ALL TABLES" > * Publication is having default options > * Publication is not associated with schemas > * Publication is not associated with relations Modified > ~~~ > > 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables > > +/* > + * Reset the publication. > + * > + * Reset the publication options, publication relations and > publication schemas. > + */ > +static void > +AlterPublicationSetAllTables(Relation rel, HeapTuple tup) > > The function comment and the function name do not seem to match here; > something looks like a cut/paste error ?? Modified > ~~~ > > 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables > > + /* set all tables option */ > + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true); > + replaces[Anum_pg_publication_puballtables - 1] = true; > > SUGGEST (comment) > /* set all ALL TABLES flag */ Modified > ~~~ > > 11. src/backend/catalog/pg_publication.c - AlterPublication > > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate, > AlterPublicationStmt *stmt) > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, > stmt->pubname); > > + if (stmt->for_all_tables) > + { > + bool isdefault = CheckPublicationDefValues(tup); > + > + if (!isdefault) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > + errmsg("Setting ALL TABLES requires publication \"%s\" to have > default values", > + stmt->pubname), > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > The errmsg should start with a lowercase letter. Modified > ~~~ > > 12. src/backend/catalog/pg_publication.c - AlterPublication > > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate, > AlterPublicationStmt *stmt) > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, > stmt->pubname); > > + if (stmt->for_all_tables) > + { > + bool isdefault = CheckPublicationDefValues(tup); > + > + if (!isdefault) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > + errmsg("Setting ALL TABLES requires publication \"%s\" to have > default values", > + stmt->pubname), > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > Example test: > > postgres=# create table t1(a int); > CREATE TABLE > postgres=# create publication p1 for table t1; > CREATE PUBLICATION > postgres=# alter publication p1 add all tables except t1; > 2022-05-20 14:34:49.301 AEST [21802] ERROR: Setting ALL TABLES > requires publication "p1" to have default values > 2022-05-20 14:34:49.301 AEST [21802] HINT: Use ALTER PUBLICATION ... > RESET to reset the publication > 2022-05-20 14:34:49.301 AEST [21802] STATEMENT: alter publication p1 > add all tables except t1; > ERROR: Setting ALL TABLES requires publication "p1" to have default values > HINT: Use ALTER PUBLICATION ... RESET to reset the publication > postgres=# alter publication p1 set all tables except t1; > > That error message does not quite match what the user was doing. > Firstly, they were adding the ALL TABLES, not setting it. Secondly, > all the values of the publication were already defaults (only there > was an existing table t1 in the publication). Maybe some minor changes > to the message wording can be a better reflect what the user is doing > here. Modified > ~~~ > > 13. src/backend/parser/gram.y > > @@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE > aggregate_with_argtypes OWNER TO RoleSpec > * > * CREATE PUBLICATION name [WITH options] > * > - * CREATE PUBLICATION FOR ALL TABLES [WITH options] > + * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]] > [WITH options] > > Comment should show the "TABLE" keyword is optional Modified > ~~~ > > 14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable > > @@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const > PublicationRelInfo *pubrinfo) > > appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY", > fmtId(pubinfo->dobj.name)); > + > appendPQExpBuffer(query, " %s", > fmtQualifiedDumpable(tbinfo)); > > This additional whitespace seems unrelated to this patch Modified > ~~~ > > 15. src/include/nodes/parsenodes.h > > 15a. > @@ -3999,6 +3999,7 @@ typedef struct PublicationTable > RangeVar *relation; /* relation to be published */ > Node *whereClause; /* qualifications */ > List *columns; /* List of columns in a publication table */ > + bool except; /* except relation */ > } PublicationTable; > > Maybe the comment should be more like similar ones: > /* exclude the relation */ Modified > 15b. > @@ -4007,6 +4008,7 @@ typedef struct PublicationTable > typedef enum PublicationObjSpecType > { > PUBLICATIONOBJ_TABLE, /* A table */ > + PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */ > PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */ > PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of > > Maybe the comment should be more like: > /* A table to be excluded */ Modified > ~~~ > > 16. src/test/regress/sql/publication.sql > > I did not see any test cases using EXCEPT when the optional TABLE > keyword is omitted. Added a test Thanks for the comments, the v7 patch attached at [1] has the changes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com Regards, Vignesh
On Sat, May 21, 2022 at 11:06 AM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, May 20, 2022 at 11:23 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Below are my review comments for v6-0002. > > > > ====== > > > > 1. Commit message. > > The psql \d family of commands to display excluded tables. > > > > SUGGESTION > > The psql \d family of commands can now display excluded tables. > > Modified > > > ~~~ > > > > 2. doc/src/sgml/ref/alter_publication.sgml > > > > @@ -22,6 +22,7 @@ PostgreSQL documentation > > <refsynopsisdiv> > > <synopsis> > > ALTER PUBLICATION <replaceable class="parameter">name</replaceable> > > ADD <replaceable class="parameter">publication_object</replaceable> [, > > ...] > > +ALTER PUBLICATION <replaceable class="parameter">name</replaceable> > > ADD ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ] > > > > The "exception_object" font is wrong. Should look the same as > > "publication_object" > > Modified > > > ~~~ > > > > 3. doc/src/sgml/ref/alter_publication.sgml - Examples > > > > @@ -214,6 +220,14 @@ ALTER PUBLICATION sales_publication ADD ALL > > TABLES IN SCHEMA marketing, sales; > > </programlisting> > > </para> > > > > + <para> > > + Alter publication <structname>production_publication</structname> to publish > > + all tables except <structname>users</structname> and > > + <structname>departments</structname> tables: > > +<programlisting> > > +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT TABLE > > users, departments; > > +</programlisting></para> > > > > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will > > show TABLE keyword is optional. > > Modified > > > ~~~ > > > > 4. doc/src/sgml/ref/create_publication.sgml > > > > An SGML tag error caused building the docs to fail. My fix was > > previously reported [1]. > > Modified > > > ~~~ > > > > 5. doc/src/sgml/ref/create_publication.sgml > > > > @@ -22,7 +22,7 @@ PostgreSQL documentation > > <refsynopsisdiv> > > <synopsis> > > CREATE PUBLICATION <replaceable class="parameter">name</replaceable> > > - [ FOR ALL TABLES > > + [ FOR ALL TABLES [ EXCEPT [ TABLE ] exception_object [, ... ] ] > > > > The "exception_object" font is wrong. Should look the same as > > "publication_object" > > Modified > > > ~~~ > > > > 6. doc/src/sgml/ref/create_publication.sgml - Examples > > > > @@ -351,6 +366,15 @@ CREATE PUBLICATION production_publication FOR > > TABLE users, departments, ALL TABL > > CREATE PUBLICATION sales_publication FOR ALL TABLES IN SCHEMA marketing, sales; > > </programlisting></para> > > > > + <para> > > + Create a publication that publishes all changes in all the tables except for > > + the changes of <structname>users</structname> and > > + <structname>departments</structname> table: > > +<programlisting> > > +CREATE PUBLICATION mypublication FOR ALL TABLE EXCEPT TABLE users, departments; > > +</programlisting> > > + </para> > > + > > > > 6a. > > Typo: "FOR ALL TABLE" -> "FOR ALL TABLES" > > Modified > > > 6b. > > Consider using "EXCEPT" instead of "EXCEPT TABLE" because that will > > show TABLE keyword is optional. > > Modified > > > ~~~ > > > > 7. src/backend/catalog/pg_publication.c - GetTopMostAncestorInPublication > > > > @@ -316,18 +316,25 @@ GetTopMostAncestorInPublication(Oid puboid, List > > *ancestors, int *ancestor_level > > } > > else > > { > > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); > > - if (list_member_oid(aschemaPubids, puboid)) > > + List *aschemapubids = NIL; > > + List *aexceptpubids = NIL; > > + > > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor)); > > + aexceptpubids = GetRelationPublications(ancestor, true); > > + if (list_member_oid(aschemapubids, puboid) || > > + (puballtables && !list_member_oid(aexceptpubids, puboid))) > > { > > > > You could re-write this as multiple conditions instead of one. That > > could avoid always assigning the 'aexceptpubids', so it might be a > > more efficient way to write this logic. > > Modified > > > ~~~ > > > > 8. src/backend/catalog/pg_publication.c - CheckPublicationDefValues > > > > +/* > > + * Check if the publication has default values > > + * > > + * Check the following: > > + * Publication is having default options > > + * Publication is not associated with relations > > + * Publication is not associated with schemas > > + * Publication is not set with "FOR ALL TABLES" > > + */ > > +static bool > > +CheckPublicationDefValues(HeapTuple tup) > > > > 8a. > > Remove the tab. Replace with spaces. > > Modified > > > 8b. > > It might be better if this comment order is the same as the logic order. > > e.g. > > > > * Check the following: > > * Publication is not set with "FOR ALL TABLES" > > * Publication is having default options > > * Publication is not associated with schemas > > * Publication is not associated with relations > > Modified > > > ~~~ > > > > 9. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables > > > > +/* > > + * Reset the publication. > > + * > > + * Reset the publication options, publication relations and > > publication schemas. > > + */ > > +static void > > +AlterPublicationSetAllTables(Relation rel, HeapTuple tup) > > > > The function comment and the function name do not seem to match here; > > something looks like a cut/paste error ?? > > Modified > > > ~~~ > > > > 10. src/backend/catalog/pg_publication.c - AlterPublicationSetAllTables > > > > + /* set all tables option */ > > + values[Anum_pg_publication_puballtables - 1] = BoolGetDatum(true); > > + replaces[Anum_pg_publication_puballtables - 1] = true; > > > > SUGGEST (comment) > > /* set all ALL TABLES flag */ > > Modified > > > ~~~ > > > > 11. src/backend/catalog/pg_publication.c - AlterPublication > > > > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate, > > AlterPublicationStmt *stmt) > > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, > > stmt->pubname); > > > > + if (stmt->for_all_tables) > > + { > > + bool isdefault = CheckPublicationDefValues(tup); > > + > > + if (!isdefault) > > + ereport(ERROR, > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > + errmsg("Setting ALL TABLES requires publication \"%s\" to have > > default values", > > + stmt->pubname), > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > The errmsg should start with a lowercase letter. > > Modified > > > ~~~ > > > > 12. src/backend/catalog/pg_publication.c - AlterPublication > > > > @@ -1501,6 +1579,20 @@ AlterPublication(ParseState *pstate, > > AlterPublicationStmt *stmt) > > aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_PUBLICATION, > > stmt->pubname); > > > > + if (stmt->for_all_tables) > > + { > > + bool isdefault = CheckPublicationDefValues(tup); > > + > > + if (!isdefault) > > + ereport(ERROR, > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > + errmsg("Setting ALL TABLES requires publication \"%s\" to have > > default values", > > + stmt->pubname), > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > Example test: > > > > postgres=# create table t1(a int); > > CREATE TABLE > > postgres=# create publication p1 for table t1; > > CREATE PUBLICATION > > postgres=# alter publication p1 add all tables except t1; > > 2022-05-20 14:34:49.301 AEST [21802] ERROR: Setting ALL TABLES > > requires publication "p1" to have default values > > 2022-05-20 14:34:49.301 AEST [21802] HINT: Use ALTER PUBLICATION ... > > RESET to reset the publication > > 2022-05-20 14:34:49.301 AEST [21802] STATEMENT: alter publication p1 > > add all tables except t1; > > ERROR: Setting ALL TABLES requires publication "p1" to have default values > > HINT: Use ALTER PUBLICATION ... RESET to reset the publication > > postgres=# alter publication p1 set all tables except t1; > > > > That error message does not quite match what the user was doing. > > Firstly, they were adding the ALL TABLES, not setting it. Secondly, > > all the values of the publication were already defaults (only there > > was an existing table t1 in the publication). Maybe some minor changes > > to the message wording can be a better reflect what the user is doing > > here. > > Modified > > > ~~~ > > > > 13. src/backend/parser/gram.y > > > > @@ -10410,7 +10411,7 @@ AlterOwnerStmt: ALTER AGGREGATE > > aggregate_with_argtypes OWNER TO RoleSpec > > * > > * CREATE PUBLICATION name [WITH options] > > * > > - * CREATE PUBLICATION FOR ALL TABLES [WITH options] > > + * CREATE PUBLICATION FOR ALL TABLES [EXCEPT TABLE table [, ...]] > > [WITH options] > > > > Comment should show the "TABLE" keyword is optional > > Modified > > > ~~~ > > > > 14. src/bin/pg_dump/pg_dump.c - dumpPublicationTable > > > > @@ -4332,6 +4380,7 @@ dumpPublicationTable(Archive *fout, const > > PublicationRelInfo *pubrinfo) > > > > appendPQExpBuffer(query, "ALTER PUBLICATION %s ADD TABLE ONLY", > > fmtId(pubinfo->dobj.name)); > > + > > appendPQExpBuffer(query, " %s", > > fmtQualifiedDumpable(tbinfo)); > > > > This additional whitespace seems unrelated to this patch > > Modified > > > ~~~ > > > > 15. src/include/nodes/parsenodes.h > > > > 15a. > > @@ -3999,6 +3999,7 @@ typedef struct PublicationTable > > RangeVar *relation; /* relation to be published */ > > Node *whereClause; /* qualifications */ > > List *columns; /* List of columns in a publication table */ > > + bool except; /* except relation */ > > } PublicationTable; > > > > Maybe the comment should be more like similar ones: > > /* exclude the relation */ > > Modified > > > 15b. > > @@ -4007,6 +4008,7 @@ typedef struct PublicationTable > > typedef enum PublicationObjSpecType > > { > > PUBLICATIONOBJ_TABLE, /* A table */ > > + PUBLICATIONOBJ_EXCEPT_TABLE, /* An Except table */ > > PUBLICATIONOBJ_TABLES_IN_SCHEMA, /* All tables in schema */ > > PUBLICATIONOBJ_TABLES_IN_CUR_SCHEMA, /* All tables in first element of > > > > Maybe the comment should be more like: > > /* A table to be excluded */ > > Modified > > > ~~~ > > > > 16. src/test/regress/sql/publication.sql > > > > I did not see any test cases using EXCEPT when the optional TABLE > > keyword is omitted. > > Added a test > > Thanks for the comments, the v7 patch attached at [1] has the changes > for the same. > [1] - https://www.postgresql.org/message-id/CALDaNm3EpX3%2BRu%3DSNaYi%3DUW5ZLE6nNhGRHZ7a8-fXPZ_-gLdxQ%40mail.gmail.com Attached v7 patch which fixes the buildfarm warning for an unused warning in release mode as in [1]. [1] - https://cirrus-ci.com/task/6220288017825792 Regards, Vignesh
Attachment
On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > Attached v7 patch which fixes the buildfarm warning for an unused warning in > release mode as in [1]. Hi, thank you for the patches. I'll share several review comments. For v7-0001. (1) I'll suggest some minor rewording. + <para> + The <literal>RESET</literal> clause will reset the publication to the + default state which includes resetting the publication options, setting + <literal>ALL TABLES</literal> flag to <literal>false</literal> and + dropping all relations and schemas that are associated with the publication. My suggestion is "The RESET clause will reset the publication to the default state. It resets the publication operations, sets ALL TABLES flag to false and drops all relations and schemas associated with the publication." (2) typo and rewording +/* + * Reset the publication. + * + * Reset the publication options, setting ALL TABLES flag to false and drop + * all relations and schemas that are associated with the publication. + */ The "setting" in this sentence should be "set". How about changing like below ? FROM: "Reset the publication options, setting ALL TABLES flag to false and drop all relations and schemas that are associated with the publication." TO: "Reset the publication operations, set ALL TABLES flag to false and drop all relations and schemas associated with the publication." (3) AlterPublicationReset Do we need to call CacheInvalidateRelcacheAll() or InvalidatePublicationRels() at the end of AlterPublicationReset() like AlterPublicationOptions() ? For v7-0002. (4) + if (stmt->for_all_tables) + { + bool isdefault = CheckPublicationDefValues(tup); + + if (!isdefault) + ereport(ERROR, + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("adding ALL TABLES requires the publication to have default publication options,no tables/.... + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); The errmsg string has three messages for user and is a bit long (we have two sentences there connected by 'and'). Can't we make it concise and split it into a couple of lines for code readability ? I'll suggest a change below. FROM: "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALL TABLESflag should not be set" TO: "adding ALL TABLES requires the publication defined not for ALL TABLES" "to have default publish actions without any associated tables/schemas" (5) typo <varlistentry> + <term><literal>EXCEPT TABLE</literal></term> + <listitem> + <para> + This clause specifies a list of tables to exclude from the publication. + It can only be used with <literal>FOR ALL TABLES</literal>. + </para> + </listitem> + </varlistentry> + Kindly change FROM: This clause specifies a list of tables to exclude from the publication. TO: This clause specifies a list of tables to be excluded from the publication. or This clause specifies a list of tables excluded from the publication. (6) Minor suggestion for an expression change Marks the publication as one that replicates changes for all tables in - the database, including tables created in the future. + the database, including tables created in the future. If + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating + the changes for the specified tables. I'll suggest a minor rewording. FROM: ...exclude replicating the changes for the specified tables TO: ...exclude replication changes for the specified tables (7) (7-1) +/* + * Check if the publication has default values + * + * Check the following: + * a) Publication is not set with "FOR ALL TABLES" + * b) Publication is having default options + * c) Publication is not associated with schemas + * d) Publication is not associated with relations + */ +static bool +CheckPublicationDefValues(HeapTuple tup) I think this header comment can be improved. FROM: Check the following: TO: Returns true if the publication satisfies all the following conditions: (7-2) b) should be changed as well FROM: Publication is having default options TO: Publication has the default publish operations Best Regards, Takamichi Osumi
Here are some minor review comments for v7-0001. ====== 1. General Probably the commit message and all the PG docs and code comments should be changed to refer to "publication parameters" instead of (currently) "publication options". This is because these things are really called "publication_parameters" in the PG docs [1]. All the following review comments are just examples of this suggestion. ~~~ 2. Commit message "includes resetting the publication options," -> "includes resetting the publication parameters," ~~~ 3. doc/src/sgml/ref/alter_publication.sgml + <para> + The <literal>RESET</literal> clause will reset the publication to the + default state which includes resetting the publication options, setting + <literal>ALL TABLES</literal> flag to <literal>false</literal> and + dropping all relations and schemas that are associated with the publication. </para> "resetting the publication options," -> "resetting the publication parameters," ~~~ 4. src/backend/commands/publicationcmds.c @@ -53,6 +53,14 @@ #include "utils/syscache.h" #include "utils/varlena.h" +/* CREATE PUBLICATION default values for flags and options */ +#define PUB_DEFAULT_ACTION_INSERT true +#define PUB_DEFAULT_ACTION_UPDATE true +#define PUB_DEFAULT_ACTION_DELETE true +#define PUB_DEFAULT_ACTION_TRUNCATE true +#define PUB_DEFAULT_VIA_ROOT false +#define PUB_DEFAULT_ALL_TABLES false "flags and options" -> "flags and publication parameters" ~~~ 5. src/backend/commands/publicationcmds.c +/* + * Reset the publication. + * + * Reset the publication options, setting ALL TABLES flag to false and drop + * all relations and schemas that are associated with the publication. + */ +static void +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt, + Relation rel, HeapTuple tup) "Reset the publication options," -> "Reset the publication parameters," ~~~ 6. src/test/regress/sql/publication.sql +-- Verify that publish options and publish_via_partition_root option are reset +\dRp+ testpub_reset +ALTER PUBLICATION testpub_reset RESET; +\dRp+ testpub_reset SUGGESTION -- Verify that 'publish' and 'publish_via_partition_root' publication parameters are reset ------ [1] https://www.postgresql.org/docs/current/sql-createpublication.html Kind Regards, Peter Smith. Fujitsu Australia
Here are my review comments for patch v7-0002. ====== 1. doc/src/sgml/logical-replication.sgml @@ -1167,8 +1167,9 @@ CONTEXT: processing remote data for replication origin "pg_16395" during "INSER <para> To add tables to a publication, the user must have ownership rights on the table. To add all tables in schema to a publication, the user must be a - superuser. To create a publication that publishes all tables or all tables in - schema automatically, the user must be a superuser. + superuser. To add all tables to a publication, the user must be a superuser. + To create a publication that publishes all tables or all tables in schema + automatically, the user must be a superuser. </para> I felt that maybe this whole paragraph should be rearranged. Put the "create publication" parts before the "alter publication" parts; Re-word the sentences more similarly. I also felt the ALL TABLES and ALL TABLES IN SCHEMA etc should be written uppercase/literal since that is what was meant. SUGGESTION To create a publication using FOR ALL TABLES or FOR ALL TABLES IN SCHEMA, the user must be a superuser. To add ALL TABLES or ALL TABLES IN SCHEMA to a publication, the user must be a superuser. To add tables to a publication, the user must have ownership rights on the table. ~~~ 2. doc/src/sgml/ref/alter_publication.sgml @@ -82,8 +88,8 @@ ALTER PUBLICATION <replaceable class="parameter">name</replaceable> RESET <para> You must own the publication to use <command>ALTER PUBLICATION</command>. - Adding a table to a publication additionally requires owning that table. - The <literal>ADD ALL TABLES IN SCHEMA</literal>, + Adding a table to or excluding a table from a publication additionally + requires owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal>, <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and Isn't this missing some information that says ADD ALL TABLES requires the invoking user to be a superuser? ~~~ 3. doc/src/sgml/ref/alter_publication.sgml - examples + <para> + Alter publication <structname>production_publication</structname> to publish + all tables except <structname>users</structname> and + <structname>departments</structname> tables: +<programlisting> +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT users, departments; +</programlisting></para> + I didn't think it needs to say "tables" 2x (e.g. remove the last "tables") ~~~ 4. doc/src/sgml/ref/create_publication.sgml - examples + <para> + Create a publication that publishes all changes in all the tables except for + the changes of <structname>users</structname> and + <structname>departments</structname> tables: +<programlisting> +CREATE PUBLICATION mypublication FOR ALL TABLES EXCEPT users, departments; +</programlisting> + </para> I didn't think it needs to say "tables" 2x (e.g. remove the last "tables") ~~~ 5. src/backend/catalog/pg_publication.c foreach(lc, ancestors) { Oid ancestor = lfirst_oid(lc); - List *apubids = GetRelationPublications(ancestor); - List *aschemaPubids = NIL; + List *apubids = GetRelationPublications(ancestor, false); + List *aschemapubids = NIL; + List *aexceptpubids = NIL; level++; - if (list_member_oid(apubids, puboid)) + /* check if member of table publications */ + if (!list_member_oid(apubids, puboid)) { - topmost_relid = ancestor; - - if (ancestor_level) - *ancestor_level = level; - } - else - { - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); - if (list_member_oid(aschemaPubids, puboid)) + /* check if member of schema publications */ + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor)); + if (!list_member_oid(aschemapubids, puboid)) { - topmost_relid = ancestor; - - if (ancestor_level) - *ancestor_level = level; + /* + * If the publication is all tables publication and the table + * is not part of exception tables. + */ + if (puballtables) + { + aexceptpubids = GetRelationPublications(ancestor, true); + if (list_member_oid(aexceptpubids, puboid)) + goto next; + } + else + goto next; } } + topmost_relid = ancestor; + + if (ancestor_level) + *ancestor_level = level; + +next: list_free(apubids); - list_free(aschemaPubids); + list_free(aschemapubids); + list_free(aexceptpubids); } I felt those negative (!) conditions and those goto are making this logic hard to understand. Can’t it be simplified more than this? Even just having another bool flag might help make it easier. e.g. Perhaps something a bit like this (but add some comments) foreach(lc, ancestors) { Oid ancestor = lfirst_oid(lc); List *apubids = GetRelationPublications(ancestor); List *aschemaPubids = NIL; List *aexceptpubids = NIL; bool set_top = false; level++; set_top = list_member_oid(apubids, puboid); if (!set_top) { aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); set_top = list_member_oid(aschemaPubids, puboid); if (!set_top && puballtables) { aexceptpubids = GetRelationPublications(ancestor, true); set_top = !list_member_oid(aexceptpubids, puboid); } } if (set_top) { topmost_relid = ancestor; if (ancestor_level) *ancestor_level = level; } list_free(apubids); list_free(aschemapubids); list_free(aexceptpubids); } ------ 6. src/backend/commands/publicationcmds.c - CheckPublicationDefValues +/* + * Check if the publication has default values + * + * Check the following: + * a) Publication is not set with "FOR ALL TABLES" + * b) Publication is having default options + * c) Publication is not associated with schemas + * d) Publication is not associated with relations + */ +static bool +CheckPublicationDefValues(HeapTuple tup) I think Osumi-san already gave a review [1] about this same comment. So I only wanted to add that it should not say "options" here: "default options" -> "default publication parameter values" ~~~ 7. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables +#ifdef USE_ASSERT_CHECKING + Assert(!pubform->puballtables); +#endif Why is this #ifdef needed? Isn't that logic built into the Assert macro already? ~~~ 8. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables + /* set ALL TABLES flag */ Use uppercase 'S' to match other comments. ~~~ 9. src/backend/commands/publicationcmds.c - AlterPublication + if (!isdefault) + ereport(ERROR, + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), + errmsg("adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALL TABLES flag should not be set"), + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); IMO this errmsg text is not very good but I think Osumi-san [1] has also given a review comment about the same errmsg. So I only wanted to add that should not say "options" here: "default publication options" -> "default publication parameter values" ~~~ 10. src/backend/parser/gram.y /***************************************************************************** * * ALTER PUBLICATION name SET ( options ) * * ALTER PUBLICATION name ADD pub_obj [, ...] * * ALTER PUBLICATION name DROP pub_obj [, ...] * * ALTER PUBLICATION name SET pub_obj [, ...] * * ALTER PUBLICATION name RESET * * pub_obj is one of: * * TABLE table_name [, ...] * ALL TABLES IN SCHEMA schema_name [, ...] * *****************************************************************************/ - Should the above comment be updated to mention also ADD ALL TABLES ... EXCEPT [TABLE] ... ~~~ 11. src/bin/pg_dump/pg_dump.c - dumpPublication + /* Include exception tables if the publication has except tables */ + for (cell = exceptinfo.head; cell; cell = cell->next) + { + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr; + PublicationInfo *relpubinfo = pubrinfo->publication; + TableInfo *tbinfo; + + if (pubinfo == relpubinfo) I am unsure if that variable 'relpubinfo' is of much use; it is only used one time. ~~~ 12. src/bin/pg_dump/t/002_pg_dump.pl I think there should be more test cases here: E.g.1. EXCEPT TABLE should also test a list of tables E.g.2. EXCEPT with optional TABLE keyword ommitted ~~~ 13. src/bin/psql/describe.c - question about the SQL Since the new 'except' is a boolean column, wouldn't it be more natural if all the SQL was treating it as one? e.g. should the SQL be saying "IS preexpect", "IS NOT prexcept"; instead of comparing preexpect to 't' and 'f' character. ~~~ 14. .../t/032_rep_changes_except_table.pl +# Test replication with publications created using FOR ALL TABLES EXCEPT TABLE +# option. +# Create schemas and tables on publisher "option" -> "clause" ------ [1] https://www.postgresql.org/message-id/TYCPR01MB83730A2F1D6A5303E9C1416AEDD99%40TYCPR01MB8373.jpnprd01.prod.outlook.com Kind Regards, Peter Smith. Fujitsu Australia
On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com <osumi.takamichi@fujitsu.com> wrote: > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > Attached v7 patch which fixes the buildfarm warning for an unused warning in > > release mode as in [1]. > Hi, thank you for the patches. > > > I'll share several review comments. > > For v7-0001. > > (1) I'll suggest some minor rewording. > > + <para> > + The <literal>RESET</literal> clause will reset the publication to the > + default state which includes resetting the publication options, setting > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > + dropping all relations and schemas that are associated with the publication. > > My suggestion is > "The RESET clause will reset the publication to the > default state. It resets the publication operations, > sets ALL TABLES flag to false and drops all relations > and schemas associated with the publication." I felt the existing looks better. I would prefer to keep it that way. > (2) typo and rewording > > +/* > + * Reset the publication. > + * > + * Reset the publication options, setting ALL TABLES flag to false and drop > + * all relations and schemas that are associated with the publication. > + */ > > The "setting" in this sentence should be "set". > > How about changing like below ? > FROM: > "Reset the publication options, setting ALL TABLES flag to false and drop > all relations and schemas that are associated with the publication." > TO: > "Reset the publication operations, set ALL TABLES flag to false and drop > all relations and schemas associated with the publication." I felt the existing looks better. I would prefer to keep it that way. > (3) AlterPublicationReset > > Do we need to call CacheInvalidateRelcacheAll() or > InvalidatePublicationRels() at the end of > AlterPublicationReset() like AlterPublicationOptions() ? CacheInvalidateRelcacheAll should be called if we change all tables from true to false, else the cache will not be invalidated. Modified > > For v7-0002. > > (4) > > + if (stmt->for_all_tables) > + { > + bool isdefault = CheckPublicationDefValues(tup); > + > + if (!isdefault) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > + errmsg("adding ALL TABLES requires the publication to have default publicationoptions, no tables/.... > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > The errmsg string has three messages for user and is a bit long > (we have two sentences there connected by 'and'). > Can't we make it concise and split it into a couple of lines for code readability ? > > I'll suggest a change below. > FROM: > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALL TABLESflag should not be set" > TO: > "adding ALL TABLES requires the publication defined not for ALL TABLES" > "to have default publish actions without any associated tables/schemas" Added errdetail and split it > (5) typo > > <varlistentry> > + <term><literal>EXCEPT TABLE</literal></term> > + <listitem> > + <para> > + This clause specifies a list of tables to exclude from the publication. > + It can only be used with <literal>FOR ALL TABLES</literal>. > + </para> > + </listitem> > + </varlistentry> > + > > Kindly change > FROM: > This clause specifies a list of tables to exclude from the publication. > TO: > This clause specifies a list of tables to be excluded from the publication. > or > This clause specifies a list of tables excluded from the publication. Modified > (6) Minor suggestion for an expression change > > Marks the publication as one that replicates changes for all tables in > - the database, including tables created in the future. > + the database, including tables created in the future. If > + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating > + the changes for the specified tables. > > > I'll suggest a minor rewording. > FROM: > ...exclude replicating the changes for the specified tables > TO: > ...exclude replication changes for the specified tables I felt the existing is better. > (7) > (7-1) > > +/* > + * Check if the publication has default values > + * > + * Check the following: > + * a) Publication is not set with "FOR ALL TABLES" > + * b) Publication is having default options > + * c) Publication is not associated with schemas > + * d) Publication is not associated with relations > + */ > +static bool > +CheckPublicationDefValues(HeapTuple tup) > > > I think this header comment can be improved. > FROM: > Check the following: > TO: > Returns true if the publication satisfies all the following conditions: Modified > (7-2) > > b) should be changed as well > FROM: > Publication is having default options > TO: > Publication has the default publish operations Changed it to "Publication is having default publication parameter values" Thanks for the comments, the attached v8 patch has the changes for the same. Regards, Vignesh
Attachment
'On Mon, May 30, 2022 at 1:51 PM Peter Smith <smithpb2250@gmail.com> wrote: > > Here are some minor review comments for v7-0001. > > ====== > > 1. General > > Probably the commit message and all the PG docs and code comments > should be changed to refer to "publication parameters" instead of > (currently) "publication options". This is because these things are > really called "publication_parameters" in the PG docs [1]. > > All the following review comments are just examples of this suggestion. Modified > ~~~ > > 2. Commit message > > "includes resetting the publication options," -> "includes resetting > the publication parameters," Modified > ~~~ > > 3. doc/src/sgml/ref/alter_publication.sgml > > + <para> > + The <literal>RESET</literal> clause will reset the publication to the > + default state which includes resetting the publication options, setting > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > + dropping all relations and schemas that are associated with the publication. > </para> > > > "resetting the publication options," -> "resetting the publication parameters," Modified > ~~~ > > 4. src/backend/commands/publicationcmds.c > > @@ -53,6 +53,14 @@ > #include "utils/syscache.h" > #include "utils/varlena.h" > > +/* CREATE PUBLICATION default values for flags and options */ > +#define PUB_DEFAULT_ACTION_INSERT true > +#define PUB_DEFAULT_ACTION_UPDATE true > +#define PUB_DEFAULT_ACTION_DELETE true > +#define PUB_DEFAULT_ACTION_TRUNCATE true > +#define PUB_DEFAULT_VIA_ROOT false > +#define PUB_DEFAULT_ALL_TABLES false > > "flags and options" -> "flags and publication parameters" Modified > ~~~ > > 5. src/backend/commands/publicationcmds.c > > +/* > + * Reset the publication. > + * > + * Reset the publication options, setting ALL TABLES flag to false and drop > + * all relations and schemas that are associated with the publication. > + */ > +static void > +AlterPublicationReset(ParseState *pstate, AlterPublicationStmt *stmt, > + Relation rel, HeapTuple tup) > > "Reset the publication options," -> "Reset the publication parameters," Modified > ~~~ > > 6. src/test/regress/sql/publication.sql > > +-- Verify that publish options and publish_via_partition_root option are reset > +\dRp+ testpub_reset > +ALTER PUBLICATION testpub_reset RESET; > +\dRp+ testpub_reset > > SUGGESTION > -- Verify that 'publish' and 'publish_via_partition_root' publication > parameters are reset Modified, I have split this into two tests as it will help the 0002 patch to add few tests with the existing steps for 'publish' and 'publish_via_partition_root' publication parameter. Thanks for the comments. the v8 patch attached at [1] has the fixes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm0sAU4s1KTLOEWv%3DrYo5dQK6uFTJn_0FKj3XG1Nv4D-qw%40mail.gmail.com Regards, Vignesh
On Tue, May 31, 2022 at 11:51 AM Peter Smith <smithpb2250@gmail.com> wrote: > > Here are my review comments for patch v7-0002. > > ====== > > 1. doc/src/sgml/logical-replication.sgml > > @@ -1167,8 +1167,9 @@ CONTEXT: processing remote data for replication > origin "pg_16395" during "INSER > <para> > To add tables to a publication, the user must have ownership rights on the > table. To add all tables in schema to a publication, the user must be a > - superuser. To create a publication that publishes all tables or > all tables in > - schema automatically, the user must be a superuser. > + superuser. To add all tables to a publication, the user must be a superuser. > + To create a publication that publishes all tables or all tables in schema > + automatically, the user must be a superuser. > </para> > > I felt that maybe this whole paragraph should be rearranged. Put the > "create publication" parts before the "alter publication" parts; > Re-word the sentences more similarly. I also felt the ALL TABLES and > ALL TABLES IN SCHEMA etc should be written uppercase/literal since > that is what was meant. > > SUGGESTION > To create a publication using FOR ALL TABLES or FOR ALL TABLES IN > SCHEMA, the user must be a superuser. To add ALL TABLES or ALL TABLES > IN SCHEMA to a publication, the user must be a superuser. To add > tables to a publication, the user must have ownership rights on the > table. Modified > ~~~ > > 2. doc/src/sgml/ref/alter_publication.sgml > > @@ -82,8 +88,8 @@ ALTER PUBLICATION <replaceable > class="parameter">name</replaceable> RESET > > <para> > You must own the publication to use <command>ALTER PUBLICATION</command>. > - Adding a table to a publication additionally requires owning that table. > - The <literal>ADD ALL TABLES IN SCHEMA</literal>, > + Adding a table to or excluding a table from a publication additionally > + requires owning that table. The <literal>ADD ALL TABLES IN SCHEMA</literal>, > <literal>SET ALL TABLES IN SCHEMA</literal> to a publication and > > Isn't this missing some information that says ADD ALL TABLES requires > the invoking user to be a superuser? Modified > ~~~ > > 3. doc/src/sgml/ref/alter_publication.sgml - examples > > + <para> > + Alter publication <structname>production_publication</structname> to publish > + all tables except <structname>users</structname> and > + <structname>departments</structname> tables: > +<programlisting> > +ALTER PUBLICATION production_publication ADD ALL TABLES EXCEPT users, > departments; > +</programlisting></para> > + > > I didn't think it needs to say "tables" 2x (e.g. remove the last "tables") Modified > ~~~ > > 4. doc/src/sgml/ref/create_publication.sgml - examples > > + <para> > + Create a publication that publishes all changes in all the tables except for > + the changes of <structname>users</structname> and > + <structname>departments</structname> tables: > +<programlisting> > +CREATE PUBLICATION mypublication FOR ALL TABLES EXCEPT users, departments; > +</programlisting> > + </para> > > I didn't think it needs to say "tables" 2x (e.g. remove the last "tables") Modified > ~~~ > > 5. src/backend/catalog/pg_publication.c > > foreach(lc, ancestors) > { > Oid ancestor = lfirst_oid(lc); > - List *apubids = GetRelationPublications(ancestor); > - List *aschemaPubids = NIL; > + List *apubids = GetRelationPublications(ancestor, false); > + List *aschemapubids = NIL; > + List *aexceptpubids = NIL; > > level++; > > - if (list_member_oid(apubids, puboid)) > + /* check if member of table publications */ > + if (!list_member_oid(apubids, puboid)) > { > - topmost_relid = ancestor; > - > - if (ancestor_level) > - *ancestor_level = level; > - } > - else > - { > - aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); > - if (list_member_oid(aschemaPubids, puboid)) > + /* check if member of schema publications */ > + aschemapubids = GetSchemaPublications(get_rel_namespace(ancestor)); > + if (!list_member_oid(aschemapubids, puboid)) > { > - topmost_relid = ancestor; > - > - if (ancestor_level) > - *ancestor_level = level; > + /* > + * If the publication is all tables publication and the table > + * is not part of exception tables. > + */ > + if (puballtables) > + { > + aexceptpubids = GetRelationPublications(ancestor, true); > + if (list_member_oid(aexceptpubids, puboid)) > + goto next; > + } > + else > + goto next; > } > } > > + topmost_relid = ancestor; > + > + if (ancestor_level) > + *ancestor_level = level; > + > +next: > list_free(apubids); > - list_free(aschemaPubids); > + list_free(aschemapubids); > + list_free(aexceptpubids); > } > > > I felt those negative (!) conditions and those goto are making this > logic hard to understand. Can’t it be simplified more than this? Even > just having another bool flag might help make it easier. > > e.g. Perhaps something a bit like this (but add some comments) > > foreach(lc, ancestors) > { > Oid ancestor = lfirst_oid(lc); > List *apubids = GetRelationPublications(ancestor); > List *aschemaPubids = NIL; > List *aexceptpubids = NIL; > bool set_top = false; > level++; > > set_top = list_member_oid(apubids, puboid); > if (!set_top) > { > aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); > set_top = list_member_oid(aschemaPubids, puboid); > > if (!set_top && puballtables) > { > aexceptpubids = GetRelationPublications(ancestor, true); > set_top = !list_member_oid(aexceptpubids, puboid); > } > } > if (set_top) > { > topmost_relid = ancestor; > > if (ancestor_level) > *ancestor_level = level; > } > > list_free(apubids); > list_free(aschemapubids); > list_free(aexceptpubids); > } Modified > ------ > > 6. src/backend/commands/publicationcmds.c - CheckPublicationDefValues > > +/* > + * Check if the publication has default values > + * > + * Check the following: > + * a) Publication is not set with "FOR ALL TABLES" > + * b) Publication is having default options > + * c) Publication is not associated with schemas > + * d) Publication is not associated with relations > + */ > +static bool > +CheckPublicationDefValues(HeapTuple tup) > > I think Osumi-san already gave a review [1] about this same comment. > > So I only wanted to add that it should not say "options" here: > "default options" -> "default publication parameter values" Modified > ~~~ > > 7. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables > > +#ifdef USE_ASSERT_CHECKING > + Assert(!pubform->puballtables); > +#endif > > Why is this #ifdef needed? Isn't that logic built into the Assert macro already? pubform is used only for assert case. If we don't use it within #ifdef or PG_USED_FOR_ASSERTS_ONLY, it will throw a unused variable error without --enable-cassert like: publicationcmds.c: In function ‘AlterPublicationSetAllTables’: publicationcmds.c:1250:29: error: unused variable ‘pubform’ [-Werror=unused-variable] 1250 | Form_pg_publication pubform = (Form_pg_publication) GETSTRUCT(tup); | ^~~~~~~ cc1: all warnings being treated as errors > ~~~ > > 8. src/backend/commands/publicationcmds.c - AlterPublicationSetAllTables > > + /* set ALL TABLES flag */ > > Use uppercase 'S' to match other comments. Modified > ~~~ > > 9. src/backend/commands/publicationcmds.c - AlterPublication > > + if (!isdefault) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > + errmsg("adding ALL TABLES requires the publication to have default > publication options, no tables/schemas associated and ALL TABLES flag > should not be set"), > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > IMO this errmsg text is not very good but I think Osumi-san [1] has > also given a review comment about the same errmsg. > > So I only wanted to add that should not say "options" here: > "default publication options" -> "default publication parameter values" Modified > ~~~ > > 10. src/backend/parser/gram.y > > /***************************************************************************** > * > * ALTER PUBLICATION name SET ( options ) > * > * ALTER PUBLICATION name ADD pub_obj [, ...] > * > * ALTER PUBLICATION name DROP pub_obj [, ...] > * > * ALTER PUBLICATION name SET pub_obj [, ...] > * > * ALTER PUBLICATION name RESET > * > * pub_obj is one of: > * > * TABLE table_name [, ...] > * ALL TABLES IN SCHEMA schema_name [, ...] > * > *****************************************************************************/ > > - > > Should the above comment be updated to mention also ADD ALL TABLES > ... EXCEPT [TABLE] ... Modified > ~~~ > > 11. src/bin/pg_dump/pg_dump.c - dumpPublication > > + /* Include exception tables if the publication has except tables */ > + for (cell = exceptinfo.head; cell; cell = cell->next) > + { > + PublicationRelInfo *pubrinfo = (PublicationRelInfo *) cell->ptr; > + PublicationInfo *relpubinfo = pubrinfo->publication; > + TableInfo *tbinfo; > + > + if (pubinfo == relpubinfo) > > I am unsure if that variable 'relpubinfo' is of much use; it is only > used one time. Removed relpubinfo > ~~~ > > 12. src/bin/pg_dump/t/002_pg_dump.pl > > I think there should be more test cases here: > > E.g.1. EXCEPT TABLE should also test a list of tables > > E.g.2. EXCEPT with optional TABLE keyword ommitted Added a test for list of tables and modified one of the test to remove TABLE. > ~~~ > > 13. src/bin/psql/describe.c - question about the SQL > > Since the new 'except' is a boolean column, wouldn't it be more > natural if all the SQL was treating it as one? > > e.g. should the SQL be saying "IS preexpect", "IS NOT prexcept"; > instead of comparing preexpect to 't' and 'f' character. modified > ~~~ > > 14. .../t/032_rep_changes_except_table.pl > > +# Test replication with publications created using FOR ALL TABLES EXCEPT TABLE > +# option. > +# Create schemas and tables on publisher > > "option" -> "clause" Modified. Thanks for the comments. The v8 patch attached at [1] has the fixes for the same. [1] - https://www.postgresql.org/message-id/CALDaNm0sAU4s1KTLOEWv%3DrYo5dQK6uFTJn_0FKj3XG1Nv4D-qw%40mail.gmail.com Regards, Vignesh
On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote: > > Thanks for the comments, the attached v8 patch has the changes for the same. > AFAICS, the summary of this proposal is that we want to support exclude of certain objects from publication with two kinds of variants. The first variant is to add support to exclude specific tables from ALL TABLES PUBLICATION. Without this feature, users need to manually add all tables for a database even when she wants to avoid only a handful of tables from the database say because they contain sensitive information or are not required. We have seen that other database like MySQL also provides similar feature [1] (See REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as follows: CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; or ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2; This will allow us to publish all the tables in the current database except t1 and t2. Now, I see that pg_dump has a similar option provided by switch --exclude-table but that allows tables matching patterns which is not the case here. I am not sure if we need a similar variant here. Then users will be allowed to reset the publication by: ALTER PUBLICATION pub1 RESET; This will reset the publication to the default state which includes resetting the publication parameters, setting the ALL TABLES flag to false, and dropping the relations and schemas that are associated with the publication. I don't know if we want to go further with allowing to RESET specific parameters and if so which parameters and what would its syntax be? The second variant is to add support to exclude certain columns of a table while publishing a particular table. Currently, users need to list all required columns' names even if they don't want to hide most of the columns in the table (for example Create Publication pub For Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary' or other sensitive information of executives/employees but would like to publish all other columns. I feel in such cases it will be a lot of work for the user especially when the table has many columns. I see that Oracle has a similar feature [2]. I think without this it will be difficult for users to use this feature in some cases. The patch for this is not proposed but I would imagine syntax for it to be something like "Create Publication pub For Table t1 Except (c3)" and similar variants for Alter Publication. Have I missed anything? Thoughts on the proposal/syntax would be appreciated? [1] - https://dev.mysql.com/doc/refman/5.7/en/change-replication-filter.html [2] - https://docs.oracle.com/en/cloud/paas/goldengate-cloud/gwuad/selecting-columns.html#GUID-9A851C8B-48F7-43DF-8D98-D086BE069E20 -- With Regards, Amit Kapila.
On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Thanks for the comments, the attached v8 patch has the changes for the > same. > > > > AFAICS, the summary of this proposal is that we want to support > exclude of certain objects from publication with two kinds of > variants. The first variant is to add support to exclude specific > tables from ALL TABLES PUBLICATION. Without this feature, users need > to manually add all tables for a database even when she wants to avoid > only a handful of tables from the database say because they contain > sensitive information or are not required. We have seen that other > database like MySQL also provides similar feature [1] (See > REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as > follows: > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > or > ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2; > > This will allow us to publish all the tables in the current database > except t1 and t2. Now, I see that pg_dump has a similar option > provided by switch --exclude-table but that allows tables matching > patterns which is not the case here. I am not sure if we need a > similar variant here. > > Then users will be allowed to reset the publication by: > ALTER PUBLICATION pub1 RESET; > > This will reset the publication to the default state which includes > resetting the publication parameters, setting the ALL TABLES flag to > false, and dropping the relations and schemas that are associated with > the publication. I don't know if we want to go further with allowing > to RESET specific parameters and if so which parameters and what would > its syntax be? > > The second variant is to add support to exclude certain columns of a > table while publishing a particular table. Currently, users need to > list all required columns' names even if they don't want to hide most > of the columns in the table (for example Create Publication pub For > Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary' > or other sensitive information of executives/employees but would like > to publish all other columns. I feel in such cases it will be a lot of > work for the user especially when the table has many columns. I see > that Oracle has a similar feature [2]. I think without this it will be > difficult for users to use this feature in some cases. The patch for > this is not proposed but I would imagine syntax for it to be something > like "Create Publication pub For Table t1 Except (c3)" and similar > variants for Alter Publication. I think the feature to exclude certain columns of a table would be useful. In some production scenarios, we usually do not want to replicate sensitive fields(column) in the table. Although we already can achieve this by specify all replicated columns in the list[1], but that seems a hard work when the table has hundreds of columns. [1] CREATE TABLE test(a int, b int, c int,..., sensitive text); CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...); In addition, it's not easy to maintain the column list like above. Because we sometimes need to add new fields or delete fields due to business needs. Every time we add a column(or delete a column in column list), we need to update the column list. If we support Except: CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive); We don't need to update the column list in most cases. Thanks for "hametan" for providing the use case off-list. Best regards, Hou zj
On Tue, Jun 14, 2022 at 9:10 AM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote: > > On Wednesday, June 8, 2022 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Jun 3, 2022 at 3:37 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Thanks for the comments, the attached v8 patch has the changes for the > > same. > > > > > > > AFAICS, the summary of this proposal is that we want to support > > exclude of certain objects from publication with two kinds of > > variants. The first variant is to add support to exclude specific > > tables from ALL TABLES PUBLICATION. Without this feature, users need > > to manually add all tables for a database even when she wants to avoid > > only a handful of tables from the database say because they contain > > sensitive information or are not required. We have seen that other > > database like MySQL also provides similar feature [1] (See > > REPLICATE_WILD_IGNORE_TABLE). The proposed syntax for this is as > > follows: > > > > CREATE PUBLICATION pub1 FOR ALL TABLES EXCEPT TABLE t1,t2; > > or > > ALTER PUBLICATION pub1 ADD ALL TABLES EXCEPT TABLE t1,t2; > > > > This will allow us to publish all the tables in the current database > > except t1 and t2. Now, I see that pg_dump has a similar option > > provided by switch --exclude-table but that allows tables matching > > patterns which is not the case here. I am not sure if we need a > > similar variant here. > > > > Then users will be allowed to reset the publication by: > > ALTER PUBLICATION pub1 RESET; > > > > This will reset the publication to the default state which includes > > resetting the publication parameters, setting the ALL TABLES flag to > > false, and dropping the relations and schemas that are associated with > > the publication. I don't know if we want to go further with allowing > > to RESET specific parameters and if so which parameters and what would > > its syntax be? > > > > The second variant is to add support to exclude certain columns of a > > table while publishing a particular table. Currently, users need to > > list all required columns' names even if they don't want to hide most > > of the columns in the table (for example Create Publication pub For > > Table t1 (c1, c2)). Consider user doesn't want to publish the 'salary' > > or other sensitive information of executives/employees but would like > > to publish all other columns. I feel in such cases it will be a lot of > > work for the user especially when the table has many columns. I see > > that Oracle has a similar feature [2]. I think without this it will be > > difficult for users to use this feature in some cases. The patch for > > this is not proposed but I would imagine syntax for it to be something > > like "Create Publication pub For Table t1 Except (c3)" and similar > > variants for Alter Publication. > > I think the feature to exclude certain columns of a table would be useful. > > In some production scenarios, we usually do not want to replicate > sensitive fields(column) in the table. Although we already can achieve > this by specify all replicated columns in the list[1], but that seems a > hard work when the table has hundreds of columns. > > [1] > CREATE TABLE test(a int, b int, c int,..., sensitive text); > CRAETE PUBLICATION pub FOR TABLE test(a,b,c,...); > > In addition, it's not easy to maintain the column list like above. Because > we sometimes need to add new fields or delete fields due to business > needs. Every time we add a column(or delete a column in column list), we > need to update the column list. > > If we support Except: > CRAETE PUBLICATION pub FOR TABLE test EXCEPT (sensitive); > > We don't need to update the column list in most cases. > Right, this is a valid point and I think it makes sense for me to support such a feature for column list and also to exclude a particular table(s) from the ALL TABLES publication. Peter E., Euler, and others, do you have any objections to supporting the above-mentioned two cases? -- With Regards, Amit Kapila.
On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com > <osumi.takamichi@fujitsu.com> wrote: > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in > > > release mode as in [1]. > > Hi, thank you for the patches. > > > > > > I'll share several review comments. > > > > For v7-0001. > > > > (1) I'll suggest some minor rewording. > > > > + <para> > > + The <literal>RESET</literal> clause will reset the publication to the > > + default state which includes resetting the publication options, setting > > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > > + dropping all relations and schemas that are associated with the publication. > > > > My suggestion is > > "The RESET clause will reset the publication to the > > default state. It resets the publication operations, > > sets ALL TABLES flag to false and drops all relations > > and schemas associated with the publication." > > I felt the existing looks better. I would prefer to keep it that way. > > > (2) typo and rewording > > > > +/* > > + * Reset the publication. > > + * > > + * Reset the publication options, setting ALL TABLES flag to false and drop > > + * all relations and schemas that are associated with the publication. > > + */ > > > > The "setting" in this sentence should be "set". > > > > How about changing like below ? > > FROM: > > "Reset the publication options, setting ALL TABLES flag to false and drop > > all relations and schemas that are associated with the publication." > > TO: > > "Reset the publication operations, set ALL TABLES flag to false and drop > > all relations and schemas associated with the publication." > > I felt the existing looks better. I would prefer to keep it that way. > > > (3) AlterPublicationReset > > > > Do we need to call CacheInvalidateRelcacheAll() or > > InvalidatePublicationRels() at the end of > > AlterPublicationReset() like AlterPublicationOptions() ? > > CacheInvalidateRelcacheAll should be called if we change all tables > from true to false, else the cache will not be invalidated. Modified > > > > > For v7-0002. > > > > (4) > > > > + if (stmt->for_all_tables) > > + { > > + bool isdefault = CheckPublicationDefValues(tup); > > + > > + if (!isdefault) > > + ereport(ERROR, > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > + errmsg("adding ALL TABLES requires the publication to have default publicationoptions, no tables/.... > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > > > The errmsg string has three messages for user and is a bit long > > (we have two sentences there connected by 'and'). > > Can't we make it concise and split it into a couple of lines for code readability ? > > > > I'll suggest a change below. > > FROM: > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALLTABLES flag should not be set" > > TO: > > "adding ALL TABLES requires the publication defined not for ALL TABLES" > > "to have default publish actions without any associated tables/schemas" > > Added errdetail and split it > > > (5) typo > > > > <varlistentry> > > + <term><literal>EXCEPT TABLE</literal></term> > > + <listitem> > > + <para> > > + This clause specifies a list of tables to exclude from the publication. > > + It can only be used with <literal>FOR ALL TABLES</literal>. > > + </para> > > + </listitem> > > + </varlistentry> > > + > > > > Kindly change > > FROM: > > This clause specifies a list of tables to exclude from the publication. > > TO: > > This clause specifies a list of tables to be excluded from the publication. > > or > > This clause specifies a list of tables excluded from the publication. > > Modified > > > (6) Minor suggestion for an expression change > > > > Marks the publication as one that replicates changes for all tables in > > - the database, including tables created in the future. > > + the database, including tables created in the future. If > > + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating > > + the changes for the specified tables. > > > > > > I'll suggest a minor rewording. > > FROM: > > ...exclude replicating the changes for the specified tables > > TO: > > ...exclude replication changes for the specified tables > > I felt the existing is better. > > > (7) > > (7-1) > > > > +/* > > + * Check if the publication has default values > > + * > > + * Check the following: > > + * a) Publication is not set with "FOR ALL TABLES" > > + * b) Publication is having default options > > + * c) Publication is not associated with schemas > > + * d) Publication is not associated with relations > > + */ > > +static bool > > +CheckPublicationDefValues(HeapTuple tup) > > > > > > I think this header comment can be improved. > > FROM: > > Check the following: > > TO: > > Returns true if the publication satisfies all the following conditions: > > Modified > > > (7-2) > > > > b) should be changed as well > > FROM: > > Publication is having default options > > TO: > > Publication has the default publish operations > > Changed it to "Publication is having default publication parameter values" > > Thanks for the comments, the attached v8 patch has the changes for the same. The patch needed to be rebased on top of HEAD because of commit "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8 version for the changes of the same. Regards, Vignesh
Attachment
On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote: > > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com > > <osumi.takamichi@fujitsu.com> wrote: > > > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in > > > > release mode as in [1]. > > > Hi, thank you for the patches. > > > > > > > > > I'll share several review comments. > > > > > > For v7-0001. > > > > > > (1) I'll suggest some minor rewording. > > > > > > + <para> > > > + The <literal>RESET</literal> clause will reset the publication to the > > > + default state which includes resetting the publication options, setting > > > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > > > + dropping all relations and schemas that are associated with the publication. > > > > > > My suggestion is > > > "The RESET clause will reset the publication to the > > > default state. It resets the publication operations, > > > sets ALL TABLES flag to false and drops all relations > > > and schemas associated with the publication." > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > (2) typo and rewording > > > > > > +/* > > > + * Reset the publication. > > > + * > > > + * Reset the publication options, setting ALL TABLES flag to false and drop > > > + * all relations and schemas that are associated with the publication. > > > + */ > > > > > > The "setting" in this sentence should be "set". > > > > > > How about changing like below ? > > > FROM: > > > "Reset the publication options, setting ALL TABLES flag to false and drop > > > all relations and schemas that are associated with the publication." > > > TO: > > > "Reset the publication operations, set ALL TABLES flag to false and drop > > > all relations and schemas associated with the publication." > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > (3) AlterPublicationReset > > > > > > Do we need to call CacheInvalidateRelcacheAll() or > > > InvalidatePublicationRels() at the end of > > > AlterPublicationReset() like AlterPublicationOptions() ? > > > > CacheInvalidateRelcacheAll should be called if we change all tables > > from true to false, else the cache will not be invalidated. Modified > > > > > > > > For v7-0002. > > > > > > (4) > > > > > > + if (stmt->for_all_tables) > > > + { > > > + bool isdefault = CheckPublicationDefValues(tup); > > > + > > > + if (!isdefault) > > > + ereport(ERROR, > > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > > + errmsg("adding ALL TABLES requires the publication to have default publicationoptions, no tables/.... > > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > > > > > > The errmsg string has three messages for user and is a bit long > > > (we have two sentences there connected by 'and'). > > > Can't we make it concise and split it into a couple of lines for code readability ? > > > > > > I'll suggest a change below. > > > FROM: > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated and ALLTABLES flag should not be set" > > > TO: > > > "adding ALL TABLES requires the publication defined not for ALL TABLES" > > > "to have default publish actions without any associated tables/schemas" > > > > Added errdetail and split it > > > > > (5) typo > > > > > > <varlistentry> > > > + <term><literal>EXCEPT TABLE</literal></term> > > > + <listitem> > > > + <para> > > > + This clause specifies a list of tables to exclude from the publication. > > > + It can only be used with <literal>FOR ALL TABLES</literal>. > > > + </para> > > > + </listitem> > > > + </varlistentry> > > > + > > > > > > Kindly change > > > FROM: > > > This clause specifies a list of tables to exclude from the publication. > > > TO: > > > This clause specifies a list of tables to be excluded from the publication. > > > or > > > This clause specifies a list of tables excluded from the publication. > > > > Modified > > > > > (6) Minor suggestion for an expression change > > > > > > Marks the publication as one that replicates changes for all tables in > > > - the database, including tables created in the future. > > > + the database, including tables created in the future. If > > > + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating > > > + the changes for the specified tables. > > > > > > > > > I'll suggest a minor rewording. > > > FROM: > > > ...exclude replicating the changes for the specified tables > > > TO: > > > ...exclude replication changes for the specified tables > > > > I felt the existing is better. > > > > > (7) > > > (7-1) > > > > > > +/* > > > + * Check if the publication has default values > > > + * > > > + * Check the following: > > > + * a) Publication is not set with "FOR ALL TABLES" > > > + * b) Publication is having default options > > > + * c) Publication is not associated with schemas > > > + * d) Publication is not associated with relations > > > + */ > > > +static bool > > > +CheckPublicationDefValues(HeapTuple tup) > > > > > > > > > I think this header comment can be improved. > > > FROM: > > > Check the following: > > > TO: > > > Returns true if the publication satisfies all the following conditions: > > > > Modified > > > > > (7-2) > > > > > > b) should be changed as well > > > FROM: > > > Publication is having default options > > > TO: > > > Publication has the default publish operations > > > > Changed it to "Publication is having default publication parameter values" > > > > Thanks for the comments, the attached v8 patch has the changes for the same. > > The patch needed to be rebased on top of HEAD because of commit > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8 > version for the changes of the same. I had missed attaching one of the changes that was present locally. The updated patch has the changes for the same. Regards, Vignesh
Attachment
I spent some time on understanding the proposal and the patch. Here are a few comments wrt the test cases. > +ALTER PUBLICATION testpub_reset ADD TABLE pub_sch1.tbl1; > + > +-- Verify that tables associated with the publication are dropped after RESET > +\dRp+ testpub_reset > +ALTER PUBLICATION testpub_reset RESET; > +\dRp+ testpub_reset > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public; > + > +-- Verify that schemas associated with the publication are dropped after RESET > +\dRp+ testpub_reset > +ALTER PUBLICATION testpub_reset RESET; > +\dRp+ testpub_reset The results for the above two cases are the same before and after the reset. Is there any way to verify that? --- > +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES' publication > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; > + > > +-- Can't add EXCEPT TABLE to 'FOR TABLE' publication > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; > + > > +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES IN SCHEMA' publication > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; > + I did not understand the objective of these tests. I think we need to improve the comments. Thanks & Regards, On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com > > > <osumi.takamichi@fujitsu.com> wrote: > > > > > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in > > > > > release mode as in [1]. > > > > Hi, thank you for the patches. > > > > > > > > > > > > I'll share several review comments. > > > > > > > > For v7-0001. > > > > > > > > (1) I'll suggest some minor rewording. > > > > > > > > + <para> > > > > + The <literal>RESET</literal> clause will reset the publication to the > > > > + default state which includes resetting the publication options, setting > > > > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > > > > + dropping all relations and schemas that are associated with the publication. > > > > > > > > My suggestion is > > > > "The RESET clause will reset the publication to the > > > > default state. It resets the publication operations, > > > > sets ALL TABLES flag to false and drops all relations > > > > and schemas associated with the publication." > > > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > > > (2) typo and rewording > > > > > > > > +/* > > > > + * Reset the publication. > > > > + * > > > > + * Reset the publication options, setting ALL TABLES flag to false and drop > > > > + * all relations and schemas that are associated with the publication. > > > > + */ > > > > > > > > The "setting" in this sentence should be "set". > > > > > > > > How about changing like below ? > > > > FROM: > > > > "Reset the publication options, setting ALL TABLES flag to false and drop > > > > all relations and schemas that are associated with the publication." > > > > TO: > > > > "Reset the publication operations, set ALL TABLES flag to false and drop > > > > all relations and schemas associated with the publication." > > > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > > > (3) AlterPublicationReset > > > > > > > > Do we need to call CacheInvalidateRelcacheAll() or > > > > InvalidatePublicationRels() at the end of > > > > AlterPublicationReset() like AlterPublicationOptions() ? > > > > > > CacheInvalidateRelcacheAll should be called if we change all tables > > > from true to false, else the cache will not be invalidated. Modified > > > > > > > > > > > For v7-0002. > > > > > > > > (4) > > > > > > > > + if (stmt->for_all_tables) > > > > + { > > > > + bool isdefault = CheckPublicationDefValues(tup); > > > > + > > > > + if (!isdefault) > > > > + ereport(ERROR, > > > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > > > + errmsg("adding ALL TABLES requires the publication to have default publicationoptions, no tables/.... > > > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > > > > > > > > > The errmsg string has three messages for user and is a bit long > > > > (we have two sentences there connected by 'and'). > > > > Can't we make it concise and split it into a couple of lines for code readability ? > > > > > > > > I'll suggest a change below. > > > > FROM: > > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated andALL TABLES flag should not be set" > > > > TO: > > > > "adding ALL TABLES requires the publication defined not for ALL TABLES" > > > > "to have default publish actions without any associated tables/schemas" > > > > > > Added errdetail and split it > > > > > > > (5) typo > > > > > > > > <varlistentry> > > > > + <term><literal>EXCEPT TABLE</literal></term> > > > > + <listitem> > > > > + <para> > > > > + This clause specifies a list of tables to exclude from the publication. > > > > + It can only be used with <literal>FOR ALL TABLES</literal>. > > > > + </para> > > > > + </listitem> > > > > + </varlistentry> > > > > + > > > > > > > > Kindly change > > > > FROM: > > > > This clause specifies a list of tables to exclude from the publication. > > > > TO: > > > > This clause specifies a list of tables to be excluded from the publication. > > > > or > > > > This clause specifies a list of tables excluded from the publication. > > > > > > Modified > > > > > > > (6) Minor suggestion for an expression change > > > > > > > > Marks the publication as one that replicates changes for all tables in > > > > - the database, including tables created in the future. > > > > + the database, including tables created in the future. If > > > > + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating > > > > + the changes for the specified tables. > > > > > > > > > > > > I'll suggest a minor rewording. > > > > FROM: > > > > ...exclude replicating the changes for the specified tables > > > > TO: > > > > ...exclude replication changes for the specified tables > > > > > > I felt the existing is better. > > > > > > > (7) > > > > (7-1) > > > > > > > > +/* > > > > + * Check if the publication has default values > > > > + * > > > > + * Check the following: > > > > + * a) Publication is not set with "FOR ALL TABLES" > > > > + * b) Publication is having default options > > > > + * c) Publication is not associated with schemas > > > > + * d) Publication is not associated with relations > > > > + */ > > > > +static bool > > > > +CheckPublicationDefValues(HeapTuple tup) > > > > > > > > > > > > I think this header comment can be improved. > > > > FROM: > > > > Check the following: > > > > TO: > > > > Returns true if the publication satisfies all the following conditions: > > > > > > Modified > > > > > > > (7-2) > > > > > > > > b) should be changed as well > > > > FROM: > > > > Publication is having default options > > > > TO: > > > > Publication has the default publish operations > > > > > > Changed it to "Publication is having default publication parameter values" > > > > > > Thanks for the comments, the attached v8 patch has the changes for the same. > > > > The patch needed to be rebased on top of HEAD because of commit > > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8 > > version for the changes of the same. > > I had missed attaching one of the changes that was present locally. > The updated patch has the changes for the same. > > Regards, > Vignesh
On Thu, Aug 18, 2022 at 12:33 PM Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote: > > I spent some time on understanding the proposal and the patch. Here > are a few comments wrt the test cases. > > > +ALTER PUBLICATION testpub_reset ADD TABLE pub_sch1.tbl1; > > + > > +-- Verify that tables associated with the publication are dropped after RESET > > +\dRp+ testpub_reset > > +ALTER PUBLICATION testpub_reset RESET; > > +\dRp+ testpub_reset > > > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public; > > + > > +-- Verify that schemas associated with the publication are dropped after RESET > > +\dRp+ testpub_reset > > +ALTER PUBLICATION testpub_reset RESET; > > +\dRp+ testpub_reset > > The results for the above two cases are the same before and after the > reset. Is there any way to verify that? If you see the expected, first \dRp+ command includes: +Tables: + "pub_sch1.tbl1" The second \dRp+ does not include the Tables. We are trying to verify that after reset, the tables will be removed from the publication. The second test is similar to the first, the only difference here is that we test schema instead of tables. i.e we verify that the schemas will be removed from the publication. > --- > > > +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES' publication > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; > > + > > > > +-- Can't add EXCEPT TABLE to 'FOR TABLE' publication > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; > > + > > > > +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES IN SCHEMA' publication > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; > > + > > I did not understand the objective of these tests. I think we need to > improve the comments. There are different publications like "ALL TABLES", "TABLE", "ALL TABLES IN SCHEMA" publications. Here we are trying to verify that except tables cannot be added to "ALL TABLES", "TABLE", "ALL TABLES IN SCHEMA" publications. If you see the expected file, you will see the following error: +-- Can't add EXCEPT TABLE to 'FOR ALL TABLES' publication +ALTER PUBLICATION testpub_reset ADD ALL TABLES EXCEPT TABLE pub_sch1.tbl1; +ERROR: adding ALL TABLES requires the publication to have default publication parameter values +DETAIL: ALL TABLES flag should not be set and no tables/schemas should be associated. +HINT: Use ALTER PUBLICATION ... RESET to reset the publication I felt the existing comment is ok. Let me know if you still feel any change is required. Regards, Vignesh
On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com > > > <osumi.takamichi@fujitsu.com> wrote: > > > > > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in > > > > > release mode as in [1]. > > > > Hi, thank you for the patches. > > > > > > > > > > > > I'll share several review comments. > > > > > > > > For v7-0001. > > > > > > > > (1) I'll suggest some minor rewording. > > > > > > > > + <para> > > > > + The <literal>RESET</literal> clause will reset the publication to the > > > > + default state which includes resetting the publication options, setting > > > > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > > > > + dropping all relations and schemas that are associated with the publication. > > > > > > > > My suggestion is > > > > "The RESET clause will reset the publication to the > > > > default state. It resets the publication operations, > > > > sets ALL TABLES flag to false and drops all relations > > > > and schemas associated with the publication." > > > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > > > (2) typo and rewording > > > > > > > > +/* > > > > + * Reset the publication. > > > > + * > > > > + * Reset the publication options, setting ALL TABLES flag to false and drop > > > > + * all relations and schemas that are associated with the publication. > > > > + */ > > > > > > > > The "setting" in this sentence should be "set". > > > > > > > > How about changing like below ? > > > > FROM: > > > > "Reset the publication options, setting ALL TABLES flag to false and drop > > > > all relations and schemas that are associated with the publication." > > > > TO: > > > > "Reset the publication operations, set ALL TABLES flag to false and drop > > > > all relations and schemas associated with the publication." > > > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > > > (3) AlterPublicationReset > > > > > > > > Do we need to call CacheInvalidateRelcacheAll() or > > > > InvalidatePublicationRels() at the end of > > > > AlterPublicationReset() like AlterPublicationOptions() ? > > > > > > CacheInvalidateRelcacheAll should be called if we change all tables > > > from true to false, else the cache will not be invalidated. Modified > > > > > > > > > > > For v7-0002. > > > > > > > > (4) > > > > > > > > + if (stmt->for_all_tables) > > > > + { > > > > + bool isdefault = CheckPublicationDefValues(tup); > > > > + > > > > + if (!isdefault) > > > > + ereport(ERROR, > > > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > > > + errmsg("adding ALL TABLES requires the publication to have default publicationoptions, no tables/.... > > > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > > > > > > > > > The errmsg string has three messages for user and is a bit long > > > > (we have two sentences there connected by 'and'). > > > > Can't we make it concise and split it into a couple of lines for code readability ? > > > > > > > > I'll suggest a change below. > > > > FROM: > > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated andALL TABLES flag should not be set" > > > > TO: > > > > "adding ALL TABLES requires the publication defined not for ALL TABLES" > > > > "to have default publish actions without any associated tables/schemas" > > > > > > Added errdetail and split it > > > > > > > (5) typo > > > > > > > > <varlistentry> > > > > + <term><literal>EXCEPT TABLE</literal></term> > > > > + <listitem> > > > > + <para> > > > > + This clause specifies a list of tables to exclude from the publication. > > > > + It can only be used with <literal>FOR ALL TABLES</literal>. > > > > + </para> > > > > + </listitem> > > > > + </varlistentry> > > > > + > > > > > > > > Kindly change > > > > FROM: > > > > This clause specifies a list of tables to exclude from the publication. > > > > TO: > > > > This clause specifies a list of tables to be excluded from the publication. > > > > or > > > > This clause specifies a list of tables excluded from the publication. > > > > > > Modified > > > > > > > (6) Minor suggestion for an expression change > > > > > > > > Marks the publication as one that replicates changes for all tables in > > > > - the database, including tables created in the future. > > > > + the database, including tables created in the future. If > > > > + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating > > > > + the changes for the specified tables. > > > > > > > > > > > > I'll suggest a minor rewording. > > > > FROM: > > > > ...exclude replicating the changes for the specified tables > > > > TO: > > > > ...exclude replication changes for the specified tables > > > > > > I felt the existing is better. > > > > > > > (7) > > > > (7-1) > > > > > > > > +/* > > > > + * Check if the publication has default values > > > > + * > > > > + * Check the following: > > > > + * a) Publication is not set with "FOR ALL TABLES" > > > > + * b) Publication is having default options > > > > + * c) Publication is not associated with schemas > > > > + * d) Publication is not associated with relations > > > > + */ > > > > +static bool > > > > +CheckPublicationDefValues(HeapTuple tup) > > > > > > > > > > > > I think this header comment can be improved. > > > > FROM: > > > > Check the following: > > > > TO: > > > > Returns true if the publication satisfies all the following conditions: > > > > > > Modified > > > > > > > (7-2) > > > > > > > > b) should be changed as well > > > > FROM: > > > > Publication is having default options > > > > TO: > > > > Publication has the default publish operations > > > > > > Changed it to "Publication is having default publication parameter values" > > > > > > Thanks for the comments, the attached v8 patch has the changes for the same. > > > > The patch needed to be rebased on top of HEAD because of commit > > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8 > > version for the changes of the same. > > I had missed attaching one of the changes that was present locally. > The updated patch has the changes for the same. The patch needed to be rebased on top of HEAD because of a recent commit. The updated v8 patch has the changes for the same. Regards, Vignesh
Attachment
2022年8月19日(金) 2:41 vignesh C <vignesh21@gmail.com>: > > On Mon, Aug 8, 2022 at 2:53 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Mon, Aug 8, 2022 at 12:46 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Fri, Jun 3, 2022 at 3:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > On Thu, May 26, 2022 at 7:04 PM osumi.takamichi@fujitsu.com > > > > <osumi.takamichi@fujitsu.com> wrote: > > > > > > > > > > On Monday, May 23, 2022 2:13 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Attached v7 patch which fixes the buildfarm warning for an unused warning in > > > > > > release mode as in [1]. > > > > > Hi, thank you for the patches. > > > > > > > > > > > > > > > I'll share several review comments. > > > > > > > > > > For v7-0001. > > > > > > > > > > (1) I'll suggest some minor rewording. > > > > > > > > > > + <para> > > > > > + The <literal>RESET</literal> clause will reset the publication to the > > > > > + default state which includes resetting the publication options, setting > > > > > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > > > > > + dropping all relations and schemas that are associated with the publication. > > > > > > > > > > My suggestion is > > > > > "The RESET clause will reset the publication to the > > > > > default state. It resets the publication operations, > > > > > sets ALL TABLES flag to false and drops all relations > > > > > and schemas associated with the publication." > > > > > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > > > > > (2) typo and rewording > > > > > > > > > > +/* > > > > > + * Reset the publication. > > > > > + * > > > > > + * Reset the publication options, setting ALL TABLES flag to false and drop > > > > > + * all relations and schemas that are associated with the publication. > > > > > + */ > > > > > > > > > > The "setting" in this sentence should be "set". > > > > > > > > > > How about changing like below ? > > > > > FROM: > > > > > "Reset the publication options, setting ALL TABLES flag to false and drop > > > > > all relations and schemas that are associated with the publication." > > > > > TO: > > > > > "Reset the publication operations, set ALL TABLES flag to false and drop > > > > > all relations and schemas associated with the publication." > > > > > > > > I felt the existing looks better. I would prefer to keep it that way. > > > > > > > > > (3) AlterPublicationReset > > > > > > > > > > Do we need to call CacheInvalidateRelcacheAll() or > > > > > InvalidatePublicationRels() at the end of > > > > > AlterPublicationReset() like AlterPublicationOptions() ? > > > > > > > > CacheInvalidateRelcacheAll should be called if we change all tables > > > > from true to false, else the cache will not be invalidated. Modified > > > > > > > > > > > > > > For v7-0002. > > > > > > > > > > (4) > > > > > > > > > > + if (stmt->for_all_tables) > > > > > + { > > > > > + bool isdefault = CheckPublicationDefValues(tup); > > > > > + > > > > > + if (!isdefault) > > > > > + ereport(ERROR, > > > > > + errcode(ERRCODE_INVALID_OBJECT_DEFINITION), > > > > > + errmsg("adding ALL TABLES requires the publication to have default publicationoptions, no tables/.... > > > > > + errhint("Use ALTER PUBLICATION ... RESET to reset the publication")); > > > > > > > > > > > > > > > The errmsg string has three messages for user and is a bit long > > > > > (we have two sentences there connected by 'and'). > > > > > Can't we make it concise and split it into a couple of lines for code readability ? > > > > > > > > > > I'll suggest a change below. > > > > > FROM: > > > > > "adding ALL TABLES requires the publication to have default publication options, no tables/schemas associated andALL TABLES flag should not be set" > > > > > TO: > > > > > "adding ALL TABLES requires the publication defined not for ALL TABLES" > > > > > "to have default publish actions without any associated tables/schemas" > > > > > > > > Added errdetail and split it > > > > > > > > > (5) typo > > > > > > > > > > <varlistentry> > > > > > + <term><literal>EXCEPT TABLE</literal></term> > > > > > + <listitem> > > > > > + <para> > > > > > + This clause specifies a list of tables to exclude from the publication. > > > > > + It can only be used with <literal>FOR ALL TABLES</literal>. > > > > > + </para> > > > > > + </listitem> > > > > > + </varlistentry> > > > > > + > > > > > > > > > > Kindly change > > > > > FROM: > > > > > This clause specifies a list of tables to exclude from the publication. > > > > > TO: > > > > > This clause specifies a list of tables to be excluded from the publication. > > > > > or > > > > > This clause specifies a list of tables excluded from the publication. > > > > > > > > Modified > > > > > > > > > (6) Minor suggestion for an expression change > > > > > > > > > > Marks the publication as one that replicates changes for all tables in > > > > > - the database, including tables created in the future. > > > > > + the database, including tables created in the future. If > > > > > + <literal>EXCEPT TABLE</literal> is specified, then exclude replicating > > > > > + the changes for the specified tables. > > > > > > > > > > > > > > > I'll suggest a minor rewording. > > > > > FROM: > > > > > ...exclude replicating the changes for the specified tables > > > > > TO: > > > > > ...exclude replication changes for the specified tables > > > > > > > > I felt the existing is better. > > > > > > > > > (7) > > > > > (7-1) > > > > > > > > > > +/* > > > > > + * Check if the publication has default values > > > > > + * > > > > > + * Check the following: > > > > > + * a) Publication is not set with "FOR ALL TABLES" > > > > > + * b) Publication is having default options > > > > > + * c) Publication is not associated with schemas > > > > > + * d) Publication is not associated with relations > > > > > + */ > > > > > +static bool > > > > > +CheckPublicationDefValues(HeapTuple tup) > > > > > > > > > > > > > > > I think this header comment can be improved. > > > > > FROM: > > > > > Check the following: > > > > > TO: > > > > > Returns true if the publication satisfies all the following conditions: > > > > > > > > Modified > > > > > > > > > (7-2) > > > > > > > > > > b) should be changed as well > > > > > FROM: > > > > > Publication is having default options > > > > > TO: > > > > > Publication has the default publish operations > > > > > > > > Changed it to "Publication is having default publication parameter values" > > > > > > > > Thanks for the comments, the attached v8 patch has the changes for the same. > > > > > > The patch needed to be rebased on top of HEAD because of commit > > > "0c20dd33db1607d6a85ffce24238c1e55e384b49", attached a rebased v8 > > > version for the changes of the same. > > > > I had missed attaching one of the changes that was present locally. > > The updated patch has the changes for the same. > > The patch needed to be rebased on top of HEAD because of a recent > commit. The updated v8 patch has the changes for the same. Hi cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is currently underway, this would be an excellent time to update the patch. [1] http://cfbot.cputube.org/patch_40_3646.log Thanks Ian Barwick
On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > Hi > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is > currently underway, this would be an excellent time to update the patch. > > [1] http://cfbot.cputube.org/patch_40_3646.log Here is an updated patch which is rebased on top of HEAD. Regards, Vignesh
Attachment
2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>: > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > > > Hi > > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is > > currently underway, this would be an excellent time to update the patch. > > > > [1] http://cfbot.cputube.org/patch_40_3646.log > > Here is an updated patch which is rebased on top of HEAD. Thanks for the updated patch. While reviewing the patch backlog, we have determined that this patch adds one or more TAP tests but has not added the test to the "meson.build" file. To do this, locate the relevant "meson.build" file for each test and add it in the 'tests' dictionary, which will look something like this: 'tap': { 'tests': [ 't/001_basic.pl', ], }, For some additional details please see this Wiki article: https://wiki.postgresql.org/wiki/Meson_for_patch_authors For more information on the meson build system for PostgreSQL see: https://wiki.postgresql.org/wiki/Meson Regards Ian Barwick
On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>: > > > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > > > > > Hi > > > > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is > > > currently underway, this would be an excellent time to update the patch. > > > > > > [1] http://cfbot.cputube.org/patch_40_3646.log > > > > Here is an updated patch which is rebased on top of HEAD. > > Thanks for the updated patch. > > While reviewing the patch backlog, we have determined that this patch adds > one or more TAP tests but has not added the test to the "meson.build" file. Thanks, I have updated the meson.build to include the TAP test. The attached patch has the changes for the same. Regards, Vignesh
Attachment
On Wed, 16 Nov 2022 at 15:35, vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > > > 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>: > > > > > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > > > > > > > Hi > > > > > > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is > > > > currently underway, this would be an excellent time to update the patch. > > > > > > > > [1] http://cfbot.cputube.org/patch_40_3646.log > > > > > > Here is an updated patch which is rebased on top of HEAD. > > > > Thanks for the updated patch. > > > > While reviewing the patch backlog, we have determined that this patch adds > > one or more TAP tests but has not added the test to the "meson.build" file. > > Thanks, I have updated the meson.build to include the TAP test. The > attached patch has the changes for the same. The patch was not applying on top of HEAD, attached a rebased version. Regards, Vignesh
Attachment
On Fri, 20 Jan 2023 at 15:30, vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 16 Nov 2022 at 15:35, vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, 16 Nov 2022 at 09:34, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > > > > > 2022年11月7日(月) 22:39 vignesh C <vignesh21@gmail.com>: > > > > > > > > On Fri, 4 Nov 2022 at 08:19, Ian Lawrence Barwick <barwick@gmail.com> wrote: > > > > > > > > > > Hi > > > > > > > > > > cfbot reports the patch no longer applies [1]. As CommitFest 2022-11 is > > > > > currently underway, this would be an excellent time to update the patch. > > > > > > > > > > [1] http://cfbot.cputube.org/patch_40_3646.log > > > > > > > > Here is an updated patch which is rebased on top of HEAD. > > > > > > Thanks for the updated patch. > > > > > > While reviewing the patch backlog, we have determined that this patch adds > > > one or more TAP tests but has not added the test to the "meson.build" file. > > > > Thanks, I have updated the meson.build to include the TAP test. The > > attached patch has the changes for the same. > > The patch was not applying on top of HEAD, attached a rebased version. As I did not see much interest from others, I'm withdrawing this patch for now. But if there is any interest others in future, I would be more than happy to work on this feature. Regards, Vignesh
On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote: > > As I did not see much interest from others, I'm withdrawing this patch > for now. But if there is any interest others in future, I would be > more than happy to work on this feature. > Just FYI, I noticed a use case for this patch in email [1]. Users would like to replicate all except a few columns having sensitive information. The challenge with current column list features is that adding new tables to columns would lead users to change the respective publications as well. [1] - https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE270FDC3047540208%40qq.com -- With Regards, Amit Kapila.
On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote: > > On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote: > > > > As I did not see much interest from others, I'm withdrawing this patch > > for now. But if there is any interest others in future, I would be > > more than happy to work on this feature. > > > > Just FYI, I noticed a use case for this patch in email [1]. Users would like to > replicate all except a few columns having sensitive information. The challenge > with current column list features is that adding new tables to columns would > lead users to change the respective publications as well. > > [1] - > https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE > 270FDC3047540208%40qq.com BTW, I noticed that debezium, an open source distributed platform for change data capture that replies on logical decoding, also support specifying the column exclusion list[1]. So, this indicates that there could be some use cases for this feature. https://debezium.io/documentation/reference/stable/connectors/postgresql.html#postgresql-property-column-exclude-list Best Regards, Hou zj
On Wed, Apr 16, 2025 at 8:22 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote: > > > > On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > As I did not see much interest from others, I'm withdrawing this patch > > > for now. But if there is any interest others in future, I would be > > > more than happy to work on this feature. > > > > > > > Just FYI, I noticed a use case for this patch in email [1]. Users would like to > > replicate all except a few columns having sensitive information. The challenge > > with current column list features is that adding new tables to columns would > > lead users to change the respective publications as well. > > > > [1] - > > https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE > > 270FDC3047540208%40qq.com > > BTW, I noticed that debezium, an open source distributed platform for change > data capture that replies on logical decoding, also support specifying the > column exclusion list[1]. So, this indicates that there could be some use cases > for this feature. > Thanks for sharing the link. I see that they support both the include and exclude lists for columns and tables. -- With Regards, Amit Kapila.
On Thu, 17 Apr 2025 at 09:12, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Apr 16, 2025 at 8:22 AM Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: > > > > On Thu, Apr 10, 2025 at 7:25 PM Amit Kapila wrote: > > > > > > On Tue, Jan 9, 2024 at 12:02 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > As I did not see much interest from others, I'm withdrawing this patch > > > > for now. But if there is any interest others in future, I would be > > > > more than happy to work on this feature. > > > > > > > > > > Just FYI, I noticed a use case for this patch in email [1]. Users would like to > > > replicate all except a few columns having sensitive information. The challenge > > > with current column list features is that adding new tables to columns would > > > lead users to change the respective publications as well. > > > > > > [1] - > > > https://www.postgresql.org/message-id/tencent_DCDF626FCD4A556C51BE > > > 270FDC3047540208%40qq.com > > > > BTW, I noticed that debezium, an open source distributed platform for change > > data capture that replies on logical decoding, also support specifying the > > column exclusion list[1]. So, this indicates that there could be some use cases > > for this feature. > > > > Thanks for sharing the link. I see that they support both the include > and exclude lists for columns and tables. > Hi Hackers, I see there is some interest in the functionality added by this patch. I have rebased the patches in [1]. I saw a new column 'pubgencols' was added in pg_publication in PG 18. So, I have modified v11-0001 to RESET this as well. I am also working on creating a patch to exclude columns in publication as per suggestion in [2]. [1]: https://www.postgresql.org/message-id/CALDaNm3dWZCYDih55qTNAYsjCvYXMFv%3D46UsDWmfCnXMt3kPCg%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CAA4eK1KRdAPC%3D5%3D7tQ1GW0cRwD%3DzaDMi%2BT4u_k4GxPhPY6e8BQ%40mail.gmail.com Thanks and Regards, Shlok Kyal
Attachment
On Wed, 18 Jun 2025 at 06:34, Peter Smith <smithpb2250@gmail.com> wrote: > > On Tue, Jun 17, 2025 at 5:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > ... > > I have attached a patch support excluding columns for publication. > > > > I have added a syntax: "FOR TABLE table_name EXCEPT (c1, c2, ..)" > > It can be used with CREATE or ALTER PUBLICATION. > > > > v12-0003 patch contains the changes for the same. > > > > Hi Shlok, > > I was interested in your new EXCEPT (col-list) so I had a quick look > at your patch v12-0003 (only looked at the documentation). > > Below are some comments: > > ====== > > 1. Chapter 29.5 "Column Lists". > > I think new EXCEPT syntax needs a mention here as well. > Added > ====== > > doc/src/sgml/catalogs.sgml > > 2. > + <para> > + This is an array of values that indicates which table columns are > + excluded from the publication. For example, a value of > + <literal>1 3</literal> would mean that the columns except the first and > + the third columns are published. > + A null value indicates that no columns are excluded from being > published. > + </para></entry> > > The sentence "A null value indicates that no columns are excluded from > being published" seems kind of confusing, because if the user has a > "normal" column-list although nothing was being *explicitly* excluded > (using EXCEPT), any columns not named are *implicitly* excluded from > being published. > I have removed this line. > ~ > > 3. > TBH, I was wondering why a new catalog attribute was necessary... > > Can't you simply re-use the existing attribute "prattrs" attribute. > e.g. let's just define negative means exclude. > > e.g. a value of 1 3 means only the 1st and 3rd columns are published > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published > e.g. a value of null mean all columns are published > > (mixes of negative and positive will not be possible) > Currently I have added a new attribute 'prexcludeattrs' in pg_publication_rel table. I used this approach because it will be easier for user to get the exclude column list, in code no extra processing is required to get the exclude column list. For an approach to use negative numbers for exclude columns. I see an advantage that we do not need to introduce a new column for pg_publication_rel. But in code, each time we want to get a column list or exclude column list we need an extra processing of 'prattrs' columns. Also I don't see any existing catalog table using a negative attribute for column list. Based on above observations, I feel that the current is better. Please correct me if I missed an advantage for the approach you suggested. > ====== > > doc/src/sgml/ref/alter_publication.sgml > > 4. ALTER PUBLICATION syntax > > The syntax is currently written as: > TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [ > EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ... > ] > > Can't this be more simply written as: > TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] ) > ] [ WHERE ( expression ) ] [, ... ] > > ~~~ Fixed > > 5. > + <para> > + Alter publication <structname>mypublication</structname> to add table > + <structname>users</structname> except column > + <structname>security_pin</structname>: > +<programlisting> > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin); > > Those tags don't seem correct. e.g. "users" and "security_pin" are not > <structname> (???). > > Perhaps, every other example here is wrong too and you just copied > them? Anyway, something here looks wrong to me. > I saw different documents and usage of tags seems not well defined. For example for table we are using tags in document create_publication.sgml, update.sgml <structname> is used, in document table.sgml, advanced.sgml <classname> is used, and in logical-replication.sgml <literal> is used. Similarly for column names <structname>, <structfield> or <literal> are used in different parts of the document. I kept the changed tag to <structfield> for the column for this patch. Do you have any suggestions? > ====== > doc/src/sgml/ref/create_publication.sgml > > 6. CREATE PUBLICATION syntax > > The syntax is currently written as: > TABLE [ ONLY ] table_name [ * ] { [ [ ( column_name [, ... ] ) ] | [ > EXCEPT ( column_name [, ... ] ) ] ] } [ WHERE ( expression ) ] [, ... > ] > > Can't this be more simply written as: > TABLE [ ONLY ] table_name [ * ] [ [ EXCEPT ] ( column_name [, ... ] ) > ] [ WHERE ( expression ) ] [, ... ] > > ~~~ Fixed > > 7. > + <para> > + When a column list is specified with EXCEPT, the named columns are not > + replicated. The excluded column list cannot contain generated > columns. The > + column list and excluded column list cannot be specified together. > + Specifying a column list has no effect on <literal>TRUNCATE</literal> > + commands. > + </para> > > IMO you don't need to say "The column list and excluded column list > cannot be specified together." because AFAIK the syntax makes that > impossible to do anyhow. > Removed this line > ~~~ > > 8. > + <para> > + Create a publication that publishes all changes for table > <structname>users</structname> > + except changes for columns <structname>security_pin</structname>: > +<programlisting> > +CREATE PUBLICATION users_safe FOR TABLE users EXCEPT (security_pin); > +</programlisting> > + </para> > > 8a. > Same review comment as previously -- Those tags don't seem correct. > e.g. "users" and "security_pin" are not <structname> (???). > Again, are all the other existing tags also wrong? Maybe a new thread > needed to address these? > > ~ Same as point 5. I also feel this should be addressed in a new thread. > 8b. > Plural? /except changes for columns/except changes for column/ Fixed Also in this patch I added displaying "EXCEPT (column_list)" for \dRp+ and \d table_name psql commands. Thanks and Regards, Shlok Kyal
Attachment
On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: ... > > 3. > > TBH, I was wondering why a new catalog attribute was necessary... > > > > Can't you simply re-use the existing attribute "prattrs" attribute. > > e.g. let's just define negative means exclude. > > > > e.g. a value of 1 3 means only the 1st and 3rd columns are published > > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published > > e.g. a value of null mean all columns are published > > > > (mixes of negative and positive will not be possible) > > > > Currently I have added a new attribute 'prexcludeattrs' in > pg_publication_rel table. I used this approach because it will be > easier for user to get the exclude column list, in code no extra > processing is required to get the exclude column list. > > For an approach to use negative numbers for exclude columns. I see an > advantage that we do not need to introduce a new column for > pg_publication_rel. But in code, each time we want to get a column > list or exclude column list we need an extra processing of 'prattrs' > columns. Also I don't see any existing catalog table using a negative > attribute for column list. > > Based on above observations, I feel that the current is better. > > Please correct me if I missed an advantage for the approach you suggested. > OK. Maybe using negative numbers was a bridge too far... But IMO it is not good to have 2 separate attributes for the lists. Doing so implies they can coexist, but that is not true. I felt there are not really 2 "kinds" of columns list anyway -- there is just a "column list" which defines columns that are either included or excluded from the publication determined by EXCEPT. Having dual lists gets weird/confusing to describe them -- you end up continually having to refer to the other one to clarify behaviour. e.g. Does 'prattrs' value NULL mean publish everything? Well, no... that depends if there is a non null 'prexcludeattrs' e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well, no... that depends if there is a non null 'prattrs' Furthermore, all the code is doubling up referring to "column list" and "exclude column list" -- code / docs / comments / error messages. There are quite a lot of places the patch touches that I thought were not really needed if you don't have 2 different kinds of column-lists. To summarise, I felt it would be better to just keep the existing 'prattrs' as the one-and-only column list, but add another BOOLEAN attribute to flag whether 'prattrs' columns should be included or excluded. prattrs; prattrs_exclude; Means -------------------------------------------- 1 2 3 f only cols 1,2,3 will be published 4 5 6 t only cols 4,5,6 will NOT be published null f all cols are published (flag is ignored) null t all cols are published (flag is ignored) > > 5. > > + <para> > > + Alter publication <structname>mypublication</structname> to add table > > + <structname>users</structname> except column > > + <structname>security_pin</structname>: > > +<programlisting> > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin); > > > > Those tags don't seem correct. e.g. "users" and "security_pin" are not > > <structname> (???). > > > > Perhaps, every other example here is wrong too and you just copied > > them? Anyway, something here looks wrong to me. > > > I saw different documents and usage of tags seems not well defined. > For example for table we are using tags in document > create_publication.sgml, update.sgml <structname> is used, in document > table.sgml, advanced.sgml <classname> is used, and in > logical-replication.sgml <literal> is used. Similarly for column > names <structname>, <structfield> or <literal> are used in different > parts of the document. > > I kept the changed tag to <structfield> for the column for this patch. > Do you have any suggestions? No, for this patch I think it is best that you just follow nearby code (as you are already doing). I plan to raise another thread to ask what are the guidelines for this sort of markup which is currently used inconsistently in different places. ////////// Below are a few more review comments for v13-0003 ====== Commit message 1. Typo /THe/The/ ~~~ 2. The new syntax allows specifying excluded column list when creating or altering a publication. For example: CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list) or ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list) ~ I felt since you say these "For example:" it would be better to give real examples. e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)". ~~~ 3. Typo /family of command/family of commands/ ====== doc/src/sgml/logical-replication.sgml 4. I am not sure that it was a good idea to be making a new term called an "exclude column list"... because in introduces a new concept of something that sounds like it is a different kind of list, and now you have to keep referring everywhere to both to "column list" versus "exclude column list". All the doubling up add more complication I think. IMO really there is just a "column list". Whether that list is for exclusion or not just depends on the presence of EXCEPT. So I felt maybe all places mentioning "exclude column list" could be rephrased. ====== src/backend/catalog/pg_publication.c 5. +/* + * Returns true if the relation has exluded column list associated with the + * publication, false otherwise. + * + * If a exclude column list is found, the corresponding bitmap is returned + * through the cols parameter, if provided. The bitmap is constructed within the + * given memory context (mcxt). + */ + Typo /exluded column/an excluded column/ Typo /exclude column list/excluded column list/ ~~~ 6. +/* + * pub_exclude_collist_validate + * Process and validate the 'excluded columns' list and ensure the columns + * are all valid to exclude from publication. Checks for and raises an + * ERROR for any unknown columns, system columns, duplicate columns, or + * generated columns. + * Why can't you exclude generated columns? e.g. Maybe PUBLICATION says publish_generated_columns=stored and there are 100s of such columns, but the user just wants to exclude one of them. Why say they cannot do that? Hmm. Perhaps this is being already handled elsewhere, in which case this comment still seems misleading. ====== src/backend/commands/publicationcmds.c 7. + * With REPLICA IDENTITY FULL, no column list and no excluded column + * list is allowed. Really, just "no column list is allowed." same as it said before. ====== Kind Regards, Peter Smith. Fujitsu Australia
> > > 5. > > > + <para> > > > + Alter publication <structname>mypublication</structname> to add table > > > + <structname>users</structname> except column > > > + <structname>security_pin</structname>: > > > +<programlisting> > > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin); > > > > > > Those tags don't seem correct. e.g. "users" and "security_pin" are not > > > <structname> (???). > > > > > > Perhaps, every other example here is wrong too and you just copied > > > them? Anyway, something here looks wrong to me. > > > > > I saw different documents and usage of tags seems not well defined. > > For example for table we are using tags in document > > create_publication.sgml, update.sgml <structname> is used, in document > > table.sgml, advanced.sgml <classname> is used, and in > > logical-replication.sgml <literal> is used. Similarly for column > > names <structname>, <structfield> or <literal> are used in different > > parts of the document. > > > > I kept the changed tag to <structfield> for the column for this patch. > > Do you have any suggestions? > > No, for this patch I think it is best that you just follow nearby code > (as you are already doing). I plan to raise another thread to ask what > are the guidelines for this sort of markup which is currently used > inconsistently in different places. > FYI - I created a new thread asking this markup question [1]. ====== [1] https://www.postgresql.org/message-id/CAHut%2BPvtf24r%2BbdPgBind84dBLPvgNL7aB%2B%3DHxAUupdPuo2gRg%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Fri, 20 Jun 2025 at 09:28, Peter Smith <smithpb2250@gmail.com> wrote: > > On Thu, Jun 19, 2025 at 4:42 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > ... > > > 3. > > > TBH, I was wondering why a new catalog attribute was necessary... > > > > > > Can't you simply re-use the existing attribute "prattrs" attribute. > > > e.g. let's just define negative means exclude. > > > > > > e.g. a value of 1 3 means only the 1st and 3rd columns are published > > > e.g. a value of -1 -3 means all columns except 1st and 3rd columns are published > > > e.g. a value of null mean all columns are published > > > > > > (mixes of negative and positive will not be possible) > > > > > > > Currently I have added a new attribute 'prexcludeattrs' in > > pg_publication_rel table. I used this approach because it will be > > easier for user to get the exclude column list, in code no extra > > processing is required to get the exclude column list. > > > > For an approach to use negative numbers for exclude columns. I see an > > advantage that we do not need to introduce a new column for > > pg_publication_rel. But in code, each time we want to get a column > > list or exclude column list we need an extra processing of 'prattrs' > > columns. Also I don't see any existing catalog table using a negative > > attribute for column list. > > > > Based on above observations, I feel that the current is better. > > > > Please correct me if I missed an advantage for the approach you suggested. > > > > OK. Maybe using negative numbers was a bridge too far... > > But IMO it is not good to have 2 separate attributes for the lists. > Doing so implies they can coexist, but that is not true. I felt there > are not really 2 "kinds" of columns list anyway -- there is just a > "column list" which defines columns that are either included or > excluded from the publication determined by EXCEPT. > > Having dual lists gets weird/confusing to describe them -- you end up > continually having to refer to the other one to clarify behaviour. > > e.g. Does 'prattrs' value NULL mean publish everything? Well, no... > that depends if there is a non null 'prexcludeattrs' > e.g. Does 'prexcludeattrs' value NULL mean publish everything? Well, > no... that depends if there is a non null 'prattrs' > > Furthermore, all the code is doubling up referring to "column list" > and "exclude column list" -- code / docs / comments / error messages. > There are quite a lot of places the patch touches that I thought were > not really needed if you don't have 2 different kinds of column-lists. > > To summarise, I felt it would be better to just keep the existing > 'prattrs' as the one-and-only column list, but add another BOOLEAN > attribute to flag whether 'prattrs' columns should be included or > excluded. > > prattrs; prattrs_exclude; Means > -------------------------------------------- > 1 2 3 f only cols 1,2,3 will be published > 4 5 6 t only cols 4,5,6 will NOT be published > null f all cols are published (flag is ignored) > null t all cols are published (flag is ignored) > I agree with your point and also it would be a better approach. In patch 0001 an column 'prexcept' was added in pg_publication_rel. We use that only for publication with all tables. I have reused this column for patch 0003. If publication is not for all tables and the 'prexcept' flag is true, it implies that the columns in 'prattrs' are to be excluded from being published. I have included the changes for it in v14-0003 patch. > > > 5. > > > + <para> > > > + Alter publication <structname>mypublication</structname> to add table > > > + <structname>users</structname> except column > > > + <structname>security_pin</structname>: > > > +<programlisting> > > > +ALTER PUBLICATION production_publication ADD TABLE users EXCEPT (security_pin); > > > > > > Those tags don't seem correct. e.g. "users" and "security_pin" are not > > > <structname> (???). > > > > > > Perhaps, every other example here is wrong too and you just copied > > > them? Anyway, something here looks wrong to me. > > > > > I saw different documents and usage of tags seems not well defined. > > For example for table we are using tags in document > > create_publication.sgml, update.sgml <structname> is used, in document > > table.sgml, advanced.sgml <classname> is used, and in > > logical-replication.sgml <literal> is used. Similarly for column > > names <structname>, <structfield> or <literal> are used in different > > parts of the document. > > > > I kept the changed tag to <structfield> for the column for this patch. > > Do you have any suggestions? > > No, for this patch I think it is best that you just follow nearby code > (as you are already doing). I plan to raise another thread to ask what > are the guidelines for this sort of markup which is currently used > inconsistently in different places. Thanks for starting a thread for it. > > ////////// > > Below are a few more review comments for v13-0003 > > ====== > Commit message > > 1. > Typo /THe/The/ > > ~~~ Fixed > 2. > The new syntax allows specifying excluded column list when creating or > altering a publication. For example: > CREATE PUBLICATION pubname FOR TABLE tabname EXCEPT (exclude_column_list) > or > ALTER PUBLICATION pubname ADD TABLE tabname EXCEPT (exclude_column_list) > > ~ > > I felt since you say these "For example:" it would be better to give > real examples. > e.g. say "(col1,col2,col3)" instead of "(exclude_column_list)". > Fixed > ~~~ > > 3. > Typo /family of command/family of commands/ > > ====== > doc/src/sgml/logical-replication.sgml > > 4. > I am not sure that it was a good idea to be making a new term called > an "exclude column list"... because in introduces a new concept of > something that sounds like it is a different kind of list, and now you > have to keep referring everywhere to both to "column list" versus > "exclude column list". All the doubling up add more complication I > think. > > IMO really there is just a "column list". Whether that list is for > exclusion or not just depends on the presence of EXCEPT. So I felt > maybe all places mentioning "exclude column list" could be rephrased. > > ====== > src/backend/catalog/pg_publication.c > > 5. > +/* > + * Returns true if the relation has exluded column list associated with the > + * publication, false otherwise. > + * > + * If a exclude column list is found, the corresponding bitmap is returned > + * through the cols parameter, if provided. The bitmap is constructed > within the > + * given memory context (mcxt). > + */ > + > > Typo /exluded column/an excluded column/ > Typo /exclude column list/excluded column list/ > updated the comment according to latest implementation > ~~~ > > 6. > +/* > + * pub_exclude_collist_validate > + * Process and validate the 'excluded columns' list and ensure the columns > + * are all valid to exclude from publication. Checks for and raises an > + * ERROR for any unknown columns, system columns, duplicate columns, or > + * generated columns. > + * > > Why can't you exclude generated columns? > > e.g. Maybe PUBLICATION says publish_generated_columns=stored and there > are 100s of such columns, but the user just wants to exclude one of > them. Why say they cannot do that? Hmm. Perhaps this is being already > handled elsewhere, in which case this comment still seems misleading. > I have removed this restriction. Now we can specify stored generated columns in EXCEPT (column_list) when we use the 'publish_generated_columns' flag. > ====== > src/backend/commands/publicationcmds.c > > 7. > + * With REPLICA IDENTITY FULL, no column list and no excluded column > + * list is allowed. > > Really, just "no column list is allowed." same as it said before. > > ====== Fixed Thanks and Regards, Shlok Kyal
Attachment
Hi Shlok. Below are some review comments for v14-0003 ====== 1. GENERAL Since the new syntax uses EXCEPT, then, in my opinion, you should try to use that same term where possible when describing things. I understand it is hard to do this in text and I agree often it makes more sense to say "exclude" columns etc, but OTOH in the code there are lots of places where you could have named vars/params differently: e.g. 'except_collist' instead of 'exclude_collist' might have been better. ====== Commit message 2. Column list specifed with EXCEPT is stored in column "prattrs" in table "pg_publication_rel" and also column "prexcept" is set to "true", to maintain the column list that user wants to exclude from the publication. ~ That paragraph could do with some rewording. For example, AFAIK, "prattrs" is for all column lists -- not just except col-lists, but the way it is described here sounds different. Also, /specifed/specified/ ====== doc/src/sgml/catalogs.sgml 3. (52.42. pg_publication_rel) <para> - True if the relation must be excluded + True if the relation or column list must be excluded. If publication is + created <literal>FOR ALL TABLES</literal> and it is specified as true, + the relation should be excluded. Else if it is true the columns in + <literal>prattrs</literal> should be excluded from being published. </para></entry> I felt this could be expressed more simply without mentioning anything about FOR ALL TABLES. SUGGESTION True if the column list or relation must be excluded from publication. If a column list is specified in <literal>prattrs</literal>, then exclude only those columns. If <literal>prattrs</literal> is NULL, then exclude the entire relation. ====== doc/src/sgml/logical-replication.sgml 4. (29.5. Column Lists) <para> - Each publication can optionally specify which columns of each table are - replicated to subscribers. The table on the subscriber side must have at - least all the columns that are published. If no column list is specified, - then all columns on the publisher are replicated. + Each publication can optionally specify which columns of each table should be + replicated or excluded from replication. On the subscriber side, the table + must include at least all the columns that are published. If no column list + is provided, all columns from the publisher are replicated by default. See <xref linkend="sql-createpublication"/> for details on the syntax. </para> I felt this patch may have changed too much text. IMO, you only needed to say "... are replicated or excluded from replication.". The other changes did not seem necessary. ~~~ 5. <para> - If no column list is specified, any columns added to the table later are - automatically replicated. This means that having a column list which names - all columns is not the same as having no column list at all. + If no column list or a column list with EXCEPT is specified, any columns + added to the table later are automatically replicated. This means that having + a column list which names all columns is not the same as having no + column list at all. If an column list is specified, any columns added to the + table later are automatically replicated. </para> 5a. "This means that having a column list which names all columns is not the same as having no column list at all." -- That note does not make sense when you say EXCEPT. I think some rewording is needed here. ~ 5b. "If an column list is specified, any columns added to the table later are automatically replicated.". This made no sense -- some words missing? ~~~ 6. Generated columns can also be specified in a column list. This allows generated columns to be published, regardless of the publication parameter <link linkend="sql-createpublication-params-with-publish-generated-columns"> - <literal>publish_generated_columns</literal></link>. See - <xref linkend="logical-replication-gencols"/> for details. + <literal>publish_generated_columns</literal></link>. Generated columns can + be included in column list specified with EXCEPT clause if publication + parameter + <link linkend="sql-createpublication-params-with-publish-generated-columns"> + <literal>publish_generated_columns</literal></link> is not set to + <literal>none</literal>. Specified generated columns will not be published. + See <xref linkend="logical-replication-gencols"/> for details. </para> I am not so sure about this. It seemed overly strict to me. Why can't it simply say: "Generated columns can also be specified in a column list. This allows generated columns to be published or excluded, regardless of the publication parameter..." Specifically, I don't know why you need to say: Generated columns can be included in column list specified with EXCEPT clause if publication parameter publish_generated_columns is not set to none. Specified generated columns will not be published. IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named cols. So if param is "stored", then the named cols will be excluded. OTOH, if param is "none" then all generated cols will be excluded anyway, so why not just allow the EXCEPT (gencol,gencol2) here as well, because the result will be the same. ~~~ 7. (29.5.1. Examples) <para> - Create a table <literal>t1</literal> to be used in the following example. + Create tables <literal>t1</literal>, <literal>t2</literal> to be used in the + following example. /Create tables t1, t2/Create tables t1 and t2/ ~~~ 8. <para> Create a publication <literal>p1</literal>. A column list is defined for - table <literal>t1</literal> to reduce the number of columns that will be - replicated. Notice that the order of column names in the column list does - not matter. + table <literal>t1</literal> and a column list is defined for table + <literal>t2</literal> with EXCEPT clause to reduce the number of columns that will be + replicated. Notice that the order of column names in the column lists does not matter. BEFORE A column list is defined for table t1 and a column list is defined for table t2... SUGGESTION (added comma, etc.) A column list is defined for table t1, and another column list is defined for table t2... ~~~ 9. The final example still says: "Only data from the column list of publication p1 is replicated." That doesn't seem quite appropriate now that you also have an EXCEPT column list. SUGGESTION: Only data specified by the column lists of publication p1 is replicated. ====== doc/src/sgml/ref/create_publication.sgml 10. + <para> + When a column list is specified with EXCEPT, the named columns are not + replicated. Specifying a column list has no effect on + <literal>TRUNCATE</literal> commands. + </para> I felt that to be clearer the preceding paragraph should be changed as follows: /When a column list is specified, only the named columns are replicated./When a column list without EXCEPT is specified, only the named columns are replicated./ ~~~ 11. CREATE PUBLICATION (NOTES section) 11a. The NOTES talk about replica identity columns -- should you mention EXCEPT here? ~ 11b. The NOTES talk about generated columns -- should you mention EXCEPT here? ====== src/backend/catalog/pg_publication.c 12. check_and_fetch_column_list + if (!isnull) + except = DatumGetBool(cfdatum); + + *except_columns = except && !pub->alltables; AFAICT, you can Assert(!pub->alltables) because you already checked that earlier up front. So you don't need 'except' var either. Just assign *except_cols up front and then overwrite it later if true. SUGGESTION: *except_cols = false; if (pub->alltables) return false; ... if (!isnull) *except_cols = DatumGetBool(cfdatum); ~~~ 13. publication_add_relation /* Validate and translate column names into a Bitmapset of attnums. */ - attnums = pub_collist_validate(pri->relation, pri->columns); + attnums = pub_collist_validate(pri->relation, pri->columns, + pri->except && !pub->alltables, + pub->pubgencols_type); I am wondering why we are even calling a function to validate column lists if pub->alltables was true. AFAIK, that combination of column-lists and FOR ALL TABLES is not even possible, so the code seems strange. ~~~ 14. pub_exclude_collist_validate . + /* + * Check if column list specified with EXCEPT have any stored + * generated column and 'publish_generated_columns' is not set to + * 'stored'. + */ + if (except_columns && + TupleDescAttr(tupdesc, attnum - 1)->attgenerated == ATTRIBUTE_GENERATED_STORED && + pubgencols_type != PUBLISH_GENCOLS_STORED) + ereport(ERROR, + errcode(ERRCODE_INVALID_COLUMN_REFERENCE), + errmsg("cannot use stored generated column \"%s\" in publication column list specified with EXCEPT when \"%s\" set to \"%s\"", + colname, "publish_generated_columns", "stored")); As mentioned in the above DOCS comments, I was having doubts about why we have this error. If the parameter says "none", then generated columns will not be replicated, so why should we care if the user also says EXCEPT(gencol1,gencol2). Either way, the result will be the same; the generated column will not be published. ~~~ 15. GetRelationPublications { HeapTuple tup = &pubrellist->members[i]->tuple; Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid; + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid)); + bool is_table_excluded = ((Form_pg_publication) GETSTRUCT(pubtup))->puballtables && + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept; - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept) + if (except_flag == is_table_excluded) result = lappend_oid(result, pubid); + + ReleaseS I'm not 100% sure you need the additional 'pubtup'... Can't you just look at the "prattrs" field to see if a column-list was specified? If "prattrs" is null and "prexcept" is true, isn't that the same combination as what you are looking for here? ~~~ 16. pg_get_publication_tables + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple, + Anum_pg_publication_rel_prattrs, + &(nulls[2])); + + /* if column list is specified with EXCEPT */ + if (!pub->alltables && except) + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL); + else + values[2] = columnsDatum; 16a. Something seems fishy here. Isn't there a pathway where you missed assigning value[2] to anything? ~ 16b. Also, I feel there should be some other boolean variable used here instead of checking bot (!pub->alltables && except) in multiple places. ====== src/backend/replication/pgoutput/pgoutput.c 17. RelationSyncEntry + + /* Indicate if no column is included in the publication */ + bool no_cols_published; Maybe this can have a more explanatory comment to explain why it is needed? ~~~ 18. check_and_init_gencol + bool found = false; + bool except_columns = false; + + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, + NULL, &except_columns); + /* * The column list takes precedence over the * 'publish_generated_columns' parameter. Those will be checked later, - * see pgoutput_column_list_init. + * see pgoutput_column_list_init. But when a column list is specified + * with EXCEPT, it should be checked. */ - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL)) + if (found && !except_columns) continue; The variable 'found' seems a poor name; how about 'has_column_list' or similar? ~~~ 19. pgoutput_change + /* + * If all columns of a table is present in column list specified with + * EXCEPT, skip publishing the changes. + */ + if (relentry->no_cols_published) + return; /is present/are present/ ====== src/bin/pg_dump/pg_dump.c 20. getPublicationTables + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL; + else + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL; pubrinfo[j].dobj.catId.tableoid = atooid(PQgetvalue(res, i, i_tableoid)); @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables) pubrinfo[j].pubrelqual = NULL; else pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual)); + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0); Why not assign pubrinfo[j].pubexcept earlier so you don't have to repeat the strcmp? ~~~ 21. - if (strcmp(prexcept, "t") == 0) + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) simple_ptr_list_append(&exceptinfo, &pubrinfo[j]); Why not assign pubrinfo[j].pubexcept earlier so you don't have to repeat the strcmp? Same also for the PQgetisnull(res, i, i_prattrs))... ~~~ 22. dumpPublicationTable if (pubrinfo->pubrattrs) - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); + { + if (pubrinfo->pubexcept) + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs); + else + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); + } SUGGESTION { if (pubrinfo->pubexcept) appendPQExpBuffer(query, " EXCEPT"); appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); } ====== Kind Regards, Peter Smith. Fujitsu Australia
On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > I have included the changes for > it in v14-0003 patch. > Thanks for the patches. I have reviewed patch001 alone, please find few comments: 1) + <para> + The <literal>RESET</literal> clause will reset the publication to the + default state which includes resetting the publication parameters, setting + <literal>ALL TABLES</literal> flag to <literal>false</literal> and + dropping all relations and schemas that are associated with the + publication. </para> It is misleading, as far as I have understood, we do not drop the tables or schemas associated with the pub; we just remove those from the publication's object list. See previous doc: "The ADD and DROP clauses will add and remove one or more tables/schemas from the publication" Perhaps we want to say the same thing when we speak about the 'drop' aspect of RESET. 2) AlterPublicationReset(): + if (!OidIsValid(prid)) + ereport(ERROR, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("relation \"%s\" is not part of the publication", + get_rel_name(relid)))); Can you please help me understand which scenario will give this error? Another question is do we really need this error? IIUC, we generally give errors if a user has explicitly called out a name of an object and that object is not found. Example: postgres=# alter publication pubnew drop table t1,tab2; ERROR: relation "t1" is not part of the publication While in a few other cases, we pass missing_okay as true and do not give errors. Please see other callers of performDeletion in publicationcmds.c itself. There we have usage of missing_okay=true. I have not researched myself, but please analyze the cases where missing_okay is passed as true to figure out if those match our RESET case. Try to reproduce if possible and then take a call. 3) +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public; +ERROR: syntax error at or near "ALL" +LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub... There is a problem in syntax, I think the intention of testcase was to run this query successfully. thanks Shveta
On Thu, 26 Jun 2025 at 09:06, Peter Smith <smithpb2250@gmail.com> wrote: > > Hi Shlok. > > Below are some review comments for v14-0003 > > ====== > 1. GENERAL > > Since the new syntax uses EXCEPT, then, in my opinion, you should try > to use that same term where possible when describing things. I > understand it is hard to do this in text and I agree often it makes > more sense to say "exclude" columns etc, but OTOH in the code there > are lots of places where you could have named vars/params differently: > e.g. 'except_collist' instead of 'exclude_collist' might have been > better. > Fixed the variable names. > ====== > Commit message > > 2. > Column list specifed with EXCEPT is stored in column "prattrs" in table > "pg_publication_rel" and also column "prexcept" is set to "true", to maintain > the column list that user wants to exclude from the publication. > > ~ > > That paragraph could do with some rewording. For example, AFAIK, > "prattrs" is for all column lists -- not just except col-lists, but > the way it is described here sounds different. > > Also, /specifed/specified/ > Reworded the paragraph > ====== > doc/src/sgml/catalogs.sgml > > 3. (52.42. pg_publication_rel) > > <para> > - True if the relation must be excluded > + True if the relation or column list must be excluded. If publication is > + created <literal>FOR ALL TABLES</literal> and it is specified as true, > + the relation should be excluded. Else if it is true the columns in > + <literal>prattrs</literal> should be excluded from being published. > </para></entry> > > I felt this could be expressed more simply without mentioning anything > about FOR ALL TABLES. > > SUGGESTION > True if the column list or relation must be excluded from publication. > If a column list is specified in <literal>prattrs</literal>, then > exclude only those columns. If <literal>prattrs</literal> is NULL, > then exclude the entire relation. > Fixed > ====== > doc/src/sgml/logical-replication.sgml > > 4. (29.5. Column Lists) > > <para> > - Each publication can optionally specify which columns of each table are > - replicated to subscribers. The table on the subscriber side must have at > - least all the columns that are published. If no column list is specified, > - then all columns on the publisher are replicated. > + Each publication can optionally specify which columns of each > table should be > + replicated or excluded from replication. On the subscriber side, the table > + must include at least all the columns that are published. If no column list > + is provided, all columns from the publisher are replicated by default. > See <xref linkend="sql-createpublication"/> for details on the syntax. > </para> > > I felt this patch may have changed too much text. IMO, you only needed > to say "... are replicated or excluded from replication.". The other > changes did not seem necessary. > > ~~~ Fixed > 5. > <para> > - If no column list is specified, any columns added to the table later are > - automatically replicated. This means that having a column list which names > - all columns is not the same as having no column list at all. > + If no column list or a column list with EXCEPT is specified, any columns > + added to the table later are automatically replicated. This means > that having > + a column list which names all columns is not the same as having no > + column list at all. If an column list is specified, any columns added to the > + table later are automatically replicated. > </para> > > 5a. > "This means that having a column list which names all columns is not > the same as having no column list at all." -- That note does not make > sense when you say EXCEPT. I think some rewording is needed here. > Fixed > ~ > > 5b. > "If an column list is specified, any columns added to the table later > are automatically replicated.". > > This made no sense -- some words missing? > This change was done by mistake. Removed it. > ~~~ > > 6. > Generated columns can also be specified in a column list. This allows > generated columns to be published, regardless of the publication parameter > <link linkend="sql-createpublication-params-with-publish-generated-columns"> > - <literal>publish_generated_columns</literal></link>. See > - <xref linkend="logical-replication-gencols"/> for details. > + <literal>publish_generated_columns</literal></link>. Generated columns can > + be included in column list specified with EXCEPT clause if publication > + parameter > + <link linkend="sql-createpublication-params-with-publish-generated-columns"> > + <literal>publish_generated_columns</literal></link> is not set to > + <literal>none</literal>. Specified generated columns will not be published. > + See <xref linkend="logical-replication-gencols"/> for details. > </para> > > I am not so sure about this. It seemed overly strict to me. > > Why can't it simply say: > "Generated columns can also be specified in a column list. This allows > generated columns to be published or excluded, regardless of the > publication parameter..." > > Specifically, I don't know why you need to say: > Generated columns can be included in column list specified with EXCEPT > clause if publication parameter publish_generated_columns is not set > to none. Specified generated columns will not be published. > > IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named > cols. So if param is "stored", then the named cols will be excluded. > OTOH, if param is "none" then all generated cols will be excluded > anyway, so why not just allow the EXCEPT (gencol,gencol2) here as > well, because the result will be the same. > > I have removed this change. And allowed specifying generated columns in EXCEPT column list as well irrespective of value of ‘publish_generated_columns’. > ~~~ > > 7. (29.5.1. Examples) > > <para> > - Create a table <literal>t1</literal> to be used in the following example. > + Create tables <literal>t1</literal>, <literal>t2</literal> to be > used in the > + following example. > > /Create tables t1, t2/Create tables t1 and t2/ > Fixed > ~~~ > > 8. > <para> > Create a publication <literal>p1</literal>. A column list is defined for > - table <literal>t1</literal> to reduce the number of columns that will be > - replicated. Notice that the order of column names in the column list does > - not matter. > + table <literal>t1</literal> and a column list is defined for table > + <literal>t2</literal> with EXCEPT clause to reduce the number of > columns that will be > + replicated. Notice that the order of column names in the column > lists does not matter. > > BEFORE > A column list is defined for table t1 and a column list is defined for > table t2... > > SUGGESTION (added comma, etc.) > A column list is defined for table t1, and another column list is > defined for table t2... > Fixed > ~~~ > > 9. > The final example still says: > "Only data from the column list of publication p1 is replicated." > > That doesn't seem quite appropriate now that you also have an EXCEPT > column list. > > SUGGESTION: > Only data specified by the column lists of publication p1 is replicated. > Fixed > ====== > doc/src/sgml/ref/create_publication.sgml > > 10. > + <para> > + When a column list is specified with EXCEPT, the named columns are not > + replicated. Specifying a column list has no effect on > + <literal>TRUNCATE</literal> commands. > + </para> > > I felt that to be clearer the preceding paragraph should be changed as follows: > > /When a column list is specified, only the named columns are > replicated./When a column list without EXCEPT is specified, only the > named columns are replicated./ > Fixed > ~~~ > > 11. CREATE PUBLICATION (NOTES section) > > 11a. > The NOTES talk about replica identity columns -- should you mention EXCEPT here? > Added notes for EXCEPT > ~ > > 11b. > The NOTES talk about generated columns -- should you mention EXCEPT here? > I felt it is not needed. > ====== > src/backend/catalog/pg_publication.c > > 12. check_and_fetch_column_list > > + if (!isnull) > + except = DatumGetBool(cfdatum); > + > + *except_columns = except && !pub->alltables; > > AFAICT, you can Assert(!pub->alltables) because you already checked > that earlier up front. > So you don't need 'except' var either. Just assign *except_cols up > front and then overwrite it later if true. > > SUGGESTION: > > *except_cols = false; > > if (pub->alltables) > return false; > ... > if (!isnull) > *except_cols = DatumGetBool(cfdatum); > Fixed > ~~~ > > 13. publication_add_relation > > /* Validate and translate column names into a Bitmapset of attnums. */ > - attnums = pub_collist_validate(pri->relation, pri->columns); > + attnums = pub_collist_validate(pri->relation, pri->columns, > + pri->except && !pub->alltables, > + pub->pubgencols_type); > > > I am wondering why we are even calling a function to validate column > lists if pub->alltables was true. AFAIK, that combination of > column-lists and FOR ALL TABLES is not even possible, so the code > seems strange. > Fixed > ~~~ > > 14. pub_exclude_collist_validate > . > + /* > + * Check if column list specified with EXCEPT have any stored > + * generated column and 'publish_generated_columns' is not set to > + * 'stored'. > + */ > + if (except_columns && > + TupleDescAttr(tupdesc, attnum - 1)->attgenerated == > ATTRIBUTE_GENERATED_STORED && > + pubgencols_type != PUBLISH_GENCOLS_STORED) > + ereport(ERROR, > + errcode(ERRCODE_INVALID_COLUMN_REFERENCE), > + errmsg("cannot use stored generated column \"%s\" in publication > column list specified with EXCEPT when \"%s\" set to \"%s\"", > + colname, "publish_generated_columns", "stored")); > > As mentioned in the above DOCS comments, I was having doubts about why > we have this error. > > If the parameter says "none", then generated columns will not be > replicated, so why should we care if the user also says > EXCEPT(gencol1,gencol2). Either way, the result will be the same; the > generated column will not be published. > Removed this restriction. > ~~~ > > 15. GetRelationPublications > > { > HeapTuple tup = &pubrellist->members[i]->tuple; > Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid; > + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid)); > + bool is_table_excluded = ((Form_pg_publication) > GETSTRUCT(pubtup))->puballtables && > + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept; > > - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept) > + if (except_flag == is_table_excluded) > result = lappend_oid(result, pubid); > + > + ReleaseS > > > I'm not 100% sure you need the additional 'pubtup'... Can't you just > look at the "prattrs" field to see if a column-list was specified? If > "prattrs" is null and "prexcept" is true, isn't that the same > combination as what you are looking for here? > Yes, we can use this combination as well. Fixed it in latest patch. > ~~~ > > 16. pg_get_publication_tables > > + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple, > + Anum_pg_publication_rel_prattrs, > + &(nulls[2])); > + > + /* if column list is specified with EXCEPT */ > + if (!pub->alltables && except) > + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL); > + else > + values[2] = columnsDatum; > > 16a. > Something seems fishy here. Isn't there a pathway where you missed > assigning value[2] to anything? > Modified this change. > ~ > > 16b. > Also, I feel there should be some other boolean variable used here > instead of checking bot (!pub->alltables && except) in multiple > places. > Fixed > > ====== > src/backend/replication/pgoutput/pgoutput.c > > 17. RelationSyncEntry > + > + /* Indicate if no column is included in the publication */ > + bool no_cols_published; > > Maybe this can have a more explanatory comment to explain why it is needed? > Fixed > ~~~ > > 18. check_and_init_gencol > > + bool found = false; > + bool except_columns = false; > + > + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, > + NULL, &except_columns); > + > /* > * The column list takes precedence over the > * 'publish_generated_columns' parameter. Those will be checked later, > - * see pgoutput_column_list_init. > + * see pgoutput_column_list_init. But when a column list is specified > + * with EXCEPT, it should be checked. > */ > - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL)) > + if (found && !except_columns) > continue; > > The variable 'found' seems a poor name; how about 'has_column_list' or similar? > Fixed > ~~~ > > 19. pgoutput_change > > + /* > + * If all columns of a table is present in column list specified with > + * EXCEPT, skip publishing the changes. > + */ > + if (relentry->no_cols_published) > + return; > > /is present/are present/ > fixed > ====== > src/bin/pg_dump/pg_dump.c > > 20. getPublicationTables > > + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) > pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL; > + else > + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL; > > pubrinfo[j].dobj.catId.tableoid = > atooid(PQgetvalue(res, i, i_tableoid)); > @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo > tblinfo[], int numTables) > pubrinfo[j].pubrelqual = NULL; > else > pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual)); > + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0); > > > Why not assign pubrinfo[j].pubexcept earlier so you don't have to > repeat the strcmp? > Fixed > ~~~ > > 21. > - if (strcmp(prexcept, "t") == 0) > + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) > simple_ptr_list_append(&exceptinfo, &pubrinfo[j]); > > Why not assign pubrinfo[j].pubexcept earlier so you don't have to > repeat the strcmp? Same also for the PQgetisnull(res, i, > i_prattrs))... > Fixed > ~~~ > > 22. dumpPublicationTable > > if (pubrinfo->pubrattrs) > - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); > + { > + if (pubrinfo->pubexcept) > + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs); > + else > + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); > + } > > SUGGESTION > { > if (pubrinfo->pubexcept) > appendPQExpBuffer(query, " EXCEPT"); > > appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); > } Fixed I have addressed the comments shared by you and shared the updated v15 patch set here. Thanks and Regards, Shlok Kyal
Attachment
On Thu, 26 Jun 2025 at 15:27, shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > I have included the changes for > > it in v14-0003 patch. > > > Thanks for the patches. I have reviewed patch001 alone, please find > few comments: > > 1) > + <para> > + The <literal>RESET</literal> clause will reset the publication to the > + default state which includes resetting the publication parameters, setting > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > + dropping all relations and schemas that are associated with the > + publication. > </para> > > It is misleading, as far as I have understood, we do not drop the > tables or schemas associated with the pub; we just remove those from > the publication's object list. See previous doc: > "The ADD and DROP clauses will add and remove one or more > tables/schemas from the publication" > > Perhaps we want to say the same thing when we speak about the 'drop' > aspect of RESET. I have updated the document. > 2) > AlterPublicationReset(): > > + if (!OidIsValid(prid)) > + ereport(ERROR, > + (errcode(ERRCODE_UNDEFINED_OBJECT), > + errmsg("relation \"%s\" is not part of the publication", > + get_rel_name(relid)))); > > Can you please help me understand which scenario will give this error? > > Another question is do we really need this error? IIUC, we generally > give errors if a user has explicitly called out a name of an object > and that object is not found. Example: > > postgres=# alter publication pubnew drop table t1,tab2; > ERROR: relation "t1" is not part of the publication > > While in a few other cases, we pass missing_okay as true and do not > give errors. Please see other callers of performDeletion in > publicationcmds.c itself. There we have usage of missing_okay=true. I > have not researched myself, but please analyze the cases where > missing_okay is passed as true to figure out if those match our RESET > case. Try to reproduce if possible and then take a call. I thought about the above point and I also think this check is not required. Also, the function was calling PublicationDropSchemas with missing_ok as false. I have changed it to be true. > 3) > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public; > +ERROR: syntax error at or near "ALL" > +LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub... > > There is a problem in syntax, I think the intention of testcase was to > run this query successfully. I have fixed it. Thanks Shveta for reviewing the patch. I have addressed the comments and posted an updated version v15 in [1]. [1]: https://www.postgresql.org/message-id/CANhcyEU%2BaPu6iAH2cTA0cDtn3pd6c_njBONCt3FubYZoEEnm8Q%40mail.gmail.com Thanks and Regards, Shlok Kyal
Hi Shlok. Some review comments for v15-0003. ====== doc/src/sgml/catalogs.sgml 1. <para> - True if the relation must be excluded + True if the column list or relation must be excluded from publication. + If a column list is specified in <literal>prattrs</literal>, then + exclude only those columns. If <literal>prattrs</literal> is NULL, + then exclude the entire relation. </para></entry> I noticed other fields on this page say "null" instead of "NULL". It seems like "null" is more conventional. ====== doc/src/sgml/logical-replication.sgml 2. <para> If no column list is specified, any columns added to the table later are automatically replicated. This means that having a column list which names - all columns is not the same as having no column list at all. + all columns is not the same as having no column list at all. Similarly, if an + column list is specified with EXCEPT, any columns added to the table later + are also replicated automatically. </para> 2a. CURRENTLY If no column list or a column list with EXCEPT is specified, any columns added to the table later are automatically replicated. This means that having a column list which names all columns is not the same as having no column list at all. If an column list is specified, any columns added to the table later are automatically replicated. ~ That still doesn't quite make sense. I think instead of saying "This means..." it needs to say something a bit like below: However, a normal column list (without EXCEPT) only replicates the specified columns and no more. Therefore, having a column list that names all columns is not the same as having no column list at all, as more columns may be added to the table later. ~ 2b. And the final sentence "If an column list..." looks like a cut/paste error (??) ~ 2c. Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ~~~ 2.5A. The description about generated columns still says this: CURRENT: Generated columns can also be specified in a column list. This allows generated columns to be published, regardless of the publication parameter publish_generated_columns. See Section 29.6 for details. ~ But I don't think it is quite correct. IMO gencols behaviour is much more subtle... e.g. a) Normal collist - these named cols are published REGARDLESS of the 'publish_generated_cols' parameter (same as before) b) EXCEPT collist - you can specify gencols in the list REGARDLESS of the 'publish_generated_cols' parameter, because since they are named as "except" then they will not be published anyhow.... c) BUT for EXCEPT collist case, I think any gencols that are *not* covered by that EXCEPT collist should follow the rules according to the 'publish_generated_cols' parameter. So, it is much more tricky than the docs currently say: Also 2.5B. - The text says "See Section 29.6 for details," but there are no examples of these combinations (e.g. EXCEPT collist and diff parameter setting) 2.5C, - The regression tests also need to be more complex to cover these 2.5D. - You might need to add something in the CREATE PUBLICATION "NOTES" section after all -- even if it just refers to here. ~~~ 3. <para> Create a publication <literal>p1</literal>. A column list is defined for - table <literal>t1</literal> to reduce the number of columns that will be - replicated. Notice that the order of column names in the column list does - not matter. + table <literal>t1</literal>, and another column list is defined for table + <literal>t2</literal> using the EXCEPT clause to reduce the number of + columns that will be replicated. Note that the order of column names in + the column lists does not matter. <programlisting> -/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d); +/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a); </programlisting></para> Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ====== doc/src/sgml/ref/create_publication.sgml 4. <para> - When a column list is specified, only the named columns are replicated. - The column list can contain stored generated columns as well. If the - column list is omitted, the publication will replicate all non-generated - columns (including any added in the future) by default. Stored generated - columns can also be replicated if <literal>publish_generated_columns</literal> - is set to <literal>stored</literal>. Specifying a column list has no - effect on <literal>TRUNCATE</literal> commands. See + When a column list without EXCEPT is specified, only the named columns are + replicated. The column list can contain stored generated columns as well. + If the column list is omitted, the publication will replicate + all non-generated columns (including any added in the future) by default. + Stored generated columns can also be replicated if + <literal>publish_generated_columns</literal> is set to + <literal>stored</literal>. Specifying a column list has no effect on + <literal>TRUNCATE</literal> commands. See <xref linkend="logical-replication-col-lists"/> for details about column lists. </para> Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ~~~ 5. + <para> + When a column list is specified with EXCEPT, the named columns are not + replicated. Specifying a column list has no effect on + <literal>TRUNCATE</literal> commands. + </para> Maybe here EXCEPT should be written as <literal>EXCEPT</literal>. ** Note all the extra subtleties that I mentioned in the review comment #2.5 above --- e.g. IMO any *un-listed* gencols still should follow the parameter rules. ~~~ 6. <para> Any column list must include the <literal>REPLICA IDENTITY</literal> columns - in order for <command>UPDATE</command> or <command>DELETE</command> - operations to be published. There are no column list restrictions if the - publication publishes only <command>INSERT</command> operations. + and any column list specified with EXCEPT must not include the + <literal>REPLICA IDENTITY</literal> columns in order for + <command>UPDATE</command> or <command>DELETE</command> operations to be + published. There are no column list restrictions if the publication publishes + only <command>INSERT</command> operations. </para> 6a. CURRENT: Any column list must include the REPLICA IDENTITY columns, and any column list specified with EXCEPT must not include the REPLICA IDENTITY columns in order for UPDATE or DELETE operations to be published. ~ I felt that might be better expressed the other way around. Also, it might be better to say "not name" instead of "not include" because EXCEPT + include seemed a bit contrary. SUGGESTION (maybe like this) In order for UPDATE or DELETE operations to work, all the REPLICA IDENTITY columns must be published. So, any column list must name all REPLICA IDENTITY columns, and any EXCEPT column list must not name any REPLICA IDENTITY columns. ~~ 6b. Maybe here EXCEPT should be written as <literal>EXCEPT</literal> ====== src/backend/catalog/pg_publication.c check_and_fetch_column_list: 7. + /* Lookup the except attribute */ + cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple, + Anum_pg_publication_rel_prexcept, &isnull); + + if (!isnull) + { + Assert(!pub->alltables); + *except_columns = DatumGetBool(cfdatum); + } + I felt it would be safer to also assign *except_columns = false; up-front so the caller could be sure this flag was meaningful on return. ~~~ pub_form_cols_map: 8. Maybe use snake case like for other params, so /excepcols/except_cols/ ~~~ pg_get_publication_tables: 9. I felt all the logic in this function maybe can be simpler: e.g. If you just have "Bitmapset *except_columns = NULL;" then null nmeans there is no except columns; otherwise there is. This means you don't need a separate 'bool except_column' variable. e.g. Assign the Bitmapset *except_columns after you already have the values[2], instead of doing it later. e.g. The skip code if (except_columns && bms_is_member(att->attnum, columns)) could just check the list member, I think, without the additional bool. ~~~ 10. + /* + * We fetch pubtuple if publication is not FOR ALL TABLES and not + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that + * prattrs contains columns to be excluded for replication. + */ + if (!isnull) + except_columns = DatumGetBool(exceptDatum); /indicate/indicates/ ====== src/backend/parser/gram.y 11. + | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause + { + $$ = makeNode(PublicationObjSpec); + $$->pubobjtype = PUBLICATIONOBJ_TABLE; + $$->pubtable = makeNode(PublicationTable); + $$->pubtable->relation = $2; + $$->pubtable->columns = $4; + $$->pubtable->whereClause = $5; + $$->pubtable->except = true; + $$->location = @1; + } I wasn't expecting you would need another 'opt_except_column_list' and all the code duplication that causes. AFAIK, the syntax is identical for 'opt_column_list' apart from the preceding EXCEPT so I thought all you need is to allow the 'opt_column_list' to have an optional EXCEPT qualifier. ====== src/backend/replication/pgoutput/pgoutput.c 12. + + /* + * Indicates whether no columns are published for a given relation. With + * the introduction of the EXCEPT clause in column lists, it is now + * possible to define a publication that excludes all columns of a table. + * However, the 'columns' attribute cannot represent this case, since a + * NULL value implies that all columns are published. To distinguish this + * scenario, the 'no_cols_published' flag is introduced. + */ + bool no_cols_published; } RelationSyncEntry; But, what about when Bitmapset *columns is not null, but has no bits set -- doesn't that mean the same as "no columns"? ====== src/include/catalog/pg_publication.h 13. extern Bitmapset *pub_form_cols_map(Relation relation, - PublishGencolsType include_gencols_type); + PublishGencolsType include_gencols_type, + Bitmapset *exceptcols); Maybe snake-case like the other params: /exceptcols/except_cols/ ====== src/test/regress/sql/publication.sql 14. +-- Verify that publication is created with EXCEPT +CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1, pub_sch1.pub_test_except2 EXCEPT (b, c); +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; + I think tests should also use psql \dRp+ commands in places to show that the "describe" stuff is working correctly. ~~~ 15. +-- Check for invalid cases +CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1, TABLE pub_test_except1 EXCEPT (b, c); +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT; Should explain more about what you are testing here: a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA b) syntax error EXCEPT without a col-list ~~~ 16. +-- Verify that publication can be altered with EXCEPT +ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT (a, b), pub_sch1.pub_test_except2; +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; The comment is a bit misleading because there are many kinds of "alter". Maybe say more like Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list) ~~~ 17. +-- Verify ALTER PUBLICATION ... DROP +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b); +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1; Should explain more: +-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list) +-- Verify ok - ALTER PUBLICATION ... DROP ... ~~~ 18. +ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d); +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; Missing comment: +-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list) ~~~ 19. +-- Verify excluded columns cannot be part of REPLICA IDENTITY +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c); +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX pub_test_except1_a_idx; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; +DROP INDEX pub_test_except1_a_idx; +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a); +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX pub_test_except1_a_idx; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; + +DROP INDEX pub_test_except1_a_idx; 19a. IIUC, really there are multiple tests here, so I think it should all be split and commented separately. a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL) b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX) c) Verify that so long as no clash between RI cols and the EXCEPT col-list, then it is ok ~ 19b. IMO, some index names could be better: CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c); How about 'pub_test_except1_ac_idx'? ~~~ 20. +DROP PUBLICATION testpub_except; +DROP TABLE pub_test_except1; +DROP TABLE pub_sch1.pub_test_except2; Add a "cleanup" comment. ====== Kind Regards, Peter Smith. Fujitsu Australia
Hi Shlok, One more thing, I noticed there is no tab-completion code yet for this new EXCEPT (column_list) syntax. ====== Kind Regards, Peter Smith. Fujitsu Australia
On Fri, Jun 27, 2025 at 3:44 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Thu, 26 Jun 2025 at 15:27, shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Jun 24, 2025 at 9:48 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > I have included the changes for > > > it in v14-0003 patch. > > > > > Thanks for the patches. I have reviewed patch001 alone, please find > > few comments: > > > > 1) > > + <para> > > + The <literal>RESET</literal> clause will reset the publication to the > > + default state which includes resetting the publication parameters, setting > > + <literal>ALL TABLES</literal> flag to <literal>false</literal> and > > + dropping all relations and schemas that are associated with the > > + publication. > > </para> > > > > It is misleading, as far as I have understood, we do not drop the > > tables or schemas associated with the pub; we just remove those from > > the publication's object list. See previous doc: > > "The ADD and DROP clauses will add and remove one or more > > tables/schemas from the publication" > > > > Perhaps we want to say the same thing when we speak about the 'drop' > > aspect of RESET. > I have updated the document. > > > 2) > > AlterPublicationReset(): > > > > + if (!OidIsValid(prid)) > > + ereport(ERROR, > > + (errcode(ERRCODE_UNDEFINED_OBJECT), > > + errmsg("relation \"%s\" is not part of the publication", > > + get_rel_name(relid)))); > > > > Can you please help me understand which scenario will give this error? > > > > Another question is do we really need this error? IIUC, we generally > > give errors if a user has explicitly called out a name of an object > > and that object is not found. Example: > > > > postgres=# alter publication pubnew drop table t1,tab2; > > ERROR: relation "t1" is not part of the publication > > > > While in a few other cases, we pass missing_okay as true and do not > > give errors. Please see other callers of performDeletion in > > publicationcmds.c itself. There we have usage of missing_okay=true. I > > have not researched myself, but please analyze the cases where > > missing_okay is passed as true to figure out if those match our RESET > > case. Try to reproduce if possible and then take a call. > I thought about the above point and I also think this check is not > required. Also, the function was calling PublicationDropSchemas with > missing_ok as false. I have changed it to be true. > Okay. Is there a reason for not using PublicationDropTables() here? We have rewritten similar code in the Reset flow. > > 3) > > +ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA public; > > +ERROR: syntax error at or near "ALL" > > +LINE 1: ALTER PUBLICATION testpub_reset ADD ALL TABLES IN SCHEMA pub... > > > > There is a problem in syntax, I think the intention of testcase was to > > run this query successfully. > > I have fixed it. > > Thanks Shveta for reviewing the patch. I have addressed the comments > and posted an updated version v15 in [1]. Thanks for the patches. My review is in progress but please find few comments on 002: 1) where exception_object is: [ ONLY ] table_name [ * ] We have the above in CREATE and ALTER pub docs, but we do not explain ONLY with EXCEPT. We do have an explanation of ONLY under 'FOR TABLE'. But since 'FOR TABLE' and 'EXCEPT' do not go together, it is somewhat difficult to connect the dots and find the information ONLY in the context of EXCEPT. We shall have ONLY explained for EXCEPT as well. Or we can have ONLY defined in a way that both 'FOR TABLE' and 'EXCEPT' can refer to it. 2) We get tab-completion options in this command: postgres=# create publication pub5 for TABLE tab1 W WHERE ( WITH ( Similarly in this command: create publication pub5 for TABLES IN SCHEMA s1 But once we have 'EXCEPT TABLE', we do not get further tab-completion option like WITH(...) create publication pub5 for ALL TABLES EXCEPT TABLE tab1 3) During tab-expansion, 'EXCEPT TABLE' and 'WITH (' in the below command looks like they are connecting words. Can the gap be increased similar to tab-expansion of next command shown below: postgres=# create publication pub4 for ALL TABLES EXCEPT TABLE WITH ( postgres=# create publication pub4 for ALL TABLES TABLE TABLES IN SCHEMA 4) alter_publication.sgml.orig is a left-over in patch002. thanks Shveta