Re: Add support for specifying tables in pg_createsubscriber. - Mailing list pgsql-hackers
From | Shubham Khanna |
---|---|
Subject | Re: Add support for specifying tables in pg_createsubscriber. |
Date | |
Msg-id | CAHv8RjJqDW8VbDtmkVgZ94DL+Bj9DxFpAog9ex9UHSAgkvo=_A@mail.gmail.com Whole thread Raw |
In response to | Re: Add support for specifying tables in pg_createsubscriber. (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Thu, Sep 11, 2025 at 8:02 AM Peter Smith <smithpb2250@gmail.com> wrote: Hi Peter, Thank you for the review. I have addressed your comments in the latest patch. > My first impression of this latest patch is that the new option name > is not particularly intuitive. > > Below are some more review comments for patch v5-0001; any changes to > the name will propagate all through this with different member names > and usage, etc, so I did not repeat that same comment over and over. > > ====== > Commit message > > 1. > Add a new '--if-not-exists' option to pg_createsubscriber, allowing users to > reuse existing publications if they are already present, or create them if they > do not exist. > > ~ > > IMO, this option name ("if-not-exists") gives the user no clues as to > its purpose. > > I suppose you wanted the user to associate this option with a CREATE > PUBLICATION, so that they can deduce that it acts internally like a > "CREATE PUBLICATION pub ... IF NOT EXISTS" (if there was such a > thing), I don't see how a user is able to find any meaning at all from > this vague name. The only way they can know what this means is to read > all the documentation. > > e.g. > - if WHAT doesn't exist? > - if <??> not exists, then do WHAT? > > Alternatives like "--reuse-pub-if-exists" or "--reuse-existing-pubs" > might be easier to understand. > I agree that --if-not-exists may not be very intuitive, as it doesn’t clearly convey the intended behavior without reading the documentation. For this version, I have renamed the option to "--reuse-existing-publications", which I believe makes the purpose clearer: if the specified publication already exists, it will be reused; otherwise, a new one will be created. This avoids ambiguity while still keeping the semantics aligned with the original intent. > ~~~ > > 2. > Key features: > 1. New '--if-not-exists' flag changes the behavior of '--publication'. > 2. If the publication exists, it is reused. > 3. If it does not exist, it is created automatically. > 4. Supports per-database specification, consistent with other options. > 5. Avoids the complexity of option conflicts and count-matching rules. > 6. Provides semantics consistent with SQL’s IF NOT EXISTS syntax. > > ~ > > Some of those "features" seem wrong, and some seem inappropriate for > the commit message: > > e.g. TBH, I doubt that "4. Supports per-database specification" can > work as claimed here. Supposing I have multiple databases, then I > cannot see how I can say "if-not-exists" for some databases but not > for others. > > e.g. The "5. Avoids the complexity..." seems like a reason why an > alternative design was rejected; Why does this comment belong here? > > e.g. The "6. Provides semantics..." seems like your internal > implementation justification, whereas IMO this patch should be more > focused on making any new command-line option more > intuitive/meaningful from the point of view of the user. > Updated the commit message. > ====== > doc/src/sgml/ref/pg_createsubscriber.sgml > > 3. > + <para> > + This option provides flexibility for mixed scenarios where some > + publications may already exist while others need to be created. > + It eliminates the need to know in advance which publications exist on > + the publisher. > + </para> > > Hmm. Is that statement ("It eliminates the need to know...") correct? > I feel it should say the total opposite of this. > > For example, IMO now the user needs to be extra careful because if > they say --publication=pub1 --if-not-exists then they have to be 100% > sure if 'pub1' exists or does not exist, otherwise they might > accidentally be making pub1 FOR ALL TABLES when really they expected > they were reusing some existing publication 'pub1', or vice versa. > > In fact, I think the docs should go further and *recommend* that the > user never uses this new option without firstly doing a --dry-run to > verify they are actually getting the publications that they assume > they are getting. > Fixed. > ~~~ > > 4. > + <para> > + This option follows the same semantics as SQL > + <literal>IF NOT EXISTS</literal> clauses, providing consistent behavior > + with other PostgreSQL tools. > + </para> > > Why even say this in the docs? What other "tools" are you referring to? > Fixed. > ====== > src/bin/pg_basebackup/pg_createsubscriber.c > > 5. > bool made_publication; /* publication was created */ > + bool publication_existed; /* publication existed before we > + * started */ > }; > > This new member feels redundant to me. > > AFAIK, the publication will either exist already, OR it will be > created/made by pg_createsubscriber. You don't need 2 booleans to > represent that. > At most, maybe all you want is another local variable in the function > setup_publisher(). > Fixed. > ~~~ > > cleanup_objects_atexit: > > (same comments as posted in my v4 review) > Fixed. > 6a. > - if (dbinfo->made_publication) > + if (dbinfo->made_publication && !dbinfo->publication_existed) > > Same as above review comment #5. It's either made or it's not made, > right? Why the extra boolean? > > ~ > > 6b. > - if (dbinfo->made_publication) > + if (dbinfo->made_publication && !dbinfo->publication_existed) > > Ditto. > Fixed. > ~~~ > > check_publication_exists: > > 7. > +/* > + * Add function to check if publication exists. > + */ > > Function comment should not say "Add function". > Fixed. > ~ > > 8. > + PGresult *res; > + bool exists = false; > + char *query; > > Redundant assignment? > Fixed. > ~~~ > > setup_publisher: > > 9. > + /* > + * Check if publication already exists when --if-not-exists is > + * specified > + */ > + if (opt->if_not_exists) > + { > + dbinfo[i].publication_existed = check_publication_exists(conn, > dbinfo[i].pubname, dbinfo[i].dbname); > + if (dbinfo[i].publication_existed) > + { > + pg_log_info("using existing publication \"%s\" in database \"%s\"", > + dbinfo[i].pubname, dbinfo[i].dbname); > + } > + } > + > /* > * Create publication on publisher. This step should be executed > * *before* promoting the subscriber to avoid any transactions between > * consistent LSN and the new publication rows (such transactions > * wouldn't see the new publication rows resulting in an error). > */ > - create_publication(conn, &dbinfo[i]); > + if (opt->if_not_exists && dbinfo[i].publication_existed) > + dbinfo[i].made_publication = false; > + else > + { > + create_publication(conn, &dbinfo[i]); > + if (opt->if_not_exists) > + { > + pg_log_info("created publication \"%s\" in database \"%s\"", > + dbinfo[i].pubname, dbinfo[i].dbname); > + } > + } > > This all seems way more complicated than it needs to be. e.g. I doubt > that you need to be checking opt->if_not_exists 3 times. > > Simpler logic might be more like below: > > bool make_pub = true; > > if (opt->if_not_exists && check_publication_exists(...)) > { > pg_log_info("using existing..."); > make_pub = false; > } > > ... > > if (make_pub) > { > create_publication(conn, &dbinfo[i]); > pg_log_info("created publication..."); > } > > dbinfo[i].made_publication = make_pub; > Fixed. > ====== > .../t/040_pg_createsubscriber.pl > > 10. > +# Initialize node_s2 as a fresh standby of node_p for table-level > +# publication test. > > I don't know that you should still be calling this a "table-level" > test. Maybe it's more like an "existing publication" test now? > Fixed. > ~~~ > > 11. > +is( $result, qq({test_pub_existing} > +{test_pub_new}), > + "subscriptions use the correct publications with --if-not-exists in > $db1 and $db2" > +); > > Something seems broken. I deliberately caused an error to occur to see > what would happen, and the substitutions of $db1 and $db2 went crazy. > > # Failed test 'subscriptions use the correct publications with > --if-not-exists in regression\"\ > > > ¬ !"\#$%&'()*+,-\\"\\\ > and regression./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ' > # at t/040_pg_createsubscriber.pl line 614. > Fixed. The attached patch contains the suggested changes. Thanks and regards, Shubham Khanna,
Attachment
pgsql-hackers by date: