Re: Add support for specifying tables in pg_createsubscriber. - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Add support for specifying tables in pg_createsubscriber.
Date
Msg-id CAHut+PseKe9hZmvOO_UjUExjY5KiyLPHbrUJr4CP+a+q66oEDg@mail.gmail.com
Whole thread Raw
In response to Re: Add support for specifying tables in pg_createsubscriber.  (Shubham Khanna <khannashubham1197@gmail.com>)
Responses Re: Add support for specifying tables in pg_createsubscriber.
List pgsql-hackers
Hi Shubham,

Here are some v7 review comments.

======
Commit message

1.
This change eliminates the need to know in advance which publications exist
on the publisher, making the tool more user-friendly. Users can specify
publication names and the tool will handle both existing and new publications
appropriately.

~

I disagree with the "eliminates the need to know" part. This change
doesn't remove that responsibility from the user. They still need to
be aware of what the existing publications are so they don't end up
reusing a publication they did not intend to reuse.

~~~

2. <general>
It would be better if the commit message wording was consistent with
the wording in the docs.

======
doc/src/sgml/ref/pg_createsubscriber.sgml

3.
+      <para>
+       When reusing existing publications, you should understand their current
+       configuration. Existing publications are used exactly as configured,
+       which may replicate different tables than expected.
+       New publications created with <literal>FOR ALL TABLES</literal> will
+       replicate all tables in the database, which may be more than intended.
+      </para>

Is that paragraph needed? What does it say that was not already said
by the previous paragraph?

~~~

4.
+      <para>
+       Use <option>--dry-run</option> to see which publications will be reused
+       and which will be created before running the actual command.
+       When publications are reused, they will not be dropped during cleanup
+       operations, ensuring they remain available for other uses.
+       Only publications created by
+       <application>pg_createsubscriber</application> will be cleaned up.
+      </para>

There also needs to be more clarity about the location of the
publications that are getting "cleaned up". AFAIK the function
check_and_drop_publications() only cleans up for the target server,
but that does not seem at all obvious here.

======
src/bin/pg_basebackup/pg_createsubscriber.c

setup_publisher:

5.
- if (num_pubs == 0 || num_subs == 0 || num_replslots == 0)
+ if (dbinfo[i].pubname == NULL || dbinfo[i].subname == NULL ||
dbinfo[i].replslotname == NULL)
  genname = generate_object_name(conn);
- if (num_pubs == 0)
+ if (dbinfo[i].pubname == NULL)
  dbinfo[i].pubname = pg_strdup(genname);
- if (num_subs == 0)
+ if (dbinfo[i].subname == NULL)
  dbinfo[i].subname = pg_strdup(genname);
- if (num_replslots == 0)
+ if (dbinfo[i].replslotname == NULL)
  dbinfo[i].replslotname = pg_strdup(dbinfo[i].subname);
~

Are these changes related to the --publication change, or are these
some other issue?

~~~

6.
  * consistent LSN and the new publication rows (such transactions
  * wouldn't see the new publication rows resulting in an error).
  */
- create_publication(conn, &dbinfo[i]);
+ if (check_publication_exists(conn, dbinfo[i].pubname, dbinfo[i].dbname))

The comment preceding this if/else is only talking about "Create the
publications..", but it should be more like "Reuse existing or create
new publications...". Alternatively, move the comments within the if
and else.

~~~

check_and_drop_publications:

7.
  if (!drop_all_pubs || dry_run)
- drop_publication(conn, dbinfo->pubname, dbinfo->dbname,
- &dbinfo->made_publication);
+ {
+ if (dbinfo->made_publication)
+ drop_publication(conn, dbinfo->pubname, dbinfo->dbname,
+ &dbinfo->made_publication);
+ }

I find this function logic confusing. In particular, why is the
"made_publication" flag only checked for dry_run?

This function comment says it will "drop all pre-existing
publications." but doesn't that contradict your commit message and
docs that said statements like "When publications are reused, they are
never dropped during cleanup operations"?

======
.../t/040_pg_createsubscriber.pl

8.
IMO one of the most important things for the user is that they must be
able to know exactly which publications will be reused, and which
publications will be created as FOR ALL TABLES. So, there should be a
test to verify that the --dry_run option emits all the necessary
logging so the user can tell that.

======
Kind Regards,
Peter Smith
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Orphan page in _bt_split
Next
From: Michael Paquier
Date:
Subject: Re: Resetting recovery target parameters in pg_createsubscriber