Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CALDaNm3q_2Qfo4XbFkBT8mPz_ALS2MyuR=isP7_8Pz27JnLHiw@mail.gmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Support logical replication of DDLs
|
List | pgsql-hackers |
On Fri, 11 Nov 2022 at 10:17, Peter Smith <smithpb2250@gmail.com> wrote: > > Here are more review comments for the v32-0001 file ddl_deparse.c > > This completes my first review pass over this overly large file. > > This review has taken a long time, so for any of my review comments > (in this and previous posts) that get rejected, please reply citing > the rejected reference numbers, because I hope to avoid spending > multiple days (in a future review) trying to reconcile what was > addressed vs what was not addressed. TIA. > > *** NOTE - my review post became too big, so I split it into smaller parts. > > THIS IS PART 1 OF 4. > > ====== > > src/backend/commands/ddl_deparse.c > > G.1. GENERAL _VA args wrapping > > + tmp = new_objtree_VA("WITH GRANT OPTION", > + 1, "present", ObjTypeBool, > + stmt->action->grant_option); > > In general, I think all these _VA() style function calls are easier to > read if you can arrange to put each of the argument names on a new > line instead of just wrapping them randomly. > > So the above would be better as: > > tmp = new_objtree_VA("WITH GRANT OPTION", 1, > "present", ObjTypeBool, stmt->action->grant_option); > > Please search/modify all cases of new_objtree_VA like this. Modified > G.2. GENERAL - "type" object > > There are many functions that insert a "type" object for some purpose: > > e.g. > + tmpobj = new_objtree_VA("DETACH PARTITION %{partition_identity}D FINALIZE", 2, > + "type", ObjTypeString, "detach partition finalize", > + "partition_identity", ObjTypeObject, > + new_objtree_for_qualname_id(RelationRelationId, > + sub->address.objectId)); > > e.g. > + tmpobj = new_objtree_VA(fmtstr, 2, > + "type", ObjTypeString, "add column", > + "definition", ObjTypeObject, tree); > > I'm not sure yet what these "type" objects are used for, but I felt > that these unsubstituted values should look slightly more like enum > values, and slightly less like real SQL syntax. > > For example - maybe do like this (for the above): > > "detach partition finalize" -> "type_detach_partition_finalize" > "add column" -> "type_add_column" > etc. I felt this is mainly useful for handling when the publisher is running on a higher version and the subscriber is running on a lower version, this feature may or may not be part of the first version. We might be removing this code from the final patch. I have not made any change for this. We will handle this at a later point based on handling version is required or not as part of patch to be committed. > G.3. GENERAL - JSON deparsed structures should be documented > > AFAICT there are mixtures of different JSON structure styles at play > in this module. Sometimes there are trees with names and sometimes > not, sometimes there are "present" objects and sometimes not. > Sometimes entire trees seemed unnecessary to me. It feels quite > arbitrary in places but it's quite hard to compare them because > everything is spread across 9000+ lines. > > IMO all these deparse structures ought to be documented. Then I think > it will become apparent that lots of them are inconsistent with the > others. Even if such documentation is ultimately not needed by > end-users, I think it would be a very valuable internal design > accompaniment to this module, and it would help a lot for > reviews/maintenance/bug prevention etc. Better late than never. There is "type" and "present" which might confuse the user, this is required for handling when the publisher is running on a higher version and the subscriber is running on a lower version, this feature may or may not be part of the first version. We might be removing this code from the final patch. I have not documented this part, the others I have documented. Let me know if you are looking for adding comments for some others particularly. > G.4 GENERAL - Underuse of _VA() function. > > (Probably I've mentioned this before in previous review comments, but > I keep encountering this many times). > > The json is sort of built up part by part and objects are appended ... > it was probably easier to think about each part during coding but OTOH > I think this style is often unnecessary. IMO most times the function > can be far simpler just by gathering together all the necessary values > and then using a single big new_objtree_VA() call to deparse the > complete format in one call. I think it could also shave 100s of lines > of code from the module. Modified > G.5 GENERAL - Inconsistent function comment wording. > > The function comments are worded in different ways... > > "Given a XXX OID and the parse tree that created it, return an ObjTree > representing the creation command." > > versus > > "Given a XXX OID and the parse tree that created it, return the JSON > blob representing the creation command." > > Please use consistent wording throughout. Modified > G.6 GENERAL - present=false > > There are many calls that do like: > append_bool_object(tmpobj, "present", false); > > I was thinking the code would be cleaner if there was a wrapper function like: > > static void > append_not_present(ObjTree objTree) > { > append_bool_object(objTree, "present", false); > } Modified > G.7 GENERAL - psprintf format strings > > There are quite a few places where the format string is > pre-constructed using psprintf. > > e.g. > + fmt = psprintf("ALTER %s %%{identity}s OWNER TO %%{newowner}I", > + stringify_objtype(node->objectType)); > + > + ownerStmt = new_objtree_VA(fmt, 2, > + "identity", ObjTypeString, > + getObjectIdentity(&address, false), > + "newowner", ObjTypeString, > + get_rolespec_name(node->newowner)); > > It's not entirely clear to me why this kind of distinction is even > made, or even what are the rules governing the choice. AFAICT this > same result could be achieved by using another string substitution > marker. So why not do it that way instead of mixing different styles? > > IMO many/most of the psprintf can be removed. > > e.g. I mean something like this (for the above example): > > fmt = "ALTER %{obj_type}s %{identity}s OWNER TO %{newowner}I"; > > ownerStmt = new_objtree_VA(fmt, 3, > "obj_type", ObjTypeString, stringify_objtype(node->objectType), > "identity", ObjTypeString, getObjectIdentity(&address, false), > "newowner", ObjTypeString, get_rolespec_name(node->newowner)); Modified wherever possible > G.8 GENERAL - Inconsistent OID/oid in error messages. > > errmsg("role with OID %u does not exist", roleoid))); > elog(ERROR, "cache lookup failed for collation with OID %u", objectId); > elog(ERROR, "cache lookup failure for function with OID %u", > elog(ERROR, "cache lookup failed for schema with OID %u", > errmsg("role with OID %u does not exist", istmt->grantor_uid))); > elog(ERROR, "cache lookup failed for operator with OID %u", objectId); > elog(ERROR, "cache lookup failed for type with OID %u", objectId); > elog(ERROR, "cache lookup failed for conversion with OID %u", objectId); > elog(ERROR, "cache lookup failed for extension with OID %u", > elog(ERROR, "cache lookup failed for extension with OID %u", > elog(ERROR, "cache lookup failed for cast with OID %u", objectId); > elog(ERROR, "cache lookup failed for domain with OID %u", objectId); > elog(ERROR, "cache lookup failure for function with OID %u", > elog(ERROR, "cache lookup failure for language with OID %u", > elog(ERROR, "null prosrc in function with OID %u", objectId); > elog(ERROR, "cache lookup failed for opclass with OID %u", opcoid); > elog(ERROR, "cache lookup failed for operator family with OID %u", > opcForm->opcfamily); > elog(ERROR, "cache lookup failed for operator family with OID %u", objectId); > elog(ERROR, "cache lookup failed for domain with OID %u", > elog(ERROR, "cache lookup failed for collation with OID %u", objectId); > elog(ERROR, "cache lookup failed for operator with OID %u", objectId); > elog(ERROR, "cache lookup failed for type with OID %u", objectId); > elog(ERROR, "cache lookup failed for text search parser with OID %u", > elog(ERROR, "cache lookup failed for text search dictionary " "with > OID %u", objectId); > elog(ERROR, "cache lookup failed for text search template with OID %u", > elog(ERROR, "cache lookup failed for text search dictionary " "with > OID %u", objectId); > elog(ERROR, "cache lookup failed for opclass with OID %u", > elog(ERROR, "cache lookup failed for operator family with OID %u", > elog(ERROR, "cache lookup failure for transform with OID %u", > elog(ERROR, "cache lookup failure for language with OID %u", > elog(ERROR, "cache lookup failure for function with OID %u", > elog(ERROR, "cache lookup failure for function with OID %u", > elog(ERROR, "cache lookup failed for rewrite rule for view with OID > %u", viewoid) > > elog(ERROR, "cache lookup failed for range with type oid %u", > elog(ERROR, "cache lookup failed for rewrite rule with oid %u", > > G.8a. > Most are uppercase 'OID'. A few are lowercase 'oid' Modified > G.8b. > There is a mixture of "cache lookup failed" and "cache lookup failure" > -- should all be the same. Modified > G.8c. > A few above (e.g. role) have a different message text. Shouldn't those > also be "cache lookup failed"? Modified > G.9 GENERAL - ObjTree variables > > Often the ObjTree variable (for the deparse_XXX function return) is > given the name of the statement it is creating. > > Although it is good to be descriptive, often there is no need for long > variable names (e.g. 'createTransform' etc), because there is no > ambiguity anyway and it just makes for extra code characters and > unnecessary wrapping. IMO it would be better to just call everything > some short but *consistent* name across every function -- like 'stmt' > or 'json_ddl' or 'root' or 'ret' ... or whatever. Modified Thanks for the comments, the attached v52 patch has the changes for the same. Regards, Vignesh
Attachment
- v52-0005-Skip-ALTER-TABLE-subcommands-generated-for-Table.patch
- v52-0004-Test-cases-for-DDL-replication.patch
- v52-0003-Support-CREATE-TABLE-AS-SELECT-INTO.patch
- v52-0001-Functions-to-deparse-DDL-commands.patch
- v52-0002-Support-DDL-replication.patch
- v52-0006-Support-DDL-replication-of-alter-type-having-USI.patch
- v52-0007-Introduce-the-test_ddl_deparse_regress-test-modu.patch
pgsql-hackers by date: