Home > mailing lists
Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: SQL Property Graph Queries (SQL/PGQ)
Date	August 7 13:47:37
Msg-id	CAExHW5t8UJgon2VW2HMndESyOfhpk2syDYaBSuw60ig9qXcHAA@mail.gmail.com Whole thread Raw
In response to	Re: SQL Property Graph Queries (SQL/PGQ) (Amit Langote <amitlangote09@gmail.com>)
List	pgsql-hackers
Tree view
Hi Amit,
Thanks for your comments

On Wed, Jul 23, 2025 at 2:08 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> Hi Ashutosh,
>
> On Tue, Jul 22, 2025 at 8:40 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> > Here's the patchset rebased on the latest master.
>
> Thanks for the update. I was going through the patch series today and
> had a few comments on the structure that might help with reviewing and
> eventually committing, whoever ends up doing that. :)
>
> I noticed that some functions, like graph_table_property_reference(),
> are introduced in one patch and then renamed or removed in a later
> one. It’s easier to review if such refactoring is avoided across
> patches -- ideally, each patch should introduce code in its final
> intended form. That also helps reduce the number of patches and makes
> each one more self-contained.
>
> One of the later patches (0004) collects several follow-up changes.
> While it fixes a real bug -- a crash when GRAPH_TABLE is used inside
> plpgsql due to a conflict with the columnref hook -- it also includes
> incidental cleanups like switching to foreach_node(), updating
> comments, and adjusting function signatures. Those would be better
> folded into the patches that introduced the relevant code, rather than
> grouped into a catch-all at the end. That keeps each patch focused and
> easier to review -- and easier to merge when committing.
>
> A general principle that might help: if you're refactoring existing
> code, a standalone preliminary patch makes sense. But if you're
> tweaking something just added in the same series, it’s usually better
> to squash that into the original patch. The rename from
> graph_table_property_reference() to transformGraphTablePropertyRef()
> may be a fair exception since it reflects a shift prompted by the bug
> fix -- but most other adjustments could be folded in without loss of
> clarity.
>
> I understand the intent to spell out each change, but if the details
> are incidental to the overall design, they don’t necessarily need to
> be split out. Explaining the reasoning in the thread is always
> helpful, but consolidating the patches once the design has settled
> makes things easier for both reviewers and committers.

0001 is almost the same patch that Peter posted almost a year ago.
Each of 0002 onwards patches contains logically grouped changes. I
have changed 0001 minimally (so that it continues to apply, compile
and pass regression). I think this arrangement will make it easy for
Peter to review the changes. It will help him understand what all
changed since 0001 and each change in its own context. That has led to
a lot of overlap between 0002 and 0001 which I think should be
squashed together even now. But I would like to know what Peter thinks
- what would make his review easier. This arrangement might also help
Junwang, who has reviewed some patches, to continue his incremental
review.

I have rearranged the patches 0002 to 0014 so that the same change
isn't revised in multiple patches. Hope that helps.

I am also attaching a single patch as well to this email, so that
newer reviewers can review the changes as a whole.

SQL_PGQ_20250807.zip is separate patches zipped together.
SQL_PGQ_one_patch_20250807.zip is a single diff with changes from all
the patches squashed together. It's still zipped to avoid the email
being held for moderation.

>
> Finally, attaching the patches directly, with versioned names like
> v8-000x, instead of zipping them helps.  Many folks (myself included)
> will casually skip a zip file because of the small hassle of unzipping
> just to read a patch. I once postponed reading such a patch and didn’t
> get back to it for quite a while. :)

I used to attach those as separate patches till [1], but after that
the total size of the patchset grew so much that my emails used to get
held back for moderation. So I started attaching zip files. The
increase in size is mostly due to the 0014 patch. If we think that
it's not needed or can be trimmed down, we will be able to attach the
patchset as separate attachments.

Here's what changed between the last patchset and this

0001 has some TODOs, FIXMEs and XXXs. I and Junwang had fixed some of
them earlier. This patchset fixes some more. Below is my analysis of
each of them.
+
+ <!-- TODO: multiple path patterns in a graph pattern (comma-separated) -->
+
+ <!-- TODO: quantifiers -->

I think we should remove these TODOs since we are not going to support
these cases in this patchset. We will add the documentation when we
implement these features.

+/*
+ * 15.2
+ * PG_DEFINED_LABEL_SETS view
+ */
+
+-- TODO
+
+
+/*
+ * 15.3
+ * PG_DEFINED_LABEL_SET_LABELS view
+ */
+
+-- TODO
+
+
+/*
+ * 15.4
+ * PG_EDGE_DEFINED_LABEL_SETS view
+ */
+
+-- TODO
+
+/*
+ * 15.6
+ * PG_EDGE_TRIPLETS view
+ */
+
+-- TODO
+
+/*
+ * 15.15
+ * PG_VERTEX_DEFINED_LABEL_SETS view
+ */
+
+-- TODO

All these views are related to the defined label set. IIUC, defined
label set is a set of labels which is shared by multiple edges or
vertices of a property graph. Please note it's the whole set shared;
not just individual labels in the set. In that sense, I think, the
labels associated with each edge or vertex table in a property graph
form a defined label set. That way the property graph element's OID
becomes the defined label set's identifier. I am not sure if this view
has any practical use for tabular property graphs. It's not clear to
me what these views will contain. Since they are not essential for
SQL/PGQ functionality, I have not implemented them.

+ /*
+ * Now check number and names.
+ *
+ * XXX We could provide more detail in the error messages, but that would
+ * probably only be useful for some ALTER commands, because otherwise it's
+ * not really clear which label definition is the wrong one, and so you'd
+ * have to construct a rather verbose report to be of any use. Let's keep
+ * it simple for now.
+ */

Agree with XXX. I think the code is good enough as is. We can leave
XXX there in case we expect someone to improve it in future. Otherwise
we should remove XXX as well.

+
+ rel = table_open(PropgraphLabelPropertyRelationId, RowShareLock);
+ ScanKeyInit(&key[0],
+ Anum_pg_propgraph_label_property_plppropid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(propoid));
+ scan = systable_beginscan(rel, InvalidOid /* FIXME */ ,
+ true, NULL, 1, key);

I think the FIXME is to add the correct index OID.
pg_propgraph_label_property_label_prop_index contains plppropid as a
key but it's not the first column. So probably some other, yet not
defined, index is expected here?

+ALTER PROPERTY GRAPH g3 ALTER VERTEX TABLE t3 DROP LABEL t3l3;
+ALTER PROPERTY GRAPH g3 DROP VERTEX TABLES (t2); -- fail (TODO:
dubious error message)
+ERROR: cannot drop vertex t2 of property graph g3 because other
objects depend on it
+DETAIL: edge e1 of property graph g3 depends on vertex t2 of property graph g3
+HINT: Use DROP ... CASCADE to drop the dependent objects too.

I don't see what's dubious about this error message. edge e1 connects
t2 and t1. Dropping t2 would leave e1 dangling; error is preventing
it. Looks similar to how we prevent a referenced table being dropped.
Or is it expected that e1 be dropped without specifying cascade when
t2 is dropped since e1 has no existence without t2?

Also the patchset fixes the CI failure from the previous patchset.

The patches are reordered slightly. Each patch contains a commit
message that explains the changes in that patch.

[1] https://www.postgresql.org/message-id/CAExHW5sM+ZGVzR1428FsDHuWc-FU2-6zQr5j91KLh6vZaWY0ow@mail.gmail.com





--
Best Wishes,
Ashutosh Bapat
Attachment

pgsql-hackers by date:
From: John Naylor
Date: 07 August, 13:30:26
Subject: Re: cpluspluscheck vs ICU again
From: Nazir Bilal Yavuz
Date: 07 August, 14:15:06
Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD
Re: SQL Property Graph Queries (SQL/PGQ) - Mailing list pgsql-hackers

Attachment

Previous

Next