On Fri, Jan 24, 2025 at 1:23 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
I may not be close to the task monitoring area, but I utilise queryId and other tools to differ plan nodes inside extensions. Initially, like queryId serves as a class identifier for queries, plan_id identifies a class of nodes, not a single node. In the implementation provided here, nodes with the same hash can represent different subtrees. For example, JOIN(A, JOIN(B,C)) and JOIN(JOIN(B,C),A) may have the same ID.
Moreover, I wonder if this version of plan_id reacts to the join level change. It appears that only a change of the join clause alters the plan_id hash value, which means you would end up with a single hash for very different plan nodes. Is that acceptable? To address this, we should consider the hashes of the left and right subtrees and the hashes of each subplan (especially in the case of Append).
I looked back at this again just to confirm we're not missing anything:
I don't think any of the posted patch versions (including the just shared v4) have a problem with distinguishing two plans that are very similar but only differ in JOIN order. Since we descend into the inner/outer plans via the setrefs.c treewalk, the placement of JOIN nodes vs other nodes should cause a different plan jumble (and we include both the node tag for the join/scan nodes, as well as the RT index the scans point to in the jumble).
Do you have a reproducer that shows these two generate the same plan ID?