Re: [PATCH] Optionally record Plan IDs to track plan changes for a query - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: [PATCH] Optionally record Plan IDs to track plan changes for a query
Date
Msg-id 25429a0d-e795-45e9-8f99-f578f70c20a6@gmail.com
Whole thread Raw
In response to Re: [PATCH] Optionally record Plan IDs to track plan changes for a query  (Lukas Fittl <lukas@fittl.com>)
List pgsql-hackers
On 2/5/25 09:16, Lukas Fittl wrote:
> Hi Andrei,
> 
> On Fri, Jan 24, 2025 at 1:23 AM Andrei Lepikhov <lepihov@gmail.com 
> <mailto:lepihov@gmail.com>> wrote:
> 
>     I may not be close to the task monitoring area, but I utilise queryId
>     and other tools to differ plan nodes inside extensions. Initially, like
>     queryId serves as a class identifier for queries, plan_id identifies a
>     class of nodes, not a single node. In the implementation provided here,
>     nodes with the same hash can represent different subtrees. For example,
>     JOIN(A, JOIN(B,C)) and JOIN(JOIN(B,C),A) may have the same ID.
> 
> 
>     Moreover, I wonder if this version of plan_id reacts to the join level
>     change. It appears that only a change of the join clause alters the
>     plan_id hash value, which means you would end up with a single hash for
>     very different plan nodes. Is that acceptable? To address this, we
>     should consider the hashes of the left and right subtrees and the
>     hashes
>     of each subplan (especially in the case of Append).
> 
> 
> I looked back at this again just to confirm we're not missing anything:
> 
> I don't think any of the posted patch versions (including the just 
> shared v4) have a problem with distinguishing two plans that are very 
> similar but only differ in JOIN order. Since we descend into the inner/ 
> outer plans via the setrefs.c treewalk, the placement of JOIN nodes vs 
> other nodes should cause a different plan jumble (and we include both 
> the node tag for the join/scan nodes, as well as the RT index the scans 
> point to in the jumble).
Maybe. I haven't dive into that stuff deeply yet. It is not difficult to 
check.
The main point was that different extensions want different plan_ids. 
For example, planner extensions want to guarantee the distinctness and 
sort of stability of this field inside a query plan. Does the hash value 
guarantee that?
We have discussed how queryId should be generated more than once. That's 
why I think the plan_id generation logic should be implemented inside an 
extension, not in the core.


-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Avoid updating inactive_since for invalid replication slots
Next
From: Dmitry Koterov
Date:
Subject: Re: Increased work_mem for "logical replication tablesync worker" only?