Re: plan shape work - Mailing list pgsql-hackers

From Robert Haas
Subject Re: plan shape work
Date
Msg-id CA+Tgmob2PHK9Bbi6TettDiD7ju03RHK_5VxOUsSmPPx7iDb_+g@mail.gmail.com
Whole thread Raw
In response to Re: plan shape work  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: plan shape work
Re: plan shape work
List pgsql-hackers
First of all, as an administrative note, since both you and Alex seem
to like 0001 and 0002 and no suggestions for improvement have been
offered, I plan to commit those soon unless there are objections or
additional review comments. I will likely do the same for 0003 as
well, pending the results of the current conversation, but maybe not
quite as quickly. I believe that 0004 still needs more review, and its
effects will be more user-visible than 0001-0003, so I don't plan to
move forward with that immediately, but I invite review comments.

On Mon, Sep 8, 2025 at 10:22 PM Richard Guo <guofenglinux@gmail.com> wrote:
> One idea (not fully thought through) is that we record the calculated
> outerjoin_relids for each outer join in its JoinPaths.  (We cannot
> store this in the joinrel's RelOptInfo because it varies depending on
> the join sequence we use.)  And then we could use the recorded
> outerjoin_relids for the assertion here:
>
>  outer_relids U inner_relids U joinpath->ojrelids == joinrel->relids
>
> The value of this approach, IMO, is that it could help verify the
> correctness of how we compute outer joins' outerjoin_relids, ie. the
> logic in add_outer_joins_to_relids(), which is quite complex due to
> outer-join identity 3.  If we miscalculate the outerjoin_relids for
> one certain outer join, this assertion could catch it effectively.
>
> However, this shouldn't be a requirement for committing your patches.
> Maybe we should discuss it in a separate thread.

I'm OK with moving the conversation to a separate thread, but can you
clarify from where you believe that joinpath->ojrelids would be
populated? It seems to me that the assertion couldn't pass unless
every join path ended up with the same value of joinpath->ojrelids.
That's because, for a given joinrel, there is only one value of
joinrel->relids; and all of those RTIs must be either RTE_JOIN or
non-RTE_JOIN. The non-RTE_JOIN RTIs will be found only in outer_relids
U inner_relids, and the RTE_JOIN RTIs will be found only in
joinpath->ojrelids. Therefore, it seems impossible for the assertion
to pass unless the value is the same for all join paths. If that is
correct, then I don't think we should store the value in the join
path. Instead, if we want to cross-check it, we could calculate the
value that would have been stored into joinpath->ojrelids at whatever
earlier stage we had the information available to do so, and it should
be equal to bms_intersect(joinrel->relids, root->outer_join_rels),
which I think would have to be already initialized before we can think
of building a join path.

Please feel free to correct me if I am misunderstanding.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Kouber Saparev
Date:
Subject: Re: BF mamba failure
Next
From: Yugo Nagata
Date:
Subject: Re: Inconsistent update in the MERGE command