Re: postgres_fdw: wrong results with self join + enable_nestloop off - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: postgres_fdw: wrong results with self join + enable_nestloop off |
Date | |
Msg-id | CAPmGK17ov=DJ=PZ1r-sCcS+mOX3J5fCWwsW0y3Di37nZgOnvVw@mail.gmail.com Whole thread Raw |
In response to | Re: postgres_fdw: wrong results with self join + enable_nestloop off (Önder Kalacı <onderkalaci@gmail.com>) |
Responses |
Re: postgres_fdw: wrong results with self join + enable_nestloop off
|
List | pgsql-hackers |
Hi Onder, On Wed, Aug 16, 2023 at 10:58 PM Önder Kalacı <onderkalaci@gmail.com> wrote: >> Maybe we could do so by leaving to extensions the decision whether >> they replace joins with pseudoconstant clauses, but I am not sure that >> that is a good idea, because that would require the authors to modify >> and recompile their extensions to fix the issue... > I think I cannot easily follow this argument. The decision to push down the join > (or not) doesn't seem to be related to calling set_join_pathlist_hook. It seems like the > extension should decide what to do with the hook. > > That seems the generic theme of the hooks that Postgres provides. For example, the extension > is allowed to even override the whole planner/executor, and there is no condition that would > prevent it from happening. In other words, an extension can easily return wrong results with the > wrong actions taken with the hooks, and that should be responsibility of the extension, not Postgres >> I am not familiar with the Citus extension, but such pseudoconstant >> clauses are handled within the Citus extension? > As I noted earlier, Citus relies on this hook for collecting information about all the joins that Postgres > knows about, there is nothing specific to pseudoconstants. Some parts of creating the (distributed) > plan relies on the information gathered from this hook. So, if information about some of the joins > are not passed to the extension, then the decisions that the extension gives are broken (and as a result > the queries are broken). Thanks for the explanation! Maybe my explanation was not enough, so let me explain: * I think you could use the set_join_pathlist_hook hook as you like at your own responsibility, but typical use cases of the hook that are designed to support in the core system would be just add custom paths for replacing joins with scans, as described in custom-scan.sgml (this note is about set_rel_pathlist_hook, but it should also apply to set_join_pathlist_hook): Although this hook function can be used to examine, modify, or remove paths generated by the core system, a custom scan provider will typically confine itself to generating <structname>CustomPath</structname> objects and adding them to <literal>rel</literal> using <function>add_path</function>. * The problem we had with the set_join_pathlist_hook hook is that in such a typical use case, previously, if the replaced joins had any pseudoconstant clauses, the planner would produce incorrect query plans, due to the lack of support for handling such quals in createplan.c. We could fix the extensions side, as you proposed, but the cause of the issue is 100% the planner's deficiency, so it would be unreasonable to force the authors to do so, which would also go against our policy of ABI compatibility. So I fixed the core side, as in the FDW case, so that extensions created for such a typical use case, which I guess are the majority of the hook extensions, need not be modified/recompiled. I think it is unfortunate that that breaks the use case of the Citus extension, though. BTW: commit 9e9931d2b removed the restriction on the call to the hook extensions, so you might want to back-patch it. Though, I think it would be better if the hook was well implemented from the beginning. Best regards, Etsuro Fujita
pgsql-hackers by date: