Re: Oddity in handling of cached plans for FDW queries - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Oddity in handling of cached plans for FDW queries |
Date | |
Msg-id | CAFjFpRd9rx1kc-Dvm6ogeNKF8C6XxFA-QAH89T08pFHTKCOaFw@mail.gmail.com Whole thread Raw |
In response to | Re: Oddity in handling of cached plans for FDW queries (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>) |
Responses |
Re: Oddity in handling of cached plans for FDW queries
|
List | pgsql-hackers |
On Thu, Jul 14, 2016 at 5:10 PM, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
-- On 2016/07/13 18:00, Ashutosh Bapat wrote:To fix the first, I'd like to propose (1) replacing the existing
has_foreign_join flag in the CachedPlan data structure with a new
flag, say uses_user_mapping, that indicates whether a cached plan
uses any user mapping regardless of whether the cached plan has
foreign joins or not (likewise, replace the hasForeignJoin flag in
PlannedStmt), and (2) invalidating the cached plan if the
uses_user_mapping flag is true.That way we will have plan cache invalidations even when simple foreign
tables scans (not join) are involved, which means all the plans
involving any reference to a foreign table with valid user mapping
associated with it. That can be a huge cost as compared to the current
solution where sub-optimal plan will be used only when a user mapping is
changed while a statement has been prepared. That's a rare scenario and
somebody can work around that by preparing the statement again.
I'm not sure that's a good workaround. ISTM that people often don't pay much attention to plan changes, so they would execute the inefficient plan without realizing the plan change, it would take long, they would start thinking what's happening there, and finally, they would find that the reason for that is due to the plan change. I think we should prevent such a trouble.
The case you described is other way round. When the statement was prepared the join was not pushed down. A change in user mapping afterwards may allow the join to be pushed down. But right now it won't be, so a user wouldn't see any difference, right?
IIRC, we
had discussed this situation when implementing the cache invalidation
logic.
I didn't know that. Sorry for speaking up late.But there's no workaround for your solution.
As you said, this is a rare scenario; in many cases, people define user mappings properly beforehand. So, just invalidating all relevant plans on the syscache invalidation events would be fine. (I thought one possible improvement might be to track exactly the dependencies of plans on user mappings and invalidate just those plans that depend on the user mapping being modified the same way for user-defined functions, but I'm not sure it's worth complicating the code.)
Exactly, for a rare scenario, should we be penalizing large number of plans or just continue to use a previously prepared plan when an optimal plan has become available because of changed condition. I would choose second over the first as it doesn't make things worse than they are.
I don't think the above change is sufficient to fix the second. The
root reason for that is that since currently, we allow the user
mapping OID (rel->umid) to be InvalidOid in two cases: (1) user
mappings mean something to the FDW but it can't get any user mapping
at planning time and (2) user mappings are meaningless to the FDW,
we cannot distinguish these two cases.The way to differentiate between these two is to look at the serverid.
If server id is invalid it's the case 1,
Really? Maybe my explanation was not good, but consider a foreign join plan created through GetForeignJoinPaths, by an FDW to which user mappings are meaningless, like file_fdw. In that plan, the corresponding server id would be valid, not invalid. No?
While planning join, we invalidate user mapping id if the user mappings do not match (see build_join_rel()). In such case, the joinrel will have invalid user mapping (and invalid server id) even though user mapping is associated with the joining tables. The way to differentiate between this case and the case when an FDW doesn't need user mappings and the join is shippable is through valid serverid (see build_join_rel()). Non-availability of a user mapping for a table whose FDW requires user mappings should end up in an error (in FDW code), so we shouldn't add complexity for that case.
So, I'd like to introduce a new callback routine to specify that
user mappings mean something to the FDW as proposed by Tom [2], and
use that to reject the former case, which allows us to set the above
uses_user_mapping flag appropriately, ie, set the flag to true only
if user mapping changes require forcing a replan.This routine is meaningless unless the core (or FDW) does not allow a
user mapping to be created for such FDWs. Without that, core code would
get confused as to what it should do when it sees a user mapping for an
FDW which says user mappings are meaningless.
The core wouldn't care about such a user mapping for the FDW; the core would just ignore the user mapping. No?
See build_join_rel(). I would like to see, how do you change the conditions below in that function with your proposal.
468 /*
469 * Set up foreign-join fields if outer and inner relation are foreign
470 * tables (or joins) belonging to the same server and using the same user
471 * mapping.
472 *
473 * Otherwise those fields are left invalid, so FDW API will not be called
474 * for the join relation.
475 *
476 * For FDWs like file_fdw, which ignore user mapping, the user mapping id
477 * associated with the joining relation may be invalid. A valid serverid
478 * distinguishes between a pushed down join with no user mapping and a
479 * join which can not be pushed down because of user mapping mismatch.
480 */
481 if (OidIsValid(outer_rel->serverid) &&
482 inner_rel->serverid == outer_rel->serverid &&
483 inner_rel->umid == outer_rel->umid)
484 {
485 joinrel->serverid = outer_rel->serverid;
486 joinrel->umid = outer_rel->umid;
487 joinrel->fdwroutine = outer_rel->fdwroutine;
488 }
468 /*
469 * Set up foreign-join fields if outer and inner relation are foreign
470 * tables (or joins) belonging to the same server and using the same user
471 * mapping.
472 *
473 * Otherwise those fields are left invalid, so FDW API will not be called
474 * for the join relation.
475 *
476 * For FDWs like file_fdw, which ignore user mapping, the user mapping id
477 * associated with the joining relation may be invalid. A valid serverid
478 * distinguishes between a pushed down join with no user mapping and a
479 * join which can not be pushed down because of user mapping mismatch.
480 */
481 if (OidIsValid(outer_rel->serverid) &&
482 inner_rel->serverid == outer_rel->serverid &&
483 inner_rel->umid == outer_rel->umid)
484 {
485 joinrel->serverid = outer_rel->serverid;
486 joinrel->umid = outer_rel->umid;
487 joinrel->fdwroutine = outer_rel->fdwroutine;
488 }
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
pgsql-hackers by date: