Re: BUG #17355: Server crashes on ExecReScanForeignScan in postgres_fdw when accessing foreign partition - Mailing list pgsql-bugs
From | Amit Langote |
---|---|
Subject | Re: BUG #17355: Server crashes on ExecReScanForeignScan in postgres_fdw when accessing foreign partition |
Date | |
Msg-id | CA+HiwqFt2oAc9b12ba32H1_TULpShmHz26V1W3TiAFzpeBf1jA@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #17355: Server crashes on ExecReScanForeignScan in postgres_fdw when accessing foreign partition (Etsuro Fujita <etsuro.fujita@gmail.com>) |
Responses |
Re: BUG #17355: Server crashes on ExecReScanForeignScan in postgres_fdw when accessing foreign partition
|
List | pgsql-bugs |
On Tue, Feb 1, 2022 at 5:58 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote: > Hi Alexander, > > On Tue, Feb 1, 2022 at 1:00 PM Alexander Lakhin <exclusion@gmail.com> wrote: > > 30.01.2022 13:59, Etsuro Fujita wrote: > > >> (Besides that I've observed an infinite waiting for ShareLock with > > >> step "s1i" { INSERT INTO pt VALUES (2000); } > > >> This doesn't happen with a regular (not foreign) table.) > > > You mean the lock wait occurs on the remote side, not on the local > > > side? If so, I think that that is expected behavior because a write > > > conflict occurs on the remote side in that case. Maybe I don’t fully > > > understand your words, so could you elaborate a bit more on your > > > observation? > > Yes, you are right, that was expected behavior. I didn't realize that > > the isolationtester itself resolves blocking when the target table is local. > > The isolationtester controls the step execution using > > pg_isolation_test_session_is_blocked(), but when the target table is > > foreign, it can not determine correctly whether one step blocking other > > (cause it checks local, not remote session pids) and just hangs. > > Ok, thanks for the explanation! > > Attached is an updated patch. I tweaked a comment a little bit and > added the commit message. I didn’t include your test because in my > understanding we don’t add such a test into the postgres_fdw > regression test. I’ll commit the patch if there are no objections > from you (or anyone else). Thank you for working on this and sorry about failing to reply to this sooner. Your proposed fix makes sense. I was trying to understand in what way commit 86dc90056's is related to this, because you mention it here in the thread and also in the commit message. I see that c3928b467, also mentioned in your commit message, fixed a related bug [1], which in turn refers to 1375422c as the cause [2]. c3928b467 fixed things so that BeginDirectModify() won't get called if EPQ is active, so fdw_state won't get set. Given that, it also fixed ForeignNext() to not enter the FDW to perform a non-SELECT operation in the EPQ-active case, which made sure its caller Exec[Foreign]Scan() won't enter the FDW in that case. Though it didn't also fix ExecReScan[ForeignScan](), so this bug report. Your patch fixes both ExecForeignScan() and ExecReScanForeignScan() to not enter the FDW during EvalPlanQual() in non-SELECT cases. Before commit 86dc90056, EPQState.plan would be set to the 1st child subplan, which if it happened to be a direct-modify ForeignScan node, this bug would manifest. After that commit though, a direct-modify child subplan would *always* be present in EPQState.plan by way of its parent Append node or some parent node thereof being set as EPQState.plan, so the bug would now always manifest. So my understanding is that the commit 86dc90056 only changed *when* the bug manifests, which is now "always" if direct-modify children are present. Is that understanding correct? BTW, isn't the following code in ForeignNext() added by c3928b467 non-reachable after your patch: /* * direct modifications cannot be re-evaluated, so shouldn't get here * during EvalPlanQual processing */ if (estate->es_epq_active != NULL) elog(ERROR, "cannot re-evaluate a Foreign Update or Delete during EvalPlanQual"); Should that be converted to an Assert(estate->es_epq_active == NULL)? -- Amit Langote EDB: http://www.enterprisedb.com [1] EvalPlanQual() attempting to initialize a direct-modify ForeignScan node without the necessary ResultRelInfo having been set up. [2] As of 1375422c, ResultRelInfos are initialized in ExecInitModifyTable() which won't be invoked during EvalPlanQualStart(), so EvalPlanQual() can no longer rely on accessing a direct-modify ForeignScan node's ResultRelInfo.
pgsql-bugs by date: