Thread: es_query_dsa is broken

es_query_dsa is broken

From

Thomas Munro

Date:

29 November 2017, 15:30:48

Hi hackers,

While reviewing commit c6755e23 I realised that es_query_dsa is
broken.  It might have made some kind of sense as a name and a concept
in an earlier version of the proposal to add a DSA area for parallel
query's use, when the DSA area was going to be independent of parallel
query DSM segments and could have been used for the whole query.  But
as it stands, each Gather [Merge] node has its own DSA area in its own
DSM segment, and that needs to be the one available to the nodes of
the child plan when executed in the leader process and it needs to be
not available to any other nodes in the tree.  It's OK for the workers
to have just one es_query_dsa set up at the beginning and used for the
whole lifetime of the executor, but it's not OK for the leader to
install it and forget about it as it does now.

The attached draft patch (for discussion only) shows one way to fix
that: only install it for the duration of Gather[Merge]'s
ExecProcNode(child), instead of doing it in ExecInitParallelPlan().
Also for the [Re]IntializeDSM invocations.

This type of install-call-clear coding isn't ideal, but unfortunately
the area doesn't exist until the first time through ExecGather runs,
and we don't seem to have another suitable scope (ie some object
easily accessible to all children of a given Gather) to put it in
right now.

Better ideas?

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

fix-es_query_dsa-v1.patch

Re: es_query_dsa is broken

From

Robert Haas

Date:

29 November 2017, 18:01:37

On Wed, Nov 29, 2017 at 7:30 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> While reviewing commit c6755e23 I realised that es_query_dsa is
> broken.  It might have made some kind of sense as a name and a concept
> in an earlier version of the proposal to add a DSA area for parallel
> query's use, when the DSA area was going to be independent of parallel
> query DSM segments and could have been used for the whole query.  But
> as it stands, each Gather [Merge] node has its own DSA area in its own
> DSM segment, and that needs to be the one available to the nodes of
> the child plan when executed in the leader process and it needs to be
> not available to any other nodes in the tree.  It's OK for the workers
> to have just one es_query_dsa set up at the beginning and used for the
> whole lifetime of the executor, but it's not OK for the leader to
> install it and forget about it as it does now.

Ugh.

> Better ideas?

How about this:

1. Remove es_query_dsa altogether.
2. Add a dsa_area * to ExecParallelInitializeDSMContext.
3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
the per-node-type function.
4. If the per-node-type function cares about the dsa_area *, it is
responsible for saving a pointer to it in the PlanState node.
5. Also pass it down via the ParallelWorkerContext.

In v10 we might need to go with a solution like what you've sketched
here, because Tom will complain about breaking binary compatibility
with EState (and maybe other PlanState nodes) in a released branch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: es_query_dsa is broken

From

Thomas Munro

Date:

30 November 2017, 13:19:44

On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Better ideas?
>
> How about this:
>
> 1. Remove es_query_dsa altogether.
> 2. Add a dsa_area * to ExecParallelInitializeDSMContext.
> 3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
> the per-node-type function.
> 4. If the per-node-type function cares about the dsa_area *, it is
> responsible for saving a pointer to it in the PlanState node.
> 5. Also pass it down via the ParallelWorkerContext.
>
> In v10 we might need to go with a solution like what you've sketched
> here, because Tom will complain about breaking binary compatibility
> with EState (and maybe other PlanState nodes) in a released branch.

I will post both versions.  I've been stuck for a while now trying to
come up with a query that actually breaks, so I can show that it's
fixed...

-- 
Thomas Munro
http://www.enterprisedb.com

Re: es_query_dsa is broken

From

Thomas Munro

Date:

05 December 2017, 04:15:19

On Thu, Nov 30, 2017 at 11:19 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> Better ideas?
>>
>> How about this:
>>
>> 1. Remove es_query_dsa altogether.
>> 2. Add a dsa_area * to ExecParallelInitializeDSMContext.
>> 3. In ExecParallelInitializeDSM, pass the dsa_area * as a separate to
>> the per-node-type function.
>> 4. If the per-node-type function cares about the dsa_area *, it is
>> responsible for saving a pointer to it in the PlanState node.
>> 5. Also pass it down via the ParallelWorkerContext.
>>
>> In v10 we might need to go with a solution like what you've sketched
>> here, because Tom will complain about breaking binary compatibility
>> with EState (and maybe other PlanState nodes) in a released branch.

Here's a pair of versions of that patch tidied up a bit, for
REL_10_STABLE and master.  Unfortunately I've been unable to come up
with a reproducer that shows misbehaviour (something like what was
reported on -bugs[1] which appears to be a case of calling
dsa_get_address(wrong_area, some_dsa_pointer)).

I don't yet have a patch to remove es_query_dsa.  Instead of adding a
dsa_area * parameter to a ton of per-node functions as you suggested,
I'm wondering if we shouldn't just pass the whole ParallelExecutorInfo
object into all ExecXXXInitializeDSM() and ExecXXXInitializeWorker()
functions so they can hold onto it if they want to.  In the worker
case it'd be only partially filled out: just area and pcxt, and pcxt
would also be only partially filled out.  Alternatively I could create
a new struct ParallelExecutorWorkerInfo which has just the bit needed
for workers (that'd replace ParallelWorkerContext, which I now realise
was put in the wrong place -- it doesn't belong in access/parallel.h,
it belongs in an executor header with an executor-sounding name).  I'd
like to get a couple of other things done before I come back to this.

[1] https://www.postgresql.org/message-id/CAEepm=1V-ZjAAyG-xj6s7ESXFhzLsRBpyX=BLz-w2DS=ObNu4A@mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

Re: es_query_dsa is broken

From

Amit Kapila

Date:

05 December 2017, 15:24:41

On Tue, Dec 5, 2017 at 6:45 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Nov 30, 2017 at 11:19 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> On Thu, Nov 30, 2017 at 4:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> In v10 we might need to go with a solution like what you've sketched
>>> here, because Tom will complain about breaking binary compatibility
>>> with EState (and maybe other PlanState nodes) in a released branch.
>
> Here's a pair of versions of that patch tidied up a bit, for
> REL_10_STABLE and master.  Unfortunately I've been unable to come up
> with a reproducer that shows misbehaviour (something like what was
> reported on -bugs[1] which appears to be a case of calling
> dsa_get_address(wrong_area, some_dsa_pointer)).
>

+ EState *estate = gatherstate->ps.state;
+
+ /* Install our DSA area while executing the plan. */
+ estate->es_query_dsa = gatherstate->pei->area;
  outerTupleSlot = ExecProcNode(outerPlan);
+ estate->es_query_dsa = NULL;

Won't the above coding pattern create a problem, if ExecProcNode
throws an error and outer block catches it and continues execution
(consider the case of execution inside PL blocks)?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: es_query_dsa is broken

From

Robert Haas

Date:

05 December 2017, 22:44:27

On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> + EState *estate = gatherstate->ps.state;
> +
> + /* Install our DSA area while executing the plan. */
> + estate->es_query_dsa = gatherstate->pei->area;
>   outerTupleSlot = ExecProcNode(outerPlan);
> + estate->es_query_dsa = NULL;
>
> Won't the above coding pattern create a problem, if ExecProcNode
> throws an error and outer block catches it and continues execution
> (consider the case of execution inside PL blocks)?

I don't see what the problem is.  The query that got aborted by the
error wouldn't be sharing an EState with one that didn't.  Control
would have to return to someplace inside the same ExecProcNode and it
would have to return normally.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: es_query_dsa is broken

From

Amit Kapila

Date:

06 December 2017, 12:16:16

On Wed, Dec 6, 2017 at 1:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> + EState *estate = gatherstate->ps.state;
>> +
>> + /* Install our DSA area while executing the plan. */
>> + estate->es_query_dsa = gatherstate->pei->area;
>>   outerTupleSlot = ExecProcNode(outerPlan);
>> + estate->es_query_dsa = NULL;
>>
>> Won't the above coding pattern create a problem, if ExecProcNode
>> throws an error and outer block catches it and continues execution
>> (consider the case of execution inside PL blocks)?
>
> I don't see what the problem is.  The query that got aborted by the
> error wouldn't be sharing an EState with one that didn't.
>

That's right.  Ignore my comment, I got confused.   Other than that, I
don't see any problem with the code as such apart from that it looks
slightly hacky.  I think Thomas or someone needs to develop a patch on
the lines you have mentioned or what Thomas was trying to describe in
his email and see how it comes out.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: es_query_dsa is broken

From

Thomas Munro

Date:

07 December 2017, 02:51:56

On Wed, Dec 6, 2017 at 10:16 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 1:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> + EState *estate = gatherstate->ps.state;
>>> +
>>> + /* Install our DSA area while executing the plan. */
>>> + estate->es_query_dsa = gatherstate->pei->area;
>>>   outerTupleSlot = ExecProcNode(outerPlan);
>>> + estate->es_query_dsa = NULL;
>>>
>>> Won't the above coding pattern create a problem, if ExecProcNode
>>> throws an error and outer block catches it and continues execution
>>> (consider the case of execution inside PL blocks)?
>>
>> I don't see what the problem is.  The query that got aborted by the
>> error wouldn't be sharing an EState with one that didn't.
>
> That's right.  Ignore my comment, I got confused.   Other than that, I
> don't see any problem with the code as such apart from that it looks
> slightly hacky.  I think Thomas or someone needs to develop a patch on
> the lines you have mentioned or what Thomas was trying to describe in
> his email and see how it comes out.

Yeah, it is a bit hacky, but I can't see a another way to fix it
without changing released APIs and it's only for one release and will
certainly be unhackified in v11.  For v11 I think we need to decide
between:

1.  Removing es_query_dsa and injecting the right context into the
executor tree as discussed.

2.  Another idea mentioned by Robert in an off-list chat:  We could
consolidate all DSM segments in a multi-gather plan into one.  See the
nearby thread where someone had over a hundred Gather nodes and had to
crank up max_connections to reserve enough DSM slots.  Of course,
optimising for that case doesn't make too much sense (I suspect
multi-gather plans will become less likely with Parallel Append and
Parallel Hash in the picture anyway), but it would reduce a bunch of
duplicated work we do when it happens as well as scarce slot
consumption.  If we did that, then all of a sudden es_query_dsa would
make sense again (ie it'd be whole-query scoped), and we could revert
that hacky change.

Or we could do both things anyway.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: es_query_dsa is broken

From

Thomas Munro

Date:

18 December 2017, 03:35:38

On Thu, Dec 7, 2017 at 12:51 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Wed, Dec 6, 2017 at 10:16 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> On Wed, Dec 6, 2017 at 1:14 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Tue, Dec 5, 2017 at 7:24 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>> + EState *estate = gatherstate->ps.state;
>>>> +
>>>> + /* Install our DSA area while executing the plan. */
>>>> + estate->es_query_dsa = gatherstate->pei->area;
>>>>   outerTupleSlot = ExecProcNode(outerPlan);
>>>> + estate->es_query_dsa = NULL;
>>>>
>>>> Won't the above coding pattern create a problem, if ExecProcNode
>>>> throws an error and outer block catches it and continues execution
>>>> (consider the case of execution inside PL blocks)?
>>>
>>> I don't see what the problem is.  The query that got aborted by the
>>> error wouldn't be sharing an EState with one that didn't.
>>
>> That's right.  Ignore my comment, I got confused.   Other than that, I
>> don't see any problem with the code as such apart from that it looks
>> slightly hacky.  I think Thomas or someone needs to develop a patch on
>> the lines you have mentioned or what Thomas was trying to describe in
>> his email and see how it comes out.

Andreas Seltenreich found a query[1] that suffers from this bug
(thanks!), and Amit confirmed that this patch fixes it, but further
sqlsmith fuzz testing with the patch applied also revealed that it
failed to handle the case where pei is NULL because parallelism was
inhibited.  Here's a new version to fix that.  Someone might prefer a
coding with "if" rather than the ternary operator, but I didn't have a
strong preference.

[1]
https://www.postgresql.org/message-id/flat/CAEepm%3D2tfz1XNDcyU_a5ZiEaopz6%2BbBo9vQY%2BiJVLTtUVNztcQ%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

Attachment

Re: es_query_dsa is broken

From

Robert Haas

Date:

18 December 2017, 20:34:27

On Sun, Dec 17, 2017 at 7:35 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Andreas Seltenreich found a query[1] that suffers from this bug
> (thanks!), and Amit confirmed that this patch fixes it, but further
> sqlsmith fuzz testing with the patch applied also revealed that it
> failed to handle the case where pei is NULL because parallelism was
> inhibited.  Here's a new version to fix that.  Someone might prefer a
> coding with "if" rather than the ternary operator, but I didn't have a
> strong preference.

Looks OK to me.  Committed and back-patched to v10.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: es_query_dsa is broken

From

Andres Freund

Date:

11 April 2018, 19:04:21

Hi,

On 2017-12-07 12:51:56 +1300, Thomas Munro wrote:
> 1.  Removing es_query_dsa and injecting the right context into the
> executor tree as discussed.
> 
> 2.  Another idea mentioned by Robert in an off-list chat:  We could
> consolidate all DSM segments in a multi-gather plan into one.  See the
> nearby thread where someone had over a hundred Gather nodes and had to
> crank up max_connections to reserve enough DSM slots.  Of course,
> optimising for that case doesn't make too much sense (I suspect
> multi-gather plans will become less likely with Parallel Append and
> Parallel Hash in the picture anyway), but it would reduce a bunch of
> duplicated work we do when it happens as well as scarce slot
> consumption.  If we did that, then all of a sudden es_query_dsa would
> make sense again (ie it'd be whole-query scoped), and we could revert
> that hacky change.
> 
> Or we could do both things anyway.

This is an open item for v11:

     Tidy up es_query_dsa and possibly ParallelWorkerContext?
        Original commit: e13029a5ce353574516c64fd1ec9c50201e705fd (principal author: Thomas Munro; owner: Robert Haas)
        Bug fix: fd7c0fa732d97a4b4ebb58730e6244ea30d0a618
        While the bug was fixed with something back-patchable, we should considering improving this situation. As
discussedin the above-linked thread, options might include (1) getting rid of es_query_dsa entirely and injecting
dependenciesinto nodes, (2) making all Gather nodes share the same DSM segment so there truly could be per-query DSA
segment.
 

Do we want to make any changes here for v11? If not, are we ok with just
closing the entry and waiting till it bugs anybody for some reason?

Greetings,

Andres Freund

Re: es_query_dsa is broken

From

Thomas Munro

Date:

11 April 2018, 23:56:13

On Thu, Apr 12, 2018 at 4:04 AM, Andres Freund <andres@anarazel.de> wrote:
> This is an open item for v11:
>
>      Tidy up es_query_dsa and possibly ParallelWorkerContext?
>         Original commit: e13029a5ce353574516c64fd1ec9c50201e705fd (principal author: Thomas Munro; owner: Robert
Haas)
>         Bug fix: fd7c0fa732d97a4b4ebb58730e6244ea30d0a618
>         While the bug was fixed with something back-patchable, we should considering improving this situation. As
discussedin the above-linked thread, options might include (1) getting rid of es_query_dsa entirely and injecting
dependenciesinto nodes, (2) making all Gather nodes share the same DSM segment so there truly could be per-query DSA
segment.
>
> Do we want to make any changes here for v11? If not, are we ok with just
> closing the entry and waiting till it bugs anybody for some reason?

I think we should probably do both of the things listed above, but
given where we are and given that it's refactoring work and not a
stop-ship issue, I propose to do that in the v12 cycle.  I'll go
remove that from the open items if no one objects soon.

--
Thomas Munro
http://www.enterprisedb.com