Thread: [HACKERS] A GUC to prevent leader processes from running subplans?
Hi hackers, While testing parallelism work I've wanted to be able to prevent gather nodes from running the plan in the leader process, and I've heard others say the same. One way would be to add a GUC "multiplex_gather", like in the attached patch. If you set it to off, Gather and Gather Merge won't run the subplan unless they have to because no workers could be launched. I thought about adding a new value for force_parallel_mode instead, but someone mentioned they might want to do this on a production system too and force_parallel_mode is not really for end users. Better ideas? -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Tue, Oct 17, 2017 at 7:27 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > While testing parallelism work I've wanted to be able to prevent > gather nodes from running the plan in the leader process, and I've > heard others say the same. One way would be to add a GUC > "multiplex_gather", like in the attached patch. If you set it to off, > Gather and Gather Merge won't run the subplan unless they have to > because no workers could be launched. I thought about adding a new > value for force_parallel_mode instead, but someone mentioned they > might want to do this on a production system too and > force_parallel_mode is not really for end users. Better ideas? I don't think overloading force_parallel_mode is a good idea, but having some other GUC for this seems OK to me. Not sure I like multiplex_gather, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Oct 21, 2017 at 8:09 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Oct 17, 2017 at 7:27 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> While testing parallelism work I've wanted to be able to prevent >> gather nodes from running the plan in the leader process, and I've >> heard others say the same. One way would be to add a GUC >> "multiplex_gather", like in the attached patch. If you set it to off, >> Gather and Gather Merge won't run the subplan unless they have to >> because no workers could be launched. I thought about adding a new >> value for force_parallel_mode instead, but someone mentioned they >> might want to do this on a production system too and >> force_parallel_mode is not really for end users. Better ideas? > > I don't think overloading force_parallel_mode is a good idea, but > having some other GUC for this seems OK to me. Not sure I like > multiplex_gather, though. How about parallel_leader_participation = on|off? The attached version has it that way, and adds regression tests to exercise on, off and off-but-couldn't-start-any-workers for both kinds of gather node. I'm not sure why node->need_to_rescan is initialised by both ExecGatherInit() and ExecGather(). Only the latter's value matters, right? I've added this to the January Commitfest. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Sun, Nov 12, 2017 at 9:18 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Sat, Oct 21, 2017 at 8:09 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Oct 17, 2017 at 7:27 AM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >> >> I don't think overloading force_parallel_mode is a good idea, but >> having some other GUC for this seems OK to me. Not sure I like >> multiplex_gather, though. > > How about parallel_leader_participation = on|off? The attached > version has it that way, and adds regression tests to exercise on, off > and off-but-couldn't-start-any-workers for both kinds of gather node. > > I'm not sure why node->need_to_rescan is initialised by both > ExecGatherInit() and ExecGather(). Only the latter's value matters, > right? > I don't see anything like need_to_rescan in the GatherState node. Do you intend to say need_to_scan_locally? If yes, then I think whatever you said is right. > I've added this to the January Commitfest. > +1 to this idea. Do you think such an option at table level can be meaningful? We have a parallel_workers as a storage option for tables, so users might want leader to participate in parallelism only for some of the tables. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Nov 12, 2017 at 8:51 PM, Amit Kapila <amit.kapila16@gmail.com> wrote: > On Sun, Nov 12, 2017 at 9:18 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> How about parallel_leader_participation = on|off? The attached >> version has it that way, and adds regression tests to exercise on, off >> and off-but-couldn't-start-any-workers for both kinds of gather node. >> >> I'm not sure why node->need_to_rescan is initialised by both >> ExecGatherInit() and ExecGather(). Only the latter's value matters, >> right? >> > > I don't see anything like need_to_rescan in the GatherState node. Do > you intend to say need_to_scan_locally? If yes, then I think whatever > you said is right. Right, that's what I meant to write. Thanks. >> I've added this to the January Commitfest. >> > > +1 to this idea. Do you think such an option at table level can be > meaningful? We have a parallel_workers as a storage option for > tables, so users might want leader to participate in parallelism only > for some of the tables. I'm not sure. I think the reason for turning it off (other than developer testing) would be that the leader is getting tied up doing work that takes a long time (sorting, hashing, aggregating) and that's causing the workers to be blocked because their output queue is full. I think that type of behaviour comes from certain plan types, and it probably wouldn't make sense to associate this behaviour with the tables you're scanning. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 11, 2017 at 10:48 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > How about parallel_leader_participation = on|off? The attached > version has it that way, and adds regression tests to exercise on, off > and off-but-couldn't-start-any-workers for both kinds of gather node. This looks mostly fine to me, but I think the documentation is strange: + Allows the leader process to execute the query plan under + <literal>Gather</literal> and <literal>Gather Merge</literal> nodes + instead of waiting for worker processes. The default is + <literal>on</literal>. Setting this value to <literal>on</literal> + can cause the leader process to begin producing tuples sooner instead + of waiting for worker processes to start up, but might in some cases + also cause workers to become blocked waiting for the leader to clear + tuple queues. This documentation would seem exactly right to me if the default value were off, but as it is it seems kinda backwards, because it's explaining why you might want to set the value to what is anyway the default. Also, it's always possible for the workers to become blocked waiting for the leader, regardless of how this GUC is set. It becomes more likely when this turned off, but that's not quite the same thing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 14, 2017 at 10:30 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Sat, Nov 11, 2017 at 10:48 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> How about parallel_leader_participation = on|off? The attached >> version has it that way, and adds regression tests to exercise on, off >> and off-but-couldn't-start-any-workers for both kinds of gather node. > > This looks mostly fine to me, but I think the documentation is strange: > > + Allows the leader process to execute the query plan under > + <literal>Gather</literal> and <literal>Gather Merge</literal> nodes > + instead of waiting for worker processes. The default is > + <literal>on</literal>. Setting this value to <literal>on</literal> > + can cause the leader process to begin producing tuples sooner instead > + of waiting for worker processes to start up, but might in some cases > + also cause workers to become blocked waiting for the leader to clear > + tuple queues. > > This documentation would seem exactly right to me if the default value > were off, but as it is it seems kinda backwards, because it's > explaining why you might want to set the value to what is anyway the > default. Also, it's always possible for the workers to become blocked > waiting for the leader, regardless of how this GUC is set. It becomes > more likely when this turned off, but that's not quite the same thing. Thanks. You're right. Rebased and updated to describe what "off" does. -- Thomas Munro http://www.enterprisedb.com
Attachment
On Tue, Nov 14, 2017 at 12:28 AM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > Thanks. You're right. Rebased and updated to describe what "off" does. Committed. I noticed that you didn't add the new GUC to postgresql.conf.sample, so I did that. But then I thought it didn't really belong in the section you put it, so I moved it to RESOURCES_ASYNCHRONOUS. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 11/15/2017 08:28 AM, Robert Haas wrote: > On Tue, Nov 14, 2017 at 12:28 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> Thanks. You're right. Rebased and updated to describe what "off" does. > > Committed. I noticed that you didn't add the new GUC to > postgresql.conf.sample, so I did that. But then I thought it didn't > really belong in the section you put it, so I moved it to > RESOURCES_ASYNCHRONOUS. > Small typo. Best regards, Jesper
Attachment
On Wed, Nov 15, 2017 at 8:35 AM, Jesper Pedersen <jesper.pedersen@redhat.com> wrote: > Small typo. Thanks. That just proves no task is so simple that I can't foul it up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Nov 15, 2017 at 6:58 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Nov 14, 2017 at 12:28 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> Thanks. You're right. Rebased and updated to describe what "off" does. > > Committed. > Thanks, reflected the same in CF entry (https://commitfest.postgresql.org/16/1375/). -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com