Re: Disable parallel query by default - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: Disable parallel query by default
Date
Msg-id 9dab89e2307fe449e7451c96e863b98e570867a2.camel@cybertec.at
Whole thread Raw
In response to Re: Disable parallel query by default  ("Scott Mead" <scott@meads.us>)
Responses Re: Disable parallel query by default
List pgsql-hackers
On Tue, 2025-05-20 at 16:58 -0400, Scott Mead wrote:
> On Wed, May 14, 2025, at 4:06 AM, Laurenz Albe wrote:
> > On Tue, 2025-05-13 at 17:53 -0400, Scott Mead wrote:
> > > On Tue, May 13, 2025, at 5:07 PM, Greg Sabino Mullane wrote:
> > > > On Tue, May 13, 2025 at 4:37 PM Scott Mead <scott@meads.us> wrote:
> > > > > I'll open by proposing that we prevent the planner from automatically
> > > > > selecting parallel plans by default
> >
> > > > > What is the fallout?  When a high-volume, low-latency query flips to
> > > > > parallel execution on a busy system, we end up in a situation where
> > > > > the database is effectively DDOSing itself with a very high rate of
> > > > > connection establish and tear-down requests.
> >
> > You are painting a bleak picture indeed.  I get to see PostgreSQL databases
> > in trouble regularly, but I have not seen anything like what you describe.
> >
> > With an argument like that, you may as well disable nested loop joins.
> > I have seen enough cases where disabling nested loop joins, without any
> > deeper analysis, made very slow queries reasonably fast.
>
> My argument is that parallel query should not be allowed to be invoked without
> user intervention.  Yes, nestedloop can have a similar impact, but let's take
> a look at the breakdown at scale of PQ:
>
> [pgbench run that shows that parallel query is bad for throughput]

I think that your experiment is somewhat misleading.  Sure, if you
overload the machine with parallel workers, that will eventually also
harm the query response time.  But many databases out there are not
overloaded, and the shorter response time that parallel query offers
makes many users happy.

It is well known that what is beneficial for response time is detrimental
for the overall throughput and vice versa.
Now parallel query clearly is a feature that is good for response time
and bad for throughput, but that is not necessarily wrong.

Essentially, you are arguing that the default configuration should favor
throughput over response time.

> Going back to the original commit which enabled PQ by default[1], it was
> done so that the feature would be tested during beta.  I think it's time
> that we limit the accidental impact this can have to users by disabling
> the feature by default.

I disagree.
My experience is that parallel query often improves the user experience.
Sure, there are cases where I recommend disabling it, but I think that
disabling it by default would be a move in the wrong direction.

On the other hand, I have also seen cases where bad estimates trigger
parallel query by mistake, making queries slower.  So I'd support an
effort to increase the default value for "parallel_setup_cost".

Yours,
Laurenz Albe



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Document default values for pgoutput options + fix missing initialization for "origin"
Next
From: Laurenz Albe
Date:
Subject: Re: psql : \dn+ to show default schema privileges