Home > mailing lists

Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [HACKERS] Parallel Append implementation
Date	October 2, 2017 15:51:35
Msg-id	CA+TgmobG-Q=-TvdXxSxwXr6nOc-fdTDR57mYrsNKneq2H+HH0g@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] Parallel Append implementation (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: [HACKERS] Parallel Append implementation
List	pgsql-hackers

Tree view

On Sun, Oct 1, 2017 at 9:55 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Isn't it for both?  I mean it is about comparing the non-partial paths
> for child relations of the same relation and also when there are
> different relations involved as in Union All kind of query.  In any
> case, the point I was trying to say is that generally non-partial
> relations will have relatively smaller scan size, so probably should
> take lesser time to complete.

I don't think that's a valid inference.  It's true that a relation
could fail to have a partial path because it's small, but that's only
one reason among very many.  The optimal index type could be one that
doesn't support parallel index scans, for example.

> Sure, I think it is quite good if we can achieve that but it seems to
> me that we will not be able to achieve that in all scenario's with the
> patch and rather I think in some situations it can result in leader
> ended up picking the costly plan (in case when there are all partial
> plans or mix of partial and non-partial plans).  Now, we are ignoring
> such cases based on the assumption that other workers might help to
> complete master backend.  I think it is quite possible that the worker
> backends picks up some plans which emit rows greater than tuple queue
> size and they instead wait on the master backend which itself is busy
> in completing its plan.  So master backend will end up taking too much
> time.  If we want to go with a strategy of master (leader) backend and
> workers taking a different strategy to pick paths to work on, then it
> might be better if we should try to ensure that master backend always
> starts from the place which has cheapest plans irrespective of whether
> the path is partial or non-partial.

I agree that it's complicated to get this right in all cases; I'm
realizing that's probably an unattainable ideal.

However, I don't think ignoring the distinction between partial and
non-partial plans is an improvement, because the argument that other
workers may be able to help the leader is a correct one.  You are
correct in saying that the workers might fill up their tuple queues
while the leader is working, but once the leader returns one tuple it
will switch to reading from the queues.  Every other tuple can be
returned by some worker.  On the other hand, if the leader picks a
non-partial plan, it must run that plan through to completion.

Imagine (a) a non-partial path with a cost of 1000 returning 100
tuples and (b) a partial path with a cost of 10,000 returning 1,000
tuples.  No matter which path the leader picks, it has about 10 units
of work to do to return 1 tuple.  However, if it picks the first path,
it is committed to doing 990 more units of work later, regardless of
whether the workers have filled the tuple queues or not.  If it picks
the second path, it again has about 10 units of work to do to return 1
tuple, but it hasn't committed to doing all the rest of the work of
that path.  So that's better.

Now it's hard to get all of the cases right.  If the partial path in
the previous example had a cost of 1 crore then even returning 1 tuple
is more costly than running the whole non-partial path.  When you
introduce partition-wise join and parallel hash, there are even more
problems.  Now the partial path might have a large startup cost, so
returning the first tuple may be very expensive, and that work may
help other workers (if this is a parallel hash) or it may not (if this
is a non-parallel hash).  I don't know that we can get all of these
cases right, or that we should try.  However, I still think that
picking partial paths preferentially is sensible -- we don't know all
the details, but we do know that they're typically going to at least
try to divide up the work in a fine-grained fashion, while a
non-partial path, once started, the leader must run it to completion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

From: Maksim Milyutin
Date: 02 October 2017, 15:43:45
Subject: Re: [HACKERS][BUG] Cache invalidation for queries that contains constof temporary composite type

From: Robert Haas
Date: 02 October 2017, 16:04:30
Subject: Re: [HACKERS] CLUSTER command progress monitor

Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

Previous

Next