Re: MergeAppend could consider sorting cheapest child path - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: MergeAppend could consider sorting cheapest child path
Date
Msg-id 1e9be983-467c-41bf-91d7-9565bff5825f@gmail.com
Whole thread Raw
In response to Re: MergeAppend could consider sorting cheapest child path  (Alexander Pyhalov <a.pyhalov@postgrespro.ru>)
Responses Re: MergeAppend could consider sorting cheapest child path
List pgsql-hackers
On 4/25/25 11:16, Alexander Pyhalov wrote:
> Andrei Lepikhov писал(а) 2025-04-24 16:01:
>> On 3/28/25 09:19, Alexander Pyhalov wrote:
>> In the attachment, see the patch written according to the idea. There 
>> are I introduced two new routines:
>> get_cheapest_path_for_pathkeys_ext
>> get_cheapest_fractional_path_for_pathkeys_ext
> 
> Hi. I'm a bit confused that 
Thanks for the participation!

> get_cheapest_fractional_path_for_pathkeys_ext() looks only on sorting 
> cheapest fractional path, and get_cheapest_path_for_pathkeys_ext() in 
> STARTUP_COST case looks only on sorting cheapest_startup_path.
At first, at the moment, I don't understand why we calculate the 
cheapest_startup path at all. In my opinion, after commit 6b94e7a [1, 
2], the min-fractional path totally covers the case. I began this 
discussion in [3] - maybe we need to scrutinise that issue beforehand.

Looking into the min-fractional-path + Sort, we propose a path for the 
case when extracting minor part of tuples with sorting later is cheaper 
than doing a massive job of non-selective index scan. You also may 
imagine the case of a JOIN as a subpath: non-sorted, highly selective 
parameterised NestLoop may be way more optimal than MergeJoin, which 
fits the pathkeys.

> Usually, sorted cheapest_total_path will be cheaper than sorted 
> fractional/startup path at least by startup cost (as after sorting it 
> includes total_cost of input path). But we ignore this case when 
> selecting cheapest_startup and cheapest_fractional subpaths. As result 
> selected cheapest_startup and cheapest_fractional can be not cheapest 
> for startup or selecting a fraction of rows.
I don't know what you mean by that. The cheapest_total_path is 
considered when we chose optimal cheapest_total path. The same works for 
the fractional path - get_cheapest_fractional_path gives us the most 
optimal fractional path and probes cheapest_total_path too.
As above, not sure about min-startup case for now. I can imagine 
MergeAppend over sophisticated subquery: non-sorted includes highly 
parameterised JOINs and the alternative (with pathkeys) includes 
HashJoin, drastically increasing startup cost. It is only a theory, of 
course. So, lets discover how min-startup works.

At the end, when the sorted path already estimated, we each time compare 
it with previously selected pathkeys-cheapest. So, if the sorted path is 
worse, we end up with the original path and don't lose anything.

[1] 
https://www.postgresql.org/message-id/e8f9ec90-546d-e948-acce-0525f3e92773%40enterprisedb.com
[2] 
https://www.postgresql.org/message-id/1581042da8044e71ada2d6e3a51bf7bb%40index.de
[3] 
https://www.postgresql.org/message-id/f0206ef2-6b5a-4d07-8770-cfa7cd30f685@gmail.com

-- 
regards, Andrei Lepikhov



pgsql-hackers by date:

Previous
From: Erik Rijkers
Date:
Subject: gcc 15.1 warnings - jsonb_util.c
Next
From: Nikhil Kumar Veldanda
Date:
Subject: Re: ZStandard (with dictionaries) compression support for TOAST compression