On 3/28/25 09:19, Alexander Pyhalov wrote:
> Andy Fan писал(а) 2024-10-17 03:33:
> I've updated patch. One more interesting case which we found - when
> fractional path is selected, it still can be more expensive than sorted
> cheapest total path (as we look only on indexes whith necessary
> pathkeys, not on indexes which allow efficiently fetch data).
> So far couldn't find artificial example, but we've seen inadequate index
> selection due to this issue - instead of using index suited for filters
> in where, index, suitable for sorting was selected as one having the
> cheapest fractional cost.
I think it is necessary to generalise the approach a little.
Each MergeAppend subpath candidate that fits pathkeys should be compared
to the overall-optimal path + explicit Sort node. Let's label this
two-node composition as base_path. There are three cases exist:
startup-optimal, total-optimal and fractional-optimal.
In the startup case, base_path should use cheapest_startup_path, the
total-optimal case - cheapest_total_path and for a fractional case, we
may employ the get_cheapest_fractional_path routine to detect proper
base_path.
It may provide a more effective plan either in full, fractional and
partial scan cases:
1. The Sort node may be pushed down to subpaths under a parallel or
async Append.
2. When a minor set of subpaths doesn't have a proper index, and it is
profitable to sort them instead of switching to plain Append.
In general, analysing the regression tests changed by this code, I see
that the cost model prefers a series of small sortings instead of a
single massive one. May be it will show some profit for execution time.
I am not afraid of any palpable planning overhead here because we just
do cheap cost estimation and comparison operations that don't need
additional memory allocations. The caller is responsible for building a
proper Sort node if this method is chosen as optimal.
In the attachment, see the patch written according to the idea. There
are I introduced two new routines:
get_cheapest_path_for_pathkeys_ext
get_cheapest_fractional_path_for_pathkeys_ext
I have designed the code that way to reduce duplicated code in the
generate_orderedappend_paths routine. But the main point is that I
envision these new routines may be reused elsewhere.
This feature looks сlose to the one we discussed before [1]. So, let me
CC the people from the previous conversation to the discussion.
[1]
https://www.postgresql.org/message-id/flat/CAN-LCVPxnWB39CUBTgOQ9O7Dd8DrA_tpT1EY3LNVnUuvAX1NjA%40mail.gmail.com
--
regards, Andrei Lepikhov