On 4/6/2025 00:41, Alexander Korotkov wrote:
> On Tue, Jun 3, 2025 at 5:35 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> On 3/6/2025 16:05, Alexander Korotkov wrote:
>>> On Tue, Jun 3, 2025 at 4:53 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>>>> Additionally, as I mentioned earlier, the primary reason for choosing
>>>> MergeAppend in the regression test was a slight total cost difference
>>>> that triggered the startup cost comparison.
>>>> May you show the query and its explain, that is a subject of concern for
>>>> you?
>>>
>>> My point is that difference in total cost is very small. For small
>>> datasets it could be even within the fuzzy limit. However, in
>>> practice difference in total time is as big as difference in startup
>>> time. So, it would be good to make the total cost difference bigger.
>> For me, it seems like a continuation of the 7d8ac98 discussion. We may
>> charge a small fee for MergeAppend to adjust the balance, of course.
>> However, I think this small change requires a series of benchmarks to
>> determine how it affects the overall cost balance. Without examples it
>> is hard to say how important this issue is and its worthiness to
>> commence such work.
>
> Yes, I think it's fair to charge the MergeAppend node. We currently
> cost it similarly to Sort merge stage, but it's clearly more
> expensive. It dealing on the executor level dealing with Slot's etc,
> while Sort node have a set of lower level optimizations.
As I see it, it makes sense to charge MergeAppend for the heap operation
or, what is more logical, reduce the charge on Sort due to internal
optimisations.
Playing with both approaches, I found that it breaks many more tests
than the current patch does. Hence, it needs additional work on the
results analysis to realise how correct these changes are.
--
regards, Andrei Lepikhov