Thread: Incorrect calculation of path fraction value in MergeAppend

Incorrect calculation of path fraction value in MergeAppend

From
Andrei Lepikhov
Date:
Hi,

Commit 6b94e7a has added a "fractional" strategy to the merge join path 
search scope. But as I see it incorrectly calculates tuple_fraction 
value. Let's see one of the regression tests' queries:

EXPLAIN (COSTS OFF)
SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y
   USING (id) ORDER BY x.id DESC LIMIT 10;

It uses NestedLoop with parameterised inner scan as a subplan of 
MergeAppend. Overall cost is:
Limit  (cost=1.11..4.86 rows=10 width=16)

Let's reduce the LIMIT a little:

EXPLAIN (COSTS OFF)
SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y
   USING (id) ORDER BY x.id DESC LIMIT 2;

There, the query plan uses plain NestLoop node and overall cost 
significantly increases:
Limit  (cost=0.56..30.70 rows=2 width=16)

The origin problem lies in the following code line:
    path_fraction = (1.0 / root->tuple_fraction);

Building startup, total and fractional MergeAppends, the optimiser puts 
off the 'fractional' way because it has built for a 50% limit (uses 
MergeJoin with high startup cost) and chooses a startup-optimal path 
(built upon plain NestLoops) that is better than the total one.

Instead, the path_fraction value must be corrected when 
root->tuple_fraction contains the absolute value of tuples. See the 
patch in the attachment.

I report this issue into the bugs mailing list because it might cause 
some 'enigmatic' performance slowdowns I have heard about (but have no 
proofs) and may need back-patching.

-- 
regards, Andrei Lepikhov

Attachment

Re: Incorrect calculation of path fraction value in MergeAppend

From
Junwang Zhao
Date:
Hi Andrei,

On Tue, Apr 29, 2025 at 7:14 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> Hi,
>
> Commit 6b94e7a has added a "fractional" strategy to the merge join path
> search scope. But as I see it incorrectly calculates tuple_fraction
> value. Let's see one of the regression tests' queries:
>
> EXPLAIN (COSTS OFF)
> SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y
>    USING (id) ORDER BY x.id DESC LIMIT 10;
>
> It uses NestedLoop with parameterised inner scan as a subplan of
> MergeAppend. Overall cost is:
> Limit  (cost=1.11..4.86 rows=10 width=16)
>
> Let's reduce the LIMIT a little:
>
> EXPLAIN (COSTS OFF)
> SELECT x.id, y.id FROM fract_t x LEFT JOIN fract_t y
>    USING (id) ORDER BY x.id DESC LIMIT 2;
>
> There, the query plan uses plain NestLoop node and overall cost
> significantly increases:
> Limit  (cost=0.56..30.70 rows=2 width=16)
>
> The origin problem lies in the following code line:
>         path_fraction = (1.0 / root->tuple_fraction);
>
> Building startup, total and fractional MergeAppends, the optimiser puts
> off the 'fractional' way because it has built for a 50% limit (uses
> MergeJoin with high startup cost) and chooses a startup-optimal path
> (built upon plain NestLoops) that is better than the total one.
>
> Instead, the path_fraction value must be corrected when
> root->tuple_fraction contains the absolute value of tuples. See the
> patch in the attachment.
>
> I report this issue into the bugs mailing list because it might cause
> some 'enigmatic' performance slowdowns I have heard about (but have no
> proofs) and may need back-patching.
>
> --
> regards, Andrei Lepikhov

I've apply the patch and the tests passed, one small nitpick:

+ double path_fraction = root->tuple_fraction;
+
+ /* Convert absolute limit to a path fraction */
+ if (path_fraction >= 1.)
+ path_fraction /= childrel->rows;

maybe add `&& childrel->rows > 0.` to the if condition to
avoid potential crash. I'm not sure if `childrel->rows` will
be 0 but I see get_cheapest_fractional_path did this.

/* Convert absolute # of tuples to a fraction; no need to clamp to 0..1 */
if (tuple_fraction >= 1.0 && best_path->rows > 0)
tuple_fraction /= best_path->rows;

--
Regards
Junwang Zhao



Re: Incorrect calculation of path fraction value in MergeAppend

From
Junwang Zhao
Date:
On Thu, May 8, 2025 at 2:42 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
>
> On 7/5/2025 12:45, Álvaro Herrera wrote:
>  > On 2025-May-07, Andrei Lepikhov wrote:
>  >> The magic is how it finds a link to the thread. This time it doesn't
>  > Weird, that doesn't seem to work very well if you try to search for
>  > stuff.  But if you just specify the Message-Id without clicking
>  > "search" then it works.  So here you go:
>  > https://commitfest.postgresql.org/patch/5742/
> Thanks! I think this memo should be pinned at the top of the page.
>
> On 7/5/2025 17:35, Junwang Zhao wrote:
> > + /* Convert absolute limit to a path fraction */
> > + if (path_fraction >= 1.)
> > + path_fraction /= childrel->rows;
> >
> > maybe add `&& childrel->rows > 0.` to the if condition to
> > avoid potential crash. I'm not sure if `childrel->rows` will
> > be 0 but I see get_cheapest_fractional_path did this.
> You may look at the 76281aa for the reason. Also, looking into the code,
> I see that it is impossible now to find more places with rows == 0
> except for the Dummy result node. Moreover, it just doesn't make sense
> in the case of append nodes: why should the optimiser spend resources
> and consider paths if it is impossible to pull any tuples from the subplan?
>
> So, I think checking assertion instead would be the better option. See
> new version in the attachment.

Agree, assertion is better, the new version LGTM.

>
> --
> regards, Andrei Lepikhov



--
Regards
Junwang Zhao