Home > mailing lists

Re: Add proper planner support for ORDER BY / DISTINCT aggregates - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Add proper planner support for ORDER BY / DISTINCT aggregates
Date	December 13, 2022 07:53:40
Msg-id	CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com Whole thread Raw
In response to	Re: Add proper planner support for ORDER BY / DISTINCT aggregates (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: Add proper planner support for ORDER BY / DISTINCT aggregates
List	pgsql-hackers

Tree view

On Wed, 9 Nov 2022 at 14:58, David Rowley <dgrowleyml@gmail.com> wrote:
> v2 attached.

I've been looking at this again and this time around understand why
the * 1.5 pessimism factor was included in the incremental sort code.

If we create a table with a very large skew in the number of rows per
what will be our pre-sorted groups.

create table ab (a int not null, b int not null);
insert into ab select 0,x from generate_Series(1,999000)x union all
select x%1000+1,0 from generate_Series(999001,1000000)x;

Here the 0 group has close to 1 million rows, but the remaining groups
1-1000 have just 1 row each. The planner only knows there are about
1001 distinct values in "a" and assumes an even distribution of rows
between those values.

With:
explain (analyze, timing off) select * from ab order by a,b;

In master, the plan is:

                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Sort  (cost=122490.27..124990.27 rows=1000000 width=8) (actual
rows=1000000 loops=1)
   Sort Key: a, b
   Sort Method: quicksort  Memory: 55827kB
   ->  Index Scan using ab_a_idx on ab  (cost=0.42..22832.42
rows=1000000 width=8) (actual rows=1000000 loops=1)
 Planning Time: 0.069 ms
 Execution Time: 155.469 ms

With the v2 patch it's:
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Incremental Sort  (cost=2767.38..109344.55 rows=1000000 width=8)
(actual rows=1000000 loops=1)
   Sort Key: a, b
   Presorted Key: a
   Full-sort Groups: 33  Sort Method: quicksort  Average Memory: 27kB
Peak Memory: 27kB
   Pre-sorted Groups: 1  Sort Method: quicksort  Average Memory:
55795kB  Peak Memory: 55795kB
   ->  Index Scan using ab_a_idx on ab  (cost=0.42..22832.42
rows=1000000 width=8) (actual rows=1000000 loops=1)
 Planning Time: 0.072 ms
 Execution Time: 163.614 ms

So there is a performance regression.

Sometimes teaching the planner new tricks means that it might use
those tricks at a bad time.  Normally we put in an off switch for
these situations to allow users an escape hatch. We have
enable_incremental_sort for this.  It seems like incremental sort has
tried to avoid this problem by always considering the same "Sort"
paths that we did prior to incremental sort, and also considers
incremental sort for pre-sorted paths with the 1.5 pessimism factor.
The v2 patch taking away the safety net.

I think what we need to do is:  Do our best to give incremental sort
the most realistic costs we can and accept that it might choose a
worse plan in some cases. Users can turn it off if they really have no
other means to convince the planner it's wrong.

Additionally, I think what we also need to add a GUC such as
enable_presorted_aggregate.  People can use that when their Index Scan
-> Incremental Sort -> Aggregate plan is worse than their previous Seq
Scan -> Sort -> Aggregate plan that they were getting in < 16.
Turning off enable_incremental_sort alone won't give them the same
aggregate plan that they had in pg15 as we always set the
query_pathkeys to request a sort order that will suit the order by /
distinct aggregates.

I'll draft up a patch for the enable_presorted_aggregate.

David

pgsql-hackers by date:

From: Bharath Rupireddy
Date: 13 December 2022, 07:36:48
Subject: Use get_call_result_type() more widely

From: Michael Paquier
Date: 13 December 2022, 08:13:21
Subject: Re: Use get_call_result_type() more widely

Re: Add proper planner support for ORDER BY / DISTINCT aggregates - Mailing list pgsql-hackers

Previous

Next