Thread: WITH ORDINALITY planner improvements

WITH ORDINALITY planner improvements

From

Greg Stark

Date:

31 July 2013, 15:51:41

On Mon, Jul 29, 2013 at 1:02 PM, Greg Stark <stark@mit.edu> wrote:
> On Mon, Jul 29, 2013 at 8:56 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> Unless LATERAL provides a way to do lock-step iteration through a pair
>> (or more) of functions I don't think we can get rid of SRFs [in select target lists] yet
>
> You don't even need lateral. This works fine:
>
> postgres=# select * from generate_series(1,10) with ordinality as
> a(a,o) natural full outer join generate_series(1,5) with ordinality as
> b(b,o) ;
>
>  o  | a  | b
> ----+----+---
>   1 |  1 | 1
>   2 |  2 | 2
>   3 |  3 | 3
>   4 |  4 | 4
>   5 |  5 | 5
>   6 |  6 |
>   7 |  7 |
>   8 |  8 |
>   9 |  9 |
>  10 | 10 |
> (10 rows)
>
> However it occurs to me that the plan isn't ideal:
>
> postgres=# explain select * from generate_series(1,10) with ordinality
> as a(a,o) natural full outer join generate_series(1,5) with ordinality
> as b(b,o) ;
>                                       QUERY PLAN
> ---------------------------------------------------------------------------------------
>  Merge Full Join  (cost=119.66..199.66 rows=5000 width=24)
>    Merge Cond: (a.o = b.o)
>    ->  Sort  (cost=59.83..62.33 rows=1000 width=12)
>          Sort Key: a.o
>          ->  Function Scan on generate_series a  (cost=0.00..10.00
> rows=1000 width=12)
>    ->  Sort  (cost=59.83..62.33 rows=1000 width=12)
>          Sort Key: b.o
>          ->  Function Scan on generate_series b  (cost=0.00..10.00
> rows=1000 width=12)
> (8 rows)
>
>
> I think all that's required to avoid the sorts is teaching the planner
> that the Path has a PathKey of the ordinal column. I can look at that
> later but I'll go ahead and commit it without it at first. I wonder if
> it's also useful to teach the planner that the column is unique.

So I took a crack at this. But it's not working.

To generate the pathkey for the ordinality column I basically exported
make_pathkey_from_sortop() through a thin wrapper called
make_pathkey_for_functionscan() and passed the Var from the
reltargetlist and hard coded int8gt's oid. That might stand to be
generalized a bit but in any case it doesn't matter because it doesn't
work.

My best guess of what's going wrong is that the eclass is being
generated too late and instead of getting the canonical eclass it's
getting a freshly minted one that doesn't match the eclass for the
merge join pathkey. But as I said, that's just a guess. I'm missing a
high level view of the order of operations between turning parser
column references into vars into eclasses and what has to be done
when. My best guess is that what's needed is marking the ordinality
column with a sortref early on but again, as I said, I'm just
guessing.

--
greg

--
greg

Attachment

ordinality-path.diff

Re: WITH ORDINALITY planner improvements

From

"Etsuro Fujita"

Date:

13 August 2013, 03:11:15

> > However it occurs to me that the plan isn't ideal:
> >
> > postgres=# explain select * from generate_series(1,10) with ordinality
> > as a(a,o) natural full outer join generate_series(1,5) with ordinality
> > as b(b,o) ;
> >                                       QUERY PLAN
> > ----------------------------------------------------------------------
> > -----------------  Merge Full Join  (cost=119.66..199.66 rows=5000
> > width=24)
> >    Merge Cond: (a.o = b.o)
> >    ->  Sort  (cost=59.83..62.33 rows=1000 width=12)
> >          Sort Key: a.o
> >          ->  Function Scan on generate_series a  (cost=0.00..10.00
> > rows=1000 width=12)
> >    ->  Sort  (cost=59.83..62.33 rows=1000 width=12)
> >          Sort Key: b.o
> >          ->  Function Scan on generate_series b  (cost=0.00..10.00
> > rows=1000 width=12)
> > (8 rows)

> > I think all that's required to avoid the sorts is teaching the planner
> > that the Path has a PathKey of the ordinal column. I can look at that
> > later but I'll go ahead and commit it without it at first. I wonder if
> > it's also useful to teach the planner that the column is unique.
> 
> So I took a crack at this. But it's not working.
> 
> To generate the pathkey for the ordinality column I basically exported
> make_pathkey_from_sortop() through a thin wrapper called
> make_pathkey_for_functionscan() and passed the Var from the reltargetlist and
> hard coded int8gt's oid. That might stand to be generalized a bit but in any
> case it doesn't matter because it doesn't work.

I think the patch is buggy in some ways.  I've reworked on the patch.  (I did
nothing on the uniqueness of an ordinality column.  We should do something on
that?)  Attached is a reworked version of the patch.  With the reworked version,
we get a plan for the above query:

postgres=# EXPLAIN SELECT * FROM generate_series(1,10) WITH ORDINALITY AS a(a,o)
NATURAL FULL OUTER JOIN generate_series(1,5) WITH ORDINALITY AS b(b,o);                                     QUERY PLAN
--------------------------------------------------------------------------------
-------Merge Full Join  (cost=0.01..97.50 rows=5000 width=24)  Merge Cond: (a.o = b.o)  ->  Function Scan on
generate_seriesa  (cost=0.00..10.00 rows=1000 width=12)  ->  Materialize  (cost=0.00..12.50 rows=1000 width=12)
-> Function Scan on generate_series b  (cost=0.00..10.00 rows=1000
 
width=12)
(5 rows)

> My best guess of what's going wrong is that the eclass is being generated too
> late and instead of getting the canonical eclass it's getting a freshly minted
> one that doesn't match the eclass for the merge join pathkey. But as I said,
> that's just a guess. I'm missing a high level view of the order of operations
> between turning parser column references into vars into eclasses and what has
> to be done when. My best guess is that what's needed is marking the ordinality
> column with a sortref early on but again, as I said, I'm just guessing.

You can find the discussion about this in src/backend/optimizer/README (Part:
Order of processing for EquivalenceClasses and PathKeys).

Thanks,

Best regards,
Etsuro Fujita

Re: WITH ORDINALITY planner improvements

From

"Etsuro Fujita"

Date:

15 August 2013, 10:20:26

I wrote:
> I've reworked on the patch.

Attached is an updated version of the patch.  In that version the code for the
newly added function build_function_pathkeys() has been made more simple by
using the macro INTEGER_BTREE_FAM_OID.

Thanks,

Best regards,
Etsuro Fujita

Re: WITH ORDINALITY planner improvements

From

Bruce Momjian

Date:

31 January 2014, 22:46:42

On Thu, Aug 15, 2013 at 07:25:17PM +0900, Etsuro Fujita wrote:
> I wrote:
> > I've reworked on the patch.
> 
> Attached is an updated version of the patch.  In that version the code for the
> newly added function build_function_pathkeys() has been made more simple by
> using the macro INTEGER_BTREE_FAM_OID.

Is this patch to be applied?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: WITH ORDINALITY planner improvements

From

Tom Lane

Date:

31 January 2014, 23:01:50

Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Aug 15, 2013 at 07:25:17PM +0900, Etsuro Fujita wrote:
>> Attached is an updated version of the patch.  In that version the code for the
>> newly added function build_function_pathkeys() has been made more simple by
>> using the macro INTEGER_BTREE_FAM_OID.

> Is this patch to be applied?

It hasn't been reviewed AFAIK.
        regards, tom lane

Re: WITH ORDINALITY planner improvements

From

Etsuro Fujita

Date:

03 February 2014, 02:51:32

(2014/02/01 8:01), Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
>> On Thu, Aug 15, 2013 at 07:25:17PM +0900, Etsuro Fujita wrote:
>>> Attached is an updated version of the patch.  In that version the code for the
>>> newly added function build_function_pathkeys() has been made more simple by
>>> using the macro INTEGER_BTREE_FAM_OID.
> 
>> Is this patch to be applied?
> 
> It hasn't been reviewed AFAIK.

Thanks for picking this up.

I think it doesn't need to be reviewed anymore because the same feature
has been commited in [1] if I understand correctly.

[1]
http://www.postgresql.org/message-id/E1Vjel1-0007x9-TR@gemulon.postgresql.org

Best regards,
Etsuro Fujita