Re: [PERFORM] select subquery versus join subquery - Mailing list pgsql-performance

From Jeff Janes
Subject Re: [PERFORM] select subquery versus join subquery
Date
Msg-id CAMkU=1z_zzi+h-NTo0Skftr3YxF0LPtZqDGLBSRr-pRcBKQOLQ@mail.gmail.com
Whole thread Raw
In response to Re: [PERFORM] select subquery versus join subquery  ("Gunnar \"Nick\" Bluth" <gunnar.bluth.extern@elster.de>)
Responses Re: [PERFORM] select subquery versus join subquery
List pgsql-performance
On Tue, May 23, 2017 at 4:03 AM, Gunnar "Nick" Bluth <gunnar.bluth.extern@elster.de> wrote:
Am 05/22/2017 um 09:57 PM schrieb Jeff Janes:
>
> create view view2 as select id,
>   (
>      select md5 from thing_alias where thing_id=id
>         order by priority desc limit 1
>   ) as md5,
>   cutoff from thing;
>
> Cheers,
>
> Jeff

Hi Jeff,

how does something like

CREATE OR REPLACE VIEW public.view3 AS
 SELECT thing.id,
    foo.md5,
    thing.cutoff
   FROM thing,
    LATERAL ( SELECT DISTINCT ON (thing_alias.thing_id)
thing_alias.thing_id,
            thing_alias.md5
           FROM thing_alias
          WHERE thing_alias.thing_id = thing.id
          ORDER BY thing_alias.thing_id, thing_alias.priority DESC) foo

work for you? At least that's always using an index scan here, as
opposed to view1, which (for me) defaults to a SeqScan on thing_alias at
a low cutoff.

Unfortunately that always uses the index scan, even at a high cutoff where aggregation on the seq scan and then hash joining is more appropriate.  So it is very similar to view2, except that it doesn't return the rows from "thing" which have zero corresponding rows in thing_alias.

*****
Note btw. that both view1 and view2 don't return any md5 values for me,
while view3 does!
*****

Because of the way I constructed the data, using the power transform of the uniform random distribution, the early rows of the view (if sorted by thing_id) are mostly null in the md5 column, so if you only look at the first few screen-fulls you might not see any md5.  But your view does effectively an inner join rather than a left join, so your view gets rid of the rows with a NULL md5.  Most things don't have aliases; of the things that do, most have 1; and some have a several.

 

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Dinesh Chandra 12108
Date:
Subject: Re: [PERFORM] Log update query along with username who has executed thesame.
Next
From: Jarek
Date:
Subject: [PERFORM] More cores or higer frequency ?