Home > mailing lists

Unions and where optimisation - Mailing list pgsql-performance

From	Boris Klug
Subject	Unions and where optimisation
Date	January 8, 2003 08:27:08
Msg-id	200301081425.48597.boris.klug@control.de Whole thread Raw
Responses	Re: Unions and where optimisation Re: Unions and where optimisation Re: Unions and where optimisation
List	pgsql-performance

Tree view

Hello!

I am quite new in the PostgreSQL performance business, done a few years Oracle
stuff before. My ist question is the following:

We have three table, lets name them rk150, 151 and rk152. They all have a
timestamp and a order number in common but than different data after this.
Now I need the data from all tables in one view for a given order number, so
I created a view

create view orderevents as
   select ts, aufnr from rk150
     union
   select ts, aufnr from rk151
     union
   select ts, aufnr from rk152;

When I does a "select * from orderevents where aufnr='1234'" it takes over 14
seconds!
The problem is now that PostgreSQL first does the union with all the three
tables and after this sorts out the right rows:

Subquery Scan a  (cost=54699.06..56622.18 rows=38462 width=20)
  ->  Unique  (cost=54699.06..56622.18 rows=38462 width=20)
        ->  Sort  (cost=54699.06..54699.06 rows=384624 width=20)
              ->  Append  (cost=0.00..10689.24 rows=384624 width=20)
                    ->  Subquery Scan *SELECT* 1
                           (cost=0.00..8862.52 rows=314852 width=20)
                          ->  Seq Scan on rk150
                           (cost=0.00..8862.52 rows=314852 width=20)
                    ->  Subquery Scan *SELECT* 2
                           (cost=0.00..1208.58 rows=45858 width=20)
                          ->  Seq Scan on rk151
                           (cost=0.00..1208.58 rows=45858 width=20)
                    ->  Subquery Scan *SELECT* 3
                           (cost=0.00..618.14 rows=23914 width=20)
                          ->  Seq Scan on rk152
                           (cost=0.00..618.14 rows=23914 width=20)

A better thing would it (Oracle does this and I think I have seen it on
PostgreSQL before), that the where-clause is moved inside every select so we
have something like this (written by hand):

select * from (
   select zeit, aufnr from rk150 where aufnr='13153811'
     union
   select zeit, aufnr from rk151 where aufnr='13153811'
     union
   select zeit, aufnr from rk152 where aufnr='13153811')
 as A;

This takes less than 1 second because the nr of rows that have to be joined
are only 45 (optimizer expects 4), not > 300.000:

Subquery Scan a  (cost=45.97..46.19 rows=4 width=20)
  ->  Unique  (cost=45.97..46.19 rows=4 width=20)
        ->  Sort  (cost=45.97..45.97 rows=45 width=20)
              ->  Append  (cost=0.00..44.74 rows=45 width=20)
                    ->  Subquery Scan *SELECT* 1
                           (cost=0.00..32.22 rows=31 width=20)
                          ->  Index Scan using rk150_uidx_aufnr on rk150
                           (cost=0.00..32.22 rows=31 width=20)
                    ->  Subquery Scan *SELECT* 2
                           (cost=0.00..7.67 rows=9 width=20)
                          ->  Index Scan using rk151_uidx_aufnr on rk151
                           (cost=0.00..7.67 rows=9 width=20)
                    ->  Subquery Scan *SELECT* 3
                           (cost=0.00..4.85 rows=5 width=20)
                          ->  Index Scan using rk152_uidx_aufnr on rk152
                           (cost=0.00..4.85 rows=5 width=20)

My question now: Is the optimizer able to move the where clause into unions?
If so, how I can get him to do it?

Thank you for the help in advance!

--
Dipl. Inform. Boris Klug, control IT GmbH, Germany

pgsql-performance by date:

From: "scott.marlowe"
Date: 07 January 2003, 18:24:27
Subject: Re: [GENERAL] PostgreSQL and memory usage

From: Achilleus Mantzios
Date: 08 January 2003, 08:45:17
Subject: Re: [SQL] 7.3.1 index use / performance

Unions and where optimisation - Mailing list pgsql-performance

Previous

Next