Unions and where optimisation - Mailing list pgsql-performance
From | Boris Klug |
---|---|
Subject | Unions and where optimisation |
Date | |
Msg-id | 200301081425.48597.boris.klug@control.de Whole thread Raw |
Responses |
Re: Unions and where optimisation
Re: Unions and where optimisation Re: Unions and where optimisation |
List | pgsql-performance |
Hello! I am quite new in the PostgreSQL performance business, done a few years Oracle stuff before. My ist question is the following: We have three table, lets name them rk150, 151 and rk152. They all have a timestamp and a order number in common but than different data after this. Now I need the data from all tables in one view for a given order number, so I created a view create view orderevents as select ts, aufnr from rk150 union select ts, aufnr from rk151 union select ts, aufnr from rk152; When I does a "select * from orderevents where aufnr='1234'" it takes over 14 seconds! The problem is now that PostgreSQL first does the union with all the three tables and after this sorts out the right rows: Subquery Scan a (cost=54699.06..56622.18 rows=38462 width=20) -> Unique (cost=54699.06..56622.18 rows=38462 width=20) -> Sort (cost=54699.06..54699.06 rows=384624 width=20) -> Append (cost=0.00..10689.24 rows=384624 width=20) -> Subquery Scan *SELECT* 1 (cost=0.00..8862.52 rows=314852 width=20) -> Seq Scan on rk150 (cost=0.00..8862.52 rows=314852 width=20) -> Subquery Scan *SELECT* 2 (cost=0.00..1208.58 rows=45858 width=20) -> Seq Scan on rk151 (cost=0.00..1208.58 rows=45858 width=20) -> Subquery Scan *SELECT* 3 (cost=0.00..618.14 rows=23914 width=20) -> Seq Scan on rk152 (cost=0.00..618.14 rows=23914 width=20) A better thing would it (Oracle does this and I think I have seen it on PostgreSQL before), that the where-clause is moved inside every select so we have something like this (written by hand): select * from ( select zeit, aufnr from rk150 where aufnr='13153811' union select zeit, aufnr from rk151 where aufnr='13153811' union select zeit, aufnr from rk152 where aufnr='13153811') as A; This takes less than 1 second because the nr of rows that have to be joined are only 45 (optimizer expects 4), not > 300.000: Subquery Scan a (cost=45.97..46.19 rows=4 width=20) -> Unique (cost=45.97..46.19 rows=4 width=20) -> Sort (cost=45.97..45.97 rows=45 width=20) -> Append (cost=0.00..44.74 rows=45 width=20) -> Subquery Scan *SELECT* 1 (cost=0.00..32.22 rows=31 width=20) -> Index Scan using rk150_uidx_aufnr on rk150 (cost=0.00..32.22 rows=31 width=20) -> Subquery Scan *SELECT* 2 (cost=0.00..7.67 rows=9 width=20) -> Index Scan using rk151_uidx_aufnr on rk151 (cost=0.00..7.67 rows=9 width=20) -> Subquery Scan *SELECT* 3 (cost=0.00..4.85 rows=5 width=20) -> Index Scan using rk152_uidx_aufnr on rk152 (cost=0.00..4.85 rows=5 width=20) My question now: Is the optimizer able to move the where clause into unions? If so, how I can get him to do it? Thank you for the help in advance! -- Dipl. Inform. Boris Klug, control IT GmbH, Germany
pgsql-performance by date: