Thread: Slow SQL?
I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs really slow compared toon the Oracle database and I am not able to interpret why this is slow. The SQL looks like this: Select a.status, a.plass, a.navn, a.avlsnr, date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0) From sed_uttak s, sem_avlsverdi a, semin_vare v where a.aktiv = 1 And s.dato = (Select Max(y.dato) From sed_uttak y Where y.avlsnr = s.avlsnr) And a.avlsnr = s.avlsnr And s.sedtype = v.tallkode And a.avlsnr Not In (Select avlsnr From dyr_pause_mot) Union Select a.status, a.plass, a.navn, a.avlsnr, date_part('day',(now() - s.dato)) dato_diff, 'Tappe pause', COALESCE(a.avlsverdi,0) From sed_uttak s, sem_avlsverdi a, dyr_pause_mot p Where s.dato = (Select Max(x.dato) From sed_uttak x Where x.avlsnr = s.avlsnr) And a.avlsnr = s.avlsnr And a.avlsnr = p.avlsnr Union Select a.status, a.plass, a.navn, a.avlsnr, null dato_diff, 'IKKE TAPPET', COALESCE(a.avlsverdi,0) From sem_avlsverdi a Where a.aktiv = 1 And a.avlsnr Not In (Select avlsnr From sed_uttak) And a.avlsnr Not In (Select avlsnr From dyr_pause_mot) And the explain result looks like this: HashAggregate (cost=7288068.92..7288092.10 rows=2318 width=36) (actual time=10740.366..10741.879 rows=6475 loops=1) Group Key: a.status, a.plass, a.navn, a.avlsnr, (date_part('day'::text, (now() - (s.dato)::timestamp with time zone))),v.tekst, (COALESCE((a.avlsverdi)::integer, 0)) -> Append (cost=1.46..7288028.35 rows=2318 width=36) (actual time=0.203..10730.906 rows=8915 loops=1) -> Nested Loop (cost=1.46..7274678.41 rows=698 width=82) (actual time=0.203..10638.870 rows=8602 loops=1) Join Filter: (s.sedtype = v.tallkode) Rows Removed by Join Filter: 127006 -> Nested Loop (cost=1.46..7274438.07 rows=698 width=26) (actual time=0.189..10607.509 rows=6164 loops=1) -> Seq Scan on sem_avlsverdi a (cost=1.04..153.19 rows=3238 width=16) (actual time=0.024..4.027 rows=6474loops=1) Filter: ((NOT (hashed SubPlan 5)) AND (aktiv = 1)) Rows Removed by Filter: 3 SubPlan 5 -> Seq Scan on dyr_pause_mot dyr_pause_mot_1 (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.003rows=3 loops=1) -> Index Scan using idx_seduttak_avlsnr on sed_uttak s (cost=0.42..2246.53 rows=1 width=14) (actualtime=1.404..1.637 rows=1 loops=6474) Index Cond: (avlsnr = a.avlsnr) Filter: (dato = (SubPlan 4)) Rows Removed by Filter: 42 SubPlan 4 -> Aggregate (cost=43.09..43.10 rows=1 width=8) (actual time=0.037..0.037 rows=1 loops=279035) -> Index Scan using idx_seduttak_avlsnr on sed_uttak y (cost=0.42..42.96 rows=52 width=8)(actual time=0.003..0.029 rows=76 loops=279035) Index Cond: (avlsnr = s.avlsnr) -> Materialize (cost=0.00..1.33 rows=22 width=60) (actual time=0.000..0.001 rows=22 loops=6164) -> Seq Scan on semin_vare v (cost=0.00..1.22 rows=22 width=60) (actual time=0.002..0.005 rows=22 loops=1) -> Nested Loop (cost=0.70..6761.33 rows=1 width=24) (actual time=0.342..1.801 rows=3 loops=1) -> Nested Loop (cost=0.28..25.96 rows=3 width=20) (actual time=0.008..0.015 rows=3 loops=1) -> Seq Scan on dyr_pause_mot p (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.002 rows=3 loops=1) -> Index Scan using idx_avlsverdi_avlsnr on sem_avlsverdi a_1 (cost=0.28..8.30 rows=1 width=16) (actualtime=0.003..0.003 rows=1 loops=3) Index Cond: (avlsnr = p.avlsnr) -> Index Scan using idx_seduttak_avlsnr on sed_uttak s_1 (cost=0.42..2245.11 rows=1 width=12) (actual time=0.377..0.589rows=1 loops=3) Index Cond: (avlsnr = a_1.avlsnr) Filter: (dato = (SubPlan 3)) Rows Removed by Filter: 27 SubPlan 3 -> Aggregate (cost=43.09..43.10 rows=1 width=8) (actual time=0.020..0.020 rows=1 loops=85) -> Index Scan using idx_seduttak_avlsnr on sed_uttak x (cost=0.42..42.96 rows=52 width=8) (actualtime=0.002..0.015 rows=37 loops=85) Index Cond: (avlsnr = s_1.avlsnr) -> Seq Scan on sem_avlsverdi a_2 (cost=6393.04..6565.43 rows=1619 width=16) (actual time=87.448..89.059 rows=310loops=1) Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (aktiv = 1)) Rows Removed by Filter: 6167 SubPlan 1 -> Seq Scan on sed_uttak (cost=0.00..5694.20 rows=279120 width=4) (actual time=0.005..36.384 rows=279120loops=1) SubPlan 2 -> Seq Scan on dyr_pause_mot (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.003 rows=3 loops=1) Planning time: 0.927 ms Execution time: 10742.300ms Appreciate any pointers on where to look... :) Regards, BTJ -- ----------------------------------------------------------------------------------------------- Bjørn T Johansen btj@havleik.no ----------------------------------------------------------------------------------------------- Someone wrote: "I understand that if you play a Windows CD backwards you hear strange Satanic messages" To which someone replied: "It's even worse than that; play it forwards and it installs Windows" -----------------------------------------------------------------------------------------------
Hi Bjorn, I have experienced that some subqueries can be quite slow, and would suspect the NOT IN clause. I occasionally rewrite NOT IN (select key from unwanted_candidates) as IN (select key from possible_candidates except select key from unwanted_candidates) Admittedly, I am not running latest version Regards Wolfgang Hamann >> I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs really slow comparedto on the Oracle database and >> I am not able to interpret why this is slow. >> >> The SQL looks like this: >> >> >> Select a.status, a.plass, a.navn, a.avlsnr, >> date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0) >> From sed_uttak s, sem_avlsverdi a, semin_vare v >> where a.aktiv = 1 >> And s.dato = (Select Max(y.dato) >> From sed_uttak y >> Where y.avlsnr = s.avlsnr) >> And a.avlsnr = s.avlsnr >> And s.sedtype = v.tallkode >> And a.avlsnr Not In (Select avlsnr >> From dyr_pause_mot) >> Union >> Select a.status, a.plass, a.navn, a.avlsnr, >> date_part('day',(now() - s.dato)) dato_diff, 'Tappe pause', COALESCE(a.avlsverdi,0) >> From sed_uttak s, sem_avlsverdi a, dyr_pause_mot p >> Where s.dato = (Select Max(x.dato) >> From sed_uttak x >> Where x.avlsnr = s.avlsnr) >> And a.avlsnr = s.avlsnr >> And a.avlsnr = p.avlsnr >> Union >> Select a.status, a.plass, a.navn, a.avlsnr, null dato_diff, 'IKKE TAPPET', COALESCE(a.avlsverdi,0) >> From sem_avlsverdi a >> Where a.aktiv = 1 >> And a.avlsnr Not In (Select avlsnr From sed_uttak) >> And a.avlsnr Not In (Select avlsnr From dyr_pause_mot) >> >> >> >> And the explain result looks like this: >> >> HashAggregate (cost=7288068.92..7288092.10 rows=2318 width=36) (actual time=10740.366..10741.879 rows=6475 loops=1) >> Group Key: a.status, a.plass, a.navn, a.avlsnr, (date_part('day'::text, (now() - (s.dato)::timestamp with time zone))),v.tekst, (COALESCE((a.avlsverdi)::integer, 0)) >> -> Append (cost=1.46..7288028.35 rows=2318 width=36) (actual time=0.203..10730.906 rows=8915 loops=1) >> -> Nested Loop (cost=1.46..7274678.41 rows=698 width=82) (actual time=0.203..10638.870 rows=8602 loops=1) >> Join Filter: (s.sedtype = v.tallkode) >> Rows Removed by Join Filter: 127006 >> -> Nested Loop (cost=1.46..7274438.07 rows=698 width=26) (actual time=0.189..10607.509 rows=6164 loops=1) >> -> Seq Scan on sem_avlsverdi a (cost=1.04..153.19 rows=3238 width=16) (actual time=0.024..4.027rows=6474 loops=1) >> Filter: ((NOT (hashed SubPlan 5)) AND (aktiv = 1)) >> Rows Removed by Filter: 3 >> SubPlan 5 >> -> Seq Scan on dyr_pause_mot dyr_pause_mot_1 (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.003rows=3 loops=1) >> -> Index Scan using idx_seduttak_avlsnr on sed_uttak s (cost=0.42..2246.53 rows=1 width=14) (actualtime=1.404..1.637 rows=1 loops=6474) >> Index Cond: (avlsnr = a.avlsnr) >> Filter: (dato = (SubPlan 4)) >> Rows Removed by Filter: 42 >> SubPlan 4 >> -> Aggregate (cost=43.09..43.10 rows=1 width=8) (actual time=0.037..0.037 rows=1 loops=279035) >> -> Index Scan using idx_seduttak_avlsnr on sed_uttak y (cost=0.42..42.96 rows=52 width=8)(actual time=0.003..0.029 rows=76 loops=279035) >> Index Cond: (avlsnr = s.avlsnr) >> -> Materialize (cost=0.00..1.33 rows=22 width=60) (actual time=0.000..0.001 rows=22 loops=6164) >> -> Seq Scan on semin_vare v (cost=0.00..1.22 rows=22 width=60) (actual time=0.002..0.005 rows=22loops=1) >> -> Nested Loop (cost=0.70..6761.33 rows=1 width=24) (actual time=0.342..1.801 rows=3 loops=1) >> -> Nested Loop (cost=0.28..25.96 rows=3 width=20) (actual time=0.008..0.015 rows=3 loops=1) >> -> Seq Scan on dyr_pause_mot p (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.002 rows=3loops=1) >> -> Index Scan using idx_avlsverdi_avlsnr on sem_avlsverdi a_1 (cost=0.28..8.30 rows=1 width=16)(actual time=0.003..0.003 rows=1 loops=3) >> Index Cond: (avlsnr = p.avlsnr) >> -> Index Scan using idx_seduttak_avlsnr on sed_uttak s_1 (cost=0.42..2245.11 rows=1 width=12) (actualtime=0.377..0.589 rows=1 loops=3) >> Index Cond: (avlsnr = a_1.avlsnr) >> Filter: (dato = (SubPlan 3)) >> Rows Removed by Filter: 27 >> SubPlan 3 >> -> Aggregate (cost=43.09..43.10 rows=1 width=8) (actual time=0.020..0.020 rows=1 loops=85) >> -> Index Scan using idx_seduttak_avlsnr on sed_uttak x (cost=0.42..42.96 rows=52 width=8)(actual time=0.002..0.015 rows=37 loops=85) >> Index Cond: (avlsnr = s_1.avlsnr) >> -> Seq Scan on sem_avlsverdi a_2 (cost=6393.04..6565.43 rows=1619 width=16) (actual time=87.448..89.059 rows=310loops=1) >> Filter: ((NOT (hashed SubPlan 1)) AND (NOT (hashed SubPlan 2)) AND (aktiv = 1)) >> Rows Removed by Filter: 6167 >> SubPlan 1 >> -> Seq Scan on sed_uttak (cost=0.00..5694.20 rows=279120 width=4) (actual time=0.005..36.384 rows=279120loops=1) >> SubPlan 2 >> -> Seq Scan on dyr_pause_mot (cost=0.00..1.03 rows=3 width=4) (actual time=0.002..0.003 rows=3 loops=1) >> Planning time: 0.927 ms >> Execution time: 10742.300ms >> >> >> Appreciate any pointers on where to look... :) >> >> >> Regards, >> >> BTJ >> >> -- >> ----------------------------------------------------------------------------------------------- >> Bjørn T Johansen >> >> btj@havleik.no >> ----------------------------------------------------------------------------------------------- >> Someone wrote: >> "I understand that if you play a Windows CD backwards you hear strange Satanic messages" >> To which someone replied: >> "It's even worse than that; play it forwards and it installs Windows" >> ----------------------------------------------------------------------------------------------- >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general
hamann.w@t-online.de wrote: Bjørn T Johansen wrote: >> I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs >> really slow compared to on the Oracle database and I am not able to interpret why this is slow. > I have experienced that some subqueries can be quite slow, and would suspect the NOT IN > clause. I occasionally rewrite > NOT IN (select key from unwanted_candidates) > as > IN (select key from possible_candidates except select key from unwanted_candidates) I would try to rewrite these clauses to NOT EXISTS, for example a.avlsnr Not In (Select avlsnr From dyr_pause_mot) could be NOT EXISTS (SELECT NULL FROM dyr_pause_mot WHERE avlsnr = a.avlsnr) This can be executed as anti-join and is often more efficient. Yours, Laurenz Albe
On Tue, 12 Jul 2016 09:12:23 +0000 Albe Laurenz <laurenz.albe@wien.gv.at> wrote: > hamann.w@t-online.de wrote: > Bjørn T Johansen wrote: > >> I am trying to move a small system from Oracle to PostgreSQL and I have come upon a sql that runs > >> really slow compared to on the Oracle database and I am not able to interpret why this is slow. > > > I have experienced that some subqueries can be quite slow, and would suspect the NOT IN > > clause. I occasionally rewrite > > NOT IN (select key from unwanted_candidates) > > as > > IN (select key from possible_candidates except select key from unwanted_candidates) > > I would try to rewrite these clauses to NOT EXISTS, for example > > a.avlsnr Not In (Select avlsnr From dyr_pause_mot) > > could be > > NOT EXISTS (SELECT NULL FROM dyr_pause_mot WHERE avlsnr = a.avlsnr) > > This can be executed as anti-join and is often more efficient. > > Yours, > Laurenz Albe > Thx for your suggestions. Tried to use NOT EXISTS and the query was about half a second quicker so not much difference... But when I try to run the 3 queries separately, then they are very quick, 2 barely measurable and the third takes about 1,5seconds. The union query takes a little over 9 seconds, so I guess the union part is the bottleneck? BTJ
On Tue, Jul 12, 2016 at 10:23:24AM +0200, Bjørn T Johansen wrote: > I am trying to move a small system from Oracle to PostgreSQL and > I have come upon a sql that runs really slow compared to on the Oracle > database and I am not able to interpret why this is slow. I loaded your explain analyze to https://explain.depesz.com/, as: https://explain.depesz.com/s/iXK as you can see there, the problem is that you made 280 thousand checks for "sed_uttak y", which seems to be related to this part: > Select a.status, a.plass, a.navn, a.avlsnr, > date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0) > From sed_uttak s, sem_avlsverdi a, semin_vare v > where a.aktiv = 1 > And s.dato = (Select Max(y.dato) > From sed_uttak y > Where y.avlsnr = s.avlsnr) from what I understand, you're doing it to work on newest record from sed_uttak, for each avlsnr. What is rowcount in the table, and how many different avlsnr are there? You might want to do something like: with s as ( select distinct on (avlsnr) * from sed_uttak order by avlsnr desc, dato desc ) and then use "s" instead of set_uttak, and get rid of the s.dato = (select max....) checks. Best regards, depesz -- The best thing about modern society is how easy it is to avoid contact with it. http://depesz.com/
Bjørn T Johansen wrote: > Thx for your suggestions. Tried to use NOT EXISTS and the query was about half a second quicker so not > much difference... > But when I try to run the 3 queries separately, then they are very quick, 2 barely measurable and the > third takes about 1,5 seconds. The union query > takes a little over 9 seconds, so I guess the union part is the bottleneck? Looking at EXPLAIN (ANALYZE) output should tell you. If yes, is UNION ALL an option for you? That should be cheaper. Yours, Laurenz Albe
=?UTF-8?B?QmrDuHJu?= T Johansen <btj@havleik.no> writes: > But when I try to run the 3 queries separately, then they are very quick, 2 barely measurable and the third takes about1,5 seconds. The union query > takes a little over 9 seconds, so I guess the union part is the bottleneck? No; it's clear from your EXPLAIN output that the first UNION arm is taking the bulk of the runtime. It's odd that it wouldn't be exactly the same when run standalone. What does EXPLAIN ANALYZE show if you run that first arm by itself? I concur with depesz's observation that the MAX() subselects seem to be pretty expensive. If you don't want to rewrite the query in a wholesale fashion like he suggests, you might be able to make the MAX's cheaper by providing an index on sed_uttak(avlsnr, dato); but I'm not sure how much that will help. regards, tom lane
On Tue, 12 Jul 2016 14:34:50 +0200 hubert depesz lubaczewski <depesz@depesz.com> wrote: > On Tue, Jul 12, 2016 at 10:23:24AM +0200, Bjørn T Johansen wrote: > > I am trying to move a small system from Oracle to PostgreSQL and > > I have come upon a sql that runs really slow compared to on the Oracle > > database and I am not able to interpret why this is slow. > > I loaded your explain analyze to https://explain.depesz.com/, as: > https://explain.depesz.com/s/iXK > > as you can see there, the problem is that you made 280 thousand checks > for "sed_uttak y", which seems to be related to this part: > > > > Select a.status, a.plass, a.navn, a.avlsnr, > > date_part('day',(now() - s.dato)) dato_diff, v.tekst, COALESCE(a.avlsverdi,0) > > From sed_uttak s, sem_avlsverdi a, semin_vare v > > where a.aktiv = 1 > > And s.dato = (Select Max(y.dato) > > From sed_uttak y > > Where y.avlsnr = s.avlsnr) > > from what I understand, you're doing it to work on newest record from > sed_uttak, for each avlsnr. > > What is rowcount in the table, and how many different avlsnr are there? > > You might want to do something like: > > with s as ( > select distinct on (avlsnr) * > from sed_uttak > order by avlsnr desc, dato desc > ) > > and then use "s" instead of set_uttak, and get rid of the s.dato > = (select max....) checks. > > Best regards, > > depesz > Rewrote using the with s as method and now it's faster than the oracle querytime, so I am happy... And I learned something new, never used the with s as way before... Thx! :) BTJ