Re: 7.3.1 New install, large queries are slow - Mailing list pgsql-performance
From | Roman Fail |
---|---|
Subject | Re: 7.3.1 New install, large queries are slow |
Date | |
Msg-id | 9B1C77393DED0D4B9DAA1AA1742942DA0E4C02@pos_pdc.posportal.com Whole thread Raw |
In response to | 7.3.1 New install, large queries are slow ("Roman Fail" <rfail@posportal.com>) |
Responses |
Re: 7.3.1 New install, large queries are slow
Re: 7.3.1 New install, large queries are slow Re: 7.3.1 New install, large queries are slow Re: 7.3.1 New install, large queries are slow |
List | pgsql-performance |
*********************** > Josh Berkus wrote: > Hey, Roman, how many records in BatchDetail, anyway? 23 million. *********************** > Stephan Szabo wrote: > What does vacuum verbose batchdetail give you (it'll give an idea of pages anyway) trans=# VACUUM VERBOSE batchdetail; INFO: --Relation public.batchdetail-- INFO: Pages 1669047: Changed 0, Empty 0; Tup 23316674: Vac 0, Keep 0, UnUsed 0. Total CPU 85.36s/9.38u sec elapsed 253.38 sec. INFO: --Relation pg_toast.pg_toast_8604247-- INFO: Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0. Total CPU 0.00s/0.00u sec elapsed 0.00 sec. VACUUM trans=# *********************** At Stephan Szabo and Tom Lane's suggestion, I reorganized the query so the JOIN syntax was only used in the outer joins. This did not seem to help at all. Of note: during this query 'sar -b' showed a consistent 6000 blocks read/sec, CPUwas about 2%. EXPLAIN ANALYZE SELECT b.batchdate, d.batchdetailid, t.bankno, d.trandate, d.tranamount, d.submitinterchange, d.authamount, d.authno, d.cardtypeid, d.mcccode, m.name AS merchantname, c.cardtype, m.merchid, p1.localtaxamount, p1.productidentifier, dr.avsresponse, cr.checkoutdate, cr.noshowindicator, ck.checkingacctno, ck.abaroutingno, ck.checkno FROM tranheader t, batchheader b, merchants m, cardtype c, batchdetail d LEFT JOIN purc1 p1 on p1.batchdetailid=d.batchdetailid LEFT JOIN direct dr ON dr.batchdetailid = d.batchdetailid LEFT JOIN carrental cr ON cr.batchdetailid = d.batchdetailid LEFT JOIN checks ck ON ck.batchdetailid = d.batchdetailid WHERE t.tranheaderid=b.tranheaderid AND m.merchantid=b.merchantid AND d.batchid=b.batchid AND c.cardtypeid=d.cardtypeid AND t.clientid = 6 AND d.tranamount BETWEEN 500.0 AND 700.0 AND b.batchdate > '2002-12-15' AND m.merchid = '701252267' ORDER BY b.batchdate DESC LIMIT 50 Limit (cost=1789105.21..1789105.22 rows=1 width=285) (actual time=1222029.59..1222029.61 rows=5 loops=1) -> Sort (cost=1789105.21..1789105.22 rows=1 width=285) (actual time=1222029.58..1222029.59 rows=5 loops=1) Sort Key: b.batchdate -> Nested Loop (cost=1787171.22..1789105.20 rows=1 width=285) (actual time=1221815.14..1222019.46 rows=5 loops=1) Join Filter: ("inner".tranheaderid = "outer".tranheaderid) -> Nested Loop (cost=1787171.22..1789026.02 rows=1 width=269) (actual time=1221809.33..1221978.62 rows=5loops=1) Join Filter: ("inner".cardtypeid = "outer".cardtypeid) -> Merge Join (cost=1787171.22..1789024.79 rows=1 width=255) (actual time=1221802.47..1221971.48 rows=5loops=1) Merge Cond: ("outer".batchid = "inner".batchid) -> Sort (cost=476.17..476.18 rows=4 width=102) (actual time=678.05..678.07 rows=17 loops=1) Sort Key: b.batchid -> Nested Loop (cost=0.00..476.14 rows=4 width=102) (actual time=161.62..677.95 rows=17loops=1) -> Index Scan using merchants_ix_merchid_idx on merchants m (cost=0.00..5.65 rows=1width=78) (actual time=13.87..13.88 rows=1 loops=1) Index Cond: (merchid = '701252267'::character varying) -> Index Scan using batchheader_ix_merchantid_idx on batchheader b (cost=0.00..470.30rows=15 width=24) (actual time=147.72..663.94 rows=17 loops=1) Index Cond: ("outer".merchantid = b.merchantid) Filter: (batchdate > '2002-12-15 00:00:00'::timestamp without time zone) -> Sort (cost=1786695.05..1787621.82 rows=370710 width=153) (actual time=1220080.34..1220722.19rows=368681 loops=1) Sort Key: d.batchid -> Merge Join (cost=1704191.25..1713674.49 rows=370710 width=153) (actual time=1200184.91..1213352.77rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Merge Join (cost=1704085.28..1712678.33 rows=370710 width=115) (actual time=1199705.71..1210336.37rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Merge Join (cost=1704085.27..1711751.54 rows=370710 width=98) (actual time=1199705.65..1208122.73rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Merge Join (cost=1704085.26..1710824.75 rows=370710 width=89) (actualtime=1199705.55..1205977.76 rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Sort (cost=1543119.01..1544045.79 rows=370710 width=70) (actualtime=1181172.79..1181902.77 rows=370307 loops=1) Sort Key: d.batchdetailid -> Index Scan using batchdetail_ix_tranamount_idx on batchdetaild (cost=0.00..1489103.46 rows=370710 width=70) (actual time=14.45..1176074.90 rows=370307 loops=1) Index Cond: ((tranamount >= 500.0) AND (tranamount <=700.0)) -> Sort (cost=160966.25..163319.59 rows=941335 width=19) (actualtime=18532.70..20074.09 rows=938770 loops=1) Sort Key: p1.batchdetailid -> Seq Scan on purc1 p1 (cost=0.00..44285.35 rows=941335width=19) (actual time=9.44..9119.83 rows=938770 loops=1) -> Sort (cost=0.01..0.02 rows=1 width=9) (actual time=0.08..0.08 rows=0loops=1) Sort Key: dr.batchdetailid -> Seq Scan on direct dr (cost=0.00..0.00 rows=1 width=9) (actualtime=0.01..0.01 rows=0 loops=1) -> Sort (cost=0.01..0.02 rows=1 width=17) (actual time=0.04..0.04 rows=0 loops=1) Sort Key: cr.batchdetailid -> Seq Scan on carrental cr (cost=0.00..0.00 rows=1 width=17) (actualtime=0.00..0.00 rows=0 loops=1) -> Sort (cost=105.97..109.13 rows=1267 width=38) (actual time=479.17..480.74 rows=1267loops=1) Sort Key: ck.batchdetailid -> Seq Scan on checks ck (cost=0.00..40.67 rows=1267 width=38) (actual time=447.88..475.60rows=1267 loops=1) -> Seq Scan on cardtype c (cost=0.00..1.10 rows=10 width=14) (actual time=1.37..1.39 rows=10 loops=5) -> Seq Scan on tranheader t (cost=0.00..55.15 rows=1923 width=16) (actual time=0.01..5.14 rows=1923 loops=5) Filter: (clientid = 6) Total runtime: 1222157.28 msec *********************** Just to see what would happen, I executed: ALTER TABLE batchdetail ALTER COLUMN tranamount SET STATISTICS 1000; ANALYZE; It seemed to hurt performance if anything. But the EXPLAIN estimate for rows was much closer to the real value than it waspreviously. *********************** It seems to me that the big, big isolated problem is the index scan on batchdetail.tranamount. During this small query,'sar -b' showed consistent 90,000 block reads/sec. (contrast with only 6,000 with larger query index scan). 'top'shows the CPU is at 20% user, 30% system the whole time (contrast with 2% total in larger query above). This resultshere still seem pretty bad (although not as bad as above), but I still don't know what is the bottleneck. And thestrange sar stats are confusing me. EXPLAIN ANALYZE SELECT * FROM batchdetail WHERE tranamount BETWEEN 300 AND 499; Seq Scan on batchdetail (cost=0.00..2018797.11 rows=783291 width=440) (actual time=45.66..283926.58 rows=783687 loops=1) Filter: ((tranamount >= 300::numeric) AND (tranamount <= 499::numeric)) Total runtime: 285032.47 msec *********************** > Stephan Szabo wrote: > As a followup, if you do set enable_indexscan=off; > before running the explain analyze, what does that give you? Now this is very interesting: 'sar -b' shows about 95,000 block reads/sec; CPU is at 20% user 30% system, vmstat shows noswapping, query takes only 5 minutes to execute (which is one-quarter of the time WITH the index scan!!!!). Obviouslythe execution plan is pretty different on this one (query is identical the larger one above). EXPLAIN ANALYZE SELECT b.batchdate, d.batchdetailid, t.bankno, d.trandate, d.tranamount, d.submitinterchange, d.authamount, d.authno, d.cardtypeid, d.mcccode, m.name AS merchantname, c.cardtype, m.merchid, p1.localtaxamount, p1.productidentifier, dr.avsresponse, cr.checkoutdate, cr.noshowindicator, ck.checkingacctno, ck.abaroutingno, ck.checkno FROM tranheader t, batchheader b, merchants m, cardtype c, batchdetail d LEFT JOIN purc1 p1 on p1.batchdetailid=d.batchdetailid LEFT JOIN direct dr ON dr.batchdetailid = d.batchdetailid LEFT JOIN carrental cr ON cr.batchdetailid = d.batchdetailid LEFT JOIN checks ck ON ck.batchdetailid = d.batchdetailid WHERE t.tranheaderid=b.tranheaderid AND m.merchantid=b.merchantid AND d.batchid=b.batchid AND c.cardtypeid=d.cardtypeid AND t.clientid = 6 AND d.tranamount BETWEEN 500.0 AND 700.0 AND b.batchdate > '2002-12-15' AND m.merchid = '701252267' ORDER BY b.batchdate DESC LIMIT 50 Limit (cost=2321460.56..2321460.57 rows=1 width=285) (actual time=308194.57..308194.59 rows=5 loops=1) -> Sort (cost=2321460.56..2321460.57 rows=1 width=285) (actual time=308194.57..308194.58 rows=5 loops=1) Sort Key: b.batchdate -> Nested Loop (cost=2319526.57..2321460.55 rows=1 width=285) (actual time=307988.56..308194.46 rows=5 loops=1) Join Filter: ("inner".tranheaderid = "outer".tranheaderid) -> Nested Loop (cost=2319526.57..2321381.37 rows=1 width=269) (actual time=307982.80..308153.22 rows=5 loops=1) Join Filter: ("inner".cardtypeid = "outer".cardtypeid) -> Merge Join (cost=2319526.57..2321380.14 rows=1 width=255) (actual time=307982.69..308152.82 rows=5loops=1) Merge Cond: ("outer".batchid = "inner".batchid) -> Sort (cost=2316388.70..2317315.47 rows=370710 width=153) (actual time=305976.74..306622.88rows=368681 loops=1) Sort Key: d.batchid -> Merge Join (cost=2233884.90..2243368.15 rows=370710 width=153) (actual time=286452.12..299485.43rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Merge Join (cost=2233778.93..2242371.98 rows=370710 width=115) (actual time=286428.77..296939.66rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Merge Join (cost=2233778.92..2241445.19 rows=370710 width=98) (actual time=286428.72..294750.01rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Merge Join (cost=2233778.91..2240518.40 rows=370710 width=89) (actualtime=286428.60..292606.56 rows=370307 loops=1) Merge Cond: ("outer".batchdetailid = "inner".batchdetailid) -> Sort (cost=2072812.66..2073739.44 rows=370710 width=70) (actualtime=269738.34..270470.83 rows=370307 loops=1) Sort Key: d.batchdetailid -> Seq Scan on batchdetail d (cost=0.00..2018797.11 rows=370710width=70) (actual time=41.66..266568.83 rows=370307 loops=1) Filter: ((tranamount >= 500.0) AND (tranamount <= 700.0)) -> Sort (cost=160966.25..163319.59 rows=941335 width=19) (actualtime=16690.20..18202.65 rows=938770 loops=1) Sort Key: p1.batchdetailid -> Seq Scan on purc1 p1 (cost=0.00..44285.35 rows=941335width=19) (actual time=6.88..7779.31 rows=938770 loops=1) -> Sort (cost=0.01..0.02 rows=1 width=9) (actual time=0.10..0.10 rows=0loops=1) Sort Key: dr.batchdetailid -> Seq Scan on direct dr (cost=0.00..0.00 rows=1 width=9) (actualtime=0.00..0.00 rows=0 loops=1) -> Sort (cost=0.01..0.02 rows=1 width=17) (actual time=0.03..0.03 rows=0 loops=1) Sort Key: cr.batchdetailid -> Seq Scan on carrental cr (cost=0.00..0.00 rows=1 width=17) (actualtime=0.00..0.00 rows=0 loops=1) -> Sort (cost=105.97..109.13 rows=1267 width=38) (actual time=23.32..24.89 rows=1267loops=1) Sort Key: ck.batchdetailid -> Seq Scan on checks ck (cost=0.00..40.67 rows=1267 width=38) (actual time=6.51..19.59rows=1267 loops=1) -> Sort (cost=3137.87..3137.88 rows=4 width=102) (actual time=954.18..954.20 rows=19 loops=1) Sort Key: b.batchid -> Nested Loop (cost=0.00..3137.84 rows=4 width=102) (actual time=236.26..954.04 rows=17loops=1) -> Seq Scan on merchants m (cost=0.00..2667.35 rows=1 width=78) (actual time=2.48..227.71rows=1 loops=1) Filter: (merchid = '701252267'::character varying) -> Index Scan using batchheader_ix_merchantid_idx on batchheader b (cost=0.00..470.30rows=15 width=24) (actual time=233.75..726.22 rows=17 loops=1) Index Cond: ("outer".merchantid = b.merchantid) Filter: (batchdate > '2002-12-15 00:00:00'::timestamp without time zone) -> Seq Scan on cardtype c (cost=0.00..1.10 rows=10 width=14) (actual time=0.02..0.04 rows=10 loops=5) -> Seq Scan on tranheader t (cost=0.00..55.15 rows=1923 width=16) (actual time=0.01..5.21 rows=1923 loops=5) Filter: (clientid = 6) Total runtime: 308323.60 msec *********************** I hope we can come up with something soon.....it seems this index scan is a big part of the problem. I'm still really curiouswhy the disk reads are so few with the index scan. Let's hope I can get it near the 3 second time for MSSQL by Friday! Roman Fail
pgsql-performance by date: