Thread: Query optimization
We have a query as below: ========================================================= select relationship_id as "Id",company_name as "Company",product_category_desc as "Product",uom_desc as "UOM",shipment_term_desc as "Shipment Term",payment_term_desc as "Payment Term",city_name as "Port",currency_name as "Currency",activity_desc as "Activity",credit_days as "Credit Days" from _100001relationships rs,master_member mm,member_product_details mpd,master_product_category mpc,master_uom mu,master_shipment_term mst,master_payment_term mpt,member_financial_details mfd,master_currency mc,member_commercial_details mcd,master_activity ma,master_city mcy where mpd.product_details_id = rs.product_details_id and mpd.product_category_id=mpc.product_category_id and mpd.uom_id = mu.uom_id and rs.shipment_term_id = mst.shipment_term_id and rs.payment_term_id = mpt.payment_term_id and rs.commercial_details_id = mcd.commercial_details_id and mcd.port = mcy.city_id and rs.financial_details_id = mfd.financial_details_id and mfd.currency_id = mc.currency_id and mpd.activity_id = ma.activity_id and mm.member_id = mpd.member_id ORDER BY rs.relationship_id DESC; ===================================================== The decision to keep the fields in different tables was taken in view of the overall need of the system (there might be scope for improvement here too). This query normally select about 10-20 rows. The problem is, the page load takes about 4-5 seconds in the local network. The query run in psql terminal takes about 2 second to execute (outputing 3 rows). When hosted on the internet with most of our users using dialup connections, and the query returning 10+ rows, this will not be acceptable. Please point us towards the right direction to handle this type of problems. Best regards, Siva Kumar
On Friday 04 Oct 2002 9:17 am, Siva Kumar wrote: > We have a query as below: [snip query with many joins] > > The decision to keep the fields in different tables was taken in view of > the overall need of the system (there might be scope for improvement here > too). If that's the way the design makes sense, stick with it. It's better to get Postgresql to handle a clean design rather than mangle a design. > This query normally select about 10-20 rows. The problem is, the page > load takes about 4-5 seconds in the local network. The query run in psql > terminal takes about 2 second to execute (outputing 3 rows). > > When hosted on the internet with most of our users using dialup > connections, and the query returning 10+ rows, this will not be acceptable. Start by running EXPLAIN SELECT ... and looking at how the parser is handling the query. One thing you might find useful is to use explicit JOINs to tell Postgresql what order to connect the tables. You might prefere EXPLAIN ANALYSE SELECT ... which will calculate actual times for each stage. See the online manuals for details. If you are missing indexes, you can add them. If Postgresql is not using indexes you already have then we can look at why. Finally, if the plan looks OK, we can look at tuning sort memory or similar. First stage though, run an EXPLAIN and if you need help understanding it post the output back to the list. - Richard Huxton
Richard Huxton <dev@archonet.com> writes: > First stage though, run an EXPLAIN and if you need help understanding it post > the output back to the list. Yep, that's a good idea. Another thing to try is to disable GEQO (SET geqo = off;) and see what difference that makes to the performance of the optimizer and the quality of the chosen plan (12 relations is close to the cross-over point for the useability of the genetic optimizer, IME). Also, your comment regarding slow access over the Internet is not a problem with PostgreSQL -- if it really *is* 2x or so slower when done remotely, talk to your network admin. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
On Friday 04 Oct 2002 3:44 pm, you wrote: > > The decision to keep the fields in different tables was taken in view of > > the overall need of the system (there might be scope for improvement here > > too). > > If that's the way the design makes sense, stick with it. It's better to get > Postgresql to handle a clean design rather than mangle a design. We had a relook at the design and managed to add one reduntant field in a table which made sense otherwise also. This shaved some time from the earlier query. now the query looks like this. ====================================== select relationship_id as "Id",company_name as "Company",product_category_desc as "Product",uom_desc as "UOM",shipment_term_desc as "Shipment Term",payment_term_desc as "Payment Term",city_name as "Port",currency_name as "Currency",activity_desc as "Activity",credit_days as "Credit Days" from _100001relationships rs,master_member mm,member_product_details mpd,master_product_category mpc,master_uom mu,master_shipment_term mst,master_payment_term mpt,member_financial_details mfd,master_currency mc,member_commercial_details mcd,master_activity ma,master_city mcy where mpd.product_details_id = rs.product_details_id and mpd.product_category_id=mpc.product_category_id and mpd.uom_id = mu.uom_id and rs.shipment_term_id = mst.shipment_term_id and rs.payment_term_id = mpt.payment_term_id and rs.commercial_details_id = mcd.commercial_details_id and mcd.port = mcy.city_id and rs.financial_details_id = mfd.financial_details_id and mfd.currency_id = mc.currency_id and mpd.activity_id = ma.activity_id and mm.member_id = rs.partner_id ORDER BY rs.relationship_id DESC; ================================================== > First stage though, run an EXPLAIN and if you need help understanding it > post the output back to the list. Giving below the output of EXPLAIN ANALYSE. I could not make much sense out of it, please help! NOTICE: QUERY PLAN: Sort (cost=1382.45..1382.45 rows=1000 width=442) (actual time=3.47..3.47 rows=3 loops=1) -> Merge Join (cost=1263.12..1332.62 rows=1000 width=442) (actual time=3.21..3.27 rows=3 loops=1) -> Index Scan using master_activity_pkey on master_activity ma (cost=0.00..52.00 rows=1000 width=50) (actual time=0.06..0.09 rows=4 loops=1) -> Sort (cost=1263.12..1263.12 rows=1000 width=392) (actual time=3.04..3.05 rows=3 loops=1) -> Merge Join (cost=1143.79..1213.29 rows=1000 width=392) (actual time=2.87..2.95 rows=3 loops=1) -> Index Scan using master_member_pkey on master_member mm (cost=0.00..52.00 rows=1000 width=47) (actual time=0.03..0.23 rows=45 loops=1) -> Sort (cost=1143.79..1143.79 rows=1000 width=345) (actual time=2.48..2.48 rows=3 loops=1) -> Merge Join (cost=1024.46..1093.96 rows=1000 width=345) (actual time=2.30..2.41 rows=3 loops=1) -> Index Scan using master_product_category_pkey on master_product_category mpc (cost=0.00..52.00 rows=1000 width=37) (actual time=0.02..0.08 rows=7 loops=1) -> Sort (cost=1024.46..1024.46 rows=1000 width=308) (actual time=2.17..2.17 rows=3 loops=1) -> Merge Join (cost=905.13..974.63 rows=1000 width=308) (actual time=1.96..2.10 rows=3 loops=1) -> Index Scan using master_uom_pkey on master_uom mu (cost=0.00..52.00 rows=1000 width=41) (actual time=0.03..0.06 rows=6 loops=1) -> Sort (cost=905.13..905.13 rows=1000 width=267) (actual time=1.87..1.87 rows=3 loops=1) -> Merge Join (cost=785.80..855.30 rows=1000 width=267) (actual time=1.70..1.77 rows=3 loops=1) -> Index Scan using member_product_details_pkey on member_product_details mpd (cost=0.00..52.00 rows=1000 width=23) (actual time=0.03..0.11 rows=14 loops=1) -> Sort (cost=785.80..785.80 rows=1000 width=244) (actual time=1.53..1.53 rows=3 loops=1) -> Merge Join (cost=666.47..735.97 rows=1000 width=244) (actual time=1.35..1.47 rows=3 loops=1) -> Index Scan using master_shipment_term_pkey on master_shipment_term mst (cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.06 rows=5 loops=1) -> Sort (cost=666.47..666.47 rows=1000 width=194) (actual time=1.27..1.27 rows=3 loops=1) -> Merge Join (cost=547.14..616.64 rows=1000 width=194) (actual time=1.12..1.19 rows=3 loops=1) -> Index Scan using master_city_pkey on master_city mcy (cost=0.00..52.00 rows=1000 width=47) (actual time=0.02..0.04 rows=4 loops=1) -> Sort (cost=547.14..547.14 rows=1000 width=147) (actual time=1.05..1.06 rows=3 loops=1) -> Merge Join (cost=427.82..497.32 rows=1000 width=147) (actual time=0.92..1.00 rows=3 loops=1) -> Index Scan using master_currency_pkey on master_currency mc (cost=0.00..52.00 rows=1000 width=40) (actual time=0.03..0.06 rows=4 loops=1) -> Sort (cost=427.82..427.82 rows=1000 width=107) (actual time=0.81..0.81 rows=3 loops=1) -> Merge Join (cost=308.49..377.99 rows=1000 width=107) (actual time=0.69..0.74 rows=3 loops=1) -> Index Scan using member_financial_details_pkey on member_financial_details mfd (cost=0.00..52.00 rows=1000 width=11) (actual time=0.03..0.08 rows=11 loops=1) -> Sort (cost=308.49..308.49 rows=1000 width=96) (actual time=0.56..0.57 rows=3 loops=1) -> Merge Join (cost=189.16..258.66 rows=1000 width=96) (actual time=0.46..0.51 rows=3 loops=1) -> Index Scan using master_payment_term_pkey on master_payment_term mpt (cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.04 rows=4 loops=1) -> Sort (cost=189.16..189.16 rows=1000 width=46) (actual time=0.35..0.35 rows=3 loops=1) -> Merge Join (cost=69.83..139.33 rows=1000 width=46) (actual time=0.21..0.25 rows=3 loops=1) -> Index Scan using member_commercial_details_pkey on member_commercial_details mcd (cost=0.00..52.00 rows=1000 width=8) (actual time=0.02..0.06 rows=6 loops=1) -> Sort (cost=69.83..69.83 rows=1000 width=38) (actual time=0.11..0.11 rows=3 loops=1) -> Seq Scan on _100001relationships rs (cost=0.00..20.00 rows=1000 width=38) (actual time=0.03..0.05 rows=3 loops=1) Total runtime: 5.45 msec ===================================================== Best regards, Siva Kumar
Quoting Siva Kumar <tech@leatherlink.net>: > > > Giving below the output of EXPLAIN ANALYSE. I could not make much > sense out of > it, please help! > > NOTICE: QUERY PLAN: > > Sort (cost=1382.45..1382.45 rows=1000 width=442) (actual > time=3.47..3.47 > rows=3 loops=1) > -> Merge Join (cost=1263.12..1332.62 rows=1000 width=442) (actual > > time=3.21..3.27 rows=3 loops=1) > -> Index Scan using master_activity_pkey on master_activity > ma > (cost=0.00..52.00 rows=1000 width=50) (actual time=0.06..0.09 rows=4 > loops=1) EXPLAIN returns the way that PostgreSQL will try to get data from the different tables and indexes and combine that together to produce the query result. To choose the best way, PostgreSQL is dependent on statistical information on the content of the tables. Those statistics are not collected automatically, you need to tell PostgreSQL to collect these statistics. Your EXPLAIN plan always returns an estimate of 1000 rows for each operation. This is usually an indication that you didn't gather statistics yet. You need to run ANALYZE first and then try again. Also, take a look at the manual entry for VACUUM, which is another maintenance operation that can be critical for performance. If your database is not in production, I would recommend you do a VACUUM FULL ANALYZE and then try the EXPLAIN again. If your database is in production, do VACUUM ANALYZE and read up on the FULL part so you know when to use that. After that, see if performance got better and post the new EXPLAIN output. Jochem
Siva Kumar wrote: >On Friday 04 Oct 2002 3:44 pm, you wrote: > > >>>The decision to keep the fields in different tables was taken in view of >>>the overall need of the system (there might be scope for improvement here >>>too). >>> >>> >>If that's the way the design makes sense, stick with it. It's better to get >>Postgresql to handle a clean design rather than mangle a design. >> >> > >We had a relook at the design and managed to add one reduntant field in a >table which made sense otherwise also. This shaved some time from the earlier >query. now the query looks like this. >====================================== > select relationship_id as "Id",company_name as >"Company",product_category_desc as "Product",uom_desc as >"UOM",shipment_term_desc as "Shipment Term",payment_term_desc as "Payment >Term",city_name as "Port",currency_name as "Currency",activity_desc as >"Activity",credit_days as "Credit Days" > >from _100001relationships rs,master_member mm,member_product_details >mpd,master_product_category mpc,master_uom mu,master_shipment_term >mst,master_payment_term mpt,member_financial_details mfd,master_currency >mc,member_commercial_details mcd,master_activity ma,master_city mcy > > where mpd.product_details_id = rs.product_details_id and >mpd.product_category_id=mpc.product_category_id and mpd.uom_id = mu.uom_id >and rs.shipment_term_id = mst.shipment_term_id and rs.payment_term_id = >mpt.payment_term_id and rs.commercial_details_id = mcd.commercial_details_id >and mcd.port = mcy.city_id and rs.financial_details_id = >mfd.financial_details_id and mfd.currency_id = mc.currency_id and >mpd.activity_id = ma.activity_id and mm.member_id = rs.partner_id ORDER BY >rs.relationship_id DESC; >================================================== > > > >>First stage though, run an EXPLAIN and if you need help understanding it >>post the output back to the list. >> >> > >Giving below the output of EXPLAIN ANALYSE. I could not make much sense out of >it, please help! > >NOTICE: QUERY PLAN: > >Sort (cost=1382.45..1382.45 rows=1000 width=442) (actual time=3.47..3.47 >rows=3 loops=1) > -> Merge Join (cost=1263.12..1332.62 rows=1000 width=442) (actual >time=3.21..3.27 rows=3 loops=1) > -> Index Scan using master_activity_pkey on master_activity ma >(cost=0.00..52.00 rows=1000 width=50) (actual time=0.06..0.09 rows=4 loops=1) > -> Sort (cost=1263.12..1263.12 rows=1000 width=392) (actual >time=3.04..3.05 rows=3 loops=1) > -> Merge Join (cost=1143.79..1213.29 rows=1000 width=392) >(actual time=2.87..2.95 rows=3 loops=1) > -> Index Scan using master_member_pkey on master_member >mm (cost=0.00..52.00 rows=1000 width=47) (actual time=0.03..0.23 rows=45 >loops=1) > -> Sort (cost=1143.79..1143.79 rows=1000 width=345) >(actual time=2.48..2.48 rows=3 loops=1) > -> Merge Join (cost=1024.46..1093.96 rows=1000 >width=345) (actual time=2.30..2.41 rows=3 loops=1) > -> Index Scan using >master_product_category_pkey on master_product_category mpc >(cost=0.00..52.00 rows=1000 width=37) (actual time=0.02..0.08 rows=7 loops=1) > -> Sort (cost=1024.46..1024.46 rows=1000 >width=308) (actual time=2.17..2.17 rows=3 loops=1) > -> Merge Join (cost=905.13..974.63 >rows=1000 width=308) (actual time=1.96..2.10 rows=3 loops=1) > -> Index Scan using >master_uom_pkey on master_uom mu (cost=0.00..52.00 rows=1000 width=41) >(actual time=0.03..0.06 rows=6 loops=1) > -> Sort (cost=905.13..905.13 >rows=1000 width=267) (actual time=1.87..1.87 rows=3 loops=1) > -> Merge Join >(cost=785.80..855.30 rows=1000 width=267) (actual time=1.70..1.77 rows=3 >loops=1) > -> Index Scan using >member_product_details_pkey on member_product_details mpd (cost=0.00..52.00 >rows=1000 width=23) (actual time=0.03..0.11 rows=14 loops=1) > -> Sort >(cost=785.80..785.80 rows=1000 width=244) (actual time=1.53..1.53 rows=3 >loops=1) > -> Merge Join >(cost=666.47..735.97 rows=1000 width=244) (actual time=1.35..1.47 rows=3 >loops=1) > -> Index >Scan using master_shipment_term_pkey on master_shipment_term mst >(cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.06 rows=5 loops=1) > -> Sort >(cost=666.47..666.47 rows=1000 width=194) (actual time=1.27..1.27 rows=3 >loops=1) > -> >Merge Join (cost=547.14..616.64 rows=1000 width=194) (actual time=1.12..1.19 >rows=3 loops=1) > >-> Index Scan using master_city_pkey on master_city mcy (cost=0.00..52.00 >rows=1000 width=47) (actual time=0.02..0.04 rows=4 loops=1) > >-> Sort (cost=547.14..547.14 rows=1000 width=147) (actual time=1.05..1.06 >rows=3 loops=1) > >-> Merge Join (cost=427.82..497.32 rows=1000 width=147) (actual >time=0.92..1.00 rows=3 loops=1) > >-> Index Scan using master_currency_pkey on master_currency mc >(cost=0.00..52.00 rows=1000 width=40) (actual time=0.03..0.06 rows=4 loops=1) > >-> Sort (cost=427.82..427.82 rows=1000 width=107) (actual time=0.81..0.81 >rows=3 loops=1) > >-> Merge Join (cost=308.49..377.99 rows=1000 width=107) (actual >time=0.69..0.74 rows=3 loops=1) > >-> Index Scan using member_financial_details_pkey on >member_financial_details mfd (cost=0.00..52.00 rows=1000 width=11) (actual >time=0.03..0.08 rows=11 loops=1) > >-> Sort (cost=308.49..308.49 rows=1000 width=96) (actual time=0.56..0.57 >rows=3 loops=1) > >-> Merge Join (cost=189.16..258.66 rows=1000 width=96) (actual >time=0.46..0.51 rows=3 loops=1) > >-> Index Scan using master_payment_term_pkey on master_payment_term mpt >(cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.04 rows=4 loops=1) > >-> Sort (cost=189.16..189.16 rows=1000 width=46) (actual time=0.35..0.35 >rows=3 loops=1) > >-> Merge Join (cost=69.83..139.33 rows=1000 width=46) (actual >time=0.21..0.25 rows=3 loops=1) > >-> Index Scan using member_commercial_details_pkey on >member_commercial_details mcd (cost=0.00..52.00 rows=1000 width=8) (actual >time=0.02..0.06 rows=6 loops=1) > >-> Sort (cost=69.83..69.83 rows=1000 width=38) (actual time=0.11..0.11 >rows=3 loops=1) > >-> Seq Scan on _100001relationships rs (cost=0.00..20.00 rows=1000 >width=38) (actual time=0.03..0.05 rows=3 loops=1) >Total runtime: 5.45 msec > >===================================================== > >Best regards, > > >Siva Kumar > > > > > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > > http://www.postgresql.org/idocs/index.php?performance-tips.html First, read this documentation. Thanks Antonis
On Friday 04 Oct 2002 1:26 pm, Siva Kumar wrote: > Giving below the output of EXPLAIN ANALYSE. I could not make much sense out > of it, please help! Scary aren't they ;-) The important thing is you've got plenty of "Index Scan"s rather than Seq Scans. The other point is the final time: > Total runtime: 5.45 msec Now, since that's not what you're getting, I'd think Neil (see other reply) is right and you need to look at issuing "SET geqo = off;" before the query. Postgresql has a genetic algorithm that kicks in on what it thinks is a very complex query, this can take a long time to analyse the options available but pays dividends on a big query. In your case you're only getting a few rows and so it takes longer to analyse than to get the results. The other thing that might work is rewriting the query with explicit JOINs - I think that should make it clear to Postgresql what order to do things in. - Richard Huxton
Thanks to Richard, Neil, Jochem and Antonis. I went through the documentation on explain/analyse. Finally, decided to split up the query into two and to reduce the number of tables to get faster query. In short, I have not solved the problem as presented (too lazy of me :-( , but opted to walk an easier path. Thanks once again. Best regards, Siva Kumar On Friday 04 Oct 2002 6:59 pm, Richard Huxton wrote: > On Friday 04 Oct 2002 1:26 pm, Siva Kumar wrote: > > Giving below the output of EXPLAIN ANALYSE. I could not make much sense > > out of it, please help! > > Scary aren't they ;-) > > The important thing is you've got plenty of "Index Scan"s rather than Seq > > Scans. The other point is the final time: > > Total runtime: 5.45 msec > > Now, since that's not what you're getting, I'd think Neil (see other reply) > is right and you need to look at issuing "SET geqo = off;" before the > query. > > Postgresql has a genetic algorithm that kicks in on what it thinks is a > very complex query, this can take a long time to analyse the options > available but pays dividends on a big query. In your case you're only > getting a few rows and so it takes longer to analyse than to get the > results. > > The other thing that might work is rewriting the query with explicit JOINs > - I think that should make it clear to Postgresql what order to do things > in. > > - Richard Huxton