Thread: Query - CPU issue
Hi,
I have this queryexplain analyze
select distinct geoip_city(src_ip) , src_ip
from alert where timestamp>=1378512000 and timestamp < 1378598400
The explain takes forever, and CPU goes upto 100%. So I end up killing the query/explain.
This one, without the function call, comes back in under a second -
explain analyze
select distinct
src_ip
from alert where timestamp>=1378512000 and timestamp < 1378598400
"HashAggregate (cost=493.94..494.40 rows=46 width=8) (actual time=38.669..38.684 rows=11 loops=1)"
" -> Index Scan using idx_alert_ts on alert (cost=0.29..468.53 rows=10162 width=8) (actual time=0.033..20.436 rows=10515 loops=1)"
" Index Cond: (("timestamp" >= 1378512000) AND ("timestamp" < 1378598400))"
"Total runtime: 38.740 ms"
The function doesn't do much, code given below -
CREATE OR REPLACE FUNCTION geoip_city(IN p_ip bigint, OUT loc_desc character varying)
RETURNS character varying AS
$BODY$
SELECT l.id || l.country ||l.region || l.city FROM blocks b JOIN locations l ON (b.location_id = l.id)
WHERE $1 >= start_ip and $1 <= end_ip limit 1 ;
$BODY$
LANGUAGE sql IMMUTABLE
COST 100;
CREATE OR REPLACE FUNCTION geoip_city(IN p_ip bigint, OUT loc_desc character varying)
RETURNS character varying AS
$BODY$
SELECT l.id || l.country ||l.region || l.city FROM blocks b JOIN locations l ON (b.location_id = l.id)
WHERE $1 >= start_ip and $1 <= end_ip limit 1 ;
$BODY$
LANGUAGE sql IMMUTABLE
COST 100;
There are indexes on the start_ip and end_ip and an explain tells me the indexes are being used (if I execute the SELECT in the function using a valid value for the ip value.
Regards,
Jayadevan
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Jayadevan M Sent: Wednesday, September 18, 2013 9:08 AM To: pgsql-general@postgresql.org Subject: [GENERAL] Query - CPU issue Hi, I have this query explain analyze select distinct geoip_city(src_ip) , src_ip from alert where timestamp>=1378512000 and timestamp < 1378598400 The explain takes forever, and CPU goes upto 100%. So I end up killing the query/explain. This one, without the function call, comes back in under a second - explain analyze select distinct src_ip from alert where timestamp>=1378512000 and timestamp < 1378598400 "HashAggregate (cost=493.94..494.40 rows=46 width=8) (actual time=38.669..38.684 rows=11 loops=1)" " -> Index Scan using idx_alert_ts on alert (cost=0.29..468.53 rows=10162 width=8) (actual time=0.033..20.436 rows=10515loops=1)" " Index Cond: (("timestamp" >= 1378512000) AND ("timestamp" < 1378598400))" "Total runtime: 38.740 ms" The function doesn't do much, code given below - CREATE OR REPLACE FUNCTION geoip_city(IN p_ip bigint, OUT loc_desc character varying) RETURNS character varying AS $BODY$ SELECT l.id || l.country ||l.region || l.city FROM blocks b JOIN locations l ON (b.location_id = l.id) WHERE $1 >= start_ip and $1 <= end_ip limit 1 ; $BODY$ LANGUAGE sql IMMUTABLE COST 100; There are indexes on the start_ip and end_ip and an explain tells me the indexes are being used (if I execute the SELECTin the function using a valid value for the ip value. Regards, Jayadevan ----------------------------------------------------------------------------------------------------------------------- Did you try to do just EXPLAIN, not EXPLAIN ANALYZE, to see what's coming without actually executing the query? Regards, Igor Neyman
Jayadevan M <maymala.jayadevan@gmail.com> wrote: > explain analyze > select distinct geoip_city(src_ip) , src_ip > from alert where timestamp>=1378512000 and timestamp < 1378598400 > > The explain takes forever What is the longest you have let it run, in hours or minutes? > This one, without the function call, comes back in under a second - > > explain analyze > select distinct > src_ip > from alert where timestamp>=1378512000 and timestamp < 1378598400 > The function doesn't do much, code given below - But it is called 10,515 times -- even a few milliseconds per call can add up. > CREATE OR REPLACE FUNCTION geoip_city(IN p_ip bigint, OUT loc_desc character varying) > RETURNS character varying AS > $BODY$ > SELECT l.id || l.country ||l.region || l.city FROM blocks b JOIN locations l ON (b.location_id = l.id) > WHERE $1 >= start_ip and $1 <= end_ip limit 1 ; > $BODY$ > LANGUAGE sql IMMUTABLE > COST 100; Try running the SELECT from the function with different values in place of $1: common versus uncommon (or even missing) and low values versus high values. Show the EXPLAIN ANALYZE output of the longest-running. By the way, IMMUTABLE has to be wrong here, since the results depend on the state of the database. STABLE is likely the right designation. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Sep 18, 2013 at 6:08 AM, Jayadevan M <maymala.jayadevan@gmail.com> wrote:
The function doesn't do much, code given below -
CREATE OR REPLACE FUNCTION geoip_city(IN p_ip bigint, OUT loc_desc character varying)
RETURNS character varying AS
$BODY$
SELECT l.id || l.country ||l.region || l.city FROM blocks b JOIN locations l ON (b.location_id = l.id)
WHERE $1 >= start_ip and $1 <= end_ip limit 1 ;
$BODY$
LANGUAGE sql IMMUTABLE
COST 100;There are indexes on the start_ip and end_ip and an explain tells me the indexes are being used (if I execute the SELECT in the function using a valid value for the ip value.
That construct is not efficiently indexable using two independent btree indexes. What did the plan look like that used the index? Two large bitmaps with a bitmap_and?
If you build ranges consisting of [start_ip, end_ip] and then build an index that specializes in range queries (GiST, I think) it should be able to do it efficiently, but you would have to rewrite the query to use a contains operator rather than two independent inequality tests.
Cheers,
Jeff
Kevin Grittner-5 wrote > What is the longest you have let it run, in hours or minutes? I let it run for about 10 minutes and killed it. Kevin Grittner-5 wrote > By the way, IMMUTABLE has to be wrong here, since the results > depend on the state of the database. STABLE is likely the right > designation. The data will not change unless I do a reload of the data set manually. In that case, is IMMUTABLE wrong? Here is the EXPLAIN (no analyze) explain select distinct geoip_city(src_ip) , src_ip from alert where timestamp>=1378512000 and timestamp < 1378598400; QUERY PLAN --------------------------------------------------------------------------------------------- Unique (cost=3815.75..3894.61 rows=39 width=8) -> Sort (cost=3815.75..3842.04 rows=10515 width=8) Sort Key: (geoip_city(src_ip)), src_ip -> Index Scan using idx_alert_ts on alert (cost=0.29..3113.34 rows=10515 width=8) Index Cond: (("timestamp" >= 1378512000) AND ("timestamp" < 1378598400)) (5 rows) -- View this message in context: http://postgresql.1045698.n5.nabble.com/Query-CPU-issue-tp5771421p5771552.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Jayadevan <maymala.jayadevan@gmail.com> wrote: > Kevin Grittner-5 wrote >> What is the longest you have let it run, in hours or minutes? > > I let it run for about 10 minutes and killed it. Doing the arithmetic, that means if the estimated row counts from the explain of the outer query are right, the function must average at least 57 ms per call. > Kevin Grittner-5 wrote >> By the way, IMMUTABLE has to be wrong here, since the results >> depend on the state of the database. STABLE is likely the right >> designation. > > The data will not change unless I do a reload of the data set > manually. In that case, is IMMUTABLE wrong? Yes. I don't think that's any part of your current problem, but it is kinda asking for trouble to get that wrong. IMMUTABLE means that regardless of how the data in the database changes, calling the function with any given set of arguments will always return the same results. STABLE means that during the course of running a single query the same arguments always yield the same results. > Here is the EXPLAIN (no analyze) > [ explain of outer query showing 10515 calls to a function ] We already had this. I was asking for you to get EXPLAIN ANALYZE output for a run of the SELECT statement inside the geoip_city() function. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Kevin Grittner-5 wrote > We already had this. I was asking for you to get EXPLAIN ANALYZE > output for a run of the SELECT statement inside the geoip_city() > function. "Merge Join (cost=9268.34..26193.41 rows=6282 width=24) (actual time=892.188..892.190 rows=1 loops=1)" " Merge Cond: (l.id = b.location_id)" " -> Index Scan using locations_pkey on locations l (cost=0.42..15739.22 rows=438386 width=24) (actual time=0.022..865.025 rows=336605 loops=1)" " -> Sort (cost=9267.84..9283.54 rows=6282 width=8) (actual time=1.329..1.330 rows=1 loops=1)" " Sort Key: b.location_id" " Sort Method: quicksort Memory: 25kB" " -> Index Scan using ix_end_start_ip on blocks b (cost=0.43..8871.54 rows=6282 width=8) (actual time=0.573..1.268 rows=1 loops=1)" " Index Cond: ((3721196957::bigint <= end_ip) AND (3721196957::bigint >= start_ip))" "Total runtime: 892.439 ms" -- View this message in context: http://postgresql.1045698.n5.nabble.com/Query-CPU-issue-tp5771421p5771558.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Rewriting the query in the function like this solved the issue (this is how it was in the SQL at https://github.com/tvondra/geoip/blob/master/sql/geoip--0.1.0.sql) explain analyze SELECT l.id, l.country, l.region, l.city FROM blocks b JOIN locations l ON (b.location_id = l.id) WHERE 3721196957 >= start_ip ORDER BY start_ip DESC LIMIT 1 "Limit (cost=0.85..2.09 rows=1 width=32) (actual time=0.015..0.015 rows=1 loops=1)" " -> Nested Loop (cost=0.85..2216242.97 rows=1784157 width=32) (actual time=0.014..0.014 rows=1 loops=1)" " -> Index Scan Backward using ix_start_end_ip on blocks b (cost=0.43..934027.92 rows=1784157 width=16) (actual time=0.008..0.008 rows=1 loops=1)" " Index Cond: (3721196957::bigint >= start_ip)" " -> Index Scan using locations_pkey on locations l (cost=0.42..0.71 rows=1 width=24) (actual time=0.003..0.003 rows=1 loops=1)" " Index Cond: (id = b.location_id)" "Total runtime: 0.039 ms" -- View this message in context: http://postgresql.1045698.n5.nabble.com/Query-CPU-issue-tp5771421p5771561.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Jayadevan <maymala.jayadevan@gmail.com> wrote: > "Merge Join (cost=9268.34..26193.41 rows=6282 width=24) (actual time=892.188..892.190 rows=1 loops=1)" > " Merge Cond: (l.id = b.location_id)" > " -> Index Scan using locations_pkey on locations l (cost=0.42..15739.22 rows=438386 width=24) (actual time=0.022..865.025rows=336605 loops=1)" > " -> Sort (cost=9267.84..9283.54 rows=6282 width=8) (actual time=1.329..1.330 rows=1 loops=1)" > " Sort Key: b.location_id" > " Sort Method: quicksort Memory: 25kB" > " -> Index Scan using ix_end_start_ip on blocks b (cost=0.43..8871.54 rows=6282 width=8) (actual time=0.573..1.268rows=1 loops=1)" > " Index Cond: ((3721196957::bigint <= end_ip) AND (3721196957::bigint >= start_ip))" > "Total runtime: 892.439 ms" This is the query which needs to be optimized. When I multiply the runtime of this function's query by the estimated number of function calls, I get 2.6 hours. Copying the query from the first email on the thread: SELECT l.id || l.country || l.region || l.city FROM blocks b JOIN locations l ON (b.location_id = l.id) WHERE $1 >= start_ip and $1 <= end_ip limit 1; Can you provide the table definitions for blocks and locations, including indexes? Also, could you tell us what the OS is, how much RAM is on the system, what the storage system looks like, and provide the output from running this?: SELECT version(); SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override'); You might also try running EXPLAIN ANALYZE for this query after running these statements on the connection, and see if you get a different plan: VACUUM ANALYZE blocks; VACUUM ANALYZE locations; SET cpu_tuple_cost = 0.03; SET random_page_cost = 1; SET effective_cache_size = <75% of machine RAM> SET work_mem = <25% of machine RAM / max_connections> -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thank you for the pointers. I will try those suggestions. As I mentioned later, resolving the query solved the problem for now.
Regards,On Thu, Sep 19, 2013 at 11:40 AM, Kevin Grittner-5 [via PostgreSQL] <[hidden email]> wrote:
Jayadevan <[hidden email]> wrote:
> "Merge Join (cost=9268.34..26193.41 rows=6282 width=24) (actual time=892.188..892.190 rows=1 loops=1)"
> " Merge Cond: (l.id = b.location_id)"
> " -> Index Scan using locations_pkey on locations l (cost=0.42..15739.22 rows=438386 width=24) (actual time=0.022..865.025 rows=336605 loops=1)"
> " -> Sort (cost=9267.84..9283.54 rows=6282 width=8) (actual time=1.329..1.330 rows=1 loops=1)"
> " Sort Key: b.location_id"
> " Sort Method: quicksort Memory: 25kB"
> " -> Index Scan using ix_end_start_ip on blocks b (cost=0.43..8871.54 rows=6282 width=8) (actual time=0.573..1.268 rows=1 loops=1)"
> " Index Cond: ((3721196957::bigint <= end_ip) AND (3721196957::bigint >= start_ip))"
> "Total runtime: 892.439 ms"This is the query which needs to be optimized. When I multiply the
runtime of this function's query by the estimated number of
function calls, I get 2.6 hours.
Copying the query from the first email on the thread:SELECT l.id || l.country || l.region || l.city
FROM blocks b
JOIN locations l ON (b.location_id = l.id)
WHERE $1 >= start_ip
and $1 <= end_ip
limit 1;Can you provide the table definitions for blocks and locations,--
including indexes? Also, could you tell us what the OS is, how
much RAM is on the system, what the storage system looks like, and
provide the output from running this?:
SELECT version();
SELECT name, current_setting(name), source
FROM pg_settings
WHERE source NOT IN ('default', 'override');
You might also try running EXPLAIN ANALYZE for this query after
running these statements on the connection, and see if you get a
different plan:
VACUUM ANALYZE blocks;
VACUUM ANALYZE locations;
SET cpu_tuple_cost = 0.03;
SET random_page_cost = 1;
SET effective_cache_size = <75% of machine RAM>
SET work_mem = <25% of machine RAM / max_connections>
Sent via pgsql-general mailing list ([hidden email])http://postgresql.1045698.n5.nabble.com/Query-CPU-issue-tp5771421p5771567.htmlIf you reply to this email, your message will be added to the discussion below:
View this message in context: Re: Query - CPU issue
Sent from the PostgreSQL - general mailing list archive at Nabble.com.