Re: BUG #7571: Query high memory usage - Mailing list pgsql-bugs
From | Radovan Jablonovsky |
---|---|
Subject | Re: BUG #7571: Query high memory usage |
Date | |
Msg-id | CAJYcdTtYu4K+N87LK3JjkiWNQZHt3OEamMapRz7sLJBD+SJhCA@mail.gmail.com Whole thread Raw |
In response to | Re: BUG #7571: Query high memory usage (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: BUG #7571: Query high memory usage
|
List | pgsql-bugs |
Hi Pavel, Here are the test data with set enable_hashagg to off. It does not looks like improvement. Query was running for 30min without returning result set. db=> set enable_hashagg=off; SET db=> explain db-> SELECT db-> schema_name, db-> sum(table_size) db-> FROM db-> (SELECT db(> pg_catalog.pg_namespace.nspname as schema_name, db(> pg_relation_size(pg_catalog.pg_class.oid) as table_size, db(> sum(pg_relation_size(pg_catalog.pg_class.oid)) over () as database_size db(> FROM pg_catalog.pg_class db(> JOIN pg_catalog.pg_namespace db(> ON relnamespace = pg_catalog.pg_namespace.oid db(> ) t db-> GROUP BY schema_name, database_size; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ GroupAggregate (cost=725540.59..756658.18 rows=40000 width=104) -> Sort (cost=725540.59..733219.99 rows=3071759 width=104) Sort Key: pg_namespace.nspname, (sum(pg_relation_size((pg_class.oid)::regclass, 'main'::text)) OVER (?)) -> WindowAgg (cost=120.98..243838.73 rows=3071759 width=68) -> Hash Join (cost=120.98..190082.95 rows=3071759 width=68) Hash Cond: (pg_class.relnamespace = pg_namespace.oid) -> Seq Scan on pg_class (cost=0.00..143885.59 rows=3071759 width=8) -> Hash (cost=90.99..90.99 rows=2399 width=68) -> Seq Scan on pg_namespace (cost=0.00..90.99 rows=2399 width=68) (9 rows) Data from top after 30 min of query run with hashagg set off: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2235 postgres 25 0 27.5g 23g 4.6g R 95.1 75.2 31:39.81 postgres: aspuser aspdata 10.0.2.67(52716) SELECT Radovan On Wed, Sep 26, 2012 at 10:15 PM, Pavel Stehule <pavel.stehule@gmail.com>wrote: > Hello > > you should to run this query on real data - and if it works now, then > send EXPLAIN ANALYZE result, please > > Pavel > > 2012/9/27 Melese Tesfaye <mtesfaye@gmail.com>: > > Thanks Pavel, > > Setting enable_hashagg to off didn't resolve the issue. > > Please find the explain as well as query results after "set > > enable_hashagg=off;" > > > > mtesfaye@[local](test_db)=# EXPLAIN SELECT DISTINCT(A.*) > > test_db-# FROM table1_t A LEFT JOIN table2_v B > > test_db-# ON A.pnr_id=B.pnr_id > > test_db-# WHERE A.pnr_id IN(1801,2056) AND > > B.departure_date_time>=DATE('2012-09-26') > > test_db-# ORDER BY pnr_id ASC,nam_id ASC; > > > +-----------------------------------------------------------------------------------------------------------+ > > | QUERY PLAN > > | > > > +-----------------------------------------------------------------------------------------------------------+ > > | Unique (cost=1354.62..1354.66 rows=4 width=13) > > | > > | -> Sort (cost=1354.62..1354.63 rows=4 width=13) > > | > > | Sort Key: a.pnr_id, a.nam_id, a.pty_num > > | > > | -> Merge Join (cost=1084.06..1354.58 rows=4 width=13) > > | > > | Merge Cond: (table2_t.pnr_id = a.pnr_id) > > | > > | -> Unique (cost=1084.06..1198.67 rows=11461 width=16) > > | > > | -> Sort (cost=1084.06..1112.72 rows=11461 > width=16) > > | > > | Sort Key: table2_t.pnr_id, table2_t.itn_id, > > table2_t.departure_date_time | > > | -> Seq Scan on table2_t (cost=0.00..311.34 > > rows=11461 width=16) | > > | Filter: (departure_date_time >= > > '2012-09-26'::date) | > > | -> Index Scan using table1_t_pnr_id_idx1 on table1_t a > > (cost=0.00..12.60 rows=4 width=13) | > > | Index Cond: (pnr_id = ANY > ('{1801,2056}'::integer[])) > > | > > > +-----------------------------------------------------------------------------------------------------------+ > > (12 rows) > > > > Time: 5.889 ms > > > > mtesfaye@[local](test_db)=# show enable_hashagg; > > +----------------+ > > | enable_hashagg | > > +----------------+ > > | on | > > +----------------+ > > (1 row) > > > > Time: 0.136 ms > > > > mtesfaye@[local](test_db)=# set enable_hashagg=off; > > SET > > Time: 0.203 ms > > mtesfaye@[local](test_db)=# show enable_hashagg; > > +----------------+ > > | enable_hashagg | > > +----------------+ > > | off | > > +----------------+ > > (1 row) > > > > Time: 0.131 ms > > > > > > mtesfaye@[local](test_db)=# SELECT DISTINCT(A.*) > > test_db-# FROM table1_t A LEFT JOIN table2_v B > > test_db-# ON A.pnr_id=B.pnr_id > > test_db-# WHERE A.pnr_id IN(1801,2056) AND > > B.departure_date_time>=DATE('2012-09-26') > > test_db-# ORDER BY pnr_id ASC,nam_id ASC; > > +--------+--------+---------+ > > | pnr_id | nam_id | pty_num | > > +--------+--------+---------+ > > | 1801 | 3359 | 1 | > > | 1801 | 3360 | 1 | > > | 1801 | 3361 | 1 | > > | 1801 | 3362 | 1 | > > +--------+--------+---------+ > > (4 rows) > > > > Time: 8.452 ms > > > > > > On Thu, Sep 27, 2012 at 3:54 AM, Pavel Stehule <pavel.stehule@gmail.com> > > wrote: > >> > >> Hello > >> > >> this situation is possible, when optimizer use HashAgg where should not > >> use it. > >> > >> Please, try to disable HashAgg - set enable_hashagg to off; > >> > >> please, send EXPLAIN result > >> > >> Regards > >> > >> Pavel Stehule > >> > >> 2012/9/26 <radovan.jablonovsky@replicon.com>: > >> > The following bug has been logged on the website: > >> > > >> > Bug reference: 7571 > >> > Logged by: Radovan Jablonovsky > >> > Email address: radovan.jablonovsky@replicon.com > >> > PostgreSQL version: 9.1.5 > >> > Operating system: CentOs 5.8 Linux 2.6.18-308.el5 x86_64 > >> > Description: > >> > > >> > During checking our company database size we used query, which was not > >> > the > >> > best to find out the tables/db size but should do the job. The query > was > >> > tested on server with 32GB of RAM, 2 CPU with 4 cores and it was > running > >> > alone without other activity. It consumed almost all RAM forced server > >> > to > >> > use swap and after 1hour it was still running. The simplified version > of > >> > query used 20% of memory and finished after 1hour 8min. > >> > > >> > The size of pg_class is 3mil rows/objects and pg_namespace has 3000 > >> > rows/schemata. > >> > > >> > query: > >> > SELECT > >> > schema_name, > >> > sum(table_size) > >> > FROM > >> > (SELECT > >> > pg_catalog.pg_namespace.nspname as schema_name, > >> > pg_relation_size(pg_catalog.pg_class.oid) as table_size, > >> > sum(pg_relation_size(pg_catalog.pg_class.oid)) over () as > >> > database_size > >> > FROM pg_catalog.pg_class > >> > JOIN pg_catalog.pg_namespace > >> > ON relnamespace = pg_catalog.pg_namespace.oid > >> > ) t > >> > GROUP BY schema_name, database_size; > >> > > >> > > >> > top - 10:50:44 up 20 days, 19:00, 1 user, load average: 1.15, 1.10, > >> > 0.84 > >> > Tasks: 239 total, 3 running, 236 sleeping, 0 stopped, 0 zombie > >> > Cpu(s): 15.1%us, 1.5%sy, 0.0%ni, 83.0%id, 0.5%wa, 0.0%hi, 0.0%si, > >> > 0.0%st > >> > Mem: 32946260k total, 32599908k used, 346352k free, 141924k > buffers > >> > Swap: 55043952k total, 85216k used, 54958736k free, 14036516k > cached > >> > > >> > Info from top: > >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >> > 2016 postgres 25 0 22.8g 17g 3.2g R 96.1 56.0 19:17.01 postgres: > >> > postgres db 10.0.1.10(49928) SELECT > >> > > >> > Simplified version of query uses pg_tables. It has 0.5mil rows/tables. > >> > Simplified version of query: > >> > SELECT > >> > schemaname, > >> > sum(pg_relation_size(schemaname || '.' || tablename))::bigint > >> > FROM pg_tables > >> > GROUP BY schemaname; > >> > > >> > > >> > > >> > > >> > > >> > > >> > -- > >> > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > >> > To make changes to your subscription: > >> > http://www.postgresql.org/mailpref/pgsql-bugs > >> > >> > >> -- > >> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > >> To make changes to your subscription: > >> http://www.postgresql.org/mailpref/pgsql-bugs > > > > >
pgsql-bugs by date: