speed up query with max() and odd estimates - Mailing list pgsql-performance
From | Matthew Nuzum |
---|---|
Subject | speed up query with max() and odd estimates |
Date | |
Msg-id | f3c0b40805042613167c47c7e9@mail.gmail.com Whole thread Raw |
Responses |
Re: speed up query with max() and odd estimates
Re: speed up query with max() and odd estimates |
List | pgsql-performance |
I have this query that takes a little over 8 min to run: select client,max(atime) as atime from usage_access where atime >= (select atime - '1 hour'::interval from usage_access order by atime desc limit 1) group by client; I think it can go a lot faster. Any suggestions on improving this? DB is 7.3.4 I think. (There is no index on client because it is very big and this data is used infrequently.) explain ANALYZE select client,max(atime) as atime from usage_access where atime >= (select atime - '1 hour'::interval from usage_access order by atime desc limit 1) group by client; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=3525096.28..3620450.16 rows=1271385 width=20) (actual time=482676.95..482693.69 rows=126 loops=1) InitPlan -> Limit (cost=0.00..0.59 rows=1 width=8) (actual time=0.40..0.41 rows=1 loops=1) -> Index Scan Backward using usage_access_atime on usage_access (cost=0.00..22657796.18 rows=38141552 width=8) (actual time=0.39..0.40 rows=2 loops=1) -> Group (cost=3525096.28..3588665.53 rows=12713851 width=20) (actual time=482676.81..482689.29 rows=3343 loops=1) -> Sort (cost=3525096.28..3556880.90 rows=12713851 width=20) (actual time=482676.79..482679.16 rows=3343 loops=1) Sort Key: client -> Seq Scan on usage_access (cost=0.00..1183396.40 rows=12713851 width=20) (actual time=482641.57..482659.18 rows=3343 loops=1) Filter: (atime >= $0) Total runtime: 482694.65 msec I'm starting to understand this, which is quite frightening to me. I thought that maybe if I shrink the number of rows down I could improve things a bit, but my first attempt didn't work. I thought I'd replace the "from usage_access" with this query instead: select * from usage_access where atime >= (select atime - '1 hour'::interval from usage_access order by atime desc limit 1); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Seq Scan on usage_access (cost=0.00..1183396.40 rows=12713851 width=116) (actual time=481796.22..481839.43 rows=3343 loops=1) Filter: (atime >= $0) InitPlan -> Limit (cost=0.00..0.59 rows=1 width=8) (actual time=0.41..0.42 rows=1 loops=1) -> Index Scan Backward using usage_access_atime on usage_access (cost=0.00..22657796.18 rows=38141552 width=8) (actual time=0.40..0.41 rows=2 loops=1) Total runtime: 481842.47 msec It doesn't look like this will help at all. This table is primarily append, however I just recently deleted a few million rows from the table, if that helps anyone. -- Matthew Nuzum www.bearfruit.org
pgsql-performance by date: