Re: Slow query with a lot of data - Mailing list pgsql-performance

From Matthew Wakeling
Subject Re: Slow query with a lot of data
Date
Msg-id alpine.DEB.1.10.0808181550110.4454@aragorn.flymine.org
Whole thread Raw
In response to Re: Slow query with a lot of data  (Moritz Onken <onken@houseofdesign.de>)
Responses Re: Slow query with a lot of data
List pgsql-performance
On Mon, 18 Aug 2008, Moritz Onken wrote:
> "HashAggregate  (cost=817397.78..817428.92 rows=2491 width=8) (actual  time=42874.339..42878.419 rows=3361 loops=1)"
> "  ->  Merge Join  (cost=748.47..674365.50 rows=19070970 width=8) (actual > time=15702.449..42829.388 rows=36308
loops=1)"
> "        Merge Cond: (b.domain = a.domain)"
> "        ->  Index Scan using domain_categories_domain on domain_categories b > (cost=0.00..391453.79 rows=12105014
width=8)(actual time=39.018..30166.349 > rows=12104989 loops=1)" 
> "        ->  Sort  (cost=395.52..405.49 rows=3985 width=8) (actual > time=0.188..32.345 rows=36309 loops=1)"
> "              Sort Key: a.domain"
> "              Sort Method:  quicksort  Memory: 27kB"
> "              ->  Index Scan using result_user_idx on result a > (cost=0.00..157.21 rows=3985 width=8) (actual
time=0.021..0.101rows=61 > loops=1)" 
> "                    Index Cond: ("user" = 1337)"
> "Total runtime: 42881.382 ms"
>
> This is still very slow...

Well, you're getting the database to read the entire contents of the
domain_categories table in order. That's 12 million rows - a fair amount
of work.

You may find that removing the "user = 1337" constraint doesn't make the
query much slower - that's where you get a big win by clustering on
domain. You might also want to cluster the results table on domain.

If you want the results for just one user, it would be very helpful to
have a user column on the domain_categories table, and an index on that
column. However, that will slow down the query for all users a little.

Matthew

--
For every complex problem, there is a solution that is simple, neat, and wrong.
                                                      -- H. L. Mencken

pgsql-performance by date:

Previous
From: Moritz Onken
Date:
Subject: Re: Slow query with a lot of data
Next
From: Tom Lane
Date:
Subject: Re: Cross Join Problem