Thread: query optimization: aggregate and distinct
I have below a simplified version of what I'm trying to do. Basically, I am trying to get both an aggregate (an average) and "most recent" value. g | v | ts ---+----+---------------------------- 1 | 10 | 2003-08-20 16:00:27.010769 1 | 20 | 2003-08-20 16:00:30.380476 2 | 40 | 2003-08-20 16:00:37.399717 2 | 80 | 2003-08-20 16:00:40.265717 I would like, as output, something like this: g | v | avg | ts ---+----+--------------------+---------------------------- 1 | 20 | 15.000000000000000 | 2003-08-20 16:00:30.380476 2 | 80 | 60.000000000000000 | 2003-08-20 16:00:40.265717 which I got by a query like: SELECT t2.g,t2.v,t1.avg,t2.ts FROM (SELECT g,avg(v) FROM t GROUP BY g ) t1, (SELECT DISTINCT ON (g) * FROM t ORDER BY g,ts DESC ) t2 WHERE t1.g = t2.g; That produces the results that I need, but it seems inefficient to join a table with itself like that. My real query (not this simplified example) takes 5+ seconds and I suspect this join is why. Is there a better way? For my real query, it's using index scans where I'd expect, and I frequently VACUUM ANALYZE the big table and I have all the stats turned on. Also, I have more shared buffers than needed to put everything in RAM. Right now I'm using 7.2.1. Any improvements in 7.3 or 7.4 that would help this issue? Regards, Jeff Davis
On Wed, Aug 20, 2003 at 16:26:26 -0700, Jeff Davis <jdavis-pgsql@empires.org> wrote: > > That produces the results that I need, but it seems inefficient to join a > table with itself like that. My real query (not this simplified example) > takes 5+ seconds and I suspect this join is why. > > Is there a better way? > > For my real query, it's using index scans where I'd expect, and I frequently > VACUUM ANALYZE the big table and I have all the stats turned on. Also, I have > more shared buffers than needed to put everything in RAM. > > Right now I'm using 7.2.1. Any improvements in 7.3 or 7.4 that would help this > issue? I think there is a chance you might benefit from hash aggregates in 7.4. Explain analyze might give you a better idea where the time is being spent. If it is sorting the data for the group bys and there are only a few groups relative to the total number of rows in the table, you will probably get a big speed up in 7.4.
On Wednesday 20 August 2003 04:26 pm, Jeff Davis wrote: > I have below a simplified version of what I'm trying to do. Basically, I am > trying to get both an aggregate (an average) and "most recent" value. > > g | v | ts > ---+----+---------------------------- > 1 | 10 | 2003-08-20 16:00:27.010769 > 1 | 20 | 2003-08-20 16:00:30.380476 > 2 | 40 | 2003-08-20 16:00:37.399717 > 2 | 80 | 2003-08-20 16:00:40.265717 > > I would like, as output, something like this: > > g | v | avg | ts > ---+----+--------------------+---------------------------- > 1 | 20 | 15.000000000000000 | 2003-08-20 16:00:30.380476 > 2 | 80 | 60.000000000000000 | 2003-08-20 16:00:40.265717 > > which I got by a query like: > > SELECT > t2.g,t2.v,t1.avg,t2.ts > FROM > (SELECT > g,avg(v) > FROM t > GROUP BY g > ) t1, > (SELECT > DISTINCT ON (g) > * FROM t > ORDER BY g,ts DESC > ) t2 > WHERE t1.g = t2.g; > > That produces the results that I need, but it seems inefficient to join a > table with itself like that. My real query (not this simplified example) > takes 5+ seconds and I suspect this join is why. > > Is there a better way? > > For my real query, it's using index scans where I'd expect, and I > frequently VACUUM ANALYZE the big table and I have all the stats turned on. > Also, I have more shared buffers than needed to put everything in RAM. > > Right now I'm using 7.2.1. Any improvements in 7.3 or 7.4 that would help > this issue? > I had an idea about using aggregates: what if I made an aggregate function called "first" that just returned the value in the first tuple it encountered? That way, I could do an "ORDER BY", and then do a real aggregate on some columns, and then just call first() on the rest and it will return the first tuple according to the ordering. Will that work? Are aggregates guarenteed to visit tuples in the same direction as the ORDERing? It seems like a bad hack, but I think it works. If not, I suppose I could pass a user-defined aggregate a user-defined complex type. Regards, Jeff Davis
Jeff Davis <jdavis-pgsql@empires.org> writes: > I had an idea about using aggregates: what if I made an aggregate function > called "first" that just returned the value in the first tuple it > encountered? You could make that work in 7.4, but not in any existing releases. The trouble is that you need something like SELECT first(foo) FROM (SELECT ... ORDER BY col1,col2) ss GROUP BY col1 and before 7.4 the optimizer doesn't realize that it can skip re-sorting at the outer level. So unless the sort is stable (which it won't be, on most platforms anyway) the needed ordering by col2 within each group is destroyed. regards, tom lane
On Thursday 21 August 2003 06:36 am, Tom Lane wrote: > Jeff Davis <jdavis-pgsql@empires.org> writes: > > I had an idea about using aggregates: what if I made an aggregate > > function called "first" that just returned the value in the first tuple > > it encountered? > > You could make that work in 7.4, but not in any existing releases. > > The trouble is that you need something like > > SELECT first(foo) FROM (SELECT ... ORDER BY col1,col2) ss > GROUP BY col1 > > and before 7.4 the optimizer doesn't realize that it can skip re-sorting > at the outer level. So unless the sort is stable (which it won't be, on > most platforms anyway) the needed ordering by col2 within each group is > destroyed. > Interesting. It turns out I don't really need the hack because I was able to optimize the query with some reworking and EXPLAIN ANALYZE. Now it takes about 1 second as opposed to 5. However, it still has me wondering what the most efficient algorithm would be. Here is my plan: - make a new complex type (say, most_recent_t) that's just an int and a timestamp - make a function to turn an int and a timestamp into a most_recent_t - make an aggregate function that takes most_recent_t and finds the int with the highest timestamp I tried a preliminary version, but all the functions were in plpgsql, which I think may have slowed it down (plus, I was using a text[] instead of a complex type, meaning more converting). The performance was nothing great, but it seemed like it should have been more efficient. After all, doesn't my plan skip the sorting phase needed for DISTINCT? The main problem is that I need to do a lot of extra aggregate calls. Does that seem like a more efficient plan overall, or would I waste my time writing all those functions? regards, jeff davis