Thread: Approach to extract top records from table based upon aggregate
Hi, I have a table that contains call records. I'm looking to get only records for users who made the most calls over a particular time duration in an efficient way. calls() time, duration, caller_number, dialed_number -- query to get top 10 callers select caller_number, count(1) from calls group by caller_number order by calls desc limit 10 --my current query to get those callers select * from call where caller_number in (above query) It works but I was hoping for something a little more efficient if anyone has an idea. Tahnks -- View this message in context: http://postgresql.nabble.com/Approach-to-extract-top-records-from-table-based-upon-aggregate-tp5872427.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Hi, I have a table that contains call records. I'm looking to get only
records for users who made the most calls over a particular time duration in
an efficient way.
calls()
time, duration, caller_number, dialed_number
-- query to get top 10 callers
select caller_number, count(1) from calls group by caller_number order by
calls desc limit 10
--my current query to get those callers
select * from call where caller_number in (above query)
It works but I was hoping for something a little more efficient if anyone
has an idea.
I don't think there is anything that is "a little more efficient" (implying, only a bit harder to implement).
You can probably get significantly faster by combining various forms of pre-computation and caching. It is likewise significantly more complex to implement.
David J.
2015-11-02 19:14 GMT-03:00 droberts <david.roberts@riverbed.com>:
Hi, I have a table that contains call records. I'm looking to get only
records for users who made the most calls over a particular time duration in
an efficient way.
calls()
time, duration, caller_number, dialed_number
-- query to get top 10 callers
select caller_number, count(1) from calls group by caller_number order by
calls desc limit 10
--my current query to get those callers
select * from call where caller_number in (above query)
It works but I was hoping for something a little more efficient if anyone
has an idea.
I think that almost every time based tables, should be partitioned. Also, depending on your workload you can create lazy views over the last entries in calls table during a particular time frame.
Probably in this particular case, you will want to dig into more underneath design in order to get the best performance.
Doing a lazy view with that query, you can use the top n of it and get less callers if you need to (or more if you want to expand the feature).
Hope it helps,
-- On Mon, Nov 2, 2015 at 4:14 PM, droberts <david.roberts@riverbed.com> wrote: > Hi, I have a table that contains call records. I'm looking to get only > records for users who made the most calls over a particular time duration in > an efficient way. > > calls() > > time, duration, caller_number, dialed_number > > > > -- query to get top 10 callers > select caller_number, count(1) from calls group by caller_number order by > calls desc limit 10 > > --my current query to get those callers > > select * from call where caller_number in (above query) > > > It works but I was hoping for something a little more efficient if anyone > has an idea. How fast is it running, and how fast do you expect it to run? To make that faster than that, you're going to have to rethink things a little bit. For example, you could narrow the search down to a time range, or maybe you could keep a running internalization of the count. This query looks suspicious: select caller_number, count(1) from calls group by caller_number order by calls desc limit 10 you're ordering by the entire table, which is almost certainly a mistake. It probably needs to look like: select * from ( select caller_number, count(1) as count_calls from calls group by caller_number ) q order by count_calls desc limit 10; merlin