Thread: two count columns?
Hello all, I have written a small firewall, and sshd, logging system which uses postgresql for storing the logs. The table that stores the firewall logs has, among other columns, from_ip, port and a timestamp. I want to construct a query which returns the columns: from_ip | port 22 entries | total entries | last_access I have managed to put this together: select from_ip, count(from_ip) as entries, max(ts)::timestamp(0) as last_access from log where to_port=22 and direction='in' group by from_ip So the only thing I'm missing is the total number of log entries matching the from_ip, but that's where I'm stuck. My instinct is to try to use subqueries: select from_ip, count(from_ip) as entries, count(select * from log where ...) as tot_entries, max(ts)::timestamp(0) as last_access from log where to_port=22 and direction='in' group by from_ip ..but count() doesn't take a subquery afaik, and how would I match the from_ip in the inner select with the outer one? So I assume that subqueries aren't the way yo go. Is there even a query to return those columns in that configuration? Thankful for any hints or tips. -- Kind Regards, Jan Danielsson Te audire non possum. Musa sapientum fixa est in aure.
Jan Danielsson <jan.danielsson@gmail.com> writes: > So the only thing I'm missing is the total number of log entries > matching the from_ip, but that's where I'm stuck. My instinct is to try > to use subqueries: > select from_ip, count(from_ip) as entries, count(select * from log where > ...) as tot_entries, max(ts)::timestamp(0) as last_access from log where > to_port=22 and direction='in' group by from_ip > ...but count() doesn't take a subquery afaik, and how would I match the > from_ip in the inner select with the outer one? I think you want something like select ..., (select count(*) from log loginner where from_ip = log.from_ip) ... from log ... You need to attach an alias to either the inner or the outer use of "log" so that you can refer to the outer one from the inner SELECT. Here I chose to alias the inner one, but it might be clearer to alias the outer: select ..., (select count(*) from log where from_ip = logouter.from_ip) ... from log logouter ... Note that anything like this is going to be pretty expensive if your log table is large. You might want to think about something involving another layer of GROUP BY instead. regards, tom lane
Jan Danielsson <jan.danielsson@gmail.com> writes: > select from_ip, count(from_ip) as entries, count(select * from log where > ...) as tot_entries, max(ts)::timestamp(0) as last_access from log where > to_port=22 and direction='in' group by from_ip select from_ip, count(from_ip) as entries, (select count(*) from log as l where l.from_ip = log.from_ip ) as tot_entries, max(ts) as last_access from logwhere to_port=22 and direction='ingroup by from_ip expect it to be pretty slow though. For every from_ip it has to look up every other entry with that from_ip. > Thankful for any hints or tips. There is a trick you could use to make it faster but it gets cumbersome and pretty tricky to use when you're doing more than one thing at a time: select from_ip sum(case when to_port=22 and direction='in' then 1 else 0 end) as entries, count(*) as tot_entries, max(case when to_port=22 and direction='in' then ts::timestamp(0) else null end) as last_access from loggroupby from_iphaving entries > 0 Note that in either case you might want to look at ANALYZE results for the query and try raising work_mem for this query using SET until you see the plan using a hash aggregate. If it can use a hash aggregate for your query (more likely for the first query than the second) without swapping it'll be faster than sorting. -- greg