Re: Need some help with a query (uniq -c) - Mailing list pgsql-general
From | Kenichiro Tanaka |
---|---|
Subject | Re: Need some help with a query (uniq -c) |
Date | |
Msg-id | 4BC451A8.3010900@ashisuto.co.jp Whole thread Raw |
In response to | Re: Need some help with a query (uniq -c) (Steve Atkins <steve@blighty.com>) |
Responses |
Re: Need some help with a query (uniq -c)
|
List | pgsql-general |
Hello. I try with "With Query". http://www.postgresql.org/docs/8.4/static/queries-with.html #We can use "With Queries" > v8.4 #That'll only work if the time values are contiguous, but there's probably a #similar trick for non-contiguous ,too. --create data drop table foo; create table foo( time int,message text); insert into foo values(1,'a'); insert into foo values(2,'b'); insert into foo values(3,'b'); insert into foo values(4,'c'); insert into foo values(5,'a'); insert into foo values(6,'c'); insert into foo values(7,'c'); insert into foo values(8,'a'); insert into foo values(9,'a'); insert into foo values(10,'a'); --begin Answer with recursive r as ( select foo.time,foo.message,1 as dummy from foo union all select foo.time,foo.message,r.dummy+1 from foo , r where foo.time=r.time-1 and foo.message=r.message ) ,rr as ( select foo.time,foo.message,'OLID' as flag from foo union all select foo.time,foo.message,'DUP' as flag from foo , rr where foo.time-1=rr.time-2 and foo.message=rr.message ) select time min,time+max(dummy)-1 max,message,max(dummy) counts from r where time not in (select distinct (time+1) times from rr where flag='DUP') group by time,message order by time; --result postgres(# where flag='DUP') group by time,message order by time; min | max | message | counts -----+-----+---------+-------- 1 | 1 | a | 1 2 | 3 | b | 2 4 | 4 | c | 1 5 | 5 | a | 1 6 | 7 | c | 2 8 | 10 | a | 3 (6 rows) --end But I think some one can provide more simple SQL. Thank you. > On Apr 12, 2010, at 11:31 AM, Scott Marlowe wrote: > > >> On Mon, Apr 12, 2010 at 12:22 PM, A B<gentosaker@gmail.com> wrote: >> >>> Hello! >>> >>> I have a table (think of it as a table of log messages) >>> >>> time | message >>> ----------------------- >>> 1 | a >>> 2 | b >>> 3 | b >>> 4 | b >>> 5 | a >>> >>> the three 'b' are the same message, so I would like to write a query >>> that would give me a result that is similar to what the unix command >>> "uniq -c" would give: >>> >>> first | message | last | count >>> -------------------------------------- >>> 1 | a | 1 | 1 >>> 2 | b | 4 | 3<--- here it squeezes >>> similar consecutive messages into a single row >>> 5 | a | 5 | 1 >>> >>> How do I write such a command? >>> >> Pretty straight ahead: >> >> select min(t), message, max(t), count(*) from table group by message. >> > That was my first though too, but it combines everything not just adjacent messages. > > Something like this, maybe > > select t1.message, t1.time as first, t2.time as last, t2.time-t1.time+1 as count > from foo as t1, foo as t2 > where t1.time<= t2.time and t1.message = t2.message > and not exists > (select * from foo as t3 > where (t3.time between t1.time and t2.time and t3.message<> t1.message) > or (t3.time = t2.time + 1 and t3.message = t1.message) > or (t3.time = t1.time - 1 and t3.message = t1.message)); > > message | first | last | count > ---------+-------+------+------- > a | 1 | 1 | 1 > b | 2 | 4 | 3 > a | 5 | 5 | 1 > > That'll only work if the time values are contiguous, but there's probably a > similar trick for non-contiguous. > > Cheers, > Steve > > > -- ================================================ Kenichiro Tanaka K.K.Ashisuto http://www.ashisuto.co.jp/english/index.html ================================================
pgsql-general by date: