Thread: unique fields
hi: I have a fairly large table. sample_id | chr | cfrom | cto | ------------------------------------------- 1 c2 19 20 2 c2 19 20 3 c2 19 20 1 c5 10 11 3 c5 10 11 (25,000 rows) I want to find out how many duplications are there for chr, cfrom and cto a. c2,19,20 are common to samples 1,2 and 3. since there will be many instances like that, do I have to loop over entire rows and find common chr, cfrom and c2 and ouput with sample_id. how can I do that. thanks adrian
Hi, On 23 September 2010 07:30, Adrian Johnson <oriolebaltimore@gmail.com> wrote: > I want to find out how many duplications are there for chr, cfrom and cto Start with it http://www.postgresql.org/docs/9.0/interactive/tutorial-agg.html p.s. SELECT chr, cfrom, cto, count(*) FROM your_table GROUP BY 1, 2, 3; > > a. c2,19,20 are common to samples 1,2 and 3. > > since there will be many instances like that, do I have to loop over > entire rows and find common chr, cfrom and c2 and ouput with > sample_id. > how can I do that. > > thanks > adrian > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql > -- Sergey Konoplev Blog: http://gray-hemp.blogspot.com / Linkedin: http://ru.linkedin.com/in/grayhemp / JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802
Howdy, Adrian Dunno if this is exactly what you want SELECT * FROM ( SELECT chr,cfrom,cto,count(*) as numberOfDuplicates FROM t_fairly_large_table GROUP BY chr,cfrom,cto ) x NATURAL JOIN t_fairly_large_table y WHERE numberOfDuplicates > 1 The idea of this (untested) query is to produce something like chr| cfrom | cto | numberOfDuplicates| sample_id c2 19 20 3 1 c2 19 20 3 2 c2 19 20 3 3 c5 10 11 2 1 c5 10 11 2 3 Can this be what you need? Best, Oliver ----- Original Message ----- From: "Adrian Johnson" <oriolebaltimore@gmail.com> To: <pgsql-sql@postgresql.org> Sent: Thursday, September 23, 2010 4:30 AM Subject: [SQL] unique fields > hi: > > I have a fairly large table. > > sample_id | chr | cfrom | cto | > ------------------------------------------- > 1 c2 19 20 > 2 c2 19 20 > 3 c2 19 20 > 1 c5 10 11 > 3 c5 10 11 > > > (25,000 rows) > > I want to find out how many duplications are there for chr, cfrom and cto > > a. c2,19,20 are common to samples 1,2 and 3. > > since there will be many instances like that, do I have to loop over > entire rows and find common chr, cfrom and c2 and ouput with > sample_id. > how can I do that. > > thanks > adrian > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql