Thread: counting distinct values
Using the example from http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I do if I wanted to know the number of different cities where I had a friend in each state? select count(city) group by state; would not work because if you had two friends in the same city it would be counted twice.
> Using the example from > http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I > do if I wanted to know the number of different cities where I had a > friend in each state? select count(city) group by state; would not work > because if you had two friends in the same city it would be counted > twice. > How about? select city, state, count(*) group by state, city; -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > Using the example from > > http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I > > do if I wanted to know the number of different cities where I had a > > friend in each state? select count(city) group by state; would not work > > because if you had two friends in the same city it would be counted > > twice. > > > > How about? > > select city, state, count(*) group by state, city; OK that doesn't work for me because I want to use other agregates on the state level. Forgetting the state/city example. playpen=> create table tablem ( a int, b int, n int); CREATE playpen=> insert into tablem (a, b, n) values (1, 2, 1); INSERT 35197 1 playpen=> insert into tablem (a, b, n) values (1, 2, 2); INSERT 35198 1 playpen=> insert into tablem (a, b, n) values (1, 3, 1); INSERT 35199 1 playpen=> insert into tablem (a, b, n) values (1, 4, 3); INSERT 35296 1 playpen=> insert into tablem (a, b, n) values (1, 1, 4); INSERT 35297 1 playpen=> insert into tablem (a, b, n) values (2, 3, 3); INSERT 35298 1 playpen=> insert into tablem (a, b, n) values (2, 5, 7); INSERT 35299 1 playpen=> insert into tablem (a, b, n) values (2, 3, 3); INSERT 35300 1 playpen=> insert into tablem (a, b, n) values (2, 3, 1); INSERT 35301 1 playpen=> select a, count(*), avg(n), sum(n) from tablem group by a; a|count|avg|sum -+-----+---+--- 1| 5| 2| 11 2| 4| 3| 14 (2 rows) ... now suppose I want to have the number of distictive b's in the results as well. I try: playpen=> select a, count(m.oid), avg(m.n), sum(m.n), (select count(mm.oid) from tablem mm where m.a = mm.a group by mm.b ) from tablem m group by a; ERROR: More than one tuple returned by a subselect used as an expression. playpen=> ... even though the subselect should only return one tuple.
Joseph Shraibman <jks@selectacast.net> writes: >>>> Using the example from >>>> http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I >>>> do if I wanted to know the number of different cities where I had a >>>> friend in each state? select count(city) group by state; would not work >>>> because if you had two friends in the same city it would be counted >>>> twice. Er, what's wrong with select count(distinct city) group by state? > ... now suppose I want to have the number of distictive b's in the > results as well. I try: > playpen=> select a, count(m.oid), avg(m.n), sum(m.n), (select > count(mm.oid) from tablem mm where m.a = mm.a group by mm.b ) from > tablem m group by a; > ERROR: More than one tuple returned by a subselect used as an > expression. > playpen=> > ... even though the subselect should only return one tuple. Not unless there's only one b value for any one a value --- otherwise the sub-select will return one row per b group. The error message looks correct to me. regards, tom lane
Tom Lane wrote: > > Joseph Shraibman <jks@selectacast.net> writes: > >>>> Using the example from > >>>> http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I > >>>> do if I wanted to know the number of different cities where I had a > >>>> friend in each state? select count(city) group by state; would not work > >>>> because if you had two friends in the same city it would be counted > >>>> twice. > > Er, what's wrong with select count(distinct city) group by state? I thought I tried that, but it didn't work the first time. <shrug> Guess I mistyped something. Sorry. > > > ... now suppose I want to have the number of distictive b's in the > > results as well. I try: > > > playpen=> select a, count(m.oid), avg(m.n), sum(m.n), (select > > count(mm.oid) from tablem mm where m.a = mm.a group by mm.b ) from > > tablem m group by a; > > ERROR: More than one tuple returned by a subselect used as an > > expression. > > playpen=> > > > ... even though the subselect should only return one tuple. > > Not unless there's only one b value for any one a value --- otherwise > the sub-select will return one row per b group. The error message looks > correct to me. > OK I thought that group by was eleminating duplicates which would then be counted. That evolved from q simple select(*) ... group by that worked. Don't know where I screwed it up.