Thread: difference when using 'distinct on'
Howdy:
Can someone tell what the difference (and why
you would use it) is between the following:
[snip]
select distinct on (col_1, col_2),
col_1,
col_2,
col_3
from t_table
--
select distinct
col_1,
col_2,
col_3
from t_table
[/snip]
In the first example, is it just getting
the unique rows for the first two columns?
Thanks!
-X
On Fri, Sep 12, 2003 at 09:29:39 -0400, "Johnson, Shaunn" <SJohnson6@bcbsm.com> wrote: > > In the first example, is it just getting > the unique rows for the first two columns? Yes. Normally you would use distinct on in combination with an order by so that you get the particular unique row you are interested in. For example if col1 and col2 were an event type and col3 was a timestamp for the event, you could use distinct on to get the earliest or latest timestamp for each event type.
On Fri, 12 Sep 2003, Johnson, Shaunn wrote: > Howdy: > > Can someone tell what the difference (and why > you would use it) is between the following: > > [snip] > select distinct on (col_1, col_2), > col_1, > col_2, > col_3 > from t_table > > -- > > select distinct > col_1, > col_2, > col_3 > from t_table > [/snip] > > In the first example, is it just getting > the unique rows for the first two columns? In the first, for each set of rows that have a distinct col1,col2 value it's taking one of those rows and using its col3 value. It's like group by, but less restrictive since you don't need to use a set function on col_3. In general distinct on in that fashion is most usable when combined with an order by so that you can get a particular row from each set. For example, you might say do something like: select distinct on (col1, col2) col1, col2, col3 from t_table order by col1, col2, col4; In this case you should get the col3 value for each col1,col2 distinct group that corresponds to the row having the lowest col4 value.