Thread: Selecting distinct records
Have a table that is populated by external radius reporting which contains name, ipaddress, sessionid, refid, and others None of the fields are unique, but the combination of those 4 listed should reasonably be unique. There is a certain type of interaction with external hardware that results in multple entries into the table, resulting in multiple duplicate entries. What we need to to is build a select that omits any duplicate records from the output, but I can't get a distinct to work given the way we need the data sorted. For example, Name IPaddress sessionid refid User1 201.201 1234 5678 first user User2 201.201 1235 5679 same ip as first user User3 201.202 3234 5670 same sessid as first user User4 201.203 3236 5678 same refid as first user User1 201.202 4234 5678 first user new entry User1 201.202 4234 5678 DUPLICATE User1 201.202 4234 5678 DUPLICATE User2 201.203 1234 5671 same ip as user 4 User3 201.204 2234 5672 unique if we do distinct on sessionid then put a where clause for the user=User1 we end up missing some of user1's entries where another users identical sessionid appears first. What I am looking to do is - grab every record for $user - remove any records that have identical ipaddress+sessionid+refid ie: turn user1 201.102 1234 5678 user1 201.102 1234 5678 user1 201.102 1234 5678 into user1 201.102 1234 5678 - then sort the results by date_time or something else Just can't get it to do all those things at the same time. Any thoughts, or am I not making sense? thanks Dave
>What I am looking to do is >- grab every record for $user >- remove any records that have identical ipaddress+sessionid+refid >- then sort the results by date_time or something else this last requirement is where the problem is... is you do a sum() or order by in the select statement you get and error, for example; select distinct on (sessionid) sum(sessiontime) from logs where name='joeblowuser' and datetime > 1036040400; ERROR: Attribute logs.sessionid must be GROUPed or used in an aggregate function Same if you have to order by datetime or something... Dave
"Dave [Hawk-Systems]" <dave@hawk-systems.com> writes: >> What I am looking to do is >> - grab every record for $user >> - remove any records that have identical ipaddress+sessionid+refid >> - then sort the results by date_time or something else > this last requirement is where the problem is... I think you need two levels of select. To do the DISTINCT ON you must sort by the columns to be distinct'd. You can re-sort the result in an outer select. For example: select * from (select distinct on (ten) * from tenk1 order by ten) ss order by hundred; In practice the inner select's order by will need additional sort columns, with which you control which row gets chosen out of any set of duplicates. regards, tom lane
>"Dave [Hawk-Systems]" <dave@hawk-systems.com> writes: >>> What I am looking to do is >>> - grab every record for $user >>> - remove any records that have identical ipaddress+sessionid+refid >>> - then sort the results by date_time or something else > >> this last requirement is where the problem is... > >I think you need two levels of select. To do the DISTINCT ON you must >sort by the columns to be distinct'd. You can re-sort the result in >an outer select. For example: > >select * from > (select distinct on (ten) * from tenk1 order by ten) ss >order by hundred; is this vailable in 7.0.3, I get an error at the second select Dave