Thread: Change detection
select Name, Department, min(year), max(year) from data group by Name, Department
The meaning of Staff_ID is obscure because it’s not unique.
Mit freundlichen Grüßen
Martin Stöcker
-----------------------------------------
ETL Datenservice GmbH
Widdersdorfer Str. 415 | D-50933 Köln
Telefon: +49(0)2219544010
Fax: +49(0)2219544015
Email: Martin.Stoecker@etl-datenservice.de
ETL Datenservice GmbH
Widdersdorfer Str. 415 · 50933 Köln
Geschäftsführer: Dr. Dirk Goldner, ppa. Melanie Lillich, ppa. Udo Heuschmann
Amtsgericht Köln · HRB 75439 · USt.-Id: DE 122 805 685
www.etl-datenservice.de
Email: info@etl-datenservice.de
Von: Shaozhong SHI <shishaozhong@gmail.com>
Gesendet: Freitag, 9. Dezember 2022 13:55
An: pgsql-sql <pgsql-sql@lists.postgresql.org>
Betreff: Change detection
A record shows all staff member worked in different departments.
For instance, Tom was in sales for several years and got promoted to management. How to detect the time of this change?
Data
Staff_ID Name Department Year
1 Tom Sales 1990
2 Tom Sales 1991
3 Tom Sales 1991
4 Tom Management 1992
4 Tom Management 1992
Regards,
David
Attachment
DataStaff_ID Name Department Year1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19924 Tom Management 1992
----------+------+------------+------+---------
1 | Tom | Sales | 1990 | f
2 | Tom | Sales | 1991 | f
3 | Tom | Sales | 1991 | f
4 | Tom | Management | 1992 | t
4 | Tom | Management | 1992 | f
(5 rows)
DataStaff_ID Name Department Year1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19924 Tom Management 1992select *, coalesce(lag(department) over(order by year), department) <> department Changed from (Values (1, 'Tom', 'Sales', 1990),(2, 'Tom', 'Sales', 1991),(3, 'Tom', 'Sales', 1991),(4, 'Tom', 'Management', 1992),(4, 'Tom', 'Management', 1992)) as x(Staff_ID, Name, Department, Year);staff_id | name | department | year | changed
----------+------+------------+------+---------
1 | Tom | Sales | 1990 | f
2 | Tom | Sales | 1991 | f
3 | Tom | Sales | 1991 | f
4 | Tom | Management | 1992 | t
4 | Tom | Management | 1992 | f
(5 rows)
DataStaff_ID Name Department Year1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19924 Tom Management 1992select *, coalesce(lag(department) over(order by year), department) <> department Changed from (Values (1, 'Tom', 'Sales', 1990),(2, 'Tom', 'Sales', 1991),(3, 'Tom', 'Sales', 1991),(4, 'Tom', 'Management', 1992),(4, 'Tom', 'Management', 1992)) as x(Staff_ID, Name, Department, Year);staff_id | name | department | year | changed
----------+------+------------+------+---------
1 | Tom | Sales | 1990 | f
2 | Tom | Sales | 1991 | f
3 | Tom | Sales | 1991 | f
4 | Tom | Management | 1992 | t
4 | Tom | Management | 1992 | f
(5 rows)

How about finding all changes for all people in a large record set?See the follwoing:David1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19925 Tom Management 19926 Tim finance 19827 Tim finance 19838 Tim management 19849 Tim management 1985On Fri, 9 Dec 2022 at 13:06, Marcos Pegoraro <marcos@f10.com.br> wrote:DataStaff_ID Name Department Year1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19924 Tom Management 1992select *, coalesce(lag(department) over(order by year), department) <> department Changed from (Values (1, 'Tom', 'Sales', 1990),(2, 'Tom', 'Sales', 1991),(3, 'Tom', 'Sales', 1991),(4, 'Tom', 'Management', 1992),(4, 'Tom', 'Management', 1992)) as x(Staff_ID, Name, Department, Year);staff_id | name | department | year | changed
----------+------+------------+------+---------
1 | Tom | Sales | 1990 | f
2 | Tom | Sales | 1991 | f
3 | Tom | Sales | 1991 | f
4 | Tom | Management | 1992 | t
4 | Tom | Management | 1992 | f
(5 rows)
just change lag(department) over(order by year) to lag(department) over(partition by name order by year)Atenciosamente,Em sex., 9 de dez. de 2022 às 11:15, Shaozhong SHI <shishaozhong@gmail.com> escreveu:How about finding all changes for all people in a large record set?See the follwoing:David1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19925 Tom Management 19926 Tim finance 19827 Tim finance 19838 Tim management 19849 Tim management 1985On Fri, 9 Dec 2022 at 13:06, Marcos Pegoraro <marcos@f10.com.br> wrote:DataStaff_ID Name Department Year1 Tom Sales 19902 Tom Sales 19913 Tom Sales 19914 Tom Management 19924 Tom Management 1992select *, coalesce(lag(department) over(order by year), department) <> department Changed from (Values (1, 'Tom', 'Sales', 1990),(2, 'Tom', 'Sales', 1991),(3, 'Tom', 'Sales', 1991),(4, 'Tom', 'Management', 1992),(4, 'Tom', 'Management', 1992)) as x(Staff_ID, Name, Department, Year);staff_id | name | department | year | changed
----------+------+------------+------+---------
1 | Tom | Sales | 1990 | f
2 | Tom | Sales | 1991 | f
3 | Tom | Sales | 1991 | f
4 | Tom | Management | 1992 | t
4 | Tom | Management | 1992 | f
(5 rows)
On 12/9/22 12:32, Shaozhong SHI wrote: > That works well. > > I just wonder whether we can tell Tom or Tim has worked in more than 1 > department. Apparently, PostgreSQL does not allow count(distinct > department) when window function is used. > > Given this data set, can we do something like count(distinct) to > provide an answer to how many different department someone has worked in? > > Regards, > > David Use the working query in a CTE and simply do the count(distinct department) on that, else there's no good place to put the number of departments per person. Pretty sure you've been asked before but Please don't top post.