Normalized Tables & SELECT [was: Find "smallest common year"] - Mailing list pgsql-general
From | Stefan Schwarzer |
---|---|
Subject | Normalized Tables & SELECT [was: Find "smallest common year"] |
Date | |
Msg-id | 880A3932-8522-4145-9B36-F568F538585F@grid.unep.ch Whole thread Raw |
In response to | Re: Find "smallest common year" ("Scott Marlowe" <scott.marlowe@gmail.com>) |
Responses |
Re: Normalized Tables & SELECT [was: Find "smallest common year"]
Re: Normalized Tables & SELECT [was: Find "smallest common year"] Re: Normalized Tables & SELECT [was: Find "smallest common year"] Re: Normalized Tables & SELECT [was: Find "smallest common year"] |
List | pgsql-general |
>> What would you recommend for say, 500 global national statistical >> variables, >> 500 regional and 500 subregional and 500 global aggregations? >> Years being >> covered having something between 10 and 60 years for each of these >> variables. All available for 240 countries/territories. > > I generally approach such problems by putting the data right > (normalized) at the start, then munging the data into summary tables > to handle the problems you're seeing now. > > I find it far easier to maintain normalized tables that produced > non-normalized ones (for things like data warehousing) than it is to > maintain non-normalized tables and trying to produce normalized data > from that. Ok, I do understand that. So, instead of the earlier mentioned database design, I would have something like this: - one table for the country names/ids/etc. (Afghanistan, 1; Albania, 2....) - one table for the variable names/ids/etc. (GDP, 1; Population, 2; Fish Catch, 3;....) - one table for the years names/ids/etc. (1970, 1; 1971, 2; 1973, 3; ....) and - one table for all "statistical data" with four fields - id_variable, id_country, id_year, and the actual value You say > I find it far easier to maintain normalized tables that produced > non-normalized ones (for things like data warehousing) than it is to > maintain non-normalized tables and trying to produce normalized data > from that. It seems to me more difficult now to produce a non-normalized output based on the normalized table. How would look a query like, if I need now to SELECT, say 100 countries and 20 years? Something like this (simplified and without joins): SELECT value, id.year FROM main_table WHERE year = '1970' OR year = '1971' OR .... country_name = 'Afghanistan' OR country_name = 'Albania' OR ... Actually, last time we came up with SubSelects for each year. So, does this make sense? Thanks a lot for your help! Stef ____________________________________________________________________ Stefan Schwarzer Lean Back and Relax - Enjoy some Nature Photography: http://photoblog.la-famille-schwarzer.de Appetite for Global Data? UNEP GEO Data Portal: http://geodata.grid.unep.ch ____________________________________________________________________
pgsql-general by date: