Re: Importing normalised data by SQL script in remote DB - Mailing list pgsql-novice
From | Andreas |
---|---|
Subject | Re: Importing normalised data by SQL script in remote DB |
Date | |
Msg-id | 48056C16.3040205@gmx.net Whole thread Raw |
In response to | Re: Importing normalised data by SQL script in remote DB ("Sean Davis" <sdavis2@mail.nih.gov>) |
Responses |
Re: Importing normalised data by SQL script in remote DB
|
List | pgsql-novice |
Sean Davis schrieb: > On Tue, Apr 15, 2008 at 9:54 PM, Andreas <maps.on@gmx.net> wrote: > >> Hi, >> I've got to import data into a remote database. >> I get some stuff usually as excel-files that doesn't fit the db-structure >> in respect of normalisation so I import it into Access, brush up the data >> and push the columns from there in the right tables via ODBC. >> >> For bigger imports (~5000 lines in Excel that get spread over 4-6 tables in >> my db) I fetch the remote DB, do the import locally and transfer a dump back >> to the remote site when noone uses the server. >> This is getting unwieldy as the db grows and the connection is not really >> that fast. >> >> I can access the server by ssh so it might be way faster to run a prepared >> SQL file that consists just of the new data against the DB on the console. >> >> Could I build a SQL script that adds the new stuff and creates foreign keys >> on the fly without the need to know the new IDs before? >> >> Lets suppose I got a couple of foreign-linked tables: >> company (company_id serial primary key, name, ...) >> person (person_id serial primary key, company_fk, name, ...) >> contact (contact_id serial primary key, person_fk, contact_date, >> contact_notes, ...) >> >> The SQL script needed to: >> 1) INSERT INTO company ... >> 2) look up the last created company_id >> 3) INSERT INTO person and use the company_id to set company_fk >> 4) look up the last created person_id >> 5) INSERT INTO contact and use the person_id to set person_fk >> Probaply create some other persons with their contacts. >> Then another company and so on. >> >> I do this right now with vba in Access but this is painfully slow via >> InterNet as it constantly has to ask the server for the last ids and the >> Net's latency spoils all the fun. >> > > Why not avoid Access altogether. Load the data into postgres "as-is" > into a set of loader tables. Then, you can use SQL to do the data > munging and inserts, including the foreign key relationships. You can > use \copy in psql to load the data, so there is no need to have access > to the server. > > Sean > > Thanks for your suggestion. As said usually I get the data as a xls with wrong column names and often enough wrong column types, missing infos or cluttered up in creative ways, too. How could I easily import a certain tab within a xls into a temporary pg-table? Then often the data is a dump of JOINs so it is not normalised. I can't help it. I just get the stuff dumped over me as our customers are able or willing to provide. Most of the time it's like this: ... company_1, ..., person_1, ... company_1, ..., person_2, ... company_1, ..., person_3, ... company_2, ..., person_4, ... company_3, ..., person_5, ... company_3, ..., person_6, ... ... I've got to catch the repeating companies to avoid doubles. Until now I try to find some columns that represent a key for the company part of the data. Then sort by this key columns, walk over the table and generate just a company whenever a part of the key changes. AFAIK I need to insert the person records right after I created their company because if I first create all companies and then the people I wouldn't know the company-id that belongs to a person. I'd love to learn how to do this more cleverly w/o Access. regards Andreas
pgsql-novice by date: