Thread: need help with import
Hi I get CSV files to import. Th structure is like this. main part, sub part Could be like this A, a1 A, a2 A, a3 B, b1 B, b2 The database has a table for main_part and one for sub_part. The relation needs to be n:m so there is a relation table that holds ( main_id, sub_id ). The 2 primary keys main_part.id and sub_part.id are both serials. Is there a way to do an import with SQL? I can read the CSV into a temporary table and I can do a INSERT INTO main_part ( ... ) SELECT DISTINCT main columns FROM import; as well as a INSERT INTO sub_part ( ... ) SELECT sub columns FROM import; But how would I know what main_id and sub_id to insert into the n:m relation? At first when I do the import the relation is actually 1:n.
-----Original Message----- From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Andreas Sent: Wednesday, February 15, 2012 8:03 PM To: pgsql-sql@postgresql.org Subject: [SQL] need help with import Hi I get CSV files to import. Th structure is like this. main part, sub part Could be like this A, a1 A, a2 A, a3 B, b1 B, b2 The database has a table for main_part and one for sub_part. The relation needs to be n:m so there is a relation table that holds ( main_id, sub_id ). The 2 primary keys main_part.id and sub_part.id are both serials. Is there a way to do an import with SQL? I can read the CSV into a temporary table and I can do a INSERT INTO main_part ( ... ) SELECT DISTINCT main columns FROMimport; as well as a INSERT INTO sub_part ( ... ) SELECT sub columns FROM import; But how would I know what main_id and sub_id to insert into the n:m relation? At first when I do the import the relation is actually 1:n. -------------------------------------------- You will need to use the temporary table and perform multiple insert+select. I do not understand where you are confused. It would help to provide more meaningful sample data and/or the final resultyou are trying to achieve. Keep in mind any n:m setup requires three tables with the joining table usually havingsome descriptive meaning. Is time one of your components that you are not showing us? Dave -- Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-sql
Am 16.02.2012 02:13, schrieb David Johnston: > -----Original Message----- > From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Andreas > Sent: Wednesday, February 15, 2012 8:03 PM > To: pgsql-sql@postgresql.org > Subject: [SQL] need help with import > > Hi > I get CSV files to import. > Th structure is like this. > main part, sub part > Could be like this > > A, a1 > A, a2 > A, a3 > B, b1 > B, b2 > > The database has a table for main_part and one for sub_part. > The relation needs to be n:m so there is a relation table that holds ( main_id, sub_id ). > The 2 primary keys main_part.id and sub_part.id are both serials. > > Is there a way to do an import with SQL? > > I can read the CSV into a temporary table and I can do a INSERT INTO main_part ( ... ) SELECT DISTINCT main columns FROMimport; as well as a INSERT INTO sub_part ( ... ) SELECT sub columns FROM import; > > But how would I know what main_id and sub_id to insert into the n:m relation? > > At first when I do the import the relation is actually 1:n. > > -------------------------------------------- > > You will need to use the temporary table and perform multiple insert+select. > > I do not understand where you are confused. It would help to provide more meaningful sample data and/or the final resultyou are trying to achieve. Keep in mind any n:m setup requires three tables with the joining table usually havingsome descriptive meaning. Is time one of your components that you are not showing us? > > As you say there are 3 tables main_part ( id serial primary key, ... ) sub_part ( id serial primary key, ... ) main_to_sub ( main_id, sub_id ) I would read the csv into a temporary table "import" and insert the main columns into main_part (). Then there are new tuples in main_part() 42, A 43, B Now I insert the sub columns into sub_part() I'll get e.g. 1000, a1 1001, a2 1002, a3 1003, b1 1004, b2 To fill up main_to_sub ( ... ) I needed to know which sub_id belongs to which main_id. ( 42, 1000 ) ( 42, 1001 ) ( 42, 1002 ) ( 43, 1003 ) ( 43, 1004 ) I could compare every main-column in "import" to every related data-column in main_part to get the newly created main_id and do the same with every sub-data-column but this seems to be a wee bit tedious. Is there a more elegant way hat I don't see, yet?
On Feb 15, 2012, at 21:05, Andreas <maps.on@gmx.net> wrote: > Am 16.02.2012 02:13, schrieb David Johnston: >> -----Original Message----- >> From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Andreas >> Sent: Wednesday, February 15, 2012 8:03 PM >> To: pgsql-sql@postgresql.org >> Subject: [SQL] need help with import >> >> Hi >> I get CSV files to import. >> Th structure is like this. >> main part, sub part >> Could be like this >> >> A, a1 >> A, a2 >> A, a3 >> B, b1 >> B, b2 >> >> The database has a table for main_part and one for sub_part. >> The relation needs to be n:m so there is a relation table that holds ( main_id, sub_id ). >> The 2 primary keys main_part.id and sub_part.id are both serials. >> >> Is there a way to do an import with SQL? >> >> I can read the CSV into a temporary table and I can do a INSERT INTO main_part ( ... ) SELECT DISTINCT main columns FROMimport; as well as a INSERT INTO sub_part ( ... ) SELECT sub columns FROM import; >> >> But how would I know what main_id and sub_id to insert into the n:m relation? >> >> At first when I do the import the relation is actually 1:n. >> >> -------------------------------------------- >> >> You will need to use the temporary table and perform multiple insert+select. >> >> I do not understand where you are confused. It would help to provide more meaningful sample data and/or the final resultyou are trying to achieve. Keep in mind any n:m setup requires three tables with the joining table usually havingsome descriptive meaning. Is time one of your components that you are not showing us? >> >> > As you say there are 3 tables > main_part ( id serial primary key, ... ) > sub_part ( id serial primary key, ... ) > main_to_sub ( main_id, sub_id ) > > I would read the csv into a temporary table "import" and insert the main columns into main_part (). > Then there are new tuples in main_part() > 42, A > 43, B > > Now I insert the sub columns into sub_part() > I'll get e.g. > 1000, a1 > 1001, a2 > 1002, a3 > 1003, b1 > 1004, b2 > > To fill up main_to_sub ( ... ) I needed to know which sub_id belongs to which main_id. > ( 42, 1000 ) > ( 42, 1001 ) > ( 42, 1002 ) > ( 43, 1003 ) > ( 43, 1004 ) > > I could compare every main-column in "import" to every related data-column in main_part to get the newly created main_idand do the same with every sub-data-column but this seems to be a wee bit tedious. > > Is there a more elegant way hat I don't see, yet? > Ditch the whole idea of using a sequence and take your primary keys from the source data. Otherwise yes, you will need toperform the join between the import and live tables to determine the newly created identifier. The question to answer is if you see the same values in subsequent import files do you create a new sequence value or reusethe existing value? Why? David J.
On Thursday 16 Feb 2012, Andreas wrote: > Hi > I get CSV files to import. > Th structure is like this. > main part, sub part > Could be like this > > A, a1 > A, a2 > A, a3 > B, b1 > B, b2 > > The database has a table for main_part and one for sub_part. > The relation needs to be n:m so there is a relation table that holds > ( main_id, sub_id ). > The 2 primary keys main_part.id and sub_part.id are both serials. > > Is there a way to do an import with SQL? > > I can read the CSV into a temporary table > and I can do a > INSERT INTO main_part ( ... ) SELECT DISTINCT main columns FROM > import; as well as a > INSERT INTO sub_part ( ... ) SELECT sub columns FROM import; > > But how would I know what main_id and sub_id to insert into the n:m > relation? Is this what you need? foo=> create table mp(mid serial primary key, m text); CREATE TABLE foo=> create table sp(sid serial primary key, s text); CREATE TABLE foo=> create table ms(mid int references mp, sid int references sp, primary key(mid, sid)); CREATE TABLE foo=> create temporary table t(m text, s text); CREATE TABLE foo=> \copy t from '/tmp/x' csv foo=> select * from t;m | s ---+----A | a1A | a2A | a3B | b1B | b2 (5 rows) foo=> insert into mp(m) (select distinct m from t); INSERT 0 2 foo=> insert into sp(s) (select distinct s from t); INSERT 0 5 foo=> select * from mp;mid | m -----+--- 1 | A 2 | B (2 rows) foo=> select * from sp;sid | s -----+---- 1 | a1 2 | a2 3 | a3 4 | b1 5 | b2 (5 rows) foo=> insert into ms (select mid, sid from mp, sp where (m,s) in (select m, s from t)); INSERT 0 5 foo=> select * from ms;mid | sid -----+----- 1 | 1 1 | 2 1 | 3 2 | 4 2 | 5 (5 rows) foo=> Regards, -- Raj -- Raj Mathur || raju@kandalaya.org || GPG: http://otheronepercent.blogspot.com || http://kandalaya.org || CC68 It is the mind that moves || http://schizoid.in || D17F