Re: Retrieve most recent 1 record from joined table - Mailing list pgsql-sql
From | Vik Fearing |
---|---|
Subject | Re: Retrieve most recent 1 record from joined table |
Date | |
Msg-id | 53F885F2.2050606@dalibo.com Whole thread Raw |
In response to | Retrieve most recent 1 record from joined table (agharta <agharta82@gmail.com>) |
Responses |
Re: Retrieve most recent 1 record from joined table
|
List | pgsql-sql |
On 08/22/2014 10:05 AM, agharta wrote: > Hi all, > This is my first question, don't hate me please if it is the wrong place. This is the right place. See below for my answer. > Mi question is relative seimple, but goes me crazy. > > First, create some example tables: > > ---- > create table table1( > t1_ID SERIAL primary key, > t1_value text > ); > > create table table2( > t2_ID SERIAL primary key, > t2_value text, > t1_id integer > ); > > > create table table3( > t3_ID SERIAL primary key, > t3_value text, > t3_date timestamp, > t2_id integer > ); > > ALTER TABLE table2 ADD FOREIGN KEY (t1_id) REFERENCES table1 (t1_ID) ON > DELETE CASCADE; > ALTER TABLE table3 ADD FOREIGN KEY (t2_id) REFERENCES table2 (t2_ID) ON > DELETE CASCADE; > > CREATE INDEX IDX_TABLE1_T1_value ON table1 (t1_value); > CREATE INDEX IDX_TABLE2_T2_value ON table2 (t2_value); > CREATE INDEX IDX_TABLE3_T3_value ON table3 (t3_value); > CREATE INDEX IDX_TABLE3_T3_DATE ON table3 (t3_date); > ----- > > As you can see, table3 is referenced to table2 and table2 is referenced > to table1 > > Next, insert some data on tables: > > --one millon records into table1 > insert into table1(t1_value) select md5(random()::text) from (select > generate_series(1,1000000) as a) as c; > > --100.000 records on table2 where t1_id is a random number between 1 and > 10.001. This guarantees many records pointing to same table1 t1_id > insert into table2(t2_value, t1_id) select md5(random()::text), > trunc((random()+1)*10000)::integer from (select > generate_series(1,100000) as a) as c; > > --again, 1.022.401 records on table3 where t2_id is a random number > between 1 and 10.001. This guarantee many records pointing to same > table2 t2_id. > -- random numbers (1.022.401) are generated by generated_series function > with timestamp (1 minute) interval > insert into table3(t3_value, t2_id, t3_date) select md5(random()::text), > trunc((random()+1)*10000)::integer, c.date_val from (select > generate_series(timestamp '2013-01-01',timestamp '2014-12-12', interval > '1 minute') as date_val) as c; > > > So, now we should have table3 with many rows per table2(t2_id) and many > rows in table2 with same t1_id. > > > Now, the question: > > > Joining the tables, how to get ONLY most recent record per > table3(t3_date)?? > > Query example: > > select * from table1 as t1 > inner join table2 t2 on (t1.t1_id = t2.t1_id and t2.t2_value like('%ab%') ) > inner join table3 t3 on (t2.t2_id = t3.t2_id and t3.t3_date <= timestamp > '2014-08-20') > order by t3.t2_id, t3.t3_date desc > > produces to me this dataset (extract) > > > t1_id t1_value t2_id t2_value t1_id > t3_id t3_value t3_date > t2_id > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 857683 > 0fb6cb380522dd83b6ac4beba2c6e98f 2014-08-19 14:42:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 857088 > 9c3481bfc0bdea51e62b338a1777cde6 2014-08-19 04:47:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 853208 > 253dc2083e70dd9e276867f43889973f 2014-08-16 12:07:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 851237 > 970acf901c4232c178b5dbeda4e44ac8 2014-08-15 03:16:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 847436 > cb8d52255eaa850f9d7f203092a2ce13 2014-08-12 11:55:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 819339 > d9e36ad622b1db499b3f623cdd46a811 2014-07-23 23:38:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 818022 > b9d13239f522e3a22f20d36ea6dab8ad 2014-07-23 01:41:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 803046 > c36164f3077894a1986c4922dfb632ec 2014-07-12 16:05:00.0 10020 > 17098 74127bc80ca759678892c957b6a34fc7 10020 > 9182c1f48cd008e31d781abc45723a10 17098 788129 > 32c966feab2212a29f86bebbaa6dfec9 2014-07-02 07:28:00.0 10020 > > > As you can see, there are many t3_id per single t2_id. > > I need the same dataset, but i want only the most recent (one) record > per table3, and, of course, directly in the join rule (in production i > have a long and complex query). > > My personal solution (very slow, i can't use it): > > select * from table1 as t1 > inner join table2 t2 on (t1.t1_id = t2.t1_id and t2.t2_value like('%ab%') ) > inner join table3 t3 on ( > t3.t3_id = (select t3_id from table3 where t2_id = t2.t2_id and t3_date > <= timestamp '2014-08-20' order by t3_date desc fetch first 1 rows only) > and t3.t2_id = t2.t2_id > ) > order by t3.t2_id, t3.t3_date desc > > > it gives me the right result, but performances are poor........ > > Same bad performance result with a function that performs the table3 query. > > Suggestions? > > Thanks to anyone who can answer to me!!!! You didn't say what version of PostgreSQL you're using. The following solution works with 9.3. select * from table1 as t1 join table2 as t2 on t2.t1_id = t1.t1_id and t2.t2_value like '%ab%' join lateral (select * from table3 where t2_id = t2.t2_id and t3_date <= '2014-08-20' order by t3_date desc limit 1) as t3 on true order by t3.t2_id, t3.t3_date desc; You will need the following indexes for it to give good performance: create index on t2 using gist (t2_value gist_trgm_ops); create index on t3 using btree (t2_id, t3_date desc); That first index will require the pg_trgm extension that you can get with: create extension pg_trgm; If you are not using 9.3, I can give you a (much) more complicated version that will perform well, but ideally you'll want to use the above. -- Vik