Re: Looking for ideas on how to speed up warehouse loading - Mailing list pgsql-performance

From Joe Conway
Subject Re: Looking for ideas on how to speed up warehouse loading
Date
Msg-id 40889DAD.6080703@joeconway.com
Whole thread Raw
In response to Looking for ideas on how to speed up warehouse loading  (Sean Shanny <shannyconsulting@earthlink.net>)
List pgsql-performance
Sean Shanny wrote:
> explain analyze SELECT t1.id, t2.url FROM referral_temp t2 LEFT OUTER
> JOIN d_referral t1 ON t2.url = t1.referral_raw_url ORDER BY t1.id;

> What I would like to know is if there are better ways to do the join?  I
> need to get all the rows back from the referral_temp table as they are
> used for assigning FK's for the fact table later in processing.  When I
> iterate over the values that I get back those with t1.id = null I assign
> a new FK and push both into the d_referral table as new entries as well
> as a text file for later use.  The matching records are written to a
> text file for later use.

Would something like this work any better (without disabling index scans):

SELECT t1.id, t2.url
FROM referral_temp t2, d_referral t1
WHERE t1.referral_raw_url = t2.url;

<process rows with a match>

SELECT t1.id, t2.url
FROM referral_temp t2
WHERE NOT EXISTS
(select 1 FROM d_referral t1 WHERE t1.referral_raw_url = t2.url);

<process rows without a match>

?

Joe

pgsql-performance by date:

Previous
From: "Anjan Dave"
Date:
Subject: Re: Wierd context-switching issue on Xeon
Next
From: "Aaron Werman"
Date:
Subject: Re: Looking for ideas on how to speed up warehouse loading