Re: Parallel Scaling of a pgplsql problem - Mailing list pgsql-performance

From Yeb Havinga
Subject Re: Parallel Scaling of a pgplsql problem
Date
Msg-id 4F98EFE8.5090609@gmail.com
Whole thread Raw
In response to Re: Parallel Scaling of a pgplsql problem  (Venki Ramachandran <venki_ramachandran@yahoo.com>)
List pgsql-performance
On 2012-04-26 04:40, Venki Ramachandran wrote:
Thanks Tom, clock_timestamp() worked. Appreciate it!!! and Sorry was hurrying to get this done at work and hence did not read through.

Can you comment on how you would solve the original problem? Even if I can get  the 11 seconds down to 500 ms for one pair, running it for 300k pairs will take multiple hours. How can one write a combination of a bash script/pgplsql code so as to use all 8 cores of a server. I am seeing that this is just executing in one session/process.

You want to compare a calculation on the cross product 'employee x employee'. If employee is partitioned into emp1, emp2, ... emp8, the cross product is equal to the union of emp1 x employee, emp2 x employee, .. emp8 x employee. Each of these 8 cross products on partitions can be executed in parallel. I'd look into dblink to execute each of the 8 cross products in parallel, and then union all of those results.

http://www.postgresql.org/docs/9.1/static/contrib-dblink-connect.html

regards,
Yeb

pgsql-performance by date:

Previous
From: Jan Nielsen
Date:
Subject: Re: Parallel Scaling of a pgplsql problem
Next
From: Greg Spiegelberg
Date:
Subject: Re: Parallel Scaling of a pgplsql problem