Thread: Do parallel queries work with only dblink not with fdw?

Do parallel queries work with only dblink not with fdw?

From

"Klaus P."

Date:

04 May 2016, 08:28:17

In order to improve cpu and disk utilization, I am testing parallel queries.

The approach with dblink_send_query() and dblink_get_result() works in my proof-of-concept. Runtime of my reference query was reduced from 55 seconds to ~20seconds using 4 parallel connections. Not what I had hoped but certainly a significant improvement.

My approach is fairly easy:

db_link_send_query('conn1', 'statement based on partitioning field');
db_link_send_query('conn2', 'statement based on partitioning field');

...

SELECT
dblink_get_result('conn1')

UNION ALL
dblink_get_result('conn2')

...

However, using fdw foreign data wrappers, I was not able to run any query in parallel. I came across this presentation http://de.slideshare.net/EnterpriseDB/foreign-data-wrappers-and-you-with-postgres where it says on page 12 "parallel FDW access". Is there any example available on how to run FDW queries in parallel?

My approach with FDW is similar to the dblink example above:

SELECT * FROM myFdwTable WHERE (clause based on partitioning field)

UNION ALL
SELECT * FROM myFdwTable WHERE (clause based on partitioning field)
...

My experience is however that the statements are carried out sequentially.

Thanks

Klaus Pieper

Re: Do parallel queries work with only dblink not with fdw?

From

Adrian Klaver

Date:

04 May 2016, 13:30:08

On 05/04/2016 01:28 AM, Klaus P. wrote:
> In order to improve cpu and disk utilization, I am testing parallel
> queries.
>
> The approach with dblink_send_query() and dblink_get_result() works in
> my proof-of-concept. Runtime of my reference query was reduced from 55
> seconds to ~20seconds using 4 parallel connections. Not what I had hoped
> but certainly a significant improvement.
>
> My approach is fairly easy:
>
> db_link_send_query('conn1', 'statement based on partitioning field');
> db_link_send_query('conn2', 'statement based on partitioning field');
>
> ...
>
> SELECT
> dblink_get_result('conn1')
> UNION ALL
> dblink_get_result('conn2')
>
> ...
>
>
> However, using fdw foreign data wrappers, I was not able to run any
> query in parallel. I came across this presentation
> http://de.slideshare.net/EnterpriseDB/foreign-data-wrappers-and-you-with-postgres
> where it says on page 12  "parallel FDW access". Is there any example
> available on how to run FDW queries in parallel?
>
> My approach with FDW is similar to the dblink example above:
>
> SELECT * FROM myFdwTable WHERE (clause based on partitioning field)
> UNION ALL
> SELECT * FROM myFdwTable WHERE (clause based on partitioning field)
> ...
>
> My experience is however that the statements are carried out sequentially.

I am not sure of the status of parallel query in FDW, but for those that
are some more information would be helpful:

1) You are using postgres_fdw, correct?

2) What version of the FDW are you using?

3) What version(s) of Postgres are you connecting from/to?


>
> Thanks
>
> Klaus Pieper


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Do parallel queries work with only dblink not with fdw?

From

"kpi6288"

Date:

04 May 2016, 14:09:59

> -----Ursprüngliche Nachricht-----
> Von: Adrian Klaver [mailto:adrian.klaver@aklaver.com]
>
> I am not sure of the status of parallel query in FDW, but for those that are some
> more information would be helpful:
>
> 1) You are using postgres_fdw, correct?

Yes.

>
> 2) What version of the FDW are you using?

Not sure if FDW could be different from the cluster version (see below). Postgres_fdw.dll shows file version
9.5.2.16088- this is the one installed with the Windows PostgreSQL package. 

>
> 3) What version(s) of Postgres are you connecting from/to?

For testing I have installed a fresh download 9.5.2 Windows x64 (on Windows Server 2008 R2, if this is of interest)

There is only one test database running in this cluster. I am connecting in a "loopback" mode to that database.

I could repeat my tests with two different databases or run it on a Linux / Debian machine if that makes a difference.

Thanks Klaus

Re: Do parallel queries work with only dblink not with fdw?

From

Marc Mamin

Date:

04 May 2016, 14:10:19

>From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Klaus P.
>Sent: Mittwoch, 4. Mai 2016 10:28
>To: pgsql-general@postgresql.org
>Subject: [GENERAL] Do parallel queries work with only dblink not with fdw?
>
>In order to improve cpu and disk utilization, I am testing parallel queries. 
>
>The approach with dblink_send_query() and dblink_get_result() works in my proof-of-concept. Runtime of my reference
querywas reduced from 55 seconds to ~20seconds using 4 parallel connections. Not what I had hoped but certainly a
significantimprovement.
 
>My approach is fairly easy: 
>db_link_send_query('conn1', 'statement based on partitioning field');
>db_link_send_query('conn2', 'statement based on partitioning field');
>
>...
>SELECT 
>dblink_get_result('conn1')
>UNION ALL 
>dblink_get_result('conn2')
>
>...


Hello,

I'm using the db_link approach too, but rather than using dblink_get_result, I store the result in an additional
table.
This was faster, at least for my use case and at the time when I implemented that solution...

something like 

db_link_send_query('conn0', 'CREATE UNLOGGED TABLE  my_result...
db_link_send_query('conn1', 'INSERT INTO my_result  statement based on partitioning field');
db_link_send_query('conn2', 'INSERT INTO my_result  statement based on partitioning field');
...
select * from my_result

regards,

Marc Mamin

Re: Do parallel queries work with only dblink not with fdw?

From

Stefan Keller

Date:

08 September 2016, 11:31:35

Hi,

Sorry for appending to that thread, but I think this is related:
Does anyone have experience with parsel [1] and/or it's extension
parallelsql [2]?

:Stefan

[1] http://geeohspatial.blogspot.ch/2013/12/a-simple-function-for-parallel-queries_18.html
[2] https://github.com/k1aus/parallelsql




2016-05-04 16:10 GMT+02:00 Marc Mamin <M.Mamin@intershop.de>:
>>From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Klaus P.
>>Sent: Mittwoch, 4. Mai 2016 10:28
>>To: pgsql-general@postgresql.org
>>Subject: [GENERAL] Do parallel queries work with only dblink not with fdw?
>>
>>In order to improve cpu and disk utilization, I am testing parallel queries.
>>
>>The approach with dblink_send_query() and dblink_get_result() works in my proof-of-concept. Runtime of my reference
querywas reduced from 55 seconds to ~20seconds using 4 parallel connections. Not what I had hoped but certainly a
significantimprovement. 
>>My approach is fairly easy:
>>db_link_send_query('conn1', 'statement based on partitioning field');
>>db_link_send_query('conn2', 'statement based on partitioning field');
>>
>>...
>>SELECT
>>dblink_get_result('conn1')
>>UNION ALL
>>dblink_get_result('conn2')
>>
>>...
>
>
> Hello,
>
> I'm using the db_link approach too, but rather than using dblink_get_result, I store the result in an additional
table.
> This was faster, at least for my use case and at the time when I implemented that solution...
>
> something like
>
> db_link_send_query('conn0', 'CREATE UNLOGGED TABLE  my_result...
> db_link_send_query('conn1', 'INSERT INTO my_result  statement based on partitioning field');
> db_link_send_query('conn2', 'INSERT INTO my_result  statement based on partitioning field');
> ...
> select * from my_result
>
> regards,
>
> Marc Mamin
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general