Re: Question about optimising (Postgres_)FDW - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Question about optimising (Postgres_)FDW |
Date | |
Msg-id | 534E6B0C.6080403@lab.ntt.co.jp Whole thread Raw |
In response to | Question about optimising (Postgres_)FDW (Hannu Krosing <hannu@2ndQuadrant.com>) |
Responses |
Re: Question about optimising (Postgres_)FDW
|
List | pgsql-hackers |
(2014/04/16 6:55), Hannu Krosing wrote: > ---------------------------------- > CREATE EXTENSION postgres_fdw; > > CREATE SERVER loop foreign data wrapper postgres_fdw > OPTIONS (port '5432', dbname 'testdb'); > > CREATE USER MAPPING FOR PUBLIC SERVER loop; > > create table onemillion ( > id serial primary key, > inserted timestamp default clock_timestamp(), > data text > ); > > insert into onemillion(data) select random() from > generate_series(1,1000000); > > CREATE FOREIGN TABLE onemillion_pgfdw ( > id int, > inserted timestamp, > data text > ) SERVER loop > OPTIONS (table_name 'onemillion', > use_remote_estimate 'true'); > > testdb=# explain analyse > select * from onemillion_pgfdw where id in (select id from onemillion > where data > '0.9' limit 100); > QUERY > PLAN > --------------------------------------------------------------------------------------------------------------------------------- > Nested Loop (cost=122.49..10871.06 rows=500000 width=44) (actual > time=4.269..93.444 rows=100 loops=1) > -> HashAggregate (cost=22.06..23.06 rows=100 width=4) (actual > time=1.110..1.263 rows=100 loops=1) > -> Limit (cost=0.00..20.81 rows=100 width=4) (actual > time=0.038..1.026 rows=100 loops=1) > -> Seq Scan on onemillion (cost=0.00..20834.00 > rows=100115 width=4) (actual time=0.036..0.984 rows=100 loops=1) > Filter: (data > '0.9'::text) > Rows Removed by Filter: 805 > -> Foreign Scan on onemillion_pgfdw (cost=100.43..108.47 rows=1 > width=29) (actual time=0.772..0.773 rows=1 loops=100) > Total runtime: 93.820 ms > (8 rows) > > Time: 97.283 ms > ------------------------------ > > ... actually performs 100 distinct "SELECT * FROM onemillion WHERE id = > $1" calls on "remote" side. Maybe I'm missing something, but I think that you can do what I think you'd like to do by the following procedure: postgres=# ALTER SERVER loop OPTIONS (ADD fdw_startup_cost '1000'); ALTER SERVER postgres=# EXPLAIN VERBOSE SELECT * FROM onemillion_pgsql WHERE id in (SELECT id FROM onemillion WHERE data > '0.9' LIMIT 100); QUERY PLAN ----------------------------------------------------------------------------------------------- Hash Semi Join (cost=1023.10..41983.21rows=100 width=30) Output: onemillion_pgsql.id, onemillion_pgsql.inserted, onemillion_pgsql.data Hash Cond: (onemillion_pgsql.id = onemillion.id) -> Foreign Scan on public.onemillion_pgsql (cost=1000.00..39334.00 rows=1000000 width=29) Output: onemillion_pgsql.id, onemillion_pgsql.inserted, onemillion_pgsql.data Remote SQL: SELECT id, inserted, data FROM public.onemillion -> Hash (cost=21.85..21.85rows=100 width=4) Output: onemillion.id -> Limit (cost=0.00..20.85 rows=100 width=4) Output: onemillion.id -> Seq Scan on public.onemillion (cost=0.00..20834.00 rows=99918 width=4) Output: onemillion.id Filter: (onemillion.data > '0.9'::text)Planning time: 0.690 ms (14 rows) or, that as Tom mentioned, by disabling the use_remote_estimate function: postgres=# ALTER FOREIGN TABLE onemillion_pgsql OPTIONS (SET use_remote_estimate 'false'); ALTER FOREIGN TABLE postgres=# EXPLAIN VERBOSE SELECT * FROM onemillion_pgsql WHERE id in (SELECT id FROM onemillion WHERE data > '0.9' LIMIT 100); QUERY PLAN ---------------------------------------------------------------------------------------------- Hash Semi Join (cost=123.10..41083.21rows=100 width=30) Output: onemillion_pgsql.id, onemillion_pgsql.inserted, onemillion_pgsql.data Hash Cond: (onemillion_pgsql.id = onemillion.id) -> Foreign Scan on public.onemillion_pgsql (cost=100.00..38434.00 rows=1000000 width=30) Output: onemillion_pgsql.id, onemillion_pgsql.inserted, onemillion_pgsql.data Remote SQL: SELECT id, inserted, data FROM public.onemillion -> Hash (cost=21.85..21.85rows=100 width=4) Output: onemillion.id -> Limit (cost=0.00..20.85 rows=100 width=4) Output: onemillion.id -> Seq Scan on public.onemillion (cost=0.00..20834.00 rows=99918 width=4) Output: onemillion.id Filter: (onemillion.data > '0.9'::text)Planning time: 0.215 ms (14 rows) Thanks, Best regards, Etsuro Fujita
pgsql-hackers by date: