Re: Optimization for updating foreign tables in Postgres FDW - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Optimization for updating foreign tables in Postgres FDW |
Date | |
Msg-id | 53D2078B.5020603@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Optimization for updating foreign tables in Postgres FDW (Shigeru Hanada <shigeru.hanada@gmail.com>) |
Responses |
Re: Optimization for updating foreign tables in Postgres FDW
|
List | pgsql-hackers |
(2014/07/24 18:30), Shigeru Hanada wrote: > My coworker Takaaki EITOKU reviewed the patch, and here are some > comments from him. Thank you for the review, Eitoku-san! > 2014-07-08 16:07 GMT+09:00 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>: >> In the patch postgresPlanForeignModify has been modified so that if, in >> addition to the above condition, the followings are satisfied, then the >> ForeignScan and ModifyTable node will work that way. >> >> - There are no local BEFORE/AFTER triggers. > > The reason the check ignores INSTEAD OF triggers here is that INSTEAD > OF trigger would prevent executing UPDATE/DELETE statement against a > foreign tables at all, right? I'm not sure that I understand your question correctly, but the reason for that is because foreign tables cannot have INSTEAD OF triggers. >> - In UPDATE it's safe to evaluate expressions to assign to the target >> columns on the remote server. > > IIUC, in addition to target expressions, whole WHERE clause should be > safe to be pushed-down. Though that is obviously required, but > mentioning that in documents and source comments would help users and > developers. OK, I'll add the comments and documentation notes. (I intentionaly didn't mention that because I think the comment for postgresPlanForeignModify has already said the equivalent condition, ie, "there are no local conditions", which means all conditions are executed remotely.) > We found that this patch speeds up DELETE case remarkably, as you > describe above, but we saw only less than 2x speed on UPDATE cases. > Do you have any numbers of UPDATE cases? Here is the result for an UPDATE case. >> On remote side:>> postgres=# create table t (id serial primary key, inserted timestamp>> default clock_timestamp(), datatext);>> CREATE TABLE>> postgres=# insert into t(data) select random() from generate_series(0,>> 99999);>> INSERT 0 100000>>postgres=# vacuum t;>> VACUUM>>>> On local side:>> postgres=# create foreign table ft (id integer, inserted timestamp,data>> text) server myserver options (table_name 't');>> CREATE FOREIGN TABLE Unpatched: postgres=# explain analyze verbose update ft set data = 'notsolongtext' where id < 10000; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Updateon public.ft (cost=100.00..147.38 rows=650 width=18) (actual time=1463.944..1463.944 rows=0 loops=1) Remote SQL: UPDATE public.t SET data = $2 WHERE ctid = $1 -> Foreign Scan onpublic.ft (cost=100.00..147.38 rows=650 width=18) (actual time=2.069..83.220 rows=9999 loops=1) Output: id, inserted, 'notsolongtext'::text, ctid Remote SQL: SELECT id, inserted, ctid FROM public.t WHERE ((id < 10000)) FOR UPDATE Planning time: 0.355 ms Execution time: 1470.032 ms (7 rows) Patched: postgres=# explain analyze verbose update ft set data = 'notsolongtext' where id < 10000; QUERY PLAN -------------------------------------------------------------------------------------------------------------------- Updateon public.ft (cost=100.00..147.38 rows=650 width=18) (actual time=0.005..0.005 rows=0 loops=1) -> Foreign Scan on public.ft (cost=100.00..147.38 rows=650 width=18) (actual time=0.001..0.001 rows=0 loops=1) Output: id, inserted, data, ctid Remote SQL: UPDATE public.tSET data = 'notsolongtext'::text WHERE ((id < 10000)) Planning time: 0.197 ms Execution time: 61.519 ms (6 rows) I think that the precise effect of this optimization for DELETE/UPDATE would depend on eg, data, queries (inc. w/ or w/o RETRUNING clauses) and server/network performance. Could you tell me these information about the UPDATE evaluation? > Some more random thoughts: > > * Naming of new behavior > You named this optimization "Direct Update", but I'm not sure that > this is intuitive enough to express this behavior. I would like to > hear opinions of native speakers. +1 > * Macros for # of fdw_private elements > In postgres_fdw.c you defined MaxFdwScanFdwPrivateLength and > MinFdwScanFdwPrivateLength for determining the mode, but number of > fdw_private elements is not a ranged value but an enum value (I mean > that fdw_private holds 3 or 7, not 4~6, so Max and Min seems > inappropriate for prefix. IMO those macros should be named so that > the names represent the behavior, or define a macro to determine the > mode, say IS_SHIPPABLE(foo). OK, Will fix. > * Comparison of Macros > Comparison against MaxFdwScanFdwPrivateLength and > MinFdwScanFdwPrivateLength should be == instead of >= or <= to detect > unexpected value. Adding assert macro seems good too. Will fix this too. Thanks, Best regards, Etsuro Fujita
pgsql-hackers by date: