Thread: [POC] FETCH limited by bytes.

[POC] FETCH limited by bytes.

From

Kyotaro HORIGUCHI

Date:

22 January 2015, 10:28:54

Hello, as the discuttion on async fetching on postgres_fdw, FETCH
with data-size limitation would be useful to get memory usage
stability of postgres_fdw.

Is such a feature and syntax could be allowed to be added?

== 

Postgres_fdw fetches tuples from remote servers using cursor. The
transfer gets faster as the number of fetch decreases. On the
other hand buffer size for the fetched tuples widely varies
according to their average length. 100 tuples per fetch is quite
small for short tuples but larger fetch size will easily cause
memory exhaustion. However, there's no way to know it in advance.

One means to settle the contradiction would be a FETCH which
sends result limiting by size, not the number of tuples. So I'd
like to propose this.

This patch is a POC for the feature. For exapmle,
 FETCH 10000 LIMIT 1000000 FROM c1;

This FETCH retrieves up to 10000 tuples but cut out just after
the total tuple length exceeds 1MB. (It does not literally
"LIMIT" in that sense)

The syntax added by this patch is described as following.
FETCH [FORWARD|BACKWARD] <ALL|SignedIconst> LIMIT Iconst [FROM|IN] curname

The "data size" to be compared with the LIMIT size is the
summation of the result of the following expression. The
appropriateness of it should be arguable.

[if tupleslot has tts_tuple]  HEAPTUPLESIZE + slot->tts_tuple->t_len
[else]  HEAPTUPLESIZE +     heap_compute_data_size(slot->tts_tupleDescriptor,
slot->tts_values,                          slot->tts_isnull);
 

========================

This patch does following changes,

- This patch adds the parameter "size" to following functions     (standard_)ExecutorRun / ExecutePlan / RunFromStore
 PortalRun / PortalRunSelect / PortalRunFetch / DoPortalRunFetch
 

- The core is in StandardExecutorRun and RunFromStore. Simplly sum up the sent tuple length and compare against the
givenlimit.
 

- struct FetchStmt and EState has new member.

- The modifications in gram.y affects on ecpg parser. I think I could fix them but with no confidence :(

- Modified the corespondence parts of the changes above in auto_explain and pg_stat_statments only in parameter list.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [POC] FETCH limited by bytes.

From

Tom Lane

Date:

22 January 2015, 16:18:01

Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> Hello, as the discuttion on async fetching on postgres_fdw, FETCH
> with data-size limitation would be useful to get memory usage
> stability of postgres_fdw.

> Is such a feature and syntax could be allowed to be added?

This seems like a lot of work, and frankly an incredibly ugly API,
for a benefit that is entirely hypothetical.  Have you got numbers
showing any actual performance win for postgres_fdw?

Even if we wanted to do something like this, I strongly object to
measuring size by heap_compute_data_size.  That's not a number that users
would normally have any direct knowledge of; nor does it have anything
at all to do with the claimed use-case, where what you'd really need to
measure is bytes transmitted down the wire.  (The difference is not small:
for instance, toasted values would likely still be toasted at the point
where you're measuring.)
        regards, tom lane

Re: [POC] FETCH limited by bytes.

From

Kyotaro HORIGUCHI

Date:

27 January 2015, 09:25:28

Thank you for the comment.

The automatic way to determin the fetch_size looks become too
much for the purpose. An example of non-automatic way is a new
foreign table option like 'fetch_size' but this exposes the
inside too much... Which do you think is preferable?

Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <24503.1421943472@sss.pgh.pa.us>
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > Hello, as the discuttion on async fetching on postgres_fdw, FETCH
> > with data-size limitation would be useful to get memory usage
> > stability of postgres_fdw.
> 
> > Is such a feature and syntax could be allowed to be added?
> 
> This seems like a lot of work, and frankly an incredibly ugly API,
> for a benefit that is entirely hypothetical.  Have you got numbers
> showing any actual performance win for postgres_fdw?

The API is a rush work to make the path for the new parameter
(but, yes, I did too much for the purpose that use from
postgres_fdw..)  and it can be any saner syntax but it's not the
time to do so yet.

The data-size limitation, any size to limit, would give
significant gain especially for small sized rows.

This patch began from the fact that it runs about twice faster
when fetch size = 10000 than 100.

http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyotaro@lab.ntt.co.jp

I took exec times to get 1M rows from localhost via postgres_fdw
and it showed the following numbers.

=# SELECT a from ft1;
fetch_size,   avg row size(*1),   time,   alloced_mem/fetch(Mbytes)(*1)
(local)                            0.75s
100            60                  6.2s       6000 (0.006)
10000          60                  2.7s     600000 (0.6  )
33333          60                  2.2s    1999980 (2.0  )
66666          60                  2.4s    3999960 (4.0  )

=# SELECT a, b, c from ft1;
fetch_size,   avg row size(*1),   time,   alloced_mem/fetch(Mbytes)(*1)
(local)                            0.8s
100           204                 12  s      20400 (0.02 )
1000          204                 10  s     204000 (0.2  )
10000         204                  5.8s    2040000 (2    )
20000         204                  5.9s    4080000 (4    )

=# SELECT a, b, d from ft1;
fetch_size,   avg row size(*1),   time,   alloced_mem/fetch(Mbytes)(*1)
(local)                            0.8s
100          1356                 17  s     135600 (0.136)
1000         1356                 15  s    1356000 (1.356)
1475         1356                 13  s    2000100 (2.0  )
2950         1356                 13  s    4000200 (4.0  )

The definitions of the environment are the following.

CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres');
CREATE USER MAPPING FOR PUBLIC SERVER sv1;
CREATE TABLE lt1 (a int, b timestamp, c text, d text);
CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1 OPTIONS (table_name 'lt1');
INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM generate_series(0, 999999) a);

The "avg row size" is alloced_mem/fetch_size and the alloced_mem
is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE +
tup->t_len) for all stored tuples in the receiver side,
fetch_more_data() in postgres_fdw.

They are about 50% gain for the smaller tuple size and 25% for
the larger. They looks to be optimal at where alloced_mem is
around 2MB by the reason unknown to me. Anyway the difference
seems to be significant.

> Even if we wanted to do something like this, I strongly object to
> measuring size by heap_compute_data_size.  That's not a number that users
> would normally have any direct knowledge of; nor does it have anything
> at all to do with the claimed use-case, where what you'd really need to
> measure is bytes transmitted down the wire.  (The difference is not small:
> for instance, toasted values would likely still be toasted at the point
> where you're measuring.)

Sure. Finally, the attached patch #1 which does the following
things.
- Sender limits the number of tuples using the sum of the net  length of the column values to be sent, not including
protocol overhead. It is calculated in the added function  slot_compute_attr_size(), using raw length for compressed
values.
- postgres_fdw calculates fetch limit bytes by the following  formula,
  MAX_FETCH_MEM - MAX_FETCH_SIZE * (estimated overhead per tuple);

The result of the patch is as follows. MAX_FETCH_MEM = 2MiB and
MAX_FETCH_SIZE = 30000.

fetch_size,   avg row size(*1),   time,   max alloced_mem/fetch(Mbytes)
(auto)         60                  2.4s   1080000 ( 1.08)
(auto)        204                  7.3s    536400 ( 0.54)
(auto)       1356                 15  s    430236 ( 0.43)

This is meaningfully fast but the patch looks too big and the
meaning of the new parameter is hard to understand..:(


On the other hand the cause of the displacements of alloced_mem
shown above is per-tuple overhead, the sum of which is unknown
before execution.  The second patch makes FETCH accept the tuple
overhead bytes. The result seems pretty good, but I think this
might be too spcialized to this usage.

MAX_FETCH_SIZE = 30000 and MAX_FETCH_MEM = 2MiB,
max_fetch_size,   avg row size(*1),   time,   max alloced_mem/fetch(MiBytes)
30000             60                  2.3s   1080000 ( 1.0)9932            204                  5.7s   1787760 (
1.7)1376          1356                 13  s   1847484 ( 1.8)
 

MAX_FETCH_SIZE = 25000 and MAX_FETCH_MEM = 1MiB,
max_fetch_size,   avg row size(*1),   time,   max alloced_mem/fetch(MiBytes)
25000             60                  2.4s    900000 ( 0.86)4358            204                  6.6s    816840 ( 0.78)
634          1356                 16  s    844488 ( 0.81)
 

MAX_FETCH_SIZE = 10000 and MAX_FETCH_MEM = 0.5MiB,
max_fetch_size,   avg row size(*1),   time,   max alloced_mem/fetch(MiBytes)
10000             60                  2.8s    360000 ( 0.35)2376            204                  7.8s    427680 ( 0.41)
332          1356                 17  s    442224 ( 0.42)
 

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

28 January 2015, 01:52:26

Last year I was working on a patch to postgres_fdw where the fetch_size could be set at the table level and the server level.

I was able to get the settings parsed and they would show up in pg_foreign_table and pg_foreign_servers. Unfortunately, I'm not very familiar with how foreign data wrappers work, so I wasn't able to figure out how to get these custom values passed from the PgFdwRelationInfo struct into the query's PgFdwScanState struct.

I bring this up only because it might be a simpler solution, in that the table designer could set the fetch size very high for narrow tables, and lower or default for wider tables. It's also a very clean syntax, just another option on the table and/or server creation.

My incomplete patch is attached.

On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Thank you for the comment.

The automatic way to determin the fetch_size looks become too
much for the purpose. An example of non-automatic way is a new
foreign table option like 'fetch_size' but this exposes the
inside too much... Which do you think is preferable?

Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <24503.1421943472@sss.pgh.pa.us>
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > Hello, as the discuttion on async fetching on postgres_fdw, FETCH
> > with data-size limitation would be useful to get memory usage
> > stability of postgres_fdw.
>
> > Is such a feature and syntax could be allowed to be added?
>
> This seems like a lot of work, and frankly an incredibly ugly API,
> for a benefit that is entirely hypothetical. Have you got numbers
> showing any actual performance win for postgres_fdw?

The API is a rush work to make the path for the new parameter
(but, yes, I did too much for the purpose that use from
postgres_fdw..) and it can be any saner syntax but it's not the
time to do so yet.

The data-size limitation, any size to limit, would give
significant gain especially for small sized rows.

This patch began from the fact that it runs about twice faster
when fetch size = 10000 than 100.

http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyotaro@lab.ntt.co.jp

I took exec times to get 1M rows from localhost via postgres_fdw
and it showed the following numbers.

=# SELECT a from ft1;
fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1)
(local) 0.75s
100 60 6.2s 6000 (0.006)
10000 60 2.7s 600000 (0.6 )
33333 60 2.2s 1999980 (2.0 )
66666 60 2.4s 3999960 (4.0 )

=# SELECT a, b, c from ft1;
fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1)
(local) 0.8s
100 204 12 s 20400 (0.02 )
1000 204 10 s 204000 (0.2 )
10000 204 5.8s 2040000 (2 )
20000 204 5.9s 4080000 (4 )

=# SELECT a, b, d from ft1;
fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1)
(local) 0.8s
100 1356 17 s 135600 (0.136)
1000 1356 15 s 1356000 (1.356)
1475 1356 13 s 2000100 (2.0 )
2950 1356 13 s 4000200 (4.0 )

The definitions of the environment are the following.

CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'postgres');
CREATE USER MAPPING FOR PUBLIC SERVER sv1;
CREATE TABLE lt1 (a int, b timestamp, c text, d text);
CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1 OPTIONS (table_name 'lt1');
INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM generate_series(0, 999999) a);

The "avg row size" is alloced_mem/fetch_size and the alloced_mem
is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE +
tup->t_len) for all stored tuples in the receiver side,
fetch_more_data() in postgres_fdw.

They are about 50% gain for the smaller tuple size and 25% for
the larger. They looks to be optimal at where alloced_mem is
around 2MB by the reason unknown to me. Anyway the difference
seems to be significant.

> Even if we wanted to do something like this, I strongly object to
> measuring size by heap_compute_data_size. That's not a number that users
> would normally have any direct knowledge of; nor does it have anything
> at all to do with the claimed use-case, where what you'd really need to
> measure is bytes transmitted down the wire. (The difference is not small:
> for instance, toasted values would likely still be toasted at the point
> where you're measuring.)

Sure. Finally, the attached patch #1 which does the following
things.

- Sender limits the number of tuples using the sum of the net
length of the column values to be sent, not including protocol
overhead. It is calculated in the added function
slot_compute_attr_size(), using raw length for compressed
values.

- postgres_fdw calculates fetch limit bytes by the following
formula,

MAX_FETCH_MEM - MAX_FETCH_SIZE * (estimated overhead per tuple);

The result of the patch is as follows. MAX_FETCH_MEM = 2MiB and
MAX_FETCH_SIZE = 30000.

fetch_size, avg row size(*1), time, max alloced_mem/fetch(Mbytes)
(auto) 60 2.4s 1080000 ( 1.08)
(auto) 204 7.3s 536400 ( 0.54)
(auto) 1356 15 s 430236 ( 0.43)

This is meaningfully fast but the patch looks too big and the
meaning of the new parameter is hard to understand..:(

On the other hand the cause of the displacements of alloced_mem
shown above is per-tuple overhead, the sum of which is unknown
before execution. The second patch makes FETCH accept the tuple
overhead bytes. The result seems pretty good, but I think this
might be too spcialized to this usage.

MAX_FETCH_SIZE = 30000 and MAX_FETCH_MEM = 2MiB,
max_fetch_size, avg row size(*1), time, max alloced_mem/fetch(MiBytes)
30000 60 2.3s 1080000 ( 1.0)
9932 204 5.7s 1787760 ( 1.7)
1376 1356 13 s 1847484 ( 1.8)

MAX_FETCH_SIZE = 25000 and MAX_FETCH_MEM = 1MiB,
max_fetch_size, avg row size(*1), time, max alloced_mem/fetch(MiBytes)
25000 60 2.4s 900000 ( 0.86)
4358 204 6.6s 816840 ( 0.78)
634 1356 16 s 844488 ( 0.81)

MAX_FETCH_SIZE = 10000 and MAX_FETCH_MEM = 0.5MiB,
max_fetch_size, avg row size(*1), time, max alloced_mem/fetch(MiBytes)
10000 60 2.8s 360000 ( 0.35)
2376 204 7.8s 427680 ( 0.41)
332 1356 17 s 442224 ( 0.42)

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

diff_so_far.diff

Re: [POC] FETCH limited by bytes.

From

Kyotaro HORIGUCHI

Date:

02 February 2015, 07:29:34

Hmm, somehow I removed some recipients, especially the
list. Sorry for the duplicate.

-----
Sorry, I've been back. Thank you for the comment.

> Do you have any insight into where I would pass the custom row fetches from
> the table struct to the scan struct?

Yeah it's one simple way to tune it, if the user knows the
appropreate value.

> Last year I was working on a patch to postgres_fdw where the fetch_size
> could be set at the table level and the server level.
> 
> I was able to get the settings parsed and they would show up in
> pg_foreign_table
> and pg_foreign_servers. Unfortunately, I'm not very familiar with how
> foreign data wrappers work, so I wasn't able to figure out how to get these
> custom values passed from the PgFdwRelationInfo struct into the
> query's PgFdwScanState
> struct.

Directly answering, the states needed to be shared among several
stages are holded within fdw_private. Your new variable
fpinfo->fetch_size can be read in postgresGetForeignPlan.  It
newly creates another fdw_private.  You can pass your values to
ForeignPlan making it hold the value there. Finally, you will get
it in postgresBeginForeginScan and can set it into
PgFdwScanState.

> I bring this up only because it might be a simpler solution, in that the
> table designer could set the fetch size very high for narrow tables, and
> lower or default for wider tables. It's also a very clean syntax, just
> another option on the table and/or server creation.
> 
> My incomplete patch is attached.

However, the fetch_size is not needed by planner (so far), so we
can simply read the options in postgresBeginForeignScan() and set
into PgFdwScanState. This runs once per exection.

Finally, the attached patch will work as you intended.

What do you think about this?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


> On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> 
> > Thank you for the comment.
> >
> > The automatic way to determin the fetch_size looks become too
> > much for the purpose. An example of non-automatic way is a new
> > foreign table option like 'fetch_size' but this exposes the
> > inside too much... Which do you think is preferable?
> >
> > Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <
> > 24503.1421943472@sss.pgh.pa.us>
> > > Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > > > Hello, as the discuttion on async fetching on postgres_fdw, FETCH
> > > > with data-size limitation would be useful to get memory usage
> > > > stability of postgres_fdw.
> > >
> > > > Is such a feature and syntax could be allowed to be added?
> > >
> > > This seems like a lot of work, and frankly an incredibly ugly API,
> > > for a benefit that is entirely hypothetical.  Have you got numbers
> > > showing any actual performance win for postgres_fdw?
> >
> > The API is a rush work to make the path for the new parameter
> > (but, yes, I did too much for the purpose that use from
> > postgres_fdw..)  and it can be any saner syntax but it's not the
> > time to do so yet.
> >
> > The data-size limitation, any size to limit, would give
> > significant gain especially for small sized rows.
> >
> > This patch began from the fact that it runs about twice faster
> > when fetch size = 10000 than 100.
> >
> >
> > http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyotaro@lab.ntt.co.jp
> >
> > I took exec times to get 1M rows from localhost via postgres_fdw
> > and it showed the following numbers.
> >
> > =# SELECT a from ft1;
> > fetch_size,   avg row size(*1),   time,   alloced_mem/fetch(Mbytes)(*1)
> > (local)                            0.75s
> > 100            60                  6.2s       6000 (0.006)
> > 10000          60                  2.7s     600000 (0.6  )
> > 33333          60                  2.2s    1999980 (2.0  )
> > 66666          60                  2.4s    3999960 (4.0  )
> >
> > =# SELECT a, b, c from ft1;
> > fetch_size,   avg row size(*1),   time,   alloced_mem/fetch(Mbytes)(*1)
> > (local)                            0.8s
> > 100           204                 12  s      20400 (0.02 )
> > 1000          204                 10  s     204000 (0.2  )
> > 10000         204                  5.8s    2040000 (2    )
> > 20000         204                  5.9s    4080000 (4    )
> >
> > =# SELECT a, b, d from ft1;
> > fetch_size,   avg row size(*1),   time,   alloced_mem/fetch(Mbytes)(*1)
> > (local)                            0.8s
> > 100          1356                 17  s     135600 (0.136)
> > 1000         1356                 15  s    1356000 (1.356)
> > 1475         1356                 13  s    2000100 (2.0  )
> > 2950         1356                 13  s    4000200 (4.0  )
> >
> > The definitions of the environment are the following.
> >
> > CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host
> > 'localhost', dbname 'postgres');
> > CREATE USER MAPPING FOR PUBLIC SERVER sv1;
> > CREATE TABLE lt1 (a int, b timestamp, c text, d text);
> > CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1
> > OPTIONS (table_name 'lt1');
> > INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM
> > generate_series(0, 999999) a);
> >
> > The "avg row size" is alloced_mem/fetch_size and the alloced_mem
> > is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE +
> > tup->t_len) for all stored tuples in the receiver side,
> > fetch_more_data() in postgres_fdw.
> >
> > They are about 50% gain for the smaller tuple size and 25% for
> > the larger. They looks to be optimal at where alloced_mem is
> > around 2MB by the reason unknown to me. Anyway the difference
> > seems to be significant.
> >
> > > Even if we wanted to do something like this, I strongly object to
> > > measuring size by heap_compute_data_size.  That's not a number that users
> > > would normally have any direct knowledge of; nor does it have anything
> > > at all to do with the claimed use-case, where what you'd really need to
> > > measure is bytes transmitted down the wire.  (The difference is not
> > small:
> > > for instance, toasted values would likely still be toasted at the point
> > > where you're measuring.)
> >
> > Sure. Finally, the attached patch #1 which does the following
> > things.
> >
> >  - Sender limits the number of tuples using the sum of the net
> >    length of the column values to be sent, not including protocol
> >    overhead. It is calculated in the added function
> >    slot_compute_attr_size(), using raw length for compressed
> >    values.
> >
> >  - postgres_fdw calculates fetch limit bytes by the following
> >    formula,
> >
> >    MAX_FETCH_MEM - MAX_FETCH_SIZE * (estimated overhead per tuple);
> >
> > The result of the patch is as follows. MAX_FETCH_MEM = 2MiB and
> > MAX_FETCH_SIZE = 30000.
> >
> > fetch_size,   avg row size(*1),   time,   max alloced_mem/fetch(Mbytes)
> > (auto)         60                  2.4s   1080000 ( 1.08)
> > (auto)        204                  7.3s    536400 ( 0.54)
> > (auto)       1356                 15  s    430236 ( 0.43)
> >
> > This is meaningfully fast but the patch looks too big and the
> > meaning of the new parameter is hard to understand..:(
> >
> >
> > On the other hand the cause of the displacements of alloced_mem
> > shown above is per-tuple overhead, the sum of which is unknown
> > before execution.  The second patch makes FETCH accept the tuple
> > overhead bytes. The result seems pretty good, but I think this
> > might be too spcialized to this usage.
> >
> > MAX_FETCH_SIZE = 30000 and MAX_FETCH_MEM = 2MiB,
> > max_fetch_size,   avg row size(*1),   time,   max
> > alloced_mem/fetch(MiBytes)
> > 30000             60                  2.3s   1080000 ( 1.0)
> >  9932            204                  5.7s   1787760 ( 1.7)
> >  1376           1356                 13  s   1847484 ( 1.8)
> >
> > MAX_FETCH_SIZE = 25000 and MAX_FETCH_MEM = 1MiB,
> > max_fetch_size,   avg row size(*1),   time,   max
> > alloced_mem/fetch(MiBytes)
> > 25000             60                  2.4s    900000 ( 0.86)
> >  4358            204                  6.6s    816840 ( 0.78)
> >   634           1356                 16  s    844488 ( 0.81)
> >
> > MAX_FETCH_SIZE = 10000 and MAX_FETCH_MEM = 0.5MiB,
> > max_fetch_size,   avg row size(*1),   time,   max
> > alloced_mem/fetch(MiBytes)
> > 10000             60                  2.8s    360000 ( 0.35)
> >  2376            204                  7.8s    427680 ( 0.41)
> >   332           1356                 17  s    442224 ( 0.42)
> >
> > regards,
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> >
> >

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

04 February 2015, 23:06:33

I applied this patch to REL9_4_STABLE, and I was able to connect to a foreign database (redshift, actually).

the basic outline of the test is below, names changed to protect my employment.

create extension if not exists postgres_fdw;

create server redshift_server foreign data wrapper postgres_fdw
options ( host 'some.hostname.ext', port '5439', dbname 'redacted', fetch_size '150' );

create user mapping for public server redshift_server options ( user 'redacted_user', password 'comeonyouarekiddingright' );

create foreign table redshift_tab150 ( <colspecs> )
server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema' );

create foreign table redshift_tab151 ( <colspecs> )
server redshift_server options (table_name 'redacted_table', schema_name 'redacted_schema', fetch_size '151' );

-- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight.
select count(*) from redshift_ccp150;

-- i don't expect the fdw to push the aggregate, this is just a test to see what query shows up in stv_inflight.
select count(*) from redshift_ccp151;

For those of you that aren't familiar with Redshift, it's a columnar database that seems to be a fork of postgres 8.cough. You can connect to it with modern libpq programs (psql, psycopg2, etc).

Redshift has a table, stv_inflight, which serves about the same purpose as pg_stat_activity. Redshift seems to perform better with very high fetch sizes (10,000 is a good start), so the default foreign data wrapper didn't perform so well.

I was able to confirm that the first query showed "FETCH 150 FROM c1" as the query, which is normal highly unhelpful, but in this case it confirms that tables created in redshift_server do by default use the fetch_size option given during server creation.

I was also able to confirm that the second query showed "FETCH 151 FROM c1" as the query, which shows that per-table overrides also work.

I'd be happy to stop here, but Kyotaro might feel differently. With this limited patch, one could estimate the number of rows that would fit into the desired memory block based on the row width and set fetch_size accordingly.

But we could go further, and have a fetch_max_memory option only at the table level, and the fdw could do that same memory / estimated_row_size calculation and derive fetch_size that way at table creation time, not query time.

Thanks to Kyotaro and Bruce Momjian for their help.

On Mon, Feb 2, 2015 at 2:27 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hmm, somehow I removed some recipients, especially the
list. Sorry for the duplicate.

-----
Sorry, I've been back. Thank you for the comment.

> Do you have any insight into where I would pass the custom row fetches from
> the table struct to the scan struct?

Yeah it's one simple way to tune it, if the user knows the
appropreate value.

> Last year I was working on a patch to postgres_fdw where the fetch_size
> could be set at the table level and the server level.
>
> I was able to get the settings parsed and they would show up in
> pg_foreign_table
> and pg_foreign_servers. Unfortunately, I'm not very familiar with how
> foreign data wrappers work, so I wasn't able to figure out how to get these
> custom values passed from the PgFdwRelationInfo struct into the
> query's PgFdwScanState
> struct.

Directly answering, the states needed to be shared among several
stages are holded within fdw_private. Your new variable
fpinfo->fetch_size can be read in postgresGetForeignPlan. It
newly creates another fdw_private. You can pass your values to
ForeignPlan making it hold the value there. Finally, you will get
it in postgresBeginForeginScan and can set it into
PgFdwScanState.

> I bring this up only because it might be a simpler solution, in that the
> table designer could set the fetch size very high for narrow tables, and
> lower or default for wider tables. It's also a very clean syntax, just
> another option on the table and/or server creation.
>
> My incomplete patch is attached.

However, the fetch_size is not needed by planner (so far), so we
can simply read the options in postgresBeginForeignScan() and set
into PgFdwScanState. This runs once per exection.

Finally, the attached patch will work as you intended.

What do you think about this?

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

> On Tue, Jan 27, 2015 at 4:24 AM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
> > Thank you for the comment.
> >
> > The automatic way to determin the fetch_size looks become too
> > much for the purpose. An example of non-automatic way is a new
> > foreign table option like 'fetch_size' but this exposes the
> > inside too much... Which do you think is preferable?
> >
> > Thu, 22 Jan 2015 11:17:52 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <
> > 24503.1421943472@sss.pgh.pa.us>
> > > Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > > > Hello, as the discuttion on async fetching on postgres_fdw, FETCH
> > > > with data-size limitation would be useful to get memory usage
> > > > stability of postgres_fdw.
> > >
> > > > Is such a feature and syntax could be allowed to be added?
> > >
> > > This seems like a lot of work, and frankly an incredibly ugly API,
> > > for a benefit that is entirely hypothetical. Have you got numbers
> > > showing any actual performance win for postgres_fdw?
> >
> > The API is a rush work to make the path for the new parameter
> > (but, yes, I did too much for the purpose that use from
> > postgres_fdw..) and it can be any saner syntax but it's not the
> > time to do so yet.
> >
> > The data-size limitation, any size to limit, would give
> > significant gain especially for small sized rows.
> >
> > This patch began from the fact that it runs about twice faster
> > when fetch size = 10000 than 100.
> >
> >
> > http://www.postgresql.org/message-id/20150116.171849.109146500.horiguchi.kyotaro@lab.ntt.co.jp
> >
> > I took exec times to get 1M rows from localhost via postgres_fdw
> > and it showed the following numbers.
> >
> > =# SELECT a from ft1;
> > fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1)
> > (local) 0.75s
> > 100 60 6.2s 6000 (0.006)
> > 10000 60 2.7s 600000 (0.6 )
> > 33333 60 2.2s 1999980 (2.0 )
> > 66666 60 2.4s 3999960 (4.0 )
> >
> > =# SELECT a, b, c from ft1;
> > fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1)
> > (local) 0.8s
> > 100 204 12 s 20400 (0.02 )
> > 1000 204 10 s 204000 (0.2 )
> > 10000 204 5.8s 2040000 (2 )
> > 20000 204 5.9s 4080000 (4 )
> >
> > =# SELECT a, b, d from ft1;
> > fetch_size, avg row size(*1), time, alloced_mem/fetch(Mbytes)(*1)
> > (local) 0.8s
> > 100 1356 17 s 135600 (0.136)
> > 1000 1356 15 s 1356000 (1.356)
> > 1475 1356 13 s 2000100 (2.0 )
> > 2950 1356 13 s 4000200 (4.0 )
> >
> > The definitions of the environment are the following.
> >
> > CREATE SERVER sv1 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host
> > 'localhost', dbname 'postgres');
> > CREATE USER MAPPING FOR PUBLIC SERVER sv1;
> > CREATE TABLE lt1 (a int, b timestamp, c text, d text);
> > CREATE FOREIGN TABLE ft1 (a int, b timestamp, c text, d text) SERVER sv1
> > OPTIONS (table_name 'lt1');
> > INSERT INTO lt1 (SELECT a, now(), repeat('x', 128), repeat('x', 1280) FROM
> > generate_series(0, 999999) a);
> >
> > The "avg row size" is alloced_mem/fetch_size and the alloced_mem
> > is the sum of HeapTuple[fetch_size] and (HEAPTUPLESIZE +
> > tup->t_len) for all stored tuples in the receiver side,
> > fetch_more_data() in postgres_fdw.
> >
> > They are about 50% gain for the smaller tuple size and 25% for
> > the larger. They looks to be optimal at where alloced_mem is
> > around 2MB by the reason unknown to me. Anyway the difference
> > seems to be significant.
> >
> > > Even if we wanted to do something like this, I strongly object to
> > > measuring size by heap_compute_data_size. That's not a number that users
> > > would normally have any direct knowledge of; nor does it have anything
> > > at all to do with the claimed use-case, where what you'd really need to
> > > measure is bytes transmitted down the wire. (The difference is not
> > small:
> > > for instance, toasted values would likely still be toasted at the point
> > > where you're measuring.)
> >
> > Sure. Finally, the attached patch #1 which does the following
> > things.
> >
> > - Sender limits the number of tuples using the sum of the net
> > length of the column values to be sent, not including protocol
> > overhead. It is calculated in the added function
> > slot_compute_attr_size(), using raw length for compressed
> > values.
> >
> > - postgres_fdw calculates fetch limit bytes by the following
> > formula,
> >
> > MAX_FETCH_MEM - MAX_FETCH_SIZE * (estimated overhead per tuple);
> >
> > The result of the patch is as follows. MAX_FETCH_MEM = 2MiB and
> > MAX_FETCH_SIZE = 30000.
> >
> > fetch_size, avg row size(*1), time, max alloced_mem/fetch(Mbytes)
> > (auto) 60 2.4s 1080000 ( 1.08)
> > (auto) 204 7.3s 536400 ( 0.54)
> > (auto) 1356 15 s 430236 ( 0.43)
> >
> > This is meaningfully fast but the patch looks too big and the
> > meaning of the new parameter is hard to understand..:(
> >
> >
> > On the other hand the cause of the displacements of alloced_mem
> > shown above is per-tuple overhead, the sum of which is unknown
> > before execution. The second patch makes FETCH accept the tuple
> > overhead bytes. The result seems pretty good, but I think this
> > might be too spcialized to this usage.
> >
> > MAX_FETCH_SIZE = 30000 and MAX_FETCH_MEM = 2MiB,
> > max_fetch_size, avg row size(*1), time, max
> > alloced_mem/fetch(MiBytes)
> > 30000 60 2.3s 1080000 ( 1.0)
> > 9932 204 5.7s 1787760 ( 1.7)
> > 1376 1356 13 s 1847484 ( 1.8)
> >
> > MAX_FETCH_SIZE = 25000 and MAX_FETCH_MEM = 1MiB,
> > max_fetch_size, avg row size(*1), time, max
> > alloced_mem/fetch(MiBytes)
> > 25000 60 2.4s 900000 ( 0.86)
> > 4358 204 6.6s 816840 ( 0.78)
> > 634 1356 16 s 844488 ( 0.81)
> >
> > MAX_FETCH_SIZE = 10000 and MAX_FETCH_MEM = 0.5MiB,
> > max_fetch_size, avg row size(*1), time, max
> > alloced_mem/fetch(MiBytes)
> > 10000 60 2.8s 360000 ( 0.35)
> > 2376 204 7.8s 427680 ( 0.41)
> > 332 1356 17 s 442224 ( 0.42)
> >
> > regards,
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> >
> >

Re: [POC] FETCH limited by bytes.

From

Kyotaro HORIGUCHI

Date:

06 February 2015, 08:13:36

Hello,

> Redshift has a table, stv_inflight, which serves about the same purpose as
> pg_stat_activity. Redshift seems to perform better with very high fetch
> sizes (10,000 is a good start), so the default foreign data wrapper didn't
> perform so well.

I agree with you.

> I was able to confirm that the first query showed "FETCH 150 FROM c1" as
> the query, which is normal highly unhelpful, but in this case it confirms
> that tables created in redshift_server do by default use the fetch_size
> option given during server creation.
> 
> I was also able to confirm that the second query showed "FETCH 151 FROM c1"
> as the query, which shows that per-table overrides also work.
> 
> I'd be happy to stop here, but Kyotaro might feel differently.

This is enough in its own way, of course.

> With this
> limited patch, one could estimate the number of rows that would fit into
> the desired memory block based on the row width and set fetch_size
> accordingly.

The users including me will be happy with it when the users know
how to determin the fetch size. Especially the remote tables are
very few or the configuration will be enough stable.

On widely distributed systems, it would be far difficult to tune
fetch size manually for every foreign tables, so finally it would
be left at the default and safe size, it's 100 or so.

This is the similar discussion about max_wal_size on another
thread. Calculating fetch size is far tougher for users than
setting expected memory usage, I think.

> But we could go further, and have a fetch_max_memory option only at the
> table level, and the fdw could do that same memory / estimated_row_size
> calculation and derive fetch_size that way *at table creation time*, not
> query time.

We cannot know the real length of the text type data in advance,
other than that, even defined as char(n), the n is the
theoretically(or in-design) maximum size for the field but in the
most cases the mean length of the real data would be far small
than that. For that reason, calculating the ratio at the table
creation time seems to be difficult.

However, I agree to the Tom's suggestion that the changes in
FETCH statement is defenitly ugly, especially the "overhead"
argument is prohibitive even for me:(

> Thanks to Kyotaro and Bruce Momjian for their help.

Not at all.

regardes,


At Wed, 4 Feb 2015 18:06:02 -0500, Corey Huinker <corey.huinker@gmail.com> wrote in
<CADkLM=eTpKYX5VOfjLr0VvfXhEZbC2UeakN=P6MXMg7S86Cdqw@mail.gmail.com>
> I applied this patch to REL9_4_STABLE, and I was able to connect to a
> foreign database (redshift, actually).
> 
> the basic outline of the test is below, names changed to protect my
> employment.
> 
> create extension if not exists postgres_fdw;
> 
> create server redshift_server foreign data wrapper postgres_fdw
> options ( host 'some.hostname.ext', port '5439', dbname 'redacted',
> fetch_size '150' );
> 
> create user mapping for public server redshift_server options ( user
> 'redacted_user', password 'comeonyouarekiddingright' );
> 
> create foreign table redshift_tab150 ( <colspecs> )
> server redshift_server options (table_name 'redacted_table', schema_name
> 'redacted_schema' );
> 
> create foreign table redshift_tab151 ( <colspecs> )
> server redshift_server options (table_name 'redacted_table', schema_name
> 'redacted_schema', fetch_size '151' );
> 
> -- i don't expect the fdw to push the aggregate, this is just a test to see
> what query shows up in stv_inflight.
> select count(*) from redshift_ccp150;
> 
> -- i don't expect the fdw to push the aggregate, this is just a test to see
> what query shows up in stv_inflight.
> select count(*) from redshift_ccp151;
> 
> 
> For those of you that aren't familiar with Redshift, it's a columnar
> database that seems to be a fork of postgres 8.cough. You can connect to it
> with modern libpq programs (psql, psycopg2, etc).
> Redshift has a table, stv_inflight, which serves about the same purpose as
> pg_stat_activity. Redshift seems to perform better with very high fetch
> sizes (10,000 is a good start), so the default foreign data wrapper didn't
> perform so well.
> 
> I was able to confirm that the first query showed "FETCH 150 FROM c1" as
> the query, which is normal highly unhelpful, but in this case it confirms
> that tables created in redshift_server do by default use the fetch_size
> option given during server creation.
> 
> I was also able to confirm that the second query showed "FETCH 151 FROM c1"
> as the query, which shows that per-table overrides also work.
> 
> I'd be happy to stop here, but Kyotaro might feel differently. With this
> limited patch, one could estimate the number of rows that would fit into
> the desired memory block based on the row width and set fetch_size
> accordingly.
> 
> But we could go further, and have a fetch_max_memory option only at the
> table level, and the fdw could do that same memory / estimated_row_size
> calculation and derive fetch_size that way *at table creation time*, not
> query time.
> 
> Thanks to Kyotaro and Bruce Momjian for their help.
> 
> 
> 
> 
> 
> 
> On Mon, Feb 2, 2015 at 2:27 AM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> 
> > Hmm, somehow I removed some recipients, especially the
> > list. Sorry for the duplicate.
> >
> > -----
> > Sorry, I've been back. Thank you for the comment.
> >
> > > Do you have any insight into where I would pass the custom row fetches
> > from
> > > the table struct to the scan struct?
> >
> > Yeah it's one simple way to tune it, if the user knows the
> > appropreate value.
> >
> > > Last year I was working on a patch to postgres_fdw where the fetch_size
> > > could be set at the table level and the server level.
> > >
> > > I was able to get the settings parsed and they would show up in
> > > pg_foreign_table
> > > and pg_foreign_servers. Unfortunately, I'm not very familiar with how
> > > foreign data wrappers work, so I wasn't able to figure out how to get
> > these
> > > custom values passed from the PgFdwRelationInfo struct into the
> > > query's PgFdwScanState
> > > struct.
> >
> > Directly answering, the states needed to be shared among several
> > stages are holded within fdw_private. Your new variable
> > fpinfo->fetch_size can be read in postgresGetForeignPlan.  It
> > newly creates another fdw_private.  You can pass your values to
> > ForeignPlan making it hold the value there. Finally, you will get
> > it in postgresBeginForeginScan and can set it into
> > PgFdwScanState.
> >
> > > I bring this up only because it might be a simpler solution, in that the
> > > table designer could set the fetch size very high for narrow tables, and
> > > lower or default for wider tables. It's also a very clean syntax, just
> > > another option on the table and/or server creation.
> > >
> > > My incomplete patch is attached.
> >
> > However, the fetch_size is not needed by planner (so far), so we
> > can simply read the options in postgresBeginForeignScan() and set
> > into PgFdwScanState. This runs once per exection.
> >
> > Finally, the attached patch will work as you intended.
> >
> > What do you think about this?
> >
> > regards,
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

27 February 2015, 18:50:56

Attached is a diff containing the original (working) patch from my (incomplete) patch, plus regression test changes and documentation changes.

While it's easy to regression-test the persistence of the fetch_size options, I am confounded as to how we would show that the fetch_size setting was respected. I've seen it with my own eyes viewing the query log on redshift, but I see no way to automate that. Suggestions welcome.

On Fri, Feb 6, 2015 at 3:11 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello,

> Redshift has a table, stv_inflight, which serves about the same purpose as
> pg_stat_activity. Redshift seems to perform better with very high fetch
> sizes (10,000 is a good start), so the default foreign data wrapper didn't
> perform so well.

I agree with you.

> I was able to confirm that the first query showed "FETCH 150 FROM c1" as
> the query, which is normal highly unhelpful, but in this case it confirms
> that tables created in redshift_server do by default use the fetch_size
> option given during server creation.
>
> I was also able to confirm that the second query showed "FETCH 151 FROM c1"
> as the query, which shows that per-table overrides also work.
>
> I'd be happy to stop here, but Kyotaro might feel differently.

This is enough in its own way, of course.

> With this
> limited patch, one could estimate the number of rows that would fit into
> the desired memory block based on the row width and set fetch_size
> accordingly.

The users including me will be happy with it when the users know
how to determin the fetch size. Especially the remote tables are
very few or the configuration will be enough stable.

On widely distributed systems, it would be far difficult to tune
fetch size manually for every foreign tables, so finally it would
be left at the default and safe size, it's 100 or so.

This is the similar discussion about max_wal_size on another
thread. Calculating fetch size is far tougher for users than
setting expected memory usage, I think.

> But we could go further, and have a fetch_max_memory option only at the
> table level, and the fdw could do that same memory / estimated_row_size
> calculation and derive fetch_size that way *at table creation time*, not
> query time.

We cannot know the real length of the text type data in advance,
other than that, even defined as char(n), the n is the
theoretically(or in-design) maximum size for the field but in the
most cases the mean length of the real data would be far small
than that. For that reason, calculating the ratio at the table
creation time seems to be difficult.

However, I agree to the Tom's suggestion that the changes in
FETCH statement is defenitly ugly, especially the "overhead"
argument is prohibitive even for me:(

> Thanks to Kyotaro and Bruce Momjian for their help.

Not at all.

regardes,

At Wed, 4 Feb 2015 18:06:02 -0500, Corey Huinker <corey.huinker@gmail.com> wrote in <CADkLM=eTpKYX5VOfjLr0VvfXhEZbC2UeakN=P6MXMg7S86Cdqw@mail.gmail.com>
> I applied this patch to REL9_4_STABLE, and I was able to connect to a
> foreign database (redshift, actually).
>
> the basic outline of the test is below, names changed to protect my
> employment.
>
> create extension if not exists postgres_fdw;
>
> create server redshift_server foreign data wrapper postgres_fdw
> options ( host 'some.hostname.ext', port '5439', dbname 'redacted',
> fetch_size '150' );
>
> create user mapping for public server redshift_server options ( user
> 'redacted_user', password 'comeonyouarekiddingright' );
>
> create foreign table redshift_tab150 ( <colspecs> )
> server redshift_server options (table_name 'redacted_table', schema_name
> 'redacted_schema' );
>
> create foreign table redshift_tab151 ( <colspecs> )
> server redshift_server options (table_name 'redacted_table', schema_name
> 'redacted_schema', fetch_size '151' );
>
> -- i don't expect the fdw to push the aggregate, this is just a test to see
> what query shows up in stv_inflight.
> select count(*) from redshift_ccp150;
>
> -- i don't expect the fdw to push the aggregate, this is just a test to see
> what query shows up in stv_inflight.
> select count(*) from redshift_ccp151;
>
>
> For those of you that aren't familiar with Redshift, it's a columnar
> database that seems to be a fork of postgres 8.cough. You can connect to it
> with modern libpq programs (psql, psycopg2, etc).
> Redshift has a table, stv_inflight, which serves about the same purpose as
> pg_stat_activity. Redshift seems to perform better with very high fetch
> sizes (10,000 is a good start), so the default foreign data wrapper didn't
> perform so well.
>
> I was able to confirm that the first query showed "FETCH 150 FROM c1" as
> the query, which is normal highly unhelpful, but in this case it confirms
> that tables created in redshift_server do by default use the fetch_size
> option given during server creation.
>
> I was also able to confirm that the second query showed "FETCH 151 FROM c1"
> as the query, which shows that per-table overrides also work.
>
> I'd be happy to stop here, but Kyotaro might feel differently. With this
> limited patch, one could estimate the number of rows that would fit into
> the desired memory block based on the row width and set fetch_size
> accordingly.
>
> But we could go further, and have a fetch_max_memory option only at the
> table level, and the fdw could do that same memory / estimated_row_size
> calculation and derive fetch_size that way *at table creation time*, not
> query time.
>
> Thanks to Kyotaro and Bruce Momjian for their help.
>
>
>
>
>
>
> On Mon, Feb 2, 2015 at 2:27 AM, Kyotaro HORIGUCHI <
> horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>
> > Hmm, somehow I removed some recipients, especially the
> > list. Sorry for the duplicate.
> >
> > -----
> > Sorry, I've been back. Thank you for the comment.
> >
> > > Do you have any insight into where I would pass the custom row fetches
> > from
> > > the table struct to the scan struct?
> >
> > Yeah it's one simple way to tune it, if the user knows the
> > appropreate value.
> >
> > > Last year I was working on a patch to postgres_fdw where the fetch_size
> > > could be set at the table level and the server level.
> > >
> > > I was able to get the settings parsed and they would show up in
> > > pg_foreign_table
> > > and pg_foreign_servers. Unfortunately, I'm not very familiar with how
> > > foreign data wrappers work, so I wasn't able to figure out how to get
> > these
> > > custom values passed from the PgFdwRelationInfo struct into the
> > > query's PgFdwScanState
> > > struct.
> >
> > Directly answering, the states needed to be shared among several
> > stages are holded within fdw_private. Your new variable
> > fpinfo->fetch_size can be read in postgresGetForeignPlan. It
> > newly creates another fdw_private. You can pass your values to
> > ForeignPlan making it hold the value there. Finally, you will get
> > it in postgresBeginForeginScan and can set it into
> > PgFdwScanState.
> >
> > > I bring this up only because it might be a simpler solution, in that the
> > > table designer could set the fetch size very high for narrow tables, and
> > > lower or default for wider tables. It's also a very clean syntax, just
> > > another option on the table and/or server creation.
> > >
> > > My incomplete patch is attached.
> >
> > However, the fetch_size is not needed by planner (so far), so we
> > can simply read the options in postgresBeginForeignScan() and set
> > into PgFdwScanState. This runs once per exection.
> >
> > Finally, the attached patch will work as you intended.
> >
> > What do you think about this?
> >
> > regards,
> >
> > --
> > Kyotaro Horiguchi
> > NTT Open Source Software Center

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment

postgres_fdw_fetch_size_20150227.diff

Re: [POC] FETCH limited by bytes.

From

Andres Freund

Date:

02 September 2015, 13:08:44

On 2015-02-27 13:50:22 -0500, Corey Huinker wrote:
> +static DefElem*
> +get_option(List *options, char *optname)
> +{
> +    ListCell *lc;
> +
> +    foreach(lc, options)
> +    {
> +        DefElem    *def = (DefElem *) lfirst(lc);
> +
> +        if (strcmp(def->defname, optname) == 0)
> +            return def;
> +    }
> +    return NULL;
> +}


>      /*
>       * Do nothing in EXPLAIN (no ANALYZE) case.  node->fdw_state stays NULL.
> @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
>      server = GetForeignServer(table->serverid);
>      user = GetUserMapping(userid, server->serverid);
>  
> +    /* Reading table options */
> +    fsstate->fetch_size = -1;
> +
> +    def = get_option(table->options, "fetch_size");
> +    if (!def)
> +        def = get_option(server->options, "fetch_size");
> +
> +    if (def)
> +    {
> +        fsstate->fetch_size = strtod(defGetString(def), NULL);
> +        if (fsstate->fetch_size < 0)
> +            elog(ERROR, "invalid fetch size for foreign table \"%s\"",
> +                 get_rel_name(table->relid));
> +    }
> +    else
> +        fsstate->fetch_size = 100;

I don't think it's a good idea to make such checks at runtime - and
either way it's somethign that should be reported back using an
ereport(), not an elog.

Also, it seems somewhat wrong to determine this at execution
time. Shouldn't this rather be done when creating the foreign scan node?
And be a part of the scan state?

Have you thought about how this option should cooperate with join
pushdown once implemented?

Greetings,

Andres Freund

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

03 September 2015, 22:02:57

On Wed, Sep 2, 2015 at 9:08 AM, Andres Freund <andres@anarazel.de> wrote:

On 2015-02-27 13:50:22 -0500, Corey Huinker wrote:
> +static DefElem*
> +get_option(List *options, char *optname)
> +{
> + ListCell *lc;
> +
> + foreach(lc, options)
> + {
> + DefElem *def = (DefElem *) lfirst(lc);
> +
> + if (strcmp(def->defname, optname) == 0)
> + return def;
> + }
> + return NULL;
> +}

> /*
> * Do nothing in EXPLAIN (no ANALYZE) case. node->fdw_state stays NULL.
> @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
> server = GetForeignServer(table->serverid);
> user = GetUserMapping(userid, server->serverid);
>
> + /* Reading table options */
> + fsstate->fetch_size = -1;
> +
> + def = get_option(table->options, "fetch_size");
> + if (!def)
> + def = get_option(server->options, "fetch_size");
> +
> + if (def)
> + {
> + fsstate->fetch_size = strtod(defGetString(def), NULL);
> + if (fsstate->fetch_size < 0)
> + elog(ERROR, "invalid fetch size for foreign table \"%s\"",
> + get_rel_name(table->relid));
> + }
> + else
> + fsstate->fetch_size = 100;

I don't think it's a good idea to make such checks at runtime - and
either way it's somethign that should be reported back using an
ereport(), not an elog.

Also, it seems somewhat wrong to determine this at execution
time. Shouldn't this rather be done when creating the foreign scan node?
And be a part of the scan state?

I agree, that was my original plan, but I wasn't familiar enough with the FDW architecture to know where the table struct and the scan struct were both exposed in the same function.

What I submitted incorporated some of Kyotaro's feedback (and working patch) to my original incomplete patch.

Have you thought about how this option should cooperate with join
pushdown once implemented?

I hadn't until now, but I think the only sensible thing would be to disregard table-specific settings once a second foreign table is detected, and instead consider only the server-level setting.

I suppose one could argue that if ALL the tables in the join had the same table-level setting, we should go with that, but I think that would be complicated, expensive, and generally a good argument for changing the server setting instead.

Re: [POC] FETCH limited by bytes.

From

Kyotaro HORIGUCHI

Date:

04 September 2015, 06:45:30

Hello,

> > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node,
> > int eflags)
...
> > > +     def = get_option(table->options, "fetch_size");
> 
> > I don't think it's a good idea to make such checks at runtime - and
> > either way it's somethign that should be reported back using an
> > ereport(), not an elog.
> 
> > Also, it seems somewhat wrong to determine this at execution
> > time. Shouldn't this rather be done when creating the foreign scan node?
> > And be a part of the scan state?
> 
> I agree, that was my original plan, but I wasn't familiar enough with the
> FDW architecture to know where the table struct and the scan struct were
> both exposed in the same function.
> 
> What I submitted incorporated some of Kyotaro's feedback (and working
> patch) to my original incomplete patch.

Sorry, it certainly shouldn't be a good place to do such thing. I
easily selected the place in order to avoid adding new similar
member in multiple existing structs (PgFdwRelationInfo and
PgFdwScanState).

Having a new member fetch_size is added in PgFdwRelationInfo and
PgFdwScanState, I think postgresGetForeignRelSize is the best
place to do that, from the point that it collects basic
information needed to calculate scan costs.

# Fetch sizes of foreign join would be the future issue..

> typedef struct PgFdwRelationInfo
> { ...
+ int    fetch_size;   /* fetch size for this remote table */

====================
> postgreGetForeignRelSize()
> { ...
>     fpinfo->table = GetForeignTable(foreigntableid);
>     fpinfo->server = GetForeignServer(fpinfo->table->serverid);
>  
+     def = get_option(table->options, "fetch_size");
+     ..
+     fpinfo->fetch_size = strtod(defGetString...

Also it is doable in postgresGetForeignPlan and doing there
removes redundant copy of fetch_size from PgFdwRelation to
PgFdwScanState but theoretical basis would be weak.

regards,

> > On 2015-02-27 13:50:22 -0500, Corey Huinker wrote:
> > > +static DefElem*
> > > +get_option(List *options, char *optname)
> > > +{
> > > +     ListCell *lc;
> > > +
> > > +     foreach(lc, options)
> > > +     {
> > > +             DefElem *def = (DefElem *) lfirst(lc);
> > > +
> > > +             if (strcmp(def->defname, optname) == 0)
> > > +                     return def;
> > > +     }
> > > +     return NULL;
> > > +}
> >
> >
> > >       /*
> > >        * Do nothing in EXPLAIN (no ANALYZE) case.  node->fdw_state stays
> > NULL.
> > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node,
> > int eflags)
> > >       server = GetForeignServer(table->serverid);
> > >       user = GetUserMapping(userid, server->serverid);
> > >
> > > +     /* Reading table options */
> > > +     fsstate->fetch_size = -1;
> > > +
> > > +     def = get_option(table->options, "fetch_size");
> > > +     if (!def)
> > > +             def = get_option(server->options, "fetch_size");
> > > +
> > > +     if (def)
> > > +     {
> > > +             fsstate->fetch_size = strtod(defGetString(def), NULL);
> > > +             if (fsstate->fetch_size < 0)
> > > +                     elog(ERROR, "invalid fetch size for foreign table
> > \"%s\"",
> > > +                              get_rel_name(table->relid));
> > > +     }
> > > +     else
> > > +             fsstate->fetch_size = 100;
> >
> > I don't think it's a good idea to make such checks at runtime - and
> > either way it's somethign that should be reported back using an
> > ereport(), not an elog.
> 
> > Also, it seems somewhat wrong to determine this at execution
> > time. Shouldn't this rather be done when creating the foreign scan node?
> > And be a part of the scan state?
> 
> I agree, that was my original plan, but I wasn't familiar enough with the
> FDW architecture to know where the table struct and the scan struct were
> both exposed in the same function.
> 
> What I submitted incorporated some of Kyotaro's feedback (and working
> patch) to my original incomplete patch.
>
> > Have you thought about how this option should cooperate with join
> > pushdown once implemented?
> >
> 
> I hadn't until now, but I think the only sensible thing would be to
> disregard table-specific settings once a second foreign table is detected,
> and instead consider only the server-level setting.
> 
> I suppose one could argue that if ALL the tables in the join had the same
> table-level setting, we should go with that, but I think that would be
> complicated, expensive, and generally a good argument for changing the
> server setting instead.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

21 September 2015, 04:52:37

On Fri, Sep 4, 2015 at 2:45 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello,

> > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node,
> > int eflags)
...
> > > + def = get_option(table->options, "fetch_size");
>
> > I don't think it's a good idea to make such checks at runtime - and
> > either way it's somethign that should be reported back using an
> > ereport(), not an elog.
>
> > Also, it seems somewhat wrong to determine this at execution
> > time. Shouldn't this rather be done when creating the foreign scan node?
> > And be a part of the scan state?
>
> I agree, that was my original plan, but I wasn't familiar enough with the
> FDW architecture to know where the table struct and the scan struct were
> both exposed in the same function.
>
> What I submitted incorporated some of Kyotaro's feedback (and working
> patch) to my original incomplete patch.

Sorry, it certainly shouldn't be a good place to do such thing. I
easily selected the place in order to avoid adding new similar
member in multiple existing structs (PgFdwRelationInfo and
PgFdwScanState).

Having a new member fetch_size is added in PgFdwRelationInfo and
PgFdwScanState, I think postgresGetForeignRelSize is the best
place to do that, from the point that it collects basic
information needed to calculate scan costs.

# Fetch sizes of foreign join would be the future issue..

> typedef struct PgFdwRelationInfo
> {
...
+ int fetch_size; /* fetch size for this remote table */

====================
> postgreGetForeignRelSize()
> {
...
> fpinfo->table = GetForeignTable(foreigntableid);
> fpinfo->server = GetForeignServer(fpinfo->table->serverid);
>
+ def = get_option(table->options, "fetch_size");
+ ..
+ fpinfo->fetch_size = strtod(defGetString...

Also it is doable in postgresGetForeignPlan and doing there
removes redundant copy of fetch_size from PgFdwRelation to
PgFdwScanState but theoretical basis would be weak.

regards,

> > On 2015-02-27 13:50:22 -0500, Corey Huinker wrote:
> > > +static DefElem*
> > > +get_option(List *options, char *optname)
> > > +{
> > > + ListCell *lc;
> > > +
> > > + foreach(lc, options)
> > > + {
> > > + DefElem *def = (DefElem *) lfirst(lc);
> > > +
> > > + if (strcmp(def->defname, optname) == 0)
> > > + return def;
> > > + }
> > > + return NULL;
> > > +}
> >
> >
> > > /*
> > > * Do nothing in EXPLAIN (no ANALYZE) case. node->fdw_state stays
> > NULL.
> > > @@ -915,6 +933,23 @@ postgresBeginForeignScan(ForeignScanState *node,
> > int eflags)
> > > server = GetForeignServer(table->serverid);
> > > user = GetUserMapping(userid, server->serverid);
> > >
> > > + /* Reading table options */
> > > + fsstate->fetch_size = -1;
> > > +
> > > + def = get_option(table->options, "fetch_size");
> > > + if (!def)
> > > + def = get_option(server->options, "fetch_size");
> > > +
> > > + if (def)
> > > + {
> > > + fsstate->fetch_size = strtod(defGetString(def), NULL);
> > > + if (fsstate->fetch_size < 0)
> > > + elog(ERROR, "invalid fetch size for foreign table
> > \"%s\"",
> > > + get_rel_name(table->relid));
> > > + }
> > > + else
> > > + fsstate->fetch_size = 100;
> >
> > I don't think it's a good idea to make such checks at runtime - and
> > either way it's somethign that should be reported back using an
> > ereport(), not an elog.
>
> > Also, it seems somewhat wrong to determine this at execution
> > time. Shouldn't this rather be done when creating the foreign scan node?
> > And be a part of the scan state?
>
> I agree, that was my original plan, but I wasn't familiar enough with the
> FDW architecture to know where the table struct and the scan struct were
> both exposed in the same function.
>
> What I submitted incorporated some of Kyotaro's feedback (and working
> patch) to my original incomplete patch.
>
> > Have you thought about how this option should cooperate with join
> > pushdown once implemented?
> >
>
> I hadn't until now, but I think the only sensible thing would be to
> disregard table-specific settings once a second foreign table is detected,
> and instead consider only the server-level setting.
>
> I suppose one could argue that if ALL the tables in the join had the same
> table-level setting, we should go with that, but I think that would be
> complicated, expensive, and generally a good argument for changing the
> server setting instead.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Ok, with some guidance from RhodiumToad (thanks!) I was able to get the proper RelOptInfo->Plan->Scan handoff.

What I don't know how to do is show that the proper fetch sizes are being used on the remote server with the resources available in the regression test. Suggestions welcome.

This patch works for my original added test-cases, and works for me connecting to a redshift cluster that we have, the queries show up in the console like this:

FETCH 101 FROM c1

FETCH 30 FROM c1

FETCH 50 FROM c1

The (redacted) source of that test is as follows:

begin;
create extension if not exists postgres_fdw;

create server redshift foreign data wrapper postgres_fdw
options (host 'REDACTED', port '5439', dbname 'REDACTED', fetch_size '101');

select * from pg_foreign_server;

create user mapping for public server redshift
options ( user 'REDACTED', password 'REDACTED');

select * from pg_user_mappings;

create foreign table test_table ( date date, tval text )
server redshift
options (table_name 'REDACTED');

select count(*) from ( select tval from test_table where date = 'REDACTED' ) x;

alter server redshift options ( set fetch_size '30' );

select count(*) from ( select tval from test_table where date = 'REDACTED' ) x;

alter foreign table test_table options ( fetch_size '50' );

select count(*) from ( select tval from test_table where date = 'REDACTED' ) x;

rollback;

Attached is the patch / diff against current master.

Attachment

0002-Make-fetch_size-settable-per-foreign-server-and-fore.patch

Re: [POC] FETCH limited by bytes.

From

Michael Paquier

Date:

23 December 2015, 13:43:16

On Mon, Sep 21, 2015 at 1:52 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
> Attached is the patch / diff against current master.

I have moved this patch to next CF, there is a new patch, but no reviews.
-- 
Michael

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

23 December 2015, 18:16:05

<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Wed, Dec 23, 2015 at 8:43 AM, Michael
Paquier<span dir="ltr"><<a href="mailto:michael.paquier@gmail.com"
target="_blank">michael.paquier@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px#ccc solid;padding-left:1ex"><span class="">On Mon, Sep 21, 2015 at 1:52 PM, Corey Huinker <<a
href="mailto:corey.huinker@gmail.com">corey.huinker@gmail.com</a>>wrote:<br /> > Attached is the patch / diff
againstcurrent master.<br /><br /></span>I have moved this patch to next CF, there is a new patch, but no reviews.<br
/><spanclass="HOEnZb"><font color="#888888">--<br /> Michael<br /></font></span></blockquote></div><br /></div><div
class="gmail_extra"><br/></div><div class="gmail_extra">That's fair. I'm still at a loss for how to show that the fetch
sizewas respected by the remote server, suggestions welcome!</div></div>

Re: [POC] FETCH limited by bytes.

From

Jim Nasby

Date:

23 December 2015, 20:08:36

On 12/23/15 12:15 PM, Corey Huinker wrote:
> That's fair. I'm still at a loss for how to show that the fetch size was
> respected by the remote server, suggestions welcome!

A combination of repeat() and generate_series()?

I'm guessing it's not that obvious and that I'm missing something; can 
you elaborate?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

24 December 2015, 23:32:07

On Wed, Dec 23, 2015 at 3:08 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 12/23/15 12:15 PM, Corey Huinker wrote:
That's fair. I'm still at a loss for how to show that the fetch size was
respected by the remote server, suggestions welcome!

A combination of repeat() and generate_series()?

I'm guessing it's not that obvious and that I'm missing something; can you elaborate?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

I'll try. So the basic test of whether the FDW respected the fetch limit is this:

1. create foreign server using postgres_fdw, create foreign table.

2. run a query against that table. it works. great.

3. alter server set fetch size option to 101 (or any number different from 100)

4. run same query against the table. the server side should show that the result set was fetched in 101 row chunks[1].

5. alter table set fetch size option to 102 (or any number different from 100 and the number you picked in #3)

6. run same query against the table. the server side should show that the result set was fetched in 102 row chunks[1].

The parts marked [1] are the problem...because the way I know it works is looking at the query console on the remote redshift cluster where the query column reads "FETCH 101 in c1" or somesuch rather than the query text. That's great, *I* know it works, but I don't know how capture that information from a vanilla postgres server, and I don't know if we can do the regression with a loopback connection, or if we'd need to set up a second pg instance for the regression test scaffolding.

Re: [POC] FETCH limited by bytes.

From

Kyotaro HORIGUCHI

Date:

25 December 2015, 05:32:45

Hello,

At Thu, 24 Dec 2015 18:31:42 -0500, Corey Huinker <corey.huinker@gmail.com> wrote in
<CADkLM=fh+ZUEykcCDu8P0PPrOyYwLEp5OBRjKCe5O7swqDF65w@mail.gmail.com>
> On Wed, Dec 23, 2015 at 3:08 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> 
> > On 12/23/15 12:15 PM, Corey Huinker wrote:
> >
> >> That's fair. I'm still at a loss for how to show that the fetch size was
> >> respected by the remote server, suggestions welcome!
> >>
> >
> > A combination of repeat() and generate_series()?
> >
> > I'm guessing it's not that obvious and that I'm missing something; can you
> > elaborate?
> > --
> > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> > Experts in Analytics, Data Architecture and PostgreSQL
> > Data in Trouble? Get it in Treble! http://BlueTreble.com
> >
> 
> 
> I'll try. So the basic test of whether the FDW respected the fetch limit is
> this:
> 
> 1. create foreign server using postgres_fdw, create foreign table.
> 2. run a query against that table. it works. great.
> 3. alter server set fetch size option to 101 (or any number different from
> 100)
> 4. run same query against the table. the server side should show that the
> result set was fetched in 101 row chunks[1].
> 5. alter table set fetch size option to 102 (or any number different from
> 100 and the number you picked in #3)
> 6. run same query against the table. the server side should show that the
> result set was fetched in 102 row chunks[1].
> 
> The parts marked [1] are the problem...because the way I know it works is
> looking at the query console on the remote redshift cluster where the query
> column reads "FETCH 101 in c1" or somesuch rather than the query text.
> That's great, *I* know it works, but I don't know how capture that
> information from a vanilla postgres server, and I don't know if we can do
> the regression with a loopback connection, or if we'd need to set up a
> second pg instance for the regression test scaffolding.

I believe it won't be needed as a regression test but DEBUGn
messages could be usable to see "what was sent to the remote
side". It can be shown to the console so easily done in
regression test. See the deocumentation for client_min_messages
for details. (It could receive far many messages then expected,
though, and maybe fragile for changing in other part.)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

26 December 2015, 09:08:54

On Fri, Dec 25, 2015 at 12:31 AM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Hello,

At Thu, 24 Dec 2015 18:31:42 -0500, Corey Huinker <corey.huinker@gmail.com> wrote in <CADkLM=fh+ZUEykcCDu8P0PPrOyYwLEp5OBRjKCe5O7swqDF65w@mail.gmail.com>
> On Wed, Dec 23, 2015 at 3:08 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>
> > On 12/23/15 12:15 PM, Corey Huinker wrote:
> >
> >> That's fair. I'm still at a loss for how to show that the fetch size was
> >> respected by the remote server, suggestions welcome!
> >>
> >
> > A combination of repeat() and generate_series()?
> >
> > I'm guessing it's not that obvious and that I'm missing something; can you
> > elaborate?
> > --
> > Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> > Experts in Analytics, Data Architecture and PostgreSQL
> > Data in Trouble? Get it in Treble! http://BlueTreble.com
> >
>
>
> I'll try. So the basic test of whether the FDW respected the fetch limit is
> this:
>
> 1. create foreign server using postgres_fdw, create foreign table.
> 2. run a query against that table. it works. great.
> 3. alter server set fetch size option to 101 (or any number different from
> 100)
> 4. run same query against the table. the server side should show that the
> result set was fetched in 101 row chunks[1].
> 5. alter table set fetch size option to 102 (or any number different from
> 100 and the number you picked in #3)
> 6. run same query against the table. the server side should show that the
> result set was fetched in 102 row chunks[1].
>
> The parts marked [1] are the problem...because the way I know it works is
> looking at the query console on the remote redshift cluster where the query
> column reads "FETCH 101 in c1" or somesuch rather than the query text.
> That's great, *I* know it works, but I don't know how capture that
> information from a vanilla postgres server, and I don't know if we can do
> the regression with a loopback connection, or if we'd need to set up a
> second pg instance for the regression test scaffolding.

I believe it won't be needed as a regression test but DEBUGn
messages could be usable to see "what was sent to the remote
side". It can be shown to the console so easily done in
regression test. See the deocumentation for client_min_messages
for details. (It could receive far many messages then expected,
though, and maybe fragile for changing in other part.)

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Ok, I'll see what debug-level messages reveal.

Re: [POC] FETCH limited by bytes.

From

Vladimir Sitnikov

Date:

26 December 2015, 11:16:49

>Have you got numbers showing any actual performance win for postgres_fdw?

For JDBC purposes, it would be nice to have an ability of asking
backend "to stop fetching if produced more than X MiB of response
data".
For small table (4 int4 fields), having decent fetchSize (~1000) makes
result processing 7 times faster than with fetchSize of 50 rows (14 ms
-> 2 ms for 2000 rows).
Here are the measurements: [1] and [2].

Note: it is not required to precisely follow given "max fetch bytes"
limit. It would be enough just to stop after certain amount of data
was sent.
The whole thing of using limited fetch size is to avoid running out of
memory at client side.
I do not think developers care how many rows is fetched at once. It
they do, they should rather use "limit X" SQL syntax.

Do you have a suggestion for such a use case?

For fixed-size data types, JDBC driver can estimate "max sane fetch
size", however:
1) In order to know data types, a roundtrip is required. This means
the first fetch must be conservative, thus small queries would be
penalized.
2) For variable length types there is no way to estimate "sane number
of rows", except of using "average row size of already received data".
This is not reliable, especially if the first rows have nulls, and
subsequent ones contain non-empty strings.

[1]: https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473
[2]: https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387

Vladimir

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

27 December 2015, 01:45:52

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Sat, Dec 26, 2015 at 6:16 AM, Vladimir Sitnikov
<spandir="ltr"><<a href="mailto:sitnikov.vladimir@gmail.com"
target="_blank">sitnikov.vladimir@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 0
0.8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">>Have you got numbers showing any actual
performancewin for postgres_fdw?<br /><br /></span>For JDBC purposes, it would be nice to have an ability of asking<br
/>backend "to stop fetching if produced more than X MiB of response<br /> data".<br /> For small table (4 int4 fields),
havingdecent fetchSize (~1000) makes<br /> result processing 7 times faster than with fetchSize of 50 rows (14 ms<br />
->2 ms for 2000 rows).<br /> Here are the measurements: [1] and [2].<br /><br /> Note: it is not required to
preciselyfollow given "max fetch bytes"<br /> limit. It would be enough just to stop after certain amount of data<br />
wassent.<br /> The whole thing of using limited fetch size is to avoid running out of<br /> memory at client side.<br
/>I do not think developers care how many rows is fetched at once. It<br /> they do, they should rather use "limit X"
SQLsyntax.<br /><br /> Do you have a suggestion for such a use case?<br /><br /> For fixed-size data types, JDBC driver
canestimate "max sane fetch<br /> size", however:<br /> 1) In order to know data types, a roundtrip is required. This
means<br/> the first fetch must be conservative, thus small queries would be<br /> penalized.<br /> 2) For variable
lengthtypes there is no way to estimate "sane number<br /> of rows", except of using "average row size of already
receiveddata".<br /> This is not reliable, especially if the first rows have nulls, and<br /> subsequent ones contain
non-emptystrings.<br /><br /> [1]: <a href="https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473"
rel="noreferrer"target="_blank">https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473</a><br /> [2]: <a
href="https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387"rel="noreferrer"
target="_blank">https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387</a><br/><span class="HOEnZb"><font
color="#888888"><br/> Vladimir<br /></font></span><div class="HOEnZb"><div class="h5"><br /><br /> --<br /> Sent via
pgsql-hackersmailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br /> To
makechanges to your subscription:<br /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" rel="noreferrer"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></div></div></blockquote></div><br
/></div><divclass="gmail_extra">I believe that Kyotaro proposed something like that, wherein the FDW would be more
adaptivebased on the amount of memory available, and fetch a number of rows that, by its estimation, would fit in the
memoryavailable. I don't know the progress of that patch.<br /></div><div class="gmail_extra"><br /></div><div
class="gmail_extra">Thispatch is a far less complicated solution and puts the burden on the DBA to figure out
approximatelyhow many rows would fit in memory based on the average row size, and set the per-table option accordingly.
Ifit is later determined that the rows are now too heavy to fit into the space allotted, the fetch size can be altered
forthat table as needed.</div></div>

Re: [POC] FETCH limited by bytes.

From

Vladimir Sitnikov

Date:

27 December 2015, 06:44:48

>and fetch a number of rows that, by its estimation, would fit in the memory available

What's wrong with having size limit in the first place? It seems to
make much more sense.

Vladimir

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

31 December 2015, 06:10:38

I believe it won't be needed as a regression test but DEBUGn
messages could be usable to see "what was sent to the remote
side". It can be shown to the console so easily done in
regression test. See the deocumentation for client_min_messages
for details. (It could receive far many messages then expected,
though, and maybe fragile for changing in other part.)

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Ok, I'll see what debug-level messages reveal.

Here's a re-based patch. Notable changes since the last patch are:

- some changes had to move to postgres_fdw.h

- the regression tests are in their own script fetch_size.sql / fetch_size.out. that should make things easier for the reviewer(s), even if we later merge it into postgres_fdw.sql/.out.

- I put braces around a multi-line error ereport(). That might be a style violation, but the statement seemed confusing without it.

- The fetch sql is reported as a DEBUG1 level ereport(), and confirms that server level options are respected, and table level options are used in favor of server-level options.

I left the DEBUG1-level message in this patch so that others can see the output, but I assume we would omit that for production code, with corresponding deletions from fetch_size.sql and fetch_size.out.

Attachment

0001-Make-fetch_size-settable-per-server-and-table.patch

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

25 January 2016, 18:04:16

On Thu, Dec 31, 2015 at 1:10 AM, Corey Huinker <corey.huinker@gmail.com> wrote:
> Here's a re-based patch. Notable changes since the last patch are:
>
> - some changes had to move to postgres_fdw.h
> - the regression tests are in their own script fetch_size.sql /
> fetch_size.out. that should make things easier for the reviewer(s), even if
> we later merge it into postgres_fdw.sql/.out.
> - I put braces around a multi-line error ereport(). That might be a style
> violation, but the statement seemed confusing without it.
> - The fetch sql is reported as a DEBUG1 level ereport(), and confirms that
> server level options are respected, and table level options are used in
> favor of server-level options.
>
> I left the DEBUG1-level message in this patch so that others can see the
> output, but I assume we would omit that for production code, with
> corresponding deletions from fetch_size.sql and fetch_size.out.

Review:

- There is an established idiom in postgres_fdw for how to extract
data from lists of DefElems, and this patch does something else
instead.  Please make it look like postgresIsForeignRelUpdatable,
postgresGetForeignRelSize, and postgresImportForeignSchema instead of
inventing something new.  (Note that your approach would require
multiple passes over the list if you needed information on multiple
options, whereas the existing approach does not.)

- I think the comment in InitPgFdwOptions() could be reduced to a
one-line comment similar to those already present.

- I would reduce the debug level on the fetch command to something
lower than DEBUG1, or drop it entirely.  If you keep it, you need to
fix the formatting so that the line breaks look like other ereport()
calls.

- We could consider folding fetch_size into "Remote Execution
Options", but maybe that's too clever.

I would not worry about trying to get this into the regression tests.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

25 January 2016, 18:21:44

Review:

- There is an established idiom in postgres_fdw for how to extract
data from lists of DefElems, and this patch does something else
instead. Please make it look like postgresIsForeignRelUpdatable,
postgresGetForeignRelSize, and postgresImportForeignSchema instead of
inventing something new. (Note that your approach would require
multiple passes over the list if you needed information on multiple
options, whereas the existing approach does not.)

I will look into that. The original patch pre-dates import foreign schema, so I'm not surprised that I missed the established pattern.

- I think the comment in InitPgFdwOptions() could be reduced to a
one-line comment similar to those already present.

Aye.

- I would reduce the debug level on the fetch command to something
lower than DEBUG1, or drop it entirely. If you keep it, you need to
fix the formatting so that the line breaks look like other ereport()
calls.

As I mentioned, I plan to drop it entirely. It was only there to show a reviewer that it works as advertised. There's not much to see without it.

- We could consider folding fetch_size into "Remote Execution
Options", but maybe that's too clever.

If you care to explain, I'm listening. Otherwise I'm going forward with the other suggestions you've made.

I would not worry about trying to get this into the regression tests.

Happy to hear that.

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

25 January 2016, 20:22:40

On Mon, Jan 25, 2016 at 1:21 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>> - We could consider folding fetch_size into "Remote Execution
>> Options", but maybe that's too clever.
>
> If you care to explain, I'm listening. Otherwise I'm going forward with the
> other suggestions you've made.

It's just a little unfortunate to have multiple sections with only a
single option in each.  It would be nice to avoid that somehow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

25 January 2016, 21:19:19

On Mon, Jan 25, 2016 at 3:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Jan 25, 2016 at 1:21 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>> - We could consider folding fetch_size into "Remote Execution
>> Options", but maybe that's too clever.
>
> If you care to explain, I'm listening. Otherwise I'm going forward with the
> other suggestions you've made.

It's just a little unfortunate to have multiple sections with only a
single option in each. It would be nice to avoid that somehow.

Revised in patch v3:

* get_option() and get_fetch_size() removed, fetch_size searches added to existing loops.

* Move fetch_size <= 0 tests into postgres_fdw_validator() routine in option.c

* DEBUG1 message removed, never intended that to live beyond the proof of concept.

* Missing regression test mentioned in makefile de-mentioned, as there's nothing to see without the DEBUG1 message.

* Multi-line comment shrunk

(There's a v2 patch that is prior to the change to postgres_fdw_validator() in option.c, but in retrospect that's not interesting to you).

I'm not too keen on having *no* new regression tests, but defer to your judgement.

Still not sure what you mean by remote execution options. But it might be simpler now that the patch is closer to your expectations.

Attachment

v3.Make-fetch_size-settable-per-server-and-table.patch

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

28 January 2016, 02:40:12

On Mon, Jan 25, 2016 at 4:18 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
> Revised in patch v3:
> * get_option() and get_fetch_size() removed, fetch_size searches added to
> existing loops.
> * Move fetch_size <= 0 tests into postgres_fdw_validator() routine in
> option.c
> * DEBUG1 message removed, never intended that to live beyond the proof of
> concept.
> * Missing regression test mentioned in makefile de-mentioned, as there's
> nothing to see without the DEBUG1 message.
> * Multi-line comment shrunk

Looks pretty good.  You seem to have added a blank line at the end of
postgres_fdw.c, which should be stripped back out.

> I'm not too keen on having *no* new regression tests, but defer to your
> judgement.

So... I'm not *opposed* to regression tests.  I just don't see a
reasonable way to write one.  If you've got an idea, I'm all ears.

> Still not sure what you mean by remote execution options. But it might be
> simpler now that the patch is closer to your expectations.

I'm talking about the documentation portion of the patch, which
regrettably seems to have disappeared between v2 and v3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

28 January 2016, 04:19:28

Looks pretty good. You seem to have added a blank line at the end of
postgres_fdw.c, which should be stripped back out.

Done.

> I'm not too keen on having *no* new regression tests, but defer to your
> judgement.

So... I'm not *opposed* to regression tests. I just don't see a
reasonable way to write one. If you've got an idea, I'm all ears.

The possible tests might be:

- creating a server/table with fetch_size set

- altering an existing server/table to have fetch_size set

- verifying that the value exists in pg_foreign_server.srvoptions and pg_foreign_table.ftoptions.

- attempting to set fetch_size to a non-integer value

None of which are very interesting, and none of which actually test what fetch_size was actually used.

But I'm happy to add any of those that seem worthwhile.

> Still not sure what you mean by remote execution options. But it might be
> simpler now that the patch is closer to your expectations.

I'm talking about the documentation portion of the patch, which
regrettably seems to have disappeared between v2 and v3.

Ah, didn't realize you were speaking about the documentation, and since that section was new, I wasn't familiar with the name. Moved.

...and not sure why the doc change didn't make it into the last patch, but it's in this one.

I also moved the paragraph starting "When using the extensions option, it is the user's responsibility that..." such that it is in the same varlistentry as "extensions", though maybe that change should be delegated to the patch that created the extensions option.

Enjoy.

Attachment

v4.Make-fetch_size-settable-per-server-and-table.patch

Re: [POC] FETCH limited by bytes.

From

Alvaro Herrera

Date:

01 February 2016, 22:02:17

Corey Huinker wrote:

> Enjoy.

Didn't actually look into the patch but looks like there's progress
here and this might be heading for commit.  Moving to next one.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

02 February 2016, 22:18:13

On Wed, Jan 27, 2016 at 11:19 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
> The possible tests might be:
> - creating a server/table with fetch_size set
> - altering an existing server/table to have fetch_size set
> - verifying that the value exists in pg_foreign_server.srvoptions and
> pg_foreign_table.ftoptions.
> - attempting to set fetch_size to a non-integer value

I'd add a test that does one of the first two and skip the others.
I'm not wedded to that exact thing; that's just a suggestion.

> Enjoy.

I'd enjoy it more if, heh heh, it compiled.

postgres_fdw.c:2642:16: error: use of undeclared identifier 'server'                       foreach(lc, server->options)
                                 ^
 
../../src/include/nodes/pg_list.h:153:26: note: expanded from macro 'foreach'       for ((cell) = list_head(l); (cell)
!=NULL; (cell) = lnext(cell))                               ^
 
1 error generated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

02 February 2016, 22:57:47

postgres_fdw.c:2642:16: error: use of undeclared identifier 'server'
foreach(lc, server->options)

Odd. Compiled for me. Maybe I messed up the diff. Will get back to you on that with the tests suggested.

^
../../src/include/nodes/pg_list.h:153:26: note: expanded from macro 'foreach'
for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell))
^

Didn't modify this file on this or any other work of mine. Possible garbage from a git pull. Will investigate.

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

02 February 2016, 23:05:23

On Tue, Feb 2, 2016 at 5:57 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>> postgres_fdw.c:2642:16: error: use of undeclared identifier 'server'
>>                         foreach(lc, server->options)
>
>
> Odd. Compiled for me. Maybe I messed up the diff. Will get back to you on
> that with the tests suggested.

I don't see how.  There really is no declaration in there for a
variable called server.

>> ../../src/include/nodes/pg_list.h:153:26: note: expanded from macro
>> 'foreach'
>>         for ((cell) = list_head(l); (cell) != NULL; (cell) = lnext(cell))
>>                                 ^
>
> Didn't modify this file on this or any other work of mine. Possible garbage
> from a git pull. Will investigate.

This is context information for the same error, not a separate problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] FETCH limited by bytes.

From

Corey Huinker

Date:

03 February 2016, 03:42:44

I don't see how. There really is no declaration in there for a
variable called server.

Absolutely correct. My only guess is that it was from failing to make clean after a checkout/re-checkout. A good reason to have even boring regression tests.

Regression tests added.

Attachment

v5.fetch-size.patch

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

03 February 2016, 14:10:34

On Tue, Feb 2, 2016 at 10:42 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>> I don't see how.  There really is no declaration in there for a
>> variable called server.
>
> Absolutely correct. My only guess is that it was from failing to make clean
> after a checkout/re-checkout. A good reason to have even boring regression
> tests.
>
> Regression tests added.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [POC] FETCH limited by bytes.

From

Ashutosh Bapat

Date:

03 February 2016, 16:17:16

There seems to be double successive assignment to fdw_private in recent commit. Here's patch to remove the first one.

On Wed, Feb 3, 2016 at 7:40 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Feb 2, 2016 at 10:42 PM, Corey Huinker <corey.huinker@gmail.com> wrote:
>> I don't see how. There really is no declaration in there for a
>> variable called server.
>
> Absolutely correct. My only guess is that it was from failing to make clean
> after a checkout/re-checkout. A good reason to have even boring regression
> tests.
>
> Regression tests added.

Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Attachment

pg_double_assignment.patch

Re: [POC] FETCH limited by bytes.

From

Robert Haas

Date:

03 February 2016, 16:55:20

On Wed, Feb 3, 2016 at 11:17 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> There seems to be double successive assignment to fdw_private in recent
> commit. Here's patch to remove the first one.

Committed along with a fix for another problem I noted along the way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company