Thread: why do we need create tuplestore for each fetch?

why do we need create tuplestore for each fetch?

From

高增琦

Date:

15 December 2011, 09:31:34

Hi everyone,<br /><br />I found this several days ago when I try to debug a "fetch" of cursor. <br />And I have sent a
mailto this list, but no one reply...<br />Maybe this is a very simple problem, please help me, thanks a lot...<br
/><br/>Here is the example:<br />    create table t (a int);<br />    insert into t values (1),(3),(5),(7),(9);<br
/>   insert into t select a+1 from t;<br />    begin;<br />    declare c cursor for select * from t order by a;<br
/>   fetch 3 in c;<br />     fetch 3 in c;<br />    fetch 3 in c;<br />    <br />In 'PortalRun', a fetch stmt will be
treatedwith PORTAL_UTIL_SELECT,<br />and then a tuplestore will be created in 'FillPortalStore' in the<br />fetch
stmt'sportal.<br /><br />In 'FillPortalStore', all result will be store at that tuplestore,<br />Then, go back to
'PortalRun';next,  'PortalRunSelect' will send this<br />results to client...<br /><br />My problem is: why do we need
createthat tuplestore as an<br /> middle storeage? why do not we just send these result to clent<br />at the first
time?<br/><br />Thank you very much.<br clear="all" /><br />-- <br />GaoZengqi<br /><a href="mailto:pgf00a@gmail.com"
target="_blank">pgf00a@gmail.com</a><br/><a href="mailto:zengqigao@gmail.com"
target="_blank">zengqigao@gmail.com</a><br/>

Re: why do we need create tuplestore for each fetch?

From

Robert Haas

Date:

19 December 2011, 09:06:24

On Thu, Dec 15, 2011 at 8:30 AM, 高增琦 <pgf00a@gmail.com> wrote:
> I found this several days ago when I try to debug a "fetch" of cursor.
> And I have sent a mail to this list, but no one reply...
> Maybe this is a very simple problem, please help me, thanks a lot...
>
> Here is the example:
>     create table t (a int);
>     insert into t values (1),(3),(5),(7),(9);
>     insert into t select a+1 from t;
>     begin;
>     declare c cursor for select * from t order by a;
>     fetch 3 in c;
>     fetch 3 in c;
>     fetch 3 in c;
>
> In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
> and then a tuplestore will be created in 'FillPortalStore' in the
> fetch stmt's portal.
>
> In 'FillPortalStore', all result will be store at that tuplestore,
> Then, go back to 'PortalRun'; next,  'PortalRunSelect' will send this
> results to client...
>
> My problem is: why do we need create that tuplestore as an
> middle storeage? why do not we just send these result to clent
> at the first time?

Good question.  I wouldn't expect it to matter very much for a
three-row fetch, but maybe it does for larger ones?  What is your
motivation for investigating this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: why do we need create tuplestore for each fetch?

From

高增琦

Date:

20 December 2011, 23:34:42

Thanks for you reply.

I found query without cursor is faster then query with server-side cursor and several fetches.
But I have a large result set to retrieve from database. I have to choose server-side cursor
to avoid out-of-memory problem.

When I try to debug the cursor and fetch, I found this unexpected behavior. I think maybe
the tuplestore slows the cursor. (maybe I should do some profile later)

I want to change the code, but I am afraid there are important reasons for the tuplestore.
Therefore, I post it to this list for help: why create tuplestore for each fetch?

p.s. a large fetch may turn tuplestore to use buffer file, and slow the performance very much.

On Mon, Dec 19, 2011 at 9:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Thu, Dec 15, 2011 at 8:30 AM, 高增琦 <pgf00a@gmail.com> wrote:
> I found this several days ago when I try to debug a "fetch" of cursor.
> And I have sent a mail to this list, but no one reply...
> Maybe this is a very simple problem, please help me, thanks a lot...
>
> Here is the example:
>     create table t (a int);
>     insert into t values (1),(3),(5),(7),(9);
>     insert into t select a+1 from t;
>     begin;
>     declare c cursor for select * from t order by a;
>     fetch 3 in c;
>     fetch 3 in c;
>     fetch 3 in c;
>
> In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
> and then a tuplestore will be created in 'FillPortalStore' in the
> fetch stmt's portal.
>
> In 'FillPortalStore', all result will be store at that tuplestore,
> Then, go back to 'PortalRun'; next, 'PortalRunSelect' will send this
> results to client...
>
> My problem is: why do we need create that tuplestore as an
> middle storeage? why do not we just send these result to clent
> at the first time?

Good question. I wouldn't expect it to matter very much for a
three-row fetch, but maybe it does for larger ones? What is your
motivation for investigating this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
GaoZengqi
pgf00a@gmail.com
zengqigao@gmail.com

Re: why do we need create tuplestore for each fetch?

From

Tom Lane

Date:

21 December 2011, 01:18:51

高增琦 <pgf00a@gmail.com> writes:
> Here is the example:
> create table t (a int);
> insert into t values (1),(3),(5),(7),(9);
> insert into t select a+1 from t;
> begin;
> declare c cursor for select * from t order by a;
> fetch 3 in c;
> fetch 3 in c;
> fetch 3 in c;
> 
> In 'PortalRun', a fetch stmt will be treated with PORTAL_UTIL_SELECT,
> and then a tuplestore will be created in 'FillPortalStore' in the
> fetch stmt's portal.

How are you trying to do the fetches, PQexec("fetch 3 in c") ?
That is an inherently inefficient way to do things, and trying to shave
a few cycles off the intermediate tuplestore isn't going to fix that.
The general overhead of parsing a new SQL command is probably going to
swamp the costs of a tuplestore, especially if it's too small to spill
to disk (and if it isn't, you really do need the tuplestore mechanism,
slow or not).

If you want to get a speed improvement there would probably be a lot
more bang for the buck in extending libpq to support protocol-level
portal access.  It does already have PQdescribePortal, but for some
reason not anything for "fetch N rows from portal so-and-so".  Not
sure whether it's worth providing explicit portal open/close commands
separate from PQexec'ing DECLARE CURSOR and CLOSE, but maybe at the
margins those steps would be worth improving too.
        regards, tom lane