Thread: Question aboud #80 - itersize in cursor dic
Hi! Thank for great library. Dear group users, please help me with understanding of situation. in psycopg2 extras.py in 2.4.4 was: 89 def __iter__(self): 90 if self._prefetch: 91 res = _cursor.fetchmany(self, self.itersize) 92 if not res: 93 return 94 if self._query_executed: 95 self._build_index() 96 if not self._prefetch: 97 res = _cursor.fetchmany(self, self.itersize) 98 99 for r in res: 100 yield r 101 102 # the above was the first itersize record. the following are 103 # in a repeated loop. 104 while 1: 105 res = _cursor.fetchmany(self, self.itersize) 106 if not res: 107 return 108 for r in res: 109 yield r 110 in 2.4.5 became: def __iter__(self): if self._prefetch: res = _cursor.__iter__(self) first = res.next() if self._query_executed: self._build_index() if not self._prefetch: res = _cursor.__iter__(self) first = res.next() yield first while 1: yield res.next() ---------------------------------------------------------------------------- and now iterating over named cursor return 1 row, not many Please help, should i use cursor.fetchmany instead? Or how use itersize? (Why itersize not working) Thank You Alex
Hi! Thank for great library. Dear group users, please help me with understanding of situation. in psycopg2 extras.py in 2.4.4 was: 89 def __iter__(self): 90 if self._prefetch: 91 res = _cursor.fetchmany(self, self.itersize) 92 if not res: 93 return 94 if self._query_executed: 95 self._build_index() 96 if not self._prefetch: 97 res = _cursor.fetchmany(self, self.itersize) 98 99 for r in res: 100 yield r 101 102 # the above was the first itersize record. the following are 103 # in a repeated loop. 104 while 1: 105 res = _cursor.fetchmany(self, self.itersize) 106 if not res: 107 return 108 for r in res: 109 yield r 110 in 2.4.5 became: def __iter__(self): if self._prefetch: res = _cursor.__iter__(self) first = res.next() if self._query_executed: self._build_index() if not self._prefetch: res = _cursor.__iter__(self) first = res.next() yield first while 1: yield res.next() ---------------------------------------------------------------------------- and now iterating over named cursor return 1 row, not many Please help, should i use cursor.fetchmany instead? Or how use itersize? (Why itersize not working) Thank You Alex
On 16/05/12 11:22, Kryklia Alex wrote: > Thank for great library. > Dear group users, please help me with understanding of situation. This is probably a regression/bug. We'll investigate. Thank you for this report. federico -- Federico Di Gregorio federico.digregorio@dndg.it Studio Associato Di Nunzio e Di Gregorio http://dndg.it I came like Water, and like Wind I go. -- Omar Khayam
On Tue, May 15, 2012 at 5:59 PM, Kryklia Alex <kryklia@gmail.com> wrote: > Hi! > Thank for great library. > Dear group users, please help me with understanding of situation. [...] > and now iterating over named cursor return 1 row, not many > > Please help, should i use cursor.fetchmany instead? > Or how use itersize? (Why itersize not working) Hi Alex, sorry I may have not understood you but everything seems working fine for me: iterating over a named dict cursor fetches itersize at time. An explicit reference to itersize has been dropped from extras.py because __iter__ now calls into the superclass' __iter__ instead of fetchmany. Testing with debug enabled with psycopg 2.4.5, and including the relevant debug info: In [3]: cnn = psycopg2.connect('') In [4]: cur = cnn.cursor('test', cursor_factory=psycopg2.extras.DictCursor) In [5]: cur.execute("select generate_series(1,20)") [13211] DECLARE "test" CURSOR WITHOUT HOLD FOR select generate_series(1,20) In [6]: for i in cur: print i ...: [13211] FETCH FORWARD 2000 FROM "test" [1] [2] [3] [...] It has fetched the default itersize records. Testing with a different cursor class, non-default itersize: In [16]: cur = cnn.cursor('test', cursor_factory=psycopg2.extras.RealDictCursor) In [17]: cur.itersize = 2 In [18]: cur.execute("select generate_series(1,5)") [13211] DECLARE "test" CURSOR WITHOUT HOLD FOR select generate_series(1,5) In [19]: for i in cur: print i ....: [13211] FETCH FORWARD 2 FROM "test" {'generate_series': 1} {'generate_series': 2} [13211] FETCH FORWARD 2 FROM "test" {'generate_series': 3} {'generate_series': 4} [13211] FETCH FORWARD 2 FROM "test" {'generate_series': 5} [13211] FETCH FORWARD 2 FROM "test" In [20]: This is the expected behaviour for me. If there is any problem I have not understood please include a test case. Thank you, -- Daniele
Hi, Thank you for fast response. Sorry, Daniel, for accidentally sending you answer. As i understood, to enable debug in psycopg2 i need to recompile it? Now without debug: In [29]: cur=conn.cursor('1234') In [30]: cur.itersize=3 In [31]: cur.execute("select generate_series(1,20)") In [32]: for i in cur: print i,'end of iteration step' (1,) end of iteration step (2,) end of iteration step (3,) end of iteration step (4,) end of iteration step (5,) end of iteration step (6,) end of iteration step (7,) end of iteration step (8,) end of iteration step (9,) end of iteration step (10,) end of iteration step (11,) end of iteration step (12,) end of iteration step (13,) end of iteration step (14,) end of iteration step (15,) end of iteration step (16,) end of iteration step (17,) end of iteration step (18,) end of iteration step (19,) end of iteration step (20,) end of iteration step I assuming, that chunk of data on every iteration consists of 3 ints: (1,2,3,) end of iteration step As mentioned in docs: The attribute itersize now controls how many records are fetched at time during the iteration: (http://initd.org/psycopg/docs/usage.html#server-side-cursors) Or i misunderstood the docs? Alex 2012/5/16 Daniele Varrazzo <daniele.varrazzo@gmail.com>: > On Tue, May 15, 2012 at 5:59 PM, Kryklia Alex <kryklia@gmail.com> wrote: >> Hi! >> Thank for great library. >> Dear group users, please help me with understanding of situation. > > [...] > > >> and now iterating over named cursor return 1 row, not many >> >> Please help, should i use cursor.fetchmany instead? >> Or how use itersize? (Why itersize not working) > > Hi Alex, sorry I may have not understood you but everything seems > working fine for me: iterating over a named dict cursor fetches > itersize at time. An explicit reference to itersize has been dropped > from extras.py because __iter__ now calls into the superclass' > __iter__ instead of fetchmany. Testing with debug enabled with psycopg > 2.4.5, and including the relevant debug info: > > In [3]: cnn = psycopg2.connect('') > In [4]: cur = cnn.cursor('test', cursor_factory=psycopg2.extras.DictCursor) > In [5]: cur.execute("select generate_series(1,20)") > [13211] DECLARE "test" CURSOR WITHOUT HOLD FOR select generate_series(1,20) > > In [6]: for i in cur: print i > ...: > [13211] FETCH FORWARD 2000 FROM "test" > [1] > [2] > [3] > [...] > > It has fetched the default itersize records. Testing with a different > cursor class, non-default itersize: > > In [16]: cur = cnn.cursor('test', cursor_factory=psycopg2.extras.RealDictCursor) > In [17]: cur.itersize = 2 > In [18]: cur.execute("select generate_series(1,5)") > [13211] DECLARE "test" CURSOR WITHOUT HOLD FOR select generate_series(1,5) > > In [19]: for i in cur: print i > ....: > [13211] FETCH FORWARD 2 FROM "test" > {'generate_series': 1} > {'generate_series': 2} > [13211] FETCH FORWARD 2 FROM "test" > {'generate_series': 3} > {'generate_series': 4} > [13211] FETCH FORWARD 2 FROM "test" > {'generate_series': 5} > [13211] FETCH FORWARD 2 FROM "test" > > In [20]: > > This is the expected behaviour for me. If there is any problem I have > not understood please include a test case. > > Thank you, > > -- Daniele
On Wed, May 16, 2012 at 1:45 PM, Kryklia Alex <kryklia@gmail.com> wrote: > Hi, > Thank you for fast response. > Sorry, Daniel, for accidentally sending you answer. > As i understood, to enable debug in psycopg2 i need to recompile it? Yes, adding PSYCOPG_DEBUG to the define in setup.cfg and then running python with the env variable PSYCOPG_DEBUG set to something. > Now without debug: > > In [29]: cur=conn.cursor('1234') > > In [30]: cur.itersize=3 > > In [31]: cur.execute("select generate_series(1,20)") > > In [32]: for i in cur: print i,'end of iteration step' > (1,) end of iteration step > (2,) end of iteration step > (3,) end of iteration step [...] > I assuming, that chunk of data on every iteration consists of 3 ints: > (1,2,3,) end of iteration step No, this is not the case. > As mentioned in docs: > The attribute itersize now controls how many records are fetched at > time during the iteration: > (http://initd.org/psycopg/docs/usage.html#server-side-cursors) > Or i misunderstood the docs? Yes, I think you got the docs wrong: itersize records are fetched from the backend, but iteration still yields one record at time. The entire paragraph reads: """ Named cursors are also iterable like regular cursors. Note however that before Psycopg 2.4 iteration was performed fetching one record at time from the backend, resulting in a large overhead. The attribute itersize now controls how many records are fetched at time during the iteration: the default value of 2000 allows to fetch about 100KB per roundtrip assuming records of 10-20 columns of mixed number and strings; you may decrease this value if you are dealing with huge records. """ "itersize" only changes how records are fetched from the backend, not how they are returned. I think the paragraph as a whole is clean enough. However, if you or anybody else can suggest a better wording, docs patches are welcome. -- Daniele
Thank you very much, got it. BTW, i read whole block of text 3 times (. I'd liked to see this line added to docs: "itersize" only changes how records are fetched from the backend, not how they are returned Thank you. Sincerely, Alex 2012/5/16 Daniele Varrazzo <daniele.varrazzo@gmail.com>: > On Wed, May 16, 2012 at 1:45 PM, Kryklia Alex <kryklia@gmail.com> wrote: >> Hi, >> Thank you for fast response. >> Sorry, Daniel, for accidentally sending you answer. >> As i understood, to enable debug in psycopg2 i need to recompile it? > > Yes, adding PSYCOPG_DEBUG to the define in setup.cfg and then running > python with the env variable PSYCOPG_DEBUG set to something. > >> Now without debug: >> >> In [29]: cur=conn.cursor('1234') >> >> In [30]: cur.itersize=3 >> >> In [31]: cur.execute("select generate_series(1,20)") >> >> In [32]: for i in cur: print i,'end of iteration step' >> (1,) end of iteration step >> (2,) end of iteration step >> (3,) end of iteration step > > [...] > >> I assuming, that chunk of data on every iteration consists of 3 ints: >> (1,2,3,) end of iteration step > > No, this is not the case. > >> As mentioned in docs: >> The attribute itersize now controls how many records are fetched at >> time during the iteration: >> (http://initd.org/psycopg/docs/usage.html#server-side-cursors) >> Or i misunderstood the docs? > > Yes, I think you got the docs wrong: itersize records are fetched from > the backend, but iteration still yields one record at time. The entire > paragraph reads: > > """ > Named cursors are also iterable like regular cursors. Note however > that before Psycopg 2.4 iteration was performed fetching one record at > time from the backend, resulting in a large overhead. The attribute > itersize now controls how many records are fetched at time during the > iteration: the default value of 2000 allows to fetch about 100KB per > roundtrip assuming records of 10-20 columns of mixed number and > strings; you may decrease this value if you are dealing with huge > records. > """ > > "itersize" only changes how records are fetched from the backend, not > how they are returned. I think the paragraph as a whole is clean > enough. However, if you or anybody else can suggest a better wording, > docs patches are welcome. > > -- Daniele