Re: [HACKERS] What about LIMIT in SELECT ? - Mailing list pgsql-hackers
| From | jwieck@debis.com (Jan Wieck) | 
|---|---|
| Subject | Re: [HACKERS] What about LIMIT in SELECT ? | 
| Date | |
| Msg-id | m0zTRrE-000EBRC@orion.SAPserv.Hamburg.dsh.de Whole thread Raw | 
| In response to | Re: [HACKERS] What about LIMIT in SELECT ? (Oleg Bartunov <oleg@sai.msu.su>) | 
| Responses | Re: [HACKERS] What about LIMIT in SELECT ? | 
| List | pgsql-hackers | 
Oleg Bartunov wrote:
>       This is not a problem for CGI-script to know which key to start.
    Never  meant that would be a problem. A FORM variable will of
    course do this.
>       Without LIMIT every CGI call backend will do *FULL* selection
>       and cursor helps just in fetching a definite number of rows,
>       in principle I can do this with CGI-script. Also, cursor
>       returns data back in  ASCII  format (man l declare) and this requires
>       additional job for backend to convert data from intrinsic (binary)
>       format. Right implementation of LIMIT offset,number_of_rows could be
>       a great win and make postgres superior free database engine for
>       Web applications. Many colleagues of mine used mysql instead of
    That's the point I was missing. The offset!
>       postgres just because of lacking LIMIT. Tatsuo posted a patch
>       for set query_limit to 'num', I just tested it and seems it
>       works fine. Now, we need only possibility to specify offset,
>       say
>          set query_limit to 'offset,num'
>       ( Tatsuo, How difficult to do this ?)
>       and LIMIT problem will ne gone.
    Think you haven't read my posting completely. Even  with  the
    executor  limit,  the  complete scan into the sort is done by
    the backend.  You need to specify ORDER BY to  get  the  same
    list  again  (without  the  offset  doesn't  make sense). But
    currently, ORDER BY forces a sort node into the query plan.
    What the executor limit  tells  is  how  many  rows  will  be
    returned  from  the sorted data. Not what goes into the sort.
    Filling the sort and sorting the data consumes the most  time
    of the queries execution.
    I  haven't  looked  at  Tatsuo's  patch  very well. But if it
    limits the amount of data going into the sort (on ORDER  BY),
    it  will  break it! The requested ordering could be different
    from what the choosen index might return. The used  index  is
    choosen by the planner upon the qualifications given, not the
    ordering wanted.
    So if you select WHERE b = 1 ORDER BY a, then it will use  an
    index on attribute b to match the qualification. The complete
    result of that index scan goes into the sort to  get  ordered
    by  a. If now the executor limit stops sort filling after the
    limit is exceeded, only the same tuples will go into the sort
    every  time.  But  they have nothing to do with the requested
    order by a.
    What LIMIT first needs is  a  planner  enhancement.  In  file
    backend/optimizer/plan/planner.c  line 284 it must be checked
    if the actual plan is an indexscan, if the indexed attributes
    are  all  the same as those in the given sort clause and that
    the requested sort order (operator) is that  what  the  index
    will  return.   If  that  all matches, it can ignore the sort
    clause and return the index scan itself.
    Second enhancement must be the handling of  the  offset.   In
    the  executor,  the  index scan must skip offset index tuples
    before returning the first. But  NOT  if  the  plan  isn't  a
    1-table-index-scan.  In that case the result tuples (from the
    topmost unique/join/whatever node) have to be skipped.
    With these enhancements,  the  index  tuples  to  be  skipped
    (offset)  will still be scanned, but not the data tuples they
    point to. Index scanning might be somewhat faster.
    This all will only speedup simple 1-table-queries,  no  joins
    or  if  the requested order isn't that what the index exactly
    returns.
    Anyway, I'll take a look if I can change the planner to  omit
    the  sort  if  the tests described above are true. I think it
    would be good anyway.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #
		
	pgsql-hackers by date: