Home > mailing lists

Re: [HACKERS] Solution for LIMIT cost estimation - Mailing list pgsql-hackers

From	Hannu Krosing
Subject	Re: [HACKERS] Solution for LIMIT cost estimation
Date	February 14, 2000 04:34:44
Msg-id	38A7CDE1.B3005B98@tm.ee Whole thread Raw
In response to	Solution for LIMIT cost estimation (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [HACKERS] Solution for LIMIT cost estimation
List	pgsql-hackers

Tree view

Chris Bitmead wrote:
> 
> Don Baccus wrote:
> >
> > At 03:32 PM 2/14/00 +1100, Chris Bitmead wrote:
> >
> > >I agree you should probably go the whole hog one way or the other. I
> > >think
> > >ignoring offset+limit is a useful option, but like I said at the
> > >beginning, it doesn't bother me _that_ much.
> >
> > It should bother you that folks who understand how SQL works might
> > be penalized in order to insulate the fact that those who don't know
> > how SQL works from an understanding of their own ignorance...
> >
> > Shouldn't we be more concerned with folks who bother to read an
> > SQL primer?  Or Oracle or Informix docs on SQL?
> 
> LIMIT is not SQL, both as a technical fact, and philosophically
> because it reaches outside of set theory.

I see limit as a shortcut (plus an optimizer hint) for the sequence
DECLARE CURSOR - MOVE offset - FETCH limit - CLOSE CURSOR

It's utility was much debated befor it was included in Postgres, 
the main argument for inclusion being "mySQL has it and it's useful 
for fast-start queries", the main argument against being "it's not SQL,
people won't understand it a and will start to misuse it".

Maybe we should still discourage the use of LIMIT, and rather introduce 
another "mode" for optimiser, activated by SET FastStart TO 'ON'.
Then queries with limit could be rewritten into
SET FastStart to 'ON';
DECLARE
MOVE
FETCH
CLOSE
SET FastStart to PREVIOUS_VALUE;

also maybe we will need PUSH/POP for set commands ?

> What LIMIT does without ORDER BY is non-deterministic, and therefore 
> a subjective matter of what is the most useful: a faster answer, 
> or a more consistant answer.

As SQL queries are all one-time things you can't be "consistent". 
It's like being able to grab the same set of socks from a bag and 
then trying to devise a strategy for getting them in same order 
without sorting them (i.e. possible but ridiculous)

If you need them in some order, you use ORDER BY, if you don't need 
any order you omit ORDER BY.

> My predudices are caused by what I use PostgreSQL for, which is
> more favourable to the latter.

Whats wrong with using ORDER BY ? 

I can't imagine a set of queries that need to be consistent _almost_
all the time, but without any order.

If you really need that kind of behaviour, the right decision is to 
select the rows into a work table that has an additional column for 
preserving order and then do the limit queries from that table.

But in that case it is often faster to have an index on said column
and to do WHERE ID BETWEEN OFFSET AND OFFSET+LIMITORDER BY ID
than to use LIMIT, more so for large offsets.

pgsql-hackers by date:

From: "davida"
Date: 14 February 2000, 04:19:43
Subject: subscribe hackers

From: Michael Meskes
Date: 14 February 2000, 05:00:44
Subject: function defined in libpq?

Re: [HACKERS] Solution for LIMIT cost estimation - Mailing list pgsql-hackers

Previous

Next