Re: Why does a simple query not use an obvious index? - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: Why does a simple query not use an obvious index? |
Date | |
Msg-id | 16093.1093817023@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Why does a simple query not use an obvious index? (Greg Stark <gsstark@mit.edu>) |
Responses |
Re: Why does a simple query not use an obvious index?
Re: Why does a simple query not use an obvious index? Re: Why does a simple query not use an obvious index? |
List | pgsql-performance |
>> select somefield from sometable where timestampfield > now()-'60 >> seconds'::interval This is a FAQ, but since the archives don't seem to be up at the moment, here's the answer once again: The expression "now() - something" is not a constant, so the planner is faced with "timestampfield > unknownvalue". Its default assumption about the number of rows that will match is much too high to make an indexscan look profitable (from memory, I think it guesses that about a third of the table will match...). There are a couple of hacks you can use to deal with this. Plan A is just "set enable_seqscan = false" for this query. This is ugly and not really recommended, but you should try it first to verify that you do get an indexscan that way, just to be sure that lack of statistics is the problem and not something else. Plan B is to add an extra WHERE clause to make the problem look like a range query, eg where timestampfield > now() - ... AND timestampfield <= now(); The planner still doesn't know the exact values involved, so it still can't make use of any statistics, but it can see that this is a range constraint on timestampfield. The default guess about the selectivity will be a lot smaller than in the case of the one-sided inequality, and in most cases you should get an indexscan out of it. This isn't completely guaranteed though. Also, it's got a severe problem in that if you sometimes do queries with a large interval, it'll still do an indexscan even though that may be quite inappropriate. Plan C is to fix things so that the compared-to value *does* look like a constant; then the planner will correctly observe that only a small part of the table is to be scanned, and do the right thing (given reasonably up-to-date ANALYZE statistics, anyway). The most trustworthy way of doing that is to compute the "now() - interval" value on the client side and send over a timestamp constant. If that's not convenient for some reason, people frequently use a hack like this: create function ago(interval) returns timestamptz as 'select now() - $1' language sql strict immutable; select ... where timestampfield > ago('60 seconds'); This is a kluge because you are lying when you say that the result of ago() is immutable; it obviously isn't. But the planner will believe you, fold the function call to a constant during planning, and use the result. CAUTION: this works nicely for interactively-issued SQL queries, but it will come back to bite you if you try to use ago() in prepared queries or plpgsql functions, because the premature collapsing of the now() result will become significant. We have speculated about ways to get the planner to treat expressions involving now() and similar functions as pseudo-constants, so that it would do the right thing in this sort of situation without any kluges. It's not been done yet though. BTW, the above discussion applies to PG 7.3 and later; if you're dealing with an old version then there are some different considerations. regards, tom lane
pgsql-performance by date: