Thread: Avoiding execution of some functions by query rewriting

Avoiding execution of some functions by query rewriting

From

Thomas Girault

Date:

16 May 2012, 09:30:39

Hello,

I would like to allow the execution of a function (my_function) only if its argument (my_table.x) belongs to a predefined interval (e.g. [100,1000]).

Let's take the following query example :

(q) SELECT * FROM my_table WHERE my_function(mytable.x);

I would like this query automatically rewrites itself to check whether "mytable.x" belong to the interval [100,1000] :

(q') SELECT * FROM my_table WHERE (my_table.x BETWEEN 100 AND 1000) AND my_function(my_table.x);

The command EXPLAIN ANALYSE shows that the second query is really faster than the first one.

How can I change the query execution plan in order to automate the process of query rewriting (q into q') ?

Where can I store suitably the metadata about the interval [100,1000] associated to my_function ?

Thanks by advance,

Thomas Girault

Re: Avoiding execution of some functions by query rewriting

From

Tom Lane

Date:

16 May 2012, 09:36:10

Thomas Girault <toma.girault@gmail.com> writes:
> Hello,
> I would like to allow the execution of a function (my_function) only if its
> argument (my_table.x) belongs to a predefined interval (e.g. [100,1000]).

> Let's take the following query example :
> (q)  SELECT * FROM my_table WHERE my_function(mytable.x);

> I would like this query automatically rewrites itself to check whether
> "mytable.x" belong to the interval [100,1000] :
> (q')  SELECT * FROM my_table WHERE (my_table.x BETWEEN 100 AND 1000) AND
> my_function(my_table.x);

> The command EXPLAIN ANALYSE shows that the second query is really faster
> than the first one.

> How can I change the query execution plan in order to automate the process
> of query rewriting (q into q') ?
> Where can I store suitably the metadata about the interval [100,1000]
> associated to my_function ?

Surely a quick "if" test added to the top of the function would be
a better solution than trying to hack the system to do this.  Keep
in mind that any code you add to the planner for such a thing would
fire on every function call, in every query, all the time.  It seems
unlikely that the possible benefit of not entering your function at
all would outweigh the distributed penalty from the added planner
overhead.
        regards, tom lane

Re: Avoiding execution of some functions by query rewriting

From

Florian Pflug

Date:

16 May 2012, 10:01:21

On May16, 2012, at 14:30 , Thomas Girault wrote:
> I would like to allow the execution of a function (my_function) only if its argument (my_table.x) belongs to a
predefinedinterval (e.g. [100,1000]). 
>
> Let's take the following query example :
> (q)  SELECT * FROM my_table WHERE my_function(mytable.x);
>
> I would like this query automatically rewrites itself to check whether "mytable.x" belong to the interval [100,1000]
:
> (q')  SELECT * FROM my_table WHERE (my_table.x BETWEEN 100 AND 1000) AND my_function(my_table.x);

Rename my_function to my_function_impl or the like, and create a SQL-language
wrapper function which does
 SELECT (my_table.x BETWEEN 100 AND 1000) AND my_function_impl(my_table.x)

i.e. do

CREATE FUNCTION my_function(sometype) RETURNS BOOL AS $$ SELECT ($1 BETWEEN 100 AND 1000) AND my_function_impl($1)
$$ LANGUAGE SQL IMMUTABLE;

This works because sufficiently simply SQL-language functions are inlined early during
query planning, which allows later stages to see through the function call and e.g.
use an index range scan to pull only those rows which satisfy a BETWEEN clause.

I don't remember the precise conditions under which SQL-language functions are inlined,
but I'm pretty sure that not referencing any tables and not having a FROM clause is
sufficient. Note that inlining *only* happens for SQL-language functions, *not* for
PLPGSQL functions, even if the latter consist of a simple RETURN statement.

best regards,
Florian Pflug