Thread: [HACKERS] is possible cache tupledesc templates in execution plan? significantperformance issue, maybe bug?

[HACKERS] is possible cache tupledesc templates in execution plan? significantperformance issue, maybe bug?

From

Pavel Stehule

Date:

08 October 2017, 19:36:23

I am looking why some queries are significantly slower on PostgreSQL than on Oracle, although there is pretty simple plan. The queries are usually 10x times slower on Postgres than on Oracle.

I migrate old Oracle application to Postgres. There are important two factors:

1. Often usage of "view" functions (I don't know better terminology) like:

CREATE OR REPLACE FUNCTION foo(_id integer)

RETURNS text AS $$

BEGIN

RETURN (SELECT xx FROM a JOIN b ON a.id = b.id WHERE b.y = _id)

END;

$$ LANGUAGE plpgsql;

These functions are used in views

CREATE VIEW xx AS

SELECT a, b, c, foo(id) as d, ...

And sometimes are used in filters

SELECT * FROM xx WHERE d IN NOT NULL;

2. Lot of used tables are pretty wide - 60, 120, .. columns

Now, I am doing profiling, and I see so most time is related to

ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)

This function is executed in exec init time - in this case pretty often. Although there are used few columns from the table, the target list is build for columns (maybe it is bug)

I have a function

CREATE OR REPLACE FUNCTION ides_funcs.najdatsplt_cislo_exekuce(mid_najdatsplt bigint)
RETURNS character varying
LANGUAGE plpgsql
STABLE SECURITY DEFINER COST 1000
AS $function$
DECLARE

Result varchar(200);

--mZALOBCISLO NAJZALOBCE.ZALOBCISLO%TYPE;
mAdra varchar(200);

BEGIN
BEGIN

-- there are only tables

     select CISLOEXEKUCE INTO STRICT mADRA
      from najzalobpr MT, najvzallok A1,
                    NAJZALOBST A2, NAJZALOBCE A3 where
                    MT.ID_NAJVZALLOK= A1.ID_NAJVZALLOK AND
                    A1.ID_NAJZALOBST=A2.ID_NAJZALOBST AND
                    A2.ID_NAJZALOBCE= A3.ID_NAJZALOBCE AND
                    MT.ID_NAJDATSPLT = mID_NAJDATSPLT LIMIT 1;
    EXCEPTION
      WHEN OTHERS THEN
        mADRA := NULL;
    END;

Result:=mADRA;
return(Result);
end;
$function$

where is necessary only few columns:

but it processing target list of length

NOTICE: plpgsql_exec_function: ides_funcs.najdatsplt_cislo_exekuce(bigint)
NOTICE: >>len: 38, hasoid: 0, skipjunk: 0
NOTICE: >>len: 21, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0
NOTICE: >>len: 65, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0
NOTICE: >>len: 93, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0

len is length of targetlist

The numbers are related to number of columns of tables najzalobpr, najvzallok, NAJZALOBST, ..

Because these tables are wide, then the queries are too slow

So, my questions?

1. Why target list is too long in this case. It should be reduced to few fields?

2. If is not possible to reduce the number of fields of target list, is possible to store tupledesc template to plan?

Without this issue, the Postgres has same speed or is faster than Ora.

I can send a schema by some private channel.

Regards

Pavel

Re: [HACKERS] is possible cache tupledesc templates in executionplan? significant performance issue, maybe bug?

From

Andres Freund

Date:

08 October 2017, 19:44:34

Hi,

On 2017-10-08 18:36:23 +0200, Pavel Stehule wrote:
> 2. Lot of used tables are pretty wide - 60, 120, .. columns
> 
> Now, I am doing profiling, and I see so most time is related to
> 
> ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)

Yea, that's known - I've complained about this a couple times. You could
try whether the following master branch helps:
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/simple_statement_perf

That's just micro-optimization though, not a more fundamental
solution. But for me it yields pretty nice speedups for cases with long
tlists.

> This function is executed in exec init time - in this case pretty often.
> Although there are used few columns from the table, the target list is
> build for columns (maybe it is bug)

It's probably just the physical tlist "optimization".

> 2. If is not possible to reduce the number of fields of target list, is
> possible to store tupledesc template to plan?

We should do that, but it's not a small change.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: is possible cache tupledesc templates in execution plan?significant performance issue, maybe bug?

From

Pavel Stehule

Date:

08 October 2017, 19:51:49

2017-10-08 18:36 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

I am looking why some queries are significantly slower on PostgreSQL than on Oracle, although there is pretty simple plan. The queries are usually 10x times slower on Postgres than on Oracle.

I migrate old Oracle application to Postgres. There are important two factors:

1. Often usage of "view" functions (I don't know better terminology) like:

CREATE OR REPLACE FUNCTION foo(_id integer)
RETURNS text AS $$
BEGIN
RETURN (SELECT xx FROM a JOIN b ON a.id = b.id WHERE b.y = _id)
END;
$$ LANGUAGE plpgsql;

These functions are used in views

CREATE VIEW xx AS
SELECT a, b, c, foo(id) as d, ...

And sometimes are used in filters

SELECT * FROM xx WHERE d IN NOT NULL;

2. Lot of used tables are pretty wide - 60, 120, .. columns

Now, I am doing profiling, and I see so most time is related to

ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)

This function is executed in exec init time - in this case pretty often. Although there are used few columns from the table, the target list is build for columns (maybe it is bug)

I have a function

CREATE OR REPLACE FUNCTION ides_funcs.najdatsplt_cislo_exekuce(mid_najdatsplt bigint)
RETURNS character varying
LANGUAGE plpgsql
STABLE SECURITY DEFINER COST 1000
AS $function$
DECLARE

Result varchar(200);

--mZALOBCISLO NAJZALOBCE.ZALOBCISLO%TYPE;
mAdra varchar(200);

BEGIN
   BEGIN
-- there are only tables
     select CISLOEXEKUCE INTO STRICT mADRA
      from najzalobpr MT, najvzallok A1,
                    NAJZALOBST A2, NAJZALOBCE A3 where
                    MT.ID_NAJVZALLOK= A1.ID_NAJVZALLOK AND
                    A1.ID_NAJZALOBST=A2.ID_NAJZALOBST AND
                    A2.ID_NAJZALOBCE= A3.ID_NAJZALOBCE AND
                    MT.ID_NAJDATSPLT = mID_NAJDATSPLT LIMIT 1;
    EXCEPTION
      WHEN OTHERS THEN
        mADRA := NULL;
    END;

Result:=mADRA;
return(Result);
end;
$function$

where is necessary only few columns:

but it processing target list of length

NOTICE: plpgsql_exec_function: ides_funcs.najdatsplt_cislo_exekuce(bigint)
NOTICE: >>len: 38, hasoid: 0, skipjunk: 0
NOTICE: >>len: 21, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0
NOTICE: >>len: 65, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0
NOTICE: >>len: 93, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0
NOTICE: >>len: 1, hasoid: 0, skipjunk: 0

len is length of targetlist

The numbers are related to number of columns of tables najzalobpr, najvzallok, NAJZALOBST, ..

Because these tables are wide, then the queries are too slow

So, my questions?

1. Why target list is too long in this case. It should be reduced to few fields?

2. If is not possible to reduce the number of fields of target list, is possible to store tupledesc template to plan?

Without this issue, the Postgres has same speed or is faster than Ora.

I can send a schema by some private channel.

Regards

The following workaround is working

create view xxxx as select CISLOEXEKUCE, MT.ID_NAJDATSPLT
      from najzalobpr MT, najvzallok A1,
                    NAJZALOBST A2, NAJZALOBCE A3 where
                    MT.ID_NAJVZALLOK= A1.ID_NAJVZALLOK AND
                    A1.ID_NAJZALOBST=A2.ID_NAJZALOBST AND
                    A2.ID_NAJZALOBCE= A3.ID_NAJZALOBCE;

and function should be changed to

BEGIN
   BEGIN
    select CISLOEXEKUCE INTO STRICT mADRA
      from xxxx
      WHERE id_najdatsplt = mID_najdatsplt LIMIT 1;
    EXCEPTION
      WHEN OTHERS THEN
        mADRA := NULL;
    END;

Result:=mADRA;
return(Result);
end;

So this issue is really related to tupleDesc management

Pavel

Re: [HACKERS] is possible cache tupledesc templates in executionplan? significant performance issue, maybe bug?

From

Pavel Stehule

Date:

08 October 2017, 19:57:28

2017-10-08 18:44 GMT+02:00 Andres Freund <andres@anarazel.de>:

Hi,

On 2017-10-08 18:36:23 +0200, Pavel Stehule wrote:
> 2. Lot of used tables are pretty wide - 60, 120, .. columns
>
> Now, I am doing profiling, and I see so most time is related to
>
> ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)

Yea, that's known - I've complained about this a couple times. You could
try whether the following master branch helps:
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/simple_statement_perf

That's just micro-optimization though, not a more fundamental
solution. But for me it yields pretty nice speedups for cases with long
tlists.

it is just this patch

        HeapTuple   tup;
        Form_pg_type typTup;

+       if (typid < FirstBootstrapObjectId)
+           break;
+
        tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
        if (!HeapTupleIsValid(tup))
            elog(ERROR, "cache lookup failed for type %u", typid);

> This function is executed in exec init time - in this case pretty often.
> Although there are used few columns from the table, the target list is
> build for columns (maybe it is bug)

It's probably just the physical tlist "optimization".

> 2. If is not possible to reduce the number of fields of target list, is
> possible to store tupledesc template to plan?

We should do that, but it's not a small change.

Greetings,

Andres Freund

Re: [HACKERS] is possible cache tupledesc templates in executionplan? significant performance issue, maybe bug?

From

Andres Freund

Date:

08 October 2017, 19:59:19

On 2017-10-08 18:57:28 +0200, Pavel Stehule wrote:
> 2017-10-08 18:44 GMT+02:00 Andres Freund <andres@anarazel.de>:
> 
> > Hi,
> >
> > On 2017-10-08 18:36:23 +0200, Pavel Stehule wrote:
> > > 2. Lot of used tables are pretty wide - 60, 120, .. columns
> > >
> > > Now, I am doing profiling, and I see so most time is related to
> > >
> > > ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)
> >
> > Yea, that's known - I've complained about this a couple times. You could
> > try whether the following master branch helps:
> > https://git.postgresql.org/gitweb/?p=users/andresfreund/
> > postgres.git;a=shortlog;h=refs/heads/simple_statement_perf
> >
> > That's just micro-optimization though, not a more fundamental
> > solution. But for me it yields pretty nice speedups for cases with long
> > tlists.
> >
> >
> it is just this patch
> 
>         HeapTuple   tup;
>         Form_pg_type typTup;
> 
> +       if (typid < FirstBootstrapObjectId)
> +           break;
> +
>         tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
>         if (!HeapTupleIsValid(tup))
>             elog(ERROR, "cache lookup failed for type %u", typid);

No.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] is possible cache tupledesc templates in executionplan? significant performance issue, maybe bug?

From

Pavel Stehule

Date:

08 October 2017, 20:04:59

2017-10-08 18:59 GMT+02:00 Andres Freund <andres@anarazel.de>:

On 2017-10-08 18:57:28 +0200, Pavel Stehule wrote:
> 2017-10-08 18:44 GMT+02:00 Andres Freund <andres@anarazel.de>:
>
> > Hi,
> >
> > On 2017-10-08 18:36:23 +0200, Pavel Stehule wrote:
> > > 2. Lot of used tables are pretty wide - 60, 120, .. columns
> > >
> > > Now, I am doing profiling, and I see so most time is related to
> > >
> > > ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)
> >
> > Yea, that's known - I've complained about this a couple times. You could
> > try whether the following master branch helps:
> > https://git.postgresql.org/gitweb/?p=users/andresfreund/
> > postgres.git;a=shortlog;h=refs/heads/simple_statement_perf
> >
> > That's just micro-optimization though, not a more fundamental
> > solution. But for me it yields pretty nice speedups for cases with long
> > tlists.
> >
> >
> it is just this patch
>
> HeapTuple tup;
> Form_pg_type typTup;
>
> + if (typid < FirstBootstrapObjectId)
> + break;
> +
> tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
> if (!HeapTupleIsValid(tup))
> elog(ERROR, "cache lookup failed for type %u", typid);

No.

please, how I can clone your repo?

Re: [HACKERS] is possible cache tupledesc templates in executionplan? significant performance issue, maybe bug?

From

Pavel Stehule

Date:

08 October 2017, 20:10:32

2017-10-08 19:04 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-10-08 18:59 GMT+02:00 Andres Freund <andres@anarazel.de>:
On 2017-10-08 18:57:28 +0200, Pavel Stehule wrote:
> 2017-10-08 18:44 GMT+02:00 Andres Freund <andres@anarazel.de>:
>
> > Hi,
> >
> > On 2017-10-08 18:36:23 +0200, Pavel Stehule wrote:
> > > 2. Lot of used tables are pretty wide - 60, 120, .. columns
> > >
> > > Now, I am doing profiling, and I see so most time is related to
> > >
> > > ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)
> >
> > Yea, that's known - I've complained about this a couple times. You could
> > try whether the following master branch helps:
> > https://git.postgresql.org/gitweb/?p=users/andresfreund/
> > postgres.git;a=shortlog;h=refs/heads/simple_statement_perf
> >
> > That's just micro-optimization though, not a more fundamental
> > solution. But for me it yields pretty nice speedups for cases with long
> > tlists.
> >
> >
> it is just this patch
>
> HeapTuple tup;
> Form_pg_type typTup;
>
> + if (typid < FirstBootstrapObjectId)
> + break;
> +
> tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
> if (!HeapTupleIsValid(tup))
> elog(ERROR, "cache lookup failed for type %u", typid);

No.

please, how I can clone your repo?

I found it

[HACKERS] Re: is possible cache tupledesc templates in execution plan?significant performance issue, maybe bug?

From

Pavel Stehule

Date:

08 October 2017, 20:23:59

The following workaround is working

create view xxxx as select CISLOEXEKUCE, MT.ID_NAJDATSPLT
      from najzalobpr MT, najvzallok A1,
                    NAJZALOBST A2, NAJZALOBCE A3 where
                    MT.ID_NAJVZALLOK= A1.ID_NAJVZALLOK AND
                    A1.ID_NAJZALOBST=A2.ID_NAJZALOBST AND
                    A2.ID_NAJZALOBCE= A3.ID_NAJZALOBCE;

and function should be changed to

BEGIN
   BEGIN
    select CISLOEXEKUCE INTO STRICT mADRA
      from xxxx
      WHERE id_najdatsplt = mID_najdatsplt LIMIT 1;
    EXCEPTION
      WHEN OTHERS THEN
        mADRA := NULL;
    END;

Result:=mADRA;
return(Result);
end;

So this issue is really related to tupleDesc management

I found a bug in this workaround. It doesn't work

Pavel

Re: [HACKERS] is possible cache tupledesc templates in executionplan? significant performance issue, maybe bug?

From

Pavel Stehule

Date:

09 October 2017, 07:07:35

2017-10-08 19:10 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-10-08 19:04 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-10-08 18:59 GMT+02:00 Andres Freund <andres@anarazel.de>:
On 2017-10-08 18:57:28 +0200, Pavel Stehule wrote:
> 2017-10-08 18:44 GMT+02:00 Andres Freund <andres@anarazel.de>:
>
> > Hi,
> >
> > On 2017-10-08 18:36:23 +0200, Pavel Stehule wrote:
> > > 2. Lot of used tables are pretty wide - 60, 120, .. columns
> > >
> > > Now, I am doing profiling, and I see so most time is related to
> > >
> > > ExecTypeFromTLInternal(List *targetList, bool hasoid, bool skipjunk)
> >
> > Yea, that's known - I've complained about this a couple times. You could
> > try whether the following master branch helps:
> > https://git.postgresql.org/gitweb/?p=users/andresfreund/
> > postgres.git;a=shortlog;h=refs/heads/simple_statement_perf
> >
> > That's just micro-optimization though, not a more fundamental
> > solution. But for me it yields pretty nice speedups for cases with long
> > tlists.
> >
> >
> it is just this patch
>
> HeapTuple tup;
> Form_pg_type typTup;
>
> + if (typid < FirstBootstrapObjectId)
> + break;
> +
> tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typid));
> if (!HeapTupleIsValid(tup))
> elog(ERROR, "cache lookup failed for type %u", typid);

No.

please, how I can clone your repo?

I found it

With your branch the execution is about 15-20% faster - so overhead of exec init is more significant.

Unfortunately Oracle is significantly faster for this pattern

Regards

Pavel