Thread: Memory leak with palloc

Memory leak with palloc

From

Han Holl

Date:

29 November 2002, 09:16:34

Hello,

Let's say I want to run
select
  count(column1) as column1,
  count(column2) as column2,
  ...
  count(column228) as column228
from aview;

coulumni are really function calls that return a computed
text that is palloc'd.
Postgres seems to free the palloc'd memory at the end of the
select statement, which means that the above query may of may
not succeed depending on machine size, database size and amount
of memory allocated.
Is there a way (and if there isn't, should there be) to tell
postgres either:
  1. Please release this memory when you're done with this record.
or 2. Here is a pointer to static memory. Please don't free.

Both would solve my problem (and I guess 2 would be the more efficient).

Cheers,

Han Holl

PS The above query failed very quickly indeed on Postgres 7.2.3, (due to
memory exhaustion,) which seems to have a additional memory leak of it's
own. On 7.3RC2 it's running fine but consuming memory at a steady rate
(from 10M to 115M in 35 minutes, with 1.3G virtual memory I', confident
that it will complete).

Re: Memory leak with palloc

From

Tom Lane

Date:

29 November 2002, 11:24:40

Han Holl <han.holl@prismant.nl> writes:
> Postgres seems to free the palloc'd memory at the end of the
> select statement,

It should do so sooner than that.  Can you provide a self-contained
example?

            regards, tom lane

Re: Memory leak with palloc

From

"Han Holl"

Date:

30 November 2002, 11:07:35

On Fri, Nov 29, 2002 at 05:24:20PM +0100, tgl@sss.pgh.pa.us wrote:
>
> Han Holl <han.holl@prismant.nl> writes:
> > Postgres seems to free the palloc'd memory at the end of the
> > select statement,
>
> It should do so sooner than that.  Can you provide a self-contained
> example?

I have tried to do that and failed. The problem proved to have
nothing to do with palloc (I was too hasty there), but there
_is_ something fishy here:

Given table:
CREATE TABLE "rubdefs" (
        "srto" character(1) NOT NULL,
        "titel" character varying(40),
        "rubnr" smallint
);
and the function:
CREATE FUNCTION "rubton" (text,text) RETURNS smallint AS '
select rubnr from rubdefs where srto = $1 and titel = $2
' LANGUAGE 'sql' WITH ( iscachable );

A query like:
select count( rubton(udps.srto, 'adres')) from udps where srto = 'T';

on a view will use (leak) approximately 64 bytes per tuples visited,
per column. My original query had 230 columns and 450000 tuples, so
it would have required about 6000 Mb.
Rewriting the mapping function as:

create function rubton2(text, text) returns int as '
declare
  rsl integer;
begin
  select into rsl rubnr from rubdefs where srto = $1 and titel = $2;
  return rsl;
end
' language 'plpgsql';

solves the problem (but is slightly slower).

So it looks like using the pure SQL version here is not a good idea.
I'm quite willing to produce an example of this behaviour, but I
suspect that it's 'known behaviour' to experts.

Cheers,

Han Holl

>
>                regards, tom lane
>

Re: Memory leak with palloc

From

Tom Lane

Date:

30 November 2002, 11:21:37

"Han Holl" <han.holl@prismant.nl> writes:
> So it looks like using the pure SQL version here is not a good idea.
> I'm quite willing to produce an example of this behaviour, but I
> suspect that it's 'known behaviour' to experts.

No, not to me anyway.  I would not be surprised if there's some leak in
the SQL-function code, but could I trouble you for a complete example,
rather than having to reverse-engineer one?

            regards, tom lane

Re: Memory leak with palloc

From

"Han Holl"

Date:

01 December 2002, 15:45:18

On Sat, Nov 30, 2002 at 05:20:57PM +0100, tgl@sss.pgh.pa.us wrote:
>
> "Han Holl" <han.holl@prismant.nl> writes:
> > So it looks like using the pure SQL version here is not a good idea.
> > I'm quite willing to produce an example of this behaviour, but I
> > suspect that it's 'known behaviour' to experts.
>
> No, not to me anyway.  I would not be surprised if there's some leak in
> the SQL-function code, but could I trouble you for a complete example,
> rather than having to reverse-engineer one?

Of course. I hope you have a Linux like system, otherwise the following
might not run.

If I rub this shellscript, I see with tom in a different terminal that
the postmaster process grows to about 28Mb, and then collapses to the
initial 1.5 Mb.

To avoid quoting problems, I;ll attach the shellscript as well

Cheers

Han Holl

#!/bin/bash


PGLIB=/usr/local/pgsql/lib
INC=/usr/src/postgresql-7.3rc2/src/include
TABLE=/tmp/demotable.tbl
MAPPING=/tmp/mapping.tbl
SQL=/tmp/demo.sql

set -e
[ -f $TABLE ] || {
  perl -e 'for ($i = 0 ; $i < 300000 ; $i++) { printf("%s%d\n",chr(65 + ($i % 4)), $i); }' >$TABLE
}
[ -f $MAPPING ] || {
  for sr in A B C D
  do
    for i in 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
    do
      echo -e "$sr\tstring$i\t$i"
    done
  done >$MAPPING
}


cat >$SQL <<EOF
drop table demotable cascade;
create table demotable ( column1 text );
drop table demorub cascade;
create table demorub (
  srto char,
  titel text,
  rnum integer
);

copy demotable from '$TABLE';
copy demorub from '$MAPPING';
create or replace function mapping (char, text) returns int as '
select rnum from demorub where srto = \$1 and titel = \$2
' language 'sql';
select count(mapping(substring(column1,1,1), 'string2')) from demotable;
EOF
psql <$SQL

Attachment

memdemo.sh

Re: Memory leak with palloc

From

Tom Lane

Date:

01 December 2002, 17:09:40

"Han Holl" <han.holl@prismant.nl> writes:
>> No, not to me anyway.  I would not be surprised if there's some leak in
>> the SQL-function code, but could I trouble you for a complete example,
>> rather than having to reverse-engineer one?

> Of course. I hope you have a Linux like system, otherwise the following
> might not run.

Okay, it turns out this is indeed an aspect of a known problem, namely
that SQL-language functions aren't good about cleaning up query-lifespan
data created by the queries they run.  The individual queries really
ought to be run using a query memory context that's shorter-lived than
that of the query that's calling the function.  But they're not, yet.

I was able to eliminate the memory leak in your example by adding
"pfree(scanstate);" to the end of ExecEndSeqScan().  However, that's
not a general solution, since there are comparable leaks in most of
the other executor node types; and even if we eliminated them all today,
it would be difficult to prevent new ones from appearing in future.

The right long-term solution is to arrange for all query-lifetime data
to be allocated in a specific context that can be freed in-toto when
we end the query. To do this cleanly, we have to fix the executor to
not scribble on plan trees (executor state nodes should point to plan
nodes, never vice-versa).  I've been intending to do that for awhile,
but it's not done yet ...

            regards, tom lane

Re: Memory leak with palloc

From

"Han Holl"

Date:

01 December 2002, 17:39:33

On Sun, Dec 01, 2002 at 09:45:14PM +0100, Han.Holl@prismant.nl wrote:
> Of course. I hope you have a Linux like system, otherwise the following
> might not run.
>
> If I rub this shellscript, I see with tom in a different terminal that
Lousy spelling here! Please read as:
  If I run this shellscript, I see with top in a different terminal that
> the postmaster process grows to about 28Mb, and then collapses to the
> initial 1.5 Mb.

Sorry,

Han Holl

Re: Memory leak with palloc

From

Han Holl

Date:

02 December 2002, 10:58:30

Thanks for the information.
I must say the involvement of the developers of PostgreSQL is
impressive.

Cheers,

Han Holl

On Sunday 01 December 2002 11:09 pm, tgl@sss.pgh.pa.us wrote:
>
> Okay, it turns out this is indeed an aspect of a known problem, namely
> that SQL-language functions aren't good about cleaning up query-lifespan
> data created by the queries they run.  The individual queries really
> ought to be run using a query memory context that's shorter-lived than
> that of the query that's calling the function.  But they're not, yet.
>
> I was able to eliminate the memory leak in your example by adding
> "pfree(scanstate);" to the end of ExecEndSeqScan().  However, that's
> not a general solution, since there are comparable leaks in most of
> the other executor node types; and even if we eliminated them all today,
> it would be difficult to prevent new ones from appearing in future.
>
> The right long-term solution is to arrange for all query-lifetime data
> to be allocated in a specific context that can be freed in-toto when
> we end the query. To do this cleanly, we have to fix the executor to
> not scribble on plan trees (executor state nodes should point to plan
> nodes, never vice-versa).  I've been intending to do that for awhile,
> but it's not done yet ...
>
>                regards, tom lane

Re: Memory leak with palloc

From

Tom Lane

Date:

15 December 2002, 16:05:15

A couple weeks ago I said:
> The right long-term solution is to arrange for all query-lifetime data
> to be allocated in a specific context that can be freed in-toto when
> we end the query. To do this cleanly, we have to fix the executor to
> not scribble on plan trees (executor state nodes should point to plan
> nodes, never vice-versa).  I've been intending to do that for awhile,
> but it's not done yet ...

Just FYI, this is now done.  Your example doesn't leak memory anymore
in CVS tip.

            regards, tom lane