Some memory allocations in gin fastupdate code are a bit brain dead - Mailing list pgsql-hackers

From David Rowley
Subject Some memory allocations in gin fastupdate code are a bit brain dead
Date
Msg-id CAKJS1f8vn-iSBE8PKeVHrnhvyjRNYCxguPFFY08QLYmjWG9hPQ@mail.gmail.com
Whole thread Raw
Responses Re: Some memory allocations in gin fastupdate code are a bit brain dead
List pgsql-hackers
I recently stumbled upon the following code in ginfast.c:

while (collector->ntuples + nentries > collector->lentuples)
{
    collector->lentuples *= 2;
    collector->tuples = (IndexTuple *) repalloc(collector->tuples,
      sizeof(IndexTuple) * collector->lentuples);
}

it does not seem very smart to perform the repalloc() inside the loop
here as there could be many loops before we double the lentuples so
that it's large enough to allow storage of all the required values.

The attached patch changes things around so that the repalloc() is
done outside of the loop. i.e. done only once, after we've determined
the correct size to reallocate it to. I've also added an else
condition so that we only bother checking this case when the tuples[]
array is not already allocated.

I tested with the following:

create table t1 (a int[], b int[]);
create index on t1 using gin (a,b) with (fastupdate = on);
truncate t1; insert into t1 select '{1}'::int[],('{' ||
string_agg(y::text, ',') || '}')::int[] from
generate_Series(1,1000000) x, generate_Series(1,10) y group by x order
by x desc;

In the above case with an unpatched master, I measured the repalloc()
to only consume 0.6% of the total runtime of the INSERT, so this does
not really improve performance by much, but I thought it was worth
fixing never-the-less.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: psql exit status with multiple -c or -f
Next
From: Filip Rembiałkowski
Date:
Subject: dropdb --force