Re: sortsupport for text - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: sortsupport for text
Date
Msg-id CAEYLb_XAW=jj8ejVj8eTU+PA+QDnsTc4Fr5W2qDb7jnWkBTz_A@mail.gmail.com
Whole thread Raw
In response to Re: sortsupport for text  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 15 June 2012 21:06, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Jun 15, 2012 at 1:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Fri, Jun 15, 2012 at 12:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> (And from a performance standpoint, I'm not entirely convinced it's not
>>>> a bug, anyway.  Worst-case behavior could be pretty bad.)
>>
>>> Instead of simply asserting that, could you respond to the specific
>>> points raised in my analysis?  I think there's no way it can be bad.
>>> I am happy to be proven wrong, but I like to understand why it is that
>>> I am wrong before changing things.
>>
>> Maybe I missed something, but as far as I saw your argument was not that
>> the performance wasn't bad but that the rest of the sort code would
>> dominate the runtime anyway.  I grant that entirely, but that doesn't
>> mean that it's good for this piece of it to possibly have bad behavior.
>
> That, plus the fact that not wasting memory in code paths where memory
> is at a premium seems important to me.  I'm shocked that either of you
> think it's OK to overallocate by as much as 2X in a code path that's
> only going to be used when we're going through fantastic gyrations to
> make memory usage fit inside work_mem.  The over-allocation by itself
> could easily exceed work_mem.

That seems pretty thin to me. We're talking about a couple of buffers
whose ultimate size is only approximately just big enough to hold the
largest text datum seen when sorting. Meanwhile, if it's the leading
key we're dealing with (and of course, it usually will be), before
exceeding work_mem all of the *entire* set of strings to be sorted are
sitting in palloc()'d memory anyway. I'm surprised that you didn't
immediately concede the point, to be honest.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


pgsql-hackers by date:

Previous
From: Cédric Villemain
Date:
Subject: Re: Allow WAL information to recover corrupted pg_controldata
Next
From: Merlin Moncure
Date:
Subject: Re: [patch] libpq one-row-at-a-time API