Home > mailing lists

Re: Tid scan improvements - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Tid scan improvements
Date	November 24, 2018 02:46:17
Msg-id	8d68f680-1a1c-142d-428d-987eb6d16db0@2ndquadrant.com Whole thread Raw
In response to	Re: Tid scan improvements (Edmund Horner <ejrh00@gmail.com>)
Responses	Re: Tid scan improvements
List	pgsql-hackers

Tree view


On 11/24/18 1:56 AM, Edmund Horner wrote:
> On Fri, 23 Nov 2018 at 07:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> On 11/22/18 8:41 AM, David Rowley wrote:
>>> ...
>>> 3. I'd rather see EnsureTidRangeSpace() keep doubling the size of the
>>> allocation until it reaches the required size. See how
>>> MakeSharedInvalidMessagesArray() does it.  Doing it this way ensures
>>> we always have a power of two sized array which is much nicer if we
>>> ever reach the palloc() limit as if the array is sized at the palloc()
>>> limit / 2 + 1, then if we try to double it'll fail.  Of course, it's
>>> unlikely to be a problem here, but... the question would be how to
>>> decide on the initial size.
>>
>> I think it kinda tries to do that in some cases, by doing this:
>>
>>      *numAllocRanges *= 2;
>>      ...
>>      tidRanges = (TidRange *)
>>          repalloc(tidRanges,
>>                   *numAllocRanges * sizeof(TidRange));
>>
>> The problem here is that what matters is not numAllocRanges being 2^N,
>> but the number of bytes allocated being 2^N. Because that's what ends up
>> in AllocSet, which keeps lists of 2^N chunks.
>>
>> And as TidRange is 12B, so this is guaranteed to waste memory, because
>> no matter what the first factor is, the result will never be 2^N.
> 
> For simplicity, I think making it a strict doubling of capacity each
> time is fine.  That's what we see in numerous other places in the
> backend code.
> 

Sure.

> What we don't really see is intentionally setting the initial capacity
> so that each subsequent capacity is close-to-but-not-exceeding a power
> of 2 bytes.  You can't really do that optimally if working in terms of
> whole numbers of items that aren't each a power of 2 size.  This step,
> there may be 2/3 of an item spare; next step, we'll have a whole item
> spare that we're not going to use.

Ah, I missed the detail with setting initial size.

> So we could keep track in terms of bytes allocated, and then figure
> out how many items we can fit at the current time.
> 
> In my opinion, such complexity is overkill for Tid scans.
> 
> Currently, we try to pick an initial size based on the number of
> expressions.  We assume each expression will yield one range, and
> allow that a saop expression might require us to enlarge the array.
> 
> Again, for simplicity, we should scrap that and pick something like
> floor(256/sizeof(TidRange)) = 21 items, with about 1.5% wastage.
> 

Probably. I don't think it'd be a lot of code to do the exact sizing,
but you're right 1.5% is close enough. As long as there is a comment
explaining the initial sizing, I'm fine with that.

If I could suggest one more thing, I'd define a struct combining the
array of ranges, numRanges and numAllocRangeslike:

typedef struct TidRanges
{
    int         numRanges;
    int         numAllocRanges;
    TidRange    ranges[FLEXIBLE_ARRAY_MEMBER];
} TidRanges;

and use that instead of the plain array. I find it easier to follow
compared to passing the various fields directly (sometimes as a value,
sometimes pointer to the value, etc.).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Tomas Vondra
Date: 24 November 2018, 02:09:25
Subject: Re: COPY FROM WHEN condition

From: Paul A Jungwirth
Date: 24 November 2018, 02:55:54
Subject: Re: SQL:2011 PERIODS vs Postgres Ranges?

Re: Tid scan improvements - Mailing list pgsql-hackers

Previous

Next