Home > mailing lists

Re: [HACKERS] PATCH: two slab-like memory allocators - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: [HACKERS] PATCH: two slab-like memory allocators
Date	February 14, 2017 22:05:20
Msg-id	19464ac8-e66b-3598-858f-29c749bec35a@2ndquadrant.com Whole thread Raw
In response to	Re: [HACKERS] PATCH: two slab-like memory allocators (Andres Freund <andres@anarazel.de>)
Responses	Re: [HACKERS] PATCH: two slab-like memory allocators
List	pgsql-hackers

Tree view

On 02/14/2017 03:22 AM, Andres Freund wrote:
> Hi,
>
> On 2017-02-11 14:40:18 +0100, Tomas Vondra wrote:
>> On 02/11/2017 02:33 AM, Andres Freund wrote:
>>>> I have a hard time believing this the cache efficiency of linked lists
>>>> (which may or may not be real in this case) out-weights this, but if you
>>>> want to try, be my guest.
>>>
>>> I'm not following - why would there be overhead in anything for
>>> allocations bigger than 4 (or maybe 8) bytes? You can store the list
>>> (via chunk ids, not pointers) inside the chunks itself, where data
>>> otherwise would be.  And I don't see why you'd need a doubly linked
>>> list, as the only operations that are needed are to push to the front of
>>> the list, and to pop from the front of the list - and both operations
>>> are simple to do with a singly linked list?
>>>
>>
>> Oh! I have not considered storing the chunk indexes (for linked lists) in
>> the chunk itself, which obviously eliminates the overhead concerns, and
>> you're right a singly-linked list should be enough.
>>
>> There will be some minimum-chunk-size requirement, depending on the block
>> size/chunk size. I wonder whether it makes sense to try to be smart and make
>> it dynamic, so that we only require 1B or 2B for cases when only that many
>> chunks fit into a block, or just say that it's 4B and be done with it.
>
> I doubt it's worth it - it seems likely that the added branch is more
> noticeable overall than the possible savings of 3 bytes. Also, won't the
> space be lost due to alignment *anyway*?
> +    /* chunk, including SLAB header (both addresses nicely aligned) */
> +    fullChunkSize = MAXALIGN(sizeof(SlabChunkData) + MAXALIGN(chunkSize));
>
> In that case I'd just Assert(MAXIMUM_ALIGNOF >= sizeof(slist_head)) and
> use a plain slist - no point in being more careful than that.
>

Hmm, I think you're right.

>
>> I mean 2^32 chunks ought to be enough for anyone, right?
>
> Yea, that seems enough; but given the alignment thing pointed out above,
> I think we can just use plain pointers - and that definitely should be
> enough :P
>

People in year 2078: Why the hell did they only use 32 bits? Wasn't it 
obvious we'll have tiny computers with 32EB of RAM? ;-)

>
>> I'm still not buying the cache efficiency argument, though. One of
>> the reasons is that the implementation prefers blocks with fewer
>> free chunks when handling palloc(), so pfree() is making the block
>> less likely to be chosen by the next palloc().
>
> That'll possibly de-optimize L1, but for L2 usage the higher density
> seems like it'll be a win. All this memory is only accessed by a
> single backend, so packing as densely as possible is good.
>
>
>>> If so, if you have e.g. 8 byte allocations and 64kb sized blocks,
>>> you end up with an array of 1024 doubly linked lists, which'll
>>> take up 64kb on its own. And there a certainly scenarios where
>>> even bigger block sizes could make sense. That's both memory
>>> overhead, and runtime overhead, because at reset-time we'll have
>>> to check the whole array (which'll presumably largely be empty
>>> lists). Resetting is a pretty common path...
>>>
>>
>> True, but it's not entirely clear if resetting is common for the
paths where we use those new allocators.
>
> That's fair enough. There's still the memory overhead, but I guess
> we can also live with that.
>

Right. My ambition was not to develop another general-purpose memory 
context that would work perfectly for everything, but something that 
works (better than the current code) for places like reorderbuffer.

>
>> Also, if we accept that it might be a problem, what other solution you
>> propose? I don't think just merging everything into a single list is a good
>> idea, for the reasons I explained before (it might make the resets somewhat
>> less expensive, but it'll make pfree() more expensive).>>
>
> Now that I think about it, a binary heap, as suggested elsewhere, isn't
> entirely trivial to use for this - it's more or less trivial to "fix"
> the heap after changing an element's value, but it's harder to find that
> element first.
>
> But a two-level list approach seems like it could work quite well -
> basically a simplified skip-list.  A top-level list contains that has
> the property that all the elements have a distinct #free, and then
> hanging of those sub-lists for all the other blocks with the same number
> of chunks.
>
> I think that'd be a better implementation, but I can understand if you
> don't immediately want to go there.
>

I don't want to go there. I'm not all that interested in reorderbuffer, 
to be honest, and this started more as "Hold my beer!" hack, after a 
midnight discussion with Petr, than a seriously meant patch. I've 
already spent like 100x time on it than I expected.

>
>> What might work is replacing the array with a list, though. So we'd have a
>> list of lists, which would eliminate the array overhead.
>
> That seems likely to be significantly worse, because a) iteration is
> more expensive b) accessing the relevant list to move between two
> different "freecount" lists would be O(n).
>

Oh, right, I haven't realized we won't know the current head of the 
list, so we'd have to search for it. OTOH, we could replace it with a 
small hash table, which would reduce the lookup time because we'd have 
to search only in a single bin.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Jeff Janes
Date: 14 February 2017, 21:55:50
Subject: Re: [HACKERS] renaming pg_resetxlog to pg_resetwal has broken pg_upgrade.

From: Tom Lane
Date: 14 February 2017, 22:18:54
Subject: Re: [HACKERS] Improve OR conditions on joined columns (common star schema problem)

Re: [HACKERS] PATCH: two slab-like memory allocators - Mailing list pgsql-hackers

Previous

Next