Re: [HACKERS] PATCH: two slab-like memory allocators - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [HACKERS] PATCH: two slab-like memory allocators |
Date | |
Msg-id | 19464ac8-e66b-3598-858f-29c749bec35a@2ndquadrant.com Whole thread Raw |
In response to | Re: [HACKERS] PATCH: two slab-like memory allocators (Andres Freund <andres@anarazel.de>) |
Responses |
Re: [HACKERS] PATCH: two slab-like memory allocators
|
List | pgsql-hackers |
On 02/14/2017 03:22 AM, Andres Freund wrote: > Hi, > > On 2017-02-11 14:40:18 +0100, Tomas Vondra wrote: >> On 02/11/2017 02:33 AM, Andres Freund wrote: >>>> I have a hard time believing this the cache efficiency of linked lists >>>> (which may or may not be real in this case) out-weights this, but if you >>>> want to try, be my guest. >>> >>> I'm not following - why would there be overhead in anything for >>> allocations bigger than 4 (or maybe 8) bytes? You can store the list >>> (via chunk ids, not pointers) inside the chunks itself, where data >>> otherwise would be. And I don't see why you'd need a doubly linked >>> list, as the only operations that are needed are to push to the front of >>> the list, and to pop from the front of the list - and both operations >>> are simple to do with a singly linked list? >>> >> >> Oh! I have not considered storing the chunk indexes (for linked lists) in >> the chunk itself, which obviously eliminates the overhead concerns, and >> you're right a singly-linked list should be enough. >> >> There will be some minimum-chunk-size requirement, depending on the block >> size/chunk size. I wonder whether it makes sense to try to be smart and make >> it dynamic, so that we only require 1B or 2B for cases when only that many >> chunks fit into a block, or just say that it's 4B and be done with it. > > I doubt it's worth it - it seems likely that the added branch is more > noticeable overall than the possible savings of 3 bytes. Also, won't the > space be lost due to alignment *anyway*? > + /* chunk, including SLAB header (both addresses nicely aligned) */ > + fullChunkSize = MAXALIGN(sizeof(SlabChunkData) + MAXALIGN(chunkSize)); > > In that case I'd just Assert(MAXIMUM_ALIGNOF >= sizeof(slist_head)) and > use a plain slist - no point in being more careful than that. > Hmm, I think you're right. > >> I mean 2^32 chunks ought to be enough for anyone, right? > > Yea, that seems enough; but given the alignment thing pointed out above, > I think we can just use plain pointers - and that definitely should be > enough :P > People in year 2078: Why the hell did they only use 32 bits? Wasn't it obvious we'll have tiny computers with 32EB of RAM? ;-) > >> I'm still not buying the cache efficiency argument, though. One of >> the reasons is that the implementation prefers blocks with fewer >> free chunks when handling palloc(), so pfree() is making the block >> less likely to be chosen by the next palloc(). > > That'll possibly de-optimize L1, but for L2 usage the higher density > seems like it'll be a win. All this memory is only accessed by a > single backend, so packing as densely as possible is good. > > >>> If so, if you have e.g. 8 byte allocations and 64kb sized blocks, >>> you end up with an array of 1024 doubly linked lists, which'll >>> take up 64kb on its own. And there a certainly scenarios where >>> even bigger block sizes could make sense. That's both memory >>> overhead, and runtime overhead, because at reset-time we'll have >>> to check the whole array (which'll presumably largely be empty >>> lists). Resetting is a pretty common path... >>> >> >> True, but it's not entirely clear if resetting is common for the paths where we use those new allocators. > > That's fair enough. There's still the memory overhead, but I guess > we can also live with that. > Right. My ambition was not to develop another general-purpose memory context that would work perfectly for everything, but something that works (better than the current code) for places like reorderbuffer. > >> Also, if we accept that it might be a problem, what other solution you >> propose? I don't think just merging everything into a single list is a good >> idea, for the reasons I explained before (it might make the resets somewhat >> less expensive, but it'll make pfree() more expensive).>> > > Now that I think about it, a binary heap, as suggested elsewhere, isn't > entirely trivial to use for this - it's more or less trivial to "fix" > the heap after changing an element's value, but it's harder to find that > element first. > > But a two-level list approach seems like it could work quite well - > basically a simplified skip-list. A top-level list contains that has > the property that all the elements have a distinct #free, and then > hanging of those sub-lists for all the other blocks with the same number > of chunks. > > I think that'd be a better implementation, but I can understand if you > don't immediately want to go there. > I don't want to go there. I'm not all that interested in reorderbuffer, to be honest, and this started more as "Hold my beer!" hack, after a midnight discussion with Petr, than a seriously meant patch. I've already spent like 100x time on it than I expected. > >> What might work is replacing the array with a list, though. So we'd have a >> list of lists, which would eliminate the array overhead. > > That seems likely to be significantly worse, because a) iteration is > more expensive b) accessing the relevant list to move between two > different "freecount" lists would be O(n). > Oh, right, I haven't realized we won't know the current head of the list, so we'd have to search for it. OTOH, we could replace it with a small hash table, which would reduce the lookup time because we'd have to search only in a single bin. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: