Re: Dynamic shared memory areas - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Dynamic shared memory areas |
Date | |
Msg-id | CAEepm=0-pbokaQdCXhtYn=w64MmdJz4hYW7qcSU235ar276x7w@mail.gmail.com Whole thread Raw |
In response to | Re: Dynamic shared memory areas (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Dynamic shared memory areas
Re: Dynamic shared memory areas |
List | pgsql-hackers |
On Wed, Nov 16, 2016 at 2:31 PM, Robert Haas <robertmhaas@gmail.com> wrote: > ... my > view is that utils/mmgr is a better fit, ... OK, changed. > The #ifdef HAVE__BUILTIN_TYPES_COMPATIBLE_P hack in relptr.h, for > which I believe I'm responsible, is ugly. There is probably a > compiler out there that has __typeof__ but not > __builtin_types_compatible_p, and we could cater to that by adding a > separate configure test for __typeof__. A little browsing of the > documentation on at https://gcc.gnu.org/onlinedocs/ seems to suggest > that __builtin_types_compatible_p didn't exist before GCC 3.1, but > __typeof__ seems to be there even in 2.95.3. That's not very > interesting, honestly, because 3.1 came out in 2002, but there might > be non-GCC compilers that implement __typeof__ but not > __builtin_types_compatible_p. I am inclined not to worry about this > unless somebody feels otherwise, but it's not beautiful. +1 > I wonder if it wouldn't be a good idea to allow the dsa_area_control > to be stored wherever instead of insisting that it's got to be located > inside the first DSM segment backing the pool itself. For example, > you could make dsa_create_dynamic() take a third argument which is a > pointer to enough space for a dsa_area_control, and it would > initialize it in place. Then you could make dsa_attach_dynamic() take > a pointer to that same structure instead of taking a dsa_handle. > Actually, I think dsa_handle goes away: it becomes the caller's > responsibility to figure out the correct pointer address to pass in > the second process. The advantage of this design change is that you > could stuff the dsa_area_control into the existing parallel DSM and > only create additional DSMs if anything is actually allocated. What > would be even cooler is to allow a little bit of space inside the > parallel DSM that gets used first, and then, when that overflows, we > start creating new DSMs. Say, 64kB. Am I sounding greedy yet? It > just seems like a good idea not to needlessly multiply the number of > DSMs. Alternatively we could stop using DSM directly for parallel query and just use a DSA area for all the shmem needs of a parallel query execution as I mentioned elsewhere[1]. That would involve changing a bunch of stuff including the FDW interface, so that's probably a bad idea at this point. So I tried this in-place idea out today. See the attached version which provides: dsa_area *dsa_create(...); dsa_area *dsa_attach(dsa_handle handle); Those replace the functions that previously had _dynamic in the name. Then I have new variants: dsa_area *dsa_create_in_place(void *place, size_t size, ...); dsa_area *dsa_attach_in_place(void *place); Those let you create an area in existing memory (in a DSM segment, traditional inherited shmem). The in-place versions will stlll create DSM segments on demand as required, though I suppose if you wanted to prevent that you could with dsa_set_size_limit(area, size). One complication is that of course the automatic detach feature doesn't work if you're in some random piece of memory. I have exposed dsa_on_dsm_detach, so that there is a way to hook it up to the detach hook for a pre-existing DSM segment, but that's the caller's responibility. This is important because although the first 'segment' is created in place, if other segments have been created we still have to manage those; it gets tricky if you are the last attached process for the area, but do not have a particular segment mapped in currently because you've never accessed it; that works with a regular dsa_create area, because everyone has the control segment mapped in so we use that one's dsm_on_detach hook and from there we can do the cleanup we need to do, but in this new case there is no such thing. You can see an example of manual detach hook installation in dsa-area-for-executor-v2.patch which I'll now go and post over in that other thread. > + /* Unlink span. */ > + /* TODO: Does it even need to be linked in in the > first place? */ > + LWLockAcquire(DSA_SCLASS_LOCK(area, DSA_SCLASS_SPAN_LARGE), > + LW_EXCLUSIVE); > + unlink_span(area, span); > + LWLockRelease(DSA_SCLASS_LOCK(area, DSA_SCLASS_SPAN_LARGE)); > > In answer to the TODO, I think this isn't strictly necessary, but it > seems like a good idea to do it anyway for debuggability. If we > didn't do this, the space occupied by a large object wouldn't be > "known" in any way other than by having disappeared from the free page > map, whereas this way it's linked into the DSA's listed of allocated > chunks like anything else, so for example dsa_dump() can print it. I > recommend removing this TODO. Removed. > + /* > + * TODO: We could take Max(fpm->contiguous_pages, result of > + * FreePageBtreeCleanup) and give it to FreePageManagerUpdatLargest as a > + * starting point for its search, potentially avoiding a bunch of work, > + * since there is no way the largest contiguous run is bigger than that. > + */ > > Typo: Updat. Fixed. > + /* > + * TODO: Figure out how to avoid setting this every time. It > may not be as > + * simple as it looks. > + */ > > Something isn't right with this function, because it takes the trouble > to calculate a value for contiguous_pages that it then doesn't use for > anything. I think the original idea here was that if we calculated a > value for contiguous_pages that was less than fpm->contiguous_pages, > there was no need to dirty it. If we can't get away with that for > some reason, then there's no point in calculating the value in the > first place. Yeah. Will come back on this point. The attached patch is just for discussion only... I need to resolve that contiguous_pages question and do some more testing. [1] https://www.postgresql.org/message-id/CAEepm=0HmRefi1+xDJ99Gj5APHr8Qr05KZtAxrMj8b+ay3o6sA@mail.gmail.com -- Thomas Munro http://www.enterprisedb.com
Attachment
pgsql-hackers by date: