Home > mailing lists

Re: Add bump memory context type and use it for tuplesorts - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Add bump memory context type and use it for tuplesorts
Date	March 15, 2024 02:21:02
Msg-id	CAApHDvpWkPp4ep=Lm+5es_i1yyBmmMEDguvAE0fdGM8Pfpmigg@mail.gmail.com Whole thread Raw
In response to	Re: Add bump memory context type and use it for tuplesorts (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: Add bump memory context type and use it for tuplesorts
List	pgsql-hackers

Tree view

On Tue, 12 Mar 2024 at 23:57, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Attached is an updated version of the mempool patch, modifying all the
> memory contexts (not just AllocSet), including the bump context. And
> then also PDF with results from the two machines, comparing results
> without and with the mempool. There's very little impact on small reset
> values (128kB, 1MB), but pretty massive improvements on the 8MB test
> (where it's a 2x improvement).

I think it would be good to have something like this.  I've done some
experiments before with something like this [1]. However, mine was
much more trivial.

One thing my version did was get rid of the *context* freelist stuff
in aset.  I wondered if we'd need that anymore as, if I understand
correctly, it's just there to stop malloc/free thrashing, which is
what the patch aims to do anyway. Aside from that, it's now a little
weird that aset.c has that but generation.c and slab.c do not.

One thing I found was that in btbeginscan(), we have "so =
(BTScanOpaque) palloc(sizeof(BTScanOpaqueData));", which on this
machine is 27344 bytes and results in a call to AllocSetAllocLarge()
and therefore a malloc().  Ideally, there'd be no malloc() calls in a
standard pgbench run, at least once the rel and cat caches have been
warmed up.

I think there are a few things in your patch that could be improved,
here's a quick review.

1. MemoryPoolEntryIndex() could follow the lead of
AllocSetFreeIndex(), which is quite well-tuned and has no looping. I
think you can get rid of MemoryPoolEntrySize() and just have
MemoryPoolEntryIndex() round up to the next power of 2.

2. The following could use  "result = Min(MEMPOOL_MIN_BLOCK,
pg_nextpower2_size_t(size));"

+ * should be very low, though (less than MEMPOOL_SIZES, i.e. 14).
+ */
+ result = MEMPOOL_MIN_BLOCK;
+ while (size > result)
+ result *= 2;

3. "MemoryPoolFree": I wonder if this is a good name for such a
function.  Really you want to return it to the pool. "Free" sounds
like you're going to free() it.  I went for "Fetch" and "Release"
which I thought was less confusing.

4. MemoryPoolRealloc(), could this just do nothing if the old and new
indexes are the same?

5. It might be good to put a likely() around this:

+ /* only do this once every MEMPOOL_REBALANCE_DISTANCE allocations */
+ if (pool->num_requests < MEMPOOL_REBALANCE_DISTANCE)
+ return;

Otherwise, if that function is inlined then you'll bloat the functions
that inline it for not much good reason.  Another approach would be to
have a static inline function which checks and calls a noinline
function that does the work so that the rebalance stuff is never
inlined.

Overall, I wonder if the rebalance stuff might make performance
testing quite tricky.  I see:

+/*
+ * How often to rebalance the memory pool buckets (number of allocations).
+ * This is a tradeoff between the pool being adaptive and more overhead.
+ */
+#define MEMPOOL_REBALANCE_DISTANCE 25000

Will TPS take a sudden jump after 25k transactions doing the same
thing?  I'm not saying this shouldn't happen, but... benchmarking is
pretty hard already. I wonder if there's something more fine-grained
that can be done which makes the pool adapt faster but not all at
once.  (I've not studied your algorithm for the rebalance.)

David

[1] https://github.com/david-rowley/postgres/tree/malloccache

pgsql-hackers by date:

From: Nathan Bossart
Date: 15 March 2024, 01:52:55
Subject: Re: cleanup patches for incremental backup

From: Kyotaro Horiguchi
Date: 15 March 2024, 02:27:26
Subject: Re: Inconsistent printf placeholders

Re: Add bump memory context type and use it for tuplesorts - Mailing list pgsql-hackers

Previous

Next