Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers
From | John Naylor |
---|---|
Subject | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date | |
Msg-id | CANWCAZYYzoKp_4+1m5mn-TRD62BTwom8iLXLOWMsHkkwFi=rzg@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Improve dead tuple storage for lazy vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [PoC] Improve dead tuple storage for lazy vacuum
|
List | pgsql-hackers |
On Tue, Jan 23, 2024 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > The new patches probably need to be polished but the VacDeadItemInfo > idea looks good to me. That idea looks good to me, too. Since you already likely know what you'd like to polish, I don't have much to say except for a few questions below. I also did a quick sweep through every patch, so some of these comments are unrelated to recent changes: v55-0003: +size_t +dsa_get_total_size(dsa_area *area) +{ + size_t size; + + LWLockAcquire(DSA_AREA_LOCK(area), LW_SHARED); + size = area->control->total_segment_size; + LWLockRelease(DSA_AREA_LOCK(area)); I looked and found dsa.c doesn't already use shared locks in HEAD, even dsa_dump. Not sure why that is... +/* + * Calculate the slab blocksize so that we can allocate at least 32 chunks + * from the block. + */ +#define RT_SLAB_BLOCK_SIZE(size) \ + Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * (size), (size) * 32) The first parameter seems to be trying to make the block size exact, but that's not right, because of the chunk header, and maybe alignment. If the default block size is big enough to waste only a tiny amount of space, let's just use that as-is. Also, I think all block sizes in the code base have been a power of two, but I'm not sure how much that matters. +#ifdef RT_SHMEM + fprintf(stderr, " [%d] chunk %x slot " DSA_POINTER_FORMAT "\n", + i, n4->chunks[i], n4->children[i]); +#else + fprintf(stderr, " [%d] chunk %x slot %p\n", + i, n4->chunks[i], n4->children[i]); +#endif Maybe we could invent a child pointer format, so we only #ifdef in one place. --- /dev/null +++ b/src/test/modules/test_radixtree/meson.build @@ -0,0 +1,35 @@ +# FIXME: prevent install during main install, but not during test :/ Can you look into this? test_radixtree.c: +/* + * XXX: should we expose and use RT_SIZE_CLASS and RT_SIZE_CLASS_INFO? + */ +static int rt_node_class_fanouts[] = { + 4, /* RT_CLASS_3 */ + 15, /* RT_CLASS_32_MIN */ + 32, /* RT_CLASS_32_MAX */ + 125, /* RT_CLASS_125 */ + 256 /* RT_CLASS_256 */ +}; These numbers have been wrong a long time, too, but only matters for figuring out where it went wrong when something is broken. And for the XXX, instead of trying to use the largest number that should fit (it's obviously not testing that the expected node can actually hold that number anyway), it seems we can just use a "big enough" number to cause growing into the desired size class. As far as cleaning up the tests, I always wondered why these didn't use EXPECT_TRUE, EXPECT_FALSE, etc. as in Andres's prototype where where convenient, and leave comments above the tests. That seemed like a good idea to me -- was there a reason to have hand-written branches and elog messages everywhere? --- a/src/tools/pginclude/cpluspluscheck +++ b/src/tools/pginclude/cpluspluscheck @@ -101,6 +101,12 @@ do test "$f" = src/include/nodes/nodetags.h && continue test "$f" = src/backend/nodes/nodetags.h && continue + # radixtree_*_impl.h cannot be included standalone: they are just code fragments. + test "$f" = src/include/lib/radixtree_delete_impl.h && continue + test "$f" = src/include/lib/radixtree_insert_impl.h && continue + test "$f" = src/include/lib/radixtree_iter_impl.h && continue + test "$f" = src/include/lib/radixtree_search_impl.h && continue Ha! I'd forgotten about these -- they're long outdated. v55-0005: - * The radix tree is locked in shared mode during the iteration, so - * RT_END_ITERATE needs to be called when finished to release the lock. + * The caller needs to acquire a lock in shared mode during the iteration + * if necessary. "need if necessary" is maybe better phrased as "is the caller's responsibility" + /* + * We can rely on DSA_AREA_LOCK to get the total amount of DSA memory. + */ total = dsa_get_total_size(tree->dsa); Maybe better to have a header comment for RT_MEMORY_USAGE that the caller doesn't need to take a lock. v55-0006: "WIP: Not built, since some benchmarks have broken" -- I'll work on this when I re-run some benchmarks. v55-0007: + * Internally, a tid is encoded as a pair of 64-bit key and 64-bit value, + * and stored in the radix tree. This hasn't been true for a few months now, and I thought we fixed this in some earlier version? + * TODO: The caller must be certain that no other backend will attempt to + * access the TidStore before calling this function. Other backend must + * explicitly call TidStoreDetach() to free up backend-local memory associated + * with the TidStore. The backend that calls TidStoreDestroy() must not call + * TidStoreDetach(). Do we need to do anything now? v55-0008: -TidStoreAttach(dsa_area *area, TidStoreHandle handle) +TidStoreAttach(dsa_area *area, dsa_pointer rt_dp) "handle" seemed like a fine name. Is that not the case anymore? The new one is kind of cryptic. The commit message just says "remove control object" -- does that imply that we need to think of this parameter differently, or is it unrelated? (Same with dead_items_handle in 0011) v55-0011: + /* + * Recreate the tidstore with the same max_bytes limitation. We cannot + * use neither maintenance_work_mem nor autovacuum_work_mem as they could + * already be changed. + */ I don't understand this part.
pgsql-hackers by date: