Home > mailing lists

Re: Logical Replica ReorderBuffer Size Accounting Issues - Mailing list pgsql-bugs

From	Masahiko Sawada
Subject	Re: Logical Replica ReorderBuffer Size Accounting Issues
Date	October 16, 2024 23:51:34
Msg-id	CAD21AoDG+kKNr2C3ExNj=gDVDHfLeYEbpQbkV4K0nh3ZCcOU8g@mail.gmail.com Whole thread Raw
In response to	Re: Logical Replica ReorderBuffer Size Accounting Issues (torikoshia <torikoshia@oss.nttdata.com>)
List	pgsql-bugs

Tree view

On Mon, May 20, 2024 at 12:02 AM torikoshia <torikoshia@oss.nttdata.com> wrote:
>
> Hi,
>
> Thank you for working on this issue.
> It seems that we have also faced the same one.
>
> > On Wed, May 24, 2023 at 9:27 AMMasahiko Sawada <sawada.mshk@gmail.com>
> > wrote:
> >> Yes, it's because the above modification doesn't fix the memory
> >> accounting issue but only reduces memory bloat in some (extremely bad)
> >> cases. Without this modification, it was possible that the maximum
> >> actual memory usage could easily reach several tens of times as
> >> logical_decoding_work_mem (e.g. 4GB vs. 256MB as originally reported).
> >> Since the fact that the reorderbuffer doesn't account for memory
> >> fragmentation etc is still there, it's still possible that the actual
> >> memory usage could reach several times as logical_decoding_work_mem.
> >> In my environment, with reproducer.sh you shared, the total actual
> >> memory usage reached up to about 430MB while logical_decoding_work_mem
> >> being 256MB. Probably even if we use another type for memory allocator
> >> such as AllocSet a similar issue will still happen. If we don't want
> >> the reorderbuffer memory usage never to exceed
> >> logical_decoding_work_mem, we would need to change how the
> >> reorderbuffer uses and accounts for memory, which would require much
> >> work, I guess.
>
> Considering the manual says that logical_decoding_work_mem "specifies
> the maximum amount of memory to be used by logical decoding" and this
> would be easy for users to tune, it may be best to do this work.
> However..
>
> >>> One idea to deal with this issue is to choose the block sizes
> >>> carefully while measuring the performance as the comment shows:
> >>>
> >>>
> >>>
> >>>     /*
> >>>      * XXX the allocation sizes used below pre-date generation
> >>> context's block
> >>>      * growing code.  These values should likely be benchmarked and
> >>> set to
> >>>      * more suitable values.
> >>>      */
> >>>     buffer->tup_context = GenerationContextCreate(new_ctx,
> >>>                                                   "Tuples",
> >>>
> >>> SLAB_LARGE_BLOCK_SIZE,
> >>>
> >>> SLAB_LARGE_BLOCK_SIZE,
> >>>
> >>> SLAB_LARGE_BLOCK_SIZE);
>
> since this idea can prevent the issue in not all but some situations,
> this may be good for mitigation measure.
> One concern is this would cause more frequent malloc(), but it is better
> than memory bloat, isn't it?

FYI I've just pushed the commit for fixing this memory issue to all
supported branches[1]. It's just to let people, including the
reporter, know the recent updates on this topic.

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=1b9b6cc3456be0f6ab929107293b31c333270bc1

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-bugs by date:

From: Ba Jinsheng
Date: 16 October 2024, 21:26:12
Subject: Re: Performance Issue on Query 18 of TPC-H Benchmark

From: Tom Lane
Date: 17 October 2024, 00:40:23
Subject: Re: BUG #18656: "STABLE" function sometimes does not see changes

Re: Logical Replica ReorderBuffer Size Accounting Issues - Mailing list pgsql-bugs

Previous

Next