Home > mailing lists

Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: Memory leak from ExecutorState context?
Date	April 21, 2023 20:44:48
Msg-id	CAAKRu_ZxT9K9FuUCKV6SwZZnC9ExAiELa1oUYyiPk4eGHh12GQ@mail.gmail.com Whole thread Raw
In response to	Re: Memory leak from ExecutorState context? (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses	Re: Memory leak from ExecutorState context?
List	pgsql-hackers

Tree view

On Fri, Apr 7, 2023 at 8:01 PM Jehan-Guillaume de Rorthais
<jgdr@dalibo.com> wrote:
>
> On Fri, 31 Mar 2023 14:06:11 +0200
> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
>
> > > [...]
> > > >> Hmmm, not sure is WARNING is a good approach, but I don't have a better
> > > >> idea at the moment.
> > > >
> > > > I stepped it down to NOTICE and added some more infos.
> > > >
> > > > [...]
> > > >   NOTICE:  Growing number of hash batch to 32768 is exhausting allowed
> > > > memory (137494656 > 2097152)
> > > [...]
> > >
> > > OK, although NOTICE that may actually make it less useful - the default
> > > level is WARNING, and regular users are unable to change the level. So
> > > very few people will actually see these messages.
> [...]
> > Anyway, maybe this should be added in the light of next patch, balancing
> > between increasing batches and allowed memory. The WARNING/LOG/NOTICE message
> > could appears when we actually break memory rules because of some bad HJ
> > situation.
>
> So I did some more minor editions to the memory context patch and start working
> on the balancing memory patch. Please, find in attachment the v4 patch set:
>
> * 0001-v4-Describe-hybrid-hash-join-implementation.patch:
>   Adds documentation written by Melanie few years ago
> * 0002-v4-Allocate-hash-batches-related-BufFile-in-a-dedicated.patch:
>   The batches' BufFile dedicated memory context patch

This is only a review of the code in patch 0002 (the patch to use a more
granular memory context for multi-batch hash join batch files). I have
not reviewed the changes to comments in detail either.

I think the biggest change that is needed is to implement this memory
context usage for parallel hash join.

To implement a file buffer memory context for multi-patch parallel hash
join we would need, at a minimum, to switch into the proposed fileCxt
memory context in sts_puttuple() before BufFileCreateFileSet().

We should also consider changing the SharedTuplestoreAccessor->context
from HashTableContext to the proposed fileCxt.

In parallel hash join code, the SharedTuplestoreAccessor->context is
only used when allocating the SharedTuplestoreAccessor->write_chunk and
read_chunk. Those are the buffers for writing out and reading from the
SharedTuplestore and are part of the memory overhead of file buffers for
a multi-batch hash join. Note that we will allocate STS_CHUNK_PAGES *
BLCKSZ size buffer for every batch -- this is on top of the BufFile
buffer per batch.

sts_initialize() and sts_attach() set the
SharedTuplestoreAccessor->context to the CurrentMemoryContext. We could
change into the fileCxt before calling those functions from the hash
join code. That would mean that the SharedTuplestoreAccessor data
structure would also be counted in the fileCxt (and probably
hashtable->batches). This is probably what we want anyway.

As for this patch's current implementation and use of the fileCxt , I
think you are going to need to switch into the fileCxt before calling
BufFileClose() with the batch files throughout the hash join code (e.g.
in ExecHashJoinNewBatch()) if you want the mem_allocated to be accurate
(since it frees the BufFile buffer). Once you figure out if that makes
sense and implement it, I think we will have to revisit if it still
makes sense to pass the fileCxt as an argument to
ExecHashJoinSaveTuple().

- Melanie

pgsql-hackers by date:

From: Nathan Bossart
Date: 21 April 2023, 20:33:58
Subject: Re: optimize several list functions with SIMD intrinsics

From: Nathan Bossart
Date: 21 April 2023, 20:44:51
Subject: Re: [PATCH] Extend the length of BackgroundWorker.bgw_library_name

Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers

Previous

Next