Re: Memory leak from ExecutorState context? - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Memory leak from ExecutorState context? |
Date | |
Msg-id | CAAKRu_ZxT9K9FuUCKV6SwZZnC9ExAiELa1oUYyiPk4eGHh12GQ@mail.gmail.com Whole thread Raw |
In response to | Re: Memory leak from ExecutorState context? (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>) |
Responses |
Re: Memory leak from ExecutorState context?
|
List | pgsql-hackers |
On Fri, Apr 7, 2023 at 8:01 PM Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote: > > On Fri, 31 Mar 2023 14:06:11 +0200 > Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote: > > > > [...] > > > >> Hmmm, not sure is WARNING is a good approach, but I don't have a better > > > >> idea at the moment. > > > > > > > > I stepped it down to NOTICE and added some more infos. > > > > > > > > [...] > > > > NOTICE: Growing number of hash batch to 32768 is exhausting allowed > > > > memory (137494656 > 2097152) > > > [...] > > > > > > OK, although NOTICE that may actually make it less useful - the default > > > level is WARNING, and regular users are unable to change the level. So > > > very few people will actually see these messages. > [...] > > Anyway, maybe this should be added in the light of next patch, balancing > > between increasing batches and allowed memory. The WARNING/LOG/NOTICE message > > could appears when we actually break memory rules because of some bad HJ > > situation. > > So I did some more minor editions to the memory context patch and start working > on the balancing memory patch. Please, find in attachment the v4 patch set: > > * 0001-v4-Describe-hybrid-hash-join-implementation.patch: > Adds documentation written by Melanie few years ago > * 0002-v4-Allocate-hash-batches-related-BufFile-in-a-dedicated.patch: > The batches' BufFile dedicated memory context patch This is only a review of the code in patch 0002 (the patch to use a more granular memory context for multi-batch hash join batch files). I have not reviewed the changes to comments in detail either. I think the biggest change that is needed is to implement this memory context usage for parallel hash join. To implement a file buffer memory context for multi-patch parallel hash join we would need, at a minimum, to switch into the proposed fileCxt memory context in sts_puttuple() before BufFileCreateFileSet(). We should also consider changing the SharedTuplestoreAccessor->context from HashTableContext to the proposed fileCxt. In parallel hash join code, the SharedTuplestoreAccessor->context is only used when allocating the SharedTuplestoreAccessor->write_chunk and read_chunk. Those are the buffers for writing out and reading from the SharedTuplestore and are part of the memory overhead of file buffers for a multi-batch hash join. Note that we will allocate STS_CHUNK_PAGES * BLCKSZ size buffer for every batch -- this is on top of the BufFile buffer per batch. sts_initialize() and sts_attach() set the SharedTuplestoreAccessor->context to the CurrentMemoryContext. We could change into the fileCxt before calling those functions from the hash join code. That would mean that the SharedTuplestoreAccessor data structure would also be counted in the fileCxt (and probably hashtable->batches). This is probably what we want anyway. As for this patch's current implementation and use of the fileCxt , I think you are going to need to switch into the fileCxt before calling BufFileClose() with the batch files throughout the hash join code (e.g. in ExecHashJoinNewBatch()) if you want the mem_allocated to be accurate (since it frees the BufFile buffer). Once you figure out if that makes sense and implement it, I think we will have to revisit if it still makes sense to pass the fileCxt as an argument to ExecHashJoinSaveTuple(). - Melanie
pgsql-hackers by date: