Home > mailing lists

Re: hyrax vs. RelationBuildPartitionDesc - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: hyrax vs. RelationBuildPartitionDesc
Date	May 17, 2019 19:59:29
Msg-id	20190517195929.uey6hadbgazplm3h@alap3.anarazel.de Whole thread Raw
In response to	Re: hyrax vs. RelationBuildPartitionDesc (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: hyrax vs. RelationBuildPartitionDesc
List	pgsql-hackers

Tree view

Hi,

On 2019-05-01 13:09:07 -0400, Robert Haas wrote:
> The original issue on this thread was that hyrax started running out
> of memory when it hadn't been doing so previously.  That happened
> because, for complicated reasons, commit
> 898e5e3290a72d288923260143930fb32036c00c resulted in the leak being
> hit lots of times in CLOBBER_CACHE_ALWAYS builds instead of just once.
> Commits 2455ab48844c90419714e27eafd235a85de23232 and
> d3f48dfae42f9655425d1f58f396e495c7fb7812 fixed that problem.
> 
> In the email at issue, Tom complains about two things.  First, he
> complains about the fact that I try to arrange things so that relcache
> data remains valid for as long as required instead of just copying it.
> Second, he complains about the fact repeated ATTACH and DETACH
> PARTITION operations can produce a slow session-lifespan memory leak.
> 
> I think it's reasonable to fix the second issue for v12.  I am not
> sure how important it is, because (1) the leak is small, (2) it seems
> unlikely that anyone would execute enough ATTACH/DETACH PARTITION
> operations in one backend for the leak to amount to anything
> significant, and (3) if a relcache flush ever happens when you don't
> have the relation open, all of the "leaked" memory will be un-leaked.
> However, I believe that we could fix it as follows.  First, invent
> rd_pdoldcxt and put the first old context there; if that pointer is
> already in use, then parent the new context under the old one.
> Second, in RelationDecrementReferenceCount, if the refcount hits 0,
> nuke rd_pdoldcxt and set the pointer back to NULL.  With that change,
> you would only keep the old PartitionDesc around until the ref count
> hits 0, whereas at present, you keep the old PartitionDesc around
> until an invalidation happens while the ref count is 0.

That sounds roughly reasonably. Tom, any objections?  I think it'd be
good if we fixed this soon.


> I think the first issue is not v12 material.  Tom proposed to fix it
> by copying all the relevant data out of the relcache, but his own
> performance results show a pretty significant hit,

Yea, the numbers didn't look very convincing.


> and AFAICS he
> hasn't pointed to anything that's actually broken by the current
> approach.  What I think should be done is actually generalize the
> approach I took in this patch, so that instead of the partition
> directory holding a reference count, the planner or executor holds
> one.  Then not only could people who want information about the
> PartitionDesc avoid copying stuff from the relcache, but maybe other
> things as well.  I think that would be significantly better than
> continuing to double-down on the current copy-everything approach,
> which really only works well in a world where a table can't have all
> that much metadata, which is clearly not true for PostgreSQL any more.
> I'm not sure that Tom is actually opposed to this approach -- although
> I may have misunderstood his position -- but where we disagree, I
> think, is that I see what I did in this commit as a stepping-stone
> toward a better world, and he sees it as something that should be
> killed with fire until that better world has fully arrived.

Tom, are you ok with deferring further work here for v13?

Greetings,

Andres Freund

pgsql-hackers by date:

From: Andres Freund
Date: 17 May 2019, 19:54:19
Subject: Re: Pluggable Storage - Andres's take

From: Andres Freund
Date: 17 May 2019, 20:04:30
Subject: Re: Passing CopyMultiInsertInfo structure toCopyMultiInsertInfoNextFreeSlot()

Re: hyrax vs. RelationBuildPartitionDesc - Mailing list pgsql-hackers

Previous

Next