Home > mailing lists

Re: [BUG] orphaned function - Mailing list pgsql-hackers

From	Drouvot, Bertrand
Subject	Re: [BUG] orphaned function
Date	February 2, 2021 08:15:05
Msg-id	03a3a7f2-4c67-7bb2-cad2-8ec7b9800942@amazon.com Whole thread Raw
In response to	Re: [BUG] orphaned function (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [BUG] orphaned function
List	pgsql-hackers

Tree view

Hi,

+ Jim and Nathan.

On 12/18/20 12:26 AM, Tom Lane wrote:
> Gilles Darold <gilles@darold.net> writes:
>> The same problem applies if the returned type or the procedural language
>> is dropped. I have tried to fix that in ProcedureCreate() by using an
>> AccessSharedLock for the data types and languages involved in the
>> function declaration. With this patch the race condition with parameters
>> types, return types and PL languages are fixed. Only data types and
>> procedural languages with Oid greater than FirstBootstrapObjectId will
>> be locked locked. But this is probably more complex than this fix so it
>> is proposed as a separate patch
>> (v1-003-orphaned_function_types_language.patch) to not interfere with
>> the applying of Bertran's patch.
> Indeed.  This points up one of the things that is standing in the way
> of any serious consideration of applying this patch.  To my mind there
> are two fundamental, somewhat interrelated considerations that haven't
> been meaningfully addressed:
>
> 1. What set of objects is it worth trying to remove this race condition
> for, and with respect to what dependencies?  Bertrand gave no
> justification for worrying about function-to-namespace dependencies
> specifically,

I focused on this one because this is the most (if not the unique) one 
we have observed so far.

Those orphaned functions are breaking stuff like pg_dump, pg_upgrade and 
that's how we discovered them.

That being said, I agree that there is no reason to focus only on this 
ones and we should better try to "fix" them all.
>   and you've likewise given none for expanding the patch
> to consider function-to-datatype dependencies.  There are dozens more
> cases that could be considered; but I sure don't want to be processing
> another ad-hoc fix every time someone thinks they're worried about
> another specific case.
>
> Taking a practical viewpoint, I'm far more worried about dependencies
> of tables than those of any other kind of object.  A messed-up function
> definition doesn't cost you much: you can just drop and recreate the
> function.  A table full of data is way more trouble to recreate, and
> indeed the data might be irreplaceable.  So it seems pretty silly to
> propose that we try to remove race conditions for functions' dependencies
> on datatypes without removing the same race condition for tables'
> dependencies on their column datatypes.
>
> But any of these options lead to the same question: why stop there?
> An approach that would actually be defensible, perhaps, is to incorporate
> this functionality into the dependency mechanism: any time we're about to
> create a pg_depend or pg_shdepend entry, take out an AccessShareLock on
> the referenced object, and then check to make sure it still exists.
> This would improve the dependency mechanism so it prevents creation-time
> race conditions, not just attempts to drop dependencies of
> already-committed objects.  It would mean that the one patch would be
> the end of the problem, rather than looking forward to a steady drip of
> new fixes.

Agree that working with pg_depend and pg_shdepend is the way to go.

Instead of using the locking machinery (and then make the one that would 
currently produce the orphaned object waiting), Jim proposed another 
approach: make use of special snapshot (like a dirty one depending of 
the case).

That way instead of locking we could instead report an error, something 
like this:

postgres=# drop schema tobeorph;
ERROR:  cannot drop schema tobeorph because other objects that are 
currently under creation depend on it
DETAIL:  function under creation tobeorph.bdttime() depends on schema 
tobeorph

>
> 2. How much are we willing to pay for this added protection?  The fact
> that we've gotten along fine without it for years suggests that the
> answer needs to be "not much".  But I'm not sure that that actually
> is the answer, especially if we don't have a policy that says "we'll
> protect against these race conditions but no other ones".  I think
> there are possibly-serious costs in three different areas:
>
> * Speed.  How much do all these additional lock acquisitions slow
> down a typical DDL command?
>
> * Number of locks taken per transaction.  This'd be particularly an
> issue for pg_restore runs using single-transaction mode: they'd take
> not only locks on the objects they create, but also a bunch of
> reference-protection locks.  It's not very hard to believe that this'd
> make a difference in whether restoring a large database is possible
> without increasing max_locks_per_transaction.
>
> * Risk of deadlock.  The reference locks themselves should be sharable,
> so maybe there isn't much of a problem, but I want to see this question
> seriously analyzed not just ignored.
All of this should be mitigated with the "special snapshot" approach.
>
> Obviously, the magnitude of these costs depends a lot on how many
> dependencies we want to remove the race condition for.  But that's
> exactly the reason why I don't want a piecemeal approach of fixing
> some problems now and some other problems later.  That's way too
> much like the old recipe for boiling a frog: we could gradually get
> into serious performance problems without anyone ever having stopped
> to consider the issue.
>
> In short, I think we should either go for a 100% solution if careful
> analysis shows we can afford it, or else have a reasoned policy
> why we are going to close these specific race conditions and no others
> (implying that we'll reject future patches in the same area).

Agree.

What about taking the 100% solution way with the "special snapshot" 
approach?

If you think that could make sense then we (Jim, Nathan and I) can work 
on a patch with this approach and come back with a proposal.

Bertrand

pgsql-hackers by date:

From: "Joel Jacobson"
Date: 02 February 2021, 08:13:43
Subject: Re: [PATCH] Doc: improve documentation of oid columns that can be zero. (correct version)

From: Fujii Masao
Date: 02 February 2021, 08:18:53
Subject: Re: Transactions involving multiple postgres foreign servers, take 2

Re: [BUG] orphaned function - Mailing list pgsql-hackers

Previous

Next