Home > mailing lists

Re: [bug?] Missed parallel safety checks, and wrong parallel safety - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [bug?] Missed parallel safety checks, and wrong parallel safety
Date	June 7, 2021 13:58:53
Msg-id	CA+TgmoZWLervc55Fgxd3sphL+N3ixJ29HasQDdToHPz+vjF7YQ@mail.gmail.com Whole thread Raw
In response to	Re: [bug?] Missed parallel safety checks, and wrong parallel safety (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: [bug?] Missed parallel safety checks, and wrong parallel safety
List	pgsql-hackers

Tree view

On Fri, Jun 4, 2021 at 6:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Thoughts?

As far as I can see, trying to error out at function call time if the
function is parallel-safe doesn't fix any problem we have, and just
makes the design of this part of the system less consistent with what
we've done elsewhere. For example, if you create a stable function
that internally calls a volatile function, you don't get an error. You
can use your stable function in an index definition if you wish. That
may break, but if so, that's your problem. Also, when it breaks, it
probably won't blow up the entire world; you'll just have a messed-up
index. Currently, the parallel-safety stuff works the same way. If we
notice that something is marked parallel-unsafe, we'll skip
parallelism. But you can lie to us and claim that things are safe when
they're not, and if you do, it may break, but that's your problem.
Mostly likely your query will just error out, and there will be no
worse consequences than that, though if your parallel-unsafe function
is written in C, it could do horrible things like crash, which is
unavoidable because C code can do anything.

Now, the reason for all of this work, as I understand it, is because
we want to enable parallel inserts, and the problem there is that a
parallel insert could involve a lot of different things: it might need
to compute expressions, or fire triggers, or check constraints, and
any of those things could be parallel-unsafe. If we enable parallelism
and then find out that we need to do to one of those things, we have a
problem. Something probably will error out. The thing is, with this
proposal, that issue is not solved. Something will definitely error
out. You'll probably get the error in a different place, but nobody
fires off an INSERT hoping to get one error message rather than
another. What they want is for it to work. So I'm kind of confused how
we ended up going in this direction which seems to me at least to be a
tangent from the real issue, and somewhat at odds with the way the
rest of PostgreSQL is designed.

It seems to me that we could simply add a flag to each relation saying
whether or not we think that INSERT operations - or perhaps DML
operations generally - are believed to be parallel-safe for that
relation. Like the marking on functions, it would be the user's
responsibility to get that marking correct. If they don't, they might
call a parallel-unsafe function in parallel mode, and that will
probably error out. But that's no worse than what we already have in
existing cases, so I don't see why it requires doing what's proposed
here first. Now, it does have the advantage of being not very
convenient for users, who, I'm sure, would prefer that the system
figure out for them automatically whether or not parallel inserts are
likely to be safe, rather than making them declare it, especially
since presumably the default declaration would have to be "unsafe," as
it is for functions. But I don't have a better idea right now.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Tom Lane
Date: 07 June 2021, 13:42:29
Subject: Re: when the startup process doesn't

From: Tom Lane
Date: 07 June 2021, 14:38:03
Subject: Re: Multiple hosts in connection string failed to failover in non-hot standby mode

Re: [bug?] Missed parallel safety checks, and wrong parallel safety - Mailing list pgsql-hackers

Previous

Next