Re: [bug?] Missed parallel safety checks, and wrong parallel safety - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [bug?] Missed parallel safety checks, and wrong parallel safety |
Date | |
Msg-id | CAA4eK1Keqx5RMJ0-5PnNEy3b9yj5eEzMtCmR_OCVpRcjaxAMZQ@mail.gmail.com Whole thread Raw |
In response to | Re: [bug?] Missed parallel safety checks, and wrong parallel safety (Robert Haas <robertmhaas@gmail.com>) |
Responses |
RE: [bug?] Missed parallel safety checks, and wrong parallel safety
|
List | pgsql-hackers |
On Thu, May 6, 2021 at 4:35 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, May 6, 2021 at 3:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > The idea here is to check for parallel safety of functions at > > someplace in the code during function invocation so that if we execute > > any parallel unsafe/restricted function via parallel worker then we > > error out. I think that is a good safety net especially if we can do > > it with some simple check. Now, we already have pg_proc information in > > fmgr_info_cxt_security for non-built-in functions, so we can check > > that and error out if the unsafe function is invoked in parallel mode. > > It has been observed that we were calling some unsafe functions in > > parallel-mode in the regression tests which is caught by such a check. > > I see your point, but I am not convinced. As I said to Tsunakawa-san, > doing the check here seems expensive. > If I read your email correctly then you are saying it is expensive based on the idea that we need to perform extra syscache lookup but actually for non-built-in functions, we already have parallel-safety information so such a check should not incur a significant cost. > Also, I had the idea in mind > that parallel-safety should work like volatility. We don't check at > runtime whether a volatile function is being called in a context where > volatile functions are not supposed to be used. If for example you try > to call a volatile function directly from an index expression I > believe you will get an error. But if the index expression calls an > immutable function and then that function internally calls something > volatile, you don't get an error. Now it might not be a good idea: you > could end up with a broken index. But that's your fault for > mislabeling the function you used. > > Sometimes this is actually quite useful. You might know that, while > the function is in general volatile, it is immutable in the particular > way that you are using it. Or, perhaps, you are using the volatile > function incidentally and it doesn't affect the output of your > function at all. Or, maybe you actually want to build an index that > might break, and then it's up to you to rebuild the index if and when > that is required. Users do this kind of thing all the time, I think, > and would be unhappy if we started checking it more rigorously than we > do today. > > Now, I don't see why the same idea can't or shouldn't apply to > parallel-safety. If you call a parallel-unsafe function in a parallel > context, it's pretty likely that you are going to get an error, and so > you might not want to do it. If the function is written in C, it could > even cause horrible things to happen so that you crash the whole > backend or something, but I tried to set things up so that for > built-in functions you'll just get an error. But on the other hand, > maybe the parallel-unsafe function you are calling is not > parallel-unsafe in all cases. If you want to create a wrapper function > that is labelled parallel-safe and try to make that it only calls the > parallel-unsafe function in the cases where there's no safety problem, > that's up to you! > I think it is difficult to say for what purpose parallel-unsafe function got called in parallel context so if we give an error in cases where otherwise it could lead to a crash or caused other horrible things, users will probably appreciate us. OTOH, if the parallel-safety labeling is wrong (parallel-safe function is marked parallel-unsafe) and we gave an error in such a case, the user can always change the parallel-safety attribute by using Alter Function. Now, if adding such a check is costly or needs some major re-design then probably it might not be worth whereas I don't think that is the case for non-built-in function invocation. -- With Regards, Amit Kapila.
pgsql-hackers by date: