Thread: [HACKERS] Fix for segfault in plpython's exception handling
The patch attached (a one-liner) includes information about how to reproduce the issue and why it happens.
We applied it to our production servers (9.5) and has been working as expected for a while. I've also checked that it can be applied cleanly to current master.
Could it make its way into master? maybe 9.5 and 9.6 as well?
Thanks!
Rafa de la Torre
Attachment
Rafa de la Torre <rtorre@carto.com> writes: > exc = PyErr_NewException(exception_map[i].name, base, dict); > + Py_INCREF(exc); > PyModule_AddObject(mod, exception_map[i].classname, exc); Hm. Seems like if this is a problem, the code for the other three exceptions is being a bit careless: it does do Py_INCREFs on them, but not soon enough to ensure no problems. Also, I wonder why that code checks for a null result from PyErr_NewException but this doesn't. Good catch though. A naive person would have assumed that PyModule_AddObject would increment the object's refcount, but the Python docs say "This steals a reference to value", which I guess must mean that the caller is required to do it. regards, tom lane
I wrote: > Rafa de la Torre <rtorre@carto.com> writes: >> exc = PyErr_NewException(exception_map[i].name, base, dict); >> + Py_INCREF(exc); >> PyModule_AddObject(mod, exception_map[i].classname, exc); > Hm. Seems like if this is a problem, the code for the other three > exceptions is being a bit careless: it does do Py_INCREFs on them, > but not soon enough to ensure no problems. Also, I wonder why that > code checks for a null result from PyErr_NewException but this doesn't. > Good catch though. A naive person would have assumed that > PyModule_AddObject would increment the object's refcount, but > the Python docs say "This steals a reference to value", which > I guess must mean that the caller is required to do it. For me (testing with Python 2.6.6 on RHEL6), this test case didn't result in a server crash, but in the wrong exception object name being reported. Tracing through it showed that a python GC was happening during the loop adding all the exceptions to the spiexceptions module, so that some of the exception objects went away and then were immediately reallocated as other exception objects. The explicit gc call in the test case wasn't necessary, because the problem happened before that. Fun fun. I've pushed a patch that deals with all these problems. Thanks for the report! regards, tom lane
Thank you! Glad to see that the report was useful.
On Fri, Dec 9, 2016 at 9:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Rafa de la Torre <rtorre@carto.com> writes:
>> exc = PyErr_NewException(exception_map[i].name, base, dict);
>> + Py_INCREF(exc);
>> PyModule_AddObject(mod, exception_map[i].classname, exc);
> Hm. Seems like if this is a problem, the code for the other three
> exceptions is being a bit careless: it does do Py_INCREFs on them,
> but not soon enough to ensure no problems. Also, I wonder why that
> code checks for a null result from PyErr_NewException but this doesn't.
> Good catch though. A naive person would have assumed that
> PyModule_AddObject would increment the object's refcount, but
> the Python docs say "This steals a reference to value", which
> I guess must mean that the caller is required to do it.
For me (testing with Python 2.6.6 on RHEL6), this test case didn't result
in a server crash, but in the wrong exception object name being reported.
Tracing through it showed that a python GC was happening during the loop
adding all the exceptions to the spiexceptions module, so that some of the
exception objects went away and then were immediately reallocated as other
exception objects. The explicit gc call in the test case wasn't
necessary, because the problem happened before that. Fun fun.
I've pushed a patch that deals with all these problems. Thanks for
the report!
regards, tom lane
Rafa de la Torre
For the record: I tested the patch by Tom Lane in our setup (python 2.7.3-0ubuntu3.8) and works like a charm. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9cda81f0056ca488dbd6cded64db1238aed816b2 Also in 9.5 and 9.6 series. My request in commitfest queue can be closed. Cheers!