Thread: Fix runtime errors from -fsanitize=undefined
After many years of trying, it seems the -fsanitize=undefined checking in gcc is now working somewhat reliably. Attached is a patch that fixes all errors of the kind runtime error: null pointer passed as argument N, which is declared to never be null Most of the cases are calls to memcpy(), memcmp(), etc. with a length of zero, so one appears to get away with passing a null pointer. Note that these are runtime errors, not static analysis, so the code in question is actually reached. To reproduce, configure normally and then set COPT=-fsanitize=undefined -fno-sanitize=alignment -fno-sanitize-recover=all and build and run make check-world. Unpatched, this will core dump in various places. (-fno-sanitize=alignment should also be fixed but I took it out here to deal with it separately.) See https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html for further documentation. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
On Mon, Jun 3, 2019 at 3:22 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > After many years of trying, it seems the -fsanitize=undefined checking > in gcc is now working somewhat reliably. Attached is a patch that fixes > all errors of the kind Is this as of some particular gcc version? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2019-06-05 21:30, Robert Haas wrote: > On Mon, Jun 3, 2019 at 3:22 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> After many years of trying, it seems the -fsanitize=undefined checking >> in gcc is now working somewhat reliably. Attached is a patch that fixes >> all errors of the kind > > Is this as of some particular gcc version? I used gcc-8. The option has existed in gcc for quite some time, but in previous releases it always tended to hang or get confused somewhere. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, I tested this patch with clang 7 on master. - On unpatched master I can't reproduce errors with make check-world in: src/backend/access/heap/heapam.c src/backend/utils/cache/relcache.c (IIRC I triggered this one in a pg previous version) src/backend/utils/misc/guc.c - I have a hard to reproduce one not in this patched: src/backend/storage/ipc/shm_mq.c line 727 About the changes - in src/fe_utils/print.c line memset(header_done, false, col_count * sizeof(bool)); is redundant and should be remove not guarded with if (hearder_done), header_done is either null or already zeroed, it's pg_malloc0 ed. In all cases but one patched version shortcut an undefined no ops but in src/backend/access/transam/clog.c memcmp 0 bytes return 0 thus current change modifies code path, before with nsubxids == 0 if branch was taken now it's not. Could wait more often while taking lock, no idea if it's relevant. Regards Didier On Thu, Jun 6, 2019 at 11:36 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > On 2019-06-05 21:30, Robert Haas wrote: > > On Mon, Jun 3, 2019 at 3:22 PM Peter Eisentraut > > <peter.eisentraut@2ndquadrant.com> wrote: > >> After many years of trying, it seems the -fsanitize=undefined checking > >> in gcc is now working somewhat reliably. Attached is a patch that fixes > >> all errors of the kind > > > > Is this as of some particular gcc version? > > I used gcc-8. > > The option has existed in gcc for quite some time, but in previous > releases it always tended to hang or get confused somewhere. > > -- > Peter Eisentraut http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > >
On Mon, Jun 03, 2019 at 09:21:48PM +0200, Peter Eisentraut wrote: > After many years of trying, it seems the -fsanitize=undefined checking > in gcc is now working somewhat reliably. Attached is a patch that fixes > all errors of the kind > > runtime error: null pointer passed as argument N, which is declared to > never be null > > Most of the cases are calls to memcpy(), memcmp(), etc. with a length of > zero, so one appears to get away with passing a null pointer. I just saw this proposal. The undefined behavior in question is strictly academic. These changes do remove the need for new users to discover -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer. Given the variety of code this touches, I expect future commits will reintroduce the complained-of usage patterns, prompting yet more commits to restore the invariant achieved here. Hence, I'm -0 for this change.
On 2019-07-05 01:33, Noah Misch wrote: > I just saw this proposal. The undefined behavior in question is strictly > academic. These changes do remove the need for new users to discover > -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer. > Given the variety of code this touches, I expect future commits will > reintroduce the complained-of usage patterns, prompting yet more commits to > restore the invariant achieved here. Hence, I'm -0 for this change. This sanitizer has found real problems in the past. By removing these trivial issues we can then set up a build farm animal or similar to automatically check for any new issues. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> This sanitizer has found real problems in the past. By removing these > trivial issues we can then set up a build farm animal or similar to > automatically check for any new issues. We have done exactly this in postgis with 2 different jobs (gcc and clang) and, even though it doesn't happen too often, it's really satisfying when it detects these issues automatically. -- Raúl Marín Rodríguez carto.com
On Fri, Jul 05, 2019 at 06:14:31PM +0200, Peter Eisentraut wrote: > On 2019-07-05 01:33, Noah Misch wrote: > > I just saw this proposal. The undefined behavior in question is strictly > > academic. These changes do remove the need for new users to discover > > -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer. > > Given the variety of code this touches, I expect future commits will > > reintroduce the complained-of usage patterns, prompting yet more commits to > > restore the invariant achieved here. Hence, I'm -0 for this change. > > This sanitizer has found real problems in the past. By removing these > trivial issues we can then set up a build farm animal or similar to > automatically check for any new issues. Has it found one real problem that it would not have found given "-fno-sanitize=nonnull-attribute"? I like UBSan in general, but I haven't found a reason to prefer plain "-fsanitize=undefined" over "-fsanitize=undefined -fno-sanitize=nonnull-attribute".
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2019-07-05 01:33, Noah Misch wrote: >> I just saw this proposal. The undefined behavior in question is strictly >> academic. These changes do remove the need for new users to discover >> -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer. >> Given the variety of code this touches, I expect future commits will >> reintroduce the complained-of usage patterns, prompting yet more commits to >> restore the invariant achieved here. Hence, I'm -0 for this change. > This sanitizer has found real problems in the past. By removing these > trivial issues we can then set up a build farm animal or similar to > automatically check for any new issues. I think Noah's point is just that we can do that with the addition of -fno-sanitize=nonnull-attribute. I agree with him that it's very unclear why we should bother to make the code clean against that specific subset of warnings. regards, tom lane
On 2019-07-05 19:10, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> On 2019-07-05 01:33, Noah Misch wrote: >>> I just saw this proposal. The undefined behavior in question is strictly >>> academic. These changes do remove the need for new users to discover >>> -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer. >>> Given the variety of code this touches, I expect future commits will >>> reintroduce the complained-of usage patterns, prompting yet more commits to >>> restore the invariant achieved here. Hence, I'm -0 for this change. > >> This sanitizer has found real problems in the past. By removing these >> trivial issues we can then set up a build farm animal or similar to >> automatically check for any new issues. > > I think Noah's point is just that we can do that with the addition of > -fno-sanitize=nonnull-attribute. I agree with him that it's very > unclear why we should bother to make the code clean against that > specific subset of warnings. OK, I'm withdrawing this patch. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services