Thread: -Wformat-signedness
Hi hackers, There're probably mostly harmless, being mostly error and debug messages and the like, and considering that eg OID parsing tolerates negative numbers when reading them back in, but for what it's worth: GCC complains about many %d vs %u type mixups if you build with $SUBJECT.
On 2020-10-29 22:37, Thomas Munro wrote: > There're probably mostly harmless, being mostly error and debug > messages and the like, and considering that eg OID parsing tolerates > negative numbers when reading them back in, but for what it's worth: > GCC complains about many %d vs %u type mixups if you build with > $SUBJECT. I had looked into this some time ago. I have dusted off my patch again. The attached version fixes all warnings for me. The following are the main categories of issues: 1. enums are unsigned by default in gcc, so all those internal error messages "unrecognized blah kind: %d" need to be changed to %u. I have split that into its own patch since it's easily separable. All the remaining issues are in one patch. 2. Various trickery at the boundary of internal counters that are unsigned and external functions or views using signed types. These need another look. 3. Various messages print signed values using %x formats, which need to be unsigned. These might also need another look. 4. Issues with constants being signed by default. For example, things like elog(ERROR, "foo is %u but should be %u", somevar, 55) warns because of the constant. Should be changed to something like 55U for symmetry, or change the %u to %d. This also reaches into genbki territory with all the OID constants being generated. 5. Some "surprising" but correct C behavior. For example, unsigned short is promoted to int (not unsigned int) in variable arguments, so needs a %d format. 6. Finally, a bunch of uses were just plain wrong and should be corrected. I haven't found anything that is a really serious bug, but I imagine you could run into trouble in various ways when you exceed the INT_MAX value. But then again, if you use up INT_MAX WAL timelines, you probably have other problems. ;-)
Attachment
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > 1. enums are unsigned by default in gcc, so all those internal error > messages "unrecognized blah kind: %d" need to be changed to %u. Do we have reason to think that that is true in every C compiler? My own preference for this would be to leave the messages as-is and add explicit "(int)" casts to the arguments. There are some fraction of these that are like that already. regards, tom lane
On Tue, Nov 10, 2020 at 4:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > > 1. enums are unsigned by default in gcc, so all those internal error > > messages "unrecognized blah kind: %d" need to be changed to %u. > > Do we have reason to think that that is true in every C compiler? > My own preference for this would be to leave the messages as-is > and add explicit "(int)" casts to the arguments. There are some > fraction of these that are like that already. From experimentation, it seems that GCC enumerator constants are int, but enum variables are int or signed int depending on whether any negative values were defined. Valid values have to be representable as int anyway regardless of what size and signedness a compiler chooses to use, so yeah, +1 for casting to int.
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: Hi, > On 2020-10-29 22:37, Thomas Munro wrote: >> There're probably mostly harmless, being mostly error and debug >> messages and the like, and considering that eg OID parsing tolerates >> negative numbers when reading them back in, but for what it's worth: >> GCC complains about many %d vs %u type mixups if you build with >> $SUBJECT. > > I had looked into this some time ago. I have dusted off my patch > again. The attached version fixes all warnings for me. When Dean pointed me this thread[1], I was thinking we need to add the "-Wformat-signedness" and fix all the existing warnning. Then after some research, it is not such easy and seems we need some agreement first if we want to fix them. > The following are the main categories of issues: > > 1. enums are unsigned by default in gcc, so all those internal error > messages "unrecognized blah kind: %d" need to be changed to %u. IIUC, we have agreed that we should cast enum to int and continue to use "%d". At least Tom suggested this and Thomas agreed this [1] and Peter didn't raise any opposition. > 2. Various trickery at the boundary of internal counters that are > unsigned and external functions or views using signed types. These need > another look. I also noticed we lack of UNSIGNED INT32/64 SQL type. Changing the counter to signed looks not good to me as well. This one looks doesn't have an agreement yet. > 3. Various messages print signed values using %x formats, which need to > be unsigned. These might also need another look. > > 4. Issues with constants being signed by default. For example, things > like elog(ERROR, "foo is %u but should be %u", somevar, 55) warns > because of the constant. Should be changed to something like 55U for > symmetry, or change the %u to %d. This also reaches into genbki > territory with all the OID constants being generated. > > 5. Some "surprising" but correct C behavior. For example, unsigned > short is promoted to int (not unsigned int) in variable arguments, so > needs a %d format. > > 6. Finally, a bunch of uses were just plain wrong and should be corrected. 7. __FILE__ in gcc is 'int', but we elog() it with "%u". Should we change it to "%d"? > I haven't found anything that is a really serious bug, but I imagine you > could run into trouble in various ways when you exceed the INT_MAX > value. But then again, if you use up INT_MAX WAL timelines, you > probably have other problems. ;-) Me too, just that want some clean code:) But FWIW, "-Wformat-signedness" is not supported by clang so far, so if people is using clang, they still can't benefit from this changes. My soluation (I use clang everyday) is adding a "gcc-checker" for my c file, if I make such mistake, it can remind me directly. [0] https://www.postgresql.org/message-id/874j4yl4cj.fsf%40163.com [1] https://www.postgresql.org/message-id/CA%2BhUKGJNUk434tcsVbs5YUGsujZbveo43QcZeWbv0xPzg9us-A%40mail.gmail.com -- Best Regards Andy Fan
On 27.10.24 04:59, Andy Fan wrote: >> I haven't found anything that is a really serious bug, but I imagine you >> could run into trouble in various ways when you exceed the INT_MAX >> value. But then again, if you use up INT_MAX WAL timelines, you >> probably have other problems. ;-) > Me too, just that want some clean code:) But FWIW, "-Wformat-signedness" > is not supported by clang so far, so if people is using clang, they > still can't benefit from this changes. clang 19 supports it now. > My soluation (I use clang > everyday) is adding a "gcc-checker" for my c file, if I make such > mistake, it can remind me directly. I think it could be useful to set up some better test coverage for various things overflowing signed integer maximums. For example, maybe you could hack initdb to advance the OID counter to INT32_MAX+1 or thereabouts and run the test suites from there. That would also catch things like inappropriate uses of atoi(), things beyond just the format strings.
On 29.10.24 07:51, Michael Paquier wrote: > On Tue, Oct 29, 2024 at 07:38:36AM +0100, Peter Eisentraut wrote: >> I think it could be useful to set up some better test coverage for various >> things overflowing signed integer maximums. For example, maybe you could >> hack initdb to advance the OID counter to INT32_MAX+1 or thereabouts and run >> the test suites from there. That would also catch things like inappropriate >> uses of atoi(), things beyond just the format strings. > > Fun. One way to be fancy here would be to force a pg_resetwal > --next-oid in some of the test paths (Cluster.pm and/or pg_regress) > with an environment variable to force the command to trigger across > the board for all the clusters created in the tests. initdb cannot be > used here as the TAP tests reuse a cluster already initdb'd to save > time. No need to touch at pg_regress, either, as we could count on the > pg_regress runs in 002_pg_upgrade.pl and 027_stream_regress.pl. I was thinking just compiling with a patch like this: -#define FirstNormalObjectId 16384 +#define FirstNormalObjectId ((Oid) INT_MAX + 1) Already found one bug: pg_checksums --filenode only accepts files up to INT_MAX.