Thread: Unused header file inclusion
Hi,
I noticed that there are many header files being
included which need not be included.
I have tried this in a few files and found the
compilation and regression to be working.
I have attached the patch for the files that
I tried.
I noticed that there are many header files being
included which need not be included.
I have tried this in a few files and found the
compilation and regression to be working.
I have attached the patch for the files that
I tried.
I tried this in CentOS, I did not find the header
files to be platform specific.
Should we pursue this further and cleanup in
all the files?
Regards,
Vignesh
all the files?
Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
Attachment
On Wed, Jul 31, 2019 at 11:19:08AM +0530, vignesh C wrote: > I noticed that there are many header files being included which need > not be included. I have tried this in a few files and found the > compilation and regression to be working. I have attached the patch > for the files that I tried. I tried this in CentOS, I did not find > the header files to be platform specific. > Should we pursue this further and cleanup in all the files? Do you use a particular method here or just manual deduction after looking at each file individually? If this can be cleaned up a bit, I think that's welcome. The removal of headers is easily forgotten when moving code from one file to another... -- Michael
Attachment
On Wed, Jul 31, 2019 at 11:26 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Wed, Jul 31, 2019 at 11:19:08AM +0530, vignesh C wrote: > > I noticed that there are many header files being included which need > > not be included. I have tried this in a few files and found the > > compilation and regression to be working. I have attached the patch > > for the files that I tried. I tried this in CentOS, I did not find > > the header files to be platform specific. > > Should we pursue this further and cleanup in all the files? > > Do you use a particular method here or just manual deduction after > looking at each file individually? If this can be cleaned up a bit, I > think that's welcome. The removal of headers is easily forgotten when > moving code from one file to another... > Thanks Michael. I'm writing some perl scripts to identify this. The script will scan through all the files, make changes, and verify. Finally it will give the changed files. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 31, 2019 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, Jul 31, 2019 at 11:26 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > On Wed, Jul 31, 2019 at 11:19:08AM +0530, vignesh C wrote: > > > I noticed that there are many header files being included which need > > > not be included. I have tried this in a few files and found the > > > compilation and regression to be working. I have attached the patch > > > for the files that I tried. I tried this in CentOS, I did not find > > > the header files to be platform specific. > > > Should we pursue this further and cleanup in all the files? > > > > Do you use a particular method here or just manual deduction after > > looking at each file individually? If this can be cleaned up a bit, I > > think that's welcome. The removal of headers is easily forgotten when > > moving code from one file to another... > > > Thanks Michael. > I'm writing some perl scripts to identify this. > The script will scan through all the files, make changes, > and verify. If we can come up with some such tool, we might be able to integrate it with Thomas's patch tester [1] wherein it can apply the patch, verify if there are unnecessary includes in the patch and report the same. [1] - http://commitfest.cputube.org/ -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 31, 2019 at 11:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jul 31, 2019 at 11:31 AM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, Jul 31, 2019 at 11:26 AM Michael Paquier <michael@paquier.xyz> wrote: > > > > > > On Wed, Jul 31, 2019 at 11:19:08AM +0530, vignesh C wrote: > > > > I noticed that there are many header files being included which need > > > > not be included. I have tried this in a few files and found the > > > > compilation and regression to be working. I have attached the patch > > > > for the files that I tried. I tried this in CentOS, I did not find > > > > the header files to be platform specific. > > > > Should we pursue this further and cleanup in all the files? > > > > > > Do you use a particular method here or just manual deduction after > > > looking at each file individually? If this can be cleaned up a bit, I > > > think that's welcome. The removal of headers is easily forgotten when > > > moving code from one file to another... > > > > > Thanks Michael. > > I'm writing some perl scripts to identify this. > > The script will scan through all the files, make changes, > > and verify. > > If we can come up with some such tool, we might be able to integrate > it with Thomas's patch tester [1] wherein it can apply the patch, > verify if there are unnecessary includes in the patch and report the > same. > > [1] - http://commitfest.cputube.org/ > Thanks Amit. I will post the tool along with the patch once I complete this activity. We can enhance the tool further based on feedback and take it forward. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 31, 2019 at 11:55:37AM +0530, Amit Kapila wrote: > If we can come up with some such tool, we might be able to integrate > it with Thomas's patch tester [1] wherein it can apply the patch, > verify if there are unnecessary includes in the patch and report the > same. > > [1] - http://commitfest.cputube.org/ Or even get something into src/tools/? If the produced result is clean enough, that could be interesting. -- Michael
Attachment
Hi, On 2019-07-31 11:19:08 +0530, vignesh C wrote: > I noticed that there are many header files being > included which need not be included. > I have tried this in a few files and found the > compilation and regression to be working. > I have attached the patch for the files that > I tried. > I tried this in CentOS, I did not find the header > files to be platform specific. > Should we pursue this further and cleanup in > all the files? These type of changes imo have a good chance of making things more fragile. A lot of the includes in header files are purely due to needing one or two definitions (which often could even be avoided by forward declarations). If you remove all the includes directly from the c files that are also included from some .h file, you increase the reliance on the indirect includes - making it harder to clean up. If anything, we should *increase* the number of includes, so we don't rely on indirect includes. But that's also not necessarily the right choice, because it adds unnecessary dependencies. > --- a/src/backend/utils/mmgr/freepage.c > +++ b/src/backend/utils/mmgr/freepage.c > @@ -56,7 +56,6 @@ > #include "miscadmin.h" > > #include "utils/freepage.h" > -#include "utils/relptr.h" I don't think it's a good idea to remove this header, for example. A *lot* of code in freepage.c relies on it. The fact that freepage.h also includes it here is just due to needing other parts of it > /* Magic numbers to identify various page types */ > diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c > index 334e35b..67268fd 100644 > --- a/src/backend/utils/mmgr/portalmem.c > +++ b/src/backend/utils/mmgr/portalmem.c > @@ -26,7 +26,6 @@ > #include "utils/builtins.h" > #include "utils/memutils.h" > #include "utils/snapmgr.h" > -#include "utils/timestamp.h" Similarly, this file uses timestamp.h functions directly. The fact that timestamp.h already is included is just due to implementation details of xact.h that this file shouldn't depend on. Greetings, Andres Freund
Michael Paquier <michael@paquier.xyz> writes: > On Wed, Jul 31, 2019 at 11:55:37AM +0530, Amit Kapila wrote: >> If we can come up with some such tool, we might be able to integrate >> it with Thomas's patch tester [1] wherein it can apply the patch, >> verify if there are unnecessary includes in the patch and report the >> same. > Or even get something into src/tools/? If the produced result is > clean enough, that could be interesting. I take it nobody has actually bothered to *look* in src/tools. src/tools/pginclude/README Note that our experience with this sort of thing has not been very good. See particularly 1609797c2. regards, tom lane
On 2019-Jul-31, vignesh C wrote: > I noticed that there are many header files being > included which need not be included. Yeah, we have tooling for this already in src/tools/pginclude. It's been used before, and it has wreaked considerable havoc; see "git log --grep pgrminclude". I think doing this sort of cleanup is useful to a point -- as Andres mentions, some includes are somewhat more "important" than others, so judgement is needed in each case. I think removing unnecessary include lines from header files is much more useful than from .c files. However, nowadays even I am not very convinced that that is a very fruitful use of time, since many/most developers use ccache which will reduce the compile times anyway in many cases; and development machines are typically much faster than ten years ago. Also, I think addition of new include lines to existing .c files should be a point worth specific attention in patch review, to avoid breaking reasonable modularity boundaries unnecessarily. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2019-07-31 11:23:22 -0400, Alvaro Herrera wrote: > I think removing unnecessary include lines from header files is much > more useful than from .c files. However, nowadays even I am not very > convinced that that is a very fruitful use of time, since many/most > developers use ccache which will reduce the compile times anyway in many > cases; and development machines are typically much faster than ten years > ago. IDK, I find the compilation times annoying. And it's gotten quite noticably worse with all the speculative execution mitigations. Although to some degree that's not really the fault of individual compilations, but our buildsystem being pretty slow. I think there's also just modularity reasons for removing includes from headers. We've some pretty oddly interlinked systems, often without good reason (*). Cleaning those up imo is well worth the time - but hardly can be automated. If one really wanted to automate removal of header files, it'd need to be a lot smarter than just checking whether a file fails to compile if one header is removed. In the general case we'd have to test if the .c file itself uses any of the symbols from the possibly-to-be-removed header. That's hard to do without using something like llvm's libclang. The one case that's perhaps a bit easier to automate, and possibly worthwhile: If a header is not indirectly included (possibly testable via #ifndef HEADER_NAME_H #error 'already included' etc), and compilation doesn't fail with it removed, *then* it's actually likely useless (except for portability cases). * I think a lot of the interlinking stems from the bad idea to use typedef's everywhere. In contrast to structs they cannot be forward declared portably in our version of C. We should use a lot more struct forward declarations, and just not use the typedef. Greetings, Andres Freund
On 2019-Jul-31, Andres Freund wrote: > IDK, I find the compilation times annoying. And it's gotten quite > noticably worse with all the speculative execution mitigations. Although > to some degree that's not really the fault of individual compilations, > but our buildsystem being pretty slow. We're in a much better position now than a decade ago, in terms of clock time. Back then I would resort to many tricks to avoid spurious compiles, even manually touching some files to dates in the past to avoid them. Nowadays I never bother with such things. But yes, reducing the build time even more would be welcome for sure. > * I think a lot of the interlinking stems from the bad idea to use > typedef's everywhere. In contrast to structs they cannot be forward > declared portably in our version of C. We should use a lot more struct > forward declarations, and just not use the typedef. I don't know about that ... I think the problem is that we both declare the typedef *and* define the struct in the same place. If we were to split those things to separate files, the required rebuilds would be much less, I think, because changing a struct would no longer require recompiles of files that merely pass those structs around (that's very common for Node-derived structs). Forward-declaring structs in unrelated header files just because they need them, feels a bit like cheating to me. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Jul-31, Andres Freund wrote: >> * I think a lot of the interlinking stems from the bad idea to use >> typedef's everywhere. In contrast to structs they cannot be forward >> declared portably in our version of C. We should use a lot more struct >> forward declarations, and just not use the typedef. > I don't know about that ... I think the problem is that we both declare > the typedef *and* define the struct in the same place. If we were to > split those things to separate files, the required rebuilds would be > much less, I think, because changing a struct would no longer require > recompiles of files that merely pass those structs around (that's very > common for Node-derived structs). Forward-declaring structs in > unrelated header files just because they need them, feels a bit like > cheating to me. Yeah. I seem to recall a proposal that nodes.h should contain typedef struct Foo Foo; for every node type Foo, and then the other headers would just fill in the structs, and we could get rid of a lot of ad-hoc forward struct declarations and other hackery. regards, tom lane
Hi, On 2019-07-31 16:55:31 -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > On 2019-Jul-31, Andres Freund wrote: > >> * I think a lot of the interlinking stems from the bad idea to use > >> typedef's everywhere. In contrast to structs they cannot be forward > >> declared portably in our version of C. We should use a lot more struct > >> forward declarations, and just not use the typedef. > > > I don't know about that ... I think the problem is that we both declare > > the typedef *and* define the struct in the same place. If we were to > > split those things to separate files, the required rebuilds would be > > much less, I think, because changing a struct would no longer require > > recompiles of files that merely pass those structs around (that's very > > common for Node-derived structs). Forward-declaring structs in > > unrelated header files just because they need them, feels a bit like > > cheating to me. > > Yeah. I seem to recall a proposal that nodes.h should contain > > typedef struct Foo Foo; > > for every node type Foo, and then the other headers would just > fill in the structs, and we could get rid of a lot of ad-hoc > forward struct declarations and other hackery. That to me just seems objectively worse. Now adding a new struct as a minor implementation detail of some subsystem doesn't just require recompiling the relevant files, but just about all of pg. And just about every header would acquire a nodes.h include - there's still a lot of them that today don't. I don't understand why you guys consider forward declaring structs ugly. It's what just about every other C project does. The only reason it's sometimes problematic in postgres is that we "hide the pointer" within some typedefs, making it not as obvious which type we're referring to (because the struct usage will be struct FooData*, instead of just Foo). But we also have confusion due to that in a lot of other places, so I don't really buy that this is a significant issue. Right now we really have weird dependencies between largely independent subsystem. Some are partially because functions aren't always in the right file, but it's also not often clear what the right one would be. E.g. snapmgr.h depending on relcache.h (for TransactionIdLimitedForOldSnapshots() having a Relation parameter), on resowner.h (for RegisterSnapshotOnOwner()) is imo not good. For one they lead to a number of .c files that actually use functionality from resowner.h to not have the necessary includes. There's a lot of things like that. .oO(the fmgr.h include in snapmgr.h has been unnecessary since 352a24a1f9) We could of course be super rigorous and have an accompanying foo_structs.h or something for every foo.h. But that seems to add no actual advantages, and makes things more complicated. The only reason the explicit forward declaration is needed in the common case of a 'struct foo*' parameter is that C has weird visibility rules about the scope of forward declarations in paramters. If you instead first have e.g. a function *return* type of that struct type, the explicit forward declaration isn't even needed - it's visible afterwards. But for parameters it's basically a *different* struct, that cannot be referenced again. Note that in C++ the visibility routines are more consistent, and you don't need an explicit forward declaration in either case (I'd also be OK with requiring it in both cases, it's just weird to only need them in one). Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2019-07-31 16:55:31 -0400, Tom Lane wrote: >> Yeah. I seem to recall a proposal that nodes.h should contain >> >> typedef struct Foo Foo; >> >> for every node type Foo, and then the other headers would just >> fill in the structs, and we could get rid of a lot of ad-hoc >> forward struct declarations and other hackery. > That to me just seems objectively worse. Now adding a new struct as a > minor implementation detail of some subsystem doesn't just require > recompiling the relevant files, but just about all of pg. Er, what? This list of typedefs would change exactly when enum NodeTag changes, so AFAICS your objection is bogus. It's true that this proposal doesn't help for structs that aren't Nodes, but my sense is that > 90% of our ad-hoc struct references are for Nodes. > Right now we really have weird dependencies between largely independent > subsystem. Agreed, but I think fixing that will take some actually serious design work. It's not going to mechanically fall out of changing typedef rules. > The only reason the explicit forward declaration is needed in the common > case of a 'struct foo*' parameter is that C has weird visibility rules > about the scope of forward declarations in paramters. Yeah, but there's not much we can do about that, nor is getting rid of typedefs in favor of "struct" going to make it even a little bit better. regards, tom lane
Hi, On 2019-07-31 19:25:01 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2019-07-31 16:55:31 -0400, Tom Lane wrote: > >> Yeah. I seem to recall a proposal that nodes.h should contain > >> > >> typedef struct Foo Foo; > >> > >> for every node type Foo, and then the other headers would just > >> fill in the structs, and we could get rid of a lot of ad-hoc > >> forward struct declarations and other hackery. > > > That to me just seems objectively worse. Now adding a new struct as a > > minor implementation detail of some subsystem doesn't just require > > recompiling the relevant files, but just about all of pg. > > Er, what? This list of typedefs would change exactly when enum NodeTag > changes, so AFAICS your objection is bogus. > It's true that this proposal doesn't help for structs that aren't Nodes, > but my sense is that > 90% of our ad-hoc struct references are for Nodes. Ah, well, I somehow assumed you were talking about all nodes. I don't think I agree with the 90% figure. In headers I feel like most the references are to things like Relation, Snapshot, HeapTuple, etc. > > Right now we really have weird dependencies between largely independent > > subsystem. > > Agreed, but I think fixing that will take some actually serious design > work. It's not going to mechanically fall out of changing typedef rules. No, but without finding a more workable approach than what we're doing often doing now wrt includes and forward declares, we'll have a lot harder time to separate subsystems out more. > > The only reason the explicit forward declaration is needed in the common > > case of a 'struct foo*' parameter is that C has weird visibility rules > > about the scope of forward declarations in paramters. > > Yeah, but there's not much we can do about that, nor is getting rid > of typedefs in favor of "struct" going to make it even a little bit > better. It imo pretty fundamentally does. You cannot redefine typedefs, but you can forward declare structs. E.g. in the attached series of patches, I'm removing a good portion of unnecessary dependencies to fmgr.h. But to actually make a difference that requires referencing two structs without including the header - and I don't think restructing fmgr.h into two headers is a particularly attractive alternative (would make it a lot more work and a lot more invasive). Think the first three are pretty clearly a good idea, I'm a bit less sanguine about the fourth: Headers like utils/timestamp.h are often included just because we need a TimestampTz type somewhere, or call GetCurrentTimestamp(). Approximately none of these need the PG_GETARG_* macros, which are the only reason for including fmgr.h in these headers. As they're macros that's not actually needed, although I think normally good style. But I' think here avoiding exposing fmgr.h to more headers is a bigger win. Greetings, Andres Freund
Attachment
On Wed, Jul 31, 2019 at 12:26 PM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2019-07-31 11:19:08 +0530, vignesh C wrote: > > I noticed that there are many header files being > > included which need not be included. > > I have tried this in a few files and found the > > compilation and regression to be working. > > I have attached the patch for the files that > > I tried. > > I tried this in CentOS, I did not find the header > > files to be platform specific. > > Should we pursue this further and cleanup in > > all the files? > > These type of changes imo have a good chance of making things more > fragile. A lot of the includes in header files are purely due to needing > one or two definitions (which often could even be avoided by forward > declarations). If you remove all the includes directly from the c files > that are also included from some .h file, you increase the reliance on > the indirect includes - making it harder to clean up. > > If anything, we should *increase* the number of includes, so we don't > rely on indirect includes. But that's also not necessarily the right > choice, because it adds unnecessary dependencies. > > > > --- a/src/backend/utils/mmgr/freepage.c > > +++ b/src/backend/utils/mmgr/freepage.c > > @@ -56,7 +56,6 @@ > > #include "miscadmin.h" > > > > #include "utils/freepage.h" > > -#include "utils/relptr.h" > > I don't think it's a good idea to remove this header, for example. A > *lot* of code in freepage.c relies on it. The fact that freepage.h also > includes it here is just due to needing other parts of it > Fixed this. > > > /* Magic numbers to identify various page types */ > > diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c > > index 334e35b..67268fd 100644 > > --- a/src/backend/utils/mmgr/portalmem.c > > +++ b/src/backend/utils/mmgr/portalmem.c > > @@ -26,7 +26,6 @@ > > #include "utils/builtins.h" > > #include "utils/memutils.h" > > #include "utils/snapmgr.h" > > -#include "utils/timestamp.h" > > Similarly, this file uses timestamp.h functions directly. The fact that > timestamp.h already is included is just due to implementation details of > xact.h that this file shouldn't depend on. > Fixed this. Made the fixes based on your comments, updated patch has the changes for the same. Regards, Vignesh EnterpriseDB: http://www.enterprisedb.com
Attachment
On 2019-Aug-04, vignesh C wrote: > Made the fixes based on your comments, updated patch has the changes > for the same. Well, you fixed the two things that seem to me quoted as examples of problems, but you didn't fix other occurrences of the same issues elsewhere. For example, you remove lwlock.h from dsa.c but there are structs there that have LWLocks as members. That's just the first hunk of the patch; didn't look at the others but it wouldn't surprise that they have similar issues. I suggest this patch should be rejected. Then there's the <limits.h> removal, which is in tuplesort.c because of INT_MAX as added by commit d26559dbf356 and still present ... FWIW sharedtuplestore.c, a very young file, also includes <limits.h> but that appears to be copy-paste of includes from some other file (and also in an inappropriate place), so I have no objections to obliterating that one. But other than that one line, this patch needs more "adult supervision". -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Then there's the <limits.h> removal, which is in tuplesort.c because of > INT_MAX as added by commit d26559dbf356 and still present ... One has to be especially wary of removing system-header inclusions; the fact that they don't seem to be needed on your own machine doesn't prove they aren't needed elsewhere. > ... I suggest this patch should be rejected. Yeah. If we do anything along this line it should be based on pgrminclude results, and even then I think it'd require manual review, especially for changes in header files. The big picture here is that removing #includes is seldom worth the effort it takes :-( regards, tom lane
On 2019-Aug-05, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Then there's the <limits.h> removal, which is in tuplesort.c because of > > INT_MAX as added by commit d26559dbf356 and still present ... > > One has to be especially wary of removing system-header inclusions; > the fact that they don't seem to be needed on your own machine doesn't > prove they aren't needed elsewhere. As far as I can see, this line was just carried over from tuplestore.c, but that file uses INT_MAX and this one doesn't use any of those macros as far as I can tell. I pushed this change and hereby consider this to be over. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Aug 8, 2019 at 9:00 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2019-Aug-05, Tom Lane wrote: > > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > > Then there's the <limits.h> removal, which is in tuplesort.c because of > > > INT_MAX as added by commit d26559dbf356 and still present ... > > > > One has to be especially wary of removing system-header inclusions; > > the fact that they don't seem to be needed on your own machine doesn't > > prove they aren't needed elsewhere. > > As far as I can see, this line was just carried over from tuplestore.c, > but that file uses INT_MAX and this one doesn't use any of those macros > as far as I can tell. Yeah, probably, or maybe I used one of those sorts of macros in an earlier version of the patch. By the way, I see now that I had put the offending #include <limits.h> *after* project headers, which is a convention I picked up from other projects, mentors and authors, but not what PostgreSQL usually does. In my own code I do that to maximise the chances that project headers will fail to compile if they themselves forget to include the system headers they depend on. Of course an attempt to compile every header in the project individually as a transaction unit also achieves that. /me runs away -- Thomas Munro https://enterprisedb.com
On Thu, Aug 8, 2019 at 9:47 AM Thomas Munro <thomas.munro@gmail.com> wrote: > transaction unit *translation unit -- Thomas Munro https://enterprisedb.com
Hi, On 2019-08-03 12:37:33 -0700, Andres Freund wrote: > Think the first three are pretty clearly a good idea, I'm a bit less > sanguine about the fourth: > Headers like utils/timestamp.h are often included just because we need a > TimestampTz type somewhere, or call GetCurrentTimestamp(). Approximately > none of these need the PG_GETARG_* macros, which are the only reason for > including fmgr.h in these headers. As they're macros that's not > actually needed, although I think normally good style. But I' think here > avoiding exposing fmgr.h to more headers is a bigger win. I still think the fourth is probably worthwhile, but I don't feel confident enough to do it without somebody else +0.5'ing it... I've pushed the other ones. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > I've pushed the other ones. Checking whether header files compile standalone shows you were overly aggressive about removing fmgr.h includes: In file included from /tmp/headerscheck.Ss8bVx/test.c:3: ./src/include/utils/selfuncs.h:143: error: expected declaration specifiers or '...' before 'FmgrInfo' ./src/include/utils/selfuncs.h:146: error: expected declaration specifiers or '...' before 'FmgrInfo' ./src/include/utils/selfuncs.h:152: error: expected declaration specifiers or '...' before 'FmgrInfo' That's with a script I use that's like cpluspluscheck except it tests with plain gcc not g++. I attached it for the archives' sake. Oddly, cpluspluscheck does not complain about those cases, but it does complain about In file included from /tmp/cpluspluscheck.FgX2SW/test.cpp:4: ./src/bin/scripts/scripts_parallel.h:18: error: ISO C++ forbids declaration of 'PGconn' with no type ./src/bin/scripts/scripts_parallel.h:18: error: expected ';' before '*' token ./src/bin/scripts/scripts_parallel.h:29: error: 'PGconn' has not been declared (My headerscheck script is missing that header; I need to update it to match the latest version of cpluspluscheck.) regards, tom lane #!/bin/sh # Check all exported PostgreSQL include files for standalone build. # Run this from the top-level source directory after performing a build. # No output if everything is OK, else compiler errors. me=`basename $0` tmp=`mktemp -d /tmp/$me.XXXXXX` trap 'rm -rf $tmp' 0 1 2 3 15 # This should match what configure chooses, though we only care about -Wxxx CFLAGS="-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -D_GNU_SOURCE -I . -I src/include -I src/interfaces/libpq $CFLAGS" # Omit src/include/port/, because it's platform specific, and c.h includes # the relevant file anyway. # rusagestub.h is also platform-specific, and will be included by # utils/pg_rusage.h if necessary. # access/rmgrlist.h is not meant to be included standalone. # regex/regerrs.h is not meant to be included standalone. # parser/gram.h will be included by parser/gramparse.h. # parser/kwlist.h is not meant to be included standalone. # pg_trace.h and utils/probes.h can include sys/sdt.h from SystemTap, # which itself contains C++ code and so won't compile with a C++ # compiler under extern "C" linkage. for f in `find src/include src/interfaces/libpq/libpq-fe.h src/interfaces/libpq/libpq-events.h -name '*.h' -print | \ grep -v -e ^src/include/port/ \ -e ^src/include/rusagestub.h -e ^src/include/regex/regerrs.h \ -e ^src/include/access/rmgrlist.h \ -e ^src/include/parser/gram.h -e ^src/include/parser/kwlist.h \ -e ^src/include/pg_trace.h -e ^src/include/utils/probes.h` do { test $f != "src/include/postgres_fe.h" && echo '#include "postgres.h"' echo "#include \"$f\"" } >$tmp/test.c ${CC:-gcc} $CFLAGS -c $tmp/test.c -o $tmp/test.o done
I wrote: > (My headerscheck script is missing that header; I need to update it to > match the latest version of cpluspluscheck.) I did that, and ended up with the attached. I'm rather tempted to stick this into src/tools/ alongside cpluspluscheck, because it seems to find rather different trouble spots than cpluspluscheck does. Thoughts? As of HEAD (927f34ce8), I get clean passes from both this and cpluspluscheck, except for the FmgrInfo references in selfuncs.h. I tried it on RHEL/Fedora, FreeBSD 12, and current macOS. regards, tom lane #!/bin/sh # Check (almost) all PostgreSQL include files for standalone build. # # Argument 1 is the top-level source directory, argument 2 the # top-level build directory (they might be the same). If not set, they # default to the current directory. # # Needs to be run after configuring and creating all generated headers. # It's advisable to configure --with-perl --with-python, or you're # likely to get errors from associated headers. # # No output if everything is OK, else compiler errors. if [ -z "$1" ]; then srcdir="." else srcdir="$1" fi if [ -z "$2" ]; then builddir="." else builddir="$2" fi me=`basename $0` # Pull some info from configure's results. MGLOB="$builddir/src/Makefile.global" CFLAGS=`sed -n 's/^CFLAGS[ ]*=[ ]*//p' "$MGLOB"` CPPFLAGS=`sed -n 's/^CPPFLAGS[ ]*=[ ]*//p' "$MGLOB"` PG_SYSROOT=`sed -n 's/^PG_SYSROOT[ ]*=[ ]*//p' "$MGLOB"` CC=`sed -n 's/^CC[ ]*=[ ]*//p' "$MGLOB"` perl_includespec=`sed -n 's/^perl_includespec[ ]*=[ ]*//p' "$MGLOB"` python_includespec=`sed -n 's/^python_includespec[ ]*=[ ]*//p' "$MGLOB"` # needed on Darwin CPPFLAGS=`echo "$CPPFLAGS" | sed "s|\\$(PG_SYSROOT)|$PG_SYSROOT|g"` # (EXTRAFLAGS is not set here, but user can pass it in if need be.) # Create temp directory. tmp=`mktemp -d /tmp/$me.XXXXXX` trap 'rm -rf $tmp' 0 1 2 3 15 # Scan all of src/ and contrib/ for header files. for f in `cd "$srcdir" && find src contrib -name '*.h' -print` do # Ignore files that are unportable or intentionally not standalone. # These files are platform-specific, and c.h will include the # one that's relevant for our current platform anyway. test "$f" = src/include/port/aix.h && continue test "$f" = src/include/port/cygwin.h && continue test "$f" = src/include/port/darwin.h && continue test "$f" = src/include/port/freebsd.h && continue test "$f" = src/include/port/hpux.h && continue test "$f" = src/include/port/linux.h && continue test "$f" = src/include/port/netbsd.h && continue test "$f" = src/include/port/openbsd.h && continue test "$f" = src/include/port/solaris.h && continue test "$f" = src/include/port/win32.h && continue # Additional Windows-specific headers. test "$f" = src/include/port/win32_port.h && continue test "$f" = src/include/port/win32/sys/socket.h && continue test "$f" = src/include/port/win32_msvc/dirent.h && continue test "$f" = src/port/pthread-win32.h && continue # Likewise, these files are platform-specific, and the one # relevant to our platform will be included by atomics.h. test "$f" = src/include/port/atomics/arch-arm.h && continue test "$f" = src/include/port/atomics/arch-hppa.h && continue test "$f" = src/include/port/atomics/arch-ia64.h && continue test "$f" = src/include/port/atomics/arch-ppc.h && continue test "$f" = src/include/port/atomics/arch-x86.h && continue test "$f" = src/include/port/atomics/fallback.h && continue test "$f" = src/include/port/atomics/generic.h && continue test "$f" = src/include/port/atomics/generic-acc.h && continue test "$f" = src/include/port/atomics/generic-gcc.h && continue test "$f" = src/include/port/atomics/generic-msvc.h && continue test "$f" = src/include/port/atomics/generic-sunpro.h && continue test "$f" = src/include/port/atomics/generic-xlc.h && continue # rusagestub.h is also platform-specific, and will be included # by utils/pg_rusage.h if necessary. test "$f" = src/include/rusagestub.h && continue # sepgsql.h depends on headers that aren't there on most platforms. test "$f" = contrib/sepgsql/sepgsql.h && continue # These files are not meant to be included standalone, because # they contain lists that might have multiple use-cases. test "$f" = src/include/access/rmgrlist.h && continue test "$f" = src/include/parser/kwlist.h && continue test "$f" = src/pl/plpgsql/src/pl_reserved_kwlist.h && continue test "$f" = src/pl/plpgsql/src/pl_unreserved_kwlist.h && continue test "$f" = src/interfaces/ecpg/preproc/c_kwlist.h && continue test "$f" = src/interfaces/ecpg/preproc/ecpg_kwlist.h && continue test "$f" = src/include/regex/regerrs.h && continue test "$f" = src/pl/plpgsql/src/plerrcodes.h && continue test "$f" = src/pl/plpython/spiexceptions.h && continue test "$f" = src/pl/tcl/pltclerrcodes.h && continue # We can't make these Bison output files compilable standalone # without using "%code require", which old Bison versions lack. # parser/gram.h will be included by parser/gramparse.h anyway. test "$f" = src/include/parser/gram.h && continue test "$f" = src/backend/parser/gram.h && continue test "$f" = src/pl/plpgsql/src/pl_gram.h && continue test "$f" = src/interfaces/ecpg/preproc/preproc.h && continue # This produces a "no previous prototype" warning. test "$f" = src/include/storage/checksum_impl.h && continue # ppport.h is not under our control, so we can't make it standalone. test "$f" = src/pl/plperl/ppport.h && continue # regression.h is not actually C, but ECPG code. test "$f" = src/interfaces/ecpg/test/regression.h && continue # printf_hack.h produces "unused function" warnings. test "$f" = src/interfaces/ecpg/test/printf_hack.h && continue # OK, create .c file to include this .h file. { test "$f" != src/include/postgres_fe.h && echo '#include "postgres.h"' echo "#include \"$f\"" } >$tmp/test.c # Some subdirectories need extra -I switches. case "$f" in src/pl/plperl/*) EXTRAINCLUDES="$perl_includespec" ;; src/pl/plpython/*) EXTRAINCLUDES="$python_includespec" ;; src/interfaces/ecpg/*) EXTRAINCLUDES="-I $builddir/src/interfaces/ecpg/include -I $srcdir/src/interfaces/ecpg/include" ;; *) EXTRAINCLUDES="" ;; esac # Run the test. ${CC:-gcc} $CPPFLAGS $CFLAGS -I $builddir -I $srcdir \ -I $builddir/src/include -I $srcdir/src/include \ -I $builddir/src/interfaces/libpq -I $srcdir/src/interfaces/libpq \ $EXTRAINCLUDES $EXTRAFLAGS -c $tmp/test.c -o $tmp/test.o done
On 2019-Aug-18, Tom Lane wrote: > I wrote: > > (My headerscheck script is missing that header; I need to update it to > > match the latest version of cpluspluscheck.) > > I did that, and ended up with the attached. I'm rather tempted to stick > this into src/tools/ alongside cpluspluscheck, because it seems to find > rather different trouble spots than cpluspluscheck does. Thoughts? Yeah, let's include this. I've written its equivalent a couple of times already. (My strategy is just to compile the .h file directly though, which creates a .gch file, rather than writing a temp .c file.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Aug-18, Tom Lane wrote: >> I did that, and ended up with the attached. I'm rather tempted to stick >> this into src/tools/ alongside cpluspluscheck, because it seems to find >> rather different trouble spots than cpluspluscheck does. Thoughts? > Yeah, let's include this. Done. regards, tom lane
Hi, On 2019-08-18 14:37:34 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I've pushed the other ones. > > Checking whether header files compile standalone shows you were overly > aggressive about removing fmgr.h includes: > > In file included from /tmp/headerscheck.Ss8bVx/test.c:3: > ./src/include/utils/selfuncs.h:143: error: expected declaration specifiers or '...' before 'FmgrInfo' > ./src/include/utils/selfuncs.h:146: error: expected declaration specifiers or '...' before 'FmgrInfo' > ./src/include/utils/selfuncs.h:152: error: expected declaration specifiers or '...' before 'FmgrInfo' Darn. Pushed the obvious fix of adding a direct fmgr.h include, rather than the preivous indirect include. > That's with a script I use that's like cpluspluscheck except it tests > with plain gcc not g++. I attached it for the archives' sake. > > Oddly, cpluspluscheck does not complain about those cases, but it > does complain about Hm. I don't understand why it's not complaining. Wonder if it's a question of the flags or such. > In file included from /tmp/cpluspluscheck.FgX2SW/test.cpp:4: > ./src/bin/scripts/scripts_parallel.h:18: error: ISO C++ forbids declaration of 'PGconn' with no type > ./src/bin/scripts/scripts_parallel.h:18: error: expected ';' before '*' token > ./src/bin/scripts/scripts_parallel.h:29: error: 'PGconn' has not been declared I noticed that "manually" earlier, when looking at the openbsd issue. > (My headerscheck script is missing that header; I need to update it to > match the latest version of cpluspluscheck.) I wonder if we should just add a #ifndef HEADERCHECK or such to the headers that we don't want to process as standalone headers (or #ifndef NOT_STANDALONE or whatever). That way multiple tools can rely on these markers, rather than copying knowledge about that kind of information into multiple places. I wish we could move the whole logic of those scripts into makefiles, so we could employ parallelism. Greetings, Andres Freund
Hi, On 2019-08-19 13:07:50 -0700, Andres Freund wrote: > On 2019-08-18 14:37:34 -0400, Tom Lane wrote: > > That's with a script I use that's like cpluspluscheck except it tests > > with plain gcc not g++. I attached it for the archives' sake. > > > > Oddly, cpluspluscheck does not complain about those cases, but it > > does complain about > > Hm. I don't understand why it's not complaining. Wonder if it's a > question of the flags or such. I'ts caused by -fsyntax-only # These switches are g++ specific, you may override if necessary. CXXFLAGS=${CXXFLAGS:- -fsyntax-only -Wall} which also explains why your headercheck doesn't have that problem, it doesn't use -fsyntax-only. > > (My headerscheck script is missing that header; I need to update it to > > match the latest version of cpluspluscheck.) > > I wonder if we should just add a #ifndef HEADERCHECK or such to the > headers that we don't want to process as standalone headers (or #ifndef > NOT_STANDALONE or whatever). That way multiple tools can rely on these markers, > rather than copying knowledge about that kind of information into > multiple places. > > I wish we could move the whole logic of those scripts into makefiles, so > we could employ parallelism. Hm. Perhaps the way to do that would be to use gcc's -include to include postgres.h, and use -Wc++-compat to detect c++ issues, rather than using g++. Without tempfiles it ought to be a lot easier to just do all of the relevant work in make, without a separate shell script. The python/perl/ecpg logic would b e abit annoying, but probably not too bad? Think we could just always add all of them? Greetings, Andres Freund
On 2019-Aug-19, Andres Freund wrote: > > I wish we could move the whole logic of those scripts into makefiles, so > > we could employ parallelism. > > Hm. Perhaps the way to do that would be to use gcc's -include to include > postgres.h, and use -Wc++-compat to detect c++ issues, rather than using > g++. Without tempfiles it ought to be a lot easier to just do all of the > relevant work in make, without a separate shell script. I used to have this: https://postgr.es/m/1293469595-sup-1462@alvh.no-ip.org Not sure how much this helps, since it's a shell line in make, so not very paralellizable. And you still have to build the exclusions somehow. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andres Freund <andres@anarazel.de> writes: > On 2019-08-19 13:07:50 -0700, Andres Freund wrote: >> On 2019-08-18 14:37:34 -0400, Tom Lane wrote: >>> Oddly, cpluspluscheck does not complain about those cases, but it >>> does complain about >> Hm. I don't understand why it's not complaining. Wonder if it's a >> question of the flags or such. > I'ts caused by -fsyntax-only Ah-hah. Should we change that to something else? That's probably a hangover from thinking that all we had to do was check for C++ keywords. >> I wish we could move the whole logic of those scripts into makefiles, so >> we could employ parallelism. I can't really get excited about expending a whole bunch of additional work on these scripts. regards, tom lane