Thread: WIP: a way forward on bootstrap data
Hi, I was looking through the archives one day for a few topics that interest me, and saw there was continued interest in making bootstrap data less painful [1] [2]. There were quite a few good ideas thrown around in those threads, but not much in the way of concrete results. I took a few of them as a starting point and threw together the attached WIP patchset. == An overview (warning: long): 1 through 3 are small tweaks worth doing even without a data format change. 4 through 7 are separated for readability and flexibility, but should be understood as one big patch. I tried to credit as many people's ideas as possible. Some things are left undone until basic agreement is reached. -- Patch 1 - Minor corrections -- Patch 2 - Minor cleanup Be more consistent style-wise, change a confusing variable name, fix perltidy junk. -- Patch 3 - Remove hard-coded schema information about pg_attribute from genbki.pl. This means slightly less code maintenance, but more importantly it's a proving ground for mechanisms used in later patches. 1. Label a column's default value in the catalog header [3]. Note: I implemented it in the simplest way possible for now, which happens to prevents a column from having both FORCE_(NOT_)NULL and a default at the same time, but I think in practice that would almost never matter. If more column options are added in the future, this will have to be rewritten. 2. Add a new function to Catalog.pm to fill in a tuple with default values. It will complain loudly if it can't find either a default or a given value, so change the signature of emit_pgattr_row() so we can pass a partially built tuple to it. 3. Format the schema macro entries according to their types. 4. Commit 8137f2c32322c624e0431fac1621e8e9315202f9 labeled which columns are variable length. Expose that label so we can exclude those columns from schema macros in a general fashion. -- Patch 4 - Infrastructure for the data conversion 1. Copy a modified version of Catalogs() from Catalog.pm to a new script that turns DATA()/(SH)DESCR() statements into serialized Perl data structures in pg_*.dat files, preserving comments along the way. Toast and index statements are unaffected. Although it's a one-off as far as the core project is concerned, I imagine third-parties might want access to this tool, so it's in the patch and not separate. 2. Remove data parsing from the original Catalogs() function and rename it to reflect its new, limited role of extracting the schema info from a header. The data files are handled by a new function. 3. Introduce a script to rewrite pg_*.dat files in a standard format [4], strip default values, preserve comments and normalize blank lines. It can also change default values on the fly. It is intended to be run when adding new data. 4. Add default values to a few catalog headers for demonstration purposes. There might be some questionable choices here. Review is appreciated. Note: At this point, the build is broken. TODO: See what pgindent does to the the modified headers. -- Patch 5 - Mechanical data conversion This is the result of: cd src/include/catalog perl convert_header2dat.pl list-of-catalog-headers-from-the-Makefile perl -I ../../backend/catalog rewrite_dat.pl *.dat rm *.bak Note: The data has been stripped of all double-quotes for readability, since the Perl hash literals have single quotes everywhere. Patches 6 and 7 restore them where needed. -- Patch 6 - Hand edits Re-doublequote values that are macros expanded by initdb.c, remove stray comments, fix up whitespace, and do a minimum of comment editing to reflect the new data format. At this point the files are ready to take a look at. Format is the central issue, of course. I tried to structure things so it wouldn't be a huge lift to bikeshed on the format. See pg_authid.dat for a conveniently small example. Each entry is 1 or 2 lines long, depending on whether oid or (sh)descr is present. Regarding pg_proc.dat, I think readability is improved, and to some extent line length, although it's not great: pg_proc.h: avg=175, stdev=25 pg_proc.dat: avg=159, stdev=43 (grep -E '^DATA' pg_proc.h | awk '{print length}' grep prosrc pg_proc.dat | awk '{print length}') Many lines now fit in an editor window, but the longest line has ballooned from 576 chars to 692. I made proparallel default to 'u' for safety, but the vast majority are 's'. If we risked making 's' the default, most entries would shrink by 20 chars. On the other hand, human-readable types would inflate that again, but maybe patch 8 below can help with editing. There was some discussion about abbreviated labels that were mapped to column names - I haven't thought about that yet. -- Patch 7 - Distprep scripts 1. Teach genbki.pl and Gen_fmgrtab.pl to read the data files, and arrange for the former to double-quote certain values so bootscanner.l can read them. 2. Introduce Makefile dependencies on the data files. The build works now. Note: Since the original DATA() entries contained some (as far as I can tell) useless quotes, the postgres.bki diff won't be zero, but it will be close. Performance note: On my laptop, running Make in the catalog dir with no compilation goes from ~700ms on master to ~1000ms with the new data files. -- Patch 8 - SQL output 1. Write out postgres.sql, which allows you to insert all the source BKI data into a development catalog schema for viewing and possibly bulk-editing [5]. It retains oids, and (sh)descr fields in their own columns, and implements default values. 2. Make it a distclean target. TODO: Find a way to dump dev catalog tuples into the canonical format. -- Not implemented yet: -Gut the header files of DATA() statements. I'm thinking we should keep the #defines in the headers, but see below: -Update README and documentation -- Future work: -More lookups for human-readable types, operators, etc. -Automate pg_type #defines a la pg_proc [6], which could also be used to maintain ecpg's copy of pg_type #defines. -- [1] https://www.postgresql.org/message-id/flat/20150220234142.GH12653%40awork2.anarazel.de [2] https://www.postgresql.org/message-id/flat/CAGoxFiFeW64k4t95Ez2udXZmKA%2BtazUFAaSTtYQLM4Jhzw%2B-pg%40mail.gmail.com [3] https://www.postgresql.org/message-id/20161113171017.7sgaqdeq6jslmsr3%40alap3.anarazel.de [4] https://www.postgresql.org/message-id/D8F1D509-6498-43AC-BEFC-052DFE16847A%402ndquadrant.com [5] https://www.postgresql.org/message-id/20150304150712.GV29780%40tamriel.snowman.net [6] https://www.postgresql.org/message-id/15697.1479161432%40sss.pgh.pa.us == I'll add this to the next commitfest soon. -John Naylor
Attachment
There doesn't seem to be any interest in bootstrap data at the moment, but rather than give up just yet, I've added a couple features to make a data migration more compelling: 1. Human-readable types, operators, opfamilies, and access methods 2. Column abbreviations For an example of both practices, an entry from pg_amop changes from DATA(insert ( 1976 21 21 1 s 95 403 0 )); to { opf => 'btree/integer_ops', lt => 'int2', rt => 'int2', str => '1', oper => '<(int2,int2)', am => 'btree' }, 3. Reduce redundancy in pg_proc data by -computing pronargs from proargtypes and -leaving out prosrc if it's just a copy of proname. This, plus a few column abbreviations drastically shrinks pg_proc.dat line length, even with human-readable types: pg_proc.h: avg=175, stdev=25 pg_proc.dat: avg=92, stdev=43 An example before: DATA(insert OID = 300 ( float48ne PGNSP PGUID 12 1 0 0 0 f f f t t f i s 2 0 16 "700 701" _null_ _null_ _null_ _null_ _null_ float48ne _null_ _null_ _null_ )); and after: { oid => '300', n => 'float48ne', lp => 't', p => 's', rt => 'bool', at => 'float4 float8' }, -- I've changed the numbering so that patches with the same number should be taken as unit, separated only for readability. When referring to the previous email overview, they map like this: 1-3 : unchanged 4-7 : 4A-4D 8 : N/A - I've left out the SQL generation for now, but I can add it later. New in this patch set: Patch 5 rips out the DATA() and DESCR() lines from the headers and updates the comments to reflect that. Patches 6A and 6B implement human-readable types etc. as described above. -John Naylor
Attachment
- 0001_corrections_v1.patch
- 0002_cleanup_v1.patch
- 0003_pgattr_schema_isolation_v1.patch
- 0004A_conversion_scripts_and_headers_v2.patch
- 0004B_data_files_mechanical_v2.patch.tar.gz
- 0004C_data_files_hand_edits_v2.patch
- 0004D_update_distprep_scripts_v2.patch
- 0005_gut_headers_v2.patch.tar.gz
- 0006A_human_readable_entries_data_v2.patch.tar.gz
- 0006B_human_readable_entries_code_v2.patch
On 12/13/17 04:06, John Naylor wrote: > There doesn't seem to be any interest in bootstrap data at the moment, > but rather than give up just yet, I've added a couple features to make > a data migration more compelling: I took a brief look at your patches, and there appear to be a number of good cleanups in there at least. But could you please send patches in git format-patch format with commit messages, so we don't have to guess what each patch does? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 12/13/17, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 12/13/17 04:06, John Naylor wrote: >> There doesn't seem to be any interest in bootstrap data at the moment, >> but rather than give up just yet, I've added a couple features to make >> a data migration more compelling: > > I took a brief look at your patches, and there appear to be a number of > good cleanups in there at least. But could you please send patches in > git format-patch format with commit messages, so we don't have to guess > what each patch does? Thanks for taking a look and for pointing me to git format-patch. That's much nicer than trying to keep emails straight. I've attached a new patchset. Note that 4-7 and 9-10 are units as far as the build is concerned. Meaning, once 4 is applied, the build is broken until 7 is applied. Also, postgres.bki won't diff 100% clean with the master branch because of some useless quotes in the latter. One thing that occured to me while looking over patch 0004 again: It's now a bit uglier to handle indexing.h and toasting.h. I think it might be cleaner to keep those statements in the header of the catalog they refer to. That has the additional benefit of making the headers the Single Point of Truth for a catalog schema. TODO: -Docs and README -Finish SQL generation patch -Consider generating pg_type #defines -John Naylor
Attachment
- 0001-Fix-typos-and-oversights-in-catalog-headers_v3.patch
- 0002-Cleanup-distprep-scripts_v3.patch
- 0003-Remove-hard-coded-schema-knowledge-about-pg_attri_v3.patch
- 0004-Data-conversion-infrastructure_v3.patch
- 0005-Mechanical-data-conversion_v3.patch.tar.gz
- 0006-Hand-edits-of-data-files_v3.patch
- 0007-Update-distprep-scripts_v3.patch
- 0008-Remove-data-from-catalog-headers_v3.patch.tar.gz
- 0009-Replace-oids-with-human-readable-names_v3.patch.tar.gz
- 0010-Type-aliases-for-oid-lookups_v3.patch
On Thu, Dec 14, 2017 at 05:59:12PM +0700, John Naylor wrote: > On 12/13/17, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > On 12/13/17 04:06, John Naylor wrote: > >> There doesn't seem to be any interest in bootstrap data at the moment, > >> but rather than give up just yet, I've added a couple features to make > >> a data migration more compelling: > > > > I took a brief look at your patches, and there appear to be a number of > > good cleanups in there at least. But could you please send patches in > > git format-patch format with commit messages, so we don't have to guess > > what each patch does? > > Thanks for taking a look and for pointing me to git format-patch. > That's much nicer than trying to keep emails straight. I've attached a > new patchset. Thanks for your hard work on this. It'll really make developing this part of the code a lot more pleasant. Unfortunately, it no longer applies to master. Can we get a rebase? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 12/20/17, David Fetter <david@fetter.org> wrote: > Thanks for your hard work on this. It'll really make developing this > part of the code a lot more pleasant. I hope so, thanks. > Unfortunately, it no longer applies to master. Can we get a rebase? Hmm, it still applied for me, except when I forgot to gunzip the larger patches. In any case, I've attached version 4 which contains some recent improvements. It was rebased against master as of 6719b238e8f0ea. If it doesn't apply for you please let me know the details. -- New in this patch set: -Remove code duplication and improve modularity in data parsing -Update README at appropriate points -Shift some hunks around to make patches more focused and readable -Comment and commit message editing -Patch 11 reduces indentation -Patch 12 moves all the toast/index commands into the individual catalog headers and simplifies some #includes (Note: I failed to shave the yak of selinux, so the #include changes for contrib/sepgsql are untested) At this point, I think it's no longer a WIP, and will only make further changes based on review or if I find a mistake. Left for future projects: -SQL generation for querying source bootstrap data -Generating pg_type #defines -John Naylor
Attachment
- v4-0001-Fix-typos-and-oversights-in-catalog-headers.patch
- v4-0002-Cleanup-distprep-scripts.patch
- v4-0003-Remove-hard-coded-schema-knowledge-about-pg_attri.patch
- v4-0004-Data-conversion-infrastructure.patch
- v4-0005-Mechanical-data-conversion.patch.tar.gz
- v4-0006-Hand-edits-of-data-files.patch
- v4-0007-Update-catalog-scripts-to-read-data-files.patch
- v4-0008-Remove-data-from-catalog-headers.patch.tar.gz
- v4-0009-Replace-oids-with-human-readable-names.patch.tar.gz
- v4-0010-Type-aliases-for-oid-lookups.patch
- v4-0011-Reduce-indentation-level.patch
- v4-0012-Move-toast-index-macros-to-the-pg_-catalog-header.patch
Pushed 0001 and 0002. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hmm, patch 0008 removes data lines from the .h but leaves the dependent OID define lines around: #define BTREE_AM_OID 403 This is not good, because then the define depends on data that appears in a distant file. Another consideration is that the current system has the property that these OIDs are discoverable by a hacker by navigating to the containing .h file; and a missing symbol is easily fixable if they need to hardcode the OID for which there isn't a symbol yet. Maybe a generated .h file containing defines for OIDs from all catalogs is okay. Of course, not all symbols are to be listed -- we should have a special marker in the data lines for those that are. Maybe something like this { oid => '403', descr => 'b-tree index access method', amname => 'btree', amhandler => 'bthandler', amtype => 'i', cpp_symbol => 'BTREE_AM_OID' }, (where 'cpp_symbol' would be skipped by genbki explicitly). Any better ideas? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Dec 21, 2017 at 5:32 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Hmm, patch 0008 removes data lines from the .h but leaves the dependent > OID define lines around: Just a question here -- do we actually have consensus on doing the stuff that these patches do? Because I'm not sure we do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 12/22/17, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Hmm, patch 0008 removes data lines from the .h but leaves the dependent > OID define lines around: > > #define BTREE_AM_OID 403 > > This is not good, because then the define depends on data that appears > in a distant file. I see what you mean. > Another consideration is that the current system has > the property that these OIDs are discoverable by a hacker by navigating > to the containing .h file; and a missing symbol is easily fixable if > they need to hardcode the OID for which there isn't a symbol yet. I'm not sure I follow you here. > Maybe a generated .h file containing defines for OIDs from all catalogs > is okay. Of course, not all symbols are to be listed -- we should have > a special marker in the data lines for those that are. Maybe something > like this > > { oid => '403', descr => 'b-tree index access method', > amname => 'btree', amhandler => 'bthandler', amtype => 'i', > cpp_symbol => 'BTREE_AM_OID' }, > > (where 'cpp_symbol' would be skipped by genbki explicitly). The last part makes sense and would be a fairly mechanical change. I'm not sure of the best way to include those generated symbols back in the code again, though. I think a single file might not be desirable because of namespace pollution. The alternative would be to have, say, pg_am.h include pg_am_sym.h. More complex but doable. Also, no need to skip non-data values explicitly. The code knows where to find the schema. :-) Thanks for pushing 1 and 2, BTW. -John Naylor
Robert Haas wrote: > On Thu, Dec 21, 2017 at 5:32 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Hmm, patch 0008 removes data lines from the .h but leaves the dependent > > OID define lines around: > > Just a question here -- do we actually have consensus on doing the > stuff that these patches do? Because I'm not sure we do. My reading of the old threads (linked provided by John in his initial email in this thread) is that we have a consensus that we want to get rid of the old data representation, because it causes endless amount of pain. The proposed format seems to satisfy the constraints that we all discussed, namely 1. be easier to modify than the current format, 2. in particular, allow for default values on certain columns 3. not cause git merge problems because of too similar lines in every record 4. not require onerous Perl modules The one thing we seem to lack is a tool to edit the data files, as you suggested[1]. Stephen Frost mentioned[2] that we could do this by allowing the .data files be loaded in a database table, have the changes made via SQL, then have a way to create an updated .data file. Tom said[3] he didn't like that particular choice. So we already have Catalog.pm that (after these patches) knows how to load .data files; we could use that as a basis to enable easy oneliners to do whatever editing is needed. Also, the CPP symbols remaining in the pg_*.h that I commented yesterday was already mentioned[4] before. [1] https://www.postgresql.org/message-id/CA%2BTgmoa4%3D5oz7wSMcLNLh8h6cXzPoxxNJKckkdSQA%2BzpUG0%2B0A%40mail.gmail.com [2] https://www.postgresql.org/message-id/20150304150712.GV29780%40tamriel.snowman.net [3] https://www.postgresql.org/message-id/24766.1478821303%40sss.pgh.pa.us [4] https://www.postgresql.org/message-id/15697.1479161432@sss.pgh.pa.us -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I wrote: > On 12/22/17, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> Maybe a generated .h file containing defines for OIDs from all catalogs >> is okay. Of course, not all symbols are to be listed -- we should have >> a special marker in the data lines for those that are. Maybe something >> like this >> >> { oid => '403', descr => 'b-tree index access method', >> amname => 'btree', amhandler => 'bthandler', amtype => 'i', >> cpp_symbol => 'BTREE_AM_OID' }, >> >> (where 'cpp_symbol' would be skipped by genbki explicitly). > > The last part makes sense and would be a fairly mechanical change. I'm > not sure of the best way to include those generated symbols back in > the code again, though. I think a single file might not be desirable > because of namespace pollution. [snip] I've attached version 5 (rebased onto be2343221fb), which removes OID #define symbols from the headers and stores them with the records they refer to. I went ahead with your suggestion to try a single generated header. I believe there is no chance of namespace pollution since the symbols already have a nomenclature that reflects what catalog they belong to. This required some additional Makefile changes, since some code outside the backend needs certain OID symbols to be visible. There are probably bugs, but I wanted to share the initial design. The MSVC changes are untested. In addition, FindDefinedSymbol() now doesn't work for catalog headers, so I added a new function to search within the data. On the plus side, there is now a mechanism to generate pg_type OID symbols, and ECPG's knowledge of types is now maintained automatically. Since I had to rebase over recent additions to SP-GiST opclasses anyway, I restructured the patches to have a clean separation between data migration and data compaction. This makes the patches easier to follow. The pg_proc defaults have been tweaked slightly to match those suggested by Andrew Dunstan [1]. There are now human-readable entries for oprcom and oprnegate in pg_operator.dat. Finally, assorted cosmetic improvements and README/comment editing. [1] https://www.postgresql.org/message-id/b76d153a-33d7-7827-746c-1109f7bf529d%40dunslane.net -- -John Naylor
Attachment
- v5-0001-Remove-hard-coded-schema-knowledge-about-pg_attri.patch
- v5-0002-Data-conversion-infrastructure.patch
- v5-0003-Mechanical-data-conversion.patch.tar.gz
- v5-0004-Hand-edits-of-data-files.patch
- v5-0005-Update-catalog-scripts-to-read-data-files.patch
- v5-0006-Remove-data-from-catalog-headers.patch.tar.gz
- v5-0007-Remove-OID-define-symbols-from-catalog-headers.patch
- v5-0008-Implement-data-compaction-strategies.patch
- v5-0009-Data-file-compaction.patch.tar.gz
- v5-0010-Replace-OIDs-with-human-readable-names.patch.tar.gz
- v5-0011-Type-aliases-for-OID-lookups.patch
- v5-0012-Reduce-indentation-level.patch
- v5-0013-Move-toast-index-macros-to-the-pg_-catalog-header.patch
Pushed 0001 with some changes of my own. I'm afraid I created a few conflicts for the later patches in your series; please rebase. I don't think we introduced anything that would fail on old Perls, but let's see what buildfarm has to say. Others: Now is the time to raise concerns related to the proposed file formats and tooling, so please do have a look when you have a moment. At this stage, the proposed data format seems a good choice to me. Thanks -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Others: Now is the time to raise concerns related to the proposed file > formats and tooling, so please do have a look when you have a moment. > At this stage, the proposed data format seems a good choice to me. It's not very clear to me what the proposed data format actually is, and I don't really want to read several hundred KB worth of patches in order to reverse-engineer that information. Nor do I see anything in the patch list that obviously looks like it updates doc/src/sgml/bki.sgml to explain things. So could we have an explanation of what it is we're agreeing to? regards, tom lane
On Fri, Jan 12, 2018 at 11:38:54AM -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Others: Now is the time to raise concerns related to the proposed > > file formats and tooling, so please do have a look when you have a > > moment. At this stage, the proposed data format seems a good > > choice to me. > > It's not very clear to me what the proposed data format actually is, > and I don't really want to read several hundred KB worth of patches > in order to reverse-engineer that information. Nor do I see > anything in the patch list that obviously looks like it updates > doc/src/sgml/bki.sgml to explain things. > > So could we have an explanation of what it is we're agreeing to? That would be awesome. A walk-through example or two would also help. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Others: Now is the time to raise concerns related to the proposed file > > formats and tooling, so please do have a look when you have a moment. > > At this stage, the proposed data format seems a good choice to me. > > It's not very clear to me what the proposed data format actually is, > and I don't really want to read several hundred KB worth of patches > in order to reverse-engineer that information. Nor do I see > anything in the patch list that obviously looks like it updates > doc/src/sgml/bki.sgml to explain things. > > So could we have an explanation of what it is we're agreeing to? Here's a small sample pg_proc entry: { oid => '2147', descr => 'number of input rows for which the input expression is not null', n => 'count', proisagg => 't', v => 'i', p => 's', rt => 'int8', at => 'any', s => 'aggregate_dummy' }, An pg_amop entry: { opf => 'btree/integer_ops', lt => 'int2', rt => 'int2', str => '1', oper => '<(int2,int2)', am => 'btree' }, Notes: 1. this is Perl data; it is read with 'eval' without any external modules. 2. the pg_proc entry has been compressed to two lines, to avoid content-free lines that would easily confuse git merge, but keep line length reasonable. 3. references to objects in other catalogs are by name, such as "int8" or "btree/integer_ops" rather than OID. 4. for each attribute, an abbreviation can be declared. In the pg_proc sample we have "n" which stands for proname, because we have this line: + NameData proname BKI_ABBREV(n); I think John has gone overboard with some of these choices, but we can argue the specific choices once we decide that abbreviation is a good idea. (Prior discussion seems to suggest we already agreed on that.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Tom Lane wrote: >> So could we have an explanation of what it is we're agreeing to? > Here's a small sample pg_proc entry: > { oid => '2147', descr => 'number of input rows for which the input expression is not null', > n => 'count', proisagg => 't', v => 'i', p => 's', rt => 'int8', at => 'any', s => 'aggregate_dummy' }, > An pg_amop entry: > { opf => 'btree/integer_ops', lt => 'int2', rt => 'int2', str => '1', oper => '<(int2,int2)', am => 'btree' }, > Notes: > 1. this is Perl data; it is read with 'eval' without any external modules. Check. Where is it coming from --- I suppose we aren't going to try to store this in the existing .h files? What provisions will there be for comments? > 2. the pg_proc entry has been compressed to two lines, to avoid > content-free lines that would easily confuse git merge, but keep line > length reasonable. Seems like we would almost need a per-catalog convention on how to lay out the entries, or else we're going to end up (over time) with lots of cowboy coding leading to entries that look randomly different from the ones around them. > 3. references to objects in other catalogs are by name, such as "int8" > or "btree/integer_ops" rather than OID. +1 > 4. for each attribute, an abbreviation can be declared. In the > pg_proc sample we have "n" which stands for proname, because we have > this line: > + NameData proname BKI_ABBREV(n); I think single-letter abbreviations here are a pretty bad space vs readability tradeoff, particularly for wider catalogs where it risks ambiguity. The pg_amop sample you show looks noticeably more legible than the other one. Still, this is something we can debate on a case-by-case basis, it's not a defect in the mechanism. One other question is how we'll verify the conversion. Is there an expectation that the .bki file immediately after the conversion will be identical to immediately before? regards, tom lane
Tom, everyone, It's getting late in my timezone, but I wanted to give a few quick answers. I'll follow up tomorrow. Thanks Alvaro for committing my refactoring of pg_attribute data creation. I think your modifications are sensible and I'll rebase soon. On 1/13/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It's not very clear to me what the proposed data format actually is, > and I don't really want to read several hundred KB worth of patches > in order to reverse-engineer that information. Nor do I see > anything in the patch list that obviously looks like it updates > doc/src/sgml/bki.sgml to explain things. Alvaro gave a good overview, so I'll just point out a few things. -Patches 0002 through 0007 represent a complete one-to-one migration of data entries. I didn't see much in bki.sgml specific to the current format, so my documentation changes are confined largely to the README, in patch 0005. -Patches 0008 and 0009 implement techniques to make the data lines shorter. My choices are certainly debatable. There is a brief addition to the README in patch 0008. The abbreviation technique was only used in three catalogs to demonstrate. -Patches 0010 and 0011 implement human-readable OID references. -Patches 0012 and 0013 are cosmetic, but invasive. > Seems like we would almost need a per-catalog convention on how to lay out > the entries, or else we're going to end up (over time) with lots of cowboy > coding leading to entries that look randomly different from the ones > around them. If I understand your concern correctly, the convention is enforced by a script (rewrite_dat.pl). At the very least this would be done at the same time as pg_indent and perltidy. To be sure, because of default values many entries will look randomly different from the ones around them regardless. I have a draft patch to load the source data into tables for viewing, but it's difficult to rebase, so I thought I'd offer that enhancement later. > One other question is how we'll verify the conversion. Is there an > expectation that the .bki file immediately after the conversion will > be identical to immediately before? Not identical. First, as part of the base migration, I stripped almost all double quotes from the data entries since the new Perl hash values are already single-quoted. (The exception is macros expanded by initdb.c) I made genbki.pl add quotes on output to match what bootscanner.l expects. Where a simple rule made it possible, it also matches the original .bki. The new .bki will only diff where the current data has superfluous quotes. (ie. "0", "sql"). Second, if the optional cosmetic patch 0013 is applied, the individual index and toast commands will be in a different order. > Check. Where is it coming from --- I suppose we aren't going to try to > store this in the existing .h files? What provisions will there be for > comments? Yes, they're in ".dat" files. Perl comments (#) on their own line are supported. I migrated all existing comments from the header files as part of the conversion. This is scripted, so I can rebase over new catalog entries that get committed. > I think single-letter abbreviations here are a pretty bad space vs > readability tradeoff, particularly for wider catalogs where it risks > ambiguity. Ironically, I got that one from you [1] ;-), but if you have a different opinion upon seeing concrete, explicit examples, I think that's to be expected. -- Now is probably a good time to disclose concerns of my own: 1. MSVC dependency tracking is certainly broken until such time as I can shave that yak and test. 2. Keeping the oid symbols with the data entries required some Makefile trickery to make them visible to .c files outside the backend (patch 0007). It builds fine, but the dependency tracking might have bugs. -- [1] https://www.postgresql.org/message-id/15697.1479161432%40sss.pgh.pa.us Thanks, John Naylor
On 1/12/18 12:24, Alvaro Herrera wrote: > Here's a small sample pg_proc entry: > > { oid => '2147', descr => 'number of input rows for which the input expression is not null', > n => 'count', proisagg => 't', v => 'i', p => 's', rt => 'int8', at => 'any', s => 'aggregate_dummy' }, > > An pg_amop entry: > { opf => 'btree/integer_ops', lt => 'int2', rt => 'int2', str => '1', oper => '<(int2,int2)', am => 'btree' }, > > Notes: > 1. this is Perl data; it is read with 'eval' without any external modules. > 2. the pg_proc entry has been compressed to two lines, to avoid > content-free lines that would easily confuse git merge, but keep line > length reasonable. I don't think I like this. I know pg_proc.h is a pain to manage, but at least right now it's approachable programmatically. I recently proposed to patch to replace the columns proisagg and proiswindow with a combined column prokind. I could easily write a small Perl script to make that change in pg_proc.h, because the format is easy to parse and has one line per entry. With this new format, that approach would no longer work, and I don't know what would replace it. > 3. references to objects in other catalogs are by name, such as "int8" > or "btree/integer_ops" rather than OID. I think we could already do this by making more use of things like regtype and regproc. That should be an easy change to make. > 4. for each attribute, an abbreviation can be declared. In the > pg_proc sample we have "n" which stands for proname, because we have > this line: > + NameData proname BKI_ABBREV(n); I'm afraid a key value system would invite writing the attributes in random order and create a mess over time. But if we want to do it, I think we could also add it to the current BKI format. The same goes for defining default values for some columns. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jan 12, 2018 at 04:22:26PM -0500, Peter Eisentraut wrote: > On 1/12/18 12:24, Alvaro Herrera wrote: > > Here's a small sample pg_proc entry: > > > > { oid => '2147', descr => 'number of input rows for which the input expression is not null', > > n => 'count', proisagg => 't', v => 'i', p => 's', rt => 'int8', at => 'any', s => 'aggregate_dummy' }, > > > > An pg_amop entry: > > { opf => 'btree/integer_ops', lt => 'int2', rt => 'int2', str => '1', oper => '<(int2,int2)', am => 'btree' }, > > > > Notes: > > 1. this is Perl data; it is read with 'eval' without any external modules. > > 2. the pg_proc entry has been compressed to two lines, to avoid > > content-free lines that would easily confuse git merge, but keep line > > length reasonable. > > I don't think I like this. I know pg_proc.h is a pain to manage, > but at least right now it's approachable programmatically. I > recently proposed to patch to replace the columns proisagg and > proiswindow with a combined column prokind. I could easily write a > small Perl script to make that change in pg_proc.h, because the > format is easy to parse and has one line per entry. With this new > format, that approach would no longer work, and I don't know what > would replace it. How about ingesting with Perl, manipulating there, and spitting back out as Perl data structures? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Peter Eisentraut wrote: > On 1/12/18 12:24, Alvaro Herrera wrote: > > Here's a small sample pg_proc entry: > > > > { oid => '2147', descr => 'number of input rows for which the input expression is not null', > > n => 'count', proisagg => 't', v => 'i', p => 's', rt => 'int8', at => 'any', s => 'aggregate_dummy' }, > > > > An pg_amop entry: > > { opf => 'btree/integer_ops', lt => 'int2', rt => 'int2', str => '1', oper => '<(int2,int2)', am => 'btree' }, > > > > Notes: > > 1. this is Perl data; it is read with 'eval' without any external modules. > > 2. the pg_proc entry has been compressed to two lines, to avoid > > content-free lines that would easily confuse git merge, but keep line > > length reasonable. > > I don't think I like this. I know pg_proc.h is a pain to manage, but at > least right now it's approachable programmatically. I recently proposed > to patch to replace the columns proisagg and proiswindow with a combined > column prokind. I could easily write a small Perl script to make that > change in pg_proc.h, because the format is easy to parse and has one > line per entry. With this new format, that approach would no longer > work, and I don't know what would replace it. The idea in my mind is that you'd write a Perl program to do such changes, yeah. If the code we supply contains enough helpers and a few samples, it should be reasonably simple for people that don't do much Perl. The patch series does contain a few helper programs to write the data files. I haven't looked in detail what can they do and what they cannot. > > 3. references to objects in other catalogs are by name, such as "int8" > > or "btree/integer_ops" rather than OID. > > I think we could already do this by making more use of things like > regtype and regproc. That should be an easy change to make. Well, that assumes we *like* the current format, which I think is not a given ... more the opposite. > > 4. for each attribute, an abbreviation can be declared. In the > > pg_proc sample we have "n" which stands for proname, because we have > > this line: > > + NameData proname BKI_ABBREV(n); > > I'm afraid a key value system would invite writing the attributes in > random order and create a mess over time. Yeah, I share this concern. But you could fix it if the Perl tooling to write these files had a hardcoded list to work with. Maybe we could put it in a declaration of sorts at the start of each data file. > But if we want to do it, I think we could also add it to the current BKI > format. The same goes for defining default values for some columns. As above -- do we really like our current format so much that we're satisfied with minor tweaks? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-01-12 18:36:14 -0300, Alvaro Herrera wrote: > As above -- do we really like our current format so much that we're > satisfied with minor tweaks? A definite no from here. I think especially pg_proc desperately needs something key=value like to be understandable, and that very clearly seems to be something we can't do in the current format. - Andres
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Peter Eisentraut wrote: >> I don't think I like this. I know pg_proc.h is a pain to manage, but at >> least right now it's approachable programmatically. I recently proposed >> to patch to replace the columns proisagg and proiswindow with a combined >> column prokind. I could easily write a small Perl script to make that >> change in pg_proc.h, because the format is easy to parse and has one >> line per entry. With this new format, that approach would no longer >> work, and I don't know what would replace it. > The idea in my mind is that you'd write a Perl program to do such > changes, yeah. If the code we supply contains enough helpers and a few > samples, it should be reasonably simple for people that don't do much > Perl. It would be good to see a sample --- for a concrete example, how about creating a Perl script to do the conversion Peter mentions? >>> 3. references to objects in other catalogs are by name, such as "int8" >>> or "btree/integer_ops" rather than OID. >> I think we could already do this by making more use of things like >> regtype and regproc. That should be an easy change to make. > Well, that assumes we *like* the current format, which I think is not a > given ... more the opposite. Note that we *can't* easily improve that given the current tooling, mainly because the bootstrap-time capabilities of regproc_in et al are so limited. We don't even have regxxx types for many of the other cross-reference columns like opclass references, and I don't think I want to build them because they'd also have bootstrapping issues. According to my understanding, part of what's going on here is that we're going to teach genbki.pl to parse these object references and convert them to hard-coded OIDs in the emitted BKI file. That seems good to me, but one thing we're going to need is a spec for how genbki.pl knows what to do. >> I'm afraid a key value system would invite writing the attributes in >> random order and create a mess over time. > Yeah, I share this concern. But you could fix it if the Perl tooling to > write these files had a hardcoded list to work with. Maybe we could put > it in a declaration of sorts at the start of each data file. This is more or less the same concern I stated upthread. But the impression I'm getting is that we expect these files to often be written out from a Perl script, so it's mostly a question of how we teach the Perl scripts to emit stylistically consistent data. Then we can use the Perl scripts as a kind of pgindent for this data. >> But if we want to do it, I think we could also add it to the current BKI >> format. The same goes for defining default values for some columns. > As above -- do we really like our current format so much that we're > satisfied with minor tweaks? I'm sure not. This will be a big change, without a doubt, but I think we'll end up in a better place. regards, tom lane
On 1/13/18, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > I'm afraid a key value system would invite writing the attributes in > random order and create a mess over time. A developer can certainly write them in random order, and it will still work. However, in patch 0002 I have a script to enforce a standard appearance. Of course, for it to work, you have to run it. I describe it, if rather tersely, in the README changes in patch 0008. Since several people have raised this concern, I will go into a bit more depth here. Perhaps I should reuse some of this language for the README to improve it. src/include/catalog/rewrite_dat.pl knows where to find the schema of each catalog, namely the pg_*.h header, accessed via ParseHeader() in Catalog.pm. It writes key/value pairs in the order found in the schema: { key_1 => 'value_1', key_2 => 'value_2', ..., key_n => 'value_n' } The script also has an array of four hard-coded metadata fields: oid, oid_symbol, descr, and shdescr. If any of these are present, they will go on their own line first, in the order given: { oid => 9999, oid_symbol => 'FOO_OID', descr => 'comment on foo', key_1 => 'value_1', key_2 => 'value_2', ..., key_n => 'value_n' } > I don't think I like this. I know pg_proc.h is a pain to manage, but at > least right now it's approachable programmatically. I recently proposed > to patch to replace the columns proisagg and proiswindow with a combined > column prokind. I could easily write a small Perl script to make that > change in pg_proc.h, because the format is easy to parse and has one > line per entry. With this new format, that approach would no longer > work, and I don't know what would replace it. I've attached four diffs/patches to walk through how you would replace the columns proisagg and proiswindow with a combined column prokind. Patch 01: Add new prokind column to pg_proc.h, with a default of 'n'. In many cases, this is all you would have to do, as far as bootstrapping is concerned. Diff 02: This is a one-off script diffed against rewrite_dat.pl. In rewrite_dat.pl, I have a section with this comment, and this is where I put the one-off code: # Note: This is also a convenient place to do one-off # bulk-editing. (I haven't documented this with explicit examples, so I'll have to remedy that) You would run it like this: cd src/include/catalog perl -I ../../backend/catalog/ rewrite_dat_with_prokind.pl pg_proc.dat While reading pg_proc.dat, the default value for prokind is added automatically. We inspect proisagg and proiswindow, and change prokind accordingly. pg_proc.dat now has all three columns, prokind, proisagg, and proiswindow. Patch 03: Remove old columns from pg_proc.h Now we run the standard rewrite: perl -I ../../backend/catalog/ rewrite_dat.pl pg_proc.dat Any values not found in the schema will simply not be written to pg_proc.dat, so the old columns are now gone. The result is found in patch 04. -- Note: You could theoretically also load the source data into tables, do the updates with SQL, and dump back out again. I made some progress with this method, but it's not complete. I think the load and dump steps add too much complexity for most use cases, but it's a possibility. -John Naylor
Attachment
On 1/13/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > According to my understanding, part of what's going on here is that > we're going to teach genbki.pl to parse these object references and > convert them to hard-coded OIDs in the emitted BKI file. That seems > good to me, but one thing we're going to need is a spec for how > genbki.pl knows what to do. I don't know if it qualifies as a spec, but here's my implementation: Use dummy type aliases in the header files: regam, regoper, regopf, and regtype These are #defined away in genbki.h: +/* ---------------- + * Some columns of type Oid have human-readable entries that are + * resolved when creating postgres.bki. + * ---------------- + */ +#define regam Oid +#define regoper Oid +#define regopf Oid +#define regtype Oid Likewise, in genbki.pl (and I just noticed a typo, s/names/types/): +# We use OID aliases to indicate when to do OID lookups, so column names +# have to be turned back into 'oid' before writing the CREATE command. +my %RENAME_REGOID = ( + regam => 'oid', + regoper => 'oid', + regopf => 'oid', + regtype => 'oid'); + When genbki.pl sees one of these type aliases, it consults the appropriate lookup table, exactly how we do now for regproc. One possibly dubious design point is that I declined to teach the pg_attribute logic about this, so doing lookups in tables with schema macros has to be done explicitly. There is only one case of this right now, and I noted the tradeoff: + # prorettype + # Note: We could handle this automatically by using the + # 'regtype' alias, but then we would have to teach + # morph_row_for_pgattr() to change the attribute type back to + # oid. Since we have to treat pg_proc differently anyway, + # just do the type lookup manually here. + my $rettypeoid = $regtypeoids{ $bki_values{prorettype}}; + $bki_values{prorettype} = $rettypeoid + if defined($rettypeoid); This is all in patch 0011. -John Naylor
On 1/12/18, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > Pushed 0001 with some changes of my own. I'm afraid I created a few > conflicts for the later patches in your series; please rebase. Attached, rebased over 255f14183ac. This time it's a single .tar.gz. Let me know if single files are better. Here's the renumbered overview copied from one of my emails responding to Tom: -Patches 0001 through 0006 represent a complete one-to-one migration of data entries. I didn't see much in bki.sgml specific to the current format, so my documentation changes are confined largely to the README, in patch 0004. -Patches 0007 and 0008 implement techniques to make the data lines shorter. My choices are certainly debatable. There is a brief addition to the README in patch 0007. The abbreviation technique was only used in three catalogs to demonstrate. -Patches 0009 and 0010 implement human-readable OID references. -Patches 0011 and 0012 are cosmetic, but invasive. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > On 1/12/18, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> Pushed 0001 with some changes of my own. I'm afraid I created a few >> conflicts for the later patches in your series; please rebase. > Attached, rebased over 255f14183ac. I decided that I wasn't going to get answers about the things I cared about without looking through the patchset, so I've now done so, in a once-over-lightly fashion. Here's a quick summary of what it does, for those who may be equally confused, and then some comments (not really a review). The patch removes DATA and (SH)DESCR lines from all the catalog/pg_*.h files, as well as the #defines for OID-value macros, and puts that information into pg_*.dat files corresponding to the .h files, in a form that is easily readable and writable by Perl scripts. Comments associated with this info are also transferred to the .dat files, and the scripts that can rewrite the .dat files are able to preserve the comments. genbki.pl is able to generate postgres.bki and other initdb input files directly from the .dat files. It also creates a single header file src/include/catalog/oid_symbols.h that contains all of the OID-value macros that were formerly in the pg_*.h files. The patch gets rid of the need to write hard-wired OIDs when referencing operators, opfamilies, etc in the .dat files; now you can write their names instead. genbki.pl will look up the names and substitute numeric OIDs in the emitted postgres.bki file. There are also provisions to shorten the .dat files through the use of abbreviated field names, default values for fields, and some other less-general techniques. -- OK, now some comments: I'm not sure about the decision to move all the OID macros into one file; that seems like namespace pollution. It's especially odd that you did that but did not consolidate fmgroids.h with the macros for symbols from other catalogs. Now it's true that we need all those symbols to be distinct, since it needs to be possible for .c files to include any combination of pg_*.h headers, but I don't think it's an especially good idea that you have to include all of them or none. Even if we're willing to put up with that namespace pollution for backend code, there are clients that currently include pg_*.h headers to get some of those macros, and they're likely to be less happy about it. The design I'd kind of imagined was one generated file of #define's per pg_*.h file, not just one giant one. It would be really nice, also, if the attribute number macros (Natts_pg_proc, Anum_pg_proc_proname, etc) could be autogenerated. Manually renumbering those is one of the bigger pains in the rear when adding catalog columns. It was less of a pain than adjusting the DATA lines of course, so I never figured it was worth doing something about in isolation --- but with this infrastructure in place, that's manual work we shouldn't have to do anymore. Another thing that I'd sort of hoped might happen from this patchset is to cure the problem of keeping some catalog headers safe for client-side inclusion, because some clients want the OID value macros and/or macros for column values (eg PROVOLATILE_IMMUTABLE), so they currently have to #include those headers or else hard-code the values. We've worked around that to date with ad-hoc solutions like splitting function declarations out to pg_*_fn.h files, but I never liked that much. With the OID value macros being moved out to separate generated file(s), there's now a possibility that we could fix this once and for all by making client-side code include those file(s) not pg_type.h et al themselves. But we'd need a way to put the column-value macros into those files too; maybe that's too messy to make it practical. The .dat files need to have header comments that follow project conventions, in particular they need to contain copyright statements. Likewise for generated files. I've got zero faith that the .h files will hold still long enough for these patches to be reviewed and applied. The ones that touch significant amounts of data need to be explained as "run this script on the current data", rather than presented as static diffs. I'm not really thrilled by the single-purpose "magic" behaviors added in 0007, such as computing prosrc from proname. I think those will add more confusion than they're worth. In 0010, you relabel the types of some OID columns so that genbki.pl will know which lookup to apply to them. That's not such a problem for the relabelings that are just macros and genbki.pl converts back to type OID in the .bki file. But you also did things like s/Oid/regtype/, and that IS a problem because it will affect what client code sees in those catalog columns. We've discussed changing those columns to regfoo types in the past, and decided not to, because of the likelihood of breaking client queries. I do not think this patch gets to change that policy. So the way to identify the lookup rule needs to be independent of whether the column is declared as Oid or an Oid alias type. Perhaps an explicit marker telling what transformation to make, like Oid rngsubtype BKI_LOOKUP(pg_type); would work for that. I'm not really on board at all with 0012, which AFAICS moves the indexing and toast-table information out of indexing.h and toasting.h for no good reason whatever. We'll have quite enough code thrash and pending-patch breakage from this patch set; we don't need to take on rearrangements that aren't buying anything. regards, tom lane
I'm 1000% on board with replacing oid constants with symbolic names that get substituted programmatically. However I wonder why we're bothering inventing a new syntax that doesn't actually do much more than present static tabular data. If things like magic proname->prosrc behaviour are not valuable then we're not getting much out of this perl-friendly syntax that a simpler more standard format wouldn't get us. So just as a straw man proposal.... What if we just replaced the data file with a csv file that could be maintained in a spreadsheet. It could easily be parsed by perl and we could even have perl scripts that load the records into memory and modify them. You could even imagine writing a postgres script that loaded the csv file into a temporary table, did complex SQL updates or other DML, then wrote it back out to a csv file.
Greg Stark <stark@mit.edu> writes: > I'm 1000% on board with replacing oid constants with symbolic names > that get substituted programmatically. Yeah, that's almost an independent feature --- we could do that without any of this other stuff, if we wanted. > However I wonder why we're bothering inventing a new syntax that > doesn't actually do much more than present static tabular data. If > things like magic proname->prosrc behaviour are not valuable then > we're not getting much out of this perl-friendly syntax that a simpler > more standard format wouldn't get us. TBH, the thing that was really drawing my ire about that was that John was inventing random special rules and documenting them *noplace* except for the guts of some perl code. If I have to read perl code to find out what the catalog data means, I'm going to be bitching loudly. That could be done better --- one obvious idea is to add a comment to the relevant .h file, next to the field whose value will be implicitly calculated. > So just as a straw man proposal.... What if we just replaced the data > file with a csv file that could be maintained in a spreadsheet. Bleah --- that's no better than what we have today, just different. And "maintained in a spreadsheet" doesn't sound attractive to me; you'd almost certainly lose comments, for instance. regards, tom lane
I wrote: > Another thing that I'd sort of hoped might happen from this patchset > is to cure the problem of keeping some catalog headers safe for > client-side inclusion, because some clients want the OID value macros > and/or macros for column values (eg PROVOLATILE_IMMUTABLE), so they > currently have to #include those headers or else hard-code the values. > We've worked around that to date with ad-hoc solutions like splitting > function declarations out to pg_*_fn.h files, but I never liked that > much. With the OID value macros being moved out to separate generated > file(s), there's now a possibility that we could fix this once and for all > by making client-side code include those file(s) not pg_type.h et al > themselves. But we'd need a way to put the column-value macros into > those files too; maybe that's too messy to make it practical. I had a thought about how to do that. It's clearly desirable that that sort of material remain in the manually-maintained pg_*.h files, because that's basically where you look to find out C-level details of what's in a particular catalog. However, that doesn't mean that that's where the compiler has to find it. Imagine that we write such sections of the catalog .h files like #ifdef EXPOSE_TO_CLIENT_CODE /* * ... comment here ... */ #define PROVOLATILE_IMMUTABLE 'i' /* never changes for given input */ #define PROVOLATILE_STABLE 's' /* does not change within a scan */ #define PROVOLATILE_VOLATILE 'v' /* can change even within a scan */ #endif /* EXPOSE_TO_CLIENT_CODE */ Like CATALOG_VARLEN, the symbol EXPOSE_TO_CLIENT_CODE is never actually defined to the compiler. What it does is to instruct genbki.pl to copy the material up to the matching #endif into the generated file for this catalog. So, for each catalog header pg_foo.h, there would be a generated file, say pg_foo_d.h, containing: * Natts_ and Anum_ macros for pg_foo * Any EXPOSE_TO_CLIENT_CODE sections copied from pg_foo.h * Any OID-value macros for entries in that catalog pg_foo.h would contain a #include for pg_foo_d.h, so that backend-side code would obtain all these values the same as it did before. But the new policy for client code would be to include pg_foo_d.h *not* pg_foo.h, and so we are freed of any worry about whether pg_foo.h has to be clean for clients to include. We could re-merge the various pg_foo_fn.h files back into the main files, if we wanted. The contents of EXPOSE_TO_CLIENT_CODE sections wouldn't necessarily have to be just macros --- they could be anything that's safe and useful for client code. But that's probably the main usage. regards, tom lane
On 1/14/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm not sure about the decision to move all the OID macros into > one file; that seems like namespace pollution. <snip> > The design I'd kind of imagined was one generated file of #define's > per pg_*.h file, not just one giant one. First, thanks for the comments. I will start incorporating them in a few days to give others the chance to offer theirs. I'm inclined to agree about namespace pollution. One stumbling block is the makefile changes to allow OID symbols to be visible to non-backend code. Assuming I took the correct approach for the single file case, adapting that to multiple files would require some rethinking. > It's especially > odd that you did that but did not consolidate fmgroids.h with > the macros for symbols from other catalogs. Point taken. > It would be really nice, also, if the attribute number macros > (Natts_pg_proc, Anum_pg_proc_proname, etc) could be autogenerated. > Manually renumbering those is one of the bigger pains in the rear > when adding catalog columns. It was less of a pain than adjusting > the DATA lines of course, so I never figured it was worth doing > something about in isolation --- but with this infrastructure in > place, that's manual work we shouldn't have to do anymore. Searching the archives for discussion of Anum_* constants [1], I prefer your one-time suggestion to use enums instead. I'd do that, and then have Catalog.pm throw an error if Natts_* doesn't match the number of columns. That's outside the scope of this patch, however. > Another thing that I'd sort of hoped might happen from this patchset > is to cure the problem of keeping some catalog headers safe for > client-side inclusion, because some clients want the OID value macros > and/or macros for column values (eg PROVOLATILE_IMMUTABLE), so they > currently have to #include those headers or else hard-code the values. > We've worked around that to date with ad-hoc solutions like splitting > function declarations out to pg_*_fn.h files, but I never liked that > much. With the OID value macros being moved out to separate generated > file(s), there's now a possibility that we could fix this once and for all > by making client-side code include those file(s) not pg_type.h et al > themselves. But we'd need a way to put the column-value macros into > those files too; maybe that's too messy to make it practical. To make sure I understand you correctly: Currently we have pg_type.h oid symbols, column symbols pg_type_fn.h function decls And assuming we go with one generated oid symbol file per header, my patch would end up with something like pg_type.h column symbols (#includes pg_type_sym.h) pg_type_fn.h function decls pg_type_sym.h oid symbols (generated) And you're saying you'd prefer pg_type.h function decls (#includes pg_type_sym.h) pg_type_sym.h oid symbols, column symbols (generated) I agree that it'd be messy to drive the generation of the column symbols. I'll think about it. What about pg_type.h function decls (#includes pg_type_sym.h) pg_type_sym.h column symbols (static, #includes pg_type_oids.h) pg_type_oids.h oid symbols (generated) It's complicated, but arguably no more so than finding someplace more distant to stick the column symbols and writing code to copy them. It'd be about than 20 *_sym.h headers and 10 *_oids.h headers. > The .dat files need to have header comments that follow project > conventions, in particular they need to contain copyright statements. > Likewise for generated files. Okay. > I've got zero faith that the .h files will hold still long enough > for these patches to be reviewed and applied. The ones that touch > significant amounts of data need to be explained as "run this script > on the current data", rather than presented as static diffs. I've already rebased over a catalog change and it was not much fun, so I'd be happy to do it this way. > I'm not really thrilled by the single-purpose "magic" behaviors added > in 0007, such as computing prosrc from proname. I think those will > add more confusion than they're worth. Okay. I still think generating pg_type OID symbols is worth doing, but I no longer think this is a good place to do it. > In 0010, you relabel the types of some OID columns so that genbki.pl > will know which lookup to apply to them. That's not such a problem for > the relabelings that are just macros and genbki.pl converts back to > type OID in the .bki file. But you also did things like s/Oid/regtype/, > and that IS a problem because it will affect what client code sees in > those catalog columns. We've discussed changing those columns to > regfoo types in the past, and decided not to, because of the likelihood > of breaking client queries. I do not think this patch gets to change > that policy. So the way to identify the lookup rule needs to be > independent of whether the column is declared as Oid or an Oid alias type. > Perhaps an explicit marker telling what transformation to make, like > > Oid rngsubtype BKI_LOOKUP(pg_type); > > would work for that. Okay. I fail to see how client queries are affected, since I change everything back to oid, but I think your design is cleaner anyway. > I'm not really on board at all with 0012, which AFAICS moves the indexing > and toast-table information out of indexing.h and toasting.h for no good > reason whatever. We'll have quite enough code thrash and pending-patch > breakage from this patch set; we don't need to take on rearrangements that > aren't buying anything. I don't have a convincing rebuttal, so I'll withdraw it. -- [1] https://www.postgresql.org/message-id/25254.1248533810%40sss.pgh.pa.us -John Naylor
Tom Lane wrote: > I had a thought about how to do that. It's clearly desirable that that > sort of material remain in the manually-maintained pg_*.h files, because > that's basically where you look to find out C-level details of what's > in a particular catalog. However, that doesn't mean that that's where > the compiler has to find it. > > [ elided explanation of pg_foo_d.h and pg_foo.h ] Sounds good to me. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 1/14/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > So, for each catalog header pg_foo.h, there would be a > generated file, say pg_foo_d.h, containing: > > * Natts_ and Anum_ macros for pg_foo > > * Any EXPOSE_TO_CLIENT_CODE sections copied from pg_foo.h > > * Any OID-value macros for entries in that catalog I'm on board in principle, but I have some questions: How do we have the makefiles gracefully handle 62 generated headers which need to be visible outside the backend? Can I generalize the approach I took for the single OIDs file I had, or is that not even the right way to go? (In short, I used a new backend make target that was invoked in src/common/Makefile - the details are in patch v6-0006) If we move fmgr oid generation here as you suggested earlier, I imagine we don't want to create a lot of #include churn. My idea is to turn src/include/utils/fmgroids.h into a static file that just #includes catalog/pg_proc_d.h. Thoughts? And I'm curious, what is "_d" intended to convey? (While I'm thinking outloud, I'm beginning to think that these headers lie outside the scope of genbki.pl, and belong in a separate script.) -John Naylor
John Naylor <jcnaylor@gmail.com> writes: > On 1/14/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> So, for each catalog header pg_foo.h, there would be a >> generated file, say pg_foo_d.h, containing: >> * Natts_ and Anum_ macros for pg_foo >> * Any EXPOSE_TO_CLIENT_CODE sections copied from pg_foo.h >> * Any OID-value macros for entries in that catalog > I'm on board in principle, but I have some questions: > How do we have the makefiles gracefully handle 62 generated headers > which need to be visible outside the backend? There are other people around here who are better make wizards than I, but I'm sure this is soluble. A substitution like $(CATALOG_HEADERS:_d.h=.h) might get you started. (It looks like CATALOG_HEADERS would have to be separated out of POSTGRES_BKI_SRCS, but that's easy.) > If we move fmgr oid generation here as you suggested earlier, I > imagine we don't want to create a lot of #include churn. My idea is to > turn src/include/utils/fmgroids.h into a static file that just > #includes catalog/pg_proc_d.h. Thoughts? Yeah ... or vice versa. I don't know if touching the way fmgroids.h is built is worthwhile. Certainly, if we'd built all this to begin with we'd have unified pg_proc.h's OID macro handling with the other catalogs, but we didn't and that might not be worth changing. I'm not strongly convinced either way. > And I'm curious, what is "_d" intended to convey? I was thinking "#define" or "data". You could make as good a case for "_g" for "generated", or probably some other choices. I don't have a strong preference; but I didn't especially like your original suggestion of "_sym", because that seemed like it would invite confusion with possible actual names for catalogs. A one-letter suffix seems less likely to conflict with anything anybody would think was a good choice of catalog name. > (While I'm thinking outloud, I'm beginning to think that these headers > lie outside the scope of genbki.pl, and belong in a separate script.) Maybe, but the conditions for regenerating these files would be exactly the same as for the .bki file, no? So we might as well just have one script do both, rather than writing duplicative rules in the Makefiles and the MSVC scripts. regards, tom lane
On 1/14/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've got zero faith that the .h files will hold still long enough > for these patches to be reviewed and applied. The ones that touch > significant amounts of data need to be explained as "run this script > on the current data", rather than presented as static diffs. For version 7, I've attached a bash script along with the patches (against master a044378ce2f) that does exactly this. To run, one would save the patches somewhere and change the PATCH_DIR and REPO_DIR variables to match. -I've added MSVC changes, but they are untested. -I've moved a cosmetic patch up in the series to reduce rebasing effort. There are some additional comment and style changes. Not done in this version: -For the time being I've left out human-readable OIDs and data file compaction strategies. This is just to reduce effort in rebasing. I'll add some form of those back after the basics have had serious review. -Change who is responsible for fmgroids.h. It's debatable whether that would be a gain anyway. The README might need to be fleshed out further, possibly with a separate README for working with the new data format. > So, for each catalog header pg_foo.h, there would be a > generated file, say pg_foo_d.h, containing: > > * Natts_ and Anum_ macros for pg_foo > > * Any EXPOSE_TO_CLIENT_CODE sections copied from pg_foo.h > > * Any OID-value macros for entries in that catalog This is done (patch 0006). As I mentioned earlier, the sticking point is the makefiles. I have a working build, but it's not up to project standards. In particular, for the first attempt I've resorted to discarding conventions for parallel make safety, so if anyone can review and offer improvements, I'd be grateful. > The .dat files need to have header comments that follow project > conventions, in particular they need to contain copyright statements. > Likewise for generated files. Done. I'll also go ahead and move this to next commitfest. -John Naylor
Attachment
- v7-apply-bootstrap-data-patches.sh
- v7-0001-Create-data-conversion-infrastructure.patch
- v7-0002-Hand-edits-of-data-files.patch
- v7-0003-Reduce-indentation-level.patch
- v7-0004-Update-catalog-scripts-to-read-data-files.patch
- v7-0005-Clean-up-header-files-and-update-comments.patch
- v7-0006-Move-symbols-from-catalog-headers-to-generated-de.patch
Version 8, rebased against 76b6aa41f41d. -John Naylor
Attachment
- v8-apply-bootstrap-data-patches.sh
- v8-0001-Create-data-conversion-infrastructure.patch
- v8-0002-Hand-edits-of-data-files.patch
- v8-0003-Reduce-indentation-level.patch
- v8-0004-Update-catalog-scripts-to-read-data-files.patch
- v8-0005-Clean-up-header-files-and-update-comments.patch
- v8-0006-Move-symbols-from-catalog-headers-to-generated-de.patch
John Naylor <jcnaylor@gmail.com> writes: > Version 8, rebased against 76b6aa41f41d. I took a preliminary look through this, without yet attempting to execute the script against HEAD. I have a few thoughts: * I'm inclined not to commit the conversion scripts to the repo. I doubt there are third parties out there with a need for them, and if they do need them they can get 'em out of this thread in the mailing list archives. (If anyone has a different opinion about that, speak up!) * I notice you have a few "preliminary cleanup" changes here and there in the series, such as fixing the inconsistent spelling of Anum_pg_init_privs_initprivs. These could be applied before we start the main conversion process, and I'm inclined to do that since we could get 'em out of the way early. Ideally, the main conversion would only touch the header files and related scripts/makefiles. * I'm a little disturbed by the fact that 0002 has to "re-doublequote values that are macros expanded by initdb.c". I see that there are only a small number of affected places, so maybe it's not really worth working harder, but I worry that something might get missed. Is there any way to include this consideration in the automated conversion, or at least to verify that we found all the places to quote? Or, seeing that 0004 seems to be introducing some quoting-related hacks to genbki.pl anyway, maybe we could take care of the issue there? * In 0003, I'd recommend leaving the re-indentation to happen in the next perltidy run (assuming perltidy would fix that, which I think is true but I might be wrong). It's just creating more review work to do it here. In any case, the patch summary line is pretty misleading since it's *not* just reindenting, but also refactoring genbki.pl. (BTW, if that refactoring would work on the script as it is, maybe that's another thing we could do early? The more we can do before "flag day", the better IMO.) * In 0006, I'm not very pleased with the introduction of "Makefile.headers". I'd keep those macros where they are in catalog/Makefile. backend/Makefile doesn't need to know about that, especially since it's doing an unconditional invocation of catalog/Makefile anyway. It could just do something like submake-schemapg: $(MAKE) -C catalog generated-headers and leave it to catalog/Makefile to know what needs to happen for both schemapg.h and the other generated files. Overall, though, this is looking pretty promising. regards, tom lane
Thanks for taking a look. On 3/3/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > John Naylor <jcnaylor@gmail.com> writes: >> Version 8, rebased against 76b6aa41f41d. > > I took a preliminary look through this, without yet attempting to execute > the script against HEAD. I have a few thoughts: > > * I'm inclined not to commit the conversion scripts to the repo. I doubt > there are third parties out there with a need for them, and if they do > need them they can get 'em out of this thread in the mailing list > archives. (If anyone has a different opinion about that, speak up!) If no one feels strongly otherwise, I'll just attach the conversion script along with the other patches for next version. To be clear, the rewrite script is intended be committed, both to enforce formatting and as a springboard for bulk editing. Now, whether that belongs in /src/include/catalog or /src/tools is debatable. > * I'm a little disturbed by the fact that 0002 has to "re-doublequote > values that are macros expanded by initdb.c". I see that there are only > a small number of affected places, so maybe it's not really worth working > harder, but I worry that something might get missed. Is there any way to > include this consideration in the automated conversion, or at least to > verify that we found all the places to quote? Or, seeing that 0004 seems > to be introducing some quoting-related hacks to genbki.pl anyway, maybe > we could take care of the issue there? The quoting hacks are really to keep the postgres.bki diff as small as possible (attached). The simplest and most air-tight way to address your concern would be to double-quote everything when writing the bki file. That could be done last as a follow-up. > * I notice you have a few "preliminary cleanup" changes here and there > in the series, such as fixing the inconsistent spelling of > Anum_pg_init_privs_initprivs. These could be applied before we start > the main conversion process, and I'm inclined to do that since we could > get 'em out of the way early. Ideally, the main conversion would only > touch the header files and related scripts/makefiles. ... > * In 0003, I'd recommend leaving the re-indentation to happen in the next > perltidy run (assuming perltidy would fix that, which I think is true but > I might be wrong). It's just creating more review work to do it here. > In any case, the patch summary line is pretty misleading since it's > *not* just reindenting, but also refactoring genbki.pl. (BTW, if that > refactoring would work on the script as it is, maybe that's another > thing we could do early? The more we can do before "flag day", the > better IMO.) I tested perltidy 20090616 and it handles it fine. I'll submit a preliminary patch soon to get some of those items out of the way. > * In 0006, I'm not very pleased with the introduction of > "Makefile.headers". I'd keep those macros where they are in > catalog/Makefile. backend/Makefile doesn't need to know about that, > especially since it's doing an unconditional invocation of > catalog/Makefile anyway. It could just do something like > > submake-schemapg: > $(MAKE) -C catalog generated-headers > > and leave it to catalog/Makefile to know what needs to happen for > both schemapg.h and the other generated files. I wasn't happy with it either, but I couldn't get it to build otherwise. The sticking point was the symlinks in $(builddir)/src/include/catalog. $(MAKE) -C catalog doesn't handle that. The makefile in /src/common relies on the backend makefile to know what to invoke for a given header. IIRC, relpath.c includes pg_tablespace.h, which now requires pg_tablespace_d.h to be built. Perhaps /src/common/Makefile could invoke the catalog makefile directly, and the pg_*_d.h headers could be written to $(builddir)/src/include/catalog directly? I'll hack on it some more. > Overall, though, this is looking pretty promising. > > regards, tom lane > Glad to hear it. -John Naylor
Attachment
I wrote: > I'll submit a > preliminary patch soon to get some of those items out of the way. I've attached a patch that takes care of these cleanups so they don't clutter the patch set. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > On 3/3/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * I'm a little disturbed by the fact that 0002 has to "re-doublequote >> values that are macros expanded by initdb.c". I see that there are only >> a small number of affected places, so maybe it's not really worth working >> harder, but I worry that something might get missed. Is there any way to >> include this consideration in the automated conversion, or at least to >> verify that we found all the places to quote? Or, seeing that 0004 seems >> to be introducing some quoting-related hacks to genbki.pl anyway, maybe >> we could take care of the issue there? > The quoting hacks are really to keep the postgres.bki diff as small as > possible (attached). The simplest and most air-tight way to address > your concern would be to double-quote everything when writing the bki > file. That could be done last as a follow-up. Oh, if you're cross-checking by diff'ing the produced .bki file, then that's sufficient to address my concern here. No need to do more. regards, tom lane
John Naylor <jcnaylor@gmail.com> writes: > I've attached a patch that takes care of these cleanups so they don't > clutter the patch set. Pushed. I made a couple of cosmetic changes in genbki.pl. regards, tom lane
John Naylor <jcnaylor@gmail.com> writes: > On 3/3/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> * In 0006, I'm not very pleased with the introduction of >> "Makefile.headers". > I wasn't happy with it either, but I couldn't get it to build > otherwise. The sticking point was the symlinks in > $(builddir)/src/include/catalog. $(MAKE) -C catalog doesn't handle > that. The makefile in /src/common relies on the backend makefile to > know what to invoke for a given header. IIRC, relpath.c includes > pg_tablespace.h, which now requires pg_tablespace_d.h to be built. I'm not following. AFAICS, what you put in src/common/Makefile was just +.PHONY: generated-headers + +generated-headers: + $(MAKE) -C ../backend generated-headers which doesn't appear to care whether backend/Makefile knows anything about specific generated headers or not. I think all we need to do is consider that the *_d.h files ought to be built as another consequence of invoking the generated-headers target. BTW, there's already a submake-generated-headers target in Makefile.global, which you should use in preference to rolling your own. regards, tom lane
On 3/4/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > John Naylor <jcnaylor@gmail.com> writes: >> On 3/3/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> * In 0006, I'm not very pleased with the introduction of >>> "Makefile.headers". > >> I wasn't happy with it either, but I couldn't get it to build >> otherwise. The sticking point was the symlinks in >> $(builddir)/src/include/catalog. $(MAKE) -C catalog doesn't handle >> that. The makefile in /src/common relies on the backend makefile to >> know what to invoke for a given header. IIRC, relpath.c includes >> pg_tablespace.h, which now requires pg_tablespace_d.h to be built. > > I'm not following. AFAICS, what you put in src/common/Makefile was just > > +.PHONY: generated-headers > + > +generated-headers: > + $(MAKE) -C ../backend generated-headers > > which doesn't appear to care whether backend/Makefile knows anything > about specific generated headers or not. I think all we need to do > is consider that the *_d.h files ought to be built as another consequence > of invoking the generated-headers target. > > BTW, there's already a submake-generated-headers target in > Makefile.global, which you should use in preference to rolling your own. I've attached version 9, whose biggest change is to address the above points of review. I pushed all of the catalog header build logic into catalog Makefile to avoid creating a separate symbol file. This involved putting the distprep logic there as well. Enough of the structure changed that one or two names didn't make sense anymore, so I changed them. As suggested, the conversion script is now part of the patchset and not committed to the repo. To run the conversion, save everything to a directory and update the dir vars at the top of apply-bootstrap-data-patches.sh accordingly. A couple things to note that I didn't do: -With all the new generated headers, the message "Writing ..." is now quite verbose. It might be worth changing that. -I'm not sure if I need to change anything involving "make install". -I haven't tested the MSVC changes. -I didn't change any clients to actually use the new headers directly. That might be too ambitious for this cycle anyway. While this goes through review, I'll get a head start rebasing the human readable OIDs and data compaction patches. -John Naylor
Attachment
- v9-convert_header2dat.pl
- v9-0001-Create-data-conversion-infrastructure.patch
- v9-0002-Hand-edits-of-data-files.patch
- v9-0003-Update-catalog-scripts-to-read-data-files.patch
- v9-0004-Clean-up-header-files-and-update-comments.patch
- v9-0005-Remove-symbols-from-catalog-headers.patch
- v9-apply-bootstrap-data-patches.sh
I wrote: > I've attached version 9, whose biggest change is to address the above > points of review. I pushed all of the catalog header build logic into > catalog Makefile to avoid creating a separate symbol file. This > involved putting the distprep logic there as well. Enough of the > structure changed that one or two names didn't make sense anymore, so > I changed them. > > As suggested, the conversion script is now part of the patchset and > not committed to the repo. To run the conversion, save everything to a > directory and update the dir vars at the top of > apply-bootstrap-data-patches.sh accordingly. > > A couple things to note that I didn't do: > -With all the new generated headers, the message "Writing ..." is now > quite verbose. It might be worth changing that. > -I'm not sure if I need to change anything involving "make install". > -I haven't tested the MSVC changes. > -I didn't change any clients to actually use the new headers directly. > That might be too ambitious for this cycle anyway. > > While this goes through review, I'll get a head start rebasing the > human readable OIDs and data compaction patches. It didn't take that long to rebase the remaining parts of the patchset, so despite what I said above I went ahead and put them in version 10 (attached), this time via scripted bulk editing rather than as large patches. Changes since the last patchset that contained these parts: -Split out the generation of pg_type OID symbols into its own patch. -Remove single-purpose magic behaviors. -Ditto for the ability to abbreviate attribute names. I decided the added complexity and possible confusion wasn't worth the space savings. -Add some more OID macros for pg_aggregate and pg_range that I missed before. Also, more generally, I cleaned up the apply-patches script and edited its comments and commit messages. Tom Lane wrote: > In 0010, you relabel the types of some OID columns so that genbki.pl > will know which lookup to apply to them. That's not such a problem for > the relabelings that are just macros and genbki.pl converts back to > type OID in the .bki file. But you also did things like s/Oid/regtype/, > and that IS a problem because it will affect what client code sees in > those catalog columns. We've discussed changing those columns to > regfoo types in the past, and decided not to, because of the likelihood > of breaking client queries. I do not think this patch gets to change > that policy. So the way to identify the lookup rule needs to be > independent of whether the column is declared as Oid or an Oid alias type. > Perhaps an explicit marker telling what transformation to make, like > > Oid rngsubtype BKI_LOOKUP(pg_type); > > would work for that. This is also done (now in 0007). -John Naylor
Attachment
- v10-apply-bootstrap-data-patches.sh
- v10-0001-Create-infrastructure-for-working-with-the-new-data-.patch
- v10-0002-Hand-edits-of-data-files.patch
- v10-0003-Update-catalog-scripts-to-read-data-files.patch
- v10-0004-Clean-up-header-files-and-update-comments.patch
- v10-0005-Remove-symbols-from-catalog-headers.patch
- v10-0006-Use-default-values-on-more-catalogs.patch
- v10-0007-Implement-OID-macro-lookups.patch
- v10-0008-Generate-pg_type-OID-symbols.patch
- v10-convert_header2dat.pl
- v10-rewrite_dat_oid2name.pl
- v10-remove_pg_type_oid_symbols.pl
John Naylor <jcnaylor@gmail.com> writes: > It didn't take that long to rebase the remaining parts of the > patchset, so despite what I said above I went ahead and put them in > version 10 (attached), this time via scripted bulk editing rather than > as large patches. Starting to look into this version now, but a small suggestion while it's still fresh in mind: it might be easier, in future rounds, to put all the files in a tarball and attach 'em as one big attachment. At least with my mail setup, it's way easier to save off a tarball and "tar xf" it than it is to individually save a dozen attachments. I suspect that way might be easier on your end, too. There's some value in posting a patchset as separate attachments when it's possible to just apply the patches in series; Munro's patch tester knows what to do with that, but not with a tarball AFAIK. But in this case, there's little hope that the patch tester would get it right anyhow. regards, tom lane
On 3/15/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > John Naylor <jcnaylor@gmail.com> writes: >> It didn't take that long to rebase the remaining parts of the >> patchset, so despite what I said above I went ahead and put them in >> version 10 (attached), this time via scripted bulk editing rather than >> as large patches. > > Starting to look into this version now, but a small suggestion while > it's still fresh in mind: it might be easier, in future rounds, to > put all the files in a tarball and attach 'em as one big attachment. Sure thing. I've done so here for version 11, which is just a rebase over the removal of pg_class.relhaspkey. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > [ v11-bootstrap-data-conversion.tar.gz ] I've done a round of review work on this, focusing on the Makefile infrastructure. I found a bunch of problems with parallel builds and VPATH builds, which are addressed in the attached incremental patch. The parallel-build issues are a bit of a mess really: it's surprising we've not had problems there earlier. For instance, catalog/Makefile has postgres.description: postgres.bki ; postgres.shdescription: postgres.bki ; schemapg.h: postgres.bki ; However, genbki.pl doesn't make any particular guarantee that postgres.bki will be written sooner than its other output files, which means that in principle make might think it needs to rebuild these other files on every subsequent check. That was somewhat harmless given the empty update rule, but it's not really the right thing. Your patch extended this to make all the generated headers dependent on postgres.bki, and those are definitely written before postgres.bki, meaning make *will* think they are out of date. Worse, it'll also think the symlinks to them are out of date. So I was running into problems with different parallel make sub-tasks removing and recreating the symlinks. VPATH builds didn't work well either, because out-of-date-ness ties into whether make will accept a file in the source dir as a valid replacement target. I resolved this mess by setting up a couple of stamp files, which is a technique we also use elsewhere. bki-stamp is a single file representing the action of having run genbki.pl, and header-stamp likewise represents the action of having made the symlinks to the generated headers. By depending on those rather than individual files, we avoid questions of exactly what the timestamps on the individual output files are. In the attached, I've also done some more work on the README file and cleaned up a few other little things. I've not really looked at the MSVC build code at all. Personally, I'm willing to just commit this (when the time comes) and let the buildfarm see if the MSVC code works ... but if anyone else wants to check that part beforehand, please do. I also have not spent much time yet looking at the end-product .h and .dat files. I did note a bit of distressing inconsistency in the formatting of the catalog struct declarations, some of which predates this patch but it seems like you've introduced more. I think what we ought to standardize on is a format similar to this in pg_opclass.h: CATALOG(pg_opclass,2616) { /* index access method opclass is for */ Oid opcmethod BKI_LOOKUP(pg_am); /* name of this opclass */ NameData opcname; /* namespace of this opclass */ Oid opcnamespace BKI_DEFAULT(PGNSP); /* opclass owner */ Oid opcowner BKI_DEFAULT(PGUID); The former convention used in some places, of field descriptions in same-line comments, clearly won't work anymore if we're sticking BKI_DEFAULT annotations there. I also don't like the format used in, eg, pg_aggregate.h of putting field descriptions in a separate comment block before the struct proper. Bitter experience has shown that there are a lot of people on this project who won't update comments that are more than about two lines away from the code they change; so the style in pg_aggregate.h is just inviting maintenance oversights. I've got mixed feelings about the whitespace lines between fields. They seem like they are mostly bulking up the code and we could do without 'em. On the other hand, pgindent will insist on putting one before any multi-line field comment, and so that would create inconsistent formatting if we don't use 'em elsewhere. Thoughts? Speaking of pgindent, those prettily aligned BKI annotations are a waste of effort, because when pgindent gets done with the code it will look like regproc aggfnoid; char aggkind BKI_DEFAULT(n); int16 aggnumdirectargs BKI_DEFAULT(0); regproc aggtransfn BKI_LOOKUP(pg_proc); regproc aggfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); regproc aggcombinefn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); regproc aggserialfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); regproc aggdeserialfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); regproc aggmtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); regproc aggminvtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); regproc aggmfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); I'm not sure if there's anything much to be done about this. BKI_DEFAULT isn't so bad, but additional annotations get unreadable fast. Maybe BKI_LOOKUP was a bad idea after all, and we should just invent more Oid-equivalent typedef names. The attached is just one incremental patch on top of your v11 series. I couldn't think of an easy way to migrate the changes back into the most relevant diffs of your series, so I didn't try. regards, tom lane diff --git a/src/Makefile b/src/Makefile index febbced..433d00b 100644 *** a/src/Makefile --- b/src/Makefile *************** SUBDIRS = \ *** 37,42 **** --- 37,47 ---- $(recurse) + # Update the commonly used headers before building the subdirectories; + # otherwise, in a parallel build, several different sub-jobs will try to + # remake them concurrently + $(SUBDIRS:%=all-%-recurse): | submake-generated-headers + install: install-local install-local: installdirs-local diff --git a/src/backend/Makefile b/src/backend/Makefile index ef00b38..775f7a3 100644 *** a/src/backend/Makefile --- b/src/backend/Makefile *************** utils/probes.h: utils/probes.d *** 150,156 **** # run this unconditionally to avoid needing to know its dependencies here: submake-catalog-headers: ! $(MAKE) -C catalog distprep builddir-headers .PHONY: submake-catalog-headers --- 150,156 ---- # run this unconditionally to avoid needing to know its dependencies here: submake-catalog-headers: ! $(MAKE) -C catalog distprep generated-header-symlinks .PHONY: submake-catalog-headers *************** endif *** 299,312 **** ########################################################################## clean: ! rm -f $(LOCALOBJS) postgres$(X) $(POSTGRES_IMP) \ ! $(top_builddir)/src/include/parser/gram.h \ ! $(top_builddir)/src/include/catalog/pg_*_d.h \ ! $(top_builddir)/src/include/catalog/schemapg.h \ ! $(top_builddir)/src/include/storage/lwlocknames.h \ ! $(top_builddir)/src/include/utils/fmgroids.h \ ! $(top_builddir)/src/include/utils/fmgrprotos.h \ ! $(top_builddir)/src/include/utils/probes.h ifeq ($(PORTNAME), cygwin) rm -f postgres.dll libpostgres.a endif --- 299,305 ---- ########################################################################## clean: ! rm -f $(LOCALOBJS) postgres$(X) $(POSTGRES_IMP) ifeq ($(PORTNAME), cygwin) rm -f postgres.dll libpostgres.a endif *************** distclean: clean *** 318,333 **** rm -f port/tas.s port/dynloader.c port/pg_sema.c port/pg_shmem.c maintainer-clean: distclean rm -f bootstrap/bootparse.c \ bootstrap/bootscanner.c \ parser/gram.c \ parser/gram.h \ parser/scan.c \ - catalog/pg_*_d.h \ - catalog/schemapg.h \ - catalog/postgres.bki \ - catalog/postgres.description \ - catalog/postgres.shdescription \ replication/repl_gram.c \ replication/repl_scanner.c \ replication/syncrep_gram.c \ --- 311,322 ---- rm -f port/tas.s port/dynloader.c port/pg_sema.c port/pg_shmem.c maintainer-clean: distclean + $(MAKE) -C catalog $@ rm -f bootstrap/bootparse.c \ bootstrap/bootscanner.c \ parser/gram.c \ parser/gram.h \ parser/scan.c \ replication/repl_gram.c \ replication/repl_scanner.c \ replication/syncrep_gram.c \ diff --git a/src/backend/catalog/.gitignore b/src/backend/catalog/.gitignore index 1044a1c..9abe91d 100644 *** a/src/backend/catalog/.gitignore --- b/src/backend/catalog/.gitignore *************** *** 3,5 **** --- 3,6 ---- /postgres.shdescription /schemapg.h /pg_*_d.h + /bki-stamp diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 6fe5566..39dae86 100644 *** a/src/backend/catalog/Catalog.pm --- b/src/backend/catalog/Catalog.pm *************** sub ParseHeader *** 201,207 **** # # The parameter $preserve_formatting needs to be set for callers that want # to work with non-data lines in the data files, such as comments and blank ! # lines. If a caller just wants consume the data, leave it unset. sub ParseData { my ($input_file, $schema, $preserve_formatting) = @_; --- 201,207 ---- # # The parameter $preserve_formatting needs to be set for callers that want # to work with non-data lines in the data files, such as comments and blank ! # lines. If a caller just wants to consume the data, leave it unset. sub ParseData { my ($input_file, $schema, $preserve_formatting) = @_; *************** sub ParseData *** 299,304 **** --- 299,305 ---- # Fill in default values of a record using the given schema. It's the # caller's responsibility to specify other values beforehand. + # If we fail to fill all fields, return a nonempty error message. sub AddDefaultValues { my ($row, $schema) = @_; *************** sub FindDefinedSymbolFromData *** 391,398 **** { return $row->{oid}; } - die "no definition found for $symbol\n"; } } 1; --- 392,399 ---- { return $row->{oid}; } } + die "no definition found for $symbol\n"; } 1; diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile index 9b87f85..17213d4 100644 *** a/src/backend/catalog/Makefile --- b/src/backend/catalog/Makefile *************** CATALOG_HEADERS := \ *** 44,68 **** pg_default_acl.h pg_init_privs.h pg_seclabel.h pg_shseclabel.h \ pg_collation.h pg_partitioned_table.h pg_range.h pg_transform.h \ pg_sequence.h pg_publication.h pg_publication_rel.h pg_subscription.h \ ! pg_subscription_rel.h \ GENERATED_HEADERS := $(CATALOG_HEADERS:%.h=%_d.h) schemapg.h - distprep: $(BKIFILES) $(GENERATED_HEADERS) - - .PHONY: builddir-headers - - builddir-headers: $(addprefix $(top_builddir)/src/include/catalog/, $(GENERATED_HEADERS)) - - all: distprep builddir-headers - # In the list of headers used to assemble postgres.bki, indexing.h needs # be last, and toasting.h just before it. ! POSTGRES_BKI_SRCS := $(addprefix $(top_srcdir)/src/include/catalog/, $(CATALOG_HEADERS) toasting.h indexing.h) POSTGRES_BKI_DATA = $(addprefix $(top_srcdir)/src/include/catalog/,\ pg_aggregate.dat pg_am.dat pg_amop.dat pg_amproc.dat pg_authid.dat \ ! pg_cast.dat pg_class.dat pg_collation.dat pg_database.dat pg_language.dat \ pg_namespace.dat pg_opclass.dat pg_operator.dat pg_opfamily.dat \ pg_pltemplate.dat pg_proc.dat pg_range.dat pg_tablespace.dat \ pg_ts_config.dat pg_ts_config_map.dat pg_ts_dict.dat pg_ts_parser.dat \ --- 44,64 ---- pg_default_acl.h pg_init_privs.h pg_seclabel.h pg_shseclabel.h \ pg_collation.h pg_partitioned_table.h pg_range.h pg_transform.h \ pg_sequence.h pg_publication.h pg_publication_rel.h pg_subscription.h \ ! pg_subscription_rel.h GENERATED_HEADERS := $(CATALOG_HEADERS:%.h=%_d.h) schemapg.h # In the list of headers used to assemble postgres.bki, indexing.h needs # be last, and toasting.h just before it. ! POSTGRES_BKI_SRCS := $(addprefix $(top_srcdir)/src/include/catalog/,\ ! $(CATALOG_HEADERS) toasting.h indexing.h \ ! ) + # The .dat files we need can just be listed alphabetically. POSTGRES_BKI_DATA = $(addprefix $(top_srcdir)/src/include/catalog/,\ pg_aggregate.dat pg_am.dat pg_amop.dat pg_amproc.dat pg_authid.dat \ ! pg_cast.dat pg_class.dat pg_collation.dat \ ! pg_database.dat pg_language.dat \ pg_namespace.dat pg_opclass.dat pg_operator.dat pg_opfamily.dat \ pg_pltemplate.dat pg_proc.dat pg_range.dat pg_tablespace.dat \ pg_ts_config.dat pg_ts_config_map.dat pg_ts_dict.dat pg_ts_parser.dat \ *************** POSTGRES_BKI_DATA = $(addprefix $(top_sr *** 72,104 **** # location of Catalog.pm catalogdir = $(top_srcdir)/src/backend/catalog ! # see explanation in ../parser/Makefile ! postgres.description: postgres.bki ; ! postgres.shdescription: postgres.bki ; ! $(GENERATED_HEADERS): postgres.bki ; ! # Technically, this should depend on Makefile.global, but then ! # postgres.bki would need to be rebuilt after every configure run, ! # even in distribution tarballs. So this is cheating a bit, but it ! # will achieve the goal of updating the version number when it ! # changes. ! postgres.bki: genbki.pl Catalog.pm $(POSTGRES_BKI_SRCS) $(POSTGRES_BKI_DATA) $(top_srcdir)/configure $(top_srcdir)/src/include/catalog/duplicate_oids cd $(top_srcdir)/src/include/catalog && $(PERL) ./duplicate_oids $(PERL) -I $(catalogdir) $< --set-version=$(MAJORVERSION) $(POSTGRES_BKI_SRCS) ! # see explanation in src/backend/Makefile ! $(top_builddir)/src/include/catalog/%_d.h: %_d.h ! prereqdir=`cd '$(dir $<)' >/dev/null && pwd` && \ ! cd '$(dir $@)' && rm -f $(notdir $@) && \ ! $(LN_S) "$$prereqdir/$(notdir $<)" . ! ! $(top_builddir)/src/include/catalog/schemapg.h: schemapg.h prereqdir=`cd '$(dir $<)' >/dev/null && pwd` && \ ! cd '$(dir $@)' && rm -f $(notdir $@) && \ ! $(LN_S) "$$prereqdir/$(notdir $<)" . .PHONY: install-data ! install-data: $(BKIFILES) installdirs $(INSTALL_DATA) $(call vpathsearch,postgres.bki) '$(DESTDIR)$(datadir)/postgres.bki' $(INSTALL_DATA) $(call vpathsearch,postgres.description) '$(DESTDIR)$(datadir)/postgres.description' $(INSTALL_DATA) $(call vpathsearch,postgres.shdescription) '$(DESTDIR)$(datadir)/postgres.shdescription' --- 68,105 ---- # location of Catalog.pm catalogdir = $(top_srcdir)/src/backend/catalog ! all: distprep generated-header-symlinks ! distprep: bki-stamp ! ! .PHONY: generated-header-symlinks ! ! generated-header-symlinks: $(top_builddir)/src/include/catalog/header-stamp ! ! # Technically, this should depend on Makefile.global which supplies ! # $(MAJORVERSION); but then postgres.bki would need to be rebuilt after every ! # configure run, even in distribution tarballs. So depending on configure.in ! # instead is cheating a bit, but it will achieve the goal of updating the ! # version number when it changes. ! bki-stamp: genbki.pl Catalog.pm $(POSTGRES_BKI_SRCS) $(POSTGRES_BKI_DATA) $(top_srcdir)/configure.in $(top_srcdir)/src/include/catalog/duplicate_oids cd $(top_srcdir)/src/include/catalog && $(PERL) ./duplicate_oids $(PERL) -I $(catalogdir) $< --set-version=$(MAJORVERSION) $(POSTGRES_BKI_SRCS) + touch $@ ! # The generated headers must all be symlinked into builddir/src/include/, ! # using absolute links for the reasons explained in src/backend/Makefile. ! # We use header-stamp to record that we've done this because the symlinks ! # themselves may appear older than bki-stamp. ! $(top_builddir)/src/include/catalog/header-stamp: bki-stamp prereqdir=`cd '$(dir $<)' >/dev/null && pwd` && \ ! cd '$(dir $@)' && for file in $(GENERATED_HEADERS); do \ ! rm -f $$file && $(LN_S) "$$prereqdir/$$file" . ; \ ! done ! touch $@ + # Note: installation of generated headers is handled elsewhere .PHONY: install-data ! install-data: bki-stamp installdirs $(INSTALL_DATA) $(call vpathsearch,postgres.bki) '$(DESTDIR)$(datadir)/postgres.bki' $(INSTALL_DATA) $(call vpathsearch,postgres.description) '$(DESTDIR)$(datadir)/postgres.description' $(INSTALL_DATA) $(call vpathsearch,postgres.shdescription) '$(DESTDIR)$(datadir)/postgres.shdescription' *************** installdirs: *** 113,121 **** uninstall-data: rm -f $(addprefix '$(DESTDIR)$(datadir)'/, $(BKIFILES) system_views.sql information_schema.sql sql_features.txt) ! # postgres.bki, postgres.description, postgres.shdescription, and the generated headers ! # are in the distribution tarball, so they are not cleaned here. clean: maintainer-clean: clean ! rm -f $(BKIFILES) --- 114,123 ---- uninstall-data: rm -f $(addprefix '$(DESTDIR)$(datadir)'/, $(BKIFILES) system_views.sql information_schema.sql sql_features.txt) ! # postgres.bki, postgres.description, postgres.shdescription, ! # and the generated headers are in the distribution tarball, ! # so they are not cleaned here. clean: maintainer-clean: clean ! rm -f bki-stamp $(BKIFILES) $(GENERATED_HEADERS) diff --git a/src/backend/catalog/README b/src/backend/catalog/README index 447fce6..4aa2adb 100644 *** a/src/backend/catalog/README --- b/src/backend/catalog/README *************** src/backend/catalog/README *** 3,20 **** System Catalog ============== ! This directory contains .c files that manipulate the system catalogs; ! src/include/catalog contains the .h files that define the structure ! of the system catalogs. ! When the compile-time script genbki.pl executes, it parses the .h files and .dat files in order to generate the postgres.* files. These are then used as input to initdb (which is just a wrapper around postgres running single-user in bootstrapping mode) in order to generate the initial (template) system catalog relation files. ! backend/utils/Gen_fmgrtab.pl uses the same mechanism to genarate .c and ! .h files used by the function manager. ----------------------------------------------------------------- --- 3,27 ---- System Catalog ============== ! This directory contains .c files that manipulate the system catalogs. ! src/include/catalog contains the pg_XXX.h files that define the structure ! of the system catalogs, as well as pg_XXX.dat files that define the ! initial contents of the catalogs. ! When the compile-time script genbki.pl executes, it parses the pg_XXX.h files and .dat files in order to generate the postgres.* files. These are then used as input to initdb (which is just a wrapper around postgres running single-user in bootstrapping mode) in order to generate the initial (template) system catalog relation files. ! genbki.pl also produces some generated header files that are used in ! compiling the C code. These include a pg_XXX_d.h file corresponding ! to each pg_XXX.h catalog header file, which contains #define's extracted ! from the corresponding .dat file as well as some automatically-generated ! symbols. ! ! backend/utils/Gen_fmgrtab.pl uses the same catalog-data-reading code ! to generate .c and .h files used by the function manager. ----------------------------------------------------------------- *************** The data file format and bootstrap data *** 22,33 **** - As far as the bootstrap code is concerned, it is very important that the insert statements in postgres.bki be properly formatted ! (e.g., no broken lines, proper use of white-space and _null_). The ! scripts are line-oriented and break easily. In addition, the only ! documentation on the proper format for them is the code in the ! bootstrap/ directory. Fortunately, the source bootstrap data is much ! more tolerant with respect to formatting, but it still pays to be ! careful when adding new data. - The .dat files contain Perl data structure literals that are simply eval'd to produce in-memory data structures. As such, the code reading --- 29,39 ---- - As far as the bootstrap code is concerned, it is very important that the insert statements in postgres.bki be properly formatted ! (e.g., no broken lines, proper use of white-space and _null_). ! In addition, the only documentation on the proper format for them ! is the code in the bootstrap/ directory. Fortunately, the source ! bootstrap data is much more tolerant with respect to formatting, ! but it still pays to be careful when adding new data. - The .dat files contain Perl data structure literals that are simply eval'd to produce in-memory data structures. As such, the code reading *************** demonstrate the key features: *** 43,54 **** # a comment { oid => '1', oid_symbol => 'TemplateDbOid', shdescr => 'default template', ! datname => 'Berkely\'s DB', datcollate => '"LC_COLLATE"', datacl => '_null_' }, ] ! -The layout is: open bracket, one or more sets of curly brackets containing ! comma-separated key-value pairs, close bracket. -All values are single-quoted. -Single quotes within values must be escaped. -If a value is a macro to be expanded by initdb.c, it must also have double- --- 49,61 ---- # a comment { oid => '1', oid_symbol => 'TemplateDbOid', shdescr => 'default template', ! datname => 'Berkeley\'s DB', datcollate => '"LC_COLLATE"', ! datacl => '_null_' }, ] ! -The overall file layout is: open bracket, one or more sets of curly brackets ! containing comma-separated key-value pairs, close bracket. -All values are single-quoted. -Single quotes within values must be escaped. -If a value is a macro to be expanded by initdb.c, it must also have double- *************** the catalog's data file, and use the #de *** 91,100 **** the actual numeric value of any OID in C code is considered very bad form. Direct references to pg_proc OIDs are common enough that there's a special mechanism to create the necessary #define's automatically: see ! backend/utils/Gen_fmgrtab.pl. We also have standard conventions for setting ! up #define's for the pg_class OIDs of system catalogs and indexes. For all ! the other system catalogs, you have to manually create any #define's you ! need. - If you need to find a valid OID for a new predefined tuple, use the script src/include/catalog/unused_oids. It generates inclusive ranges of --- 98,109 ---- the actual numeric value of any OID in C code is considered very bad form. Direct references to pg_proc OIDs are common enough that there's a special mechanism to create the necessary #define's automatically: see ! backend/utils/Gen_fmgrtab.pl. Similarly (but, for historical reasons, not ! done the same way), there's an automatic method for creating #define's ! for pg_type OIDs. We also have standard conventions for setting up #define's ! for the pg_class OIDs of system catalogs and indexes. For all the other ! system catalogs, you have to manually specify any #define's you need, via ! oid_symbol entries. - If you need to find a valid OID for a new predefined tuple, use the script src/include/catalog/unused_oids. It generates inclusive ranges of *************** script src/include/catalog/unused_oids. *** 102,111 **** not been allocated yet). Currently, OIDs 1-9999 are reserved for manual assignment; the unused_oids script simply looks through the include/catalog headers and .dat files to see which ones do not appear. ! (As of Postgres 8.1, it also looks at CATALOG and DECLARE_INDEX lines.) ! You can use the duplicate_oids script to check for mistakes. This script ! is also run at compile time, and will stop the build if a duplicate is ! found. - The OID counter starts at 10000 at bootstrap. If a catalog row is in a table that requires OIDs, but no OID was preassigned by an "OID =" clause, --- 111,119 ---- not been allocated yet). Currently, OIDs 1-9999 are reserved for manual assignment; the unused_oids script simply looks through the include/catalog headers and .dat files to see which ones do not appear. ! You can also use the duplicate_oids script to check for mistakes. ! (This script is run automatically at compile time, and will stop the build ! if a duplicate is found.) - The OID counter starts at 10000 at bootstrap. If a catalog row is in a table that requires OIDs, but no OID was preassigned by an "OID =" clause, *************** the tables are in place, and toasting.h *** 134,144 **** (or at least after all the tables that need toast tables). There are reputedly some other order dependencies in the .bki list, too. ! -Client code should not include the catalog headers directly. Instead, it ! should include the corresponding generated pg_*_d.h header. If you want ! macros or other code in the catalog headers to be visible to clients, use ! the undefined macro EXPOSE_TO_CLIENT_CODE to instruct genbki.pl to copy ! that section to the pg_*_d.h header. ----------------------------------------------------------------- --- 142,155 ---- (or at least after all the tables that need toast tables). There are reputedly some other order dependencies in the .bki list, too. ! - Frontend code should not include any pg_XXX.h header directly, as these ! files may contain C code that won't compile outside the backend. Instead, ! frontend code may include the corresponding generated pg_*_d.h header, which ! will contain OID #defines and any other data that might be of use on the ! client side. If you want macros or other code in a catalog header to be ! visible to frontend code, write "#ifdef EXPOSE_TO_CLIENT_CODE" ... "#endif" ! around that section to instruct genbki.pl to copy that section to the ! pg_*_d.h header. ----------------------------------------------------------------- *************** piece of code will likely perform "typet *** 155,163 **** random errors or even segmentation violations. Hence, do NOT insert catalog tuples that contain NULL attributes except in their variable-length portions! (The bootstrapping code is fairly good about ! marking NOT NULL each of the columns that can legally be referenced via ! C struct declarations ... but those markings won't be enforced against ! insert commands, so you must get it right in the data files.) - Modification of the catalogs must be performed with the proper updating of catalog indexes! That is, most catalogs have indexes --- 166,176 ---- random errors or even segmentation violations. Hence, do NOT insert catalog tuples that contain NULL attributes except in their variable-length portions! (The bootstrapping code is fairly good about ! automatically marking NOT NULL each of the columns that can legally be ! referenced via C struct declarations, and it can be helped along with ! BKI_FORCE_NOT_NULL and BKI_FORCE_NULL annotations where needed. But ! attnotnull constraints are only enforced in the executor, not against ! tuples that are generated by random C code.) - Modification of the catalogs must be performed with the proper updating of catalog indexes! That is, most catalogs have indexes *************** method calls to insert new or modified t *** 167,170 **** also make the calls to insert the tuple into ALL of its indexes! If not, the new tuple will generally be "invisible" to the system because most of the accesses to the catalogs in question will be through the ! associated indexes. --- 180,185 ---- also make the calls to insert the tuple into ALL of its indexes! If not, the new tuple will generally be "invisible" to the system because most of the accesses to the catalogs in question will be through the ! associated indexes. Current practice is to always use CatalogTupleInsert, ! CatalogTupleUpdate, CatalogTupleDelete, or one of their sibling functions ! when updating the system catalogs, so that this is handled automatically. diff --git a/src/common/Makefile b/src/common/Makefile index f28c136..80e78d7 100644 *** a/src/common/Makefile --- b/src/common/Makefile *************** libpgcommon_srv.a: $(OBJS_SRV) *** 88,95 **** %_srv.o: %.c %.o $(CC) $(CFLAGS) $(subst -DFRONTEND ,, $(CPPFLAGS)) -c $< -o $@ - $(OBJS_COMMON): | submake-generated-headers - $(OBJS_SRV): | submake-errcodes .PHONY: submake-errcodes --- 88,93 ---- diff --git a/src/include/Makefile b/src/include/Makefile index 7fe4d45..59e18c7 100644 *** a/src/include/Makefile --- b/src/include/Makefile *************** install: all installdirs *** 54,60 **** chmod $(INSTALL_DATA_MODE) '$(DESTDIR)$(includedir_server)'/$$dir/*.h || exit; \ done ifeq ($(vpath_build),yes) ! for file in dynloader.h catalog/schemapg.h parser/gram.h storage/lwlocknames.h utils/probes.h; do \ cp $$file '$(DESTDIR)$(includedir_server)'/$$file || exit; \ chmod $(INSTALL_DATA_MODE) '$(DESTDIR)$(includedir_server)'/$$file || exit; \ done --- 54,60 ---- chmod $(INSTALL_DATA_MODE) '$(DESTDIR)$(includedir_server)'/$$dir/*.h || exit; \ done ifeq ($(vpath_build),yes) ! for file in dynloader.h catalog/schemapg.h catalog/pg_*_d.h parser/gram.h storage/lwlocknames.h utils/probes.h; do\ cp $$file '$(DESTDIR)$(includedir_server)'/$$file || exit; \ chmod $(INSTALL_DATA_MODE) '$(DESTDIR)$(includedir_server)'/$$file || exit; \ done *************** uninstall: *** 73,79 **** clean: ! rm -f utils/fmgroids.h utils/fmgrprotos.h utils/errcodes.h parser/gram.h utils/probes.h catalog/schemapg.h catalog/pg_*_d.h distclean maintainer-clean: clean rm -f pg_config.h pg_config_ext.h pg_config_os.h dynloader.h stamp-h stamp-ext-h --- 73,81 ---- clean: ! rm -f utils/fmgroids.h utils/fmgrprotos.h utils/errcodes.h ! rm -f parser/gram.h storage/lwlocknames.h utils/probes.h ! rm -f catalog/schemapg.h catalog/pg_*_d.h catalog/header-stamp distclean maintainer-clean: clean rm -f pg_config.h pg_config_ext.h pg_config_os.h dynloader.h stamp-h stamp-ext-h diff --git a/src/include/catalog/.gitignore b/src/include/catalog/.gitignore index dfd5616..6c8da54 100644 *** a/src/include/catalog/.gitignore --- b/src/include/catalog/.gitignore *************** *** 1,2 **** --- 1,3 ---- /schemapg.h /pg_*_d.h + /header-stamp
On 3/21/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > John Naylor <jcnaylor@gmail.com> writes: >> [ v11-bootstrap-data-conversion.tar.gz ] > > I've done a round of review work on this, focusing on the Makefile > infrastructure. I found a bunch of problems with parallel builds and > VPATH builds, which are addressed in the attached incremental patch. > [explanation of Make issues and stamp files] > In the attached, I've also done some more work on the README file > and cleaned up a few other little things. Thanks for pulling my attempt at makefile hackery across the finish line. It sounds like there are no more obvious structural issues remaining (fingers crossed). Your other improvements make sense. > I did note a bit of distressing inconsistency in the formatting of > the catalog struct declarations, some of which predates this patch but it > seems like you've introduced more. I think what we ought to standardize > on is a format similar to this in pg_opclass.h: > > CATALOG(pg_opclass,2616) > { > /* index access method opclass is for */ > Oid opcmethod BKI_LOOKUP(pg_am); > [snip] That is the most sensible format. Did you mean all 62 catalog headers for future-proofing, or just the ones with annotations now? > The former convention used in some places, of field descriptions in > same-line comments, clearly won't work anymore if we're sticking > BKI_DEFAULT annotations there. Yeah. > I also don't like the format used in, eg, > pg_aggregate.h of putting field descriptions in a separate comment block > before the struct proper. Bitter experience has shown that there are a > lot of people on this project who won't update comments that are more than > about two lines away from the code they change; so the style in > pg_aggregate.h is just inviting maintenance oversights. Okay. > I've got mixed feelings about the whitespace lines between fields. They > seem like they are mostly bulking up the code and we could do without 'em. > On the other hand, pgindent will insist on putting one before any > multi-line field comment, and so that would create inconsistent formatting > if we don't use 'em elsewhere. Thoughts? I'll do it both ways for one header and post the results for people to look at. > Speaking of pgindent, those prettily aligned BKI annotations are a waste > of effort, because when pgindent gets done with the code it will look > like > [snip] > regproc aggmtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); > regproc aggminvtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); > regproc aggmfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); > I'm not sure if there's anything much to be done about this. BKI_DEFAULT > isn't so bad, but additional annotations get unreadable fast. Maybe > BKI_LOOKUP was a bad idea after all, and we should just invent more > Oid-equivalent typedef names. Well, until version 7, I used "fake" type aliases that only genbki.pl could see. The C compiler and postgres.bki could only see the actual Oid/oid type. Perhaps it was a mistake to model their appearance after regproc (regtype, etc), because that was misleading. Maybe something more obviously transient like 'lookup_typeoid? I'm leaning towards this idea. Another possibility is to teach the pgindent pre_/post_indent functions to preserve annotation formatting, but I'd rather not add yet another regex to that script. Plus, over the next 10+ years, I could see people adding several more BKI_* macros, leading to readability issues regardless of formatting, so maybe we should nip this one in the bud. > The attached is just one incremental patch on top of your v11 series. > I couldn't think of an easy way to migrate the changes back into the > most relevant diffs of your series, so I didn't try. I've done that quite a few times while developing this patch series, so I'm used to it. I'll incorporate your changes soon and also rebase over the new pg_class column that landed recently. I'll have a new version by this weekend, assuming we conclude the formatting discussion, so if you or others have any more comments by then, I'll include them. -John Naylor
John Naylor <jcnaylor@gmail.com> writes: > On 3/21/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> regproc aggmtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); >> regproc aggminvtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); >> regproc aggmfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); >> I'm not sure if there's anything much to be done about this. BKI_DEFAULT >> isn't so bad, but additional annotations get unreadable fast. Maybe >> BKI_LOOKUP was a bad idea after all, and we should just invent more >> Oid-equivalent typedef names. > Well, until version 7, I used "fake" type aliases that only genbki.pl > could see. The C compiler and postgres.bki could only see the actual > Oid/oid type. Perhaps it was a mistake to model their appearance after > regproc (regtype, etc), because that was misleading. Maybe something > more obviously transient like 'lookup_typeoid? I'm leaning towards > this idea. Looking at this again, I think a big chunk of the readability problem here is just from the fact that we have long, similar-looking lines tightly packed. If it were reformatted to have comment lines and whitespace between, it might not look nearly as bad. > Another possibility is to teach the pgindent pre_/post_indent > functions to preserve annotation formatting, but I'd rather not add > yet another regex to that script. Plus, over the next 10+ years, I > could see people adding several more BKI_* macros, leading to > readability issues regardless of formatting, so maybe we should nip > this one in the bud. Well, whether or not we invent BKI_LOOKUP, the need for other kinds of annotations isn't likely to be lessened. I wondered whether we could somehow convert the format into multiple lines, say regproc aggmfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc); but some quick experimentation was discouraging: either Emacs' C syntax mode, or pgindent, or both would make a hash of it. It wasn't great from the vertical-space-consumption standpoint either. Anyway, for the moment I'd stick with BKI_LOOKUP rather than undoing that work. I think it's a more transparent way of saying what we want than the magic-OID-typedefs approach. The formatting issue is just a mild annoyance, and it's not really BKI_LOOKUP's fault anyway. While I'm thinking of it --- I noticed one or two places where you had "BKI_DEFAULT(\0)". That coding scares me a bit --- gcc seems to tolerate it, but other C compilers might feel that \0 is not a valid preprocessing token, or it might confuse some editors' syntax highlight rules. I'd rather write cases like this as "BKI_DEFAULT('\0')". IOW, the argument should be a valid C identifier, number, or string literal. regards, tom lane
I wrote: > On 3/21/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I've got mixed feelings about the whitespace lines between fields. They >> seem like they are mostly bulking up the code and we could do without >> 'em. >> On the other hand, pgindent will insist on putting one before any >> multi-line field comment, and so that would create inconsistent >> formatting >> if we don't use 'em elsewhere. Thoughts? > > I'll do it both ways for one header and post the results for people to look > at. I've attached an earlier version of pg_proc.h with both formats as I understand them. I turned a couple comments into multi-line comments to demonstrate. I think without spaces it's just as hard to read as with multiple annotations. I'd vote for spaces, but then again I'm not the one who has to read these things very often. -John Naylor
Attachment
On 3/22/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Looking at this again, I think a big chunk of the readability problem here > is just from the fact that we have long, similar-looking lines tightly > packed. If it were reformatted to have comment lines and whitespace > between, it might not look nearly as bad. > ... > Anyway, for the moment I'd stick with BKI_LOOKUP rather than undoing > that work. I think it's a more transparent way of saying what we > want than the magic-OID-typedefs approach. The formatting issue is > just a mild annoyance, and it's not really BKI_LOOKUP's fault anyway. Okay, I'll do it with comments and whitespace. > While I'm thinking of it --- I noticed one or two places where you > had "BKI_DEFAULT(\0)". That coding scares me a bit --- gcc seems to > tolerate it, but other C compilers might feel that \0 is not a valid > preprocessing token, or it might confuse some editors' syntax highlight > rules. I'd rather write cases like this as "BKI_DEFAULT('\0')". IOW, > the argument should be a valid C identifier, number, or string literal. Hmm, I only see this octal in pg_type.h char typdelim BKI_DEFAULT(\054); Which I hope is fine. Were you thinking of this comment in pg_attribute.h? We use the double-quoted empty string for postgres.bki and change it to '\0' for schemapg.h. /* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */ char attidentity BKI_DEFAULT(""); -John Naylor
John Naylor wrote: > I've attached an earlier version of pg_proc.h with both formats as I > understand them. I turned a couple comments into multi-line comments > to demonstrate. I think without spaces it's just as hard to read as > with multiple annotations. I'd vote for spaces, but then again I'm not > the one who has to read these things very often. how about letting the line go long, with the comment at the right of each definition, with one blank line between struct members, as in the sample below? You normally don't care that these lines are too long since you seldom edit them -- one mostly adds or remove entire lines instead, so there's not as much need for side-by-side diffs as with regular code. (One issue with this proposal is how to convince pgindent to leave the long lines alone.) To me, an important property of these structs is fitting as much as possible (vertically) in a screenful; the other proposed modes end up with too many lines. CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81) BKI_SCHEMA_MACRO { NameData proname; /* procedure name */ Oid pronamespace BKI_DEFAULT(PGNSP); /* OID of namespace containing this proc */ Oid proowner BKI_DEFAULT(PGUID); /* procedure owner */ Oid prolang BKI_DEFAULT(12); /* OID of pg_language entry */ float4 procost BKI_DEFAULT(1); /* estimated execution cost */ float4 prorows BKI_DEFAULT(0); /* estimated # of rows out (if proretset) */ Oid provariadic BKI_DEFAULT(0); /* element type of variadic array, or 0 */ regproc protransform BKI_DEFAULT(0); /* transforms calls to it during planning */ bool proisagg BKI_DEFAULT(f); /* is it an aggregate? */ bool proiswindow BKI_DEFAULT(f); /* is it a window function? */ bool prosecdef BKI_DEFAULT(f); /* security definer */ bool proleakproof BKI_DEFAULT(f); /* is it a leak-proof function? */ bool proisstrict BKI_DEFAULT(f); /* strict with respect to NULLs? */ bool proretset BKI_DEFAULT(f); /* returns a set? */ char provolatile BKI_DEFAULT(v); /* see PROVOLATILE_ categories below */ char proparallel BKI_DEFAULT(u); /* see PROPARALLEL_ categories below */ int16 pronargs; /* number of arguments */ int16 pronargdefaults BKI_DEFAULT(0); /* number of arguments with defaults */ Oid prorettype; /* OID of result type */ /* * variable-length fields start here, but we allow direct access to * proargtypes */ oidvector proargtypes; /* parameter types (excludes OUT params) */ #ifdef CATALOG_VARLEN Oid proallargtypes[1] BKI_DEFAULT(_null_); /* all param types (NULL if IN only) */ char proargmodes[1] BKI_DEFAULT(_null_); /* parameter modes (NULL if IN only) */ text proargnames[1] BKI_DEFAULT(_null_); /* parameter names (NULL if no names) */ pg_node_tree proargdefaults BKI_DEFAULT(_null_); /* list of expression trees for argument defaults (NULL if none) */ Oid protrftypes[1] BKI_DEFAULT(_null_); /* types for which to apply transforms */ text prosrc BKI_FORCE_NOT_NULL; /* procedure source text */ text probin BKI_DEFAULT(_null_); /* secondary procedure info (can be NULL) */ text proconfig[1] BKI_DEFAULT(_null_); /* procedure-local GUC settings */ aclitem proacl[1] BKI_DEFAULT(_null_); /* access permissions */ #endif } FormData_pg_proc; -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
John Naylor <jcnaylor@gmail.com> writes: > On 3/22/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> While I'm thinking of it --- I noticed one or two places where you >> had "BKI_DEFAULT(\0)". > Hmm, I only see this octal in pg_type.h > char typdelim BKI_DEFAULT(\054); Sorry, I was going by memory rather than looking at the code. > Which I hope is fine. I don't really think it's legal C; I'd rather write BKI_DEFAULT('\054'). > Were you thinking of this comment in > pg_attribute.h? We use the double-quoted empty string for postgres.bki > and change it to '\0' for schemapg.h. > /* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */ > char attidentity BKI_DEFAULT(""); That definitely seems like a hack --- why not BKI_DEFAULT('\0') ? regards, tom lane
John Naylor <jcnaylor@gmail.com> writes: >> On 3/21/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I've got mixed feelings about the whitespace lines between fields. > I've attached an earlier version of pg_proc.h with both formats as I > understand them. I turned a couple comments into multi-line comments > to demonstrate. I think without spaces it's just as hard to read as > with multiple annotations. I'd vote for spaces, but then again I'm not > the one who has to read these things very often. Yeah, after looking at this example I agree --- it's too tight without the blank lines. regards, tom lane
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > how about letting the line go long, with the comment at the right of > each definition, with one blank line between struct members, as in the > sample below? > NameData proname; /* procedure name */ > Oid pronamespace BKI_DEFAULT(PGNSP); /* OID of namespace containing this proc */ > Oid proowner BKI_DEFAULT(PGUID); /* procedure owner */ I don't think this is going to work: pgindent is going to wrap most of these comments, ending up with something that's ugly *and* consumes just as much vertical space as if we'd given the comments their own lines. The problem is that in the headers where we were using same-line comments, the comments were written to fit in the space available without this extra annotation. (For my money, having spent lots of time shaving a character or two off such comments to make 'em fit, I'd much prefer the luxury of having a whole line to write in.) We could go with some scheme that preserves the old formatting of the struct definition proper and puts the added info somewhere else, ie Oid pronamespace; /* OID of namespace containing this proc */ Oid prolang; /* OID of pg_language entry */ then after the struct: BKI_DEFAULT(pronamespace, PGNSP); BKI_DEFAULT(prolang, 12); but on the whole I don't think that's an improvement. I'd rather keep the info about a field together. regards, tom lane
On 3/22/18, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > how about letting the line go long, with the comment at the right of > each definition, with one blank line between struct members, as in the > sample below? You normally don't care that these lines are too long > since you seldom edit them -- one mostly adds or remove entire lines > instead, so there's not as much need for side-by-side diffs as with > regular code. (One issue with this proposal is how to convince pgindent > to leave the long lines alone.) Yeah, it seems when perltidy or pgindent mangle things badly, it's to try and shoehorn a long line into a smaller number of characters. If memory serves, I've come across things like this: pg_node_tree proargdefaults BKI_DEFAULT(_null_); /* list of expression trees for argument defaults (NULL if none) */ And thought "only a machine could be so precisely awkward" -John Naylor
On 3/22/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't really think it's legal C; I'd rather write BKI_DEFAULT('\054'). Okay, I'll do that. >> Were you thinking of this comment in >> pg_attribute.h? We use the double-quoted empty string for postgres.bki >> and change it to '\0' for schemapg.h. > >> /* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */ >> char attidentity BKI_DEFAULT(""); > > That definitely seems like a hack --- why not BKI_DEFAULT('\0') ? Hmm, yes, the way I had it, the comment is a mystery. I'll switch it around. -John Naylor
On 3/21/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The attached is just one incremental patch on top of your v11 series. > I couldn't think of an easy way to migrate the changes back into the > most relevant diffs of your series, so I didn't try. I've applied your changes to the v12 patch series (attached), but I hope you'll allow two nit-picky adjustments: -s/pg_XXX.h/pg_xxx.h/ in the README. There seems to be greater precedent for the lower-case spelling if the rest of the word is lower case. -I shortened the data example in the README so it would comfortably fit on two lines. Spreading it out over three lines doesn't match what's in the data files. It's valid syntax, but real data is formatted to at most two lines (See rewrite_dat.pl. Hmm, maybe I should make that more explicit elsewhere in the README) > I also have not spent much time yet looking at the end-product .h and .dat > files. I did note a bit of distressing inconsistency in the formatting of > the catalog struct declarations, some of which predates this patch but it > seems like you've introduced more. I think what we ought to standardize > on is a format similar to this in pg_opclass.h: > > CATALOG(pg_opclass,2616) > { > /* index access method opclass is for */ > Oid opcmethod BKI_LOOKUP(pg_am); > Done, with blank lines interspersed. I put most changes of this sort in with the other cleanups in patch 0004. I neglected to do this separately for couple of tiny tables that have lookups, but no default values. I don't think it impacts the readability of patch 0007 much. On 3/22/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't really think it's legal C; I'd rather write BKI_DEFAULT('\054'). [snip] >> /* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */ >> char attidentity BKI_DEFAULT(""); > > That definitely seems like a hack --- why not BKI_DEFAULT('\0') ? Done (patch 0006). Other changes: -A README note about OID macros (patch 0007). -A couple minor cosmetic rearrangements and comment/commit message edits. Open items: -Test MSVC. -Arrange for rewrite_dat.pl to run when perltidy does. -I was a bit cavalier about when to use =/:= in the Makefiles. Not sure if there's a preferred project style for when the choice doesn't really matter. -Maybe document examples of how to do bulk-editing of data files? -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > -I shortened the data example in the README so it would comfortably > fit on two lines. Spreading it out over three lines doesn't match > what's in the data files. It's valid syntax, but real data is > formatted to at most two lines (See rewrite_dat.pl. Hmm, maybe I > should make that more explicit elsewhere in the README) Well, as I said, I hadn't really reviewed the .dat files, but if that's what you're doing I'm going to request a change. Project style is to fit in 80 columns as much as possible. I do not see a reason to exempt the .dat files from that, especially not since it would presumably be a trivial change in rewrite_dat.pl to insert extra newlines between fields when needed. (Obviously, if a field value is so wide it runs past 80 columns on its own, it's not rewrite_dat.pl's charter to fix that.) > Open items: > -Test MSVC. Again, while I'd be happy if someone did that manually, I'm prepared to let the buildfarm do it. > -Arrange for rewrite_dat.pl to run when perltidy does. What I was thinking we should have is a convenience target in include/Makefile to do this, say "make reformat-dat-files". I'm not that excited about bundling it into pgindent runs. > -Maybe document examples of how to do bulk-editing of data files? +1. In the end, that's the reason we're doing all this work, so showing people how to benefit seems like a good thing. regards, tom lane
On 3/25/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Well, as I said, I hadn't really reviewed the .dat files, but if that's > what you're doing I'm going to request a change. Project style is to > fit in 80 columns as much as possible. I do not see a reason to exempt > the .dat files from that, especially not since it would presumably be a > trivial change in rewrite_dat.pl to insert extra newlines between fields > when needed. (Obviously, if a field value is so wide it runs past 80 > columns on its own, it's not rewrite_dat.pl's charter to fix that.) This feature is working now. I've attached a 100-line sample of all the catalogs' files for viewing. Note, this is pretty raw output, without the clean-up step from patch 0002. In the most of the original DATA() lines, there was no spacing between entries except in some cases to separate groups (often with a comment to describe the group). My clean-up patch tried to make that more consistent. For this sample, it would add blank lines before the comments in pg_amop, and remove blank lines from the first few entries in pg_type. If you wanted to opine on that before I rework that patch, I'd be grateful. Also, these data entries have default values removed, but they don't have human-readable OID macros. (I'll have to adjust that script to the 80-column limit as well). >> -Arrange for rewrite_dat.pl to run when perltidy does. > > What I was thinking we should have is a convenience target in > include/Makefile to do this, say "make reformat-dat-files". > I'm not that excited about bundling it into pgindent runs. I've attached a draft patch for this. If it's okay, I'll incorporate it into the series. I think reformat_dat_files.pl also works as a better script name. >> -Maybe document examples of how to do bulk-editing of data files? > > +1. In the end, that's the reason we're doing all this work, so showing > people how to benefit seems like a good thing. It seems like with that, it'd be good to split off the data-format section of the README into a new file, maybe README.data, which will contain code snippets and some example scenarios. I'll include the example pg_proc.prokind merger among those. -John Naylor
With the attachments this time. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > With the attachments this time. Layout of .dat files seems generally reasonable, but I don't understand the proposed make rule: +reformat-dat-files: + $(PERL) -I $(catalogdir) $< catalog/rewrite_dat.pl -o catalog catalog/pg_*.dat This rule has no prerequisite, so what's $< supposed to be? Also, I think the rule probably ought to be located in src/include/catalog/Makefile, because that's typically where you'd be cd'd to when messing with the .dat files, I'd think. (Hm, I see no such makefile, but maybe it's time for one. A convenience rule located one level up doesn't seem very convenient.) > My clean-up patch tried to make that more consistent. For this sample, > it would add blank lines before the comments in pg_amop, and remove > blank lines from the first few entries in pg_type. If you wanted to > opine on that before I rework that patch, I'd be grateful. No particular objection to either. >>> -Maybe document examples of how to do bulk-editing of data files? >> +1. In the end, that's the reason we're doing all this work, so showing >> people how to benefit seems like a good thing. > It seems like with that, it'd be good to split off the data-format > section of the README into a new file, maybe README.data, which will > contain code snippets and some example scenarios. I'll include the > example pg_proc.prokind merger among those. It would be more work, but maybe we should move this into the main SGML docs. It seems rather silly to have SGML documentation for the .BKI file format, which now will be an internal matter that hardly any developers need worry about, but not for the .DAT file format. But I understand if that seems a bridge too far for today --- certainly a README file is way better than nothing. regards, tom lane
> On Mar 26, 2018, at 10:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote > Layout of .dat files seems generally reasonable, but I don't understand > the proposed make rule: > > +reformat-dat-files: > + $(PERL) -I $(catalogdir) $< catalog/rewrite_dat.pl -o catalog catalog/pg_*.dat > > This rule has no prerequisite, so what's $< supposed to be? Also, I think > the rule probably ought to be located in src/include/catalog/Makefile, > because that's typically where you'd be cd'd to when messing with the > .dat files, I'd think. (Hm, I see no such makefile, but maybe it's time > for one. A convenience rule located one level up doesn't seem very > convenient.) > Oops, copy-pasto. And I’ll see about a new Makefile. >> It seems like with that, it'd be good to split off the data-format >> section of the README into a new file, maybe README.data, which will >> contain code snippets and some example scenarios. I'll include the >> example pg_proc.prokind merger among those. > > It would be more work, but maybe we should move this into the main > SGML docs. It seems rather silly to have SGML documentation for the > .BKI file format, which now will be an internal matter that hardly > any developers need worry about, but not for the .DAT file format. > But I understand if that seems a bridge too far for today --- certainly > a README file is way better than nothing. Makes sense on all points. I’m not optimistic about creating a new sgml doc on time, but I’ll keep it in mind. -John Naylor
Tom Lane wrote: >> -Maybe document examples of how to do bulk-editing of data files? > > +1. In the end, that's the reason we're doing all this work, so showing > people how to benefit seems like a good thing. I'll hold off on posting a new patchset until I add this to the documentation, but I wanted to report on a couple of other things: While adjusting to the 80-column limit, I encountered a separation of concerns violation between Catalog.pm and reformat_dat_files.pl that I hadn't noticed before. Fixing that made things easier to read, with fewer lines of code. Speaking of bulk editing, that would be done via adopting reformat_dat_files.pl to the task at hand. I did this myself for two of the conversion helper scripts. However, enough bitrot has now occurred that to make the relationship murky. Since I had to adopt them to the 80-column limit as well, I shaved all the irrelevant differences away, and now they're just a small diff away from the reformat script. I also added block comments to help developers find where they need to edit the script. Since reformat_dat_files.pl has been substantially altered, I'll attach it here, along with the diffs to the the helper scripts. I wrote: > I’ll see about a new Makefile. I've attached a draft of this. I thought about adding a call to duplicate_oids here, but this won't run unless you've run configure first, and if you've done that, you've likely built already, running duplicate_oids in the process. I think I'll consolidate all documentation patches into one, at the end of the series for maximum flexibility. I liked the idea of spreading the doc changes over the patches, but there is not a huge amount of time left. -John Naylor
Attachment
Attached is v13, rebased against b0c90c85fc. Patch 0001: -The data files are formatted to at most 80 columns wide. -Rename rewrite_dat.pl to reformat_dat_file.pl. -Refactor Catalog.pm and reformat_dat_file.pl to have better separation of concerns. -Add src/include/catalog/Makefile with convenience targets for rewriting data files. Patch 0002: -Some adjustments to the post-conversion cleanup of data files. Patch 0005: -I made a stub version of Solution.pm to simulate testing the MSVC build. This found one bug, and also allowed me to bring in some of the more pedantic dependencies I added to utils/Makefile. Patch 0009: -New patch that puts all doc changes in one patch, for flexibility. -Split the parts of catalog/README having to do with data into a new README.data file. Add recipes for how to edit data, with code examples. On 3/26/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It would be more work, but maybe we should move this into the main > SGML docs. It seems rather silly to have SGML documentation for the > .BKI file format, which now will be an internal matter that hardly > any developers need worry about, but not for the .DAT file format. > But I understand if that seems a bridge too far for today --- certainly > a README file is way better than nothing. I had an idea to make it less silly without doing as much work: Get rid of the SGML docs for the BKI format, and turn them into bootstrap/README. Thoughts? And in the department of second thoughts, it occurred to me that the only reason that the .dat files are in include/catalog is because that's where the DATA() statements were. Since they are separate now, one could make the case that they actually belong in backend/catalog. One trivial advantage here is that there is already an existing Makefile in which to put convenience targets for formatting. On the other hand, it kind of makes sense to have the files describing the schema (.h) and the contents (.dat) in the same directory. I'm inclined to leave things as they are for that reason. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > And in the department of second thoughts, it occurred to me that the > only reason that the .dat files are in include/catalog is because > that's where the DATA() statements were. Since they are separate now, > one could make the case that they actually belong in backend/catalog. > One trivial advantage here is that there is already an existing > Makefile in which to put convenience targets for formatting. On the > other hand, it kind of makes sense to have the files describing the > schema (.h) and the contents (.dat) in the same directory. I'm > inclined to leave things as they are for that reason. Yeah. The fact that, eg, both the .h and .dat files are inputs to duplicate_oids and unused_oids makes me think it's better to keep them together. I'd actually been thinking of something that's about the reverse: instead of building the derived .h files in backend/catalog and then symlinking them into include/catalog, it'd be saner to build them in include/catalog to begin with. However, that would mean that the Perl scripts need to produce output in two different places, so maybe it'd end up more complicated not less so. In any case, that seems like something to leave for another day. regards, tom lane
I'm starting to look through v13 seriously, and one thing struck me that could use some general discussion: what is our policy going to be for choosing the default values for catalog columns? In particular, I noticed that you have for pg_proc bool proisstrict BKI_DEFAULT(f); char provolatile BKI_DEFAULT(v); char proparallel BKI_DEFAULT(u); which do not comport at all with the most common values in those columns. As of HEAD, I see postgres=# select proisstrict, count(*) from pg_proc group by 1; proisstrict | count -------------+------- f | 312 t | 2640 (2 rows) postgres=# select provolatile, count(*) from pg_proc group by 1; provolatile | count -------------+------- i | 2080 s | 570 v | 302 (3 rows) postgres=# select proparallel, count(*) from pg_proc group by 1; proparallel | count -------------+------- r | 139 s | 2722 u | 91 (3 rows) (Since this is from the final initdb state, this overstates the number of .bki entries for pg_proc a bit, but not by much.) I think there's no question that the default for proisstrict ought to be "true" --- not only is that by far the more common choice, but it's actually the safer choice. A C function that needs to be marked strict and isn't will at best do the wrong thing, and quite likely will crash, if passed a NULL value. The defaults for provolatile and proparallel maybe require more thought though. What you've chosen corresponds to the default assumptions of CREATE FUNCTION, which are what we need for user-defined functions that we don't know anything about; but I'm not sure that makes them the best defaults for built-in functions. I'm inclined to go with the majority values here, in part because that will make the outliers stand out when looking at pg_proc.dat. I don't think it's great that we'll have 2800+ entries explicitly marked with proparallel 'i' or 's', but the less-than- 100 with proparallel 'u' will be so only implicitly because the rewrite script will strip out any field entries that match the default. That's really the worst of all worlds: it'd be better to have no default in this column at all, I think, than to behave like that. In short, I'm tempted to say that when there's a clear majority of entries that would use a particular default, that's the default we should use, whether or not it's "surprising" or "unsafe" according to the semantics. It's clearly not "surprising" for a C function to be marked proparallel 's'; the other cases are more so. I'm not seeing any other BKI_DEFAULT choices that I'm inclined to question, so maybe it's a mistake to try to derive any general policy choices from such a small number of cases. But anyway I'm inclined to change these cases. Comments anyone? regards, tom lane
Hi, On 2018-04-04 18:29:31 -0400, Tom Lane wrote: > I'm starting to look through v13 seriously, and one thing struck > me that could use some general discussion: what is our policy > going to be for choosing the default values for catalog columns? > > [...] > > In short, I'm tempted to say that when there's a clear majority of > entries that would use a particular default, that's the default we > should use, whether or not it's "surprising" or "unsafe" according > to the semantics. It's clearly not "surprising" for a C function > to be marked proparallel 's'; the other cases are more so. > > [...] > > I'm not seeing any other BKI_DEFAULT choices that I'm inclined to > question, so maybe it's a mistake to try to derive any general > policy choices from such a small number of cases. But anyway > I'm inclined to change these cases. > > Comments anyone? I think choosing SQL defaults is defensible, but so is choosing the most common value as default to make uncommon stand out more, and so is choosing the safest values. In short, I don't think it matters terribly much, we just should try to be reasonably consistent about. Greetings, Andres Freund
Here are the results of an evening's desultory hacking on v13. I was dissatisfied with the fact that we still had several function-referencing columns that had numeric instead of symbolic contents, for instance pg_aggregate.aggfnoid. Of course, the main reason is that those are declared regproc but reference functions with overloaded names, which regproc can't handle. Now that the lookups are being done in genbki.pl there's no reason why we have to live with that limitation. In the attached, I've generalized the BKI_LOOKUP(pg_proc) code so that you can use either regproc-like or regprocedure-like notation, and then applied that to relevant columns. I did not like the hard-wired handling of proargtypes and proallargtypes in genbki.pl; it hardly seems impossible that we'll want similar conversions for other array-of-OID columns in future. After a bit of thought, it seemed like we could allow oidvector proargtypes BKI_LOOKUP(pg_type); Oid proallargtypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); and just teach genbki.pl that if a lookup rule is attached to an oidvector or Oid[] column, it means to apply the rule to each array element individually. I also changed genbki.pl so that it'd warn about entries that aren't recognized by the lookup rules. This seems like a good idea for catching errors, such as (ahem) applying BKI_LOOKUP to a column that isn't even an OID. bootstrap-v13-delta.patch is a diff atop your patch series for the in-tree files, and convert_oid2name.patch adjusts that script to make use of the additional conversion capability. regards, tom lane diff --git a/src/backend/catalog/README.data b/src/backend/catalog/README.data index b7c680c..22ad0f2 100644 *** a/src/backend/catalog/README.data --- b/src/backend/catalog/README.data *************** teach Catalog::ParseData() how to expand *** 62,71 **** representation. - To aid readability, some values that are references to other catalog ! entries are represented by macros rather than numeric OIDs. This is ! the case for index access methods, opclasses, operators, opfamilies, ! and types. This is done for functions as well, but only if the proname ! is unique. Bootstrap Data Conventions ========================== --- 62,103 ---- representation. - To aid readability, some values that are references to other catalog ! entries are represented by names rather than numeric OIDs. Currently ! this is the case for access methods, functions, operators, opclasses, ! opfamilies, and types. The rules are as follows: ! ! * Use of names rather than numbers is enabled for a particular catalog ! column by attaching BKI_LOOKUP(lookuprule) to the column's definition, ! where "lookuprule" is pg_am, pg_proc, pg_operator, pg_opclass, ! pg_opfamily, or pg_type. ! ! * In a name-lookup column, all entries must use the name format except ! when writing "0" for InvalidOid. (If the column is declared regproc, ! you can optionally write "-" instead of "0".) genbki.pl will warn ! about unrecognized names. ! ! * Access methods are just represented by their names, as are types. ! Type names must match the referenced pg_type entry's typname; you ! do not get to use any aliases such as "integer" for "int4". ! ! * A function can be represented by its proname, if that is unique among ! the pg_proc.dat entries (this works like regproc input). Otherwise, ! write it as "proname(argtypename,argtypename,...)", like regprocedure. ! The argument type names must be spelled exactly as they are in the ! pg_proc.dat entry's proargtypes field. Do not insert any spaces. ! ! * Operators are represented by "oprname(lefttype,righttype)", writing the ! type names exactly as they appear in the pg_operator.dat entry's oprleft ! and oprright fields. (Write 0 for the omitted operand of a unary ! operator.) ! ! * The names of opclasses and opfamilies are only unique within an access ! method, so they are represented by "access_method_name/object_name". ! ! In none of these cases is there any provision for schema-qualification; ! all objects created during bootstrap are expected to be in the pg_catalog ! schema. ! Bootstrap Data Conventions ========================== *************** You can also use the duplicate_oids scri *** 105,111 **** build if a duplicate is found.) - The OID counter starts at 10000 at bootstrap. If a catalog row is ! in a table that requires OIDs, but no OID was preassigned by an "OID =" clause, then it will receive an OID of 10000 or above. --- 137,143 ---- build if a duplicate is found.) - The OID counter starts at 10000 at bootstrap. If a catalog row is ! in a table that requires OIDs, but no OID was preassigned by an "oid =>" clause, then it will receive an OID of 10000 or above. diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index 27494d9..f6be50a 100644 *** a/src/backend/catalog/genbki.pl --- b/src/backend/catalog/genbki.pl *************** foreach my $row (@{ $catalog_data{pg_opf *** 169,181 **** my %procoids; foreach my $row (@{ $catalog_data{pg_proc} }) { ! if (defined($procoids{ $row->{proname} })) { ! $procoids{ $row->{proname} } = 'MULTIPLE'; } else { ! $procoids{ $row->{proname} } = $row->{oid}; } } --- 169,197 ---- my %procoids; foreach my $row (@{ $catalog_data{pg_proc} }) { ! # Generate an entry under just the proname (corresponds to regproc lookup) ! my $prokey = $row->{proname}; ! if (defined($procoids{ $prokey })) { ! $procoids{ $prokey } = 'MULTIPLE'; } else { ! $procoids{ $prokey } = $row->{oid}; ! } ! # Also generate an entry using proname(proargtypes). This is not quite ! # identical to regprocedure lookup because we don't worry much about ! # special SQL names for types etc; we just use the names in the source ! # proargtypes field. These *should* be unique, but do a multiplicity ! # check anyway. ! $prokey .= '(' . join(',', split(/\s+/, $row->{proargtypes})) . ')'; ! if (defined($procoids{ $prokey })) ! { ! $procoids{ $prokey } = 'MULTIPLE'; ! } ! else ! { ! $procoids{ $prokey } = $row->{oid}; } } *************** EOM *** 294,300 **** print $def $line; } ! # Open it, unless bootstrap case (create bootstrap does this # automatically) if (!$catalog->{bootstrap}) { --- 310,316 ---- print $def $line; } ! # Open it, unless it's a bootstrap catalog (create bootstrap does this # automatically) if (!$catalog->{bootstrap}) { *************** EOM *** 323,375 **** $bki_values{$attname} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; $bki_values{$attname} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; ! # Replace OID macros with OIDs. ! # If we don't have a unique value to substitute, just do ! # nothing. This should only happen in the case for functions, ! # in which case regprocin will complain. if ($column->{lookup}) { my $lookup = $lookup_kind{ $column->{lookup} }; - my $lookupoid = $lookup->{ $bki_values{$attname} }; - $bki_values{$attname} = $lookupoid - if defined($lookupoid) && $lookupoid ne 'MULTIPLE'; - } - } ! # Some pg_proc columns contain lists of types, so we must unpack ! # these and do the lookups on each element in turn. ! if ($catname eq 'pg_proc') ! { ! # proargtypes ! if ($bki_values{proargtypes}) ! { ! my @argtypenames = split /\s+/, $bki_values{proargtypes}; ! my @argtypeoids; ! foreach my $argtypename (@argtypenames) { ! my $argtypeoid = $typeoids{$argtypename}; ! push @argtypeoids, $argtypeoid; } ! $bki_values{proargtypes} = join(' ', @argtypeoids); ! } ! ! # proallargtypes ! if ($bki_values{proallargtypes} ne '_null_') ! { ! $bki_values{proallargtypes} =~ s/[{}]//g; ! my @argtypenames = split /,/, $bki_values{proallargtypes}; ! my @argtypeoids; ! foreach my $argtypename (@argtypenames) { ! my $argtypeoid = $typeoids{$argtypename}; ! push @argtypeoids, $argtypeoid; } - $bki_values{proallargtypes} = - sprintf "{%s}", join(',', @argtypeoids); } } ! elsif ($catname eq 'pg_type' and !exists $bki_values{oid_symbol}) { my $symbol = form_pg_type_symbol($bki_values{typname}); $bki_values{oid_symbol} = $symbol --- 339,423 ---- $bki_values{$attname} =~ s/\bPGUID\b/$BOOTSTRAP_SUPERUSERID/g; $bki_values{$attname} =~ s/\bPGNSP\b/$PG_CATALOG_NAMESPACE/g; ! # Replace OID synonyms with OIDs per the appropriate lookup rule. ! # ! # If the column type is oidvector or oid[], we have to replace ! # each element of the array as per the lookup rule. ! # ! # If we don't have a unique value to substitute, warn and ! # leave the entry unchanged. if ($column->{lookup}) { my $lookup = $lookup_kind{ $column->{lookup} }; ! die "unrecognized BKI_LOOKUP type " . $column->{lookup} ! if !defined($lookup); ! if ($atttype eq 'oidvector') { ! my @lookupnames = split /\s+/, $bki_values{$attname}; ! my @lookupoids; ! foreach my $lookupname (@lookupnames) ! { ! my $lookupoid = $lookup->{ $lookupname }; ! if (defined($lookupoid) && $lookupoid ne 'MULTIPLE') ! { ! $lookupname = $lookupoid; ! } ! else ! { ! warn "unresolved OID reference \"$lookupname\" in $catname row " . join(',', values(%bki_values)) ! if $lookupname ne '-' && $lookupname ne '0'; ! } ! push @lookupoids, $lookupname; ! } ! $bki_values{$attname} = join(' ', @lookupoids); } ! elsif ($atttype eq 'oid[]') { ! if ($bki_values{$attname} ne '_null_') ! { ! $bki_values{$attname} =~ s/[{}]//g; ! my @lookupnames = split /,/, $bki_values{$attname}; ! my @lookupoids; ! foreach my $lookupname (@lookupnames) ! { ! my $lookupoid = $lookup->{ $lookupname }; ! if (defined($lookupoid) && $lookupoid ne 'MULTIPLE') ! { ! $lookupname = $lookupoid; ! } ! else ! { ! warn "unresolved OID reference \"$lookupname\" in $catname row " . join(',', values(%bki_values)) ! if $lookupname ne '-' && $lookupname ne '0'; ! } ! push @lookupoids, $lookupname; ! } ! $bki_values{$attname} = ! sprintf "{%s}", join(',', @lookupoids); ! } ! } ! else ! { ! my $lookupname = $bki_values{$attname}; ! my $lookupoid = $lookup->{ $lookupname }; ! if (defined($lookupoid) && $lookupoid ne 'MULTIPLE') ! { ! $bki_values{$attname} = $lookupoid; ! } ! else ! { ! warn "unresolved OID reference \"$lookupname\" in $catname row " . join(',', values(%bki_values)) ! if $lookupname ne '-' && $lookupname ne '0'; ! } } } } ! ! # Special hack to generate OID symbols for pg_type entries ! # that lack one. ! if ($catname eq 'pg_type' and !exists $bki_values{oid_symbol}) { my $symbol = form_pg_type_symbol($bki_values{typname}); $bki_values{oid_symbol} = $symbol diff --git a/src/include/catalog/pg_aggregate.h b/src/include/catalog/pg_aggregate.h index 226bb07..767bab5 100644 *** a/src/include/catalog/pg_aggregate.h --- b/src/include/catalog/pg_aggregate.h *************** *** 31,37 **** CATALOG(pg_aggregate,2600) BKI_WITHOUT_OIDS { /* pg_proc OID of the aggregate itself */ ! regproc aggfnoid; /* aggregate kind, see AGGKIND_ categories below */ char aggkind BKI_DEFAULT(n); --- 31,37 ---- CATALOG(pg_aggregate,2600) BKI_WITHOUT_OIDS { /* pg_proc OID of the aggregate itself */ ! regproc aggfnoid BKI_LOOKUP(pg_proc); /* aggregate kind, see AGGKIND_ categories below */ char aggkind BKI_DEFAULT(n); diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h index f045bfa..accbe83 100644 *** a/src/include/catalog/pg_amproc.h --- b/src/include/catalog/pg_amproc.h *************** CATALOG(pg_amproc,2603) *** 54,63 **** Oid amprocrighttype BKI_LOOKUP(pg_type); /* support procedure index */ ! int16 amprocnum BKI_LOOKUP(pg_type); /* OID of the proc */ ! regproc amproc; } FormData_pg_amproc; /* ---------------- --- 54,63 ---- Oid amprocrighttype BKI_LOOKUP(pg_type); /* support procedure index */ ! int16 amprocnum; /* OID of the proc */ ! regproc amproc BKI_LOOKUP(pg_proc); } FormData_pg_amproc; /* ---------------- diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h index 2674701..f3bc3c0 100644 *** a/src/include/catalog/pg_cast.h --- b/src/include/catalog/pg_cast.h *************** CATALOG(pg_cast,2605) *** 39,45 **** Oid casttarget BKI_LOOKUP(pg_type); /* cast function; 0 = binary coercible */ ! Oid castfunc; /* contexts in which cast can be used */ char castcontext; --- 39,45 ---- Oid casttarget BKI_LOOKUP(pg_type); /* cast function; 0 = binary coercible */ ! Oid castfunc BKI_LOOKUP(pg_proc); /* contexts in which cast can be used */ char castcontext; diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 8d5d044..b9d9cfd 100644 *** a/src/include/catalog/pg_proc.h --- b/src/include/catalog/pg_proc.h *************** CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ *** 49,55 **** float4 prorows BKI_DEFAULT(0); /* element type of variadic array, or 0 */ ! Oid provariadic BKI_DEFAULT(0); /* transforms calls to it during planning */ regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); --- 49,55 ---- float4 prorows BKI_DEFAULT(0); /* element type of variadic array, or 0 */ ! Oid provariadic BKI_DEFAULT(0) BKI_LOOKUP(pg_type); /* transforms calls to it during planning */ regproc protransform BKI_DEFAULT(0) BKI_LOOKUP(pg_proc); *************** CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ *** 90,101 **** */ /* parameter types (excludes OUT params) */ ! oidvector proargtypes; #ifdef CATALOG_VARLEN /* all param types (NULL if IN only) */ ! Oid proallargtypes[1] BKI_DEFAULT(_null_); /* parameter modes (NULL if IN only) */ char proargmodes[1] BKI_DEFAULT(_null_); --- 90,101 ---- */ /* parameter types (excludes OUT params) */ ! oidvector proargtypes BKI_LOOKUP(pg_type); #ifdef CATALOG_VARLEN /* all param types (NULL if IN only) */ ! Oid proallargtypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); /* parameter modes (NULL if IN only) */ char proargmodes[1] BKI_DEFAULT(_null_); diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index 381da18..8992fcd 100644 *** a/src/include/catalog/pg_type.h --- b/src/include/catalog/pg_type.h *************** CATALOG(pg_type,1247) BKI_BOOTSTRAP BKI_ *** 109,121 **** * * typelem != 0 and typlen == -1. */ ! Oid typelem BKI_DEFAULT(0); /* * If there is a "true" array type having this type as element type, * typarray links to it. Zero if no associated "true" array type. */ ! Oid typarray; /* * I/O conversion procedures for the datatype. --- 109,121 ---- * * typelem != 0 and typlen == -1. */ ! Oid typelem BKI_DEFAULT(0) BKI_LOOKUP(pg_type); /* * If there is a "true" array type having this type as element type, * typarray links to it. Zero if no associated "true" array type. */ ! Oid typarray BKI_DEFAULT(0) BKI_LOOKUP(pg_type); /* * I/O conversion procedures for the datatype. *** boot-13/convert_oid2name.pl~ Sat Mar 31 07:53:29 2018 --- boot-13/convert_oid2name.pl Wed Apr 4 21:45:05 2018 *************** *** 129,146 **** $row->{oprname}, $typenames{$row->{oprleft}}, $typenames{$row->{oprright}}; } ! # Proc oid lookup. my %procoids; foreach my $row (@{ $catalog_data{pg_proc} }) { next if !ref $row; if (defined($procoids{ $row->{proname} })) { $procoids{ $row->{proname} } = 'MULTIPLE'; } else { ! $procoids{ $row->{oid} } = $row->{proname}; } } --- 129,153 ---- $row->{oprname}, $typenames{$row->{oprleft}}, $typenames{$row->{oprright}}; } ! # Proc oid lookup (see lookup_procname). ! my %procshortnames; ! my %proclongnames; my %procoids; foreach my $row (@{ $catalog_data{pg_proc} }) { next if !ref $row; + $procshortnames{ $row->{oid} } = $row->{proname}; + $proclongnames{ $row->{oid} } = sprintf "%s(%s)", + $row->{proname}, + join(',', map($typenames{$_}, split(/\s+/, $row->{proargtypes}))); + # We use this to track whether a proname is duplicated. if (defined($procoids{ $row->{proname} })) { $procoids{ $row->{proname} } = 'MULTIPLE'; } else { ! $procoids{ $row->{proname} } = $row->{oid}; } } *************** *** 186,191 **** --- 193,200 ---- if ($catname eq 'pg_proc') { + $values{provariadic} = $typenames{$values{provariadic}} + if exists $values{provariadic}; $values{prorettype} = $typenames{$values{prorettype}}; if ($values{proargtypes}) { *************** *** 211,222 **** --- 220,236 ---- } elsif ($catname eq 'pg_aggregate') { + $values{aggfnoid} = lookup_procname($values{aggfnoid}); $values{aggsortop} = $opernames{$values{aggsortop}} if exists $values{aggsortop}; $values{aggtranstype} = $typenames{$values{aggtranstype}}; $values{aggmtranstype} = $typenames{$values{aggmtranstype}} if exists $values{aggmtranstype}; } + elsif ($catname eq 'pg_am') + { + $values{aggfnoid} = lookup_procname($values{aggfnoid}); + } elsif ($catname eq 'pg_amop') { $values{amoplefttype} = $typenames{$values{amoplefttype}}; *************** *** 232,242 **** --- 246,258 ---- $values{amprocfamily} = $opfnames{$values{amprocfamily}}; $values{amproclefttype} = $typenames{$values{amproclefttype}}; $values{amprocrighttype} = $typenames{$values{amprocrighttype}}; + $values{amproc} = lookup_procname($values{amproc}); } elsif ($catname eq 'pg_cast') { $values{castsource} = $typenames{$values{castsource}}; $values{casttarget} = $typenames{$values{casttarget}}; + $values{castfunc} = lookup_procname($values{castfunc}); } elsif ($catname eq 'pg_opclass') { *************** *** 255,260 **** --- 271,277 ---- if exists $values{oprcom}; $values{oprnegate} = $opernames{$values{oprnegate}} if exists $values{oprnegate}; + $values{oprcode} = lookup_procname($values{oprcode}); } elsif ($catname eq 'pg_opfamily') { *************** *** 266,271 **** --- 283,295 ---- $values{rngsubtype} = $typenames{$values{rngsubtype}}; $values{rngsubopc} = $opcnames{$values{rngsubopc}}; } + elsif ($catname eq 'pg_type') + { + $values{typelem} = $typenames{$values{typelem}} + if exists $values{typelem}; + $values{typarray} = $typenames{$values{typarray}} + if exists $values{typarray}; + } ############################################################ *************** *** 412,417 **** --- 436,454 ---- return $hash_str; } + sub lookup_procname + { + my $oid = shift; + return $oid if !defined($oid) || $oid eq '-' || $oid eq '0'; + my $shortname = $procshortnames{$oid}; + return $shortname if defined($shortname) && + defined($procoids{$shortname}) && + $procoids{$shortname} eq $oid; + my $longname = $proclongnames{$oid}; + return $longname if defined($longname); + return $oid; + } + sub usage { die <<EOM;
On 4/5/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Here are the results of an evening's desultory hacking on v13. > > [numeric function oids with overloaded name] Thank you for the detailed review and for improving the function references (not to mention the type references I somehow left on the table). I was also not quite satisfied with just the regproc columns. > I did not like the hard-wired handling of proargtypes and proallargtypes > in genbki.pl; it hardly seems impossible that we'll want similar > conversions for other array-of-OID columns in future. After a bit of > thought, it seemed like we could allow > > oidvector proargtypes BKI_LOOKUP(pg_type); > > Oid proallargtypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); > > and just teach genbki.pl that if a lookup rule is attached to > an oidvector or Oid[] column, it means to apply the rule to > each array element individually. I think that's a good idea. I went an extra step and extracted the common logic into a function (attached draft patch to be applied on top of yours). It treats all lookups as operating on arrays. The common case is that we pass a single-element array. That may seem awkward, but I think it's clear. The code is slimmer, and the lines now fit within 80 characters. > I also changed genbki.pl so that it'd warn about entries that aren't > recognized by the lookup rules. This seems like a good idea for > catching errors, such as (ahem) applying BKI_LOOKUP to a column > that isn't even an OID. Yikes, I must have fat-fingered that during the comment reformatting. Unrelated, I noticed my quoting of defaults that contain back-slashes was half-baked, so I'll include that fix in the next patchset. I'll put out a new one in a couple days, to give a chance for further review and discussion of the defaults. I didn't feel the need to respond to the other messages, but yours and Andres' points are well taken. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > On 4/5/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I did not like the hard-wired handling of proargtypes and proallargtypes >> in genbki.pl; it hardly seems impossible that we'll want similar >> conversions for other array-of-OID columns in future. After a bit of >> thought, it seemed like we could allow >> oidvector proargtypes BKI_LOOKUP(pg_type); >> Oid proallargtypes[1] BKI_DEFAULT(_null_) BKI_LOOKUP(pg_type); >> and just teach genbki.pl that if a lookup rule is attached to >> an oidvector or Oid[] column, it means to apply the rule to >> each array element individually. > I think that's a good idea. I went an extra step and extracted the > common logic into a function (attached draft patch to be applied on > top of yours). It treats all lookups as operating on arrays. The > common case is that we pass a single-element array. That may seem > awkward, but I think it's clear. The code is slimmer, and the lines > now fit within 80 characters. Sounds good. I too was bothered by the duplication of code, but I'm not a good enough Perl programmer to have thought of that solution. Something that bothered me a bit while writing the warning-producing code is that showing %bki_values isn't actually that great a way of identifying the trouble spot. By this point we've expanded out defaults and possibly replaced some other macros, so it doesn't look that much like what was in the .dat file. I think what would be ideal, both here and in some other places like AddDefaultValues, is to be able to finger the location of the bad tuple by filename and line number, but I have no idea whether it's practical to annotate the tuples with that while reading the .dat files. Any thoughts? (Obviously, better error messages could be a future improvement; it's not something we have to get done before the conversion.) > Unrelated, I noticed my quoting of defaults that contain back-slashes > was half-baked, so I'll include that fix in the next patchset. I'll > put out a new one in a couple days, to give a chance for further > review and discussion of the defaults. I didn't feel the need to > respond to the other messages, but yours and Andres' points are well > taken. We're getting down to the wire here --- I think the plan is to close the CF on Saturday or Sunday, and then push the bootstrap changes right after that. So please turn around whatever you're planning to do ASAP. I'm buckling down to a final review today and tomorrow. regards, tom lane
I experimented with converting all frontend code to include just the catalog/pg_foo_d.h files instead of catalog/pg_foo.h, as per the proposed new policy. I soon found that we'd overlooked one thing: some clients expect to see the relation OID macros, eg LargeObjectRelationId. Attached is a patch that changes things around so that those appear in the _d files instead of the master files. This is cleaner anyway because it removes duplication of the OIDs in the master files, with attendant risk of error. For example we have this change in pg_aggregate.h: -#define AggregateRelationId 2600 - -CATALOG(pg_aggregate,2600) BKI_WITHOUT_OIDS +CATALOG(pg_aggregate,2600,AggregateRelationId) BKI_WITHOUT_OIDS Some of the CATALOG lines spill well past 80 characters with this, although many of the affected ones already were overlength, eg -#define DatabaseRelationId 1262 -#define DatabaseRelation_Rowtype_Id 1248 - -CATALOG(pg_database,1262) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248) BKI_SCHEMA_MACRO +CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id) BKI_SCHEMA_MACRO I thought about improving that by removing the restriction that these BKI annotations appear on the same line as the CATALOG macro, so that we could break the above into several lines. I think the original key reason for the restriction was to avoid accidentally taking some bit of a DATA line as a BKI annotation. With the DATA lines gone from these files, that's no longer a significant hazard (although passing references to BKI keywords in comments might still be hazards for the Perl scripts). However, if we try to format things like CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id) BKI_SCHEMA_MACRO { fields... } I'm afraid that neither pgindent nor a lot of common editors would indent that very nicely. So at least for the moment I'm inclined to just keep it all on one line ... we know how that behaves, anyway. regards, tom lane diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 519247e..fb3d62a 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -104,18 +104,29 @@ sub ParseHeader { push @{ $catalog{indexing} }, "build indices\n"; } - elsif (/^CATALOG\(([^,]*),(\d+)\)/) + elsif (/^CATALOG\(([^,]*),(\d+),(\w+)\)/) { $catalog{catname} = $1; $catalog{relation_oid} = $2; + $catalog{relation_oid_macro} = $3; $catalog{bootstrap} = /BKI_BOOTSTRAP/ ? ' bootstrap' : ''; $catalog{shared_relation} = /BKI_SHARED_RELATION/ ? ' shared_relation' : ''; $catalog{without_oids} = /BKI_WITHOUT_OIDS/ ? ' without_oids' : ''; - $catalog{rowtype_oid} = - /BKI_ROWTYPE_OID\((\d+)\)/ ? " rowtype_oid $1" : ''; + if (/BKI_ROWTYPE_OID\((\d+),(\w+)\)/) + { + $catalog{rowtype_oid} = $1; + $catalog{rowtype_oid_clause} = " rowtype_oid $1"; + $catalog{rowtype_oid_macro} = $2; + } + else + { + $catalog{rowtype_oid} = ''; + $catalog{rowtype_oid_clause} = ''; + $catalog{rowtype_oid_macro} = ''; + } $catalog{schema_macro} = /BKI_SCHEMA_MACRO/ ? 1 : 0; $declaring_attributes = 1; } diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index f6be50a..fe8c3ca 100644 --- a/src/backend/catalog/genbki.pl +++ b/src/backend/catalog/genbki.pl @@ -228,6 +228,7 @@ my @tables_needing_macros; # produce output, one catalog at a time foreach my $catname (@catnames) { + my $catalog = $catalogs{$catname}; # Create one definition header with macro definitions for each catalog. my $def_file = $output_path . $catname . '_d.h'; @@ -258,13 +259,21 @@ foreach my $catname (@catnames) EOM + # Emit OID macros for catalog's OID and rowtype OID, if wanted + print $def + sprintf("#define %s %s\n", $catalog->{relation_oid_macro}, $catalog->{relation_oid}) + if $catalog->{relation_oid_macro} ne ''; + print $def + sprintf("#define %s %s\n", $catalog->{rowtype_oid_macro}, $catalog->{rowtype_oid}) + if $catalog->{rowtype_oid_macro} ne ''; + print $def "\n"; + # .bki CREATE command for this catalog - my $catalog = $catalogs{$catname}; print $bki "create $catname $catalog->{relation_oid}" . $catalog->{shared_relation} . $catalog->{bootstrap} . $catalog->{without_oids} - . $catalog->{rowtype_oid} . "\n"; + . $catalog->{rowtype_oid_clause} . "\n"; my $first = 1; diff --git a/src/include/catalog/duplicate_oids b/src/include/catalog/duplicate_oids index 9732f61..0e6285f 100755 --- a/src/include/catalog/duplicate_oids +++ b/src/include/catalog/duplicate_oids @@ -15,7 +15,7 @@ while (<>) next if /^CATALOG\(.*BKI_BOOTSTRAP/; next unless /\boid *=> *'(\d+)'/ - || /^CATALOG\([^,]*, *(\d+).*BKI_ROWTYPE_OID\((\d+)\)/ + || /^CATALOG\([^,]*, *(\d+).*BKI_ROWTYPE_OID\((\d+),/ || /^CATALOG\([^,]*, *(\d+)/ || /^DECLARE_INDEX\([^,]*, *(\d+)/ || /^DECLARE_UNIQUE_INDEX\([^,]*, *(\d+)/ diff --git a/src/include/catalog/genbki.h b/src/include/catalog/genbki.h index af064fc..02c38c4 100644 --- a/src/include/catalog/genbki.h +++ b/src/include/catalog/genbki.h @@ -20,13 +20,13 @@ #define GENBKI_H /* Introduces a catalog's structure definition */ -#define CATALOG(name,oid) typedef struct CppConcat(FormData_,name) +#define CATALOG(name,oid,oidmacro) typedef struct CppConcat(FormData_,name) /* Options that may appear after CATALOG (on the same line) */ #define BKI_BOOTSTRAP #define BKI_SHARED_RELATION #define BKI_WITHOUT_OIDS -#define BKI_ROWTYPE_OID(oid) +#define BKI_ROWTYPE_OID(oid,oidmacro) #define BKI_SCHEMA_MACRO /* Options that may appear after an attribute (on the same line) */ @@ -34,7 +34,7 @@ #define BKI_FORCE_NOT_NULL /* Specifies a default value for a catalog field */ #define BKI_DEFAULT(value) -/* Indicates where to perform lookups for OID macros */ +/* Indicates how to perform name lookups for OID fields */ #define BKI_LOOKUP(catalog) /* The following are never defined; they are here only for documentation. */ diff --git a/src/include/catalog/pg_aggregate.h b/src/include/catalog/pg_aggregate.h index 767bab5..4d0ec01 100644 --- a/src/include/catalog/pg_aggregate.h +++ b/src/include/catalog/pg_aggregate.h @@ -26,9 +26,7 @@ * cpp turns this into typedef struct FormData_pg_aggregate * ---------------------------------------------------------------- */ -#define AggregateRelationId 2600 - -CATALOG(pg_aggregate,2600) BKI_WITHOUT_OIDS +CATALOG(pg_aggregate,2600,AggregateRelationId) BKI_WITHOUT_OIDS { /* pg_proc OID of the aggregate itself */ regproc aggfnoid BKI_LOOKUP(pg_proc); diff --git a/src/include/catalog/pg_am.h b/src/include/catalog/pg_am.h index d6454c5..5aa2bac 100644 --- a/src/include/catalog/pg_am.h +++ b/src/include/catalog/pg_am.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_am * ---------------- */ -#define AccessMethodRelationId 2601 - -CATALOG(pg_am,2601) +CATALOG(pg_am,2601,AccessMethodRelationId) { /* access method name */ NameData amname; diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h index 59842a6..a481691 100644 --- a/src/include/catalog/pg_amop.h +++ b/src/include/catalog/pg_amop.h @@ -51,9 +51,7 @@ * typedef struct FormData_pg_amop * ---------------- */ -#define AccessMethodOperatorRelationId 2602 - -CATALOG(pg_amop,2602) +CATALOG(pg_amop,2602,AccessMethodOperatorRelationId) { /* the index opfamily this entry is for */ Oid amopfamily BKI_LOOKUP(pg_opfamily); diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h index accbe83..d638e0c 100644 --- a/src/include/catalog/pg_amproc.h +++ b/src/include/catalog/pg_amproc.h @@ -40,9 +40,7 @@ * typedef struct FormData_pg_amproc * ---------------- */ -#define AccessMethodProcedureRelationId 2603 - -CATALOG(pg_amproc,2603) +CATALOG(pg_amproc,2603,AccessMethodProcedureRelationId) { /* the index opfamily this entry is for */ Oid amprocfamily BKI_LOOKUP(pg_opfamily); diff --git a/src/include/catalog/pg_attrdef.h b/src/include/catalog/pg_attrdef.h index 068ab64..16b106d 100644 --- a/src/include/catalog/pg_attrdef.h +++ b/src/include/catalog/pg_attrdef.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_attrdef * ---------------- */ -#define AttrDefaultRelationId 2604 - -CATALOG(pg_attrdef,2604) +CATALOG(pg_attrdef,2604,AttrDefaultRelationId) { Oid adrelid; /* OID of table containing attribute */ int16 adnum; /* attnum of attribute */ diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h index 1b3f306..69b651a 100644 --- a/src/include/catalog/pg_attribute.h +++ b/src/include/catalog/pg_attribute.h @@ -34,10 +34,7 @@ * You may need to change catalog/genbki.pl as well. * ---------------- */ -#define AttributeRelationId 1249 -#define AttributeRelation_Rowtype_Id 75 - -CATALOG(pg_attribute,1249) BKI_BOOTSTRAP BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(75) BKI_SCHEMA_MACRO +CATALOG(pg_attribute,1249,AttributeRelationId) BKI_BOOTSTRAP BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(75,AttributeRelation_Rowtype_Id)BKI_SCHEMA_MACRO { Oid attrelid; /* OID of relation containing this attribute */ NameData attname; /* name of attribute */ diff --git a/src/include/catalog/pg_auth_members.h b/src/include/catalog/pg_auth_members.h index b8ac653..75bc2ba 100644 --- a/src/include/catalog/pg_auth_members.h +++ b/src/include/catalog/pg_auth_members.h @@ -27,10 +27,7 @@ * typedef struct FormData_pg_auth_members * ---------------- */ -#define AuthMemRelationId 1261 -#define AuthMemRelation_Rowtype_Id 2843 - -CATALOG(pg_auth_members,1261) BKI_SHARED_RELATION BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(2843) BKI_SCHEMA_MACRO +CATALOG(pg_auth_members,1261,AuthMemRelationId) BKI_SHARED_RELATION BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(2843,AuthMemRelation_Rowtype_Id)BKI_SCHEMA_MACRO { Oid roleid; /* ID of a role */ Oid member; /* ID of a member of that role */ diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h index f27906f..863ef65 100644 --- a/src/include/catalog/pg_authid.h +++ b/src/include/catalog/pg_authid.h @@ -28,10 +28,7 @@ * typedef struct FormData_pg_authid * ---------------- */ -#define AuthIdRelationId 1260 -#define AuthIdRelation_Rowtype_Id 2842 - -CATALOG(pg_authid,1260) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842) BKI_SCHEMA_MACRO +CATALOG(pg_authid,1260,AuthIdRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(2842,AuthIdRelation_Rowtype_Id) BKI_SCHEMA_MACRO { NameData rolname; /* name of role */ bool rolsuper; /* read this field via superuser() only! */ diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h index 8a5b7f9..a9e7e2b 100644 --- a/src/include/catalog/pg_cast.h +++ b/src/include/catalog/pg_cast.h @@ -28,9 +28,7 @@ * typedef struct FormData_pg_cast * ---------------- */ -#define CastRelationId 2605 - -CATALOG(pg_cast,2605) +CATALOG(pg_cast,2605,CastRelationId) { /* source datatype for cast */ Oid castsource BKI_LOOKUP(pg_type); diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h index 914aa63..28d939d 100644 --- a/src/include/catalog/pg_class.h +++ b/src/include/catalog/pg_class.h @@ -26,10 +26,7 @@ * typedef struct FormData_pg_class * ---------------- */ -#define RelationRelationId 1259 -#define RelationRelation_Rowtype_Id 83 - -CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83) BKI_SCHEMA_MACRO +CATALOG(pg_class,1259,RelationRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83,RelationRelation_Rowtype_Id) BKI_SCHEMA_MACRO { NameData relname; /* class name */ Oid relnamespace; /* OID of namespace containing this class */ diff --git a/src/include/catalog/pg_collation.h b/src/include/catalog/pg_collation.h index 0c6d47f..9c643cf 100644 --- a/src/include/catalog/pg_collation.h +++ b/src/include/catalog/pg_collation.h @@ -27,9 +27,7 @@ * typedef struct FormData_pg_collation * ---------------- */ -#define CollationRelationId 3456 - -CATALOG(pg_collation,3456) +CATALOG(pg_collation,3456,CollationRelationId) { NameData collname; /* collation name */ Oid collnamespace; /* OID of namespace containing collation */ diff --git a/src/include/catalog/pg_constraint.h b/src/include/catalog/pg_constraint.h index 2024c45..e1ef9cc 100644 --- a/src/include/catalog/pg_constraint.h +++ b/src/include/catalog/pg_constraint.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_constraint * ---------------- */ -#define ConstraintRelationId 2606 - -CATALOG(pg_constraint,2606) +CATALOG(pg_constraint,2606,ConstraintRelationId) { /* * conname + connamespace is deliberately not unique; we allow, for diff --git a/src/include/catalog/pg_conversion.h b/src/include/catalog/pg_conversion.h index eacc09a..2a38d71 100644 --- a/src/include/catalog/pg_conversion.h +++ b/src/include/catalog/pg_conversion.h @@ -35,9 +35,7 @@ * condefault true if this is a default conversion * ---------------------------------------------------------------- */ -#define ConversionRelationId 2607 - -CATALOG(pg_conversion,2607) +CATALOG(pg_conversion,2607,ConversionRelationId) { NameData conname; Oid connamespace; diff --git a/src/include/catalog/pg_database.h b/src/include/catalog/pg_database.h index 9435f24..7f03d24 100644 --- a/src/include/catalog/pg_database.h +++ b/src/include/catalog/pg_database.h @@ -26,10 +26,7 @@ * typedef struct FormData_pg_database * ---------------- */ -#define DatabaseRelationId 1262 -#define DatabaseRelation_Rowtype_Id 1248 - -CATALOG(pg_database,1262) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248) BKI_SCHEMA_MACRO +CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id) BKI_SCHEMA_MACRO { NameData datname; /* database name */ Oid datdba; /* owner of database */ diff --git a/src/include/catalog/pg_db_role_setting.h b/src/include/catalog/pg_db_role_setting.h index 9793c69..cccb28a 100644 --- a/src/include/catalog/pg_db_role_setting.h +++ b/src/include/catalog/pg_db_role_setting.h @@ -29,9 +29,7 @@ * typedef struct FormData_pg_db_role_setting * ---------------- */ -#define DbRoleSettingRelationId 2964 - -CATALOG(pg_db_role_setting,2964) BKI_SHARED_RELATION BKI_WITHOUT_OIDS +CATALOG(pg_db_role_setting,2964,DbRoleSettingRelationId) BKI_SHARED_RELATION BKI_WITHOUT_OIDS { Oid setdatabase; /* database */ Oid setrole; /* role */ diff --git a/src/include/catalog/pg_default_acl.h b/src/include/catalog/pg_default_acl.h index 868ac0c..ac81df1 100644 --- a/src/include/catalog/pg_default_acl.h +++ b/src/include/catalog/pg_default_acl.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_default_acl * ---------------- */ -#define DefaultAclRelationId 826 - -CATALOG(pg_default_acl,826) +CATALOG(pg_default_acl,826,DefaultAclRelationId) { Oid defaclrole; /* OID of role owning this ACL */ Oid defaclnamespace; /* OID of namespace, or 0 for all */ diff --git a/src/include/catalog/pg_depend.h b/src/include/catalog/pg_depend.h index 030f655..bf31c1a 100644 --- a/src/include/catalog/pg_depend.h +++ b/src/include/catalog/pg_depend.h @@ -38,9 +38,7 @@ * typedef struct FormData_pg_depend * ---------------- */ -#define DependRelationId 2608 - -CATALOG(pg_depend,2608) BKI_WITHOUT_OIDS +CATALOG(pg_depend,2608,DependRelationId) BKI_WITHOUT_OIDS { /* * Identification of the dependent (referencing) object. diff --git a/src/include/catalog/pg_description.h b/src/include/catalog/pg_description.h index d3c8644..b95b188 100644 --- a/src/include/catalog/pg_description.h +++ b/src/include/catalog/pg_description.h @@ -45,9 +45,7 @@ * typedef struct FormData_pg_description * ---------------- */ -#define DescriptionRelationId 2609 - -CATALOG(pg_description,2609) BKI_WITHOUT_OIDS +CATALOG(pg_description,2609,DescriptionRelationId) BKI_WITHOUT_OIDS { Oid objoid; /* OID of object itself */ Oid classoid; /* OID of table containing object */ diff --git a/src/include/catalog/pg_enum.h b/src/include/catalog/pg_enum.h index edea5e3..a0922be 100644 --- a/src/include/catalog/pg_enum.h +++ b/src/include/catalog/pg_enum.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_enum * ---------------- */ -#define EnumRelationId 3501 - -CATALOG(pg_enum,3501) +CATALOG(pg_enum,3501,EnumRelationId) { Oid enumtypid; /* OID of owning enum type */ float4 enumsortorder; /* sort position of this enum value */ diff --git a/src/include/catalog/pg_event_trigger.h b/src/include/catalog/pg_event_trigger.h index 3ca0a88..f06cbe0 100644 --- a/src/include/catalog/pg_event_trigger.h +++ b/src/include/catalog/pg_event_trigger.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_event_trigger * ---------------- */ -#define EventTriggerRelationId 3466 - -CATALOG(pg_event_trigger,3466) +CATALOG(pg_event_trigger,3466,EventTriggerRelationId) { NameData evtname; /* trigger's name */ NameData evtevent; /* trigger's event */ diff --git a/src/include/catalog/pg_extension.h b/src/include/catalog/pg_extension.h index a60bd44..10bbb69 100644 --- a/src/include/catalog/pg_extension.h +++ b/src/include/catalog/pg_extension.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_extension * ---------------- */ -#define ExtensionRelationId 3079 - -CATALOG(pg_extension,3079) +CATALOG(pg_extension,3079,ExtensionRelationId) { NameData extname; /* extension name */ Oid extowner; /* extension owner */ diff --git a/src/include/catalog/pg_foreign_data_wrapper.h b/src/include/catalog/pg_foreign_data_wrapper.h index ae9b0be..67e3319 100644 --- a/src/include/catalog/pg_foreign_data_wrapper.h +++ b/src/include/catalog/pg_foreign_data_wrapper.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_foreign_data_wrapper * ---------------- */ -#define ForeignDataWrapperRelationId 2328 - -CATALOG(pg_foreign_data_wrapper,2328) +CATALOG(pg_foreign_data_wrapper,2328,ForeignDataWrapperRelationId) { NameData fdwname; /* foreign-data wrapper name */ Oid fdwowner; /* FDW owner */ diff --git a/src/include/catalog/pg_foreign_server.h b/src/include/catalog/pg_foreign_server.h index 34fc827..0d25839 100644 --- a/src/include/catalog/pg_foreign_server.h +++ b/src/include/catalog/pg_foreign_server.h @@ -25,9 +25,7 @@ * typedef struct FormData_pg_foreign_server * ---------------- */ -#define ForeignServerRelationId 1417 - -CATALOG(pg_foreign_server,1417) +CATALOG(pg_foreign_server,1417,ForeignServerRelationId) { NameData srvname; /* foreign server name */ Oid srvowner; /* server owner */ diff --git a/src/include/catalog/pg_foreign_table.h b/src/include/catalog/pg_foreign_table.h index 1a1fefc..13de918 100644 --- a/src/include/catalog/pg_foreign_table.h +++ b/src/include/catalog/pg_foreign_table.h @@ -25,9 +25,7 @@ * typedef struct FormData_pg_foreign_table * ---------------- */ -#define ForeignTableRelationId 3118 - -CATALOG(pg_foreign_table,3118) BKI_WITHOUT_OIDS +CATALOG(pg_foreign_table,3118,ForeignTableRelationId) BKI_WITHOUT_OIDS { Oid ftrelid; /* OID of foreign table */ Oid ftserver; /* OID of foreign server */ diff --git a/src/include/catalog/pg_index.h b/src/include/catalog/pg_index.h index ecae0db..b70ad73 100644 --- a/src/include/catalog/pg_index.h +++ b/src/include/catalog/pg_index.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_index. * ---------------- */ -#define IndexRelationId 2610 - -CATALOG(pg_index,2610) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO +CATALOG(pg_index,2610,IndexRelationId) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO { Oid indexrelid; /* OID of the index */ Oid indrelid; /* OID of the relation it indexes */ diff --git a/src/include/catalog/pg_inherits.h b/src/include/catalog/pg_inherits.h index 478a587..3b2e03c 100644 --- a/src/include/catalog/pg_inherits.h +++ b/src/include/catalog/pg_inherits.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_inherits * ---------------- */ -#define InheritsRelationId 2611 - -CATALOG(pg_inherits,2611) BKI_WITHOUT_OIDS +CATALOG(pg_inherits,2611,InheritsRelationId) BKI_WITHOUT_OIDS { Oid inhrelid; Oid inhparent; diff --git a/src/include/catalog/pg_init_privs.h b/src/include/catalog/pg_init_privs.h index 7dcb70c..6ce2646 100644 --- a/src/include/catalog/pg_init_privs.h +++ b/src/include/catalog/pg_init_privs.h @@ -43,9 +43,7 @@ * typedef struct FormData_pg_init_privs * ---------------- */ -#define InitPrivsRelationId 3394 - -CATALOG(pg_init_privs,3394) BKI_WITHOUT_OIDS +CATALOG(pg_init_privs,3394,InitPrivsRelationId) BKI_WITHOUT_OIDS { Oid objoid; /* OID of object itself */ Oid classoid; /* OID of table containing object */ diff --git a/src/include/catalog/pg_language.h b/src/include/catalog/pg_language.h index d2d878c..e2d8d15 100644 --- a/src/include/catalog/pg_language.h +++ b/src/include/catalog/pg_language.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_language * ---------------- */ -#define LanguageRelationId 2612 - -CATALOG(pg_language,2612) +CATALOG(pg_language,2612,LanguageRelationId) { NameData lanname; /* Language name */ Oid lanowner; /* Language's owner */ diff --git a/src/include/catalog/pg_largeobject.h b/src/include/catalog/pg_largeobject.h index 2157bab..07adca0 100644 --- a/src/include/catalog/pg_largeobject.h +++ b/src/include/catalog/pg_largeobject.h @@ -24,9 +24,7 @@ * typedef struct FormData_pg_largeobject * ---------------- */ -#define LargeObjectRelationId 2613 - -CATALOG(pg_largeobject,2613) BKI_WITHOUT_OIDS +CATALOG(pg_largeobject,2613,LargeObjectRelationId) BKI_WITHOUT_OIDS { Oid loid; /* Identifier of large object */ int32 pageno; /* Page number (starting from 0) */ diff --git a/src/include/catalog/pg_largeobject_metadata.h b/src/include/catalog/pg_largeobject_metadata.h index 3d5e0cd..a8732bc 100644 --- a/src/include/catalog/pg_largeobject_metadata.h +++ b/src/include/catalog/pg_largeobject_metadata.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_largeobject_metadata * ---------------- */ -#define LargeObjectMetadataRelationId 2995 - -CATALOG(pg_largeobject_metadata,2995) +CATALOG(pg_largeobject_metadata,2995,LargeObjectMetadataRelationId) { Oid lomowner; /* OID of the largeobject owner */ diff --git a/src/include/catalog/pg_namespace.h b/src/include/catalog/pg_namespace.h index 5f80e86..0d9cada 100644 --- a/src/include/catalog/pg_namespace.h +++ b/src/include/catalog/pg_namespace.h @@ -31,9 +31,7 @@ * nspacl access privilege list * ---------------------------------------------------------------- */ -#define NamespaceRelationId 2615 - -CATALOG(pg_namespace,2615) +CATALOG(pg_namespace,2615,NamespaceRelationId) { NameData nspname; Oid nspowner; diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h index 1b20d0d..16c3875 100644 --- a/src/include/catalog/pg_opclass.h +++ b/src/include/catalog/pg_opclass.h @@ -46,9 +46,7 @@ * typedef struct FormData_pg_opclass * ---------------- */ -#define OperatorClassRelationId 2616 - -CATALOG(pg_opclass,2616) +CATALOG(pg_opclass,2616,OperatorClassRelationId) { /* index access method opclass is for */ Oid opcmethod BKI_LOOKUP(pg_am); diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h index 9c829d0..4950d28 100644 --- a/src/include/catalog/pg_operator.h +++ b/src/include/catalog/pg_operator.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_operator * ---------------- */ -#define OperatorRelationId 2617 - -CATALOG(pg_operator,2617) +CATALOG(pg_operator,2617,OperatorRelationId) { /* name of operator */ NameData oprname; diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h index 598e546..4bedc9a 100644 --- a/src/include/catalog/pg_opfamily.h +++ b/src/include/catalog/pg_opfamily.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_opfamily * ---------------- */ -#define OperatorFamilyRelationId 2753 - -CATALOG(pg_opfamily,2753) +CATALOG(pg_opfamily,2753,OperatorFamilyRelationId) { /* index access method opfamily is for */ Oid opfmethod BKI_LOOKUP(pg_am); diff --git a/src/include/catalog/pg_partitioned_table.h b/src/include/catalog/pg_partitioned_table.h index 39ee67e..676532a 100644 --- a/src/include/catalog/pg_partitioned_table.h +++ b/src/include/catalog/pg_partitioned_table.h @@ -25,9 +25,7 @@ * typedef struct FormData_pg_partitioned_table * ---------------- */ -#define PartitionedRelationId 3350 - -CATALOG(pg_partitioned_table,3350) BKI_WITHOUT_OIDS +CATALOG(pg_partitioned_table,3350,PartitionedRelationId) BKI_WITHOUT_OIDS { Oid partrelid; /* partitioned table oid */ char partstrat; /* partitioning strategy */ diff --git a/src/include/catalog/pg_pltemplate.h b/src/include/catalog/pg_pltemplate.h index 116a4a0..d84c86b 100644 --- a/src/include/catalog/pg_pltemplate.h +++ b/src/include/catalog/pg_pltemplate.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_pltemplate * ---------------- */ -#define PLTemplateRelationId 1136 - -CATALOG(pg_pltemplate,1136) BKI_SHARED_RELATION BKI_WITHOUT_OIDS +CATALOG(pg_pltemplate,1136,PLTemplateRelationId) BKI_SHARED_RELATION BKI_WITHOUT_OIDS { NameData tmplname; /* name of PL */ bool tmpltrusted; /* PL is trusted? */ diff --git a/src/include/catalog/pg_policy.h b/src/include/catalog/pg_policy.h index 543077c..7ad0cde 100644 --- a/src/include/catalog/pg_policy.h +++ b/src/include/catalog/pg_policy.h @@ -20,9 +20,7 @@ * typedef struct FormData_pg_policy * ---------------- */ -#define PolicyRelationId 3256 - -CATALOG(pg_policy,3256) +CATALOG(pg_policy,3256,PolicyRelationId) { NameData polname; /* Policy name. */ Oid polrelid; /* Oid of the relation with policy. */ diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index b9d9cfd..fd0b909 100644 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -25,10 +25,7 @@ * typedef struct FormData_pg_proc * ---------------- */ -#define ProcedureRelationId 1255 -#define ProcedureRelation_Rowtype_Id 81 - -CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81) BKI_SCHEMA_MACRO +CATALOG(pg_proc,1255,ProcedureRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81,ProcedureRelation_Rowtype_Id) BKI_SCHEMA_MACRO { /* procedure name */ NameData proname; diff --git a/src/include/catalog/pg_publication.h b/src/include/catalog/pg_publication.h index 92cdcf1..9a6a64d 100644 --- a/src/include/catalog/pg_publication.h +++ b/src/include/catalog/pg_publication.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_publication * ---------------- */ -#define PublicationRelationId 6104 - -CATALOG(pg_publication,6104) +CATALOG(pg_publication,6104,PublicationRelationId) { NameData pubname; /* name of the publication */ diff --git a/src/include/catalog/pg_publication_rel.h b/src/include/catalog/pg_publication_rel.h index 864d6ca..2208e42 100644 --- a/src/include/catalog/pg_publication_rel.h +++ b/src/include/catalog/pg_publication_rel.h @@ -25,9 +25,7 @@ * typedef struct FormData_pg_publication_rel * ---------------- */ -#define PublicationRelRelationId 6106 - -CATALOG(pg_publication_rel,6106) +CATALOG(pg_publication_rel,6106,PublicationRelRelationId) { Oid prpubid; /* Oid of the publication */ Oid prrelid; /* Oid of the relation */ diff --git a/src/include/catalog/pg_range.h b/src/include/catalog/pg_range.h index d507e4e..3762b3e 100644 --- a/src/include/catalog/pg_range.h +++ b/src/include/catalog/pg_range.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_range * ---------------- */ -#define RangeRelationId 3541 - -CATALOG(pg_range,3541) BKI_WITHOUT_OIDS +CATALOG(pg_range,3541,RangeRelationId) BKI_WITHOUT_OIDS { /* OID of owning range type */ Oid rngtypid BKI_LOOKUP(pg_type); diff --git a/src/include/catalog/pg_replication_origin.h b/src/include/catalog/pg_replication_origin.h index 02856dd..1adc3f7 100644 --- a/src/include/catalog/pg_replication_origin.h +++ b/src/include/catalog/pg_replication_origin.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_replication_origin * ---------------- */ -#define ReplicationOriginRelationId 6000 - -CATALOG(pg_replication_origin,6000) BKI_SHARED_RELATION BKI_WITHOUT_OIDS +CATALOG(pg_replication_origin,6000,ReplicationOriginRelationId) BKI_SHARED_RELATION BKI_WITHOUT_OIDS { /* * Locally known id that get included into WAL. diff --git a/src/include/catalog/pg_rewrite.h b/src/include/catalog/pg_rewrite.h index d656990..7712586 100644 --- a/src/include/catalog/pg_rewrite.h +++ b/src/include/catalog/pg_rewrite.h @@ -29,9 +29,7 @@ * typedef struct FormData_pg_rewrite * ---------------- */ -#define RewriteRelationId 2618 - -CATALOG(pg_rewrite,2618) +CATALOG(pg_rewrite,2618,RewriteRelationId) { NameData rulename; Oid ev_class; diff --git a/src/include/catalog/pg_seclabel.h b/src/include/catalog/pg_seclabel.h index d6d2f97..48d4548 100644 --- a/src/include/catalog/pg_seclabel.h +++ b/src/include/catalog/pg_seclabel.h @@ -19,9 +19,7 @@ * typedef struct FormData_pg_seclabel * ---------------- */ -#define SecLabelRelationId 3596 - -CATALOG(pg_seclabel,3596) BKI_WITHOUT_OIDS +CATALOG(pg_seclabel,3596,SecLabelRelationId) BKI_WITHOUT_OIDS { Oid objoid; /* OID of the object itself */ Oid classoid; /* OID of table containing the object */ diff --git a/src/include/catalog/pg_sequence.h b/src/include/catalog/pg_sequence.h index de6ed1a..a13b05e 100644 --- a/src/include/catalog/pg_sequence.h +++ b/src/include/catalog/pg_sequence.h @@ -14,9 +14,7 @@ #include "catalog/genbki.h" #include "catalog/pg_sequence_d.h" -#define SequenceRelationId 2224 - -CATALOG(pg_sequence,2224) BKI_WITHOUT_OIDS +CATALOG(pg_sequence,2224,SequenceRelationId) BKI_WITHOUT_OIDS { Oid seqrelid; Oid seqtypid; diff --git a/src/include/catalog/pg_shdepend.h b/src/include/catalog/pg_shdepend.h index 708980b..9f4dcb9 100644 --- a/src/include/catalog/pg_shdepend.h +++ b/src/include/catalog/pg_shdepend.h @@ -35,9 +35,7 @@ * typedef struct FormData_pg_shdepend * ---------------- */ -#define SharedDependRelationId 1214 - -CATALOG(pg_shdepend,1214) BKI_SHARED_RELATION BKI_WITHOUT_OIDS +CATALOG(pg_shdepend,1214,SharedDependRelationId) BKI_SHARED_RELATION BKI_WITHOUT_OIDS { /* * Identification of the dependent (referencing) object. diff --git a/src/include/catalog/pg_shdescription.h b/src/include/catalog/pg_shdescription.h index 1777144..00fd0e0 100644 --- a/src/include/catalog/pg_shdescription.h +++ b/src/include/catalog/pg_shdescription.h @@ -38,9 +38,7 @@ * typedef struct FormData_pg_shdescription * ---------------- */ -#define SharedDescriptionRelationId 2396 - -CATALOG(pg_shdescription,2396) BKI_SHARED_RELATION BKI_WITHOUT_OIDS +CATALOG(pg_shdescription,2396,SharedDescriptionRelationId) BKI_SHARED_RELATION BKI_WITHOUT_OIDS { Oid objoid; /* OID of object itself */ Oid classoid; /* OID of table containing object */ diff --git a/src/include/catalog/pg_shseclabel.h b/src/include/catalog/pg_shseclabel.h index 9fceeee..22ecf98 100644 --- a/src/include/catalog/pg_shseclabel.h +++ b/src/include/catalog/pg_shseclabel.h @@ -19,10 +19,7 @@ * typedef struct FormData_pg_shseclabel * ---------------- */ -#define SharedSecLabelRelationId 3592 -#define SharedSecLabelRelation_Rowtype_Id 4066 - -CATALOG(pg_shseclabel,3592) BKI_SHARED_RELATION BKI_ROWTYPE_OID(4066) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO +CATALOG(pg_shseclabel,3592,SharedSecLabelRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(4066,SharedSecLabelRelation_Rowtype_Id)BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO { Oid objoid; /* OID of the shared object itself */ Oid classoid; /* OID of table containing the shared object */ diff --git a/src/include/catalog/pg_statistic.h b/src/include/catalog/pg_statistic.h index ea4d1be..c0ab74b 100644 --- a/src/include/catalog/pg_statistic.h +++ b/src/include/catalog/pg_statistic.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_statistic * ---------------- */ -#define StatisticRelationId 2619 - -CATALOG(pg_statistic,2619) BKI_WITHOUT_OIDS +CATALOG(pg_statistic,2619,StatisticRelationId) BKI_WITHOUT_OIDS { /* These fields form the unique key for the entry: */ Oid starelid; /* relation containing attribute */ diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h index 3c6be71..5d57b81 100644 --- a/src/include/catalog/pg_statistic_ext.h +++ b/src/include/catalog/pg_statistic_ext.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_statistic_ext * ---------------- */ -#define StatisticExtRelationId 3381 - -CATALOG(pg_statistic_ext,3381) +CATALOG(pg_statistic_ext,3381,StatisticExtRelationId) { Oid stxrelid; /* relation containing attributes */ diff --git a/src/include/catalog/pg_subscription.h b/src/include/catalog/pg_subscription.h index 1b2981f..93b249f 100644 --- a/src/include/catalog/pg_subscription.h +++ b/src/include/catalog/pg_subscription.h @@ -20,8 +20,6 @@ * typedef struct FormData_pg_subscription * ---------------- */ -#define SubscriptionRelationId 6100 -#define SubscriptionRelation_Rowtype_Id 6101 /* * Technically, the subscriptions live inside the database, so a shared catalog @@ -31,7 +29,7 @@ * * NOTE: When adding a column, also update system_views.sql. */ -CATALOG(pg_subscription,6100) BKI_SHARED_RELATION BKI_ROWTYPE_OID(6101) BKI_SCHEMA_MACRO +CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(6101,SubscriptionRelation_Rowtype_Id)BKI_SCHEMA_MACRO { Oid subdbid; /* Database the subscription is in. */ NameData subname; /* Name of the subscription */ diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h index 64aa121..d82b262 100644 --- a/src/include/catalog/pg_subscription_rel.h +++ b/src/include/catalog/pg_subscription_rel.h @@ -22,9 +22,7 @@ * typedef struct FormData_pg_subscription_rel * ---------------- */ -#define SubscriptionRelRelationId 6102 - -CATALOG(pg_subscription_rel,6102) BKI_WITHOUT_OIDS +CATALOG(pg_subscription_rel,6102,SubscriptionRelRelationId) BKI_WITHOUT_OIDS { Oid srsubid; /* Oid of subscription */ Oid srrelid; /* Oid of relation */ diff --git a/src/include/catalog/pg_tablespace.h b/src/include/catalog/pg_tablespace.h index bd9c118..4782e78 100644 --- a/src/include/catalog/pg_tablespace.h +++ b/src/include/catalog/pg_tablespace.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_tablespace * ---------------- */ -#define TableSpaceRelationId 1213 - -CATALOG(pg_tablespace,1213) BKI_SHARED_RELATION +CATALOG(pg_tablespace,1213,TableSpaceRelationId) BKI_SHARED_RELATION { NameData spcname; /* tablespace name */ Oid spcowner; /* owner of tablespace */ diff --git a/src/include/catalog/pg_transform.h b/src/include/catalog/pg_transform.h index c571fb5..6059b89 100644 --- a/src/include/catalog/pg_transform.h +++ b/src/include/catalog/pg_transform.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_transform * ---------------- */ -#define TransformRelationId 3576 - -CATALOG(pg_transform,3576) +CATALOG(pg_transform,3576,TransformRelationId) { Oid trftype; Oid trflang; diff --git a/src/include/catalog/pg_trigger.h b/src/include/catalog/pg_trigger.h index eabd301..7d60861 100644 --- a/src/include/catalog/pg_trigger.h +++ b/src/include/catalog/pg_trigger.h @@ -31,9 +31,7 @@ * to be associated with a deferrable constraint. * ---------------- */ -#define TriggerRelationId 2620 - -CATALOG(pg_trigger,2620) +CATALOG(pg_trigger,2620,TriggerRelationId) { Oid tgrelid; /* relation trigger is attached to */ NameData tgname; /* trigger's name */ diff --git a/src/include/catalog/pg_ts_config.h b/src/include/catalog/pg_ts_config.h index d0b7aa9..d344bb7 100644 --- a/src/include/catalog/pg_ts_config.h +++ b/src/include/catalog/pg_ts_config.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_ts_config * ---------------- */ -#define TSConfigRelationId 3602 - -CATALOG(pg_ts_config,3602) +CATALOG(pg_ts_config,3602,TSConfigRelationId) { NameData cfgname; /* name of configuration */ Oid cfgnamespace; /* name space */ diff --git a/src/include/catalog/pg_ts_config_map.h b/src/include/catalog/pg_ts_config_map.h index cdee4b4..2120021 100644 --- a/src/include/catalog/pg_ts_config_map.h +++ b/src/include/catalog/pg_ts_config_map.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_ts_config_map * ---------------- */ -#define TSConfigMapRelationId 3603 - -CATALOG(pg_ts_config_map,3603) BKI_WITHOUT_OIDS +CATALOG(pg_ts_config_map,3603,TSConfigMapRelationId) BKI_WITHOUT_OIDS { Oid mapcfg; /* OID of configuration owning this entry */ int32 maptokentype; /* token type from parser */ diff --git a/src/include/catalog/pg_ts_dict.h b/src/include/catalog/pg_ts_dict.h index 58af179..1e285ad 100644 --- a/src/include/catalog/pg_ts_dict.h +++ b/src/include/catalog/pg_ts_dict.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_ts_dict * ---------------- */ -#define TSDictionaryRelationId 3600 - -CATALOG(pg_ts_dict,3600) +CATALOG(pg_ts_dict,3600,TSDictionaryRelationId) { NameData dictname; /* dictionary name */ Oid dictnamespace; /* name space */ diff --git a/src/include/catalog/pg_ts_parser.h b/src/include/catalog/pg_ts_parser.h index 51b70ae..ccaf40b 100644 --- a/src/include/catalog/pg_ts_parser.h +++ b/src/include/catalog/pg_ts_parser.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_ts_parser * ---------------- */ -#define TSParserRelationId 3601 - -CATALOG(pg_ts_parser,3601) +CATALOG(pg_ts_parser,3601,TSParserRelationId) { /* parser's name */ NameData prsname; diff --git a/src/include/catalog/pg_ts_template.h b/src/include/catalog/pg_ts_template.h index cfb97925..5e66e02 100644 --- a/src/include/catalog/pg_ts_template.h +++ b/src/include/catalog/pg_ts_template.h @@ -26,9 +26,7 @@ * typedef struct FormData_pg_ts_template * ---------------- */ -#define TSTemplateRelationId 3764 - -CATALOG(pg_ts_template,3764) +CATALOG(pg_ts_template,3764,TSTemplateRelationId) { /* template name */ NameData tmplname; diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index 8992fcd..4ddc09a 100644 --- a/src/include/catalog/pg_type.h +++ b/src/include/catalog/pg_type.h @@ -31,10 +31,7 @@ * See struct FormData_pg_attribute for details. * ---------------- */ -#define TypeRelationId 1247 -#define TypeRelation_Rowtype_Id 71 - -CATALOG(pg_type,1247) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71) BKI_SCHEMA_MACRO +CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelation_Rowtype_Id) BKI_SCHEMA_MACRO { /* type name */ NameData typname; diff --git a/src/include/catalog/pg_user_mapping.h b/src/include/catalog/pg_user_mapping.h index ec62ee2..6efbed0 100644 --- a/src/include/catalog/pg_user_mapping.h +++ b/src/include/catalog/pg_user_mapping.h @@ -25,9 +25,7 @@ * typedef struct FormData_pg_user_mapping * ---------------- */ -#define UserMappingRelationId 1418 - -CATALOG(pg_user_mapping,1418) +CATALOG(pg_user_mapping,1418,UserMappingRelationId) { Oid umuser; /* Id of the user, InvalidOid if PUBLIC is * wanted */ diff --git a/src/include/catalog/unused_oids b/src/include/catalog/unused_oids index e8be34c..f71222d 100755 --- a/src/include/catalog/unused_oids +++ b/src/include/catalog/unused_oids @@ -30,7 +30,7 @@ export FIRSTOBJECTID cat pg_*.h pg_*.dat toasting.h indexing.h | egrep -v -e '^CATALOG\(.*BKI_BOOTSTRAP' | \ sed -n -e 's/.*\boid *=> *'\''\([0-9][0-9]*\)'\''.*$/\1/p' \ - -e 's/^CATALOG([^,]*, *\([0-9][0-9]*\).*BKI_ROWTYPE_OID(\([0-9][0-9]*\)).*$/\1,\2/p' \ + -e 's/^CATALOG([^,]*, *\([0-9][0-9]*\).*BKI_ROWTYPE_OID(\([0-9][0-9]*\),.*$/\1,\2/p' \ -e 's/^CATALOG([^,]*, *\([0-9][0-9]*\).*$/\1/p' \ -e 's/^DECLARE_INDEX([^,]*, *\([0-9][0-9]*\).*$/\1/p' \ -e 's/^DECLARE_UNIQUE_INDEX([^,]*, *\([0-9][0-9]*\).*$/\1/p' \
BTW, I experimented with adding blank lines between the hash items in the .dat files, and that seemed to make a nice improvement in readability, converting masses of rather gray text into visibly distinct stanzas. I'm not dead set on that, but try it and see what you think. A small example from pg_aggregate.dat: { aggfnoid => 'avg(int8)', aggtransfn => 'int8_avg_accum', aggfinalfn => 'numeric_poly_avg', aggcombinefn => 'int8_avg_combine', aggserialfn => 'int8_avg_serialize', aggdeserialfn => 'int8_avg_deserialize', aggmtransfn => 'int8_avg_accum', aggminvtransfn => 'int8_avg_accum_inv', aggmfinalfn => 'numeric_poly_avg', aggtranstype => 'internal', aggtransspace => '48', aggmtranstype => 'internal', aggmtransspace => '48' }, { aggfnoid => 'avg(int4)', aggtransfn => 'int4_avg_accum', aggfinalfn => 'int8_avg', aggcombinefn => 'int4_avg_combine', aggmtransfn => 'int4_avg_accum', aggminvtransfn => 'int4_avg_accum_inv', aggmfinalfn => 'int8_avg', aggtranstype => '_int8', aggmtranstype => '_int8', agginitval => '{0,0}', aggminitval => '{0,0}' }, { aggfnoid => 'avg(int2)', aggtransfn => 'int2_avg_accum', aggfinalfn => 'int8_avg', aggcombinefn => 'int4_avg_combine', aggmtransfn => 'int2_avg_accum', aggminvtransfn => 'int2_avg_accum_inv', aggmfinalfn => 'int8_avg', aggtranstype => '_int8', aggmtranstype => '_int8', agginitval => '{0,0}', aggminitval => '{0,0}' }, { aggfnoid => 'avg(numeric)', aggtransfn => 'numeric_avg_accum', aggfinalfn => 'numeric_avg', aggcombinefn => 'numeric_avg_combine', aggserialfn => 'numeric_avg_serialize', aggdeserialfn => 'numeric_avg_deserialize', aggmtransfn => 'numeric_avg_accum', aggminvtransfn => 'numeric_accum_inv', aggmfinalfn => 'numeric_avg', aggtranstype => 'internal', aggtransspace => '128', aggmtranstype => 'internal', aggmtransspace => '128' }, { aggfnoid => 'avg(float4)', aggtransfn => 'float4_accum', aggfinalfn => 'float8_avg', aggcombinefn => 'float8_combine', aggtranstype => '_float8', agginitval => '{0,0,0}' }, versus { aggfnoid => 'avg(int8)', aggtransfn => 'int8_avg_accum', aggfinalfn => 'numeric_poly_avg', aggcombinefn => 'int8_avg_combine', aggserialfn => 'int8_avg_serialize', aggdeserialfn => 'int8_avg_deserialize', aggmtransfn => 'int8_avg_accum', aggminvtransfn => 'int8_avg_accum_inv', aggmfinalfn => 'numeric_poly_avg', aggtranstype => 'internal', aggtransspace => '48', aggmtranstype => 'internal', aggmtransspace => '48' }, { aggfnoid => 'avg(int4)', aggtransfn => 'int4_avg_accum', aggfinalfn => 'int8_avg', aggcombinefn => 'int4_avg_combine', aggmtransfn => 'int4_avg_accum', aggminvtransfn => 'int4_avg_accum_inv', aggmfinalfn => 'int8_avg', aggtranstype => '_int8', aggmtranstype => '_int8', agginitval => '{0,0}', aggminitval => '{0,0}' }, { aggfnoid => 'avg(int2)', aggtransfn => 'int2_avg_accum', aggfinalfn => 'int8_avg', aggcombinefn => 'int4_avg_combine', aggmtransfn => 'int2_avg_accum', aggminvtransfn => 'int2_avg_accum_inv', aggmfinalfn => 'int8_avg', aggtranstype => '_int8', aggmtranstype => '_int8', agginitval => '{0,0}', aggminitval => '{0,0}' }, { aggfnoid => 'avg(numeric)', aggtransfn => 'numeric_avg_accum', aggfinalfn => 'numeric_avg', aggcombinefn => 'numeric_avg_combine', aggserialfn => 'numeric_avg_serialize', aggdeserialfn => 'numeric_avg_deserialize', aggmtransfn => 'numeric_avg_accum', aggminvtransfn => 'numeric_accum_inv', aggmfinalfn => 'numeric_avg', aggtranstype => 'internal', aggtransspace => '128', aggmtranstype => 'internal', aggmtransspace => '128' }, { aggfnoid => 'avg(float4)', aggtransfn => 'float4_accum', aggfinalfn => 'float8_avg', aggcombinefn => 'float8_combine', aggtranstype => '_float8', agginitval => '{0,0,0}' }, regards, tom lane
On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I experimented with converting all frontend code to include just the > catalog/pg_foo_d.h files instead of catalog/pg_foo.h, as per the > proposed new policy. I soon found that we'd overlooked one thing: > some clients expect to see the relation OID macros, eg > LargeObjectRelationId. Attached is a patch that changes things around > so that those appear in the _d files instead of the master files. > This is cleaner anyway because it removes duplication of the OIDs in > the master files, with attendant risk of error. For example we > have this change in pg_aggregate.h: > > -#define AggregateRelationId 2600 > - > -CATALOG(pg_aggregate,2600) BKI_WITHOUT_OIDS > +CATALOG(pg_aggregate,2600,AggregateRelationId) BKI_WITHOUT_OIDS > > Some of the CATALOG lines spill well past 80 characters with this, > although many of the affected ones already were overlength, eg > > -#define DatabaseRelationId 1262 > -#define DatabaseRelation_Rowtype_Id 1248 > - > -CATALOG(pg_database,1262) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248) > BKI_SCHEMA_MACRO > +CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION > BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id) BKI_SCHEMA_MACRO It seems most of the time the FooRelationId labels are predictable, although not as pristine as the Anum_* constants. One possibility that came to mind is to treat these like pg_type OID #defines -- have a simple rule that can be overridden for historical reasons. In this case the pg_database change would simply be: -#define DatabaseRelationId 1262 -#define DatabaseRelation_Rowtype_Id 1248 - and genbki.pl would know what to do. But for pg_am: -#define AccessMethodRelationId 2601 - -CATALOG(pg_am,2601) +CATALOG(pg_am,2601) BKI_REL_LABEL(AccessMethod) I haven't thought this through yet. I imagine it will add as well as remove a bit of complexity, code-wise. The upside is most CATALOG lines will remain unchanged, and those that do won't end up quite as long. I can try a draft tomorrow to see how it looks, unless you see an obvious downside. -John Naylor
On 4/5/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I've generalized the BKI_LOOKUP(pg_proc) code so that > you can use either regproc-like or regprocedure-like notation, and then > applied that to relevant columns. > [...] > bootstrap-v13-delta.patch is a diff atop your patch series for the > in-tree files, and convert_oid2name.patch adjusts that script to > make use of the additional conversion capability. Looking at convert_oid2name.patch again, I see this: + elsif ($catname eq 'pg_am') + { + $values{aggfnoid} = lookup_procname($values{aggfnoid}); + } aggfnoid is in pg_aggregate, and pg_am already had a regproc lookup. Do you remember the intent here? -John Naylor
John Naylor <jcnaylor@gmail.com> writes: > It seems most of the time the FooRelationId labels are predictable, > although not as pristine as the Anum_* constants. One possibility that > came to mind is to treat these like pg_type OID #defines -- have a > simple rule that can be overridden for historical reasons. Meh. I'd just as soon avoid having some catalogs done one way and some another. regards, tom lane
John Naylor <jcnaylor@gmail.com> writes: > Looking at convert_oid2name.patch again, I see this: > + elsif ($catname eq 'pg_am') > + { > + $values{aggfnoid} = lookup_procname($values{aggfnoid}); > + } > aggfnoid is in pg_aggregate, and pg_am already had a regproc lookup. > Do you remember the intent here? Ugh, copy-and-pasteo. I intended to have it lookup pg_am.amhandler, but must have missed changing the field name after copying code from the pg_aggregate stanza. Seems to have been unnecessary anyway, since all the entries in the column are already symbolic. regards, tom lane
Just had another thought about this business: if practical, we should remove the distinction between "descr" and "shdescr" and just use the former name in .dat files. genbki.pl knows which catalogs are shared, so it ought to be able to figure out where to route the descriptions. regards, tom lane
On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Just had another thought about this business: if practical, we should > remove the distinction between "descr" and "shdescr" and just use the > former name in .dat files. genbki.pl knows which catalogs are shared, > so it ought to be able to figure out where to route the descriptions. Fairly trivial (attached), and shouldn't be too hard to integrate into the series. -John Naylor
Attachment
On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > BTW, I experimented with adding blank lines between the hash items in the > .dat files, and that seemed to make a nice improvement in readability, > converting masses of rather gray text into visibly distinct stanzas. > I'm not dead set on that, but try it and see what you think. Narrow entries with natural whitespace might be okay as is. The pg_aggregate example is better with blank lines, but another thing to consider is that a comment that hugs a block is clear on which entries it's referring to (pg_amop): # btree integer_ops # default operators int2 { amopfamily => 'btree/integer_ops', amoplefttype => 'int2', amoprighttype => 'int2', amopstrategy => '1', amopopr => '<(int2,int2)', amopmethod => 'btree' }, { amopfamily => 'btree/integer_ops', amoplefttype => 'int2', amoprighttype => 'int2', amopstrategy => '2', amopopr => '<=(int2,int2)', amopmethod => 'btree' }, { amopfamily => 'btree/integer_ops', amoplefttype => 'int2', amoprighttype => 'int2', amopstrategy => '3', amopopr => '=(int2,int2)', amopmethod => 'btree' }, { amopfamily => 'btree/integer_ops', amoplefttype => 'int2', amoprighttype => 'int2', amopstrategy => '4', amopopr => '>=(int2,int2)', amopmethod => 'btree' }, { amopfamily => 'btree/integer_ops', amoplefttype => 'int2', amoprighttype => 'int2', amopstrategy => '5', amopopr => '>(int2,int2)', amopmethod => 'btree' }, # crosstype operators int24 { amopfamily => 'btree/integer_ops', amoplefttype => 'int2', ... [more blocks of integer ops ...] amopmethod => 'btree' }, # btree oid_ops With a blank line beween every entry, the comments would "float" more, and the scope is not as clear. I'm okay with whatever the community thinks, but at this point I'm inclined to leave things as they are and focus on the other points of review for the next patchset. While on the subject of viewing, I do have a badly outdated patch that would create a postgres.sql file which would load into a development schema so one could query the bootstrap data in a database without running initdb. I could update it at a future point. -John Naylor
On 4/5/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Something that bothered me a bit while writing the warning-producing code > is that showing %bki_values isn't actually that great a way of identifying > the trouble spot. By this point we've expanded out defaults and possibly > replaced some other macros, so it doesn't look that much like what was > in the .dat file. I think what would be ideal, both here and in some > other places like AddDefaultValues, is to be able to finger the location > of the bad tuple by filename and line number, but I have no idea whether > it's practical to annotate the tuples with that while reading the .dat > files. Any thoughts? We could use the $. variable to save the line number, which is what the old code had. AddDefaultValues will report the line number on failure, so I left out explicit line number reporting. If memory serves, Perl is sensitive to how you format the "die" message. If I delete a default value from the header, I get this, reporting line 16: Failed to form full tuple for pg_opfamily Missing values for: opfnamespace Showing other values for context: oid => 421, opfmethod => 403, opfowner => PGUID, opfname => abstime_ops, at ../../../src/backend/catalog/Catalog.pm line 259, <$ifd> line 16. Makefile:23: recipe for target 'reformat-dat-files' failed make: *** [reformat-dat-files] Error 25 I think the context is good for pg_attribute, because there is no file to read from. I'll think about the lookup code. -John Naylor
For version 14, diffed against f1464c53804: -Use majority values for proisstrict, provolatile, proparallel (patch 0006) -Use valid C string for multi-char defaults containing a backslash (patch 0006) -Apply Tom's patch for additional lookups, slightly modified by me (convert_oid2name.pl, patch 0007) -Apply Tom's patch for relation/rowtype OID macros (patch 0005) On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Just had another thought about this business: if practical, we should > remove the distinction between "descr" and "shdescr" and just use the > former name in .dat files. genbki.pl knows which catalogs are shared, > so it ought to be able to figure out where to route the descriptions. Done (convert_header2dat.pl, patches 0001, 0003, 0009) On 4/5/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think what would be ideal, both here and in some > other places like AddDefaultValues, is to be able to finger the location > of the bad tuple by filename and line number, but I have no idea whether > it's practical to annotate the tuples with that while reading the .dat > files. Any thoughts? Done (patch 0007). So far only lookup_oids() uses it. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, I experimented with adding blank lines between the hash items in the >> .dat files, and that seemed to make a nice improvement in readability, >> converting masses of rather gray text into visibly distinct stanzas. >> I'm not dead set on that, but try it and see what you think. > Narrow entries with natural whitespace might be okay as is. The > pg_aggregate example is better with blank lines, Yeah, it's somewhat of a worst-case example, because (more or less by chance) most of the entries have last lines that are mostly full. In other places it often happens that the last line of an entry is much shorter than average, and that provides enough of a visual break, as in your example from pg_amop. > but another thing to > consider is that a comment that hugs a block is clear on which entries > it's referring to (pg_amop): True, we'd need to rethink vertical spacing for comments. I don't offhand see why we couldn't keep comments that apply to a specific entry directly adjacent to that entry, though. Anyway, as I said, I'm not set on this change. If you're unexcited by the idea then let's drop it. regards, tom lane
I wrote: > John Naylor <jcnaylor@gmail.com> writes: >> On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> BTW, I experimented with adding blank lines between the hash items in the >>> .dat files, and that seemed to make a nice improvement in readability, > Anyway, as I said, I'm not set on this change. If you're unexcited by > the idea then let's drop it. On third thought, there's actually a positive reason not to do that. The more vertical whitespace we have, the less traction there is for context diffs to find the right place to change, so that we'd be increasing the risk of patch misapplication. That was one of the worries that we set out to avoid with this whole design. So nevermind ... I'll get on with looking at v14. regards, tom lane
Attached is a round of minor review fixes. Much of this is cleaning up existing mistakes or out-of-date comments, rather than anything introduced by this patchset, but I noticed it while going through the patch. The additional EXPOSE_TO_CLIENT_CODE bits actually are necessary, in some cases, as I had compile failures without them after changing client include lines. (I'll post that separately.) I added a couple more BKI_DEFAULT markers here, but omitted the ensuing .dat file updates, because they're just mechanical. I don't think there's any great need to incorporate this into your patch set. As far as I'm concerned, v14 is ready as-is, and I'll just apply this over the top of it. (Note that I'll probably smash the whole thing to one commit when the time comes.) I have some other work pending on the documentation aspect, but that's not quite ready for public consumption yet. regards, tom lane diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 8729ccd..1626999 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -3566,7 +3566,7 @@ Oid PQftype(const PGresult *res, You can query the system table <literal>pg_type</literal> to obtain the names and properties of the various data types. The <acronym>OID</acronym>s of the built-in data types are defined - in the file <filename>src/include/catalog/pg_type.h</filename> + in the file <filename>src/include/catalog/pg_type_d.h</filename> in the source tree. </para> </listitem> diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile index 17213d4..d25d98a 100644 --- a/src/backend/catalog/Makefile +++ b/src/backend/catalog/Makefile @@ -25,8 +25,10 @@ BKIFILES = postgres.bki postgres.description postgres.shdescription include $(top_srcdir)/src/backend/common.mk -# Note: there are some undocumented dependencies on the ordering in which -# the catalog header files are assembled into postgres.bki. +# Note: the order of this list determines the order in which the catalog +# header files are assembled into postgres.bki. BKI_BOOTSTRAP catalogs +# must appear first, and there are reputedly other, undocumented ordering +# dependencies. CATALOG_HEADERS := \ pg_proc.h pg_type.h pg_attribute.h pg_class.h \ pg_attrdef.h pg_constraint.h pg_inherits.h pg_index.h pg_operator.h \ @@ -49,7 +51,8 @@ CATALOG_HEADERS := \ GENERATED_HEADERS := $(CATALOG_HEADERS:%.h=%_d.h) schemapg.h # In the list of headers used to assemble postgres.bki, indexing.h needs -# be last, and toasting.h just before it. +# be last, and toasting.h just before it. This ensures we don't try to +# create indexes or toast tables before their catalogs exist. POSTGRES_BKI_SRCS := $(addprefix $(top_srcdir)/src/include/catalog/,\ $(CATALOG_HEADERS) toasting.h indexing.h \ ) diff --git a/src/include/catalog/Makefile b/src/include/catalog/Makefile index d84a572..1da3ea7 100644 --- a/src/include/catalog/Makefile +++ b/src/include/catalog/Makefile @@ -2,9 +2,6 @@ # # Makefile for src/include/catalog # -# 'make reformat-dat-files' is a convenience target for rewriting the -# catalog data files in a standard format. -# # Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group # Portions Copyright (c) 1994, Regents of the University of California # @@ -19,10 +16,16 @@ include $(top_builddir)/src/Makefile.global # location of Catalog.pm catalogdir = $(top_srcdir)/src/backend/catalog +# 'make reformat-dat-files' is a convenience target for rewriting the +# catalog data files in our standard format. This includes collapsing +# out any entries that are redundant with a BKI_DEFAULT annotation. reformat-dat-files: $(PERL) -I $(catalogdir) reformat_dat_file.pl pg_*.dat +# 'make expand-dat-files' is a convenience target for expanding out all +# default values in the catalog data files. This should be run before +# altering or removing any BKI_DEFAULT annotation. expand-dat-files: $(PERL) -I $(catalogdir) reformat_dat_file.pl pg_*.dat --full-tuples -.PHONY: reformat-dat-files +.PHONY: reformat-dat-files expand-dat-files diff --git a/src/include/catalog/duplicate_oids b/src/include/catalog/duplicate_oids index 0e6285f..8c143cf 100755 --- a/src/include/catalog/duplicate_oids +++ b/src/include/catalog/duplicate_oids @@ -30,7 +30,7 @@ foreach my $oid (sort { $a <=> $b } keys %oidcounts) { next unless $oidcounts{$oid} > 1; $found = 1; - print "***Duplicate OID: $oid\n"; + print "$oid\n"; } exit $found; diff --git a/src/include/catalog/genbki.h b/src/include/catalog/genbki.h index 02c38c4..b1e2cbd 100644 --- a/src/include/catalog/genbki.h +++ b/src/include/catalog/genbki.h @@ -34,7 +34,7 @@ #define BKI_FORCE_NOT_NULL /* Specifies a default value for a catalog field */ #define BKI_DEFAULT(value) -/* Indicates how to perform name lookups for OID fields */ +/* Indicates how to perform name lookups for an OID or OID-array field */ #define BKI_LOOKUP(catalog) /* The following are never defined; they are here only for documentation. */ @@ -48,11 +48,12 @@ #undef CATALOG_VARLEN /* - * There is code in the catalog headers that needs to be visible to clients, - * but we don't want them to include the full header because of safety issues - * with other code in the header. This symbol instructs genbki.pl to copy this - * section when generating the corresponding definition header, where it can - * be included by both client and backend code. + * There is code in some catalog headers that needs to be visible to clients, + * but we don't want clients to include the full header because of safety + * issues with other code in the header. To handle that, surround code that + * should be visible to clients with "#ifdef EXPOSE_TO_CLIENT_CODE". That + * instructs genbki.pl to copy the section when generating the corresponding + * "_d" header, which can be included by both client and backend code. */ #undef EXPOSE_TO_CLIENT_CODE diff --git a/src/include/catalog/pg_am.h b/src/include/catalog/pg_am.h index 5aa2bac..9620bcd 100644 --- a/src/include/catalog/pg_am.h +++ b/src/include/catalog/pg_am.h @@ -47,9 +47,8 @@ typedef FormData_pg_am *Form_pg_am; #ifdef EXPOSE_TO_CLIENT_CODE -/* ---------------- - * compiler constant for amtype - * ---------------- +/* + * Allowed values for amtype */ #define AMTYPE_INDEX 'i' /* index access method */ diff --git a/src/include/catalog/pg_authid.h b/src/include/catalog/pg_authid.h index c410b99..863ef65 100644 --- a/src/include/catalog/pg_authid.h +++ b/src/include/catalog/pg_authid.h @@ -23,17 +23,6 @@ #include "catalog/genbki.h" #include "catalog/pg_authid_d.h" -/* - * The CATALOG definition has to refer to the type of rolvaliduntil as - * "timestamptz" (lower case) so that bootstrap mode recognizes it. But - * the C header files define this type as TimestampTz. Since the field is - * potentially-null and therefore can't be accessed directly from C code, - * there is no particular need for the C struct definition to show the - * field type as TimestampTz --- instead we just make it int. - */ -#define timestamptz int - - /* ---------------- * pg_authid definition. cpp turns this into * typedef struct FormData_pg_authid @@ -58,8 +47,6 @@ CATALOG(pg_authid,1260,AuthIdRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(284 #endif } FormData_pg_authid; -#undef timestamptz - /* ---------------- * Form_pg_authid corresponds to a pointer to a tuple with * the format of pg_authid relation. diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h index e58eb8d..a9e7e2b 100644 --- a/src/include/catalog/pg_cast.h +++ b/src/include/catalog/pg_cast.h @@ -53,6 +53,8 @@ CATALOG(pg_cast,2605,CastRelationId) */ typedef FormData_pg_cast *Form_pg_cast; +#ifdef EXPOSE_TO_CLIENT_CODE + /* * The allowable values for pg_cast.castcontext are specified by this enum. * Since castcontext is stored as a "char", we use ASCII codes for human @@ -81,4 +83,6 @@ typedef enum CoercionMethod COERCION_METHOD_INOUT = 'i' /* use input/output functions */ } CoercionMethod; +#endif /* EXPOSE_TO_CLIENT_CODE */ + #endif /* PG_CAST_H */ diff --git a/src/include/catalog/pg_class.dat b/src/include/catalog/pg_class.dat index ff972c0..e1450e3 100644 --- a/src/include/catalog/pg_class.dat +++ b/src/include/catalog/pg_class.dat @@ -12,9 +12,10 @@ [ -# Note: only "bootstrapped" relations need to be declared here. Be sure that -# the OIDs listed here match those given in their CATALOG macros, and that -# the relnatts values are correct. +# Note: only "bootstrapped" relations, ie those marked BKI_BOOTSTRAP, need to +# have entries here. Be sure that the OIDs listed here match those given in +# their CATALOG and BKI_ROWTYPE_OID macros, and that the relnatts values are +# correct. # Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId; # similarly, "1" in relminmxid stands for FirstMultiXactId diff --git a/src/include/catalog/pg_db_role_setting.h b/src/include/catalog/pg_db_role_setting.h index 0faa81f..896c410 100644 --- a/src/include/catalog/pg_db_role_setting.h +++ b/src/include/catalog/pg_db_role_setting.h @@ -1,7 +1,7 @@ /*------------------------------------------------------------------------- * * pg_db_role_setting.h - * definition of configuration settings + * definition of per-database/per-user configuration settings relation * * * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group @@ -18,6 +18,7 @@ #ifndef PG_DB_ROLE_SETTING_H #define PG_DB_ROLE_SETTING_H +#include "catalog/genbki.h" #include "catalog/pg_db_role_setting_d.h" #include "utils/guc.h" #include "utils/relcache.h" diff --git a/src/include/catalog/pg_index.h b/src/include/catalog/pg_index.h index 65c1110..b70ad73 100644 --- a/src/include/catalog/pg_index.h +++ b/src/include/catalog/pg_index.h @@ -64,6 +64,8 @@ CATALOG(pg_index,2610,IndexRelationId) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO */ typedef FormData_pg_index *Form_pg_index; +#ifdef EXPOSE_TO_CLIENT_CODE + /* * Index AMs that support ordered scans must support these two indoption * bits. Otherwise, the content of the per-column indoption fields is @@ -72,6 +74,8 @@ typedef FormData_pg_index *Form_pg_index; #define INDOPTION_DESC 0x0001 /* values are in reverse order */ #define INDOPTION_NULLS_FIRST 0x0002 /* NULLs are first instead of last */ +#endif /* EXPOSE_TO_CLIENT_CODE */ + /* * Use of these macros is recommended over direct examination of the state * flag columns where possible; this allows source code compatibility with diff --git a/src/include/catalog/pg_largeobject.h b/src/include/catalog/pg_largeobject.h index 07adca0..481d2ff 100644 --- a/src/include/catalog/pg_largeobject.h +++ b/src/include/catalog/pg_largeobject.h @@ -10,6 +10,8 @@ * src/include/catalog/pg_largeobject.h * * NOTES + * The Catalog.pm module reads this file and derives schema + * information. * *------------------------------------------------------------------------- */ diff --git a/src/include/catalog/pg_operator.dat b/src/include/catalog/pg_operator.dat index f8118ff..4382738 100644 --- a/src/include/catalog/pg_operator.dat +++ b/src/include/catalog/pg_operator.dat @@ -3054,8 +3054,6 @@ { oid => '3681', descr => 'OR-concatenate', oprname => '||', oprleft => 'tsquery', oprright => 'tsquery', oprresult => 'tsquery', oprcode => 'tsquery_or' }, - -# <-> operation calls tsquery_phrase, but function is polymorphic. So, point to OID of the tsquery_phrase { oid => '5005', descr => 'phrase-concatenate', oprname => '<->', oprleft => 'tsquery', oprright => 'tsquery', oprresult => 'tsquery', oprcode => 'tsquery_phrase(tsquery,tsquery)' }, diff --git a/src/include/catalog/pg_policy.h b/src/include/catalog/pg_policy.h index 7ad0cde..45fdc28 100644 --- a/src/include/catalog/pg_policy.h +++ b/src/include/catalog/pg_policy.h @@ -7,6 +7,12 @@ * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California * + * src/include/catalog/pg_policy.h + * + * NOTES + * The Catalog.pm module reads this file and derives schema + * information. + * *------------------------------------------------------------------------- */ #ifndef PG_POLICY_H diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 43e24d2..5b5a1c0 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -30,7 +30,9 @@ # "convert srctypename to desttypename" for cast functions # "less-equal-greater" for B-tree comparison functions -# Keep the following ordered by OID so that later changes can be made easier. +# Once upon a time these entries were ordered by OID. Lately it's often +# been the custom to insert new entries adjacent to related older entries. +# Try to do one or the other though, don't just insert entries at random. # OIDS 1 - 99 diff --git a/src/include/catalog/pg_range.h b/src/include/catalog/pg_range.h index 3762b3e..d8e16cc 100644 --- a/src/include/catalog/pg_range.h +++ b/src/include/catalog/pg_range.h @@ -35,7 +35,7 @@ CATALOG(pg_range,3541,RangeRelationId) BKI_WITHOUT_OIDS Oid rngsubtype BKI_LOOKUP(pg_type); /* collation for this range type, or 0 */ - Oid rngcollation; + Oid rngcollation BKI_DEFAULT(0); /* subtype's btree opclass */ Oid rngsubopc BKI_LOOKUP(pg_opclass); diff --git a/src/include/catalog/pg_shdepend.h b/src/include/catalog/pg_shdepend.h index 9f4dcb9..0f8508c 100644 --- a/src/include/catalog/pg_shdepend.h +++ b/src/include/catalog/pg_shdepend.h @@ -11,7 +11,6 @@ * for example, there's not much value in creating an explicit dependency * from a relation to its database. Currently, only dependencies on roles * are explicitly stored in pg_shdepend. - * * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group * Portions Copyright (c) 1994, Regents of the University of California diff --git a/src/include/catalog/pg_statistic.h b/src/include/catalog/pg_statistic.h index 30d6868..c0ab74b 100644 --- a/src/include/catalog/pg_statistic.h +++ b/src/include/catalog/pg_statistic.h @@ -126,6 +126,8 @@ CATALOG(pg_statistic,2619,StatisticRelationId) BKI_WITHOUT_OIDS */ typedef FormData_pg_statistic *Form_pg_statistic; +#ifdef EXPOSE_TO_CLIENT_CODE + /* * Several statistical slot "kinds" are defined by core PostgreSQL, as * documented below. Also, custom data types can define their own "kind" @@ -255,4 +257,6 @@ typedef FormData_pg_statistic *Form_pg_statistic; */ #define STATISTIC_KIND_BOUNDS_HISTOGRAM 7 +#endif /* EXPOSE_TO_CLIENT_CODE */ + #endif /* PG_STATISTIC_H */ diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h index 24e4bd8..5d57b81 100644 --- a/src/include/catalog/pg_statistic_ext.h +++ b/src/include/catalog/pg_statistic_ext.h @@ -58,7 +58,11 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId) */ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext; +#ifdef EXPOSE_TO_CLIENT_CODE + #define STATS_EXT_NDISTINCT 'd' #define STATS_EXT_DEPENDENCIES 'f' +#endif /* EXPOSE_TO_CLIENT_CODE */ + #endif /* PG_STATISTIC_EXT_H */ diff --git a/src/include/catalog/pg_subscription_rel.h b/src/include/catalog/pg_subscription_rel.h index d82b262..033f5a1 100644 --- a/src/include/catalog/pg_subscription_rel.h +++ b/src/include/catalog/pg_subscription_rel.h @@ -12,9 +12,9 @@ #ifndef PG_SUBSCRIPTION_REL_H #define PG_SUBSCRIPTION_REL_H -#include "access/xlogdefs.h" #include "catalog/genbki.h" #include "catalog/pg_subscription_rel_d.h" +#include "access/xlogdefs.h" #include "nodes/pg_list.h" /* ---------------- diff --git a/src/include/catalog/pg_trigger.h b/src/include/catalog/pg_trigger.h index ee64bb4..7d60861 100644 --- a/src/include/catalog/pg_trigger.h +++ b/src/include/catalog/pg_trigger.h @@ -69,6 +69,8 @@ CATALOG(pg_trigger,2620,TriggerRelationId) */ typedef FormData_pg_trigger *Form_pg_trigger; +#ifdef EXPOSE_TO_CLIENT_CODE + /* Bits within tgtype */ #define TRIGGER_TYPE_ROW (1 << 0) #define TRIGGER_TYPE_BEFORE (1 << 1) @@ -128,4 +130,6 @@ typedef FormData_pg_trigger *Form_pg_trigger; #define TRIGGER_USES_TRANSITION_TABLE(namepointer) \ ((namepointer) != (char *) NULL) +#endif /* EXPOSE_TO_CLIENT_CODE */ + #endif /* PG_TRIGGER_H */ diff --git a/src/include/catalog/pg_ts_parser.h b/src/include/catalog/pg_ts_parser.h index 20b3a4b..ccaf40b 100644 --- a/src/include/catalog/pg_ts_parser.h +++ b/src/include/catalog/pg_ts_parser.h @@ -32,7 +32,7 @@ CATALOG(pg_ts_parser,3601,TSParserRelationId) NameData prsname; /* name space */ - Oid prsnamespace; + Oid prsnamespace BKI_DEFAULT(PGNSP); /* init parsing session */ regproc prsstart BKI_LOOKUP(pg_proc); diff --git a/src/include/catalog/pg_ts_template.h b/src/include/catalog/pg_ts_template.h index 5f5b580..5e66e02 100644 --- a/src/include/catalog/pg_ts_template.h +++ b/src/include/catalog/pg_ts_template.h @@ -32,7 +32,7 @@ CATALOG(pg_ts_template,3764,TSTemplateRelationId) NameData tmplname; /* name space */ - Oid tmplnamespace; + Oid tmplnamespace BKI_DEFAULT(PGNSP); /* initialization method of dict (may be 0) */ regproc tmplinit BKI_LOOKUP(pg_proc); diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat index a430c79..3c2c813 100644 --- a/src/include/catalog/pg_type.dat +++ b/src/include/catalog/pg_type.dat @@ -12,20 +12,21 @@ [ -# Keep the following ordered by OID so that later changes can be made more -# easily. - # For types used in the system catalogs, make sure the values here match # TypInfo[] in bootstrap.c. -# The defined symbols for pg_type OIDs are generated by genbki.pl +# OID symbol macro names for pg_type OIDs are generated by genbki.pl # according to the following rule, so you don't need to specify them # here: # foo_bar -> FOO_BAROID # _foo_bar -> FOO_BARARRAYOID # -# The only symbols in this file are ones that don't match this rule, and -# are grandfathered in. +# The only oid_symbol entries in this file are for names that don't match +# this rule, and are grandfathered in. + +# Once upon a time these entries were ordered by OID. Lately it's often +# been the custom to insert new entries adjacent to related older entries. +# Try to do one or the other though, don't just insert entries at random. # OIDS 1 - 99 diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h index 695ed4e..4bcfac9 100644 --- a/src/include/catalog/pg_type.h +++ b/src/include/catalog/pg_type.h @@ -92,7 +92,7 @@ CATALOG(pg_type,1247,TypeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71,TypeRelati /* delimiter for arrays of this type */ char typdelim BKI_DEFAULT("\054"); - /* 0 if not a composite type */ + /* associated pg_class OID if a composite type, else 0 */ Oid typrelid BKI_DEFAULT(0); /* @@ -246,7 +246,7 @@ typedef FormData_pg_type *Form_pg_type; #ifdef EXPOSE_TO_CLIENT_CODE /* - * macros + * macros for values of poor-mans-enumerated-type columns */ #define TYPTYPE_BASE 'b' /* base type (ordinary scalar type) */ #define TYPTYPE_COMPOSITE 'c' /* composite (e.g., table's rowtype) */ diff --git a/src/include/catalog/reformat_dat_file.pl b/src/include/catalog/reformat_dat_file.pl index f748a45..cb7a020 100644 --- a/src/include/catalog/reformat_dat_file.pl +++ b/src/include/catalog/reformat_dat_file.pl @@ -13,7 +13,7 @@ # Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group # Portions Copyright (c) 1994, Regents of the University of California # -# /src/include/catalog/reformat_dat_file.pl +# src/include/catalog/reformat_dat_file.pl # #----------------------------------------------------------------------
Oh, one more thing: looking again at the contents of pg_proc.dat, I find myself annoyed at the need to specify pronargs. That's entirely derivable from proargtypes, and if we did so, we'd get down to this for the first few pg_proc entries: { oid => '1242', descr => 'I/O', proname => 'boolin', prorettype => 'bool', proargtypes => 'cstring', prosrc => 'boolin' }, { oid => '1243', descr => 'I/O', proname => 'boolout', prorettype => 'cstring', proargtypes => 'bool', prosrc => 'boolout' }, { oid => '1244', descr => 'I/O', proname => 'byteain', prorettype => 'bytea', proargtypes => 'cstring', prosrc => 'byteain' }, which seems pretty close to the minimum amount of stuff you could expect to need to write. I recall that this and some other data compression methods were on the table awhile back, and I expressed doubt about putting very much magic knowledge in genbki.pl, but this case seems like a no-brainer. genbki can always get the right answer, and expecting people to do it just creates another way to make a mistake. So attached is an incremental patch to do that. I had to adjust AddDefaultValues's argument list to tell it the catalog name, and once I did that, I saw that the API arrangement whereby the caller throws error was just useless complexity, so I got rid of it. regards, tom lane diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 70aea89..3b3bb6b 100644 *** a/src/backend/catalog/Catalog.pm --- b/src/backend/catalog/Catalog.pm *************** sub ParseData *** 254,265 **** } # Expand tuples to their full representation. ! my $error = AddDefaultValues($hash_ref, $schema); ! if ($error) ! { ! print "Failed to form full tuple for $catname\n"; ! die $error; ! } } else { --- 254,260 ---- } # Expand tuples to their full representation. ! AddDefaultValues($hash_ref, $schema, $catname); } else { *************** sub ParseData *** 289,302 **** return $data; } ! # Fill in default values of a record using the given schema. It's the ! # caller's responsibility to specify other values beforehand. ! # If we fail to fill all fields, return a nonempty error message. sub AddDefaultValues { ! my ($row, $schema) = @_; my @missing_fields; - my $msg; foreach my $column (@$schema) { --- 284,295 ---- return $data; } ! # Fill in default values of a record using the given schema. ! # It's the caller's responsibility to specify other values beforehand. sub AddDefaultValues { ! my ($row, $schema, $catname) = @_; my @missing_fields; foreach my $column (@$schema) { *************** sub AddDefaultValues *** 311,316 **** --- 304,316 ---- { $row->{$attname} = $column->{default}; } + elsif ($catname eq 'pg_proc' && $attname eq 'pronargs' && + defined($row->{proargtypes})) + { + # pg_proc.pronargs can be derived from proargtypes. + my @proargtypes = split /\s+/, $row->{proargtypes}; + $row->{$attname} = scalar(@proargtypes); + } else { # Failed to find a value. *************** sub AddDefaultValues *** 320,333 **** if (@missing_fields) { ! $msg = "Missing values for: " . join(', ', @missing_fields); ! $msg .= "\nShowing other values for context:\n"; while (my($key, $value) = each %$row) { $msg .= "$key => $value, "; } } - return $msg; } # Rename temporary files to final names. --- 320,334 ---- if (@missing_fields) { ! my $msg = "Failed to form full tuple for $catname\n"; ! $msg .= "Missing values for: " . join(', ', @missing_fields); ! $msg .= "\nOther values for row:\n"; while (my($key, $value) = each %$row) { $msg .= "$key => $value, "; } + die $msg; } } # Rename temporary files to final names. diff --git a/src/backend/catalog/genbki.pl b/src/backend/catalog/genbki.pl index d4c2ddf..3512952 100644 *** a/src/backend/catalog/genbki.pl --- b/src/backend/catalog/genbki.pl *************** sub morph_row_for_pgattr *** 641,651 **** $row->{attnotnull} = 'f'; } ! my $error = Catalog::AddDefaultValues($row, $pgattr_schema); ! if ($error) ! { ! die "Failed to form full tuple for pg_attribute: ", $error; ! } } # Write an entry to postgres.bki. Adding quotes here allows us to keep --- 641,647 ---- $row->{attnotnull} = 'f'; } ! Catalog::AddDefaultValues($row, $pgattr_schema, 'pg_attribute'); } # Write an entry to postgres.bki. Adding quotes here allows us to keep diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h index 2c42b5c..b25546e 100644 *** a/src/include/catalog/pg_proc.h --- b/src/include/catalog/pg_proc.h *************** CATALOG(pg_proc,1255,ProcedureRelationId *** 73,78 **** --- 73,79 ---- char proparallel BKI_DEFAULT(s); /* number of arguments */ + /* Note: need not be given in pg_proc.dat; genbki.pl will compute it */ int16 pronargs; /* number of arguments with defaults */ diff --git a/src/include/catalog/reformat_dat_file.pl b/src/include/catalog/reformat_dat_file.pl index b20d236..bbceb16 100644 *** a/src/include/catalog/reformat_dat_file.pl --- b/src/include/catalog/reformat_dat_file.pl *************** sub strip_default_values *** 201,206 **** --- 201,213 ---- { delete $row->{$attname}; } + + # Also delete pg_proc.pronargs, since that can be recomputed. + if ($catname eq 'pg_proc' && $attname eq 'pronargs' && + defined($row->{proargtypes})) + { + delete $row->{$attname}; + } } }
On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't think there's any great need to incorporate this into your patch > set. As far as I'm concerned, v14 is ready as-is, and I'll just apply > this over the top of it. (Note that I'll probably smash the whole thing > to one commit when the time comes.) Glad to hear it. A couple recent commits added #define symbols to headers, which broke the patchset, so I've attached v15, diff'd against 4f813c7203e. Commit 9fdb675fc added a symbol to pg_opfamily.h where there were none before, so I went ahead and wrapped it with an EXPOSE_TO_CLIENT_CODE macro. All your additional patches apply still apply over it. Your SGML patch can only apply if my doc patch is not applied, but I've included it anyway for the sake of no surprises. I'll check back in 24 hours to see if everything still applies. -John Naylor
Attachment
John Naylor wrote: > On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > I don't think there's any great need to incorporate this into your patch > > set. As far as I'm concerned, v14 is ready as-is, and I'll just apply > > this over the top of it. (Note that I'll probably smash the whole thing > > to one commit when the time comes.) > > Glad to hear it. A couple recent commits added #define symbols to > headers, which broke the patchset, so I've attached v15, diff'd > against 4f813c7203e. Commit 9fdb675fc added a symbol to pg_opfamily.h > where there were none before, so I went ahead and wrapped it with an > EXPOSE_TO_CLIENT_CODE macro. Actually, after pushing that, I was thinking maybe it's better to remove that #define from there and put it in each of the two .c files that need it. I don't think it makes sense to expose this macro any further, and before that commit it was localized to a single file. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > John Naylor wrote: >> Commit 9fdb675fc added a symbol to pg_opfamily.h >> where there were none before, so I went ahead and wrapped it with an >> EXPOSE_TO_CLIENT_CODE macro. > Actually, after pushing that, I was thinking maybe it's better to remove > that #define from there and put it in each of the two .c files that need > it. I don't think it makes sense to expose this macro any further, and > before that commit it was localized to a single file. We're speaking of IsBooleanOpfamily, right? Think I'd leave it where it is. As soon as you have more than one place using a macro like that, there's room for maintenance mistakes. Now it could also be argued that indxpath.c and partprune.c don't necessarily have the same idea of "boolean opfamily" anyway, in which case giving them separate copies might be better. Not sure about that. Anyway, now that John and I have each (separately) rebased the bootstrap patch over that, I'd appreciate it if you hold off cosmetic refactoring till said patch goes in, which I expect to do in ~ 24 hours. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > John Naylor wrote: > >> Commit 9fdb675fc added a symbol to pg_opfamily.h > >> where there were none before, so I went ahead and wrapped it with an > >> EXPOSE_TO_CLIENT_CODE macro. > > > Actually, after pushing that, I was thinking maybe it's better to remove > > that #define from there and put it in each of the two .c files that need > > it. I don't think it makes sense to expose this macro any further, and > > before that commit it was localized to a single file. > > We're speaking of IsBooleanOpfamily, right? Yeah, that's the one. > Think I'd leave it where it is. As soon as you have more than one > place using a macro like that, there's room for maintenance mistakes. Yeah, that's why I thought it'd be better to have it somewhere central (the originally submitted patch just added it to partprune.c). > Anyway, now that John and I have each (separately) rebased the bootstrap > patch over that, I'd appreciate it if you hold off cosmetic refactoring > till said patch goes in, which I expect to do in ~ 24 hours. Understood. I'm going over David Rowley's runtime pruning patch now (doesn't touch any catalog files), which I intend to be my last functional commit this cycle, and won't be doing any other commits till after bootstrap rework has landed. (As I mentioned elsewhere, I intend to propose some restructuring of partitioning code, without any functional changes, during next week.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I wrote: > I'll check back in 24 hours to see if everything still applies. There were a couple more catalog changes that broke patch context, so attached is version 16. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > There were a couple more catalog changes that broke patch context, so > attached is version 16. Pushed with a few more adjustments (mostly, another round of copy-editing on bki.sgml). Now we wait to see what the buildfarm thinks, particularly about the MSVC build ... Congratulations, and many thanks, to John for seeing this through! I know it's been a huge amount of work, but we've needed this for years. regards, tom lane
On 4/9/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Congratulations, and many thanks, to John for seeing this through! > I know it's been a huge amount of work, but we've needed this for years. Many thanks for your review, advice, and additional hacking. Thanks also to Álvaro for review and initial commits, and to all who participated in previous discussion. -John Naylor
On April 8, 2018 10:19:59 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >John Naylor <jcnaylor@gmail.com> writes: >> There were a couple more catalog changes that broke patch context, so >> attached is version 16. > >Pushed with a few more adjustments (mostly, another round of >copy-editing >on bki.sgml). Now we wait to see what the buildfarm thinks, >particularly >about the MSVC build ... > >Congratulations, and many thanks, to John for seeing this through! >I know it's been a huge amount of work, but we've needed this for >years. Seconded and thirded and fourthed (?)! -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I experimented with converting all frontend code to include just the > catalog/pg_foo_d.h files instead of catalog/pg_foo.h, as per the > proposed new policy. I soon found that we'd overlooked one thing: > some clients expect to see the relation OID macros, eg > LargeObjectRelationId. Attached is a patch that changes things around > so that those appear in the _d files instead of the master files. [...] > Some of the CATALOG lines spill well past 80 characters with this, > although many of the affected ones already were overlength, eg > > -#define DatabaseRelationId 1262 > -#define DatabaseRelation_Rowtype_Id 1248 > - > -CATALOG(pg_database,1262) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248) > BKI_SCHEMA_MACRO > +CATALOG(pg_database,1262,DatabaseRelationId) BKI_SHARED_RELATION > BKI_ROWTYPE_OID(1248,DatabaseRelation_Rowtype_Id) BKI_SCHEMA_MACRO [ to which I responded with an inadequate alternative ] Thinking about this some more, a way occurred to me to shorten the CATALOG lines while still treating all headers the same, and with very little code (Patch 0001). What we do is automate the use of 'RelationId' and 'Relation_Rowtype_Id' so that the CATALOG macro only needs the part pertaining to the table name, and the BKI_ROWTYPE_OID macro can go back to just having the OID, eg: CATALOG(pg_database,1262,Database) BKI_SHARED_RELATION BKI_ROWTYPE_OID(1248) BKI_SCHEMA_MACRO This is shorter, but not quite as short as before the conversion. If we really wanted to, we could also leave off the BKI_ prefix from the CATALOG options (Patch 0002), eg: CATALOG(pg_database,1262,Database) SHARED_RELATION ROWTYPE_OID(1248) SCHEMA_MACRO CATALOG itself lacks a prefix, and its options can only go in one place, so we'd lose some consistency but perhaps we don't lose any clarity. (This isn't true for the attribute-level options for nullability etc, which can appear all over the place.) Results below - number of CATALOG lines with more than 80 characters, and the longest line: grep -E '^CATALOG\(' src/include/catalog/pg_*.h | sed 's/.*://' | awk '{ print length, $0 }' | sort -n -r -- before conversion: 6 lines > 80 chars 105 CATALOG(pg_auth_members,1261) BKI_SHARED_RELATION BKI_WITHOUT_OIDS BKI_ROWTYPE_OID(2843) BKI_SCHEMA_MACRO -- after conversion: 14 lines > 80 chars 162 CATALOG(pg_shseclabel,3592,SharedSecLabelRelationId) BKI_SHARED_RELATION BKI_ROWTYPE_OID(4066,SharedSecLabelRelation_Rowtype_Id) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO -- with Patch 0001: 11 lines > 80 chars 118 CATALOG(pg_shseclabel,3592,SharedSecLabel) BKI_SHARED_RELATION BKI_ROWTYPE_OID(4066) BKI_WITHOUT_OIDS BKI_SCHEMA_MACRO -- with Patch 0002: 5 lines > 80 chars 102 CATALOG(pg_shseclabel,3592,SharedSecLabel) SHARED_RELATION ROWTYPE_OID(4066) WITHOUT_OIDS SCHEMA_MACRO -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > On 4/6/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Some of the CATALOG lines spill well past 80 characters with this, >> although many of the affected ones already were overlength, eg ... > Thinking about this some more, a way occurred to me to shorten the > CATALOG lines while still treating all headers the same, and with very > little code (Patch 0001). What we do is automate the use of > 'RelationId' and 'Relation_Rowtype_Id' so that the CATALOG macro only > needs the part pertaining to the table name, and the BKI_ROWTYPE_OID > macro can go back to just having the OID, eg: Hm ... I don't like this too much, because it means that grepping for those macros will no longer turn up the source of their definition. Yeah, if you already know how Relation_Rowtype_Id macros are created, you might not be confused, but I think it'd be problematic for newcomers. Essentially we'd be shortening these lines by obfuscating, which doesn't seem like a good tradeoff. It might be all right to drop the BKI_ prefixes as per your other suggestion, but I'm worried about possible symbol conflicts. It's probably not really worth changing that by itself. regards, tom lane
There still seems to be a lot of boilerplate in the .dat files that could be eliminated. Tom mentioned upthread that he did not want too much magic in genbki.pl or Catalog.pm, but I think I can propose putting some magic in the header files themselves. Take, for example, some of the fields in pg_type.dat. I'll elide the ones I'm not talking about with ...: {... typname => 'X', ... typinput => 'Xin', typoutput => 'Xout', typreceive => 'Xrecv', typsend => 'Xsend', ... }, If we changed pg_type.h: /* text format (required) */ - regproc typinput BKI_LOOKUP(pg_proc); - regproc typoutput BKI_LOOKUP(pg_proc); + regproc typinput BKI_DEFAULT("${typname}in") BKI_LOOKUP(pg_proc); + regproc typoutput BKI_DEFAULT("${typname}out") BKI_LOOKUP(pg_proc); /* binary format (optional) */ - regproc typreceive BKI_LOOKUP(pg_proc); - regproc typsend BKI_LOOKUP(pg_proc); + regproc typreceive BKI_DEFAULT("${typname}recv") BKI_LOOKUP(pg_proc); + regproc typsend BKI_DEFAULT("${typname}send") BKI_LOOKUP(pg_proc); we could remove the typinput, typoutput, typreceive, and typsend fields from many of the records. The logic for how to derive these fields would not be hardcoded into genbki.pl or Catalog.pm, but rather would be in pg_type.h, where it belongs. The pattern "${typname}in", for example, is not hardcoded in perl, but is rather specified in pg_type.h, making it obvious for those reading pg_type.h what the pattern will be for generating the default value. For those types where the typreceive and/or typsend values are not defined, owing to receive and send functionality not being implemented, the user would need to specify something like: {... typname => 'X', typreceive => '-', typsend => '-'}, but that seems desirable anyway, as it helps to document the lacking functionality. For types with associated functions named 'X_in', 'X_out', 'X_recv' and 'X_send' rather than 'Xin', 'Xout', 'Xrecv' and 'Xsend' the user would have to specify those function names. That's not a regression from how things are now, as all function names are currently required. Perhaps after this change is applied (if it is) there will be some pressure to standardize the naming of these functions. I'd consider that a good thing. If we changed pg_proc.h: /* procedure source text */ - text prosrc BKI_FORCE_NOT_NULL; + text prosrc BKI_DEFAULT("${proname}") BKI_FORCE_NOT_NULL; we could remove the prosrc field from many of the records, which would do a better job of calling attention to the remaining records where the C function name differs from the SQL function name. These two changes have been made in the patch I am submitting, with the consequence that pg_type.dat drops from 52954 bytes to 49106 bytes, and pg_proc.dat drops from 521302 bytes to 464554 bytes. Since postgres.bki is unchanged, I don't mean to suggest any runtime or initdb time performance improvement; I only mention the size reduction to emphasize that there is less text for human programmers to review. There are further changes possible along these lines, where instead of specifying s/FOO/BAR/ type substitition, we have something more like s/FOO/BAR/e and s/FOO/BAR/ee type symantics, but before proposing them, I'd like to see if the community likes the direction I am going with this patch. If so, we can debate whether, for example, the default alignment requirements of a type can be derived from the type's size rather than having to be specified for every row in pg_type.dat. mark
Attachment
On Wed, Apr 25, 2018 at 3:44 PM, Mark Dilger <hornschnorter@gmail.com> wrote: > There still seems to be a lot of boilerplate in the .dat files > that could be eliminated. Tom mentioned upthread that he did > not want too much magic in genbki.pl or Catalog.pm, but I think > I can propose putting some magic in the header files themselves. > > Take, for example, some of the fields in pg_type.dat. I'll elide > the ones I'm not talking about with ...: > > > {... typname => 'X', ... typinput => 'Xin', typoutput => 'Xout', > typreceive => 'Xrecv', typsend => 'Xsend', ... }, -1 for trying to automate this. It varies between fooin and foo_in, and it'll be annoying to have to remember which one happens automatically and which one needs an override. > If we changed pg_proc.h: > > /* procedure source text */ > - text prosrc BKI_FORCE_NOT_NULL; > + text prosrc BKI_DEFAULT("${proname}") BKI_FORCE_NOT_NULL; > > we could remove the prosrc field from many of the records, which would > do a better job of calling attention to the remaining records where the > C function name differs from the SQL function name. That one I kinda like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> On Apr 25, 2018, at 1:00 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Apr 25, 2018 at 3:44 PM, Mark Dilger <hornschnorter@gmail.com> wrote: >> There still seems to be a lot of boilerplate in the .dat files >> that could be eliminated. Tom mentioned upthread that he did >> not want too much magic in genbki.pl or Catalog.pm, but I think >> I can propose putting some magic in the header files themselves. >> >> Take, for example, some of the fields in pg_type.dat. I'll elide >> the ones I'm not talking about with ...: >> >> >> {... typname => 'X', ... typinput => 'Xin', typoutput => 'Xout', >> typreceive => 'Xrecv', typsend => 'Xsend', ... }, > > -1 for trying to automate this. It varies between fooin and foo_in, > and it'll be annoying to have to remember which one happens > automatically and which one needs an override. That may be a good argument. I was on the fence about it, because I like the idea that the project might do some cleanup and standardize the names of all in/out/send/recv functions so that no overrides would be required. (Likewise, I'd like the eq,ne,lt,le,gt,ge functions to be named in a standard way.) Would that be an API break and hence a non-starter? > >> If we changed pg_proc.h: >> >> /* procedure source text */ >> - text prosrc BKI_FORCE_NOT_NULL; >> + text prosrc BKI_DEFAULT("${proname}") BKI_FORCE_NOT_NULL; >> >> we could remove the prosrc field from many of the records, which would >> do a better job of calling attention to the remaining records where the >> C function name differs from the SQL function name. > > That one I kinda like. Yeah, that one is simpler, and it is the one that makes the majority of the difference in bringing down the size of the .dat files. mark
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Apr 25, 2018 at 3:44 PM, Mark Dilger <hornschnorter@gmail.com> wrote: >> There still seems to be a lot of boilerplate in the .dat files >> that could be eliminated. ... >> {... typname => 'X', ... typinput => 'Xin', typoutput => 'Xout', >> typreceive => 'Xrecv', typsend => 'Xsend', ... }, > -1 for trying to automate this. It varies between fooin and foo_in, > and it'll be annoying to have to remember which one happens > automatically and which one needs an override. Yeah, that was my first reaction to that example as well. If it weren't so nearly fifty-fifty then we might have more traction there; but it's pretty close, and actually the foo_in cases outnumber fooin if I counted correctly. (Array entries should be ignored for this purpose; maybe we'll autogenerate them someday.) >> + text prosrc BKI_DEFAULT("${proname}") BKI_FORCE_NOT_NULL; > That one I kinda like. Agreed, this seems more compelling. However, I think we need more than one compelling example to justify the additional infrastructure. There aren't that many places where there's obvious internal redundancy in single catalog rows, so I'm not sure that we're going to find a lot of win here. (The prosrc-from-proname case, in isolation, could be handled about as well by adding a hardwired rule like we have for pronargs.) I don't especially like the idea of trying to compute, for instance, typalign from typlen. That's mostly going to encourage people to overlook it, which isn't a good thing, because there are plenty of places where genbki.pl couldn't be expected to get it right. regards, tom lane
Mark Dilger <hornschnorter@gmail.com> writes: >> On Apr 25, 2018, at 1:00 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> -1 for trying to automate this. It varies between fooin and foo_in, >> and it'll be annoying to have to remember which one happens >> automatically and which one needs an override. > That may be a good argument. I was on the fence about it, because I > like the idea that the project might do some cleanup and standardize > the names of all in/out/send/recv functions so that no overrides would > be required. (Likewise, I'd like the eq,ne,lt,le,gt,ge functions to > be named in a standard way.) > Would that be an API break and hence a non-starter? Yeah, I'm afraid so. I'm pretty sure I recall cases where people invoked I/O functions by name, and I definitely recall cases where operator-implementation functions were invoked by name. Those might not be very bright things to do, but people do 'em. We'd also be risking creating problems for individual apps by conflicting with user-defined functions that we didn't use to conflict with. We take that chance every time we add functionality, of course, but usually we can soothe complaints by pointing out that they got some new functionality in return. I don't think "we made I/O function names more uniform" is going to be a very satisfactory excuse for breakage. regards, tom lane
> On Apr 25, 2018, at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Apr 25, 2018 at 3:44 PM, Mark Dilger <hornschnorter@gmail.com> wrote: >>> There still seems to be a lot of boilerplate in the .dat files >>> that could be eliminated. ... > >>> {... typname => 'X', ... typinput => 'Xin', typoutput => 'Xout', >>> typreceive => 'Xrecv', typsend => 'Xsend', ... }, > >> -1 for trying to automate this. It varies between fooin and foo_in, >> and it'll be annoying to have to remember which one happens >> automatically and which one needs an override. > > Yeah, that was my first reaction to that example as well. If it > weren't so nearly fifty-fifty then we might have more traction there; > but it's pretty close, and actually the foo_in cases outnumber fooin > if I counted correctly. (Array entries should be ignored for this > purpose; maybe we'll autogenerate them someday.) Part of the problem right now is that nothing really encourages new functions to be named foo_in vs. fooin, so the nearly 50/50 split will continue when new code is contributed. If the system forced you to specify the name when you did it in a nonstandard way, and not otherwise, that would have the effect of documenting which way is now considered standard. > >>> + text prosrc BKI_DEFAULT("${proname}") BKI_FORCE_NOT_NULL; > >> That one I kinda like. > > Agreed, this seems more compelling. However, I think we need more than > one compelling example to justify the additional infrastructure. I'll look for more. > There > aren't that many places where there's obvious internal redundancy in > single catalog rows, so I'm not sure that we're going to find a lot of win > here. (The prosrc-from-proname case, in isolation, could be handled about > as well by adding a hardwired rule like we have for pronargs.) Right, but doing that in genbki.pl or Catalog.pm is an obfuscation. Doing it in pg_proc.h makes is much more reasonable, I think. Note that my patch does not hardcode the logic in the perl code. It teaches the perl code how to do variable substitution, but then allows the actual rules for each row to be written in the header file itself. > > I don't especially like the idea of trying to compute, for instance, > typalign from typlen. That's mostly going to encourage people to overlook > it, which isn't a good thing, because there are plenty of places where > genbki.pl couldn't be expected to get it right. Well, it wouldn't be in genbki.pl, it would be in pg_type.h, but I take your broader point that you don't want this calculated.
>> >>>> + text prosrc BKI_DEFAULT("${proname}") BKI_FORCE_NOT_NULL; >> >>> That one I kinda like. >> >> Agreed, this seems more compelling. However, I think we need more than >> one compelling example to justify the additional infrastructure. > > I'll look for more. There are two more that seem reasonable optimizations in pg_cast.h: diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h index 7f4a25b2da..98794df7f8 100644 --- a/src/include/catalog/pg_cast.h +++ b/src/include/catalog/pg_cast.h @@ -37,13 +37,13 @@ CATALOG(pg_cast,2605,CastRelationId) Oid casttarget BKI_LOOKUP(pg_type); /* cast function; 0 = binary coercible */ - Oid castfunc BKI_LOOKUP(pg_proc); + Oid castfunc BKI_DEFAULT("${casttarget}(${castsource})") BKI_LOOKUP(pg_proc); /* contexts in which cast can be used */ char castcontext; /* cast method */ - char castmethod; + char castmethod BKI_DEFAULT("/${castfunc} == 0 ? 'b' : 'f'/e"); } FormData_pg_cast; /* ---------------- Which would convert numerous lines like: { castsource => 'money', casttarget => 'numeric', castfunc => 'numeric(money)', castcontext => 'a', castmethod => 'f' }, To shorter lines like: { castsource => 'money', casttarget => 'numeric', castcontext => 'a' }, which is great, because all you really are trying to tell the postgres system is that when you cast from numeric to money it has to be an explicit assignment and not an implicit cast. The extra stuff about castfunc and castmethod just gets in the way of understanding what is being done. There are another two in pg_opclass.h: diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h index b980327fc0..9f528f97c0 100644 --- a/src/include/catalog/pg_opclass.h +++ b/src/include/catalog/pg_opclass.h @@ -52,7 +52,7 @@ CATALOG(pg_opclass,2616,OperatorClassRelationId) Oid opcmethod BKI_LOOKUP(pg_am); /* name of this opclass */ - NameData opcname; + NameData opcname BKI_DEFAULT("${opcintype}_ops"); /* namespace of this opclass */ Oid opcnamespace BKI_DEFAULT(PGNSP); @@ -61,7 +61,7 @@ CATALOG(pg_opclass,2616,OperatorClassRelationId) Oid opcowner BKI_DEFAULT(PGUID); /* containing operator family */ - Oid opcfamily BKI_LOOKUP(pg_opfamily); + Oid opcfamily BKI_DEFAULT("${opcmethod}/${opcintype}_ops") BKI_LOOKUP(pg_opfamily); /* type of data indexed by opclass */ Oid opcintype BKI_LOOKUP(pg_type); Which would convert numerous lines like the following line from pg_opclass.dat: { opcmethod => 'btree', opcname => 'bytea_ops', opcfamily => 'btree/bytea_ops', opcintype => 'bytea' }, to much easier to read lines like: { opcmethod => 'btree', opcintype => 'bytea' }, which is also great, because you're really just declaring that type bytea has a btree opclass and letting the system do the rest of the work. Having to manually specify opfamily and the opname just clutters the row and makes it less intuitive. There are a bunch more of varying quality, depending on which automations you like or don't like. mark
On Thu, Apr 26, 2018 at 2:11 AM, Mark Dilger <hornschnorter@gmail.com> wrote: > >> On Apr 25, 2018, at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Robert Haas <robertmhaas@gmail.com> writes: >>> On Wed, Apr 25, 2018 at 3:44 PM, Mark Dilger <hornschnorter@gmail.com> wrote: >>>> There still seems to be a lot of boilerplate in the .dat files >>>> that could be eliminated. ... >> >>>> {... typname => 'X', ... typinput => 'Xin', typoutput => 'Xout', >>>> typreceive => 'Xrecv', typsend => 'Xsend', ... }, >> >>> -1 for trying to automate this. It varies between fooin and foo_in, >>> and it'll be annoying to have to remember which one happens >>> automatically and which one needs an override. >> >> Yeah, that was my first reaction to that example as well. If it >> weren't so nearly fifty-fifty then we might have more traction there; >> but it's pretty close, and actually the foo_in cases outnumber fooin >> if I counted correctly. (Array entries should be ignored for this >> purpose; maybe we'll autogenerate them someday.) > > Part of the problem right now is that nothing really encourages new > functions to be named foo_in vs. fooin, so the nearly 50/50 split will > continue when new code is contributed. If the system forced you to > specify the name when you did it in a nonstandard way, and not otherwise, > that would have the effect of documenting which way is now considered > standard. > FWIW, I would like some standard naming convention one way or the other for in/out function names. Looking up pg_type.h for in/out functions when you know the built-in type name isn't great. But that itself may not be enough reason to change it. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
On Wed, Apr 25, 2018 at 04:39:42PM -0400, Tom Lane wrote: > Mark Dilger <hornschnorter@gmail.com> writes: > >> On Apr 25, 2018, at 1:00 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> -1 for trying to automate this. It varies between fooin and foo_in, > >> and it'll be annoying to have to remember which one happens > >> automatically and which one needs an override. > > > That may be a good argument. I was on the fence about it, because I > > like the idea that the project might do some cleanup and standardize > > the names of all in/out/send/recv functions so that no overrides would > > be required. (Likewise, I'd like the eq,ne,lt,le,gt,ge functions to > > be named in a standard way.) > > > Would that be an API break and hence a non-starter? > > Yeah, I'm afraid so. I'm pretty sure I recall cases where people > invoked I/O functions by name, and I definitely recall cases where > operator-implementation functions were invoked by name. Those might > not be very bright things to do, but people do 'em. > > We'd also be risking creating problems for individual apps by conflicting > with user-defined functions that we didn't use to conflict with. We take > that chance every time we add functionality, of course, but usually we can > soothe complaints by pointing out that they got some new functionality in > return. I don't think "we made I/O function names more uniform" is going > to be a very satisfactory excuse for breakage. What do you rate the chances that someone created a foo_in when there's already a built-in fooin? If it's low enough, we could add all the foo_ins as aliases to fooins and then mandate that new ones be called foo_in for future foos. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 4/26/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > (The prosrc-from-proname case, in isolation, could be handled about > as well by adding a hardwired rule like we have for pronargs.) If we think we might go in the direction of more special-case behavior, the attached patch more fully documents what we already do, and does some minor refactoring to make future additions more straightforward. I think this would even be good to do for v11. > Robert Haas <robertmhaas@gmail.com> writes: >> -1 for trying to automate this. It varies between fooin and foo_in, >> and it'll be annoying to have to remember which one happens >> automatically and which one needs an override. > > Yeah, that was my first reaction to that example as well. If it > weren't so nearly fifty-fifty then we might have more traction there; > but it's pretty close, and actually the foo_in cases outnumber fooin > if I counted correctly. (Array entries should be ignored for this > purpose; maybe we'll autogenerate them someday.) Hmm, that wouldn't be too hard. Add a new metadata field called 'array_type_oid', then if it finds such an OID, teach genbki.pl to construct a tuple for the corresponding array type by consulting something like char typcategory BKI_ARRAY(A); char typstorage BKI_ARRAY(x); ...etc in the header file, plus copying typalign from the element type. I'll whip this up sometime and add it to the next commitfest. On 4/26/18, Mark Dilger <hornschnorter@gmail.com> wrote: > /* cast method */ > - char castmethod; > + char castmethod BKI_DEFAULT("/${castfunc} == 0 ? 'b' : 'f'/e"); > } FormData_pg_cast; I don't have a strong opinion about your simple substitution mechanism, but I find the above a bit harder to reason about. It also has the added disadvantage that there is whitespace in the BKI annotation, so it can no longer be parsed with the current setup. That could be overcome with additional complexity, of course, but then you're back in the position of advocating that for a single use case. -John Naylor
Attachment
John Naylor <jcnaylor@gmail.com> writes: > On 4/26/18, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (The prosrc-from-proname case, in isolation, could be handled about >> as well by adding a hardwired rule like we have for pronargs.) > If we think we might go in the direction of more special-case > behavior, the attached patch more fully documents what we already do, > and does some minor refactoring to make future additions more > straightforward. I think this would even be good to do for v11. Agreed, pushed. regards, tom lane
Hackers, Have you already considered and rejected the idea of having genbki.pl/Catalog.pm define constants that can be used in the catalog .dat files? I'm mostly curious if people think the resulting .dat files are better or worse using constants of this sort. For example: diff --git a/src/backend/catalog/Catalog.pm b/src/backend/catalog/Catalog.pm index 7497d9cd9f..58ce24adf0 100644 --- a/src/backend/catalog/Catalog.pm +++ b/src/backend/catalog/Catalog.pm @@ -250,6 +250,21 @@ sub ParseData if ($lcnt == $rcnt) { + # pg_cast constants for castcontext + use constant IMPLICIT => 'i'; + use constant ASSIGNMENT => 'a'; + use constant EXPLICIT => 'e'; + + # pg_cast constants for castmethod + use constant FUNCTION => 'f'; + use constant BINARY => 'b'; + use constant INOUT => 'i'; + + # pg_proc constants for provolatile + use constant IMMUTABLE => 'i'; + use constant STABLE => 's'; + use constant VOLATILE => 'v'; + eval '$hash_ref = ' . $_; if (!ref $hash_ref) { diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat index cf007528fd..a4ceceb652 100644 --- a/src/include/catalog/pg_cast.dat +++ b/src/include/catalog/pg_cast.dat @@ -19,79 +19,79 @@ # int2->int4->int8->numeric->float4->float8, while casts in the # reverse direction are assignment-only. { castsource => 'int8', casttarget => 'int2', castfunc => 'int2(int8)', - castcontext => 'a', castmethod => 'f' }, + castcontext => ASSIGNMENT, castmethod => FUNCTION }, { castsource => 'int8', casttarget => 'int4', castfunc => 'int4(int8)', - castcontext => 'a', castmethod => 'f' }, + castcontext => ASSIGNMENT, castmethod => FUNCTION }, { castsource => 'int8', casttarget => 'float4', castfunc => 'float4(int8)',
Mark Dilger <hornschnorter@gmail.com> writes: > Have you already considered and rejected the idea of having > genbki.pl/Catalog.pm define constants that can be used in > the catalog .dat files? I'm mostly curious if people think > the resulting .dat files are better or worse using constants > of this sort. For example: Hm, I don't think it's been debated in exactly these terms; but I find myself a bit skeptical of the proposal as-is. At least going by your examples, it'd make the .dat files a good bit more verbose, which doesn't seem like what we want. Also it creates a bigger impedance mismatch between what you see in the .dat files and what you see in the actual catalogs, inviting confusion. Also, with the particular implementation technique you suggest here, there'd be a single namespace for the constants across all catalogs, which creates a lot of possibilities for conflicts and mistakes. We could fix that by instituting some sort of unique-ifying naming convention, say CAST_IMPLICIT not just IMPLICIT, but that makes the verbosity problem worse. I think if we wanted to have something along this line, my preferred approach would be to continue to write the entries as string literals: castcontext => 'ASSIGNMENT', castmethod => 'FUNCTION' }, and make it the responsibility of genbki.pl to convert to the values recognized by the backend. That way the conversion could happen on a per-column basis, eliminating the conflict issue. But I'm not really convinced that this'd be an improvement over where we are now. regards, tom lane
On 5/7/18, Mark Dilger <hornschnorter@gmail.com> wrote: > Hackers, > > Have you already considered and rejected the idea of having > genbki.pl/Catalog.pm define constants that can be used in > the catalog .dat files? I'm mostly curious if people think > the resulting .dat files are better or worse using constants > of this sort. For example: ... > + # pg_cast constants for castcontext > + use constant IMPLICIT => 'i'; > + use constant ASSIGNMENT => 'a'; > + use constant EXPLICIT => 'e'; The comment refers to pg_cast, but these constants apply globally. It's also not the right place from a maintainability perspective, and if it was, now these values have different macros defined in two places. This is not good. > - castcontext => 'a', castmethod => 'f' }, > + castcontext => ASSIGNMENT, castmethod => FUNCTION }, For one, this breaks convention that the values are always single-quoted. If you had a use case for something like this, I would instead use the existing lookup infrastructure and teach genbki.pl to parse the enums (or #defines as the case might be) in the relevant header file. You'd need some improvement in readability to justify that additional code, though. I don't think this example quite passes (it's pretty obvious locally what the letters refer to), but others may feel differently. -John Naylor
> On May 6, 2018, at 12:08 PM, John Naylor <jcnaylor@gmail.com> wrote: > > On 5/7/18, Mark Dilger <hornschnorter@gmail.com> wrote: >> Hackers, >> >> Have you already considered and rejected the idea of having >> genbki.pl/Catalog.pm define constants that can be used in >> the catalog .dat files? I'm mostly curious if people think >> the resulting .dat files are better or worse using constants >> of this sort. For example: > ... >> + # pg_cast constants for castcontext >> + use constant IMPLICIT => 'i'; >> + use constant ASSIGNMENT => 'a'; >> + use constant EXPLICIT => 'e'; > > The comment refers to pg_cast, but these constants apply globally. > It's also not the right place from a maintainability perspective, and > if it was, now these values have different macros defined in two > places. This is not good. > >> - castcontext => 'a', castmethod => 'f' }, >> + castcontext => ASSIGNMENT, castmethod => FUNCTION }, > > For one, this breaks convention that the values are always > single-quoted. If you had a use case for something like this, I would > instead use the existing lookup infrastructure and teach genbki.pl to > parse the enums (or #defines as the case might be) in the relevant > header file. You'd need some improvement in readability to justify > that additional code, though. I don't think this example quite passes > (it's pretty obvious locally what the letters refer to), but others > may feel differently. John, Tom, thanks for the feedback. I share your concerns about my straw-man proposal. But.... In the catalogs, 'f' usually means 'false', not 'function'. A person reading pg_cast.dat could see: castmethod => 'f' and think that meant a binary conversion, since castmethod is false. That's almost exactly wrong. Hence my desire to write castmethod => FUNCTION I don't have any better proposal, though. mark