Thread: Release note trimming: another modest proposal
We've been around on this before, I know, but I got annoyed about it again while waiting around for test builds of the back-branch documentation. I think that we need some policy about maintaining back-branch release notes that's not "keep everything, forever". The release notes are becoming an ever-larger fraction of the docs, and that's not good for documentation maintenance or for download bandwidth. As an example, looking at the US-letter PDF version of the v10 docs, as things stand today: Total page count: 3550 Pages in release notes for 10.x: 41 (1%) Pages in release notes for older branches: 898 (25%) Pages in release notes for pre-9.2 branches: 546 (15%) I've not measured directly, but it's a reasonable assumption that if we dropped all the back-branch release notes the documentation build time would drop about 25%, whichever format you were building. I also live in fear of overrunning TeX's hard-wired limits, in the back branches that depend on a TeX-based PDF toolchain. We've hit those before and been able to work around them, but I wouldn't count on doing so again, and I sure don't want to discover that we have a problem of that sort the day before a release deadline. Trimming the release notes would definitely give us enough slack to not worry about that before all those branches are EOL. We've discussed trimming the release notes before, and people have objected on the grounds that they like being able to access ancient notes from time to time. I'm not unsympathetic to that issue, but does that access point need to be our daily working documentation? Anyway, I'd like to propose a compromise position that I don't think has been discussed before: let's drop release notes for branches that were already EOL when a given branch was released. So for example, 9.3 and before would go away from v12, due out next year. Working backwards, we'd drop 9.1 and before from v10, giving the 15% savings in page count that I showed above. A quick measurement says that would also trim the size of the v10 tarball by about 4%, which is not a lot maybe but it's noticeable across a lot of downloads. It seems to me that this would still provide enough historical info for just about any ordinary interest. We could discuss ways of making a complete release-note archive available somewhere, if "go dig in the git repo" doesn't seem like an adequate answer for that. Thoughts? regards, tom lane
On 5 August 2018 at 23:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Anyway, I'd like to propose a compromise position that I don't think > has been discussed before: let's drop release notes for branches > that were already EOL when a given branch was released. WFM. +1 Regards, Dean
On Aug 5, 2018, at 6:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:We've been around on this before, I know, but I got annoyed about it
again while waiting around for test builds of the back-branch
documentation. I think that we need some policy about maintaining
back-branch release notes that's not "keep everything, forever".
The release notes are becoming an ever-larger fraction of the docs,
and that's not good for documentation maintenance or for download
bandwidth. As an example, looking at the US-letter PDF version of
the v10 docs, as things stand today:
Total page count: 3550
Pages in release notes for 10.x: 41 (1%)
Pages in release notes for older branches: 898 (25%)
Pages in release notes for pre-9.2 branches: 546 (15%)
I've not measured directly, but it's a reasonable assumption that if
we dropped all the back-branch release notes the documentation build
time would drop about 25%, whichever format you were building.
I also live in fear of overrunning TeX's hard-wired limits, in the
back branches that depend on a TeX-based PDF toolchain. We've hit
those before and been able to work around them, but I wouldn't count
on doing so again, and I sure don't want to discover that we have a
problem of that sort the day before a release deadline. Trimming the
release notes would definitely give us enough slack to not worry
about that before all those branches are EOL.
We've discussed trimming the release notes before, and people have
objected on the grounds that they like being able to access ancient
notes from time to time. I'm not unsympathetic to that issue, but
does that access point need to be our daily working documentation?
I’ll reference old release notes when researching some historical
evolution of a feature, but it’s definitely not a part of daily work.
Anyway, I'd like to propose a compromise position that I don't think
has been discussed before: let's drop release notes for branches
that were already EOL when a given branch was released. So for
example, 9.3 and before would go away from v12, due out next year.
Working backwards, we'd drop 9.1 and before from v10, giving the 15%
savings in page count that I showed above. A quick measurement says
that would also trim the size of the v10 tarball by about 4%, which
is not a lot maybe but it's noticeable across a lot of downloads.
+1. This is also a time consuming process when working the release
itself, so any time savings is great.
It seems to me that this would still provide enough historical
info for just about any ordinary interest. We could discuss ways
of making a complete release-note archive available somewhere,
if "go dig in the git repo" doesn't seem like an adequate answer
for that.
Why not www.postgresql.org? We could add it as a subnav to the
documentation section and just have the entire archive there. We could
then update the official docs to say “If you would like to reference release
notes for earlier versions, please visit <URL>”
Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: >> On Aug 5, 2018, at 6:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> ... We could discuss ways >> of making a complete release-note archive available somewhere, >> if "go dig in the git repo" doesn't seem like an adequate answer >> for that. > Why not www.postgresql.org <http://www.postgresql.org/>? We could add it as a subnav to the > documentation section and just have the entire archive there. We could > then update the official docs to say “If you would like to reference release > notes for earlier versions, please visit <URL>” Yeah, that should certainly be part of it. The questions I have are (1) Is it sufficient to have that info on the website? People who want it locally can always fall back on searching the development git repo, but it'd be less convenient perhaps. (2) How would we maintain that exactly? It's not, for instance, possible to build the release notes as a standalone document right now. (Bruce's eagerness to provide xrefs for just about everything is the main stumbling block, though there might be others.) The process I'm vaguely imagining is that when a release branch is EOL'd, before removing its release-NN.sgml file from the HEAD branch, we copy that file into some archive somewhere and do a one-time edit to make it buildable as part of a standalone release-notes document. Maybe the "archive" contains a makefile and enough supporting stuff to build a document that has just the obsolete release notes, and somewhere we have a git repo for that. Then anybody who wants local access can clone that repo (solving question 1), and we annually use it to build a new version of the old-release-notes document to put on the website. This seems like a nontrivial amount of work, but maybe we can automate it to some extent. regards, tom lane
> On Aug 6, 2018, at 11:09 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >>> On Aug 5, 2018, at 6:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> ... We could discuss ways >>> of making a complete release-note archive available somewhere, >>> if "go dig in the git repo" doesn't seem like an adequate answer >>> for that. > >> Why not www.postgresql.org <http://www.postgresql.org/>? We could add it as a subnav to the >> documentation section and just have the entire archive there. We could >> then update the official docs to say “If you would like to reference release >> notes for earlier versions, please visit <URL>” > > Yeah, that should certainly be part of it. The questions I have are > > (1) Is it sufficient to have that info on the website? People who want > it locally can always fall back on searching the development git repo, > but it'd be less convenient perhaps. Skimming some other OSS projects and it seems to be all over the board. Some have a webpage covering releases, some have nicer formatted documentation with a release section, some just link to the CHANGELOG in a repo. We could do something like: - Host release notes on .org - Have a reference in the official release notes to the page on the website that houses the historical notes. That way we’re building “pointers” to the official releases notes as opposed to having to build them every single time. Though thinking on this further, we’d probably want to maintain the URLs that have been generated through the years so they don’t all 404 at once. That would require having the appropriate URL rules written out either in pgweb itself or at the web server level. > (2) How would we maintain that exactly? It's not, for instance, possible > to build the release notes as a standalone document right now. (Bruce's > eagerness to provide xrefs for just about everything is the main stumbling > block, though there might be others.) Well, as long as we are still housing the docs and those references are still alive, it should be ok. > The process I'm vaguely imagining is that when a release branch is EOL'd, > before removing its release-NN.sgml file from the HEAD branch, we copy > that file into some archive somewhere and do a one-time edit to make it > buildable as part of a standalone release-notes document. Maybe the > "archive" contains a makefile and enough supporting stuff to build a > document that has just the obsolete release notes, and somewhere we have > a git repo for that. Then anybody who wants local access can clone that > repo (solving question 1), and we annually use it to build a new version > of the old-release-notes document to put on the website. Another option is we could have a script that just scrapes the data from the already built docs and loads it into (file system, database, etc.). This could become a part of the (minor/major) release process. The biggest pain would be doing this the first time, as we’d have to get all of the historical notes in a one-time sweep. > This seems like a nontrivial amount of work, but maybe we can automate it > to some extent. If nontrivial work saves a lot of wasted time during the build process, I’m for it. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > Though thinking on this further, we’d probably want to maintain the URLs > that have been generated through the years so they don’t all 404 at once. > That would require having the appropriate URL rules written out either in > pgweb itself or at the web server level. I dunno, you think it's worth the trouble? The whole premise of this proposal is that hardly anybody is looking at those pages. If that's not the case, we shouldn't be doing this. OTOH, if we can easily set up a generic redirect rule like "if https://www.postgresql.org/docs/*/static/release-*.html doesn't exist, then redirect to https://www.postgresql.org/docs/old-release-notes/static/release-*.html" it might be worth doing. regards, tom lane
On 2018-Aug-06, Tom Lane wrote: > OTOH, if we can easily set up a generic redirect rule like "if > https://www.postgresql.org/docs/*/static/release-*.html > doesn't exist, then redirect to > https://www.postgresql.org/docs/old-release-notes/static/release-*.html" > it might be worth doing. Yeah I'm pretty sure we can do that. I'm not sure how many people rely on this, but it seems useful to keep HTML-rendered relnotes for all versions (rather than require people to read SGML source). I don't think we need PDFs though ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On Aug 6, 2018, at 11:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >> Though thinking on this further, we’d probably want to maintain the URLs >> that have been generated through the years so they don’t all 404 at once. >> That would require having the appropriate URL rules written out either in >> pgweb itself or at the web server level. > > I dunno, you think it's worth the trouble? The whole premise of this > proposal is that hardly anybody is looking at those pages. If that's > not the case, we shouldn't be doing this. I took a look at the stats and directionally it’s incredibly low. More I get concerned by introducing 404s that could hurt any SEO-related metrics, but that could just be general concern vs. anything factual. > OTOH, if we can easily set up a generic redirect rule like "if > https://www.postgresql.org/docs/*/static/release-*.html > doesn't exist, then redirect to > https://www.postgresql.org/docs/old-release-notes/static/release-*.html" > it might be worth doing. And looking at how the docs are served, we could do this from pgweb, which is fairly straightforward. FWIW I’m thinking of something like: `/docs/release-notes/release-X-Y(-Z)?.html` and have them all live there. Of course the docs themselves would still have their copy of the release notes, but we could at least have a single repository of all the releases, which I do see on other OSS projects. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > FWIW I’m thinking of something like: > `/docs/release-notes/release-X-Y(-Z)?.html` > and have them all live there. Of course the docs themselves would still > have their copy of the release notes, but we could at least have a single > repository of all the releases, which I do see on other OSS projects. I'm imagining this being a repo of only the obsolete branches' release notes, not the active ones. Otherwise we are talking about maintaining two copies of active release note files (because of the xref problem). I personally will flat out refuse to do that; the overhead of maintaining the relnotes is high enough already. Maybe you could make the website look like that without any manual effort using a reverse redirection rule (redirecting from this new area back into the standard docs, for pages belonging to active branches). But that seems pretty confusing, and prone to redirection loops if we also have the other thing. regards, tom lane
> On Aug 6, 2018, at 12:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >> FWIW I’m thinking of something like: > >> `/docs/release-notes/release-X-Y(-Z)?.html` > >> and have them all live there. Of course the docs themselves would still >> have their copy of the release notes, but we could at least have a single >> repository of all the releases, which I do see on other OSS projects. > > I'm imagining this being a repo of only the obsolete branches' release > notes, not the active ones. Otherwise we are talking about maintaining > two copies of active release note files (because of the xref problem). > I personally will flat out refuse to do that; the overhead of maintaining > the relnotes is high enough already. Well I want to make this easier, not harder. Thinking about the process of maintaining all, no matter what, I see making it more complicated for someone, so I will drop that for now. > Maybe you could make the website look like that without any manual effort > using a reverse redirection rule (redirecting from this new area back > into the standard docs, for pages belonging to active branches). But that > seems pretty confusing, and prone to redirection loops if we also have the > other thing. Agreed. So perhaps `/docs/archive/release-notes/release-X-Y-(-Z)?.html` will be where they live. I can make a quick prototype of this on pgweb just to see how easy it is to get the release notes up in it. Basically, once the archived ones are in pgweb, we would not need to have to build them anymore. Jonathan
Attachment
On Aug 6, 2018, at 1:22 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:I can make a quick prototype of this on pgweb just to see how easy it is to get
the release notes up in it. Basically, once the archived ones are in pgweb, we
would not need to have to build them anymore.
Attached is a screenshot of something real quick I drew up. I was able to
generate these notes from what was already loaded in via “docload” (which
is what happens in every release).
I did manually edit the xref’s in order to have them appear more cleanly, but
I should be able to script the process.
To quote you earlier, yes there is a bit of nontrivial work here, but I do think
we have most of the tools in place to do this. What I am thinking is the
following:
1. Add to the “docload” script to segment out the release notes and store
them in a separate table. Perform an “upsert” (i.e. check for an existing
reference; if it’s there, update any content, otherwise insert).
2. Perform any modifications to the content (i.e. there’s some HTML I
explicitly removed from the generated docs).
3. Display the archived docs on the page.
That way in future docloads, if there are missing release notes, the script
would be ok as it would not remove any release notes.
This strategy *should* also work with displaying current release notes on the
site, as it’s basically following what docload currently does, if we wanted to
go down this patch.
Once we run this for the first time with the collection of *all* release notes,
we could then trim down release.sgml et al. And thus as far as I can tell, you
would not have to modify anything in the doc build process.
Thoughts?
Jonathan

Attachment
On Aug 6, 2018, at 2:05 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:On Aug 6, 2018, at 1:22 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:I can make a quick prototype of this on pgweb just to see how easy it is to get
the release notes up in it. Basically, once the archived ones are in pgweb, we
would not need to have to build them anymore.Attached is a screenshot of something real quick I drew up. I was able togenerate these notes from what was already loaded in via “docload” (whichis what happens in every release).I did manually edit the xref’s in order to have them appear more cleanly, butI should be able to script the process.To quote you earlier, yes there is a bit of nontrivial work here, but I do thinkwe have most of the tools in place to do this. What I am thinking is thefollowing:1. Add to the “docload” script to segment out the release notes and storethem in a separate table. Perform an “upsert” (i.e. check for an existingreference; if it’s there, update any content, otherwise insert).2. Perform any modifications to the content (i.e. there’s some HTML Iexplicitly removed from the generated docs).3. Display the archived docs on the page.That way in future docloads, if there are missing release notes, the scriptwould be ok as it would not remove any release notes.This strategy *should* also work with displaying current release notes on thesite, as it’s basically following what docload currently does, if we wanted togo down this patch.Once we run this for the first time with the collection of *all* release notes,we could then trim down release.sgml et al. And thus as far as I can tell, youwould not have to modify anything in the doc build process.
OK, I’ve codified Step #2 from the above, which in turn performs Step #3.
The script reads in the releases notes that are loaded in via docload, updates
the xrefs to point to other releases notes in the archive, updates the doc URLs
to point at the equivalent docs in “current,” and performs some general
cleanup on the page.
Attached is another screenshot of the end result.
To proceed, I would want to ensure we feel good about this direction. I will
also need to discuss with Magnus about how we would want to store this
in pgweb itself. And of course, test it across all the different release notes
to ensure it works.
Jonathan

Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On Aug 6, 2018, at 2:05 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote: >> 1. Add to the “docload” script to segment out the release notes and store >> them in a separate table. Perform an “upsert” (i.e. check for an existing >> reference; if it’s there, update any content, otherwise insert). >> >> 2. Perform any modifications to the content (i.e. there’s some HTML I >> explicitly removed from the generated docs). >> >> 3. Display the archived docs on the page. >> >> That way in future docloads, if there are missing release notes, the script >> would be ok as it would not remove any release notes. > To proceed, I would want to ensure we feel good about this direction. I will > also need to discuss with Magnus about how we would want to store this > in pgweb itself. And of course, test it across all the different release notes > to ensure it works. Hm, so the only objection I can think of is that this results in the old release notes only being available on the website; there's no other way to access them, short of digging around in the git repo. But maybe that's enough. It's certainly attractive that this doesn't seem like it'd entail any manual effort once it's set up initially. regards, tom lane
I wrote: > Hm, so the only objection I can think of is that this results in the old > release notes only being available on the website; there's no other way > to access them, short of digging around in the git repo. But maybe that's > enough. Actually, a concrete reason why that might not be good is that it results in having a single point of failure: once we remove branch N's relnotes from the active branches, the only copy of that data is the one in the archive table the docload script is filling. Given, say, a bug in the docload script that causes it to overwrite the wrong table entries, can we recover? This doesn't seem insoluble, but it might mean a bit more work to do to ensure we can revert back to an earlier version of that table. regards, tom lane
> On Aug 6, 2018, at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: >> Hm, so the only objection I can think of is that this results in the old >> release notes only being available on the website; there's no other way >> to access them, short of digging around in the git repo. But maybe that's >> enough. > > Actually, a concrete reason why that might not be good is that it results > in having a single point of failure: once we remove branch N's relnotes > from the active branches, the only copy of that data is the one in the > archive table the docload script is filling. Given, say, a bug in the > docload script that causes it to overwrite the wrong table entries, > can we recover? Well, the release notes are still in the git history as well as the tarballs. One could always pull an older tarball of PostgreSQL with the full release.sgml and load from there. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: >> On Aug 6, 2018, at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Actually, a concrete reason why that might not be good is that it results >> in having a single point of failure: once we remove branch N's relnotes >> from the active branches, the only copy of that data is the one in the >> archive table the docload script is filling. Given, say, a bug in the >> docload script that causes it to overwrite the wrong table entries, >> can we recover? > Well, the release notes are still in the git history as well as the tarballs. > One could always pull an older tarball of PostgreSQL with the full > release.sgml and load from there. True ... as long as those older tarballs represent data that our current workflow can process. For instance, if we did another documentation format change (from XML to something else), the older tarballs would perhaps no longer be useful for this purpose. On the other hand, it's hard to believe that we'd make such a conversion without tools to help. So probably if the situation came up, we could cobble together something that would allow ingesting the old format. regards, tom lane
> On Aug 6, 2018, at 3:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >>> On Aug 6, 2018, at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Actually, a concrete reason why that might not be good is that it results >>> in having a single point of failure: once we remove branch N's relnotes >>> from the active branches, the only copy of that data is the one in the >>> archive table the docload script is filling. Given, say, a bug in the >>> docload script that causes it to overwrite the wrong table entries, >>> can we recover? > >> Well, the release notes are still in the git history as well as the tarballs. >> One could always pull an older tarball of PostgreSQL with the full >> release.sgml and load from there. > > True ... as long as those older tarballs represent data that our current > workflow can process. For instance, if we did another documentation > format change (from XML to something else), the older tarballs would > perhaps no longer be useful for this purpose. > > On the other hand, it's hard to believe that we'd make such a conversion > without tools to help. So probably if the situation came up, we could > cobble together something that would allow ingesting the old format. Attached is a (rough) working copy of the patch to pgweb. It can: - Extract the release notes from the docload and puts them into their own table - Display the release notes via pgweb akin to earlier screenshots It needs: - The notes actually exposed in the navigation tree - Review how some of the xrefs are translated (esp. non-release ones) - Dependency on all major versions being cataloged in our “Version” table on pgweb, which currently we do not do - Magnus review, as to do this I introduced a new Python dependency I was able to successfully load all of the release notes from the 10.4 tarball and spot checked view several different major/minor version combinations. It’s not near production ready, but wanted to demonstrate that it would not be too hard to get this done. Jonathan
Attachment
On Mon, Aug 06, 2018 at 08:14:23AM +0100, Dean Rasheed wrote: > On 5 August 2018 at 23:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Anyway, I'd like to propose a compromise position that I don't think >> has been discussed before: let's drop release notes for branches >> that were already EOL when a given branch was released. > > WFM. +1 +1. -- Michael
Attachment
On Mon, Aug 6, 2018, 05:57 Tom Lane <tgl@sss.pgh.pa.us> wrote:
We've been around on this before, I know, but I got annoyed about it
again while waiting around for test builds of the back-branch
documentation. I think that we need some policy about maintaining
back-branch release notes that's not "keep everything, forever".
The release notes are becoming an ever-larger fraction of the docs,
and that's not good for documentation maintenance or for download
bandwidth. As an example, looking at the US-letter PDF version of
the v10 docs, as things stand today:
Total page count: 3550
Pages in release notes for 10.x: 41 (1%)
Pages in release notes for older branches: 898 (25%)
Pages in release notes for pre-9.2 branches: 546 (15%)
I've not measured directly, but it's a reasonable assumption that if
we dropped all the back-branch release notes the documentation build
time would drop about 25%, whichever format you were building.
I also live in fear of overrunning TeX's hard-wired limits, in the
back branches that depend on a TeX-based PDF toolchain. We've hit
those before and been able to work around them, but I wouldn't count
on doing so again, and I sure don't want to discover that we have a
problem of that sort the day before a release deadline. Trimming the
release notes would definitely give us enough slack to not worry
about that before all those branches are EOL.
We've discussed trimming the release notes before, and people have
objected on the grounds that they like being able to access ancient
notes from time to time. I'm not unsympathetic to that issue, but
does that access point need to be our daily working documentation?
Anyway, I'd like to propose a compromise position that I don't think
has been discussed before: let's drop release notes for branches
that were already EOL when a given branch was released. So for
example, 9.3 and before would go away from v12, due out next year.
Working backwards, we'd drop 9.1 and before from v10, giving the 15%
savings in page count that I showed above. A quick measurement says
that would also trim the size of the v10 tarball by about 4%, which
is not a lot maybe but it's noticeable across a lot of downloads.
It seems to me that this would still provide enough historical
info for just about any ordinary interest. We could discuss ways
of making a complete release-note archive available somewhere,
if "go dig in the git repo" doesn't seem like an adequate answer
for that.
Works for me. Especially with a release note archive available somewhere.
Thoughts?
regards, tom lane
On Wed, Aug 8, 2018 at 09:53:42PM +0700, Chris Travers wrote: > It seems to me that this would still provide enough historical > info for just about any ordinary interest. We could discuss ways > of making a complete release-note archive available somewhere, > if "go dig in the git repo" doesn't seem like an adequate answer > for that. > > Works for me. Especially with a release note archive available somewhere. Works for me, though, is there no interest in keeping the SGML files in the git tree and just not building them as docs? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > Works for me, though, is there no interest in keeping the SGML files in > the git tree and just not building them as docs? Yeah, I thought about that alternative, but I'm not sure I see the percentage. It'd bloat the tarballs compared to removing them, and for what? Another point that's bothered me a bit is that we're failing to keep the historical notes historical. Every so often, somebody decides they need to run around and fix misspellings or whatever, and they do it to the old notes as well as stuff that's current. To me that goes against every principle of archiving. Taking old notes files out of the tree once we've stopped updating them would at least put a limit on how long they're exposed to historical revisionism. regards, tom lane
On 06/08/2018 00:57, Tom Lane wrote: > Anyway, I'd like to propose a compromise position that I don't think > has been discussed before: let's drop release notes for branches > that were already EOL when a given branch was released. So for > example, 9.3 and before would go away from v12, due out next year. > Working backwards, we'd drop 9.1 and before from v10, giving the 15% > savings in page count that I showed above. A quick measurement says > that would also trim the size of the v10 tarball by about 4%, which > is not a lot maybe but it's noticeable across a lot of downloads. Why not go further and just ship the release notes of the current major version. If you want to look at the release notes of version 11, read the documentation for version 11. Who reads the documentation of version 12 to get the release notes of version 11? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 06/08/2018 00:57, Tom Lane wrote: >> Anyway, I'd like to propose a compromise position that I don't think >> has been discussed before: let's drop release notes for branches >> that were already EOL when a given branch was released. > Why not go further and just ship the release notes of the current major > version. If you want to look at the release notes of version 11, read > the documentation for version 11. Who reads the documentation of > version 12 to get the release notes of version 11? Personally, I'd be OK with that, but it seemed to me that that had already been proposed and shot down (on the grounds of not-enough- history) the last time this was discussed. regards, tom lane
On Thu, Aug 9, 2018 at 07:45:08PM -0400, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > > On 06/08/2018 00:57, Tom Lane wrote: > >> Anyway, I'd like to propose a compromise position that I don't think > >> has been discussed before: let's drop release notes for branches > >> that were already EOL when a given branch was released. > > > Why not go further and just ship the release notes of the current major > > version. If you want to look at the release notes of version 11, read > > the documentation for version 11. Who reads the documentation of > > version 12 to get the release notes of version 11? > > Personally, I'd be OK with that, but it seemed to me that that had > already been proposed and shot down (on the grounds of not-enough- > history) the last time this was discussed. We allow people to just several major versions as long as they read the release notes of all the versions they skipped. Shipping all active major version release notes works for that. Personally, I would find a git tree or tarball of all release notes in SGML or HTML format useful. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Mon, Aug 6, 2018 at 9:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Jonathan S. Katz" <jkatz@postgresql.org> writes:
>> On Aug 6, 2018, at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Actually, a concrete reason why that might not be good is that it results
>> in having a single point of failure: once we remove branch N's relnotes
>> from the active branches, the only copy of that data is the one in the
>> archive table the docload script is filling. Given, say, a bug in the
>> docload script that causes it to overwrite the wrong table entries,
>> can we recover?
> Well, the release notes are still in the git history as well as the tarballs.
> One could always pull an older tarball of PostgreSQL with the full
> release.sgml and load from there.
True ... as long as those older tarballs represent data that our current
workflow can process. For instance, if we did another documentation
format change (from XML to something else), the older tarballs would
perhaps no longer be useful for this purpose.
On the other hand, it's hard to believe that we'd make such a conversion
without tools to help. So probably if the situation came up, we could
cobble together something that would allow ingesting the old format.
The current process to load the docs is basically "extract the HTML files from the tarballs". We run this against the tarballs of any "latest minor release".
So yes, as long as we are OK with only loading release notes the same way we do docs, which is from tarballs, then I really don't think this part will be a problem, and we don't need to do anything about the old files either. But it's not like we're going to be *editing* old release notes in branches that are out of support. We'll be trimming them out of the master branch, but the master branch is not used to load the old docs, only the developer docs.
On Fri, Aug 10, 2018 at 1:38 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 06/08/2018 00:57, Tom Lane wrote:
> Anyway, I'd like to propose a compromise position that I don't think
> has been discussed before: let's drop release notes for branches
> that were already EOL when a given branch was released. So for
> example, 9.3 and before would go away from v12, due out next year.
> Working backwards, we'd drop 9.1 and before from v10, giving the 15%
> savings in page count that I showed above. A quick measurement says
> that would also trim the size of the v10 tarball by about 4%, which
> is not a lot maybe but it's noticeable across a lot of downloads.
Why not go further and just ship the release notes of the current major
version. If you want to look at the release notes of version 11, read
the documentation for version 11. Who reads the documentation of
version 12 to get the release notes of version 11?
+1 for that. At least if we get a generic release notes index up on the website, easy to find.
That might also make the process of manually merging release notes back and forth in the release process easier, I assume?
On Mon, Aug 6, 2018 at 11:17 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:
> On Aug 6, 2018, at 3:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "Jonathan S. Katz" <jkatz@postgresql.org> writes:
>>> On Aug 6, 2018, at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Actually, a concrete reason why that might not be good is that it results
>>> in having a single point of failure: once we remove branch N's relnotes
>>> from the active branches, the only copy of that data is the one in the
>>> archive table the docload script is filling. Given, say, a bug in the
>>> docload script that causes it to overwrite the wrong table entries,
>>> can we recover?
>
>> Well, the release notes are still in the git history as well as the tarballs.
>> One could always pull an older tarball of PostgreSQL with the full
>> release.sgml and load from there.
>
> True ... as long as those older tarballs represent data that our current
> workflow can process. For instance, if we did another documentation
> format change (from XML to something else), the older tarballs would
> perhaps no longer be useful for this purpose.
>
> On the other hand, it's hard to believe that we'd make such a conversion
> without tools to help. So probably if the situation came up, we could
> cobble together something that would allow ingesting the old format.
Attached is a (rough) working copy of the patch to pgweb. It can:
- Extract the release notes from the docload and puts them into their
own table
Not a huge fan of keeping a separate copy of them. I think we can find a way to make it work off the current data, which would simplify the process a bit I think.
- Display the release notes via pgweb akin to earlier screenshots
A thought on this.
Do we actually need a separate copy of the release notes at all? What I mean is:
We have all the old branch tip release notes already on the site, in the docs for that particular version. Wouldn't we get pretty far by just creating a separate *index*, that then links directly to those release notes?
One advantage of that would be that we'd get away from that link rewriting that you did in your patch -- because the docs will actually live at their "natural" location.
The downside would be that they'd end up under "docs" in the navigation breadcrumbs, rather than under "release notes". But is that really a problem?
On 8/30/18 4:15 PM, Magnus Hagander wrote: > On Fri, Aug 10, 2018 at 1:38 AM, Peter Eisentraut > <peter.eisentraut@2ndquadrant.com > <mailto:peter.eisentraut@2ndquadrant.com>> wrote: > > On 06/08/2018 00:57, Tom Lane wrote: > > Anyway, I'd like to propose a compromise position that I don't think > > has been discussed before: let's drop release notes for branches > > that were already EOL when a given branch was released. So for > > example, 9.3 and before would go away from v12, due out next year. > > Working backwards, we'd drop 9.1 and before from v10, giving the 15% > > savings in page count that I showed above. A quick measurement says > > that would also trim the size of the v10 tarball by about 4%, which > > is not a lot maybe but it's noticeable across a lot of downloads. > > Why not go further and just ship the release notes of the current major > version. If you want to look at the release notes of version 11, read > the documentation for version 11. Who reads the documentation of > version 12 to get the release notes of version 11? > > > +1 for that. At least if we get a generic release notes index up on the > website, easy to find. So circling back on this, Peter's point makes a lot of sense. If you want to see release notes for other major versions, there would be URLs to the other major versions, but that would be far less costly than keeping the actual release notes in each tarball. So for example, let's take PostgreSQL 11: https://www.postgresql.org/docs/11/release.html We could do something like: ==snip== - Release 11.1 Migration to Version 11.1 Changes - Release 11.0 Migration to Version 11.1 Changes Older Major Versions: PostgreSQL 10 [URL to https://www.postgresql.org/docs/10/release.html] PostgreSQL 9.6 [URL to https://www.postgresql.org/docs/9.6/release.html] etc. etc. == snip == That would both save significant space and hopefully solve the archiving problem, as we would have the older docs available with all of their respective versions. The downside would be the PDFs, you would not have all the release notes for, say PostgreSQL 10, in the PostgreSQL 11 PDFs. But I would argue does that really matter? I could see that being helpful if you're migrating between versions, but if you're using PostgreSQL 11, you're using PostgreSQL 11 and the information for that version is the most relevant. It also seems like it'd make it easier to maintain the release notes too, which would be another big win in addition to the build speedup. Thoughts? Jonathan
Attachment
On Mon, Jan 7, 2019 at 09:18:29PM -0500, Jonathan Katz wrote: > So circling back on this, Peter's point makes a lot of sense. > > If you want to see release notes for other major versions, there would > be URLs to the other major versions, but that would be far less costly > than keeping the actual release notes in each tarball. I assume this means we would only keep the current release notes in the git tree too, e.g. 11.0, 11.1, 11.2, etc. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce Momjian <bruce@momjian.us> writes: > I assume this means we would only keep the current release notes in the > git tree too, e.g. 11.0, 11.1, 11.2, etc. Yeah, I'd imagine that each branch would have just its own release notes. I'm not sure whether to apply this policy retroactively to the supported back branches or just establish it going forward. Maintaining the notes could be pretty confusing for the next few years if we do the latter, though. regards, tom lane
On Fri, Jan 25, 2019 at 06:41:20PM -0500, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > I assume this means we would only keep the current release notes in the > > git tree too, e.g. 11.0, 11.1, 11.2, etc. > > Yeah, I'd imagine that each branch would have just its own release notes. > > I'm not sure whether to apply this policy retroactively to the supported > back branches or just establish it going forward. Maintaining the notes > could be pretty confusing for the next few years if we do the latter, > though. Agreed. We would need to backpatch. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 1/25/19 6:46 PM, Bruce Momjian wrote: > On Fri, Jan 25, 2019 at 06:41:20PM -0500, Tom Lane wrote: >> Bruce Momjian <bruce@momjian.us> writes: >>> I assume this means we would only keep the current release notes in the >>> git tree too, e.g. 11.0, 11.1, 11.2, etc. >> >> Yeah, I'd imagine that each branch would have just its own release notes. >> >> I'm not sure whether to apply this policy retroactively to the supported >> back branches or just establish it going forward. Maintaining the notes >> could be pretty confusing for the next few years if we do the latter, >> though. > > Agreed. We would need to backpatch. > I am in favor of backpatching. The one "caveat" I will bring up is that once pushed and applied to the site, we would bring introduce a lot of 404s into the website. Doing some research on our traffic analytics on the past 6 months, the only release notes that even registered in the top 500 pages visited were the ones from whatever the newest release was (i.e. 10.4, 10.5, 11.0, 11.1). That, combined with that I don't think we will take an SEO hit from unceremoniously removing the pages even with the sudden rise in 404s, should make it ok to backpatch. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > The one "caveat" I will bring up is that once pushed and applied to the > site, we would bring introduce a lot of 404s into the website. Hm. In principle we could probably insert some redirects, but I doubt it's worth the trouble. If I haven't heard objections, I'll see about making this happen during the first week of Feb (after the CF closes, but before it's time to do the February releases' notes). regards, tom lane
On 1/26/19 10:06 AM, Tom Lane wrote: > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >> The one "caveat" I will bring up is that once pushed and applied to the >> site, we would bring introduce a lot of 404s into the website. > > Hm. In principle we could probably insert some redirects, but > I doubt it's worth the trouble. The reason I didn't bring it up the redirect method was due to the latter point: it'd be more trouble than its worth and for not much gain. How often do people look at the 8.4.7 release notes anyway? (...I have when researching various things, but that's not a regular occurrence :-) > > If I haven't heard objections, I'll see about making this happen > during the first week of Feb (after the CF closes, but before > it's time to do the February releases' notes). Thank you! I was hoping to take a crack at doing this, but I would not be able to do so in the above timeline. However, I should be able to review. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On 1/26/19 10:06 AM, Tom Lane wrote: >> If I haven't heard objections, I'll see about making this happen >> during the first week of Feb (after the CF closes, but before >> it's time to do the February releases' notes). > Thank you! I was hoping to take a crack at doing this, but I would not > be able to do so in the above timeline. However, I should be able to review. Attached is a diff showing what I'm thinking about, for HEAD; each active back branch would get a similar change. I'd also "git rm" now-unreferenced files in relevant branches, but that'd just bulk up the diff so I've not shown it here. It's not quite clear to me what the policy would be for removing back-branch links from this list when old versions drop out of support. Should we go back and remove them in surviving back branches, or just change HEAD? Note that this would change our workflow for release notes a bit, in that real editing work would happen in the back branches, rather than them just getting copies of text from HEAD. I don't see a big problem there, but it's a bit different from how we've traditionally done things. Just for the record, this change causes the time to build HEAD's HTML documentation to drop from ~120 sec to ~95 sec for me; the size of the resulting html/ directory drops from 21MB to 15MB, while the PDF output goes from 17MB to 12.2MB. I didn't try to measure the impact on tarball size, but it should be noticeable. regards, tom lane diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 5dfdf54..a03ea14 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -166,22 +166,6 @@ <!ENTITY release SYSTEM "release.sgml"> <!ENTITY release-12 SYSTEM "release-12.sgml"> -<!ENTITY release-11 SYSTEM "release-11.sgml"> -<!ENTITY release-10 SYSTEM "release-10.sgml"> -<!ENTITY release-9.6 SYSTEM "release-9.6.sgml"> -<!ENTITY release-9.5 SYSTEM "release-9.5.sgml"> -<!ENTITY release-9.4 SYSTEM "release-9.4.sgml"> -<!ENTITY release-9.3 SYSTEM "release-9.3.sgml"> -<!ENTITY release-9.2 SYSTEM "release-9.2.sgml"> -<!ENTITY release-9.1 SYSTEM "release-9.1.sgml"> -<!ENTITY release-9.0 SYSTEM "release-9.0.sgml"> -<!ENTITY release-8.4 SYSTEM "release-8.4.sgml"> -<!ENTITY release-8.3 SYSTEM "release-8.3.sgml"> -<!ENTITY release-8.2 SYSTEM "release-8.2.sgml"> -<!ENTITY release-8.1 SYSTEM "release-8.1.sgml"> -<!ENTITY release-8.0 SYSTEM "release-8.0.sgml"> -<!ENTITY release-7.4 SYSTEM "release-7.4.sgml"> -<!ENTITY release-old SYSTEM "release-old.sgml"> <!ENTITY limits SYSTEM "limits.sgml"> <!ENTITY acronyms SYSTEM "acronyms.sgml"> diff --git a/doc/src/sgml/release.sgml b/doc/src/sgml/release.sgml index 4055adf..cd12e1b 100644 --- a/doc/src/sgml/release.sgml +++ b/doc/src/sgml/release.sgml @@ -70,27 +70,78 @@ For new features, add links to the documentation sections. </para> <!-- - To add a new major-release series, add an entry here and in filelist.sgml. + When beginning a new major-release series, create a new release-N.sgml + file and replace the previous one with a link to the on-line documentation + for that branch. Don't forget to update filelist.sgml. - The reason for splitting the release notes this way is so that appropriate - subsets can easily be copied into back branches. + The reason for keeping each branch's release notes in a differently-named + file is to reduce confusion when preparing minor-release updates. + All the active branches have to be edited concurrently when doing that. --> + &release-12; -&release-11; -&release-10; -&release-9.6; -&release-9.5; -&release-9.4; -&release-9.3; -&release-9.2; -&release-9.1; -&release-9.0; -&release-8.4; -&release-8.3; -&release-8.2; -&release-8.1; -&release-8.0; -&release-7.4; -&release-old; + + <sect1 id="release-prior"> + <title>Prior Releases</title> + + <para> + Release notes for currently-supported previous release series can be + found at: + + <itemizedlist> + <listitem> + <para> + <productname>PostgreSQL</productname> 11: + <ulink url="https://www.postgresql.org/docs/11/release.html"> + <literal>https://www.postgresql.org/docs/11/release.html</literal> + </ulink> + </para> + </listitem> + + <listitem> + <para> + <productname>PostgreSQL</productname> 10: + <ulink url="https://www.postgresql.org/docs/10/release.html"> + <literal>https://www.postgresql.org/docs/10/release.html</literal> + </ulink> + </para> + </listitem> + + <listitem> + <para> + <productname>PostgreSQL</productname> 9.6: + <ulink url="https://www.postgresql.org/docs/9.6/release.html"> + <literal>https://www.postgresql.org/docs/9.6/release.html</literal> + </ulink> + </para> + </listitem> + + <listitem> + <para> + <productname>PostgreSQL</productname> 9.5: + <ulink url="https://www.postgresql.org/docs/9.5/release.html"> + <literal>https://www.postgresql.org/docs/9.5/release.html</literal> + </ulink> + </para> + </listitem> + + <listitem> + <para> + <productname>PostgreSQL</productname> 9.4: + <ulink url="https://www.postgresql.org/docs/9.4/release.html"> + <literal>https://www.postgresql.org/docs/9.4/release.html</literal> + </ulink> + </para> + </listitem> + </itemizedlist> + </para> + + <para> + Release notes for out-of-support release series can be found at + <ulink url="https://www.postgresql.org/docs/manuals/archive/"> + <literal>https://www.postgresql.org/docs/manuals/archive/</literal> + </ulink> + </para> + </sect1> </appendix>
On 2/4/19 11:12 AM, Tom Lane wrote: > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >> On 1/26/19 10:06 AM, Tom Lane wrote: >>> If I haven't heard objections, I'll see about making this happen >>> during the first week of Feb (after the CF closes, but before >>> it's time to do the February releases' notes). > >> Thank you! I was hoping to take a crack at doing this, but I would not >> be able to do so in the above timeline. However, I should be able to review. > > Attached is a diff showing what I'm thinking about, for HEAD; each > active back branch would get a similar change. I'd also "git rm" > now-unreferenced files in relevant branches, but that'd just bulk up > the diff so I've not shown it here. Thanks on all accounts. I reviewed and its along the lines of what I was thinking as well. The documentation in release.sgml on how to create things is clear. I did not try applying the patch, but syntactically it passes the eyeball test. > It's not quite clear to me what the policy would be for removing > back-branch links from this list when old versions drop out of support. > Should we go back and remove them in surviving back branches, or just > change HEAD? Yeah, that was one of my first thoughts as I reviewed the patch. It's one of those "once-a-year" things that are easily forgotten (e.g. with EOL warnings, which is why we updated a few things around that). But as long as they're added to the process of wrapping for the release, it does not sound like its a huge burden. > Note that this would change our workflow for release notes a bit, > in that real editing work would happen in the back branches, rather > than them just getting copies of text from HEAD. I don't see a big > problem there, but it's a bit different from how we've traditionally > done things. I guess one way to look at it: overhead of adding these additional changes vs. overhead saved with build times + tarball size? Are the extra X minutes of developer time worth it? > > Just for the record, this change causes the time to build HEAD's > HTML documentation to drop from ~120 sec to ~95 sec for me; the > size of the resulting html/ directory drops from 21MB to 15MB, > while the PDF output goes from 17MB to 12.2MB. I didn't try to > measure the impact on tarball size, but it should be noticeable. Wow, 28-29% reduction in the file sizes, and 20% reduction in build time! Nice. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On 2/4/19 11:12 AM, Tom Lane wrote: >> It's not quite clear to me what the policy would be for removing >> back-branch links from this list when old versions drop out of support. >> Should we go back and remove them in surviving back branches, or just >> change HEAD? > Yeah, that was one of my first thoughts as I reviewed the patch. It's > one of those "once-a-year" things that are easily forgotten (e.g. with > EOL warnings, which is why we updated a few things around that). But as > long as they're added to the process of wrapping for the release, it > does not sound like its a huge burden. After a bit more thought, I'm inclined to propose that the policy be that we *don't* update the surviving back branches for branch retirement. The new wording in release.sgml should be adjusted to clarify this, along the lines of Release notes for prior release branches can be found on the PostgreSQL web site. At the time of release of version 12, these were the supported prior release branches: <list of direct links, as before> Release notes for older branches can be found at <link to docs/manuals/archive/>. In this way, the prior-release notes section just provides some handy links for recent past releases, and isn't purporting to offer up-to-the-minute info on what's in support. regards, tom lane
On 2/4/19 4:25 PM, Tom Lane wrote: > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >> On 2/4/19 11:12 AM, Tom Lane wrote: >>> It's not quite clear to me what the policy would be for removing >>> back-branch links from this list when old versions drop out of support. >>> Should we go back and remove them in surviving back branches, or just >>> change HEAD? > >> Yeah, that was one of my first thoughts as I reviewed the patch. It's >> one of those "once-a-year" things that are easily forgotten (e.g. with >> EOL warnings, which is why we updated a few things around that). But as >> long as they're added to the process of wrapping for the release, it >> does not sound like its a huge burden. > > After a bit more thought, I'm inclined to propose that the policy be > that we *don't* update the surviving back branches for branch retirement. > The new wording in release.sgml should be adjusted to clarify this, > along the lines of ...so I guess in turn, we would not update back branches with newer releases as well, i.e. adding references about 12 to 10? That makes sense, and eases some of the burden on releases. > Release notes for prior release branches can be found on the > PostgreSQL web site. At the time of release of version 12, > these were the supported prior release branches: > > <list of direct links, as before> > > Release notes for older branches can be found at > <link to docs/manuals/archive/>. > > In this way, the prior-release notes section just provides some handy > links for recent past releases, and isn't purporting to offer > up-to-the-minute info on what's in support. +1 Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On 2/4/19 4:25 PM, Tom Lane wrote: >> After a bit more thought, I'm inclined to propose that the policy be >> that we *don't* update the surviving back branches for branch retirement. > ...so I guess in turn, we would not update back branches with newer > releases as well, i.e. adding references about 12 to 10? That makes > sense, and eases some of the burden on releases. No, I definitely didn't have any intention of putting in forward references to later releases. That seems a bit weird. regards, tom lane
On 2/4/19 5:23 PM, Tom Lane wrote: > "Jonathan S. Katz" <jkatz@postgresql.org> writes: >> On 2/4/19 4:25 PM, Tom Lane wrote: >>> After a bit more thought, I'm inclined to propose that the policy be >>> that we *don't* update the surviving back branches for branch retirement. > >> ...so I guess in turn, we would not update back branches with newer >> releases as well, i.e. adding references about 12 to 10? That makes >> sense, and eases some of the burden on releases. > > No, I definitely didn't have any intention of putting in forward > references to later releases. That seems a bit weird. Agreed. Anyway, I like the overall solution: +1 Thanks for writing the patch, Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On 2/4/19 11:12 AM, Tom Lane wrote: >> Just for the record, this change causes the time to build HEAD's >> HTML documentation to drop from ~120 sec to ~95 sec for me; the >> size of the resulting html/ directory drops from 21MB to 15MB, >> while the PDF output goes from 17MB to 12.2MB. I didn't try to >> measure the impact on tarball size, but it should be noticeable. > Wow, 28-29% reduction in the file sizes, and 20% reduction in build > time! Nice. For amusement's sake (well, and to be sure I'd not broken anything) I ran tarball builds on the various branch heads, and got -rw-r--r-- 1 pgsql pgsql 18929153 Feb 5 00:27 postgresql-10.6.tar.bz2 -rw-r--r-- 1 pgsql pgsql 19703728 Feb 5 00:25 postgresql-11.1.tar.bz2 -rw-r--r-- 1 pgsql pgsql 16858141 Feb 5 00:32 postgresql-9.4.20.tar.bz2 -rw-r--r-- 1 pgsql pgsql 17506811 Feb 5 00:30 postgresql-9.5.15.tar.bz2 -rw-r--r-- 1 pgsql pgsql 18737381 Feb 5 00:29 postgresql-9.6.11.tar.bz2 (The minor numbers are lies, since we've not done a version_stamp.pl run recently.) The last real releases were -rw-r--r--. 1 tgl tgl 20350612 Nov 5 17:11 postgresql-10.6.tar.bz2 -rw-r--r--. 1 tgl tgl 21263173 Nov 6 19:03 postgresql-11.1.tar.bz2 -rw-r--r--. 1 tgl tgl 17905682 Nov 5 17:11 postgresql-9.4.20.tar.bz2 -rw-r--r--. 1 tgl tgl 18707696 Nov 5 17:11 postgresql-9.5.15.tar.bz2 -rw-r--r--. 1 tgl tgl 20009048 Nov 5 17:11 postgresql-9.6.11.tar.bz2 so this change got us about 6%-7% savings in post-compression tarball size. This isn't quite apples to apples of course, since the new builds include code fixes since November ... but patches seldom make things smaller, so if anything this is understating the savings. regards, tom lane
Hi, On 2019-01-26 10:06:06 -0500, Tom Lane wrote: > "Jonathan S. Katz" <jkatz@postgresql.org> writes: > > The one "caveat" I will bring up is that once pushed and applied to the > > site, we would bring introduce a lot of 404s into the website. > > Hm. In principle we could probably insert some redirects, but > I doubt it's worth the trouble. > > If I haven't heard objections, I'll see about making this happen > during the first week of Feb (after the CF closes, but before > it's time to do the February releases' notes). Gah, I'd skipped this thread, because I was OK, if not happy, about the original modest proposal (trimming to supported versions). My fault. For the record: I think this is a terrible idea. Makes it much harder to figure out what changed when, and requires per-branch incantations to grep through the log. That's not to speak of the fact that now it's just about impossible to reference all releasenotes on the website in a useful manner now. For crying out loud, super prominent and often referenced URLs like https://www.postgresql.org/docs/devel/release-10.html are now broken, and soon URLs like https://www.postgresql.org/docs/current/release-10.html will be too. I don't understand how this can be considered a good idea. Greetings, Andres Freund
On 2019-Feb-04, Andres Freund wrote: > Gah, I'd skipped this thread, because I was OK, if not happy, about the > original modest proposal (trimming to supported versions). My fault. > > For the record: I think this is a terrible idea. +1 I don't like it either. The original idea of just removing unsupported ones was fine. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/5/19 1:02 AM, Andres Freund wrote: > Hi, > > On 2019-01-26 10:06:06 -0500, Tom Lane wrote: >> "Jonathan S. Katz" <jkatz@postgresql.org> writes: >>> The one "caveat" I will bring up is that once pushed and applied to the >>> site, we would bring introduce a lot of 404s into the website. >> >> Hm. In principle we could probably insert some redirects, but >> I doubt it's worth the trouble. >> >> If I haven't heard objections, I'll see about making this happen >> during the first week of Feb (after the CF closes, but before >> it's time to do the February releases' notes). > > Gah, I'd skipped this thread, because I was OK, if not happy, about the > original modest proposal (trimming to supported versions). My fault. > > For the record: I think this is a terrible idea. Makes it much harder to > figure out what changed when, and requires per-branch incantations to > grep through the log. That's not to speak of the fact that now it's > just about impossible to reference all releasenotes on the website in a > useful manner now. How frequently are you referencing release notes from older versions -- and I don't mean ones that are just deprecated, but things like 8.2? Or even minor versions such as 8.2.5? Is there a way to keep a balance on the code side: keep the source files in but don't reference them to be built? That may not help with the tarball size, but would certainly still help build times + lower HTML/PDF output. > > For crying out loud, super prominent and often referenced URLs like > https://www.postgresql.org/docs/devel/release-10.html > are now broken, and soon URLs like > https://www.postgresql.org/docs/current/release-10.html > will be too. We can set up some redirect rules for this in pgweb. We have a record of what the latest version is, so we can intercept anything going to `/current/release-(1?[0-9]+(\.[0-9]?` (untested regex) and point it to the correct version. The original thought process was to _not_ do that given the effort, but if it's just for `/current/` it may not be so bad. Jonathan
Attachment
Andres Freund <andres@anarazel.de> writes: > For the record: I think this is a terrible idea. Makes it much harder to > figure out what changed when, and requires per-branch incantations to > grep through the log. Uh ... "grep through the log"? The git log output hasn't changed at all. I've personally never found the SGML/HTML release notes to be even slightly useful for search purposes, because they're spread across so many files. This just changes how many copies of those files we have. > That's not to speak of the fact that now it's > just about impossible to reference all releasenotes on the website in a > useful manner now. You can still point to, say, https://www.postgresql.org/docs/devel/release.html There's maybe two more clicks needed to reach any particular back branch from there, but I would not call that "just about impossible". Anyway, if people want something resembling the old presentation, I think the way to get there is to have some sort of aggregate release notes in a separate place on the web site. We'd discussed that briefly upthread, but no one's volunteered to push it through. regards, tom lane
On 2/5/19 9:12 AM, Tom Lane wrote: > Anyway, if people want something resembling the old presentation, > I think the way to get there is to have some sort of aggregate > release notes in a separate place on the web site. We'd discussed > that briefly upthread, but no one's volunteered to push it through. I do have one patch for exactly that. Magnus and I disagreed on the implementation, perhaps we can circle back around and find something we both agree on. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On 2/5/19 9:12 AM, Tom Lane wrote: >> Anyway, if people want something resembling the old presentation, >> I think the way to get there is to have some sort of aggregate >> release notes in a separate place on the web site. We'd discussed >> that briefly upthread, but no one's volunteered to push it through. > I do have one patch for exactly that. Magnus and I disagreed on the > implementation, perhaps we can circle back around and find something we > both agree on. If we do get something like that set up, I'd be inclined to replace the branch-varying "Prior Releases" text I put into release.sgml with a single pointer to that. BTW, while we're thinking about this --- I remembered that as things stand, we've broken my historical practice of putting up first-draft minor release notes for people to look at if they choose. Those will now be in the newest back branch, which we don't have an automatic build-and-post pipeline for, AFAIK. Now, maybe the people who would review those notes are all comfortable with looking at the git commitdiff anyway. But somebody who preferred to wait for the next guaibasaurus run and then look at the website is now out of luck. Would it be possible to drive this aggregation off the git copies of release-NN.sgml (from appropriate branches) instead of the last released versions? Or set up something equivalent to the devel notes pipeline for back branches? regards, tom lane
I wrote: > BTW, while we're thinking about this --- I remembered that as things > stand, we've broken my historical practice of putting up first-draft > minor release notes for people to look at if they choose. Those will > now be in the newest back branch, which we don't have an automatic > build-and-post pipeline for, AFAIK. Now, maybe the people who would > review those notes are all comfortable with looking at the git > commitdiff anyway. But somebody who preferred to wait for the next > guaibasaurus run and then look at the website is now out of luck. > Would it be possible to drive this aggregation off the git copies > of release-NN.sgml (from appropriate branches) instead of the last > released versions? Or set up something equivalent to the devel > notes pipeline for back branches? After further thought about that, I'm liking the idea that was discussed upthread of setting up a separate git repo for the aggregate release notes. It'd have a simple(?) Makefile with the only build product being the aggregate release notes as HTML (maybe PDF too). The constituent files would be copies of the release-NN.sgml files from the master code repo. There'd be no particular need for multiple branches in this repo, it'd just be latest data all the time. The main drawback of this approach is the need to copy the release-NN.sgml files from the master code repo. But since we'd only touch it four or five times a year, that doesn't seem like unacceptable overhead to me. The benefits are: * It's not so hard to cope with the fact that the various branches don't all use the same docs toolchain. We'd just agree that the release notes repo uses the current toolchain, and when transferring over old release notes, they'd have to be edited as necessary to make them build. * The web site could be set up to build-and-publish from this repo automatically, more or less like the devel docs are published from the master code repo automatically. That'd fix the problem I worry about above: drafts could be published by shoving them into the release note repo ahead of official release. (Contrariwise, if we had say a security-related update we did *not* want to be visible immediately, we'd just delay transferring that to the release note repo.) I'd be willing to do most of the legwork in populating this repo, if someone else were to handle the website plumbing. regards, tom lane
On 2019-02-05 08:50:16 -0500, Jonathan S. Katz wrote: > On 2/5/19 1:02 AM, Andres Freund wrote: > > Hi, > > > > On 2019-01-26 10:06:06 -0500, Tom Lane wrote: > >> "Jonathan S. Katz" <jkatz@postgresql.org> writes: > >>> The one "caveat" I will bring up is that once pushed and applied to the > >>> site, we would bring introduce a lot of 404s into the website. > >> > >> Hm. In principle we could probably insert some redirects, but > >> I doubt it's worth the trouble. > >> > >> If I haven't heard objections, I'll see about making this happen > >> during the first week of Feb (after the CF closes, but before > >> it's time to do the February releases' notes). > > > > Gah, I'd skipped this thread, because I was OK, if not happy, about the > > original modest proposal (trimming to supported versions). My fault. > > > > For the record: I think this is a terrible idea. Makes it much harder to > > figure out what changed when, and requires per-branch incantations to > > grep through the log. That's not to speak of the fact that now it's > > just about impossible to reference all releasenotes on the website in a > > useful manner now. > > How frequently are you referencing release notes from older versions -- > and I don't mean ones that are just deprecated, but things like 8.2? Or > even minor versions such as 8.2.5? Not never, but acceptably rare. That's why I didn't protest loudly when Tom proposed cutting those; I like having them in tree, but Tom cares about not having too many, so that seemed like a reasonable compromise. But then this thread got a lot more extreme. > Is there a way to keep a balance on the code side: keep the source files > in but don't reference them to be built? That may not help with the > tarball size, but would certainly still help build times + lower > HTML/PDF output. Well, the still supported stuff actually makes a ton of sense to have in a release. E.g. looking up minor release notes of the old release when doing a pg_upgrade makes sense. > > For crying out loud, super prominent and often referenced URLs like > > https://www.postgresql.org/docs/devel/release-10.html > > are now broken, and soon URLs like > > https://www.postgresql.org/docs/current/release-10.html > > will be too. > > We can set up some redirect rules for this in pgweb. We have a record of > what the latest version is, so we can intercept anything going to > `/current/release-(1?[0-9]+(\.[0-9]?` (untested regex) and point it to > the correct version. > > The original thought process was to _not_ do that given the effort, but > if it's just for `/current/` it may not be so bad. I think it definitely should also be on /devel/, that's what's out there on blog posts and such. I am flummoxed that we're just giving up google juice by willy nilly returning 404 for stuff that's more widely linked than the average page. It's not like we are that good placed in searches (although that's primarily related to other things). Greetings, Andres Freund
On 2019-02-05 09:12:56 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > For the record: I think this is a terrible idea. Makes it much harder to > > figure out what changed when, and requires per-branch incantations to > > grep through the log. > > Uh ... "grep through the log"? The git log output hasn't changed at all. Sorry, release notes. > I've personally never found the SGML/HTML release notes to be even > slightly useful for search purposes, because they're spread across so > many files. This just changes how many copies of those files we have. IDK, it's really easy right now to just do a grep term doc/src/sgml/release*sgml, and that gives pretty useful results. It's pretty common that a feature is not that easily searchable in the git log, because the focus is a lot lower level. > You can still point to, say, > https://www.postgresql.org/docs/devel/release.html > > There's maybe two more clicks needed to reach any particular back > branch from there, but I would not call that "just about impossible". > > Anyway, if people want something resembling the old presentation, > I think the way to get there is to have some sort of aggregate > release notes in a separate place on the web site. We'd discussed > that briefly upthread, but no one's volunteered to push it through. Yea, and now people's old links are broken. I don't understand how the status quo wouldn't have at least required fixing that before pushing this into the wild. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2019-02-05 08:50:16 -0500, Jonathan S. Katz wrote: >> The original thought process was to _not_ do that given the effort, but >> if it's just for `/current/` it may not be so bad. > I think it definitely should also be on /devel/, that's what's out there > on blog posts and such. I am flummoxed that we're just giving up google > juice by willy nilly returning 404 for stuff that's more widely linked > than the average page. It's not like we are that good placed in searches > (although that's primarily related to other things). I thought there was some concern that we were deoptimizing by having multiple copies of substantially the same page. For something like release-9-6-10.html, there's no value in having it appear in three or four different places. You can't even argue that the later branches might be more up-to-date: that text is *the same*, modulo toolchain-forced markup differences, in every branch; or at least if it isn't it means I screwed up. regards, tom lane
Hi, On 2019-02-05 12:10:57 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2019-02-05 08:50:16 -0500, Jonathan S. Katz wrote: > >> The original thought process was to _not_ do that given the effort, but > >> if it's just for `/current/` it may not be so bad. > > > I think it definitely should also be on /devel/, that's what's out there > > on blog posts and such. I am flummoxed that we're just giving up google > > juice by willy nilly returning 404 for stuff that's more widely linked > > than the average page. It's not like we are that good placed in searches > > (although that's primarily related to other things). > > I thought there was some concern that we were deoptimizing by having > multiple copies of substantially the same page. I think that's an independent issue, given that the rest of the docs are largely duplicated between the versions too. > For something like release-9-6-10.html, there's no value in having it > appear in three or four different places. You can't even argue that > the later branches might be more up-to-date: that text is *the same*, > modulo toolchain-forced markup differences, in every branch; or at > least if it isn't it means I screwed up. If somebody proposed adding automatic redirects from the older linked versions to the newest /current/ URL with that version's release notes, I'm not sure I would have argued against that. But I do *not* think it's actually accurate they are the same - it's a significant difference that they're linking to the corresponding version's pages, because those will contain that version's syntax / docs. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2019-02-05 12:10:57 -0500, Tom Lane wrote: >> For something like release-9-6-10.html, there's no value in having it >> appear in three or four different places. You can't even argue that >> the later branches might be more up-to-date: that text is *the same*, >> modulo toolchain-forced markup differences, in every branch; or at >> least if it isn't it means I screwed up. > If somebody proposed adding automatic redirects from the older linked > versions to the newest /current/ URL with that version's release notes, > I'm not sure I would have argued against that. But I do *not* think > it's actually accurate they are the same - it's a significant difference > that they're linking to the corresponding version's pages, because those > will contain that version's syntax / docs. Huh? The release note contents are identical cross-branch. I know, because I'm generally the one making them. Anyway, what I'm now thinkimg would be useful would be to set up a separate area for aggregated release notes, driven off a new git repo as I suggested; and then we could consider auto-redirecting existing URLs like https://www.postgresql.org/docs/10/release-9-4-19.html into that area. regards, tom lane
Hi, On 2019-02-05 12:24:00 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2019-02-05 12:10:57 -0500, Tom Lane wrote: > >> For something like release-9-6-10.html, there's no value in having it > >> appear in three or four different places. You can't even argue that > >> the later branches might be more up-to-date: that text is *the same*, > >> modulo toolchain-forced markup differences, in every branch; or at > >> least if it isn't it means I screwed up. > > > If somebody proposed adding automatic redirects from the older linked > > versions to the newest /current/ URL with that version's release notes, > > I'm not sure I would have argued against that. But I do *not* think > > it's actually accurate they are the same - it's a significant difference > > that they're linking to the corresponding version's pages, because those > > will contain that version's syntax / docs. > > Huh? The release note contents are identical cross-branch. > I know, because I'm generally the one making them. The point is that links in release-$version.html in /current/ or in a magical new repo will likely contain references to other pages in the docs. Even when the contents of the specific release-*.html page look the same, the pages they link to will differ more. As I said, that's not necessarily bad, but that did use to be a difference between the pages depending on which version they are from. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2019-02-05 12:24:00 -0500, Tom Lane wrote: >> Huh? The release note contents are identical cross-branch. >> I know, because I'm generally the one making them. > The point is that links in release-$version.html in /current/ or in a > magical new repo will likely contain references to other pages in the > docs. Yeah, I was just wondering what a separate aggregated-notes document could do with links to pages outside the release notes proper. The low-tech answer would be to remove the links, but it'd be nicer (at least in the HTML version) to make them point over to the main docs. As you say, it'd be good if they pointed to the corresponding version of the main docs, and that's actually something that's broken in our historical process. REL_11_STABLE's copy of release-10.sgml, say, was not linking to the right places. I don't know enough SGML/XML to know if there's some automatic way of rendering <xref linkend="app-pg-dumpall"/> as an external URL. I sure wouldn't care to convert such stuff by hand. regards, tom lane
On 2/5/19 11:37 AM, Tom Lane wrote: > I wrote: >> BTW, while we're thinking about this --- I remembered that as things >> stand, we've broken my historical practice of putting up first-draft >> minor release notes for people to look at if they choose. Those will >> now be in the newest back branch, which we don't have an automatic >> build-and-post pipeline for, AFAIK. Now, maybe the people who would >> review those notes are all comfortable with looking at the git >> commitdiff anyway. But somebody who preferred to wait for the next >> guaibasaurus run and then look at the website is now out of luck. >> Would it be possible to drive this aggregation off the git copies >> of release-NN.sgml (from appropriate branches) instead of the last >> released versions? Or set up something equivalent to the devel >> notes pipeline for back branches? > > After further thought about that, I'm liking the idea that was > discussed upthread of setting up a separate git repo for the > aggregate release notes. It'd have a simple(?) Makefile with > the only build product being the aggregate release notes as > HTML (maybe PDF too). The constituent files would be copies > of the release-NN.sgml files from the master code repo. There'd > be no particular need for multiple branches in this repo, it'd > just be latest data all the time. > > The main drawback of this approach is the need to copy the > release-NN.sgml files from the master code repo. This is where I had a slight moment of panic especially regarding the release process. Yes, it's not often -- it's an extra step, but perhaps in the end it saves a lot of headaches and allows us to cover the below. > But since > we'd only touch it four or five times a year, that doesn't > seem like unacceptable overhead to me. The benefits are: > > * It's not so hard to cope with the fact that the various > branches don't all use the same docs toolchain. We'd just > agree that the release notes repo uses the current toolchain, > and when transferring over old release notes, they'd have to > be edited as necessary to make them build. > > * The web site could be set up to build-and-publish from this > repo automatically, more or less like the devel docs are published > from the master code repo automatically. That'd fix the problem > I worry about above: drafts could be published by shoving them into > the release note repo ahead of official release. The original pgweb patch I wrote sort-of handled this: it basically looked for release notes within the core repo, found ones that it did not already have, and stuffed them into a table. It should not be difficult to repurpose that code to load them in from a separate repo, and perform that similar parsing. > (Contrariwise, if we had say a security-related update we did > *not* want to be visible immediately, we'd just delay transferring > that to the release note repo.) I don't see this as an insurmountable issue. The contrary point I will make is handling this via a different method. I believe one of the things Magnus objected to in the original patch upthread (or in a private conversation) was that we were double-storing the release note data in the patch I proposed. My way around that was going to perform some careful scripting, i.e: - Find the version of PostgreSQL from newest to oldest - Find the associated release notes from newest to oldest - Make available on the site Which all should be doable from the current data we store. The advantage is that allows us to leave everything as is when displaying release notes on the site. (which if we end up going this way, I'm happy to work on this) > > I'd be willing to do most of the legwork in populating this repo, > if someone else were to handle the website plumbing. If we go down the new path, I would be happy to do the website work, it will require Magnus sign-off if there is a schema change. Thanks, Jonathan
Attachment
On 2/5/19 12:17 PM, Andres Freund wrote: > Hi, > > On 2019-02-05 12:10:57 -0500, Tom Lane wrote: >> Andres Freund <andres@anarazel.de> writes: >>> On 2019-02-05 08:50:16 -0500, Jonathan S. Katz wrote: >>>> The original thought process was to _not_ do that given the effort, but >>>> if it's just for `/current/` it may not be so bad. >> >>> I think it definitely should also be on /devel/, that's what's out there >>> on blog posts and such. I am flummoxed that we're just giving up google >>> juice by willy nilly returning 404 for stuff that's more widely linked >>> than the average page. It's not like we are that good placed in searches >>> (although that's primarily related to other things). >> >> I thought there was some concern that we were deoptimizing by having >> multiple copies of substantially the same page. > > I think that's an independent issue, given that the rest of the docs are > largely duplicated between the versions too. To chime in on this quickly, I remember researching the traffic to the various release notes over a fairly large window -- other than when a major release comes out, the traffic is so insignificant to all the other release pages that dropping them down to 404s would barely register. For minor releases, most people get the info from the news article (and presumably email to -announce) by several orders of magnitude vs. the release notes themselves. The redirects would be a courtesy for our users rather than anything affecting what's in search. Jonathan
Attachment
"Jonathan S. Katz" <jkatz@postgresql.org> writes: > On 2/5/19 11:37 AM, Tom Lane wrote: >> After further thought about that, I'm liking the idea that was >> discussed upthread of setting up a separate git repo for the >> aggregate release notes. > The contrary point I will make is handling this via a different method. > I believe one of the things Magnus objected to in the original patch > upthread (or in a private conversation) was that we were double-storing > the release note data in the patch I proposed. Yeah, the $64 question is whether that is a feature or a bug. A big thing that I like about how matters stand right now is that there's one source of truth about what are the release notes for release X.Y[.Z]. Previously, it was never real clear about whether HEAD or that release branch had precedence, and the possibility of different markup requirements in the two branches didn't make that better. Plus, as Andres points out, *only* the release branch really provided correct pointers in any links to the rest of the docs. If we could avoid the separate git repo, and instead do some redirection magic to sew together the existing pages https://www.postgresql.org/docs/X/release-Y-Z.html for only pages with X = Y, that would be cool probably. regards, tom lane