Thread: [DOCS] PDF building with FOP
As we are moving away from the old DSSSL toolchain, we also need to look into how we are building PDFs. The old way, using jadetex, is called by make postgres-A4.pdf postgres-US.pdf The new way, using FOP, is called by make postgres-A4-fop.pdf postgres-US-fop.pdf This already exists. The questions for those who are building PDFs are - Can you make the build work? - Does the output look OK? Some tips: FOP is extremely memory hungry. You will probably have to fiddle with some Java memory settings to make it work. One way is by edting ~/.foprc and set something like FOP_OPTS='-Xmx1200m' # fop upstream binary installation ADDITIONAL_FLAGS='-Xmx1200m' # centos/fedora JAVA_ARGS='-Xmx1200m' # debian It looks like you need at least -Xmx1000m, depending on the fop version. More memory can make things faster. (Some of this could go into the documentation.) Note also that there are wildly different fop versions shipped with distributions. Compare your package version or fop -v output with the available versions at <https://xmlgraphics.apache.org/fop/download.html>. I have done a fair amount of testing across different platforms. My assessment is that it's good enough to move forward. It doesn't have to work out of the box for everyone. But I want to make sure that those who are building PDFs regularly are on board with this. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut wrote: > The questions for those who are building PDFs are > > - Can you make the build work? It worked fine for me after installing the "fop" package. No need to rerun configure. > - Does the output look OK? It looks generally sane in a very quick skim, modulo the problems we've always had with PDFs, such as tables 9-44 and 9-45 being completely unusable. > Some tips: FOP is extremely memory hungry. You will probably have to > fiddle with some Java memory settings to make it work. One way is by > edting ~/.foprc and set something like > > FOP_OPTS='-Xmx1200m' # fop upstream binary installation > ADDITIONAL_FLAGS='-Xmx1200m' # centos/fedora > JAVA_ARGS='-Xmx1200m' # debian I didn't have to change anything; it just worked out of the box. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > As we are moving away from the old DSSSL toolchain, we also need to look > into how we are building PDFs. > The new way, using FOP, is called by > make postgres-A4-fop.pdf postgres-US-fop.pdf > The questions for those who are building PDFs are > - Can you make the build work? > - Does the output look OK? I poked at this a little bit. * On RHEL 6: the available version of fop is 0.95, and it just fails completely. It looks like <title id="..."> constructs cause a NullPointerException in some cases. I tried installing fop 1.0 from back-rev Fedora SRPMs, but ran into dependency hell and gave up for the time being. I didn't look into whether the binary downloads available from apache.org would work. * On Fedora 25: the available version of fop is 2.0, and it seems to Just Work. I did not need to mess with memory settings. And it's enormously faster than the DSSSL toolchain, like 8x. The output PDF varies quite a bit from what I get from DSSSL; it's got slightly different font and spacing choices. But it looks okay, and the hyperlinks seem to work. It would be slightly annoying for me to not be able to build PDFs on my RHEL6 workstation; but thinking about that, the only reason I ever do that is to precheck releases for occurrences of the dreaded "\pdfendlink ended up in different nesting level than \pdfstartlink" error. We can hope that the fop toolchain hasn't got that problem. I think we need some more research into what's the minimum recommendable version of fop, but on the whole it seems like we can move forward. BTW, the .gitignore for doc/src/sgml fails to ignore *.fo files. regards, tom lane
Hello Alvaro, 11.03.2017 06:19, Alvaro Herrera wrote: > - Does the output look OK? > It looks generally sane in a very quick skim, modulo the problems we've > always had with PDFs, such as tables 9-44 and 9-45 being completely > unusable. We at postgrespro added custom XSL (see pg-customize-fo.xsl in patches/xml in pg-doc.check.tar.bz2 attached to https://www.postgresql.org/message-id/449e34c4-9cc8-d17d-5ebe-be92b4c0a87a%40gmail.com) to get long text wrapped in the table cells. Look at http://repo.postgrespro.ru/doc/pgpro/9.6/en/postgres-A4-fop.pdf (pages 224-228) There are a few places where long strings still should be broken manually (with &zwsp;) but the majority of the long string issues fixed automatically with this xsl. Best regards, Alexander
Hello Peter, 11.03.2017 05:25, Peter Eisentraut wrote: > The questions for those who are building PDFs are > > - Can you make the build work? > > - Does the output look OK? Yes, we use FOP build to generate all the PDF's for current versions (starting with 9.5). Output looks ok, though we need to use extra fop-config.xml with font substitutions to get Russian letters in the translated docs (e.g. http://repo.postgrespro.ru/doc/pgsql/9.6/ru/postgres-A4-fop.pdf). > Some tips: FOP is extremely memory hungry. You will probably have to > fiddle with some Java memory settings to make it work. One way is by > edting ~/.foprc and set something like > > FOP_OPTS='-Xmx1200m' # fop upstream binary installation > ADDITIONAL_FLAGS='-Xmx1200m' # centos/fedora > JAVA_ARGS='-Xmx1200m' # debian > > It looks like you need at least -Xmx1000m, depending on the fop version. > More memory can make things faster. (Some of this could go into the > documentation.) Yes, we added JAVA_ARGS=-Xmx4096m to get the docs built on Ubuntu 14.04 (with FOP Version 1.1) We also haven't encountered new problems on Ubuntu 16.04 (with FOP Version 2.1). (This version prints some additional warnings, but the resulting PDF looks valid). Best regards, Alexander
Awhile back I wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> The questions for those who are building PDFs are >> - Can you make the build work? > * On RHEL 6: the available version of fop is 0.95, and it just fails > completely. It looks like <title id="..."> constructs cause a > NullPointerException in some cases. I tried installing fop 1.0 > from back-rev Fedora SRPMs, but ran into dependency hell and gave > up for the time being. I didn't look into whether the binary > downloads available from apache.org would work. I got around to trying the binary-download solution today. The binary tarballs available from http://www.apache.org/dist/xmlgraphics/fop/binaries/ are definitely recommendable: they contain essentially a bunch of jars and an invocation shell script, and they work great even with RHEL6's none-too-shiny-new JRE. You just need to unpack the tarball someplace and put a symlink to its included "fop" script into your PATH. I initially tried the fop-2.0 tarball, and got a NullPointerException on today's HEAD docs, which seems odd because the version of fop 2.0 included in Fedora 25 works. I suppose Red Hat is carrying some patches in their distribution, but I didn't look more closely. However, the fop-2.2 tarball works just fine. (No need for tweaking memory settings either, although it does bloat to multiple GB while running.) So we could recommend 2.2 or later for people going this route. regards, tom lane