August 16, 2013

The Foundation is pleased to announce that Gábor Kövesdán has completed a funded project to improve the infrastructure behind the documentation project. Here is how Gabor describes the project.

Our documentation infrastructure uses very old and discontinued standards and tools. The work on modernising it has already started but still not yet completed. After discussions, we agreed that the migration of the toolchain had to be done in several smaller milestones for better quality assurance. The first major step changed the syntax to a more XML-like one but still allowed some SGML constructs. This is what we currently use in head. The second step is nearing completion and it will upgrade documentation from DocBook 4.2 to DocBook 4.5 and at the same time migrate to proper XML tools. A significant exception to this is PDF rendering, which will still use Jade and DSSSL stylesheets. DSSSL is an old and dead standard, which will not evolve any more. The only available open source DSSSL processors are Jade and OpenJade. Their development is also discontinued. This means a serious technology and vendor lock-in for the documentation project. The aim of this proposal is to eliminate this technology and vendor lock-in.

The following deliverables are provided as a result of the project:

  1. A DocBook 5.0 tree of the documentation. The basic set is available in projects/db5 but still the converter has to be run on a fresh checkout. The converter can be obtained from user/gabor/db5 and it has a README, which explains how it works. The converted sources are not checked in because in this way merges are easier and adjustments of conversion parameters is still possible. These are flexible until the moment when we merge this to head.
  2. A FOP-based rendering toolchain. This gives high quality output and after some last touches it will be print quality. I18N support is really good, all languages can be rendered in high quality. In particular, Xin Li (delphij@) confirmed that the Simplified Chinese is output is very close to print quality. A full PDF build can be found here:
  3. A dblatex-based rendering toolchain. This toolchain is more limited. Its customization is more difficult and non-Latin text is not wrapped properly. Also, it fails with programlistings in tables so it cannot build the whole documentation set. This is a limitation that is explicitly listed in the dblatex documentation. This is a Java-free solution and is probably good enough to render release notes. In most cases, it is still better than the current PDF rendering but it does not give us full support and such a high quality as FOP. Some sample documents can be found here:

During this work, I have discovered some things that are not related to the rendering itself but should be addressed before the print edition of Handbook:

  1. Indexterms are misused. They are not placed to the correct location, which may result in incorrect page number references. Also, there are outdated entries. Besides, some best practices should be consulted and applied to our docs. For example, should we index company names and brands or just technical terms? And which ones? And what are the conventions for secondary indexterms? I think they are a bit abused as well. And probably we should avoid third and further levels. We need to standardize this a bit and document it in fdp-primer. For such a huge manual as the Handbook, it is very important to have a good and usable index.
  2. Sectioning has too many levels. It is recommended to limit them to 3 or 4 when it is really necessary. In DocBook terms, it means using only sect2 and sect3 but not sect4. There are several places where such appear.
  3. Markup is sometimes overused. There are plenty of admonitions (warning, notes, …), some of them in lists, tables in examples and tables in admonitions and such. These are meant to facilitate reading but when abused they are just confusing and overwhelming.