Sources of information
The most useful sources of information I have found are:
Installation
All the required files are provided as part of Slackware 11. However,
the DocBook V4.1.2 that is provided with Slackware 11 does not
support HTML tables and may have problems
handling Xincludes. I've
upgraded the linuxdoc-tools-0.9.21-i486-2 package to linuxdoc-tools-0.9.21-i486-6
(also from Slackware), which includes DocBook V4.5.
Generating Java help HTML
The xsltproc program is
used to generate HTML from DocBook XML files (xsltproc is used instead of the
docbook2html command that
we originally used). The command line should be something like:
xsltproc
\
--nonet
\
--stringparam
base.dir docdir/ \
--stringparam root.filename manual \
--stringparam use.id.as.filename 1 \
--stringparam chunker.output.indent yes \
--stringparam chunk.first.sections 1 \
--stringparam html.stylesheet javahelp.css \
http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl \
manual.xml
(See Using a
customized stylesheet below for a way to simplify this command.)
manual.xml is the DocBook
XML file that the HTML files are to be generated from.
http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl
is the XML stylesheet to be used to convert from DocBook to HTML.
Although this is specified using a URL, it should be found on the local
machine (the catalog
file /etc/xml/catalog
provides a mapping from the URL to the local filename). This stylesheet
provides "chunked" output - which means that the output is split into
several HTML files (see Chunking into
multiple HTML files).
The --nonet option tells xsltproc
not to fetch files from the internet. This is optional, and should have
no effect since all the required files should be installed locally. But
it should serve as a check that the files are installed locally.
The various stringparam
options are described in the following table:
Parameter |
Value |
Description |
base.dir |
e.g. "docdir/" |
Directory where the generated HTML files are to be put. It
can be a relative or absolute pathname. A trailing '/' is essential! The directory
must already exist. |
root.filename |
e.g. "manual" |
Name to be used for the
first HTML file (i.e. in this example the first HTML file will be
called manual.html). By
default, the first HTML file is called index.html. |
use.id.as.filename |
1 |
Use the id attribute of
each "chunk" element as the name (with ".html" appended) for each HTML
file except the first. |
chunker.output.indent |
yes |
Indent the HTML output
to make it more readable by humans. |
html.stylesheet |
e.g. "javahelp.css" |
CSS stylesheet to be
used to control appearance of HTML. Only the name of this file is used
by the DocBook processing, not the contents (the browser interprets the
contents, as usual). |
chunk.first.sections |
1 |
Don't put the first
section in the same page as the table of contents. |
The xsltproc command does
not check the XML input file very thoroughly
for errors. It is recommended that the XML file should be validated
using
xmllint before passing it
to xsltproc. The command
for this is, for
example:
xmllint
--valid --noout --noent --nonet modsak.xml
Changes needed to files to use xsltproc
A few minor changes were needed to the XML files in order to work with xsltproc and the XSL stylesheet:
- Change the DOCTYPE header to start:
<!DOCTYPE
article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
- Remove <authorgroup>
element so author information doesn't appear in output.
- Remove the "scale=..."
attribute from all <imagedata>
elements - they generate "width"
attributes in the HTML that don't work well. If an image needs to be
smaller, use the ImageMagick
convert program to
reduce its size, e.g.:
convert
input.png -resize 75% -depth 8 -colors 256 output.png
Even if the image doesn't need to be smaller, it's a good idea to use a
256-colour palette to reduce the image file size:
convert
input.png -depth 8 -colors 256 output.png
The javahelp.css
HTML stylesheet also needed modifying, mainly because different
heading levels are generated when using xsltproc instead of docbook2html.
Entities
"Entities" are a convenient way of naming a string that you want to use
repeatedly in the XML file. They are a bit like C's #define statement. Entity
definitions should be placed within the DOCTYPE header at the start of
the XML file, for example:
<!ENTITY
company "Wingpath Limited">
gives the name "company"
to the string "Wingpath Limited".
Each occurrence of "&company;"
in the document text wil be automatically replaced by "Wingpath Limited".
If you want the string to come from a file, you can use a "system
entity". For example:
<!ENTITY
company SYSTEM "company.txt">
will get the string from the file company.txt.
I.e. each occurrence of "&company;"
in the document text will be replaced by the contents of the file company.txt. Note that files
created/edited using vim
will normally have a newline at end. This newline will get rendered as
a space when the HTML is displayed, which may not be what is
wanted. To stop vim
writing the newline at the end of the file, use the vim commands:
:set
binary
:set noendofline
:w
to save the file.
Including files
If you give xsltproc the extra option "--xinclude", it will perform Xinclude
processing of the input document. This allows you to break up the input
document into mutiple files, share document fragments between
documents, include generated text (plain or DocBook), etc. See Docbook XSL:
The Complete Guide: Modular DocBook files for more details.
If you use xmllint, you
will have to give it the --xinclude
option too, and also use the option --postvalid instead of --valid (so that validity
checks are done after the
Xinclude processing - see Validating
with Xincludes).
If you have a document that is split into multiple XML files, and
you are using entities as described above, then each XML file will have
to include the entity definitions. The best way to handle this is to
move the entity definitions into a separate file, which is then
included in each XML file - see Shared
text entities. For example, you might have a file definitions.ent containing:
<?xml
version="1.0" encoding="utf-8"?>
<!ENTITY company
"Wingpath Limited">
<!ENTITY appname SYSTEM
"appname">
<!ENTITY appversion SYSTEM
"../src/VERSION">
which is included in each XML file using:
<?xml
version="1.0"?>
<!DOCTYPE article PUBLIC
"-//OASIS//DTD DocBook XML V4.5//EN"
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % entities SYSTEM
"definitions.ent" >
%entities;
]>
Note that xmllint and xsltproc seem to require the
"encoding" in the entities file, but allow it to be omitted in XML
files (according to the XML standard,
the encoding is optional even in entities).
You would normally use a relative
pathname in an xi:include
element to specify the file to be included. The pathname is interpreted
as relative to the directory that the including file came from. If the included file itself includes other
files, then the names of these
files will be relative to the directory that the included
file came from. If you want the names of the indirectly-included files
to be interpreted as relative to the directory of the original
including file, you can use a symbolic link from the original directory
to the first included file, and use the name of the symbolic link in
the xi:include element.
Conditional text
DocBook has a rudimentary mechanism (called "profiling")
for conditionally including DocBook elements. Profiling uses a few
special attributes (e.g. the "condition"
attribute) to mark elements that are to be conditionally included. An
element that is so marked will only be included if the attribute value
in the element matches the attribute value specified on the xsltproc command-line.
For example, if an XML file contains:
<phrase
condition="modsak">master or slave</phrase><phrase
condition="modmaster">master only</phrase>
and the xsltproc
command-line specifies:
--stringparam
profile.condition modsak
then the output will include the phrase "master or slave", but not the
phrase "master only".
You can make an element conditional on more than condition. For
example, if an XML file contains:
<phrase
condition="modsak;modmaster">master</phrase>
then the phrase will be included if the command line specifies "profile.condition modsak" or "profile.condition modmaster".
You cannot easily AND two conditions - it requires two passes of xsltproc, one for each
condition.
When using profiling, the DocBook processing is best done in two passes:
the first pass processes the conditionals (and any Xincludes) and
produces a temporary XML file, and the second pass translates the
temporary XML file into HTML. It's a good idea to use xmllint to validate both the
original XML file and the temporary XML file (see Validation
and profiling). The commands to do the two-pass processing with
validation will be something like:
xmllint
--xinclude --postvalid --noent --noout --nonet manual.xml
xsltproc
\
--nonet
\
--xinclude \
--output manual.tmp.xml \
--stringparam profile.condition
modsak \
http://docbook.sourceforge.net/release/xsl/current/profiling/profile.xsl \
manual.xml
xmllint --dtdvalid
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd --noout --nonet
manual.tmp.xml
xsltproc \
--nonet
\
--stringparam base.dir docdir/ \
--stringparam root.filename manual \
--stringparam use.id.as.filename 1 \
--stringparam chunker.output.indent yes \
--stringparam chunk.first.sections 1 \
--stringparam html.stylesheet javahelp.css \
http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl \
manual.tmp.xml
Tables
Docbook supports two sets of elements for tables: CALS and HTML.
If you use xsltproc the setting of CALS column widths is not supported
- you have to use HTML elements if you want to specify column widths.
The widths are best specified in the <td> elements of the first
table row - the <col> element is not supported by Java EditorPane.
Using a customized
stylesheet
The section Customization
Methods in the Complete Guide describes how to write a wrapper for
the chunk.xsl stylesheet.
One use for this is to simplify the xsltproc
command line by moving commonly used parameter settings into the
stylesheet wrapper. For example, by using the stylesheet wrapper:
<?xml
version='1.0'?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:import
href="http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl"/>
<xsl:param
name="root.filename">manual</xsl:param>
<xsl:param
name="use.id.as.filename">1</xsl:param>
<xsl:param
name="chunker.output.indent">yes</xsl:param>
<xsl:param name="chunk.first.sections">1</xsl:param>
<xsl:param
name="html.stylesheet">javahelp.css</xsl:param>
<xsl:param
name="generate.section.toc.level">1</xsl:param>
<xsl:param
name="toc.section.depth">3</xsl:param>
<xsl:param
name="bridgehead.in.toc">1</xsl:param>
</xsl:stylesheet>
we can simplify the command line:
@xsltproc \
--nonet \
--stringparam base.dir docdir/ \
--stringparam root.filename manual \
--stringparam use.id.as.filename 1 \
--stringparam chunker.output.indent yes \
--stringparam chunk.first.sections 1 \
--stringparam html.stylesheet javahelp.css \
--stringparam generate.section.toc.level 1 \
--stringparam toc.section.depth 3 \
--stringparam bridgehead.in.toc 1 \
http://docbook.sourceforge.net/release/xsl/current/html/chunk.xsl
\
manual.tmp.xml
to:
@xsltproc
\
--nonet
\
--stringparam base.dir docdir/ \
javahelp.xsl \
manual.tmp.xml
where javahelp.xml is the
stylesheet wrapper.
The
wrapper can also be used to customize the rendering by overriding
templates that are
defined in (and called from) chunk.xsl. For example,
the following template definition puts a copyright notice at the bottom
of every page (with the notice also being a link to our website):
<xsl:template
name="user.footer.navigation">
<div class="centre">
<a
href="http://wingpath.co.uk">
Copyright © 2009 Wingpath
Limited
</a>
</div>
</xsl:template>
The template user.footer.navigation
is one of many templates
that have empty definitions in chunk.xsl and are there
specifically to allow customization.
You can also change
"generated" text. For example, the following changes "Home" in the
footer to "Contents":
<xsl:param
name="local.l10n.xml" select="document('')" />
<l:i18n
xmlns:l="http://docbook.sourceforge.net/xmlns/l10n/1.0">
<l:l10n language="en">
<l:gentext
key="nav-home" text="Contents"/>
</l:l10n>
</l:i18n>