I nternational
V irtual
O bservatory
A lliance
This note describes software designed to assist in creating IVOA documentation conforming to the IVOA documentation standards [std:docSTD].
This is an IVOA Note expressing suggestions from and opinions of the authors. It is intended to share best practices, possible approaches, or other perspectives on interoperability with the Virtual Observatory. It should not be referenced or otherwise interpreted as a standard specification.
A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.
The document transformations described in this document are derived from similar transformations created by Norman Gray and Ray Plante. Markus Demleitner has also contributed to the development of the scripts since initial publication.
This document describes an XSL based document processing system aimed at producing documents suitable for publishing to the IVOA.
The fundamental design aims were;
The main features of the system are listed below and are described in section 3
Several of these features allow for more accurate authoring of documents, avoiding errors that can creep in when only manual editing is available. For instance the facility to be able to include and format raw XML files means that not only can the original XML files be independently validated, but also there are no errors in the complex process of pasting XML into a document whilst trying to maintain formatting.
The scripts are maintained and indended to be distributed from the GoogleCode Volute repository at http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%2Fivoadoc. The primary method of use is envisaged to be via the svn externals mechanism within other projects in the Volute repository.
svn propset svn:externals 'ivoadoc
https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoadoc'
.
The bulk of the processing is done with version 2.0 XSL
Transformations [std:XSL2], which means that an XSL 2.0 processor must be installed.
Virtually the only fully capable XSLT 2.0 processor avalable is
Saxon [saxon], which itself is written in Java so that a java runtime must also
be installed. The scripting engine that is used to drive the
processing is Ant [ant] which must also be installed. The version of Java that should be
used is Java 1.6 or newer, and the version of Ant should not be too
critical, but any recent version e.g. 1.6 or greater should
suffice. The version of Saxon that is used is more complex, as this
now has several variants including commercial offerings - the free
version that still has all the required functionality is Saxon-B version 9.1 (note that this is not the most recent free
version of Saxon). There are several ways in which Saxon can be
made available for running within Ant, but perhaps the simplest is
to install the jar file in the
$HOME/.ant/lib/
directory.
The production of the PDF version of the document is done using the
Apache FOP system [FOP]. A binary version of the FOP distribution should be downloaded,
and its location set as the
fop.home
variable in the ant
build.xml
file.
To use the bibliography feature then LaTeX and BibTex need to be installed.
In order to take advantage of the various features of the IvoaPub system the input file needs to be valid XHTML with some extra DIV delimited structure that denotes where sections start and end. These DIVs can be nested to create sub-sections, and sub-sub-sections in a properly structured fashion rather than relying on the level of the HTML heading tag (H1, H2 H3, etc) to denote the hierarchy.
If starting afresh, then there is a
template.html
file that has the appropriate structuring with some example
sections, as well as other standard elements in an IVOA document. If
you already have a document that needs to have the DIV structuring
added, there there is a python script
structure.py
that can be used to add the DIV tags dependent on the already
existing heading tags. It is likely that you will have to do some
customization of the
structure.py
script (which uses a SAX [SAX] processing model) to alter when the DIV wrapping should occur.
The various different features can be invoked with the Ant
build.xml
script (which is available from the same location as this document http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%253Fstate%253Dclosed)
in the following ways
ant
without any arguments will run the default
ivoarestructure.xslt
transformation which causes
ant biblio
will regenerate the bibliography
(which has to be manually included).ant createPDF
with generate the PDF version.ant package
will zip up the HTML and PDF
versions of the document and associated files into a form suitable to submit to the IVOA.There is a makefile in the same svn repository directory that performs similar functions for those who prefer to drive this process via a makefile.
The automated section numbering is useful when sections have been moved or before any numbering has occured at all.
For the initial input for section numbering to occur then there simply needs to be a HTML heading tag as the first element within the DIV of class="section" as shown below
After running the script, the same section will look like
It should be noted that a link anchor has been automatically created with an autogenerated identifier - if it is desired this can be changed to a more memorable value (to make authoring a cross reference easier for instance) and the new value will be used in the next iteration of contents generation.
The location of the table of contents itself is indicated by a ToC processing instruction located within a DIV tag, as shown below
As with other features that use a processing instruction to indicate where some special processing should take place, all text within the enclosing DIV is replaced by the processing.
Cross references can be made with a span of class "xref".
The XML inclusion feature can be used to include and format either the whole or a part of an XML document. The principal advantage of this automated inclusion is that the source XML can be independently validated to ensure the accuracy of the document being created.
The
<?xmlinc?>
processing instruction can be used to include and format XML
within the document. The XML to be included is specified as a URL
by using the href pseudo-attribute of the processing instruction.
It is also possible to include a part of an xml document by using an XPATH specifier for the part of the document to be included by using the select attribute - note that the XPATH specified is preceded by a nominal "//" when creating the full XPATH that is used to select the portion of the document to be displayed.
Note that this example shows a relative path being used to locate the XML to be included, and this illustrates another quirk of the system. Because the XSLT scripts live in a subdirectory of the main document, it is necessary to make an "up" directory path segement first even to refer to a file that is in the same directory as the main document.
Schema documentation of s style that is used in other IVOA
documents can be generated with the
<?schemadef?>
processing instruction. In this case the schema
Citations can be made by using the standard HTML <cite> element with the citation key as content. The citation key should reference an entry in a standard LaTeX bibliography file. After processing the citation is restyled and a hypertext link created connecting the entry in the references.
The bibliography file to be used is configured as below
The file refs.bib
contains some BibTeX references to IVOA standards.
In order to generate the references list from the citations within
the document, the
ant biblio
command should be run and the result pasted into the document.
The PDF generation requires no special configuration within the file as long as the template structure is being used.
This is usually because the input document is not well formed XHTML - The parser should indicate where the error is although it can be difficult to spot the message amongst the stack trace that is produced. If you can separately validate your XHTML before running the script you will find these errors more easily.
If the transformation fails then the original file can be obtained from the copy made before the transformation with "out" appended to the name before the xsl extension.
This can often happen because there is a broken internal link in the document. The error messages in such a case can be quite lengthy, but the source of the error is usually indicated as a missing identifier.
There can be several reasons for this
If you do not want an include to be redone in a document formatting cycle, then you need only to comment out or remove the processing instruction that would cause the inclusion - the already included/formatted XML will not be touched.
The full template document is reproduced below.
$Revision$ $Date: 2011-08-03 13:48:08 +0100 (Wed, 03. Aug 2011) $ $HeadURL: https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoapub.html $