I nternational
V irtual
O bservatory
A lliance
This note describes software designed to assist in creating IVOA documentation conforming to the IVOA documentation standards [std:docSTD].
This is an IVOA note expressing suggestions from and opinions of the authors. It is intended to share best practices, possible approaches, or other perspectives on interoperability with the Virtual Observatory.
A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.
The document transformations described in this document are derived from similar transformations created by Norman Gray and Ray Plante.
This document describes an XSL based document processing system.
The fundamental design aims were;
The main features of the system are listed below and are described in section 3.
Several of these features allow for more accurate authoring of documents, avoiding errors that can creep in when only manual editing is available. For instance the facility to be able to include and format raw XML files means that not only can the original XML files be independently validated, but also there are no errors in the complex process of pasting XML into a document whilst trying to maintain formatting.
The scripts are maintained and indended to be distributed from the GoogleCode Volute repository at http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%2Fivoadoc. The primary method of use is envisaged to be via the svn externals mechanism within other projects in the Volute repository.
svn propset svn:externals 'ivoadoc https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoadoc' .
The bulk of the processing is done with version 2.0 XSL Transformations [std:XSL2], which means that an XSL 2.0 processor must be installed. Virtually the only fully capable XSLT 2.0 processor avalable is Saxon [saxon], which itself is written in Java so that a java runtime must also be installed. The scripting engine that is used to drive the processing is Ant [ant] which must also be installed. The version of Java that should be used is Java 1.6 or newer, and the version of Ant should not be too critical, but any recent version e.g. 1.6 or greater should suffice. The version of Saxon that is used is more complex, as this now has several variants including commercial offerings - the free version that still has all the required functionality is Saxon-B version 9.1 (note that this is not the most recent free version of Saxon). There are several ways in which Saxon can be made available for running within Ant, but perhaps the simplest is to install the jar file in the $HOME/.ant/lib/
directory.
The production of the PDF version of the document is done using the Apache FOP system [FOP]. A binary version of the FOP distribution should be downloaded, and its location set as the fop.home
variable in the ant build.xml
file.
To use the bibliography feature then LaTeX and BibTex need to be installed.
In order to take advantage of the various features of the IvoaPub system the input file needs to be valid XHTML with some extra DIV delimited structure that denotes where sections start and end. These DIVs can be nested to create sub-sections, and sub-sub-sections in a properly structured fashion rather than relying on the level of the HTML heading tag (H1, H2 H3, etc) to denote the level.
If starting afresh, then there is a template.html
file that has the appropriate structuring with some example sections, as well as other standard elements in an IVOA document. If you already have a document that needs to have the DIV structuring added, there there is a python script structure.py
that can be used to add the DIV tags dependent on the already existing heading tags. It is likely that you will have to do some customization of the structure.py
script (which uses a SAX [SAX] processing model) to alter when the DIV wrapping should occur.
The various different features can be invoked with the Ant build script in the following ways
ant
without any arguments will run the default ivoarestructure.xslt
transformation which causes
ant biblio
will regenerate the bibliography (which has to be manually included).ant createPDF
with generate the PDF version.ant package
will zip up the HTML and PDF versions of the document and associated files.The automated section numbering is useful when sections have been moved or before any numbering has occured at all.
For the initial input for section numbering to occur then there simply needs to be a HTML heading tag as the first element within the DIV of class="section" as shown below
After running the script, the same section will look like
It should be noted that a link anchor has been automatically created with an autogenerated identifier - if it is desired this can be changed to a more memorable value (to make authoring a cross reference easier for instance) and the new value will be used in the next iteration of contents generation.
The location of the table of contents itself is indicated by a ToC processing instruction located within a DIV tag, as shown below
As with other features that use a processing instruction to indicate where some special processing should take place, all text within the enclosing DIV is replaced by the processing.
Cross references can be made with a span of class "xref".
The XML inclusion feature can be used to include and format either the whole or a part of an XML document. The principal advantage of this automated inclusion is that the source XML can be independently validated to ensure the accuracy of the document being created.
The <?xmlinc?>
processing instruction can be used to include and format XML withing the document. The XML to be included is specified as a URL by using the href pseudo-attribute of the processing instruction.
It is also possible to include a part of an xml document by using an XPATH specifier for the part of the document to be included by using the select attribute - note that the XPATH specified is preceded by a nominal "//" when creating the full XPATH that is used to select the portion of the document to be displayed.
Note that this example shows a relative path being used to locate the XML to be included.
Schema documentation of s style that is used in other IVOA documents can be generated with the <?schemadef?>
processing instruction. In this case the schema
Citations can be made by using the standard HTML <cite> element with the citation key as content. The citation key should reference an entry in a standard LaTeX bibliography file. After processing the citation is restyled and a hypertext link created connecting the entry in the references.
The bibliography file to be used is configured as below
In order to generate the references list from the citations within the document, the ant biblio
command should be run and the result pasted into the document.
The PDF generation requires no special configuration within the file as long as the template structure is being used.
This is usually because the input document is not well formed XHTML - The parser should indicate where the error is although it can be difficult to spot the message amongst the stack trace that is produced. If you can separately validate your XHTML before running the script you will find these errors more easily.
If the transformation fails then the original file can be obtained from the copy made before the transformation with "out" appended to the name before the xsl extension.
This can often happen because there is a broken internal link in the document. The error messages in such a case can be quite lengthy, but the source of the error is usually indicated as a missing identifier.
There can be several reasons for this
If you do not want an include to be redone in a document formatting cycle, then you need only to comment out or remove the processing instruction that would cause the inclusion - the already included/formattig XML will not be touched.
The full template document
$Revision$ $Date$ $HeadURL$