I nternational

V irtual

O bservatory

A lliance

IVOAPub - A system for creating IVOA documents
Version 1.0

IVOA Note 10 April 2012

This version:
1.0
Latest version:
not issued outside GWS-WG
Previous version(s):
None
Author(s):
Paul Harrison

Abstract

This note describes software designed to assist in creating IVOA documentation conforming to the IVOA documentation standards [std:docSTD].

Status of This Document

This is an IVOA Note expressing suggestions from and opinions of the authors. It is intended to share best practices, possible approaches, or other perspectives on interoperability with the Virtual Observatory. It should not be referenced or otherwise interpreted as a standard specification.

A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.

Acknowledgements

The document transformations described in this document are derived from similar transformations created by Norman Gray and Ray Plante. Markus Demleitner has also contributed to the development of the scripts since initial publication.

Contents

1. Introduction

This document describes an XSL based document processing system aimed at producing documents suitable for publishing to the IVOA.

1.1. Design Aims

The fundamental design aims were;

  1. Author the document in XHTML - allows both text and popular WYSIWYG editors to be used, and facilitates using the full features of source code revision systems to be used.
  2. Allow the document created by one iteration of the processing to be used as input for the next stage - this allows tweaks to be made in a WYSIWIG editor to the final output format that will be then auto-numbered for instance, so making the authoring process easier than "batch" orientated systems such as LaTeX.

1.2. Features

The main features of the system are listed below and are described in section 3

  • automated section numbering and table of contents generation
  • automated reference handling with bibtex.
  • formatting of included XML files.
  • automatic generation of some boilerplate text.
  • generation of schema documentation according to uniform standards.
  • final output of IVOA mandated PDF file with associated pdf bookmarks.

Several of these features allow for more accurate authoring of documents, avoiding errors that can creep in when only manual editing is available. For instance the facility to be able to include and format raw XML files means that not only can the original XML files be independently validated, but also there are no errors in the complex process of pasting XML into a document whilst trying to maintain formatting.

2. Installation

The scripts are maintained and indended to be distributed from the GoogleCode Volute repository at http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%2Fivoadoc. The primary method of use is envisaged to be via the svn externals mechanism within other projects in the Volute repository.

svn propset svn:externals 'ivoadoc https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoadoc' .

2.1. Prerequisites

The bulk of the processing is done with version 2.0 XSL Transformations [std:XSL2], which means that an XSL 2.0 processor must be installed. Virtually the only fully capable XSLT 2.0 processor avalable is Saxon [saxon], which itself is written in Java so that a java runtime must also be installed. The scripting engine that is used to drive the processing is Ant [ant] which must also be installed. The version of Java that should be used is Java 1.6 or newer, and the version of Ant should not be too critical, but any recent version e.g. 1.6 or greater should suffice. The version of Saxon that is used is more complex, as this now has several variants including commercial offerings - the free version that still has all the required functionality is Saxon-B version 9.1 (note that this is not the most recent free version of Saxon). There are several ways in which Saxon can be made available for running within Ant, but perhaps the simplest is to install the jar file in the $HOME/.ant/lib/ directory.

The production of the PDF version of the document is done using the Apache FOP system [FOP]. A binary version of the FOP distribution should be downloaded, and its location set as the fop.home variable in the ant build.xml file.

To use the bibliography feature then LaTeX and BibTex need to be installed.

 

3. User Guide

In order to take advantage of the various features of the IvoaPub system the input file needs to be valid XHTML with some extra DIV delimited structure that denotes where sections start and end. These DIVs can be nested to create sub-sections, and sub-sub-sections in a properly structured fashion rather than relying on the level of the HTML heading tag (H1, H2 H3, etc) to denote the hierarchy.

If starting afresh, then there is a template.html file that has the appropriate structuring with some example sections, as well as other standard elements in an IVOA document. If you already have a document that needs to have the DIV structuring added, there there is a python script structure.py that can be used to add the DIV tags dependent on the already existing heading tags. It is likely that you will have to do some customization of the structure.py script (which uses a SAX [SAX] processing model) to alter when the DIV wrapping should occur.

The various different features can be invoked with the Ant build.xml script (which is available from the same location as this document http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%253Fstate%253Dclosed) in the following ways

  1. ant without any arguments will run the default ivoarestructure.xslt transformation which causes
    • Section renumbering and regeneration of the table of contents
    • cross references to be updated.
    • XML files to be re-included.
    • schema documentation to be regenerated.
  2. ant biblio will regenerate the bibliography (which has to be manually included).
  3. ant createPDF with generate the PDF version.
  4. ant package will zip up the HTML and PDF versions of the document and associated files into a form suitable to submit to the IVOA.

There is a makefile in the same svn repository directory that performs similar functions for those who prefer to drive this process via a makefile.

3.1. Section Numbering and Table of Contents generation

The automated section numbering is useful when sections have been moved or before any numbering has occured at all.

For the initial input for section numbering to occur then there simply needs to be a HTML heading tag as the first element within the DIV of class="section" as shown below

<div class="section">
<h2>Subtitle</h2>
<p>text</p>
</div>

After running the script, the same section will look like

<div class="section">
<h3>
<a id="d2e144" shape="rect"/>
<span class="secnum">1.1. </span>
Subtitle
</h3>
<p>text</p>
</div>

It should be noted that a link anchor has been automatically created with an autogenerated identifier - if it is desired this can be changed to a more memorable value (to make authoring a cross reference easier for instance) and the new value will be used in the next iteration of contents generation.

The location of the table of contents itself is indicated by a ToC processing instruction located within a DIV tag, as shown below

<div>
<?toc ?>
</div>

As with other features that use a processing instruction to indicate where some special processing should take place, all text within the enclosing DIV is replaced by the processing.

3.2. Cross-Referencing

Cross references can be made with a span of class "xref".

<div class="section">
<h3>Cross references</h3>
<p>
A cross reference to the
<span class="xref">xmlinclusion</span>
</p>
</div>

 

3.3. XML Inclusion

The XML inclusion feature can be used to include and format either the whole or a part of an XML document. The principal advantage of this automated inclusion is that the source XML can be independently validated to ensure the accuracy of the document being created.

The <?xmlinc?> processing instruction can be used to include and format XML within the document. The XML to be included is specified as a URL by using the href pseudo-attribute of the processing instruction.

<div class="section">
<h2>
<a id="xmlinclusion" shape="rect"/>
Full XML Inclusion
</h2>
<div>
<?incxml href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd"?>
</div>
</div>

It is also possible to include a part of an xml document by using an XPATH specifier for the part of the document to be included by using the select attribute - note that the XPATH specified is preceded by a nominal "//" when creating the full XPATH that is used to select the portion of the document to be displayed.

<div class="section">
<h2>Partial XML Inclusion</h2>
<div>
<?incxml href="../build.xml" select="target[1]" ?>
</div>
</div>

Note that this example shows a relative path being used to locate the XML to be included, and this illustrates another quirk of the system. Because the XSLT scripts live in a subdirectory of the main document, it is necessary to make an "up" directory path segement first even to refer to a file that is in the same directory as the main document.

3.4. Schema Documentation

Schema documentation of s style that is used in other IVOA documents can be generated with the <?schemadef?> processing instruction. In this case the schema

<div class="section">
<h2>Generate documentation for Capability</h2>
<div>
<?schemadef href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd" defn="Capability" ?>
</div>
</div>

 

3.5. Bibliography

Citations can be made by using the standard HTML <cite> element with the citation key as content. The citation key should reference an entry in a standard LaTeX bibliography file. After processing the citation is restyled and a hypertext link created connecting the entry in the references.

<div class="section">
<h3>Making the citation</h3>
<p>
A citation can begin like
<cite>std:RM</cite>
, but after processing will be like
<cite>
[
<a href="#std:RM" shape="rect">std:RM</a>
]
</cite>
</p>
</div>

The bibliography file to be used is configured as below

<div class="section-nonum">
<h2>
<a name="References" id="References" shape="rect"/>
References
</h2>
<?bibliography ivoadoc/refs ?>
</div>

The file refs.bib contains some BibTeX references to IVOA standards.

In order to generate the references list from the citations within the document, the ant biblio command should be run and the result pasted into the document.

3.6. PDF Generation

The PDF generation requires no special configuration within the file as long as the template structure is being used.

 

4. Troubleshooting & FAQ

4.1. XSLT transformation fails

This is usually because the input document is not well formed XHTML - The parser should indicate where the error is although it can be difficult to spot the message amongst the stack trace that is produced. If you can separately validate your XHTML before running the script you will find these errors more easily.

If the transformation fails then the original file can be obtained from the copy made before the transformation with "out" appended to the name before the xsl extension.

4.2. PDF Generation fails

This can often happen because there is a broken internal link in the document. The error messages in such a case can be quite lengthy, but the source of the error is usually indicated as a missing identifier.

4.3. XML inclusion fails

There can be several reasons for this

  • The included document is not well formed
  • The relative URL is incorrect - it is a quirk of the system that the URL must be relative to the location of the ivoarestructure.xsl file.
  • The selecting XPATH is incorrect - be particularly careful of namespaces.

4.4. I do not want all my XML includes to be redone

If you do not want an include to be redone in a document formatting cycle, then you need only to comment out or remove the processing instruction that would cause the inclusion - the already included/formatted XML will not be touched.

 

A. Full Template

The full template document is reproduced below.

<!-- $Id$ Note that this file should be xhtml with div to mark sections - see README for more information Paul Harrison -->
<html xmlns:xml="http://www.w3.org/XML/1998/namespace" xmlns:="http://www.w3.org/1999/xhtml">
<head>
<title>Template IVOA Document</title>
<meta name="Title" content="IVOA WG Internal Draft"/>
<meta name="author" content="Paul Harrison, paul.harrison@manchester.ac.uk"/>
<meta name="maintainedBy" content="Paul Harrison, paul.harrison@manchester.ac.uk"/>
<link href="http://www.ivoa.net/misc/ivoa_a.css" rel="stylesheet" type="text/css"/>
<link rel="stylesheet" href="http://www.ivoa.net/misc/ivoa_wd.css" type="text/css"/>
<!-- Add other styling information here (but this element, if present, mustn't be empty) <style type="text/css"></style> -->
<link href="./ivoadoc/XMLPrint.css" rel="stylesheet" type="text/css"/>
<link href="./ivoadoc/ivoa-extras.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div class="head">
<div id="titlehead" style="position:relative;height:170px;width: 500px">
<div id="logo" style="position:absolute;width:300px;height:169px;left: 50px;top: 0px;">
<img src="http://www.ivoa.net/pub/images/IVOA_wb_300.jpg" alt="IVOA logo"/>
</div>
<div id="logo-title" style="position: absolute; width: 200px; height: 115px; left: 320px; top: 5px; font-size: 14pt; color: #005A9C; font-style: italic;">
<p style="position: absolute; left: 0px; top: 0px;">
<span style="font-weight: bold;">I</span>
nternational
</p>
<p style="position: absolute; left: 15pt; top: 25pt;">
<span style="font-weight: bold;">V</span>
irtual
</p>
<p style="position: absolute; left: 15pt; top: 50pt;">
<span style="font-weight: bold;">O</span>
bservatory
</p>
<p style="position: absolute; left: 0px; top: 75pt;">
<span style="font-weight: bold;">A</span>
lliance
</p>
</div>
</div>
<h1>
Title
<br clear="none"/>
Version
<span class="docversion">0.1</span>
</h1>
<h2 class="subtitle">Filled in automatically</h2>
<dl>
<dt>Working Group</dt>
<dd>
<a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaGridAndWebServices" shape="rect">http://www.ivoa.net/twiki/bin/view/IVOA/IvoaGridAndWebServices</a>
</dd>
<dt>
<b>This version:</b>
</dt>
<dd>
<a href="" class="currentlink" shape="rect">filled in automatically</a>
</dd>
<dt>
<b>Latest version:</b>
</dt>
<dd> not issued outside GWS-WG</dd>
<dt>
<b>Previous version(s):</b>
</dt>
<dd> Internal Working Draft v0.1, 2005-01-24 Internal Working Draft v0.2, 2006-05-11 Internal Working Draft v.0.3, 2007-04-26 Internal Working Draft v.04 2008-05-10</dd>
<dt>
<b>Author(s):</b>
</dt>
<dd> Paul Harrison</dd>
</dl>
<h2>Abstract</h2>
<p>Blah Blah</p>
<h2> Status of This Document</h2>
<p>This is an working draft of the GWS-WG. The first release of this document was on 2005-01-24 within the working group; This version is the first public WD.</p>
<p id="statusdecl">(updated automatically)</p>
<p>
<em>A list of </em>
<span style="background: transparent">
<a href="http://www.ivoa.net/Documents/" shape="rect">
<i>current IVOA Recommendations and other technical documents</i>
</a>
</span>
<em> can be found at http://www.ivoa.net/Documents/.</em>
</p>
<h2 class="prologue-heading-western">Acknowledgements</h2>
<p>blah</p>
</div>
<h2>Contents</h2>
<div>
<?toc ?>
</div>
<div class="body">
<div class="section">
<h1>
<a id="Introduction" shape="rect"/>
Introduction
</h1>
<p> </p>
<div class="section">
<h2>Subtitle</h2>
<p>subtext</p>
<p>
A citation can begin like
<cite>std:RM</cite>
</p>
</div>
<div class="section">
<h2> Subtitle</h2>
<p>subtext</p>
</div>
</div>
<div class="section">
<h1> Section</h1>
<div class="section">
<h2> Subsection</h2>
<p>Subsection blah</p>
<div class="section">
<h3>SubSubSection</h3>
</div>
<p>subsubsection blah</p>
</div>
</div>
<div class="section">
<h3> Section</h3>
<p>blah blah</p>
</div>
</div>
<div class="appendices">
<div class="section">
<h1>An appendix</h1>
<p>Include some xml.</p>
<div>
<?incxml href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd"?>
</div>
</div>
</div>
<div class="section-nonum">
<h1>
<a name="References" id="References" shape="rect"/>
References
</h1>
<?bibliography ivoadoc/refs ?>
<!-- omit '.bib' -->
</div>
<hr/>
<p style="text-align: right; font-size: x-small; color: #888;"> $Revision$ $Date$ $HeadURL$ </p>
</body>
</html>

References

[std:docSTD] R. .J. Hanisch, C.Arviset, F. Genova, and B. Rino.
Ivoa document standards, version 1.2. {IVOA Recommendation}, 2010.
[ant] http://ant.apache.org/.
Apache ant. [Online].
[saxon] http://saxon.sourceforge.net/.
Saxon. [Online].
[SAX] http://www.saxproject.org/.
SAX. [Online].
[FOP] http://xmlgraphics.apache.org/fop/.
Apache FOP. [Online].
[std:XSL2] Michael Kay, editor.
{XSL} transformations ({XSLT}) version 2.0. {W3C Recommendation}, January 2007.

$Revision$ $Date: 2011-08-03 13:48:08 +0100 (Wed, 03. Aug 2011) $ $HeadURL: https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoapub.html $