IVOAPub 10 April 2012

1. Introduction

This document describes an XSL based document processing system aimed at producing documents suitable for publishing to the IVOA.

1.1. Design Aims

The fundamental design aims were;

Author the document in XHTML - allows both text and popular WYSIWYG editors to be used, and facilitates using the full features of source code revision systems to be used.
Allow the document created by one iteration of the processing to be used as input for the next stage - this allows tweaks to be made in a WYSIWIG editor to the final output format that will be then auto-numbered for instance, so making the authoring process easier than "batch" orientated systems such as LaTeX.

1.2. Features

The main features of the system are listed below and are described in section 3

automated section numbering and table of contents generation
automated reference handling with bibtex.
formatting of included XML files.
automatic generation of some boilerplate text.
generation of schema documentation according to uniform standards.
final output of IVOA mandated PDF file with associated pdf bookmarks.

Several of these features allow for more accurate authoring of documents, avoiding errors that can creep in when only manual editing is available. For instance the facility to be able to include and format raw XML files means that not only can the original XML files be independently validated, but also there are no errors in the complex process of pasting XML into a document whilst trying to maintain formatting.

2. Installation

The scripts are maintained and indended to be distributed from the GoogleCode Volute repository at http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%2Fivoadoc. The primary method of use is envisaged to be via the svn externals mechanism within other projects in the Volute repository.

svn propset svn:externals 'ivoadoc https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoadoc' .

2.1. Prerequisites

The bulk of the processing is done with version 2.0 XSL Transformations [std:XSL2], which means that an XSL 2.0 processor must be installed. Virtually the only fully capable XSLT 2.0 processor avalable is Saxon [saxon], which itself is written in Java so that a java runtime must also be installed. The scripting engine that is used to drive the processing is Ant [ant] which must also be installed. The version of Java that should be used is Java 1.6 or newer, and the version of Ant should not be too critical, but any recent version e.g. 1.6 or greater should suffice. The version of Saxon that is used is more complex, as this now has several variants including commercial offerings - the free version that still has all the required functionality is Saxon-B version 9.1 (note that this is not the most recent free version of Saxon). There are several ways in which Saxon can be made available for running within Ant, but perhaps the simplest is to install the jar file in the $HOME/.ant/lib/ directory.

The production of the PDF version of the document is done using the Apache FOP system [FOP]. A binary version of the FOP distribution should be downloaded, and its location set as the fop.home variable in the ant build.xml file.

To use the bibliography feature then LaTeX and BibTex need to be installed.

3. User Guide

In order to take advantage of the various features of the IvoaPub system the input file needs to be valid XHTML with some extra DIV delimited structure that denotes where sections start and end. These DIVs can be nested to create sub-sections, and sub-sub-sections in a properly structured fashion rather than relying on the level of the HTML heading tag (H1, H2 H3, etc) to denote the hierarchy.

If starting afresh, then there is a template.html file that has the appropriate structuring with some example sections, as well as other standard elements in an IVOA document. If you already have a document that needs to have the DIV structuring added, there there is a python script structure.py that can be used to add the DIV tags dependent on the already existing heading tags. It is likely that you will have to do some customization of the structure.py script (which uses a SAX [SAX] processing model) to alter when the DIV wrapping should occur.

The various different features can be invoked with the Ant build.xml script (which is available from the same location as this document http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%253Fstate%253Dclosed) in the following ways

ant without any arguments will run the defaultivoarestructure.xslt transformation which causes
- Section renumbering and regeneration of the table of contents
- cross references to be updated.
- XML files to be re-included.
- schema documentation to be regenerated.
ant biblio will regenerate the bibliography (which has to be manually included).
ant createPDF with generate the PDF version.
ant package will zip up the HTML and PDF versions of the document and associated files into a form suitable to submit to the IVOA.

There is a makefile in the same svn repository directory that performs similar functions for those who prefer to drive this process via a makefile.

3.1. Section Numbering and Table of Contents generation

The automated section numbering is useful when sections have been moved or before any numbering has occured at all.

For the initial input for section numbering to occur then there simply needs to be a HTML heading tag as the first element within the DIV of class="section" as shown below

<h2>Subtitle</h2>

</div>

After running the script, the same section will look like

<h3>

Subtitle

</h3>

</div>

It should be noted that a link anchor has been automatically created with an autogenerated identifier - if it is desired this can be changed to a more memorable value (to make authoring a cross reference easier for instance) and the new value will be used in the next iteration of contents generation.

The location of the table of contents itself is indicated by a ToC processing instruction located within a DIV tag, as shown below

<div>

<?toc ?>

</div>

As with other features that use a processing instruction to indicate where some special processing should take place, all text within the enclosing DIV is replaced by the processing.

3.2. Cross-Referencing

Cross references can be made with a span of class "xref".

<h3>Cross references</h3>

<p>

A cross reference to the

<span class="xref">xmlinclusion</span>

</p>

</div>

3.3. XML Inclusion

The XML inclusion feature can be used to include and format either the whole or a part of an XML document. The principal advantage of this automated inclusion is that the source XML can be independently validated to ensure the accuracy of the document being created.

The <?xmlinc?> processing instruction can be used to include and format XML within the document. The XML to be included is specified as a URL by using the href pseudo-attribute of the processing instruction.

<h2>

Full XML Inclusion

</h2>

<div>

<?incxml href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd"?>

</div>

It is also possible to include a part of an xml document by using an XPATH specifier for the part of the document to be included by using the select attribute - note that the XPATH specified is preceded by a nominal "//" when creating the full XPATH that is used to select the portion of the document to be displayed.

<h2>Partial XML Inclusion</h2>

<div>

<?incxml href="../build.xml" select="target[1]" ?>

</div>

Note that this example shows a relative path being used to locate the XML to be included, and this illustrates another quirk of the system. Because the XSLT scripts live in a subdirectory of the main document, it is necessary to make an "up" directory path segement first even to refer to a file that is in the same directory as the main document.

3.4. Schema Documentation

Schema documentation of s style that is used in other IVOA documents can be generated with the <?schemadef?> processing instruction. In this case the schema

<h2>Generate documentation for Capability</h2>

<div>

<?schemadef href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd" defn="Capability" ?>

</div>

3.5. Bibliography

Citations can be made by using the standard HTML <cite> element with the citation key as content. The citation key should reference an entry in a standard LaTeX bibliography file. After processing the citation is restyled and a hypertext link created connecting the entry in the references.

<h3>Making the citation</h3>

<p>

A citation can begin like

, but after processing will be like

<cite>

[

]

</cite>

</p>

</div>

The bibliography file to be used is configured as below

<h2>

References

</h2>

<?bibliography ivoadoc/refs ?>

</div>

The file refs.bib contains some BibTeX references to IVOA standards.

In order to generate the references list from the citations within the document, the ant biblio command should be run and the result pasted into the document.

3.6. PDF Generation

The PDF generation requires no special configuration within the file as long as the template structure is being used.

4. Troubleshooting & FAQ

4.1. XSLT transformation fails

This is usually because the input document is not well formed XHTML - The parser should indicate where the error is although it can be difficult to spot the message amongst the stack trace that is produced. If you can separately validate your XHTML before running the script you will find these errors more easily.

If the transformation fails then the original file can be obtained from the copy made before the transformation with "out" appended to the name before the xsl extension.

4.2. PDF Generation fails

This can often happen because there is a broken internal link in the document. The error messages in such a case can be quite lengthy, but the source of the error is usually indicated as a missing identifier.

4.3. XML inclusion fails

There can be several reasons for this

The included document is not well formed
The relative URL is incorrect - it is a quirk of the system that the URL must be relative to the location of the ivoarestructure.xsl file.
The selecting XPATH is incorrect - be particularly careful of namespaces.

4.4. I do not want all my XML includes to be redone

If you do not want an include to be redone in a document formatting cycle, then you need only to comment out or remove the processing instruction that would cause the inclusion - the already included/formatted XML will not be touched.

A. Full Template

The full template document is reproduced below.

<head>

<title>Template IVOA Document</title>

</head>

<body>

</div>

nternational

</p>

irtual

</p>

bservatory

</p>

lliance

</p>

</div>

<h1>

Title

Version

</h1>

<h2 class="subtitle">Filled in automatically</h2>

<dl>

<dt>Working Group</dt>

<dd>

<a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaGridAndWebServices" shape="rect">http://www.ivoa.net/twiki/bin/view/IVOA/IvoaGridAndWebServices</a>

</dd>

<dt>

<b>This version:</b>

</dt>

<dd>

<a href="" class="currentlink" shape="rect">filled in automatically</a>

</dd>

<dt>

<b>Latest version:</b>

</dt>

<dd> not issued outside GWS-WG</dd>

<dt>

<b>Previous version(s):</b>

</dt>

<dd> Internal Working Draft v0.1, 2005-01-24 Internal Working Draft v0.2, 2006-05-11 Internal Working Draft v.0.3, 2007-04-26 Internal Working Draft v.04 2008-05-10</dd>

<dt>

<b>Author(s):</b>

</dt>

<dd> Paul Harrison</dd>

</dl>

<h2>Abstract</h2>

<h2> Status of This Document</h2>

<p>This is an working draft of the GWS-WG. The first release of this document was on 2005-01-24 within the working group; This version is the first public WD.</p>

<p id="statusdecl">(updated automatically)</p>

<p>

<i>current IVOA Recommendations and other technical documents</i>

</a>

</span>

<em> can be found at http://www.ivoa.net/Documents/.</em>

</p>

<h2 class="prologue-heading-western">Acknowledgements</h2>

</div>

<h2>Contents</h2>

<div>

<?toc ?>

</div>

<h1>

Introduction

</h1>

<h2>Subtitle</h2>

<p>subtext</p>

<p>

A citation can begin like

</p>

</div>

<h2> Subtitle</h2>

<p>subtext</p>

</div>

<h1> Section</h1>

<h2> Subsection</h2>

<p>Subsection blah</p>

<h3>SubSubSection</h3>

</div>

<p>subsubsection blah</p>

</div>

<h3> Section</h3>

</div>

<h1>An appendix</h1>

<p>Include some xml.</p>

<div>

<?incxml href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd"?>

</div>

<h1>

References

</h1>

<?bibliography ivoadoc/refs ?>

</div>

<hr/>

<p style="text-align: right; font-size: x-small; color: #888;"> $Revision$ $Date$ $HeadURL$ </p>

</body>

</html>

$Revision$ $Date: 2011-08-03 13:48:08 +0100 (Wed, 03. Aug 2011) $ $HeadURL: https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoapub.html $

IVOAPub - A system for creating IVOA documents
Version 1.0

IVOA Note 10 April 2012

Abstract

Status of This Document

Acknowledgements

Contents

1. Introduction

1.1. Design Aims

1.2. Features

2. Installation

2.1. Prerequisites

3. User Guide

3.1. Section Numbering and Table of Contents generation

3.2. Cross-Referencing

3.3. XML Inclusion

3.4. Schema Documentation

3.5. Bibliography

3.6. PDF Generation

4. Troubleshooting & FAQ

4.1. XSLT transformation fails

4.2. PDF Generation fails

4.3. XML inclusion fails

4.4. I do not want all my XML includes to be redone

A. Full Template

References

IVOAPub - A system for creating IVOA documentsVersion 1.0

IVOA Note 10 April 2012

Abstract

Status of This Document

Acknowledgements

Contents

IVOAPub - A system for creating IVOA documents
Version 1.0