IVOAPub

1. Introduction

This document describes an XSL based document processing system.

1.1. Design Aims

The fundamental design aims were;

Author the document in XHTML - allows both text and popular WYSIWYG editors to be used, and facilitates using the full features of source code revision systems to be used.
Allow the document created by one iteration of the processing to be used as input for the next stage - this allows tweaks to be made in a WYSIWIG editor to the final output format that will be then auto-numbered for instance, so making the authoring process easier than "batch" orientated systems such as LaTeX.

1.2. Features

The main features of the system are listed below and are described in section 3.

automated section numbering and table of contents generation
automated reference handling with bibtex.
formatting of included XML files.
generation of schema documentation according to uniform standards.
final output of IVOA mandated PDF file with associated pdf bookmarks.

Several of these features allow for more accurate authoring of documents, avoiding errors that can creep in when only manual editing is available. For instance the facility to be able to include and format raw XML files means that not only can the original XML files be independently validated, but also there are no errors in the complex process of pasting XML into a document whilst trying to maintain formatting.

2. Installation

The scripts are maintained and indended to be distributed from the GoogleCode Volute repository at http://code.google.com/p/volute/source/browse/#svn%2Ftrunk%2Fprojects%2Fivoapub%2Fivoadoc. The primary method of use is envisaged to be via the svn externals mechanism within other projects in the Volute repository.

svn propset svn:externals 'ivoadoc https://volute.googlecode.com/svn/trunk/projects/ivoapub/ivoadoc' .

2.1. Prerequisites

The bulk of the processing is done with version 2.0 XSL Transformations [std:XSL2], which means that an XSL 2.0 processor must be installed. Virtually the only fully capable XSLT 2.0 processor avalable is Saxon [saxon], which itself is written in Java so that a java runtime must also be installed. The scripting engine that is used to drive the processing is Ant [ant] which must also be installed. The version of Java that should be used is Java 1.6 or newer, and the version of Ant should not be too critical, but any recent version e.g. 1.6 or greater should suffice. The version of Saxon that is used is more complex, as this now has several variants including commercial offerings - the free version that still has all the required functionality is Saxon-B version 9.1 (note that this is not the most recent free version of Saxon). There are several ways in which Saxon can be made available for running within Ant, but perhaps the simplest is to install the jar file in the $HOME/.ant/lib/ directory.

The production of the PDF version of the document is done using the Apache FOP system [FOP]. A binary version of the FOP distribution should be downloaded, and its location set as the fop.home variable in the ant build.xml file.

To use the bibliography feature then LaTeX and BibTex need to be installed.

3. User Guide

In order to take advantage of the various features of the IvoaPub system the input file needs to be valid XHTML with some extra DIV delimited structure that denotes where sections start and end. These DIVs can be nested to create sub-sections, and sub-sub-sections in a properly structured fashion rather than relying on the level of the HTML heading tag (H1, H2 H3, etc) to denote the level.

If starting afresh, then there is a template.html file that has the appropriate structuring with some example sections, as well as other standard elements in an IVOA document. If you already have a document that needs to have the DIV structuring added, there there is a python script structure.py that can be used to add the DIV tags dependent on the already existing heading tags. It is likely that you will have to do some customization of the structure.py script (which uses a SAX [SAX] processing model) to alter when the DIV wrapping should occur.

The various different features can be invoked with the Ant build script in the following ways

ant without any arguments will run the default ivoarestructure.xslt transformation which causes
- Section renumbering and regeneration of the table of contents
- cross references to be updated.
- XML files to be re-included.
- schema documentation to be regenerated.
ant biblio will regenerate the bibliography (which has to be manually included).
ant createPDF with generate the PDF version.
ant package will zip up the HTML and PDF versions of the document and associated files.

3.1. Section Numbering and Table of Contents generation

The automated section numbering is useful when sections have been moved or before any numbering has occured at all.

For the initial input for section numbering to occur then there simply needs to be a HTML heading tag as the first element within the DIV of class="section" as shown below

<h2>Subtitle</h2>

</div>

After running the script, the same section will look like

It should be noted that a link anchor has been automatically created with an autogenerated identifier - if it is desired this can be changed to a more memorable value (to make authoring a cross reference easier for instance) and the new value will be used in the next iteration of contents generation.

The location of the table of contents itself is indicated by a ToC processing instruction located within a DIV tag, as shown below

<div>

<?toc ?>

</div>

As with other features that use a processing instruction to indicate where some special processing should take place, all text within the enclosing DIV is replaced by the processing.

3.2. Cross-Referencing

Cross references can be made with a span of class "xref".

<h3>Cross references</h3>

<p>

A cross reference to the

<span class="xref">xmlinclusion</span>

</p>

</div>

3.3. XML Inclusion

The XML inclusion feature can be used to include and format either the whole or a part of an XML document. The principal advantage of this automated inclusion is that the source XML can be independently validated to ensure the accuracy of the document being created.

The <?xmlinc?> processing instruction can be used to include and format XML withing the document. The XML to be included is specified as a URL by using the href pseudo-attribute of the processing instruction.

<?incxml href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd"?>

</div>

It is also possible to include a part of an xml document by using an XPATH specifier for the part of the document to be included by using the select attribute - note that the XPATH specified is preceded by a nominal "//" when creating the full XPATH that is used to select the portion of the document to be displayed.

<h2>Partial XML Inclusion</h2>

<div>

<?incxml href="../build.xml" select="target[1]" ?>

</div>

Note that this example shows a relative path being used to locate the XML to be included.

3.4. Schema Documentation

Schema documentation of s style that is used in other IVOA documents can be generated with the <?schemadef?> processing instruction. In this case the schema

<h2>Generate documentation for Capability</h2>

<div>

<?schemadef href="http://www.ivoa.net/xml/VOResource/VOResource-v1.0.xsd" defn="Capability" ?>

</div>

3.5. Bibliography

Citations can be made by using the standard HTML <cite> element with the citation key as content. The citation key should reference an entry in a standard LaTeX bibliography file. After processing the citation is restyled and a hypertext link created connecting the entry in the references.

<h3>Making the citation</h3>

<p>

A citation can begin like

, but after processing will be like

The bibliography file to be used is configured as below

In order to generate the references list from the citations within the document, the ant biblio command should be run and the result pasted into the document.

3.6. PDF Generation

The PDF generation requires no special configuration within the file as long as the template structure is being used.

4. Troubleshooting & FAQ

4.1. XSLT transformation fails

This is usually because the input document is not well formed XHTML - The parser should indicate where the error is although it can be difficult to spot the message amongst the stack trace that is produced. If you can separately validate your XHTML before running the script you will find these errors more easily.

If the transformation fails then the original file can be obtained from the copy made before the transformation with "out" appended to the name before the xsl extension.

4.2. PDF Generation fails

This can often happen because there is a broken internal link in the document. The error messages in such a case can be quite lengthy, but the source of the error is usually indicated as a missing identifier.

4.3. XML inclusion fails

There can be several reasons for this

The included document is not well formed
The relative URL is incorrect - it is a quirk of the system that the URL must be relative to the location of the ivoarestructure.xsl file.
The selecting XPATH is incorrect - be particularly careful of namespaces.

4.3. I do not want all my XML includes to be redone

If you do not want an include to be redone in a document formatting cycle, then you need only to comment out or remove the processing instruction that would cause the inclusion - the already included/formattig XML will not be touched.

IVOAPub - A system for creating IVOA documents
Version 1.0

IVOA Note 2011-05-16

Abstract

Status of This Document

Acknowledgements

Contents

1. Introduction

1.1. Design Aims

1.2. Features

2. Installation

2.1. Prerequisites

3. User Guide

3.1. Section Numbering and Table of Contents generation

3.2. Cross-Referencing

3.3. XML Inclusion

3.4. Schema Documentation

3.5. Bibliography

3.6. PDF Generation

4. Troubleshooting & FAQ

4.1. XSLT transformation fails

4.2. PDF Generation fails

4.3. XML inclusion fails

4.3. I do not want all my XML includes to be redone

Appendix A. Full Template

References

IVOAPub - A system for creating IVOA documents Version 1.0

IVOA Note 2011-05-16

Abstract

Status of This Document

Acknowledgements

Contents

4.3. I do not want all my XML includes to be redone

IVOAPub - A system for creating IVOA documents
Version 1.0