% draft for Utypes May 2009
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%  For an conversion via cgiprint (HTX):
%  See http://vizier.u-strasbg.fr/local/man/cgiprint.htx
\def\ifhtx{\iffalse}    % Lines used only for the HTML version
\ifhtx
% . . .
% . . . Definitions in HTX context
% . . .
\else
\documentclass[12pt,notitlepage,onecolumn]{ivoa}
% . . .
% . . . Definitions in LaTeX context
% . . .
\fi
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%% Comment/uncomment lines below to follow your LateX distribution...

%%
%% If document is processed with latex, dvips and ps2pdf
%%
\ifx\pdftexversion\undefined
  \usepackage[dvips]{graphicx}
  \DeclareGraphicsExtensions{.eps,.ps}
%% Uncomment following line if you want PDF thumbnails
%  \usepackage[ps2pdf]{thumbpdf}
% for old hyperref, use:
  \usepackage{hyperref}
%% for recent hyperref, use:
%  \usepackage[ps2pdf,bookmarks=true,bookmarksnumbered=true,hypertexnames=false,breaklinks=true,%
%  colorlinks,linkcolor=blue,urlcolor=blue]{hyperref}



%%
%% else if document is processed with pdflatex
%%
\else
  \usepackage[pdftex]{graphicx} %% graphics for pdftex (supports .pdf .jpg .png)
  \usepackage{epstopdf}         %% requires epstopdf
%% this is to support .ps files :
  \makeatletter
  \g@addto@macro\Gin@extensions{,.ps}
  \@namedef{Gin@rule@.ps}#1{{pdf}{.pdf}{`ps2pdf #1}}
  \makeatother
%% comment above lines if you have included ps files
%\DeclareGraphicsExtensions{.pdf,.jpg,.png}
%% Uncomment following line if you want PDF thumbnails
%  \usepackage[pdftex]{thumbpdf}
%% for old hyperref, use:
  \usepackage{hyperref}
% for recent hyperref, use:
%  \usepackage[pdftex,bookmarks=true,bookmarksnumbered=true,hypertexnames=false,breaklinks=true,%
%  colorlinks,linkcolor=blue,urlcolor=blue]{hyperref}
  \pdfadjustspacing=1
\fi
 %%
%%  Header of the document...
%%
% Provide a title for your document
\title{\textsf{The Observation Core Components Data Model}}
% Give date and version number
\date{\today}

% Choose one document type from below
%\ivoatype{IVOA Note}
\ivoatype{IVOA Working Draft}
%\ivoatype{IVOA Proposed Recommendation}
%\ivoatype{IVOA Recommendation}

\version{0.1}
% Give author list: separate different authors with \\
% You can add email addresses with links \url{mailto:yourname@ivoa.net}
\author{Mireille Louys, Franç\c{c}ois Bonnarel, Alberto Micol, David Schade, Pat Dowler, ... }
\editor{Mireille Louys, ...???\\}
\urlthisversion{\url{
http://www.ivoa.net/Documents/Notes/WD-ObsCoreDM-0.1-20091211.pdf}}
\urllastversion{\url{http://www.ivoa.net/Documents/latest/ObsCoreComponentsDM.html}}
\previousversion{0.3}
%%%%%%%%%%%%%%%%
%mir \documentclass[12pt]{article}
%\usepackage{graphicx}
%\usepackage{hyperref}
%\usepackage{psfig}
%\usepackage{html}
%\usepackage{epsf}
%\usepackage{lscape}
\usepackage{algorithm}
\usepackage{algorithmicx}
\usepackage{algpseudocode}
\algrenewcommand{\algorithmiccomment}[1]{\hskip3em$\rightarrow$ #1}
%mir \textheight 9.0in \hoffset -0.5in \voffset -0.5in
%\newcommand{\Sensitiv}{Variation}
\newcommand{\Sensitiv}{Sensitivity}

%Mir colors definitions
\definecolor{ipink}{rgb}{1.0,0.937,0.957}
\definecolor{dblue}{rgb}{0.60,0.60,1.0}
\definecolor{ddblue}{rgb}{0.20,0.20,1.0}
\definecolor{iblue}{rgb}{0.9,0.9,1.0}

\newcommand{\bleu}[1]{\textcolor[rgb]{0.00,0.00,1.00}{#1}}
\newcommand{\blue}{\textcolor{blue}}
\newcommand{\violet}{\textcolor[rgb]{0.50,0.00,0.50}}
\newcommand{\green}{\textcolor[rgb]{0.00,1.00,0.00}}
%%%%%%%%%%%%%%%%%
\newcommand{\m}[1]{\mbox{#1}}
%\newcommand{\change}[1]{{\color{red}\it #1}}
\newcommand{\change}[1]{{ #1}}
\newsavebox{\fmbox}
\newenvironment{fmpage}
     {\begin{lrbox}{\fmbox}\begin{footnotesize}\begin{minipage}{15cm}} 
     {\end{minipage}\end{footnotesize}\end{lrbox} \colorbox{iblue}{\fbox{\usebox{\fmbox}}}}
     
\newenvironment{fmpagesmall}
     {\begin{lrbox}{\fmbox} \begin{minipage}{5cm}} 
     {\end{minipage}\end{lrbox}\colorbox{ipink}{\fbox{ \usebox{\fmbox}}}}

\newenvironment{fmppage}
     {\begin{lrbox}{\fmbox}\begin{minipage}{\textwidth}}
     {\end{minipage}\end{lrbox}\colorbox{ipink}{\fbox{\usebox{\fmbox}}}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{document}

\maketitle % print header in standard form

\section*{Abstract}
This document discusses the definition of the core components of the Observation data model that are necessary to cover data discovery use-cases when querying the data providers centers.
It exposes the use-cases to be carried out, explains the model and provides a table of fields to be implemented along the lines of a TAP/SCHEMA strategy.Such a small model is easy to understand and implement by data providers that wish to publish their data into the Virtual Observatory.
 
\section{Status of this document}
This document has been produced by the Data Model Working Group. It is still a draft.
 
\section*{Acknowledgements}
Members of the IVOA Data Model Working Group, the Euro-VO have
contributed to the present draft. F.\ Bonnarel, Anita Richards and M.\ Louys
acknowledge support from the European {\em EUROVO-AIDA} project.
\clearpage

\tableofcontents

\clearpage

\section{Introduction}
\subsection{Scope of the document}
This document is summarizing the practice adopted in the Virtual
Observatory for naming and identifying data models elements. It defines
a very simple formalism for describing data models, the Utype concept
itself, the syntax proposed to represent Utypes-lists in the VO, and
finally illustrates how to use them.

\subsection {Context and definition}
\label{sec:context}

The Virtual Observatory provides protocols and interoperable
applications in order to access, retrieve, and analyze astronomical data.
To facilite this, a unified representation of metadata is required.
For each domain -- Observation, Spectra, Simulations, VOEvents, etc. --,
a data model can be defined to provide such a unified representation.

The metadata in astronomy are distributed using file formats like
VOTable, FITS, or plain text.  To link data items within such formats to
elements in data models (``roles''), an iteroperable way to
denote roles is necessary.

For example, a FITS header card (\texttt{SPATRES = 1.3 arcsec} could be
unequivocally linked to the typical resolution on the spatial axis
within the Characterisation Data Model ~\cite{UtypeCharac} by
associating it with the tag
\textit{char:Spatial\-Axis.resolution.referenceValue}; this tag is the
role's utype, denoting both the data model (``cha'') and the role
the particular value fills within this data model.

Within the VO, UCDs also serve as semantic labels for metadata.  They
categorise physical quantities (e.g., ``this is \emph{a} declination''),
but are not linked to a data model; utypes, in contrast, define the role
of a specific value (e.g., ``this is \emph{the} declination of the
target object'').

\begin{figure}[!htbp]
    \includegraphics[width=0.8\textwidth]{./images/simpledm.eps}
  \caption{\it A small part of the simple data model representation of
  the characterization data model.}
  \label{fig:simpledm}
\end{figure}


\section{Data Models}
\label{sec:dms}

Several formalisms are in use to define data models (using the term in a
loose sense) -- there are many UML modelling tools with a multitude of
serialization formats, more conventional entity-relationship modellers,
and more.  Current practice in the VO is to formally define data models
using XML schema files; these typically are generated using modelling
programs, either directly or using tools like VO-URP \ref{vourp-todo}.

To keep the utype definition independent of the concrete formalism
employed and still be able to formally talk about what a data model is,
we adopt the following definition of a data model for the purpose of
this document:

\medskip
\noindent \textbf{Definition.}  A data model is a directed acyclic graph
$G=(V,E)$, where each vertex $v\in V$ has an associated label $l$; all
elements of $V$ have at most one incoming edge in $E$.
\medskip

Note that the graph need not be connected; a data model may thus have
several "`roots"'; a connected subgraph is called a ``package''.

In effect, this definition describes a collection of trees.
It is hoped that all features of data modelling formalisms necessary for
the transmission of serialized instances of VO data models are covered
in this simplified definition.

In the common case of a data model specified in an XML schema, 
algorithm \ref{alg:xsdtodm} can perform the transformation to such a
simple data model description.  All names mentioned in the algorithm are
understood to be namespace-less, and element references are assumed
to be resolved before the algorithm starts.

Note that the algorithm does not halt for data models that do not yield
acyclic graphs.

Algorithm \ref{alg:xsdtodm} is a recommendation, but not normative; data
model authors are free to prescribe different mappings even if they use
XSD. In particular in the presence of substitution groups, this may be
necessary, as the example of the STC data model shows
\ref{stcinvotable-todo}.

\begin{algorithm}
\caption{Generation of a simplified data model from a subset of XSD}
\label{alg:xsdtodm}
\begin{algorithmic}
\Procedure{buildForType}{$n$, $v$, $V$, $E$}

\Comment{$n$ is an XSD type definition}

\Comment{$v$ is the graph vertex correspondig to $n$}

\Comment{$V$ and $E$ are the sets of vertices and edges of a graph}

\For{each descendant $n'$ of $n$ of type \texttt{xs:attribute}}
  \State{Add a node $v'$ to $V$, labeled with the name attribute of $n'$}
  \State{Add an edge $(v,v')$ to $E$}
 \EndFor
\For{each descendant $n'$ for $n$ of type \texttt{xs:element}}
  \State{Add a node $v'$ to $V$, labeled with the name attribute of $n'$}
  \State{Add an edge $(v,v')$ to $E$}
  \State{$t\gets$ the type definition node for $n'$}
  \State call {\sc buildForType($t$, $v'$, $V$, $E$)}
\EndFor
\EndProcedure

\Procedure{buildModel}{$x$}

\Comment{$x$ is the XML Schema (e.g., in a DOM tree))}

\State{$V\gets$ empty set of labeled vertices}
\State{$E\gets$ empty set of labeled edges}
\For{each global element definition $n$ in $x$}
  \State{Add a node $v$ labelled with the name of $n$ to $V$}
  \State{$t\gets$ the type definition node for $n$}
  \State{call {\sc buildForType($t$, $v$, $V$, $E$)}}
\EndFor
\EndProcedure
\end{algorithmic}
\end{algorithm}

As an example, consider the following (abridged) excerpt from the
characterization data model:

\begin{verbatim}
<xsd:element name="timeAxis" type="cha:TimeAxisType"/>
<xsd:element name="spectralAxis" type="cha:SpectralAxisType"/>
<xsd:complexType name="SpectralAxisType">
  <xsd:complexContent>
      <xsd:sequence>
        <xsd:element name="unit" type="xsd:anyType"/>
        <xsd:element name="accuracy" 
          type="cha:AccuracyType" minOccurs="0"/>
      </xsd:sequence>
  </xsd:complexContent>
</xsd:complexType>
<xsd:complexType name="TimeAxisType">
  <xsd:complexContent>
      <xsd:sequence>
          <xsd:element name="unit" type="xsd:anyType"/>
          <xsd:element name="accuracy" 
            type="cha:AccuracyType" minOccurs="0"/>
      </xsd:sequence>
  </xsd:complexContent>
</xsd:complexType>
<xsd:complexType name="AccuracyType">
  <xsd:sequence>
    <xsd:element name="quality" type="xsd:string" minOccurs="0"/>
    <xsd:element name="statError" 
      type="cha:StatErrorType" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>
\end{verbatim}

The corresponding simplified data model is shown in
figure~\ref{fig:simpledm}.

\section{Utypes Syntax }
\label{sec:syntax}

Utypes are simple strings, interpreted case-insensitively.  Client
programs may compare utypes without any need to parse them (``utypes are
opaque for clients'').  Still, for advanced usage, utypes can be parsed.
The grammar is (nonterminals are in italics, terminals in teletype
and double quotes):

\begin{eqnarray}
\textit{utype}&::=&\textit{prefix}\,\texttt{":"}\,\textit{path}\cr
\textit{path}&::=&\textit{label}\,\{\texttt{"."}\,\textit{label}\,\}
\end{eqnarray}

Both \textit{prefix} and \textit{label} are simple Java or C identifiers
(i.e., a letter or underscore followed by zero or more alphanumeric
characters or underscores).

The prefix is a (short) identifier for the data model used.  It is fixed
for any given data model (e.g., \texttt{cha} for characterization,
\texttt{stc} for space-time-coordinates, etc). This prefix remains the
same even for revisions of the data model specification as long as
clients can still deserialize older versions of the data model from
newer utype serializations (this is certainly the case for pure
extensions).
A mapping of the prefix to the concrete definition of the data
model is provided by the \texttt{DataModel.URI} mechanism (see below).

To generate the path, traverse the appropriate subgraph from its root to
the desired data model element.  Gather the labels of the vertices
traversed in sequence and join them with periods (``.'').  The result is
the utype path.  It is unique because the graph is acyclic and no vertex
has more than one predecessor.

For example, in figure~\ref{fig:simpledm}, the utype the quality role on the
spectral axis is obtained by starting in the spectalAxis node (the root
of the subgraph containing the target role), then traversing to
accuracy and finally reaching quality.  The result then is
\texttt{cha:spectralAxis.accuracy.quality}.  It is of course legal to
stop the traversal before reaching a leaf node.  Thus,
\texttt{cha:spectralAxis.accuracy} is another valid utype resulting from
the data model depticted in figure~\ref{fig:simpledm}.

\section{Data model re-use}

However, in the VO there are common structures that are needed everywhere, like IVOA identifiers, or coordinates.
Coordinates are defined in a separate model: STC,  \violet{\footnotesize{\url{http://www.ivoa.net/Documents/latest/STC.html}}} and identifiers are standardised in at \violet{\footnotesize{\url{http://www.ivoa.net/Documents/latest/IDs.html}}}.

In object oriented programs, classes of these packages are simply linked using libraries, and can then be used as types (primitive classes) for other models.
For serialisation, we need an explicit mechanism to mention that attributes in a class re-use STC basic structures, for instance.

XML serialisation reusing other models are easy to build up as existing schemata can be linked together or imported.
For instance the Characterisation data model imports STC elements which are then parsed using the XML name space mechanism.
In the case of Utypes serialisation, there are two proposed strategies:

\subsubsection{Canonical notation}
This is the most explicit that allows various versions of the two associated models.
Utypes are prefixed with their relative data model name space string and just concatenated using a specific delimiter as suggested in  Fig.~\ref{fig:DM_UTypeExample}.
Utypes would then be chained according to this pattern:
\\
\begin{fmpagesmall}
{\verb dm1:Utype1;dm2:Utype2 \\}
\end{fmpagesmall} 
\\
which means that entities named Utype2 in 'dm2' are re-used as atomic constructs inside Utype1 entities in 'dm1'.
This notation helps to clearly identify the data model which each Utype belongs to.
\begin{figure}[!htbp]
    \includegraphics[width=\textwidth]{./images/DMexSpat.jpg}
  \caption{\it Correspondance between XML elements and Utypes: this example illustrates the similarities between the XML path reaching a leaf element and its Utype representation.}
  \label{fig:DM_UTypeExample}
\end{figure}

%\begin{table}
%    \begin{tabular}{|lp{17cm}|}
%\verb <characterisationAxis>      \\
%\verb <axisName>spatial</axisName>    \\  
%\verb <ucd>pos</ucd>                \\
%\verb <unit>deg</unit>               \\
%{\verb <coordsystem id\="TT-ICRS-TOPO" xlink:type\="simple    cha:characterisationAxis.Coordsystem  } \\
%  \verb xlink: $href=$"ivo://STClib/CoordSys\# TT-ICRS-TOPO"/>           \\ 
%  \verb <coverage>                                            \\        
%    \verb <location>              cha:characterisationAxis.Coverage.location \\
%{      \verb <coord $coord_system_id=$"TT-ICRS-TOPO"> } \\
%        \verb <stc:Position2D>  \\
%          \verb   <stc:Name1>RA</stc:Name1>  \\     
%          \verb   <stc:Name2>Dec</stc:Name2>  \\                          
%          \verb <stc:Value2>       \\
%{      \verb       <stc:C1>132.4210</stc:C1>      } cha:characterisationAxis.Coverage.location  \\
%{      \verb       <stc:C2>12.1232</stc:C2>      } ; stc:Position2D.Value2D.C1   \\
%      \verb     </stc:Value2>                      \\           
%  \verb     </stc:Position2D>                     \\
%    \verb   </coord>  \\
%  \verb </location>   \\
%\verb </coverage>                                \\
%\verb </characterisationAxis>     \\        
%\end{tabular}
%%\end{fmppage}
%  \caption{Correspondance between XML elements and Utypes: a simple example}
%  \label{tab:Correspondance}
%  \end{table}

dm1 and dm2 are name space prefix that point to the data model representation , for example the XML schema corresponding to the corresponding version of the model.
 The concatenating character \textbf{\emph{\texttt{;} (semicolon)}} is not overlapping with any reserved characters of the VOTable standard, or XML, or uri syntax. In order to be able to use the Uri mechanism \cite{URIutypes} described by Norman Gray, [\verb @,*,$,#,% ] not allowed in URI should be avoided.

The concatenation is supposed to happen only one time which means the right part after ';' is a kind of VO type described consistently and self sufficiently in one single data model.
This makes the assumption that VO models are properly organised in nested packages and are cooperative enough to cover the whole field of astronomical metadata with a minimum of overlap.

\subsubsection{Alternative Utypes representation}
The canonical notation applied to the Characterisation data model provides very long Utype strings that are not appealing to the user and too long to be used as data base column names.
  If we consider for instance a specific version of the Characterisation data model whose classes integrates coordinates and regions from the STC v1.33 data model, we get a simpler notation by just browsing down the attributes chain as shown in the syntax section.
\\
\begin{fmppage}
{\verb SpatialAxis.Coverage.location.coord;stc:Position2D.Value2D.C1 \\}
\end{fmppage} 
\\
would simply become \\  
\begin{fmppage}
{\verb SpatialAxis.Coverage.location.Position2D.Value2D.C1 \\}
\end{fmppage} 
\\

Such a notation does not show the limit between the two models but is consistent with the XML schema import mecanism.
Parsing the Utype string and resolving the name space will point to the specific version of the CharDM with the specific STC v1.33 data model version. This provides a fixed binding between the two data model versions.
Although the string is not much shorten, it allows to pick up any single value in a data model instance and browse down the nested classes to build up the corresponding Utype string.
 
\subsection{Short abbreviations for Utypes}
From the building approach, Utypes are prone to be long due to the object oriented design that encourages nested classes and package re-use. 
However, even if it is a drawback for display in applications, long strings are easier to interpret by data providers and VO programmers, avoid ambiguities and foster uniqueness.

Inside an application, a data base or or a server, where Utypes are only machine-interpreted,
alias to short names can be build and used internally.
For instance a mapping table between Characterisation Utypes and local abbreviations are generated in the SaadaDB system \cite{Saada}.   

\section{Generating Utypes from UML data models via their XML representation} 
\label{sec:gene}
%
The syntax rules proposed in Section \ref{sec:syntax} above can be implemented from an XML schema representing the data model, using the XPATH mechanism ~\cite{Xpath} to build up a path from the root of the schema down to the finer grain elements corresponding to attributes' class in the model.
XPATH is not directly used in Utype generation , but its properties are indirectly applied in the approach described here.  \\
Suppose now that we have an XML schema fully mapping the UML model content, with all classes represented as elements in the model, nested elements for aggregation, references and basic types.\\
For the sake of clarity, we do avoid substitution groups and choice patterns and on the contrary prefer the  XML extension mechanism.
Such a rule helps to guarantee that for one XML element at any level, its name can be mapped to only one sub-structure and therefore allow for direct class encoding.
Nested classes will be organized as XML trees, then browsing down the tree to leaves elements and concatenating the names provides a path which is similar to the Utypes construction mentioned in the previous section(cf \ref{sec:syntax}). 

In order to achieve a proper mapping from UML to XML serialisation, and derive object code or Utype list from the generated XML, some requirements on the style of UML design as well as the XML schema construction should be met.
\begin{itemize}
  \item UML : For any association , each class connected should have a role name in order to clearly identify references.
Template classes provide a same name for different typed structures and are difficult to translate in XML; hence they should be avoided.
  \item XML 
Classes, should be converted as XML elements and class attributes as included sub-elements. The XML attributes are more or less providing context for the XML translation and are not used to describe the data model structures( only valid for charac. simdb has a diff. strategy ).
\end{itemize}

Most of the UML modeling commercial tools like RationalRose, MagicDraw, Objecteering , etc... have an internal XML representation of a UML model encoded in a proprietary XMI format. When simplifying this representation, one can apply XSLT transformation rules to directly generate output products like :
\begin{itemize}
  \item an XML schema
  \item an example of XML document instance 
  \item a Utype list with documentation
  \item a set of hyperlinked webpages for the datamodel documentation  
\end{itemize}

Such an approach has been implemented with success by G. Lemson and L. Bourges in the Theory interest group.
see http://volute...

UML allows various designs for a specific project and fully integrates the properties of graphs, with association links between classes while on the contrary XML emphasizes the hierarchy of elements. 
Therefore the translation is not straightforward. Some modeling rules should be imposed in UML design in
order to simplify translation and produce robust XML schema and Utypes
list.
The Theory interest group \cite{Lemson} has tried to come up with a minimal, necessary set of rules to produce a string that uniquely represents any of the
fundamental syntactic elements in the model. These rules are the following:
\begin{itemize}
  \item  Property names are unique in a Class. Note there are three types of
properties: An Attribute is a property the data type of which is a value
type (NOT an object type,/class), though it need not be primitive but
may be structured (i.e. have attributes of its own). A Collection is a
named, 1-to-many composition relation of a parent to a child class. A
Reference is a named, many-to-one shared association to another class.
\item Class names are unique in a Package (name space).
\item Package names are unique in either an enclosing parent package, or in
the Model (the root of all).

\end{itemize}
\section{ How are Utypes documented? }
\label{sec:docUri}
The documentation for a Utype is defined when the data model is build up and stored in the XMI representation of a UML Model. Most case tools provide a documentation generator that produces an HTML hyperlinked set of pages. 
These may contain just a set of few lines or a full illustrated text if necessary.
N. Gray  has proposed an URI generation function for each Utype in a DM, that could be used to point to the corresponding anchors of the on-line documentation of a data model.

\section{ How are Utypes published? }
\label{sec:publish}

%identified via namespaces 
For each version of the VO data models, an explicit set of Utype strings is built up in an XML Schema enumerating the various Utypes strings.
In VOTable documents or Utype-list, a name space definition should be included for Utypes validation.

Services /applications to describe, assign and parse all Utypes defined from a data model should be developed, similarly to the UCD tools available at \violet{\footnotesize{\url{http://cdsweb.u-strasbg.fr/UCD/}}} for instance.
As a (training) example, the revised version of Characterisation DM, version 2.0 has a new XML schema and an updated   set of Utypes available at \violet{\footnotesize{\url{http://ivoa.net/DM/UTypeListCharacterisationDM/UtypeListCharacterisationDM-V0.2-20090522.xsd ....}}} 
 
\section{ How are Utypes used? }
\label{sec:usage}
\subsection{Publishing data to the VO}
\label{sec:pubdata}
Data Providers can use Utypes to label the metadata attached to their data collections.
The process will be the following:
\begin{itemize}
  \item select a data model which covers the domain of these data
  \item map proprietary metadata ( FITS, Archive, Etc..) to VO DM Utypes 
  \item generate metadata as serialised documents ( VOtable, Utypelists, others?)
\end{itemize}
Different scenarios can be explored :
to be developed:
To publish data with the CharacterisationDM-v1.11 , one can use the CAMEA VO Tool (\violet{\footnotesize{\url{http://eurovotech.org/twiki/bin/view/VOTech/CharacEditorTool}}}) to check the Utype assignation, and verify if the Utype serialisation is compliant to this model.
other strategy?

At the data collection level , tools have been developed to help for keyword mapping from FITS keywords to Utypes list: 
Here is a list of the first tools developed for that:
\begin{itemize}
  \item FITS to DAL interface or data model Utypes:
  \item MEX      (ESO)    DAL interface link...
  \item DM-Mapper  (ESA)    DAL interface link...
  \item Interactive mapping tool (CDS) ( prototype) link...
    This tools takes a data model description and helps the data provider to interactively build a map table from FITS keywords to Utypes.
  \end{itemize}
  Such a tool is under development and  should be stabilized and tested for different data models. 
  It would help data providers to map their metadata to a standardized VO Utype description. 


\subsection{Naming metadata in VO protocols }
\label{sec:voprotocols}
The SSA query response consists of a number of fields, identified by Utypes, grouped into
component data models of the form “<component-name>’.’<field-name>”.
This is used in the Simple spectra access (SSA) protocol with a specific list of 'hand-carved ' keywords list representing objects structure . See Appendix D of the Simple Spectral Access Protocol V1.04 standard document at
\violet{\footnotesize{\url{http://www.ivoa.net/Documents/latest/SSA.html}
}}\\. 

Similarly the SLAP protocol defines its own set of Utypes in the Appendix D of the Simple Spectral Line Access Protocol V0.9 standard document( \violet{\footnotesize{\url{http://www.ivoa.net/Internal/IVOA/SpectralLinesListDocs/WD-SLAP-0.9-20090518.pdf}
}}\\).

The protocols generally use Utypes pointing to leaves of a data model:
\subsection{Querying data bases}
\label{sec:query}
Queries in ADQL or SQL use column names to ask for information. 
For a data base to be compliant with a data model, only the mapping between existing columns and Utypes must be defined. Unfortunately Utypes strings may be longer than the allowed length for a column string content in the Data base systems, therefore Utypes cannot be used directly in queries. Using a mapping table allow to build up a service where: 
- 
\begin{enumerate}
\item the client application ask a server for its list of supported metadata and Utypes
\item the server exposes the metadata  
\item The user selects the metadata he/she requires by browsing the Utypes and the documentation.
\item the client translates each Utype in the query into a column name and submits the query
\item the server parses and resolves the query and sends back the results columns 
\item the client translates each column name in Utypes when possible and display the results.
\end{enumerate}
 
Such a scenario is interesting as if offers a general vocabulary to the user , whatever the data base content and needs few steps of re-engineering. 
\section{Conclusion}
Utypes are useful to convey the role, the structure and the normalized name for each piece of metadata involved in a service or a protocol. It is an important factor in interoperability. 
A compromise between long descriptive strings and usability has been found in developing simple mapping mechanism at the client side.

  
\bibliographystyle{plain}
\bibliography{utypes}

\appendix
\newpage
\section{Appendix A: Utype serialisation example}
\label{app:xml}
\emph include a simbd or snap simulation serialisation?? 

\newpage
\section{ Appendix B: VOTable serialisation example}
\label{app:VOT}
\begin{figure}[!ht]
  \includegraphics[width=0.87 \textwidth]{./images/xmlSSAQueryResponse.jpg}
  \caption{\it Identifying pieces of a data model: SSA service.
  Here is a short extract of the Query response of an SSA protocol implementation.
  A VOTable document is returned, each of metadata being mapped to a Utype name in the SSA Utype data model.}
  \label{fig:ssaqueryresponse}
\end{figure}

\newpage
\section{ Appendix C: Updates of the document}
\label{app:docupdate}
\begin{itemize}
\item version 0.3 to 0.4
  \begin{itemize}
    \item introduce canonical and alternative notations
    \item update fig.1 and fig.2
  \end{itemize}
\end{itemize}


\end{document}

