IVOA

Simulation Data Access Layer (SimDAL)
Draft

IVOA Note March 2009

This version:
http://www.ivoa.net/Documents/...
Latest version:
http://www.ivoa.net/Documents/latest/...
Previous versions:
http://www.ivoa.net/Documents/...
http://www.ivoa.net/Documents/...
Interest Group:
http://www.ivoa.net/twiki/bin/view/IVOA/IvoaTheory
Author(s):
Claudio Gheller
Gerard Lemson
............

Abstract

This specification defines a protocol for retrieving data coming from numerical simulations from a variety of data repositories through a uniform interface. The interface is meant to be reasonably simple to implement by service providers. Data are selected by a proper search procedure. Once data of interest is identified specific quantities can be selected and sub-samples can be extracted and downloaded. Data is returned in VOTable simulation specific format, with support of external binary file management.

Status of this Document

This is a Note. The first release of this document was 18 May 2008.

This is an IVOA Note expressing suggestions from and opinions of the authors.
It is intended to share best practices, possible approaches, or other perspectives on interoperability with the Virtual Observatory. It should not be referenced or otherwise interpreted as a standard specification.

A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.


Acknowledgments

We thank Ugo Becciani, Laurent Bourgès, Patrizia Manzato, Hervé Wozniak......... for discussions and feedbacks on the topic.

1. Introduction

This specification defines a prototype standard for accessing theoretical data from a variety of astrophysical simulation repositories: the Simulation Data Access Layer (hereafter SimDAL). In this context Theoretical Data is defined as the outcome of different kinds of numerical applications, like dynamical simulations, semianalytical models, montecarlo simulations etc.
SimDAL deals with datasets that can always be represented as binary tables in which raws identify a simulated element (a mesh cell, a particle, a pixel...) and columns represent the associated physical parameters (the 3D spatial coordinates, the velocity, the temperature...). Typically, the tables are stored in files (either raw files or based on standard formats - HDF5, FITS) which are managed by filesystem-like infrastructures (e.g. UNIX filesystems, iRODS, SRB...). Datasets can represent different timesteps (so, evolutionary configurations) of the same simulated system or different realization of physical model (e.g. obtained changing some basic parameters). In the rest of the document, we will refer to these datasets as snapshots (this term recalls the time dependency, due to the original idea of SimDAL, but it extends to any raw numerical result).
In operation, SimDAL represents a negotiation between the client and the data service, which allows the user to query, preview, select and retrieve data. Services supporting SimDAL provide access to both existing datasets and virtual ones (i.e., datasets generated by the service on demand).
A SimDAL service is expected to allow the user to explore and select available datasets, by means of a TAP-like interface relying on the SimDB data model, namely SimTAP. In principle, SimTAP is just a reccommendation and not a mandatory part of the SimDAL protocol. Particularly complex or computational demanding queries (e.g. S3-like queries), in fact, could be better managed by ad-hoc services. However, their results must adopt the standard defined by SimDAL.
Once the datasets of interest are identified, the SimDAL service allows the user to access them. Besides a basic download capability, a SimDAL service can support complex processing functionalities working on the available snapshots. For example, data can be so large that its direct dowload is unfeasible. The SimDAL protocol describes a standard interface to access services which allow the user to cut-out only the part of the data he is actually interested in, reducing the data volume to move over the network, making its retrieval possible. In the same way, the cut-out function can extract only that part of the data which match some user's requirements (e.g. with some reference parameters in specific ranges). The cut-out and the download are the only mandatory functonalities expected from the SimDAL data-access service. Other data processing functions (custom functions) can, however, be included and published.

The results of any SimDAL operation is in terms of a VOTable in the specialized format - TDFF - specified for theoretical data.

In conclusion, the SimDAL services in general supports a number of operations, which can summarized as follows:

1.1 SimDAL: use case

A typical usage scenario of a SimDAL service can be as follows:

1. The user finds out, querying general VO Registries, which SimDB services are available and their status (if they are available, running, etc.). SimDB services are those which give access to theoretical data and related services.

2. SimDB services can be either stand-alone or associated to a SimDAL service. The former can support different types of data access services; they can be just lists of available data products, they can refer to SimDAL services, or they can address non-standard data access services. The latter represents the search engine for the SimDAL service, which allows the user to find the simulations, the datasets and the quantities he is interested in. It is implemented as a TAP service

3. Once the user has found, by means of the SimTAP interface, the datasets of interest, he performs one of the available operations (e.g preview, download, cutout). The operation can return the results immediately (syncronous) or after some time (due to complex processing - asyncronous)

SimDAL use case

1.2 Data discovery: use cases

Here we may give some example of typical queries (from the simplest to the most complex, S3 like, that we support...

2 SimTAP Discovery Service

2.0 Workflow

The typical workflow for obtaining access to data in the VO follows a particular pattern, which is by now well establihed for observational data products [TBD references]. First a VO registry is queried to find data access layer (DAL) services of a particular type, e.g. SCS, SIA, SSA or (Obs)TAP. These service describe themselves with appropriate metadata allowing users to select those of their liking, which are then accessed.

Accessing a particular data service comes in general in two stages, which may be itereated on. A first "queryData" request allows users to search the archive exposed by the service for data products of particular interest. This search and selection is facilitated by a set of metadata items (a data model) that describes the data products. This model may also be used in the query request itself, as is most explicitly done in the ObsTAP specification.

After particular data products have been selected, a "getData" request will retrieve them. This data retrieval most often is a simple download of individual data items through an accessReference URL. But the service may allow more general actions, such as cut-outs, or mosaics. In some (only SSA?) cases, a common data model exists for expressing the data products. In general a general container object such as a FITS file or VOTable is used.

SimDAL services as proposed in this Note follow this same pattern for accessing products of theoretical research, shortly called "simulations". The differences are in the details, which we will discuss in opposite order from how they appear in the workflow.

We assume that a getData request in SimDAL services is generally more complex than a simple download of existing data products. This is especially true for the often very large results of "cosmological" simulations. With this we mean simulations reproducing part of 3+1D space such as N-body simuations, the original target of the SNAP effort [TBD link].

Also, no standard containers such as FITS exist for the large data sets that often are stored in proprietary, binary data formats. Though VOTable can be used for data products that can be flattened to tables, the size often means that the basic ASCII XML serialisation of the data is unwieldy.

It was therefore assumed from the beginning that getData requests would be more involved than simple downloads of complete products, but instead would enact some type of filtering on the server side. Part of this may be standardised, as is attempted in Section(s) TBD of this Note. But in many cases the services may be customised to the actual data that is being exposed by the SImDAL services.

This last sentence exposes the main complicating factor in deailing with simulation results: lack of knowledge about the content and format of the data products.

2.1 Overview of SimDB and SimDB Data Model

SimDB stands for Simulation Database. It is a protocol (in the making) for querying a database structured according to the SimDB Data model. In the proposed protocol, the SimDB data model is mapped onto a relational database schema, and TAP is supposed to be used for accessing databases implementing the schema. This approach is in fact a precursor of the ObsTAP protocol that aims to do something similar: define a common/global datamodel that must be implemented and use TAP for querying it. A potential part of the SimDB protocol is how to update the database. This would be performed by submitting XML documents structured according to the XML schemas also derived from the data model.

TODO describe model, use some images

2.2 SimTAP: TAP service over SimDB

SimDB (and SNAP originally) always wanted to use the data model explicitly in the query part of any protocol definition. The idea was to create a database model from the model defined in UML and to use the TAP specification as a shortcut to define a protocol for querying for simulations. A similar idea is being persued also by ObsTAP. The main work here is to perform the mapping of the UML model to a relational representation. This is a standard problem that has many possible solutions. One approach is implemented in the VO-URP framework [REF]. VO-URP uses the XMI representation [REF] of a UML model to derive other representations using XSLT transformation scripts [REF]. In SimDM this was used to derive HTML documentation, UTYPEs and XML schemas taht form part of that specification. But VO-URP also includes XSLT scripts to generate relational DDL scripts for creating tables and views and scripts for generating appropriate files and scripts for TAP metadata.

The Simulation Data Model is perfectly alright for describing Simulation resources, for example in XML. It is less well suited for use in a relational context if we expect users to submit ADQL queries against the TAP_SCHEMA derived from the model. The model is simply too complex for that. It is not so much the fact that the model is very normalised, which will mainly cause difficulties for maintenance of and insertions into the database. These problems are faced daily by programmers in business which have to deal with models containing hundreds of tables.

It is especially the high level of abstractness of the model itself that causes problems for users wishing to query for interesting simulations and other SimDB resources. For example, ... [something about definition of parameters etc]

Another consequence is that the model can make no assumptions about the datatypes of the parameters. This has consequences when values must be assigned to the variables. In the model there should be an attribute storing the parameter value, but we can not assign the appropriate data type to this value attribute.

2.3 VOSI services

2.4 ... [anything else?]

3. The SimDAL interface

3.1 overview

SimDAL standardizes the interface to data services and the protocol for data exchange (input/output). A SimDAL service must be represented as a tree structure of web resources each addressable via a URL in the http scheme, or the https scheme, or both. The web resource at the root of the tree must represent the service as a whole. This specification defines no standard representation for this root resource. Implementations may provide a representation, or may return a '404 not found' response to requests for the root web-resource. One possible representation is an HTML page describing the scientific usage and content of the service. The service operations described here use HTTP GET and POST as the low level communications protocol. The functionality of each operation is defined independently of the low level communications protocol, and semantically equivalent operations could be implemented via other protocols.

The result of a SimDAL function is a VOTable or a file in some other format. Support for VOTable output is mandatory; all other formats are optional.

Examples:

http://example.org/simdap (SimDAL service root)
http://example.org/simdap/sync?REQUEST=LISTEXPERIMENTS (simdap service request)

3.2 Operation execution

The SimDAL service specification defines both synchronous and asynchronous query execution. In the case of HTTP based service, the users select synchronous or asynchronous execution by chosing the appropriate resource below the base URL for the service (see xxx). A query is synchronous if the results of the query are delivered in the HTTP response to the request that originally posed the query. If the service returns an immediate HTTP-response upon accepting a query and the client later obtains the results of the query in response to a separate HTTP request, then we say the request is asynchronous.

3.2.1 Synchronous Operations

Synchronous operations support is mandatory. A SimDAL service must provide a web resource with relative URL /sync that is a direct child of the root web resource. This web resource represents the results of synchronous requests. The exact form of the request and the representations of the results are described in the nexte sections. Synchronous operations execute immediately and the client must wait for the query to finish. If the HTTP request times out or the client otherwise loses the connection to the service before receiving the response, then the query fails.
Synchronous operation execution is adequate when the operation will execute quickly (e.g. a data query) and with a small number of results, or when they can at least start returning results quickly. They are generally simple to implement using standard web technologies and easy to use from a browser or scripting environment. However, synchronous queries are generally not sufficient and likely to fail for queries that take a long time to execute, especially before returning any results.

Example:

http://example.org/simdap/sync?REQUEST=GetAvailability 

3.2.2 Asynchronous Operations

Asynchronous operations support is not mandatory. Asynchronous operations require that client and server share knowledge of the state of the query during its execution and between HTTP exchanges. A SimDAL service can provide a web resource with relative URL /async that is a direct child of the root web resource. This web resource represents controls for asynchronous queries. Specifically, the web resource must represent the job-list as specified in the UWS standard (ref., see also the TAP document). The response to the request is a job_id, whch maps an associated web resurce.

Example:

http://example.org/simdap/async?REQUEST=Cutout&EXPERIMENT=mysim&SNAPSHOT=output0001.h5 
this invokes a cutout operation which is performed asyncronously identified by a job id, e.g. 10021. A corresponding web resource is then available:
http://example.org/simdap/async/10021
At this point, a number of other operations, for monitoring or controlling the job, can be invoked by using the web resource. For example,
http://example.org/simdap/async/10021/error
returns possible errors during the job execution.

3.3 Parameters

The /sync and /async web-resources must accept the parameters listed in the following sub-sections. In a synchronous request, the parameters select the representation returned in the response message. In an asynchronous request, the parameters select the representation of the eventual result rather than the response to the initial request.
Not all combinations of the parameters are meaningful. For example, ...... . If a service receives a spurious parameter in an otherwise correct request, then the service must ignore the spurious parameter, must respond to the request normally and must not report errors concerning the spurious parameter.

3.3.1 REQUEST

This is the besic parameter which allows to distinguish between current service operations, makes it possible to extend the service spec (with additional or custom operations), and specifies how other parameters should be interpreted. A SimDAL client must set this parameter correctly in every request (GET or POST) to the /async or /sync web resources. If a SimDAL service receives a request without this parameter or with an incorrect value for this parameter, then the service must reject the request and return an error document as the result.
These are the standard values of the parameter:

Detailed descripition of the associated operations is given in the next sections.

3.3.2 VERSION

The VERSION parameter specifies the SimDAL protocol version number. The format of the version number, and version negotiation, are described in section xxx.
A SimDAL service must support the VERSION parameter.

3.3.2 FORMAT

The FORMAT parameter indicates the client's desired format for the table of results of a query.
If the FORMAT parameter is omitted, the default format is VOTable.
A SimDAL service must support VOTable as an output format and may support other formats. A SimDAL service must accept a FORMAT parameter indicating a format that the service supports and should reject queries where the FORMAT parameter specifies a format not supported by the service implementation.

3.3.4 Parameters Values

Integer numbers are represented as defined in the specification of integers in XML Schema Datatypes. Real numbers are represented as specified for double precision numbers in XML Schema Datatypes. Sexagesimal formatting is not permitted, either for parameter input or in formal output metadata, other than in ISO 8601 formatted time strings (sexagesimal format is permitted in any informal output intended for a human, e.g., text or HTML formatted tables). SimDAL defines a special range-list format for specifying numerical ranges or lists of ranges as parameter values. For example, 1E-7/3E-6 specifies a closed range from 1E-7 to 3E-6 inclusive. The syntax supports both open and closed ranges. Ranges or range lists are permitted only when explicitly indicated in the definition of an individual parameter. A variant of the range list is the value of the WHERE parameter, used to specify the query constraint for a ParamQuery operation. For a full description of range list syntax refer to section xxx Repeated values in an array are specified using a single comma-separated list, in order to preserve the order of the elements when specifying spatial dimensions.

      $/sync?REQUEST=CUTOUT&EXPERIMENT=clrc00&SNAPSHOT=clrc00_0010&LEFTEDGE=0.5,0.6,0.2&RIGHTEDGE=0.7,0.8,0.4
    

3.4 VOSI services

Similar to the Virtual Observatory Service Interface (VOSI ref.), the SimDAL interface specifies base service interface common to all SimDAL services. SimDAL Interface requests supply metadata concerning the availability of the service ('SimDAL-availability) and of its main interfaces ('SimDAL-capabilities'). SimDAL-capabilities outputs use the same schema introduced in section xxx. Two further interfaces are specifically designed to get the lists of the available data collections (experiments) and associated datasets (snapshots).

3.4.1 Availability

This interface indicates whether the service is operable and the reliability of the service for extended and scheduled requests.

Interface

Operation name:

GetAvailability

Input parameters: NONE

Examples

http://example.org/simdap/sync?REQUEST=GetAvailability

3.4.2 Capability

This interface provides the service metadata in the form of a list of Capability descriptions. Each of these descriptions is an XML element that

An entry for a service in the resource registry - i.e. its VOResource - contains the Dublin-core Resource metadata (identifier, curation information, content description etc.) followed by the Service's capability descriptions. For a detailed description of the resource and service metadata we refer to the TAP document (ref.). From this description the Resources metadata, Identity, Curation and General Content, can be adopted by SimDAL with no exceptions. Collection and Service Content metadata instead cannot apply to SimDAL and are ruled out. Between the Service metadata, Interface metadata (which describe how to access the service) can be adopted as they are. Capability metadata (describing the usage of the service) are instead to be specifically defined. The service metadata shall be represented as an XML document which contains a sequence of one or more elements of type {http://www.ivoa.net/xml/VOResource/v1.0}Capability or sub-types thereof.

Interface

Operation name:

GetCapabilities

Input parameters: NONE

Examples

http://example.org/simdap/sync?REQUEST=GetCapabilities

3.5 SimTAP services: data model and access protocol

The discovery part in SimDAL services goes by the name of SimTAP. It is a TAP service with a restricted data model based on the Simulation Data Model [REF]. It is not identical with that model for reasons explained in section
2.2 above. Here we describe how the model depends on the Simulation Data Model, present the corresponding TAP schema and review the actual TAP access protocol applied to the model.

3.5.1 SimTAP: data model derivation

The assumption is that a SimDAL service gives access to simulation results derived using a specific SimDB:simdb/protocol/Protocol [TBD could we allow more than one?]. The advantage of this assumption is that we can describe experiments and their results obtained with the protocol in a much more direct manner than SimDM allows. In particular...

3.5.2 SimTAP access protocol

The protocol part of SimTAP is very simple. It basically says that TAP should be supported on the tables defined by the SimTAP TAP_SCHEMA. In this we follow the approach in ObsTAP [TBD do we?]

TBD list special features of SimTAP vs TAP. Check how ObsTAP deals with this.

Examples

3.6 Data Listing

These function allows to list all the simulations available at the selected SimDAL service and to list all the snapshots associated to a simulation.

3.6.1 List Experiments

This fuction returns the list of the experiments (simulations) served by this SimDAL instance.

Interface

Operation name:

ListExperiments
Input parameters: NONE

Examples

http://example.org/simdap/sync?REQUEST=ListExperiments

3.6.2 List Snapshots

This function lists the available snapshots, either all or for one experiment.

Interface

Operation name:

ListSnapshots

Input parameters: UTYPE Required?
EXPERIMENT SimDB.Experiment.PublisherDID OPT

Examples

http://example.org/simdap/sync?REQUEST=ListSnapshots
Returns a list of ALL datafiles served by this service.

http://example.org/simdap/sync?REQUEST=ListSnapshots&EXPERIMENT=my_favourite_simulation
Returns a list of the datafiles of simulation "my_favourite_simulation".

3.7 Data Processing Services

Two basic operations are expected to be implemented by a SimDAL service: the cutout (mandatory) and the preview (optional) of data. The cutout concept was born for geometric selections in cosmological simulations (i.e. extract all data which are inside a given spatial region). It has been generalized to any kind of multidimensional, multiparametric selection. Notice that the Download service can be represented as a Cutout over the whole domain.
The preview functions has a broad definitions. It is up to the service to provide the preview functions more suitable for the available data.
Custom services can perform any kind of processing, provided their intrfaces and responses are suitable to the standards defined in 4.3.xxx.

3.7.1 Cutout

The cutout operation refers to a single snapshot. Multiple sources cutouts, like for various time steps of the same simulation, are not supported by the protocol. Their implementation is up to the client, as, for example, sequences of requests with same subbox and fields but different datasets.

Interface

Operation name:

Cutout

Input parameter UTYPE Required?
EXPERIMENT SimDB.Experiment.PublisherDID REQ ID of the simulation
SNAPSHOT SimDB.Snapshot.PublisherDID REQ ID of the snapshot subject to the cutout
PROPERTY SimDB.RepresentationObject.Property OPT ID of the quantities to be extracted (if more than one, comma separated list)
PARAM OPT IDs of the parameters that define the cutout region (e.g. x, y, z for a geometric cutout)
MINVAL OPT minimum value of PARAM (MINVAL/MAXVAL defines the range)
MAXVAL OPT maximum value of PARAM (MINVAL/MAXVAL defines the range)

Notice that Cutout interface specification is provided by GetCapabilities, EXPERIMENT and SNAPSHOTS are provided by QueryData (or ListExperiments+ListSnapshots), PROPERTY and PARAM are provided (verify.............) by GetMetadata. Between the optional parameters, If only PROPERTY is specified, all data for that property are selected. If only PARAM is set all the properties in the MINVAL/MAXVAL range are selected. If none of the optional parameters is specified, all the snapshot is selected and the cutout reduces to a download of the snapshot.

Examples

http://example.org/simdap/sync?REQUEST=Cutout&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5
Returns the whole snap0001.h5 dataset in the standardized format.
http://example.org/simdap/sync?REQUEST=Cutout&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5& \
PROPERTY=temperature,density
Returns the whole temperature and density from snap0001.h5 dataset in the standardized format.
http://example.org/simdap/sync?REQUEST=Cutout&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5& \
PROPERTY=temperature,density&PARAM=xpos,ypos,zpos&MINVAL=0.3,0.5,0.3&MAXVAL=0.8,1.0,0.8
Returns a sub volume with temperature and density from snap0001.h5 dataset. The subvolume has coordinates between 0.3 and 0.8 in x and z and between 0.5 and 1.0 in y.

3.7.2 Preview

The preview can be implemented in different ways, depending on the specific data we are dealing with. The input of this method is the basic couple EXPERIMENT and SNAPSHOT. The PROPERTY parameter may be used to specify which fields to preview (if supported, otherwise it is discarded). No FIELDS specification or a blank PROPERTY parameter, is interpeted as: preview all available fields. If PROPERTY requires unavailable quantities, the corresponding request is discarded. If the cutout service is available, the preview service MUST provide instruments to select the fields of interest and the cutout region.

Interface

Operation name:

Preview

Input parameter UTYPE Required?
EXPERIMENT SimDB.Experiment.PublisherDID REQ ID of the simulation
SNAPSHOT SimDB.Snapshot.PublisherDID REQ ID of the snapshot subject to the cutout
PROPERTY SimDB.RepresentationObject.Property OPT ID of the quantities to be extracted (if more than one, comma separated list)

Examples

http://example.org/simdap/sync?REQUEST=Preview&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5& \ 
PROPERTY=entropy
Previews the entropy from snap0001.h5 of "my_favourite_simulation". How data are previewed depends on the service.

3.8 Representation of results

The basic format of a response from a SimDAL service is a table. This table must be encoded in the output format specified by the FORMAT parameter of the query. See section xxx for required, optional and default formats. VOTable is the default format and VOTable support is mandatory.

3.8.1 VOSI

Representations of VOSI outputs (capabilities and availability) must be as defined in the VOSI standard (ref.), extended to match specific requrements of the SimDAL service......

3.8.2 Data Search

3.8.3 Data Listing

3.8.4 Data Processing

Differently from the other operations, the result of both Cutout and Preview is a pair VOTable-data files. The support to VOTable is mandatory, even if alternative formats can be deployed.
External data files are in general necessary in order to deal with large and binary data. However, in some cases, the resulting data can still be represented by ASCII tables. In such cases a standard VOTable is the only result of the SimDAL operation. In all other cases, the VOTable is used to describe the results and the data files. Furthermore, it stores the links to the files, that can be downloaded from the remote storage area (via ftp, gridftp, http...).
The description of the VOTable customization needed to support external file description (Theoretical Data File Format, TDFF) is provided, together with examples, in Section 6.

3.8.5 Errors

3.9 SimDAL Versioning

The SimDAL protocol provides explicitly for versioning of the interface in order to support version negotiation between a client and a service where one or both parties support more than one version.
The versioning is based on a version number, which follow IVOA conventions (ref.). The version number applies to all aspects of the protocol as defined in this document, including any associated XML schema and the request encodings.
If a SimDAL client does not specify the version number in a request, the server assumes the highest standard version supported by the service, and no explicit version checking takes place. If the client specifies an explicit version number, and this does not match a version available from the service, the service returns a version number mismatch error as described in 3.8.5. The client can determine what versions of the protocol the service supports by a prior call to VOSI-capabilities or via a registry query.

4. Service registration

Publication of a service to the VO requires its registeration with an IVOA registry, including describing the identity and capabilities of the service.
............................

5. Extended capabilities

The SIMDAP service allows for optional extended capabilities and operations. Extensions may be defined within an information community when needed for additional functionality or specialization. A generic client must not be required or expected to make use of such extensions. Extended capabilities or operations must be defined by the service metadata. Extended capabilities provide additional metadata about the service, and may or may not enable optional new parameters to be included in operation requests. Extended operations may allow additional operations to be defined.
A server must produce a valid response to the operations defined in this document, even if parameters used by extended capabilities are missing or malformed (i.e. the server must supply a default value for any extended capabilities it defines), or if parameters are supplied that are not known to the server.
Service providers must choose extension names with care to avoid conflicting with standard metadata fields, parameters and operations.

6. Theoretical Data File Format (TDFF)

The standard result of a cutout operation is a specialized VOTable, adopting the following schema:

Theoretical Data File Format class diagram

6.1 Description of TDFF Elements

Class
UTYPE UCD1+ Description
TDFF.FileType ? Type of file produced by a software protocol
Attributes
UTYPE UCD1+ Datatype Description
TDFF.FileType.Name string Short name
TDFF.FileType.PublisherDID string Publisher assigned identifier of the FileType.
TDFF.FileType.Description text
TDFF.FileType.Mimetype string Content-type
Class
UTYPE UCD1+ Description
TDFF.File ? File or table containing one or more arrays
Attributes
UTYPE UCD1+ Datatype Description
TDFF.File.Name string File name
TDFF.File.Type string Reference to the PublisherDID of the FileType.
TDFF.File.PublisherDID string
TDFF.File.Size int Approximate size in KiB
TDFF.File.AccessURL string Resolvable URL for retrieving file
Class
UTYPE UCD1+ Description
TDFF.Array ? Sequence of data values in a binary array or table column
Attributes
UTYPE UCD1+ Datatype Description
TDFF.Array.Name string Array or column name
TDFF.Array.Dataype string Array name
TDFF.Array.Property string Reference to the PublisherDID of the Property represented by the Array.
TDFF.Array.Rank int Number of axes in the Array.
TDFF.Array.Dims int[] Array of length Rank indicate the number of elements along each axis of the Array.
TDFF.Array.Offset int Number of bytes in the File before the beginning of the Array.
TDFF.Array.Stride int Number of bytes to skip between each element of the Array.
TDFF.Array.SkipByte int Claudio, do we need this if we're providing the offset for each array?
TDFF.Array.Endian string The endian-ness of the Array; possible values are "little" or "big".
TDFF.Array.RowMajor bool Whether or not the Array is in row-major or column-major order.
TDFF.Array.InternalPath string Internal path of the Array if it is a self-desciribing file format, such as FITS or HDF5.
TDFF.Array.LeftEdge float[] Array of length Rank of the minimum spatial extent in each dimension.
TDFF.Array.RightEdge float[] Array of length Rank of the maximum spatial extent in each dimension.

6.2 Examples

TO BE DONE





END OF REVISED PART



==================================================================================

3.5 Query Response

The basic format of a response from a SimDAL service is a VOTable XML document, containing a nested hierarchy of RESOURCE elements.

   <RESOURCE utype="SimDB.Experiment">
     ...Experiment metadata...
     <RESOURCE utype="SimDB.Snapshot">
       ...Snaphot metadata...
       <RESOURCE utype="TDFF.File">
         ...File metadata, access reference...
         <TABLE utype="TDFF.Array">
           ...Table of arrays metadata...
    

The response to a ListExperiments request is a VOTable containing a series of RESOURCE elements, where each RESOURCE contains the metadata for a single Experiment. Individual attributes of the Experiment (taken from the SimDM), are listed as PARAM or LINK elements in the RESOURCE. Attributes that are collections, ParameterSetting for example, are listed as TABLEs in the RESOURCE.

The required and optional attributes are in Section A.3.1 of the Appendix. This list has been deliberately kept to a minimum, since not all data providers will have a complete database with all of the classes from the SimDM. Instead, the can use the a LINK element for the RefererenceURL attribute to point the client to a richer description of the simulation. Ideally, this would point to an XML instance document describing the Experiment based on the XML Schema from the SimDM. (SimDM or SimDB?)

Similarly, the required attributes are the same for all service operations. It is assumed that a client performing a ListExperiment query is exploring the Experiments, and would like more metadata. However, when performing a QueryData request, the client may already have

TODO: Should the service allow continuation tokens for long responses?

Appendix A: Detailed List of Query Parameters and Response Content

A.1 Custom Services

Custom services must define their own input parameters and responses.

A.2 Input Parameters

Parameter Service Operation
Name UTYPE ListExperiments ListSnapshots QueryData Preview Cutout
EXPERIMENT SimDB.Experiment.PublisherDID N/A OPT REQ REQ REQ
SNAPSHOT SimDB.Snapshot.PublisherDID N/A N/A OPT OPT REQ
PROPERTY SimDB.RepresentationObject.Property N/A N/A OPT OPT OPT
LEFTEDGE TDFF.Array.LeftEdge N/A N/A N/A N/A OPT
RIGHTEDGE TDFF.Array.RightEdge N/A N/A N/A N/A OPT

A.3 Query Response

Tables are used to represent collections from the data model. In many cases, these tables are optional. In this case, the required fields (columns) of the table only apply if the service chooses to return that table. This way, the client can be assured of a minimal set of metadata if the table is returned.

Resource Service Operation
Name UTYPE ListExperiments ListSnapshots QueryData Preview Cutout
EXPERIMENT SimDB.Experiment REQ REQ REQ REQ REQ
SNAPSHOT SimDB.Snapshot OPT REQ REQ REQ REQ
FILE TDFF.File OPT OPT REQ REQ REQ

A.3.1 Experiment Resource Metadata

The Experiment, Simulation, and PostProcessing classes from the SimDB have more attributes than are listed here. In principle, all of these attribute can be returned by a SimDAL service, in addition to appropriatedly related elements from the other classes, namely Protocol and its subclasses. The attributes and collections given here are the ones most important for describing the data.

UTYPE VOT Element Required?
SimDB.Experiment.Name PARAM REQ
SimDB.Experiment.Created PARAM OPT
SimDB.Experiment.Description PARAM OPT
SimDB.Experiment.Status PARAM OPT
SimDB.Experiment.Updated PARAM OPT
SimDB.Experiment.ReferenceURL LINK REQ
SimDB.Protocol.Name PARAM REQ
SimDB.Protocol.PublisherDID PARAM REQ
SimDB.Protocol.ReferenceURL LINK REQ
SimDB.Protocol.Version PARAM OPT
SimDB.Experiment.GenericParameterSetting TABLE OPT
SimDB.Experiment.NumericParameterSetting TABLE OPT
SimDB.Experiment.InputDataset TABLE OPT
SimDB.Experiment.ExperimentRepresentationObject TABLE OPT
A.3.1.1 Generic Experiment Parameter Setting Table Columns
Column Required?
SimDB.Protocol.InputParameter.Name REQ
SimDB.Protocol.InputParameter.Description OPT
SimDB.Protocol.InputParameter.Datatype REQ
SimDB.Experiment.GenericParameterSetting.Value REQ
A.3.1.2 Numeric Experiment Parameter Setting Table Columns
Column Required?
SimDB.Protocol.InputParameter.Name REQ
SimDB.Protocol.InputParameter.Description OPT
SimDB.Protocol.InputParameter.Datatype REQ
SimDB.Experiment.NumericParameterSetting.Value.Value REQ
SimDB.Experiment.NumericParameterSetting.Value.Unit REQ
A.3.1.3 Input Dataset Table Columns
Column Required?
SimDB.Experiment.Name REQ
SimDB.Experiment.PublisherDID REQ
SimDB.Experiment.ReferenceURL REQ
SimDB.Snapshot.PublisherDID REQ
A.3.1.4 Experiment Representation Object Table Columns
Column Required?
SimDB.Protocol.RepresentationObjectType.Name REQ
SimDB.Protocol.RepresentationObjectType.Description OPT
SimDB.Protocol.RepresentationObjectType.Label OPT
SimDB.Protocol.RepresentationObjectType.Type REQ

A.3.2 Snapshot Resource Metadata

UTYPE VOT Element Required?
SimDB.Experiment.Snapshot.PublisherDID PARAM REQ

A.3.3 File Resource Metadata

UTYPE VOT Element Required
TDFF.File.PublisherDID PARAM REQ
TDFF.File.AccessURL LINK REQ
Protocol.FileType.PublisherDID PARAM REQ
Protocol.FileType.Mimetype PARAM REQ
TDFF.Array TABLE REQ
A.3.3.1 Array Table Columns
Column Required?
TDFF.Array.Name REQ
SimDB.Protocol.RepresentationObject.Name REQ
SimDB.Protocol.RepresentationObject.Description OPT
SimDB.Protocol.RepresentationObject.PublisherDID REQ
SimDB.Protocol.RepresentationObject.Property.Name REQ
SimDB.Protocol.RepresentationObject.Property.Description OPT
SimDB.Protocol.RepresentationObject.Property.PublisherDID REQ
TDFF.Array.Datatype REQ

Appendix B: Example SQL queries vs SimDB and SimTAP

Here we give some example queries comparing the required SQL (ADQL) when equivalent information is stored in a SimDB or in a SimTAP database.
B.1 Sample protocol
We assume a simplified Gadget 2 protocol [REF]. It is defined by the following XML document that follows the Simulation Data Model's XML serialisation.
		XML example here ...
		

References

[1] R. Hanisch, Resource Metadata for the Virtual Observatory
http://www.ivoa.net/Documents/latest/RM.html

[2] R. Hanisch, M. Dolensky, M. Leoni, Document Standards Management: Guidelines and Procedure
http://www.ivoa.net/Documents/latest/DocStdProc.html