IVOA

Simulation Data Access Protocol (SimDAP)
Draft

IVOA Note March 2009

This version:
http://www.ivoa.net/Documents/...
Latest version:
http://www.ivoa.net/Documents/latest/...
Previous versions:
http://www.ivoa.net/Documents/...
http://www.ivoa.net/Documents/...
Interest Group:
http://www.ivoa.net/twiki/bin/view/IVOA/IvoaTheory
Author(s):
Claudio Gheller
Gerard Lemson
Rick Wagner

Abstract

This specification defines a protocol for retrieving data coming from numerical simulations from a variety of data repositories through a uniform interface. The interface is meant to be reasonably simple to implement by service providers. Data are selected by a proper search procedure. Once data of interest is identified specific quantities can be selected and sub-samples can be extracted and downloaded. Data is returned in VOTable simulation specific format, with support of external binary file management.

Status of this Document

This is a Note. The first release of this document was 18 May 2008.

This is an IVOA Note expressing suggestions from and opinions of the authors.
It is intended to share best practices, possible approaches, or other perspectives on interoperability with the Virtual Observatory. It should not be referenced or otherwise interpreted as a standard specification.

A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.


Acknowledgments

We thank Ugo Becciani, Laurent Bourgès, Patrizia Manzato, Hervé Wozniak for discussions and feedbacks on the topic.

1. Introduction

This specification defines a prototype standard for accessing theoretical data from a variety of astrophysical simulation repositories: the Simulation Data Access Protocol (hereafter SimDAP). In this context Theoretical Data is defined as the outcome of different kinds of numerical applications, like dynamical simulations, semianalytical models, montecarlo simulations etc.
SimDAP deals with datasets that can always be represented as binary tables in which raws identify a simulated element (a mesh cell, a particle, a pixel...) and columns represent the associated physical parameters (the 3D spatial coordinates, the velocity, the temperature...). Typically, the tables are stored in files (either raw files or based on standard formats - HDF5, FITS) which are managed by filesystem-like infrastructures (e.g. UNIX filesystems, iRODS, SRB...). Datasets can represent different timesteps (so, evolutionary configurations) of the same simulated system or different realization of physical model (e.g. obtained changing some basic parameters). In the rest of the document, we will refer to the considered datasets as snapshot (this term recalls the time dependency, due to the original idea of SimDAP, but it extends to any raw numerical result). Snapshots are the data sources. No further assumption is made on data.
In operation, SimDAP represents a negotiation between the client and the data service, which allows the user to query, preview, select and retrieve data. Services supporting SimDAP provide access to both existing datasets and virtual ones (i.e., datasets generated by the service on demand). They are composite services, which must implement registry as well as data resources and processing functionalities. The resulting protocol must support all these features.
A SimDAP service is expected to allow the user to explore and select available datasets, by means of a Query interface. The support of an explicit query mechanism is not mandatory (see section 3 and 4). Once the datasets of interest are detected, the SimDAP service must allow the user to access them. Besides a basic download capability, a SimDAP service can support complex processing functionalities working on the available snapshots. For example, data can be so large that its direct dowload is unfeasible. The SimDAP protocol describes a standard interface to access services which allow the user to cut-out only the part of the data he is actually interested in, reducing the data volume to move over the network, making its retrieval possible. In the same way, the cut-out function can extract only that part of the data which match some user's requirements (e.g. with some reference parameters in specific ranges). The cut-out and the download are the only mandatory functonalities expected from the SimDAP data-access service. Other data processing functions (custom functions) can, however, be included and published.
In many respects, SimDAP is similar to other VO-compliant DAL protocols, like TAP (ref.), SIAP (ref.) or SSAP (rep). We will refer to such approaches wherever possible, specifying only the details of the features specific to the theoretical data access. Furthermore, SimDAP will rely on the Simulation Data Model (SimDM, ref.) for the modeling of the data and the registration of the services.

1.1 SimDAP: use case

A typical usage scenario of a SimDAP service is espected to be as follows:

1. The user finds out, querying general VO registries, which SimDAP services are available and which appear to be suitable for his/her needs.

2. One of these services is selected. This can be a local service or a distributed service. No differences from the user perspective.

3. The selected SimDAP client application allows the user to submit a query to search for data of interest. Results are returned to the user in a proper format.

4. The user chooses the datasets of interest and perform one of the available operations (e.g preview, cutout, selective identification etc.). The operation can return the results immediately (syncronous) or after some time (due to complex processing - asyncronous)

5. The user dowloads the result.

All these steps, requires interactions between servers, data resources, registries, client applications and human users. Such interactions requires proper protocols.

1.2 SimDAP: features

The previous scenario relies on a number of features which are the foundation of a SimDAP service:

1. The service must be registered in VO registries, according to the Registry Services metadata model (ref.). Therefore it must be described in the IVOA compliant standard.

2. The SimDAP service must support a Metadata Query capability which gets the querable metadata (in terms of metadata schema) from an associated SimDM registry.

3. A query is submitted to the SimDAP metadata database based on the querable metadata (as, for instance, a http get/post request).

4. The SimDAP service can support a Data Processing capability which allows to process data according to the available interfaces (which drive specific functionalities of the service), each characterized by a set of constraint parameters, which are used to set properly the query. The only mandatory Data Processing capability is the cut-out (see section ...)

5. The SimDAP service must support a download capability to permit the retrieval of the final results.

2 The Data Model

A well defined data model is needed to organize and describe data such that they can be searched by means of queries on associated metadata and used for the service operations. The SimDAP service provide a mechanism to describe self consistently the metadata schema, so that any data model can be adopted. This pechanism relies on the SimDM-TAP service (ref.).

2.1 The SimDAP_SCHEMA

The SimDAP_SCHEMA is derived from the SimDM-TAP schema and defines a set of tables that contain the minimal metadata required to describe and use the tables exposed by a SimDAP service. Services must provide these tables and make them accessible by all supported operations and query mechanisms.
The qualified names in the tables of the SimDAP_SCHEMA must follow the rules defined in XXXXXXX. The names must be stated in a form that is acceptable as an operand of a query.
All columns in the SimDAP_SCHEMA tables are of type VARCHAR except for .............. which are ........ values.
The SimDAP_SCHEMA consists in four main tables descibing

A SimDAP service must provide the tables listed above and may provide other tables in the SimDAP_SCHEMA namespace.

3. The SimDAP interface

3.1 overview

The SimDAP services in general supports a number of requests which can summarized as follows:

SimDAP standardizes the interface to each functionality and the protocol for data exchange (input/output). A SimDAP service must be represented as a tree structure of web resources each addressable via a URL in the http scheme, or the https scheme, or both. The web resource at the root of the tree must represent the service as a whole. This specification defines no standard representation for this root resource. Implementations may provide a representation, or may return a '404 not found' response to requests for the root web-resource. One possible representation is an HTML page describing the scientific usage and content of the service. The service operations described here use HTTP GET and POST as the low level communications protocol. The functionality of each operation is defined independently of the low level communications protocol, and semantically equivalent operations could be implemented via other protocols.

The result of a SimDAP function is a VOTable or a file in some other format. Support for VOTable output is mandatory; all other formats are optional.

Examples:

http://example.org/simdap (SimDAP service root)
http://example.org/simdap/sync?REQUEST=LISTEXPERIMENTS (simdap service request)

3.2 Operation execution

The SimDAP service specification defines both synchronous and asynchronous query execution. In the case of HTTP based service, the users select synchronous or asynchronous execution by chosing the appropriate resource below the base URL for the service (see xxx). A query is synchronous if the results of the query are delivered in the HTTP response to the request that originally posed the query. If the service returns an immediate HTTP-response upon accepting a query and the client later obtains the results of the query in response to a separate HTTP request, then we say the request is asynchronous.

3.2.1 Synchronous Operations

Synchronous operations support is mandatory. A SimDAP service must provide a web resource with relative URL /sync that is a direct child of the root web resource. This web resource represents the results of synchronous requests. The exact form of the request and the representations of the results are described in the nexte sections. Synchronous operations execute immediately and the client must wait for the query to finish. If the HTTP request times out or the client otherwise loses the connection to the service before receiving the response, then the query fails.
Synchronous operation execution is adequate when the operation will execute quickly (e.g. a data query) and with a small number of results, or when they can at least start returning results quickly. They are generally simple to implement using standard web technologies and easy to use from a browser or scripting environment. However, synchronous queries are generally not sufficient and likely to fail for queries that take a long time to execute, especially before returning any results.

Example:

http://example.org/simdap/sync?REQUEST=GetAvailability 

3.2.2 Asynchronous Operations

Asynchronous operations support is not mandatory. Asynchronous operations require that client and server share knowledge of the state of the query during its execution and between HTTP exchanges. A SimDAP service can provide a web resource with relative URL /async that is a direct child of the root web resource. This web resource represents controls for asynchronous queries. Specifically, the web resource must represent the job-list as specified in the UWS standard (ref., see also the TAP document). The response to the request is a job_id, whch maps an associated web resurce.

Example:

http://example.org/simdap/async?REQUEST=Cutout&EXPERIMENT=mysim&SNAPSHOT=output0001.h5 
this invokes a cutout operation which is performed asyncronously identified by a job id, e.g. 10021. A corresponding web resource is then available:
http://example.org/simdap/async/10021
At this point, a number of other operations, for monitoring or controlling the job, can be invoked by using the web resource. For example,
http://example.org/simdap/async/10021/error
returns possible errors during the job execution.

3.3 Parameters

The /sync and /async web-resources must accept the parameters listed in the following sub-sections. In a synchronous request, the parameters select the representation returned in the response message. In an asynchronous request, the parameters select the representation of the eventual result rather than the response to the initial request.
Not all combinations of the parameters are meaningful. For example, ...... . If a service receives a spurious parameter in an otherwise correct request, then the service must ignore the spurious parameter, must respond to the request normally and must not report errors concerning the spurious parameter.

3.3.1 REQUEST

This is the besic parameter which allows to distinguish between current service operations, makes it possible to extend the service spec (with additional or custom operations), and specifies how other parameters should be interpreted. A SimDAP client must set this parameter correctly in every request (GET or POST) to the /async or /sync web resources. If a SimDAP service receives a request without this parameter or with an incorrect value for this parameter, then the service must reject the request and return an error document as the result.
These are the standard values of the parameter:

Detailed descripition of the associated operations is given in the next sections.

3.3.2 VERSION

The VERSION parameter specifies the SimDAP protocol version number. The format of the version number, and version negotiation, are described in section xxx.
A SimDAP service must support the VERSION parameter.

3.3.2 FORMAT

The FORMAT parameter indicates the client's desired format for the table of results of a query.
If the FORMAT parameter is omitted, the default format is VOTable.
A SimDAP service must support VOTable as an output format and may support other formats. A SimDAP service must accept a FORMAT parameter indicating a format that the service supports and should reject queries where the FORMAT parameter specifies a format not supported by the service implementation.

3.3.4 Parameters Values

Integer numbers are represented as defined in the specification of integers in XML Schema Datatypes. Real numbers are represented as specified for double precision numbers in XML Schema Datatypes. Sexagesimal formatting is not permitted, either for parameter input or in formal output metadata, other than in ISO 8601 formatted time strings (sexagesimal format is permitted in any informal output intended for a human, e.g., text or HTML formatted tables). SimDAP defines a special range-list format for specifying numerical ranges or lists of ranges as parameter values. For example, 1E-7/3E-6 specifies a closed range from 1E-7 to 3E-6 inclusive. The syntax supports both open and closed ranges. Ranges or range lists are permitted only when explicitly indicated in the definition of an individual parameter. A variant of the range list is the value of the WHERE parameter, used to specify the query constraint for a ParamQuery operation. For a full description of range list syntax refer to section xxx Repeated values in an array are specified using a single comma-separated list, in order to preserve the order of the elements when specifying spatial dimensions.

      $/sync?REQUEST=CUTOUT&EXPERIMENT=clrc00&SNAPSHOT=clrc00_0010&LEFTEDGE=0.5,0.6,0.2&RIGHTEDGE=0.7,0.8,0.4
    

3.4 VOSI services

Similar to the Virtual Observatory Service Interface (VOSI ref.), the SimDAP interface specifies base service interface common to all SimDAP services. SimDAP Interface requests supply metadata concerning the availability of the service ('SimDAP-availability) and of its main interfaces ('SimDAP-capabilities'). SimDAP-capabilities outputs use the same schema introduced in section xxx. Two further interfaces are specifically designed to get the lists of the available data collections (experiments) and associated datasets (snapshots).

3.4.1 Availability

This interface indicates whether the service is operable and the reliability of the service for extended and scheduled requests.

Interface

Operation name:

GetAvailability

Input parameters: NONE

Examples

http://example.org/simdap/sync?REQUEST=GetAvailability

3.4.2 Capability

This interface provides the service metadata in the form of a list of Capability descriptions. Each of these descriptions is an XML element that

An entry for a service in the resource registry - i.e. its VOResource - contains the Dublin-core Resource metadata (identifier, curation information, content description etc.) followed by the Service's capability descriptions. For a detailed description of the resource and service metadata we refer to the TAP document (ref.). From this description the Resources metadata, Identity, Curation and General Content, can be adopted by SimDAP with no exceptions. Collection and Service Content metadata instead cannot apply to SimDAP and are ruled out. Between the Service metadata, Interface metadata (which describe how to access the service) can be adopted as they are. Capability metadata (describing the usage of the service) are instead to be specifically defined. The service metadata shall be represented as an XML document which contains a sequence of one or more elements of type {http://www.ivoa.net/xml/VOResource/v1.0}Capability or sub-types thereof.

Interface

Operation name:

GetCapabilities

Input parameters: NONE

Examples

http://example.org/simdap/sync?REQUEST=GetCapabilities

3.5 Data search

Data collectons can be selected according to specific requirements that compose a query. The requirements are set on parameters which are retrieved as metadata.

3.5.1 Metadata Query ???????????

Metadata queries are applied to standardized tables which explain the data model of the SimDAP data search service. Metadata queries allow a client to discover the names of the parameters to be used in data search (3.2.2) and processing (3.4) requests.

Interface

Operation name:

GetMetadata

Input parameters: NONE

Examples

3.5.2 Data Query

The information describing available data collections made accessible via SimDAP are typically stored in relational database management systems (whose schema is retrieved by GetMetadata operation - see 3.2). The SimDAP service allows to identify all the simulations and snapshots which match specific conditions

Interface

Operation name:

QueryData

Input parameters: TBD

Examples

3.6 Data Listing

These function allows to list all the simulations available at the selected SimDAP service and to list all the snapshots associated to a simulation.

3.6.1 List Experiments

This fuction returns the list of the experiments (simulations) served by this SimDAP instance.

Interface

Operation name:

ListExperiments
Input parameters: NONE

Examples

http://example.org/simdap/sync?REQUEST=ListExperiments

3.6.2 List Snapshots

This function lists the available snapshots, either all or for one experiment.

Interface

Operation name:

ListSnapshots

Input parameters: UTYPE Required?
EXPERIMENT SimDB.Experiment.PublisherDID OPT

Examples

http://example.org/simdap/sync?REQUEST=ListSnapshots
Returns a list of ALL datafiles served by this service.

http://example.org/simdap/sync?REQUEST=ListSnapshots&EXPERIMENT=my_favourite_simulation
Returns a list of the datafiles of simulation "my_favourite_simulation".

3.7 Data Processing

Two basic operations are expected to be implemented by a SimDAP service: the cutout (mandatory) and the preview (optional) of data. The cutout concept was born for geometric selections in cosmological simulations (i.e. extract all data which are inside a given spatial region). It has been generalized to any kind of multidimensional, multiparametric selection.
The preview functions has a broad definitions. It is up to the service to provide the preview functions more suitable for the available data.
Custom services can perform any kind of processing, provided their intrfaces and responses are suitable to the standards defined in 4.3.xxx.

3.7.1 Cutout

The cutout operation refers to a single snapshot. Multiple sources cutouts, like for various time steps of the same simulation, are not supported by the protocol. Their implementation is up to the client, as, for example, sequences of requests with same subbox and fields but different datasets.

Interface

Operation name:

Cutout

Input parameter UTYPE Required?
EXPERIMENT SimDB.Experiment.PublisherDID REQ ID of the simulation
SNAPSHOT SimDB.Snapshot.PublisherDID REQ ID of the snapshot subject to the cutout
PROPERTY SimDB.RepresentationObject.Property OPT ID of the quantities to be extracted (if more than one, comma separated list)
PARAM OPT IDs of the parameters that define the cutout region (e.g. x, y, z for a geometric cutout)
MINVAL OPT minimum value of PARAM (MINVAL/MAXVAL defines the range)
MAXVAL OPT maximum value of PARAM (MINVAL/MAXVAL defines the range)

Notice that Cutout interface specification is provided by GetCapabilities, EXPERIMENT and SNAPSHOTS are provided by QueryData (or ListExperiments+ListSnapshots), PROPERTY and PARAM are provided (verify.............) by GetMetadata. Between the optional parameters, If only PROPERTY is specified, all data for that property are selected. If only PARAM is set all the properties in the MINVAL/MAXVAL range are selected. If none of the optional parameters is specified, all the snapshot is selected and the cutout reduces to a download of the snapshot.

Examples

http://example.org/simdap/sync?REQUEST=Cutout&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5
Returns the whole snap0001.h5 dataset in the standardized format.
http://example.org/simdap/sync?REQUEST=Cutout&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5& \
PROPERTY=temperature,density
Returns the whole temperature and density from snap0001.h5 dataset in the standardized format.
http://example.org/simdap/sync?REQUEST=Cutout&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5& \
PROPERTY=temperature,density&PARAM=xpos,ypos,zpos&MINVAL=0.3,0.5,0.3&MAXVAL=0.8,1.0,0.8
Returns a sub volume with temperature and density from snap0001.h5 dataset. The subvolume has coordinates between 0.3 and 0.8 in x and z and between 0.5 and 1.0 in y.

3.7.2 Preview

The preview can be implemented in different ways, depending on the specific data we are dealing with. The input of this method is the basic couple EXPERIMENT and SNAPSHOT. The PROPERTY parameter may be used to specify which fields to preview (if supported, otherwise it is discarded). No FIELDS specification or a blank PROPERTY parameter, is interpeted as: preview all available fields. If PROPERTY requires unavailable quantities, the corresponding request is discarded. If the cutout service is available, the preview service MUST provide instruments to select the fields of interest and the cutout region.

Interface

Operation name:

Preview

Input parameter UTYPE Required?
EXPERIMENT SimDB.Experiment.PublisherDID REQ ID of the simulation
SNAPSHOT SimDB.Snapshot.PublisherDID REQ ID of the snapshot subject to the cutout
PROPERTY SimDB.RepresentationObject.Property OPT ID of the quantities to be extracted (if more than one, comma separated list)

Examples

http://example.org/simdap/sync?REQUEST=Preview&EXPERIMENT=my_favourite_simulation&SNAPSHOT=snap0001.h5& \ 
PROPERTY=entropy
Previews the entropy from snap0001.h5 of "my_favourite_simulation". How data are previewed depends on the service.

3.8 Representation of results

The basic format of a response from a SimDAP service is a table. This table must be encoded in the output format specified by the FORMAT parameter of the query. See section xxx for required, optional and default formats. VOTable is the default format and VOTable support is mandatory.

3.8.1 VOSI

Representations of VOSI outputs (capabilities and availability) must be as defined in the VOSI standard (ref.), extended to match specific requrements of the SimDAP service......

3.8.2 Data Search

3.8.3 Data Listing

3.8.4 Data Processing

Differently from the other operations, the result of both Cutout and Preview is a pair VOTable-data files. The support to VOTable is mandatory, even if alternative formats can be deployed.
External data files are in general necessary in order to deal with large and binary data. However, in some cases, the resulting data can still be represented by ASCII tables. In such cases a standard VOTable is the only result of the SimDAP operation. In all other cases, the VOTable is used to describe the results and the data files. Furthermore, it stores the links to the files, that can be downloaded from the remote storage area (via ftp, gridftp, http...).
The description of the VOTable customization needed to support external file description (Theoretical Data File Format, TDFF) is provided, together with examples, in appendix Axxx.

3.8.5 Errors

3.9 SimDAP Versioning

The SimDAP protocol provides explicitly for versioning of the interface in order to support version negotiation between a client and a service where one or both parties support more than one version.
The versioning is based on a version number, which follow IVOA conventions (ref.). The version number applies to all aspects of the protocol as defined in this document, including any associated XML schema and the request encodings.
If a SimDAP client does not specify the version number in a request, the server assumes the highest standard version supported by the service, and no explicit version checking takes place. If the client specifies an explicit version number, and this does not match a version available from the service, the service returns a version number mismatch error as described in 3.8.5. The client can determine what versions of the protocol the service supports by a prior call to VOSI-capabilities or via a registry query.

4. Service registration

Publication of a service to the VO requires its registeration with an IVOA registry, including describing the identity and capabilities of the service.
............................

5. Extended capabilities

The SIMDAP service allows for optional extended capabilities and operations. Extensions may be defined within an information community when needed for additional functionality or specialization. A generic client must not be required or expected to make use of such extensions. Extended capabilities or operations must be defined by the service metadata. Extended capabilities provide additional metadata about the service, and may or may not enable optional new parameters to be included in operation requests. Extended operations may allow additional operations to be defined.
A server must produce a valid response to the operations defined in this document, even if parameters used by extended capabilities are missing or malformed (i.e. the server must supply a default value for any extended capabilities it defines), or if parameters are supplied that are not known to the server.
Service providers must choose extension names with care to avoid conflicting with standard metadata fields, parameters and operations.



==================================================================================

3.5 Query Response

The basic format of a response from a SimDAP service is a VOTable XML document, containing a nested hierarchy of RESOURCE elements.

   <RESOURCE utype="SimDB.Experiment">
     ...Experiment metadata...
     <RESOURCE utype="SimDB.Snapshot">
       ...Snaphot metadata...
       <RESOURCE utype="TDFF.File">
         ...File metadata, access reference...
         <TABLE utype="TDFF.Array">
           ...Table of arrays metadata...
    

The response to a ListExperiments request is a VOTable containing a series of RESOURCE elements, where each RESOURCE contains the metadata for a single Experiment. Individual attributes of the Experiment (taken from the SimDM), are listed as PARAM or LINK elements in the RESOURCE. Attributes that are collections, ParameterSetting for example, are listed as TABLEs in the RESOURCE.

The required and optional attributes are in Section A.3.1 of the Appendix. This list has been deliberately kept to a minimum, since not all data providers will have a complete database with all of the classes from the SimDM. Instead, the can use the a LINK element for the RefererenceURL attribute to point the client to a richer description of the simulation. Ideally, this would point to an XML instance document describing the Experiment based on the XML Schema from the SimDM. (SimDM or SimDB?)

Similarly, the required attributes are the same for all service operations. It is assumed that a client performing a ListExperiment query is exploring the Experiments, and would like more metadata. However, when performing a QueryData request, the client may already have

TODO: Should the service allow continuation tokens for long responses?

Appendix A: Detailed List of Query Parameters and Response Content

A.1 Custom Services

Custom services must define their own input parameters and responses.

A.2 Input Parameters

Parameter Service Operation
Name UTYPE ListExperiments ListSnapshots QueryData Preview Cutout
EXPERIMENT SimDB.Experiment.PublisherDID N/A OPT REQ REQ REQ
SNAPSHOT SimDB.Snapshot.PublisherDID N/A N/A OPT OPT REQ
PROPERTY SimDB.RepresentationObject.Property N/A N/A OPT OPT OPT
LEFTEDGE TDFF.Array.LeftEdge N/A N/A N/A N/A OPT
RIGHTEDGE TDFF.Array.RightEdge N/A N/A N/A N/A OPT

A.3 Query Response

Tables are used to represent collections from the data model. In many cases, these tables are optional. In this case, the required fields (columns) of the table only apply if the service chooses to return that table. This way, the client can be assured of a minimal set of metadata if the table is returned.

Resource Service Operation
Name UTYPE ListExperiments ListSnapshots QueryData Preview Cutout
EXPERIMENT SimDB.Experiment REQ REQ REQ REQ REQ
SNAPSHOT SimDB.Snapshot OPT REQ REQ REQ REQ
FILE TDFF.File OPT OPT REQ REQ REQ

A.3.1 Experiment Resource Metadata

The Experiment, Simulation, and PostProcessing classes from the SimDB have more attributes than are listed here. In principle, all of these attribute can be returned by a SimDAP service, in addition to appropriatedly related elements from the other classes, namely Protocol and its subclasses. The attributes and collections given here are the ones most important for describing the data.

UTYPE VOT Element Required?
SimDB.Experiment.Name PARAM REQ
SimDB.Experiment.Created PARAM OPT
SimDB.Experiment.Description PARAM OPT
SimDB.Experiment.Status PARAM OPT
SimDB.Experiment.Updated PARAM OPT
SimDB.Experiment.ReferenceURL LINK REQ
SimDB.Protocol.Name PARAM REQ
SimDB.Protocol.PublisherDID PARAM REQ
SimDB.Protocol.ReferenceURL LINK REQ
SimDB.Protocol.Version PARAM OPT
SimDB.Experiment.GenericParameterSetting TABLE OPT
SimDB.Experiment.NumericParameterSetting TABLE OPT
SimDB.Experiment.InputDataset TABLE OPT
SimDB.Experiment.ExperimentRepresentationObject TABLE OPT
A.3.1.1 Generic Experiment Parameter Setting Table Columns
Column Required?
SimDB.Protocol.InputParameter.Name REQ
SimDB.Protocol.InputParameter.Description OPT
SimDB.Protocol.InputParameter.Datatype REQ
SimDB.Experiment.GenericParameterSetting.Value REQ
A.3.1.2 Numeric Experiment Parameter Setting Table Columns
Column Required?
SimDB.Protocol.InputParameter.Name REQ
SimDB.Protocol.InputParameter.Description OPT
SimDB.Protocol.InputParameter.Datatype REQ
SimDB.Experiment.NumericParameterSetting.Value.Value REQ
SimDB.Experiment.NumericParameterSetting.Value.Unit REQ
A.3.1.3 Input Dataset Table Columns
Column Required?
SimDB.Experiment.Name REQ
SimDB.Experiment.PublisherDID REQ
SimDB.Experiment.ReferenceURL REQ
SimDB.Snapshot.PublisherDID REQ
A.3.1.4 Experiment Representation Object Table Columns
Column Required?
SimDB.Protocol.RepresentationObjectType.Name REQ
SimDB.Protocol.RepresentationObjectType.Description OPT
SimDB.Protocol.RepresentationObjectType.Label OPT
SimDB.Protocol.RepresentationObjectType.Type REQ

A.3.2 Snapshot Resource Metadata

UTYPE VOT Element Required?
SimDB.Experiment.Snapshot.PublisherDID PARAM REQ

A.3.3 File Resource Metadata

UTYPE VOT Element Required
TDFF.File.PublisherDID PARAM REQ
TDFF.File.AccessURL LINK REQ
Protocol.FileType.PublisherDID PARAM REQ
Protocol.FileType.Mimetype PARAM REQ
TDFF.Array TABLE REQ
A.3.3.1 Array Table Columns
Column Required?
TDFF.Array.Name REQ
SimDB.Protocol.RepresentationObject.Name REQ
SimDB.Protocol.RepresentationObject.Description OPT
SimDB.Protocol.RepresentationObject.PublisherDID REQ
SimDB.Protocol.RepresentationObject.Property.Name REQ
SimDB.Protocol.RepresentationObject.Property.Description OPT
SimDB.Protocol.RepresentationObject.Property.PublisherDID REQ
TDFF.Array.Datatype REQ

Appendix B: Theoretical Data File Format (TDFF)

B.1 TDFF Class Diagram

Theoretical Data File Format class diagram

B.2 Description of TDFF Elements

Class
UTYPE UCD1+ Description
TDFF.FileType ? Type of file produced by a software protocol
Attributes
UTYPE UCD1+ Datatype Description
TDFF.FileType.Name string Short name
TDFF.FileType.PublisherDID string Publisher assigned identifier of the FileType.
TDFF.FileType.Description text
TDFF.FileType.Mimetype string Content-type
Class
UTYPE UCD1+ Description
TDFF.File ? File or table containing one or more arrays
Attributes
UTYPE UCD1+ Datatype Description
TDFF.File.Name string File name
TDFF.File.Type string Reference to the PublisherDID of the FileType.
TDFF.File.PublisherDID string
TDFF.File.Size int Approximate size in KiB
TDFF.File.AccessURL string Resolvable URL for retrieving file
Class
UTYPE UCD1+ Description
TDFF.Array ? Sequence of data values in a binary array or table column
Attributes
UTYPE UCD1+ Datatype Description
TDFF.Array.Name string Array or column name
TDFF.Array.Dataype string Array name
TDFF.Array.Property string Reference to the PublisherDID of the Property represented by the Array.
TDFF.Array.Rank int Number of axes in the Array.
TDFF.Array.Dims int[] Array of length Rank indicate the number of elements along each axis of the Array.
TDFF.Array.Offset int Number of bytes in the File before the beginning of the Array.
TDFF.Array.Stride int Number of bytes to skip between each element of the Array.
TDFF.Array.SkipByte int Claudio, do we need this if we're providing the offset for each array?
TDFF.Array.Endian string The endian-ness of the Array; possible values are "little" or "big".
TDFF.Array.RowMajor bool Whether or not the Array is in row-major or column-major order.
TDFF.Array.InternalPath string Internal path of the Array if it is a self-desciribing file format, such as FITS or HDF5.
TDFF.Array.LeftEdge float[] Array of length Rank of the minimum spatial extent in each dimension.
TDFF.Array.RightEdge float[] Array of length Rank of the maximum spatial extent in each dimension.

References

[1] R. Hanisch, Resource Metadata for the Virtual Observatory
http://www.ivoa.net/Documents/latest/RM.html

[2] R. Hanisch, M. Dolensky, M. Leoni, Document Standards Management: Guidelines and Procedure
http://www.ivoa.net/Documents/latest/DocStdProc.html