I nternational

V irtual

O bservatory

A lliance

IVOA Registry Relational Schema
Version

Filled in automatically

Working Group:
Registry WG
This version:
filled in automatically
Latest version:
http://www.ivoa.net/Documents/RegTAP/
Previous versions:
http://www.ivoa.net/documents/RegTAP/20130909/WD-RegTAP-1.0-20130909.html
http://www.ivoa.net/documents/RegTAP/20130411/WD-RegTAP-1.0-20130411.html
(Unpublished) Internal Working Draft 2013-03-05 (Volute rev. 2011)
(Unpublished) Internal Working Draft 2012-11-12 (Volute rev. 1864)
Authors:
Markus Demleitner
Paul Harrison
Marco Molinaro
Gretchen Greene
Theresa Dower
Menelaos Perdikeas

Abstract

Registries provide a mechanism with which VO applications can discover and select resources—first and foremost data and services—that are relevant for a particular scientific problem. This specification defines an interface for searching this resource metadata based on the IVOA's TAP protocol. It specifies a set of tables that comprise a useful subset of the information contained in the registry records, as well as the table's data content in terms of the XML VOResource data model. The general design of the system is geared towards allowing easy authoring of queries.

Status of this Document

A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.

Acknowledgments

This document has been developed in part with support from the German Astronomical Virtual Observatory (BMBF Bewilligungsnummer 05A08VHA).

Conformance-related definitions

The words "MUST ", "SHOULD", "MAY", "RECOMMENDED", and "OPTIONAL" (in upper or lower case) used in this document are to be interpreted as described in the IETF standard RFC 2119 std:RFC2119.

Contents

Introduction

In the Virtual Observatory (VO), registries provide a means for discovering useful resources, i.e., data and services. Individual publishers offer the descriptions for their resources ("resource records") in publishing registries. At the time of writing, there are roughly 14000 such resource records active within the VO, originating from about 20 publishing registries.

The protocol spoken by these publishing registries, OAI-PMH only allows restricting queries by modification date and identifier and is hence not suitable for data discovery. Even if it were, data discovery would at least be fairly time consuming if each client had to query dozens or, potentially, hundreds of publishing registries.

To enable efficient data discovery nevertheless, there are services harvesting the resource records from the publishing registries and offering rich query languages. The IVOA Registry Interfaces specification std:RI1 defined, among other aspects of the VO registry system, such an interface using SOAP and an early draft of an XML-based query language.

This document provides an update to the query ("searchable registry") part of that specifiation (chapter 2), aimed towards usage with current VO standards, in particular TAP std:TAP and ADQL std:ADQL. It follows the model of ObsCore std:OBSCORE of defining a representation of a data model within a relational database. In this case, the data model is a simplification of the VO's resource metadata interchange representation, the VOResource XML format std:VOR. The simplification yields 13 tables. For each table, TAP_SCHEMA metadata is given together with rules for how to fill these tables from VOResource-serialized metadata records as well as conditions on foreign keys and recommendations on indices.

The resulting set of tables has a modest size by today's standards, but is still non-trivial. The largest table, table_column, has about half a million rows at the time of writing.

The architecture laid out here allows client applications to perform "canned" queries on behalf of their users as well as complex queries formulated directly by advanced users, using the same TAP clients they employ to query astronomical data servers.

The combination of the table set discussed here and a TAP server is referred to in this document as "RegTAP".

The Relational Registry within the VO Architecture

The Relational Registry within the VO architecture

Figure 1: IVOA Architecture diagram with the Relational Registry specification (RegTAP) and the related standards marked up.

This specification directly relates to other VO standards in the following ways:

VOResource, v1.03 std:VOR
VOResource sets the foundation for a formal definition of the data model for resource records via its schema definition. This document refers to concepts laid down there via XPaths std:XPATH.
VODataService, v1.1 std:VODS11
VODataService extends the VOResource data model by important concepts and resource types (tablesets, data services, data collections). These concepts and types are reflected in the database schema. Again xpaths link this specification and VODataService.
Other Registry Extensions
Registry extensions are VO standards defining how particular resources (e.g., Standards) or capabilities (e.g., IVOA defined interfaces) are described. Most aspects introduced by them are reflected in the res_detail table using xpaths into the registry documents. This document should not in general need updates for registry extension updates. For completeness, we note the versions current as of this specification: SimpleDALRegExt 1.0 std:DALREGEXT, StandardsRegExt 1.0 std:STDREGEXT, TAPRegExt 1.0 std:TAPREGEXT, Registry Interfaces 1.0 std:RI1.
TAP, v1.0 std:TAP
The queries against the schema defined in the present document, and the results of these queries, will usually be transported using the Table Access Protocol TAP. It also allows discovering local additions to the registry relations via TAP's metadata publishing mechanisms.
IVOA Identifiers, v1.12 std:VOID
IVOA identifiers are something like the primary keys within the VO registry; as such, they are actual primary keys of the central table of the relational registry. Also, the notion of an authority as laid down in IVOA Identifiers plays an important role as publishing registries can be viewed as a realization of a set of authorities.

This standard also relates to other IVOA standards:

ADQL, v2.0 std:ADQL
The rules for ingestion are designed to allow easy queries given the constraints of ADQL 2.0. Also, we give three functions that extend ADQL using the language's built-in facility for user-defined functions.
RegTAP-STC
This specification will be complemented by a schema of four tables, also under the rr schema making up the relational registry, that gives coverages of resources on the spatial, temporal, spectral and redshift axes. These will be defined in a separate document.

Design Considerations

In the design of the tables, the goal has been to preserve as much of VOResource and its extensions, including the element names, as possible.

An overriding consideration has been, however, to make natural joins between the tables behave usefully, i.e., to actually combine rows relevant to the same entity (resource, table, capability, etc.). To disambiguate column names that name the same concept on different entities (name, description, etc.) and would therefore interfere with the natural join, a shortened tag for the source object is prepended to the name. Thus, a description element within a resource ends up in a column named res_description, whereas the same element from a capability becomes cap_description.

We further renamed some columns with respect to their VOResource counterparts to avoid clashes with reserved words in popular database management systems. The alternatives would have been to either recommend quoting them or burden ADQL translation layers with the task of automatically converting them to delimited identifiers. Both alternatives seemed either confusing or impacting robustness.

Furthermore, camel-case identifiers have been converted to underscore-separated ones (thus, standardId becomes standard_id) to have all-lowercase column names; this saves potential headache if users choose to reference the columns using SQL delimited identifiers. Dashes in VOResource attribute names are converted to underscores, too, with the exception of ivo-id, which is just rendered ivoid.

Another design goal of this specification has been that different registries operating on the same set of registry records will return identical responses for most queries; hence, we try to avoid relying on features left not defined by ADQL (e.g., the case sensitivity of string matches). However, with a view to non-uniform support for information retrieval-type queries in database systems, the ivo_hasword user defined function is not fully specified here; queries employing it may yield different results on different implementations, even if they operate on the same set of resource records.

Note on case normalization

ADQL has no operators for case-insensitive matching of strings. To allow for robust and straightforward queries nevertheless, most columns containing values not usually intended for display are required to be converted to lower case; in the table descriptions below, there are explicit requirements on case normalization near the end of each section. This is particularly important when the entities to be compared are defined to be case-insensitive (e.g., ucds, IVORNs). Client software that can inspect user-provided arguments (e.g., when filling template queries) should also convert the respective fields to lower case.

This conversion MUST cover all ASCII letters, i.e., A through Z. The conversion SHOULD take place according to algorithm R2 in section 3.13, "Default Case Algorithms" of std:UNICODE. In practice, non-ASCII characters are not expected to occur in columns for which lowercasing is required.

Analogously, case-insensitive comparisons as required by some of the user-defined functions for the relational registry MUST compare the ASCII letters without regard for case. They SHOULD compare according to D144 in std:UNICODE.

QNames in VOResource attributes

VOResource and its extensions make use of XML QNames in attribute values, most prominently in xsi:type. The standard representation of these QNames in XML instance documents makes use of an abbreviated notation employing prefixes declared using the xmlns mechanism as discussed in std:XMLNS. Within an ADQL-exposed database, no standard mechanism exists that could provide a similar mapping of URLs and abbreviations. The correct way to handle this problem would thus be to have full QNames in the database (e.g., {http://www.ivoa.net/xml/ConeSearch/v1.0}ConeSearch for the canonical cs:ConeSearch). This, of course, would make for excessively tedious and error-prone querying.

For various reasons, VOResource authors have always been encouraged to use a set of "standard" prefixes. This allows an easy and, to users, unsurprising exit from the problem of the missing xmlns declarations: For the representation of QNames within the database, these recommended prefixes are now mandatory. Future VOResource extensions define their mandatory prefixes themselves.

Following the existing practice, minor version changes are not in general reflected in the recommended prefixes—e.g., both VODataService 1.0 and VODataService 1.1 use vs:. For reference, here is a table of XML namespaces and prefixes for namespaces relevant to this specification:

oaihttp://www.openarchives.org/OAI/2.0/
rihttp://www.ivoa.net/xml/RegistryInterface/v1.0
vghttp://www.ivoa.net/xml/VORegistry/v1.0
vrhttp://www.ivoa.net/xml/VOResource/v1.0
dchttp://purl.org/dc/elements/1.1/
vshttp://www.ivoa.net/xml/VODataService/v1.0
vshttp://www.ivoa.net/xml/VODataService/v1.1
cshttp://www.ivoa.net/xml/ConeSearch/v1.0
siahttp://www.ivoa.net/xml/SIA/v1.0
siahttp://www.ivoa.net/xml/SIA/v1.1
ssaphttp://www.ivoa.net/xml/SSA/v1.0
ssaphttp://www.ivoa.net/xml/SSA/v1.1
trhttp://www.ivoa.net/xml/TAPRegExt/v1.0
vstdhttp://www.ivoa.net/xml/StandardsRegExt/v1.0
xsihttp://www.w3.org/2001/XMLSchema-instance

XPaths

This specification piggybacks on top of the well-established VOResource standard. This means that it does not define a full data model, but rather something like a reasonably query-friendly view of a partial representation of one. The link between the actual data model, i.e., VOResource and its extensions as defined by the XML Schema documents, and the fields within this database schema, is provided by xpaths, which are here slightly abbreviated for both brevity and generality.

All xpaths given in this specifiation are assumed to be relative to the enclosing vr:Resource element; these are called "resource xpaths" in the following. If resource xpaths are to be applied to an OAI-PMH response, the xpath expression */*/*/oai:metadata/ri:Resource must be prepended to it, with the canonical prefixes from section qnameatts implied. The resource xpaths themselves largely do not need explicit namespaces since VOResource elements are by default unqualified. Elements and attributes from non-VOResource schemata have the canonical namespace prefixes, which in this specification only applies to several xsi:type attribute names.

Some tables draw data from several different VOResource elements. For those, we have introduced an extended syntax with additional metacharacters (, ), and |, where the vertical bar denotes an alternative and the parentheses grouping. For instance, our notation /(tableset/schema/|)table/ corresponds to the two xpaths /table and /tableset/schema/table.

Within the Virtual Observatory, the link between data models and concrete data representations is usually made using utypes. Since VOResource is directly modelled in XML Schema, the choice of XPath as the bridging formalism is compelling, though, and utypes themselves are not necessary for the operation of a TAP service containing the relational registry. TAP, however, offers fields for utypes in its TAP_SCHEMA. Since they are not otherwise required, this specification takes the liberty of using them to denote the xpaths.

In the metadata for tables and columns below, the utypes given are obtained from the xpaths by simply prepending them with xpath:. To avoid repetition, we allow relative xpaths: when the xpath in a column utype does not start with a slash, it is understood that is must be concatenated with the table utype to obtain the full xpath.

For illustration, if a table has a utype of xpath:/capability/interface/ and a column within this table has a utype of xpath:accessURL/@use, the resulting resource xpath would come out to be /capability/interface/accessURL/@use; to match this in an OAI-PMH response, the xpath would be */*/*/oai:metadata/ri:Resource/capability/interface/accessURL/@use.

While clients MUST NOT rely on these utypes in either TAP_SCHEMA or the metadata delievered with TAP replies, service operators SHOULD provide them, in particular when there are local extensions to the relational registry in their services. Giving xpaths for extra columns and tables helps human interpretation of them at least when the defining schema files are available.

Resource xpaths are also used in the res_details table (section table_res_detail). These are normal xpaths (although again understood relative to the enclosing Resource element), which, in particular, means that they are case sensitive. On the other hand, to clients they are simply opaque strings, i.e., clients cannot just search for any xpaths into VOResource within res_details.

Discovering Relational Registries

The relational registry can be part of any TAP service. The presence of the tables discussed here is indicated by declaring support for the data model Registry 1.0 with the IVORN ivo://ivoa.net/std/RegTAP/vor in the service's capabilities (see std:TAPREGEXT). Technically, this entails adding

Registry 1.0
]]>

as a child of the capability element with the type {http://www.ivoa.net/xml/TAPRegExt/v1.0}TableAccess.

A client that knows the access URL of one TAP service containing a relational registry can thus discover all other services exposing one. The "Find all TAP endpoints offering the relational registry" example in section sample_queries shows a query that does this.

It is recommended to additionally register a relational registry as a VODataService data collection and connect this and the TAP services with a pair of service-for/served-by relations. This allows giving more metadata on the data content like, for example, the frequency of harvesting.

Services implementing this data model that do not (strive to) offer the full data content of the VO registry (like domain-specific registries or experimental systems) MUST NOT declare the above data model in order to not invite clients expecting the VO registry to send queries to it.

VOResource Tables

In the following table descriptions, the names of tables and columns are normative and MUST be used as given, and all-lowercase. On the values in the utype columns within TAP_SCHEMA, see section vorutypes. All columns defined in this document MUST have a 1 in the std column of the TAP_SCHEMA.table_columns table. Unless otherwise specified, all values of ucd and unit in TAP_SCHEMA.table_columns are NULL for columns defined here. Descriptions are not normative (as given, they usually are taken from the schema files of VOResource and its extensions with slight redaction). Registry operators MAY provide additional columns in their tables, but they MUST provide all columns given in this specification.

All table descriptions start out with brief remarks on the relationship of the table to the VOResource XML data model. Then, the columns are described in a selection of TAP_SCHEMA metadata. For each table, recommendations on explicit primary and foreign keys as well as indexed columns are given, where it is understood that primary and foreign keys are already indexed in order to allow efficient joins; these parts are not normative, but operators should ensure decent performance for queries assuming the presence of the given indices and relationships. Finally, lowercasing requirements (normative) are given.

The following tables make up the data model ivo://ivoa.net/std/RegTAP/vor:

Table
Utype
Description
rr.capability
xpath:/capability/
Pieces of behaviour of a resource.
rr.interface
xpath:/capability/interface/
Information on access modes of a capability.
rr.intf_param
xpath:/capability/interface/param/
Input parameters for services.
rr.relationship
xpath:/content/relationship/
Relationships between resources, e.g., mirroring, derivation, but also providing access to data within a resource.
rr.res_date
xpath:/curation/
A date associated with an event in the life cycle of the resource. This could be creation or update. The role column can be used to clarify.
rr.res_detail
N/A
XPath-value pairs for members of resource or capability and their derivations that are less used and/or from VOResource extensions. The pairs refer to a resource if cap_index is NULL, to the referenced capability otherwise.
rr.res_role
N/A
Entities, i.e., persons or organizations, operating on resources: creators, contacts, publishers, contributors.
rr.res_schema
xpath:/tableset/schema/
Sets of tables related to resources.
rr.res_subject
xpath:/content/
Topics, object types, or other descriptive keywords about the resource.
rr.res_table
xpath:/(tableset/schema/|)table/
(Relational) tables that are part of schemata or resources.
rr.resource
xpath:/
The resources, i.e., services, data collections, organizations, etc., present in this registry.
rr.table_column
xpath:/(tableset/schema|)/table/column/
Metadata on columns of tables pertaining to resources.
rr.validation
xpath:/(capability|)
Validation levels for resources and capabilities.

The resource Table

The rr.resource table contains most atomic members of vr:Resource that have a 1:1 relationship to the resource itself. Members of derived types are, in general, handled through the res_detail table even if 1:1 (see table_res_detail). The content_level, content_type, waveband, and rights members are 1:n but still appear here. If there are multiple values, they are concatenated with hash characters (#). Use the ivo_hashlist_has ADQL extension function to check for the presence of a single value. This convention saves on tables while not complicating common queries significantly.

A local addition is the creator_seq column. It contains all content of the name elements below a resource element curation child's creator children, concatenated with a sequence of semicolon and blank characters ("; "). The individual parts must be concatenated preserving the sequence of the XML elements. The resulting string is primarily intended for display purposes ("author list") and is hence not case-normalized. It was added since the equivalent of an author list is expected to be a metadatum that is displayed fairly frequently, but also since the sequence of author names is generally considered significant. The res_role table, on the other hand, does not allow recovering the input sequence of the rows belonging to one resource.

The res_type column reflects the lower-cased value of the ri:Resource element's xsi:type attribute, where the canonical prefixes are used. While custom or experimental VOResource extensions may lead to more or less arbitrary strings in that column, VOResource and its IVOA-recommended extensions at the time of writing define the following values for res_type:

vg:authority
A naming authority (these records allow resolving who is responsible for IVORNs with a certain authority; cf. std:VOID)
vg:registry
A registry is considered a publishing registry if it contains a capability element with xsi:type="vg:Harvest". Old, RegistryInterface 1-compliant registries also use this type with a capability of type vg:Search. The relational registry as specified here, while superceding these old vg:Search capabilities, does not use this type any more. See section registration on how to locate services supporting it.
vr:organisation
The main purpose of an organisation as a registered resource is to serve as a publisher of other resources.
vr:resource
Any entity or component of a VO application that is describable and identifiable by a IVOA Identifier; while it is technically possible to publish such records, the authors of such records should probably be asked to use a more specific type.
vr:service
A resource that can be invoked by a client to perform some action on its behalf
vs:catalogservice
A service that interacts with one or more specified tables having some coverage of the sky, time, and/or frequency.
vs:dataservice
A service for accessing astronomical data; publishers choosing this over vs:catalogservice probably intend to communicate that there are no actual sky positions in the tables exposed.
vs:datacollection
A schema as a logical grouping of data which, in general, is composed of one or more accessible datasets. Use the rr.relationship table to find out services that allow access to the data (the served_by relation), and/or look for values for /accessURL in rr.res_detail.
vstd:standard
A description of a standard specification.

The status attribute of vr:Resource is considered an implementation detail of the XML serialization and is not kept here. Neither inactive nor deleted records may be kept in the resource table. Since all other tables in the relational registry should keep a foreign key on the ivoid column, this implies that only metadata on active records is being kept in the relational registry. In other words, users can expect a resource to exist and work if they find it in a relational registry .

Name
Utype
TypeDescription
ivoid
xpath:identifier
adql:VARCHAR(*)Unambiguous reference to the resource conforming to the IVOA standard for identifiers
res_type
xpath:@xsi:type
adql:VARCHAR(*)Resource type (something like vs:datacollection, vs:catalogservice, etc)
created
xpath:@created
adql:TIMESTAMPThe UTC date and time this resource metadata description was created. This timestamp must not be in the future. This time is not required to be accurate; it should be at least accurate to the day. Any insignificant time fields should be set to zero.
short_name
xpath:shortName
adql:VARCHAR(*)A short name or abbreviation given to something. This name will be used where brief annotations for the resource name are required. Applications may use to refer to this resource in a compact display. One word or a few letters is recommended. No more than sixteen characters are allowed.
res_title
xpath:title
adql:VARCHAR(*)The full name given to the resource
updated
xpath:@updated
adql:TIMESTAMPThe UTC date this resource metadata description was last updated. This timestamp must not be in the future. This time is not required to be accurate; it should be at least accurate to the day. Any insignificant time fields should be set to zero.
content_level
xpath:content/contentLevel
adql:VARCHAR(*)Description of the content level or intended audience
res_description
xpath:content/description
adql:VARCHAR(*)An account of the nature of the resource.
reference_url
xpath:content/referenceURL
adql:VARCHAR(*)URL pointing to a human-readable document describing this resource.
creator_seq
xpath:curation/creator/name
adql:VARCHAR(*)The creator(s) of the resource in the order given by the resource record author.
content_type
xpath:content/type
adql:VARCHAR(*)Nature or genre of the content of the resource
source_format
xpath:content/source/@format
adql:VARCHAR(*)The format of source_value. Recognized values include "bibcode", referring to a standard astronomical bibcode (http://cdsweb.u-strasbg.fr/simbad/refcode.html).
source_value
xpath:content/source
adql:VARCHAR(*)A bibliographic reference from which the present resource is derived or extracted.
res_version
xpath:curation/version
adql:VARCHAR(*)Label associated with creation or availablilty of a version of a resource.
region_of_regard
xpath:coverage/regionOfRegard
adql:REALA single numeric value representing the angle, given in decimal degrees, by which a positional query against this resource should be "blurred" in order to get an appropriate match.
waveband
xpath:coverage/waveband
adql:VARCHAR(*)A hash-separated list of regions of the electro-magnetic spectrum that the resource's spectral coverage overlaps with.
rights
xpath:rights
adql:VARCHAR(*)Information about rights held in and over the resource.

This table should have the ivoid column explicitly set as its primary key.

The following columns MUST be lowercased during ingestion: ivoid, res_type, content_level, content_type, source_format, waveband. Clients are advised to query the res_description and res_title columns using the the ivo_hasword function, and to use ivo_hashlist_has on content_level, content_type, waveband, and rights.

The row for region_of_regard TAP_SCHEMA.tables must have deg in its unit column.

The res_role Table

This table subsumes the contact, publisher, contributor, and creator members of the VOResource data model. They have been combined into a single table to reduce the total number of tables, and also in anticipation of a unified data model for such entities in future versions of VOResource.

The actual role is given in the base_role column, which can be one of contact, publisher, contributor, or creator. Depending on this value, here are the xpaths for the table fields (we have abbreviated /curation/publisher as cp, /curation/contact as co, /curation/creator as cc, and /curation/contributor as cb):

base_role value contact publisher creator contributor
role_name co/name cp cc/name cb
role_ivoid co/name/@ivo-id cp/@ivo-id cc/name/@ivo-id cb/@ivo-id
address co/address N/A N/A N/A
email co/email N/A N/A N/A
telephone co/telephone N/A N/A N/A
logo co/logo N/A cc/logo N/A

Not all columns are available for each role type in VOResource. For example, contacts have no logo, and creators no telephone members. Unavailable metadata (marked with N/A in the above table) MUST be represented with NULL values in the corresponding columns.

Note that, due to current practice in the VO, it is not easy to predict what role_name will contain; it could be a single name, where again the actual format is unpredictable (full first name, initials in front or behind, or even a project name), but it could as well be a full author list. Thus, when matching against role_names, you will have to use rather lenient regular expressions. Changing this, admittedly regrettable, situation would probably require a change in the VOResource schema.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
role_name
N/A
adql:VARCHAR(*)The real-world name or title of a person or organization
role_ivoid
N/A
adql:VARCHAR(*)An IVOA identifier of a person or organization
street_address
N/A
adql:VARCHAR(*)A mailing address for a person or organization
email
N/A
adql:VARCHAR(*)An email address the entity can be reached at
telephone
N/A
adql:VARCHAR(*)A telephone number the entity can be reached at
logo
N/A
adql:VARCHAR(*)URL pointing to a graphical logo, which may be used to help identify the entity
base_role
N/A
adql:VARCHAR(*)The role played by this entity; this is one of contact, publisher, and creator

The ivoid column should be an explicit foreign key into the resource table. It is recommended to maintain indexes on at least the role_name column, ideally in a way that supports regular expressions.

The following columns MUST be lowercased during ingestion: ivoid, role_ivoid, base_role. Clients are advised to query the remaining columns, in particular role_name, case-insensitively, e.g., using ivo_nocasematch.

The res_subject Table

Since subject queries are expected to be frequent and perform relatively complex checks (e.g., resulting from thesaurus queries in the clients), the subjects are kept in a separate table rather than being hash-joined like other string-like 1:n members of resource.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
res_subject
xpath:subject
adql:VARCHAR(*)Topics, object types, or other descriptive keywords about the resource.

The ivoid column should be an explicit foreign key into resource. It is recommended to index the res_subject column, preferably in a way that allows to process case-insensitive and pattern queries using the index.

The ivoid column MUST be lowercased during ingestion. Clients are advised query the res_subject column case-insensitively, e.g., using ivo_nocasematch.

The capability Table

The capability table describes a resource's modes of interaction; it only contains the members of the base type vr:Capability. Members of derived types are kept in the res_detail table (see table_res_detail).

The table has an integer-typed column cap_index to disambiguate multiple capabilities on a single resource. Operators are free to choose the actual values as convenient, although it is recommended to just enumerate the capabilities in their physical sequence per-resource.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
cap_index
N/A
adql:SMALLINTRunning number of this capability within the resource.
cap_type
xpath:@xsi:type
adql:VARCHAR(*)The type of capability covered here.
cap_description
xpath:description
adql:VARCHAR(*)A human-readable description of what this capability provides as part of the over-all service
standard_id
xpath:@standardID
adql:VARCHAR(*)A URI for a standard this capability conforms to.

This table should have an explicit primary key made up of ivoid and cap_index. The ivoid column should be an explicit foreign key into resource. It is recommended to maintain indexes on at least the cap_type and standard_id columns.

The following columns MUST be lowercased during ingestion: ivoid, cap_type, standard_id. Clients are advised to query the cap_description column using the ivo_hasword function.

The res_schema Table

The res_schema table corresponds to VODataService's schema element. It has been renamed to avoid clashes with the SQL reserved word SCHEMA.

The table has an integer-typed column schema_index to disambiguate multiple schema elements on a single resource. Operators are free to choose the actual values as convenient, although it is recommended to just enumerate the schema elements in their physical sequence per-resource. The xpath count(preceding-sibling::schema) gives one possible set of values for this column; it has not been prescribed via a utype, though, since other enumeration schemes are valid as well.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
schema_index
N/A
adql:SMALLINTA running number for the res_schema rows belonging to a resource.
schema_description
xpath:description
adql:VARCHAR(*)A free text description of the tableset explaining in general how all of the tables are related.
schemaname
xpath:name
adql:VARCHAR(*)A name for the set of tables.
schema_title
xpath:title
adql:VARCHAR(*)A descriptive, human-interpretable name for the table set.
schema_utype
xpath:utype
adql:VARCHAR(*)An identifier for a concept in a data model that the data in this schema as a whole represent.

This table should have an explicit primary key made up of ivoid and schema_index. The ivoid column should be an explicit foreign key into resource.

The following columns MUST be lowercased during ingestion: ivoid, schema_name, schema_utype. Clients are advised to query the schema_description and schema_title columns using the the ivo_hasword function.

The res_table Table

The res_table table models VODataService's table element. It has been renamed to avoid name clashes with the SQL reserved word TABLE.

VODataService 1.0 had a similar element that was a direct child of resource. Ingestors should also accept such tables, as there are still numerous active VODataService 1.0 resources in the Registry at the time of writing (this is the reason for the alternative in the table xpath).

The table contains an integer-typed column table_index to disambiguate multiple tables on a single resource. Operators are free to choose the actual values as convenient, although it is recommended to just enumerate the tables in their physical sequence per-resource (but not per schema—table_index MUST be unique within a resource).

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
schema_index
N/A
adql:SMALLINTIndex of the schema this table belongs to, if it belongs to a schema (otherwise NULL).
table_description
xpath:description
adql:VARCHAR(*)A free-text description of the table's contents
table_name
xpath:name
adql:VARCHAR(*)The fully qualified name of the table. This name should include all catalog or schema prefixes needed to distinguish it in a query.
table_index
N/A
adql:INTEGERAn artificial counter for the tables belonging to a resource
table_title
xpath:title
adql:VARCHAR(*)A descriptive, human-interpretable name for the table
table_type
xpath:@type
adql:VARCHAR(*)A name for the role this table plays. Recognized values include "output", indicating this table is output from a query; "base_table", indicating a table whose records represent the main subjects of its schema; and "view", indicating that the table represents a useful combination or subset of other tables. Other values are allowed.
table_utype
xpath:utype
adql:VARCHAR(*)An identifier for a concept in a data model that the data in this table as a whole represent.

This table should have an explicit primary key made up of ivoid and table_index. The ivoid column should be an explicit foreign key into resource. It is recommended to maintain an index on at least the table_description column, ideally one suited for queries with ivo_hasword.

The following columns MUST be lowercased during ingestion: ivoid, table_name, table_type, table_utype. Clients are advised to query the table_description and table_title columns using the the ivo_hasword function.

The table_column Table

The table_column table models the content of VOResource's column element. The table has been renamed to avoid a name clash with the SQL reserved word COLUMN.

Since it is expected that queries for column properties will be fairly common in advanced queries, it is the column table that has the unprefixed versions of common member names (name, ucd, utype, etc).

The flag column contains a concatenation of all values of a column element's flag children, separated by hash characters. Use the ivo_hashlist_has function in queries against flag.

The table_column table also includes information from VODataService's data type concept. VODataService 1.1 includes several type systems (VOTable, ADQL, Simple). The type_system column contains the value of the column's datatype child, with the VODataService XML prefix fixed to vs; hence, this column will contain one of NULL, vs:TAPType, vs:SimpleDataType, and vs:VOTableType.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
table_index
N/A
adql:INTEGERIndex of the table this column belongs to.
name
xpath:name
adql:VARCHAR(*)The name of the column.
ucd
xpath:ucd
adql:VARCHAR(*)A unified content descriptor that describes the scientific content of the parameter.
unit
xpath:unit
adql:VARCHAR(*)The unit associated with all values in the column.
utype
xpath:utype
adql:VARCHAR(*)An identifier for a role in a data model that the data in this column represents.
std
xpath:@std
adql:SMALLINTIf 1, the meaning and use of this parameter is reserved and defined by a standard model. If 0, it represents a database-specific parameter that effectively extends beyond the standard.
datatype
xpath:dataType
adql:VARCHAR(*)The type of the data contained in the column.
extended_schema
xpath:dataType/@extendedSchema
adql:VARCHAR(*)An identifier for the schema that the value given by the extended attribute is drawn from.
extended_type
xpath:dataType/@extendedType
adql:VARCHAR(*)A custom type for the values this column contains.
arraysize
xpath:dataType/@arraysize
adql:VARCHAR(*)The shape of the array that constitutes the value, e.g., 4, *, 4*, 5x4, or 5x*, as specified by VOTable.
delim
xpath:dataType/@delim
adql:VARCHAR(*)The string that is used to delimit elements of an array value when arraysize is not '1'.
type_system
xpath:dataType/@xsi:type
adql:VARCHAR(*)The type system used, as a QName with a canonical prefix; this will ususally be one of vs:simpledatatype, vs:votabletype, and vs:taptype.
flag
xpath:flag
adql:VARCHAR(*)Hash-separated keywords representing traits of the column. Recognized values include "indexed", "primary", and "nullable".
column_description
xpath:description
adql:VARCHAR(*)A free-text description of the column's contents.

The pair ivoid, table_index should be an explicit foreign key into res_table. It is recommended to maintain indexes on at least the column_description, name, ucd, and utype columns, where the index on column_description should ideally be able to handle queries using ivo_hasword.

The following columns MUST be lowercased during ingestion: ivoid, name, ucd, utype, datatype, type_system. Clients are advised to query the description column using the ivo_hasword function, and to query the flag column using the ivo_hashlist_has function.

The interface Table

The interface table subsumes both the vr:Interface and vr:accessURL types from VOResource. The integration of accessURL into the interface table means that an interface in the relational registry can only have one access URL, where in VOResource it can have many. In practice, this particular VOResource capability has not been used by registry record authors. Since access URLs are probably the item most queried for, it seems warranted to save one indirection when querying for them.

This specification deprecates the maxOccurs="unbounded" in the definition of Interface's accessURL child in the XML schema http://www.ivoa.net/xml/VOResource/v1.0; in future versions of VOResource, implementations can expect this to be maxOccurs="1". Meanwhile, implementation behaviour in the presence of multiple access URLs in an interface is undefined.

The table contains an integer-typed column intf_index to disambiguate multiple interfaces of one resource. Operators are free to choose the actual values as convenient, although it is recommended to just enumerate the interfaces in their physical sequence per resource.

The query_type column is a hash-joined list (analoguos to waveband, etc. in the resource table), as the XML schema allows listing up to two request methods.

This table only contains interface elements from within capabilities. Interface elements in StandardsRegExt records are ignored in RegTAP, and they must not be inserted in this table, since doing so would disturb the foreign key from interface into capability. In other words, RegTAP requires every interface to have a parent capability.

Analogous to resource.res_type, the intf_type column contains type names; VOResource extensions can define new types here, but at the time of writing, the following types are mentioned in IVOA-recommended schemata:

vs:paramHTTP
A service invoked via an HTTP query with either form-urlencoded or multipart form-data parameters.
vr:WebBrowser
A (form-based) interface intended to be accessed interactively by a user via a web browser.
vg:OAIHTTP
A standard OAI PMH interface using HTTP queries with form-urlencoded parameters.
vg:OAISOAP
A standard OAI PMH interface using a SOAP Web Service interface.
vr:WebService
A Web Service that is describable by a WSDL document.
Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
cap_index
N/A
adql:SMALLINTThe index of the parent capability.
intf_index
N/A
adql:SMALLINTA running number for the interfaces of a resource.
intf_type
xpath:@xsi:type
adql:VARCHAR(*)The type of the interface (vr:WebBrowser, vs:ParamHTTP, etc).
intf_role
xpath:@role
adql:VARCHAR(*)An identifier for the role the interface plays in the particular capability. If the value is equal to "std" or begins with "std:", then the interface refers to a standard interface defined by the standard referred to by the capability's standardID attribute.
std_version
xpath:@version
adql:VARCHAR(*)The version of a standard interface specification that this interface complies with. When the interface is provided in the context of a Capability element, then the standard being refered to is the one identified by the Capability's standardID element.
query_type
xpath:queryType
adql:VARCHAR(*)Hash-joined list of expected HTTP method (GET or POST) supported by the service.
result_type
xpath:resultType
adql:VARCHAR(*)The MIME type of a document returned in the HTTP response.
wsdl_url
xpath:wsdlURL
adql:VARCHAR(*)The location of the WSDL that describes this Web Service. If NULL, the location can be assumed to be the accessURL with '?wsdl' appended.
url_use
xpath:accessURL/@use
adql:VARCHAR(*)A flag indicating whether this should be interpreted as a base URL ('base'), a full URL ('full'), or a URL to a directory that will produce a listing of files ('dir').
access_url
xpath:accessURL
adql:VARCHAR(*)The URL at which the interface is found.

This table should have the pair ivoid, cap_index as an explicit foreign key into capability, and the pair ivoid, and intf_index as an explicit primary key. Additionally, it is recommended to maintain an index on at least the intf_type column.

The following columns MUST be lowercased during ingestion: ivoid, intf_type, intf_role, std_version, query_type, result_type, url_use. Clients are adviced to query query_type using the the ivo_hashlist_has function.

The intf_param Table

The intf_param table keeps information on the parameters available on interfaces. It is therefore closely related to table_column, but the differences between the two are significant enough to warrant a separation between the two tables. Since the names of common column attributes are used where applicable in both tables (e.g., name, ucd, etc), the two tables cannot be (naturally) joined.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
intf_index
N/A
adql:SMALLINTThe index of the interface this parameter belongs to.
name
xpath:name
adql:VARCHAR(*)The name of the parameter.
ucd
xpath:ucd
adql:VARCHAR(*)A unified content descriptor that describes the scientific content of the parameter.
unit
xpath:unit
adql:VARCHAR(*)The unit associated with all values in the parameter.
utype
xpath:utype
adql:VARCHAR(*)An identifier for a role in a data model that the data in this parameter represents.
std
xpath:@std
adql:SMALLINTIf 1, the meaning and use of this parameter is reserved and defined by a standard model. If 0, it represents a database-specific parameter that effectively extends beyond the standard.
datatype
xpath:dataType
adql:VARCHAR(*)The type of the data contained in the parameter.
extended_schema
xpath:dataType/@extendedSchema
adql:VARCHAR(*)An identifier for the schema that the value given by the extended attribute is drawn from.
extended_type
xpath:dataType/@extendedType
adql:VARCHAR(*)A custom type for the values this parameter contains.
arraysize
xpath:dataType/@arraysize
adql:VARCHAR(*)The shape of the array that constitutes the value, e.g., 4, *, 4*, 5x4, or 5x*, as specified by VOTable.
delim
xpath:dataType/@delim
adql:VARCHAR(*)The string that is used to delimit elements of an array value when arraysize is not '1'.
param_use
xpath:@use
adql:VARCHAR(*)An indication of whether this parameter is required to be provided for the application or service to work properly (one of required, optional, ignored, or NULL).
param_description
xpath:description
adql:VARCHAR(*)A free-text description of the column's contents.

The pair ivoid, intf_index should be an explicit foreign key into interface.

The remaining requirements and conventions are as per section table_table_column where applicable, and param_description taking the role of column_description.

The relationship Table

The relationship element is a slight denormalization of the vr:Relationship type: Whereas in VOResource, a single relationship element can take several IVORNs, in the relational model, the pairs are stored directly. It is straightforward to translate between the two representations in the database ingestor.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
relationship_type
xpath:relationshipType
adql:VARCHAR(*)The named type of relationship; this can be mirror-of, service-for, served-by, derived-from, related-to, or something user-defined.
related_id
xpath:relatedResource/@ivo-id
adql:VARCHAR(*)The URI form of the IVOA identifier for the resource refered to.
related_name
xpath:relatedResource
adql:VARCHAR(*)The name of resource that this resource is related to.

The ivoid column should be an explicit foreign key into the resoure table. You should index at least the related_id column.

The following columns MUST be lowercased during ingestion: ivoid, relationship_type, related_id.

The validation Table

The validation subsumes the vr:validationLevel-typed members of both vr:Resource and vr:Capability.

If the cap_index column in NULL the validation comprises the entire resource. Otherwise, only the referenced capability has been validated.

While it is recommended that harvesters only accept resource records from their originating registries, it is valuable to gather validation results from various sources. Hence, harvesters for the relational registry may choose to obtain validation data from the OAI-PMH endpoints of various registries by not harvesting just for the ivo_managed set and generate rr.validation rows from these records. This can trigger potentially problematic behaviour when the original registry updates its resource record in that naive implementations will lose all third-party validation rows; this may actually be the correct behaviour, since an update of the registry record might very well indicate validation-relevant changes in the underlying services. Implementations are free to handle or ignore validation results as they see fit, and they may add validation results of their own.

For convenicence, we repeat the definition of the validation levels from std:RM:

0
The resource has a description that is stored in a registry. This level does not imply a compliant description.
1
In addition to meeting the level 0 definition, the resource description conforms syntactically to this standard and to the encoding scheme used.
2
In addition to meeting the level 1 definition, the resource description refers to an existing resource that has been demonstrated to be functionally compliant. When the resource is a service, it is considered to exist and to be functionally compliant if use of the Service.InterfaceURL or Service.BaseURL responds without error when used as intended by the resource. If the service is a standard one, it must also demonstrate the response is syntactically compliant with the service standard in order to be considered functionally compliant. If the resource is not a service, then the ReferenceURL must be shown to return a document without error.
3
In addition to meeting the level 2 definition, the resource description has been inspected by a human and judged to comply semantically to this standard as well as meeting any additional minimum quality criteria (e.g., providing values for important but non-required metadata) set by the human inspector. [...]
4
In addition to meeting the level 3 definition, the resource description meets additional quality criteria set by the human inspector and is therefore considered an excellent description of the resource. Consequently, the resource is expected to operate well as part of a VO application or research study.
Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
validated_by
xpath:validationLevel/@validatedBy
adql:VARCHAR(*)The IVOA ID of the registry or organisation that assigned the validation level.
val_level
xpath:validationLevel
adql:SMALLINTA numeric grade describing the quality of the resource description, when applicable, to be used to indicate the confidence an end-user can put in the resource as part of a VO application or research study.
cap_index
N/A
adql:SMALLINTIf non-NULL, the validation only refers to the the capability referenced here.

The ivoid should be an explicit foreign key into resource. Note, however, that ivoid, cap_index is not a foreign key into capability since cap_index may be NULL (in case the validation addresses the entire resource).

The following columns MUST be lowercased during ingestion: ivoid, validated_by.

The res_date Table

The res_date table contains information gathered from vr:Curation's date children.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
date_value
xpath:date
adql:TIMESTAMPA date associated with an event in the life cycle of the resource.
value_role
xpath:date/@role
adql:VARCHAR(*)A string indicating what the date refers to, e.g., created, availability, updated.

The ivoid column should be an explicit foreign key into resource.

The following columns MUST be lowercased during ingestion: ivoid, value_role.

The res_detail Table

The res_detail table is RegTAP's primary means for extensibility as well as a fallback for less-used simple metadata. Conceptually, it stores triples of resource entity references, resource xpaths, and values, where resource entities can be resource records themselves or capabilities. Thus, metadata with values that are either string-valued or sets of strings can be represented in this table.

As long as the metadata that needs to be represented in the relational registry for new VOResource extensions is simple enough, no changes to the schema defined here will be neccessary as these are introduced. Instead, the extension itself simply defines new xpaths to be added in res_detail.

Some complex metadata—tr:languageFeature or vstd:key being examples—cannot be kept in this table. If a representation of such information in the relational registry is required, this standard will need to be changed.

Appendix d_u_list gives a list of resource xpaths from the registry extensions that were recommendations at the time of writing. For the resource xpaths marked with an exclamation mark there, xpath/value pairs MUST be generated whenever the corresponding metadata items are given in a resource record. For the remaining resource xpaths, these pairs should be provided if convenient; they mostly concern test queries and other curation-type information that, while unlikely to be useful to normal users, may be relevant to curation-type clients that, e.g., ascertain the continued operation of services.

Some detail values must be interpreted case-insensitively; this concerns, in particular, IVORNs like the TAP data model type. For other rows—the test queries are immediate examples—, changing the case will likely break the data. In order to avoid having to give and implement case normalization rules by detail xpath, no case normalization is done on detail values at all, and users and clients will have to use the ivo_nocasematch user defined function (see section adqludf) when locating case-insensitive values. For the resource xpaths given in Appendix d_u_list, this concerns all items with xpaths ending in @ivo-id.

Individual ingestors MAY choose to expose additional metadata using other utypes, provided they are formed according to the rules in section vorutypes (this rule is intended to minimize the risk of later clashes).

In addition to the metadata listed in this specification, metadata defined in future IVOA-approved VOResource extensions MUST or SHOULD be present in res_details as the extensions require it.

Name
Utype
TypeDescription
ivoid
xpath:/identifier
adql:VARCHAR(*)The parent resource.
cap_index
N/A
adql:SMALLINTThe index of the parent capability; if NULL the xpath-value pair describes a member of the entire resource.
detail_xpath
N/A
adql:VARCHAR(*)The xpath of the data item
detail_value
N/A
adql:VARCHAR(*)(Atomic) value of the member

The ivoid column should be an explicit foreign key into resource. It is recommended to maintain indexes on at least the columns detail_xpath and detail_value, where the indices on detail_value should ideally work for both direct comparisons and searches using the ivo_nocasematch function.

The following column MUST be lowercased during ingestion: ivoid. Clients are advised to use the ivo_nocasematch function to search in detail_value if the values are to be compared case-insensitively (e.g., all IVORNs).

ADQL User Defined Functions

TAP Servers implementing the ivo://ivoa.net/std/RegTAP/vor data model MUST implement the following three functions in their ADQL 2.0 language, with signatures written as recommended in std:TAPREGEXT:

ivo_nocasematch(value VARCHAR(*), pattern VARCHAR(*))->INTEGER
The function returns 1 if pattern matches value, 0 otherwise. pattern is defined as for the SQL LIKE operator, but the match is performed case-insensitively.
ivo_hasword(haystack VARCHAR(*), needle VARCHAR(*)) -> INTEGER
The function takes two strings and returns 1 if the second is contained in the first one in a "word" sense, i.e., delimited by non-letter characters or the beginning or end of the string, where case is ignored. Additionally, servers MAY employ techniques to improve recall, in particular stemming. Registry clients must hence expect different results from different servers when using ivo_hasword; for such queries trying them on multiple registries may improve recall.
ivo_hashlist_has(haslist VARCHAR(*), item VARCHAR(*))->INTEGER
The function takes two strings; the first is a list of words not containing the hash sign (#), concatenated by hash signs, the second is a word not containing the hash sign. It returns 1 if, compared case-insensitively, the second argument is in the list of words encoded in the first argument. The behaviour for second arguments containing a hash sign is undefined.

Reference implementations of the three functions for the PostgreSQL database system are given in Appendix appPGDefs.

Common Queries to the Relational Registry

This section contains sample queries to the relational registry, mostly contributed as use cases by various members of the IVOA Registry working group. They are intended as an aid in designing relational registry queries, in particular for users new to the data model.

Where these queries locate access URLs of standard services, they contain intf_role='std' conditions. Desktop clients should probably include this to weed out nonstandard interfaces on standard capabilities, which are rare but technically legal; on the other hand, there are fairly standards-compliant interfaces for which the operator neglected to set role="std", and the respective services would be missed with such a condition. Hand-crafted queries can probably typically do without restraining role since most standard capabilities in real-world resources only have one—standard—interface.

TAP accessURLs

Problem: Find all TAP services; return their accessURLs

As the capability type is in rr.capability, whereas the access URL can be found from rr.interface, this requires a join:

SELECT ivoid, access_url 
FROM rr.capability
NATURAL JOIN rr.interface
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND intf_role='std'

Other standard_ids relevant here include:

  • ivo://ivoa.net/std/registry for OAI-PMH services,
  • ivo://ivoa.net/std/sia for SIA services,
  • ivo://ivoa.net/std/conesearch for SCS services, and
  • ivo://ivoa.net/std/ssa for SSA services.

Image Services with Spirals

Problem: Find all SIA services that might have spiral galaxies

This is somewhat tricky since it is probably hard to image a part of the sky guaranteed not to have some, possibly distant, spiral galaxy in it. However, translating the intention into "find all SIA services that mention spiral in either the subject (from rr.res_subject), the description, or the title (which are in rr.resource)", the query would become:

SELECT ivoid, access_url 
FROM rr.capability 
  NATURAL JOIN rr.resource
  NATURAL JOIN rr.interface
  NATURAL JOIN rr.res_subject
WHERE standard_id='ivo://ivoa.net/std/sia'
  AND intf_role='std'
  AND (
    1=ivo_nocasematch(res_subject, '%spiral%')
    OR 1=ivo_hasword(res_description, 'spiral')
    OR 1=ivo_hasword(res_title, 'spiral'))

Infrared Image Services

Problem: Find all SIA services that provide infrared images

The waveband information in rr.resource comes in hash-separated atoms (which can be radio, millimeter, infrared, optical, uv, euv, x-ray, or gamma-ray). For matching those, use the ivo_hashlist_has function as below. The access URL and the service standard come from rr.interface and rr.capability, respectively.

SELECT ivoid, access_url 
FROM rr.capability 
  NATURAL JOIN rr.resource
  NATURAL JOIN rr.interface
WHERE standard_id='ivo://ivoa.net/std/sia'
  AND intf_role='std'
  AND 1=ivo_hashlist_has('infrared', waveband)

Catalogs with Redshifts

Problem: Find all searchable catalogs (i.e., cone search services) that provide a column containing redshifts

Metadata on columns exposed by a service are contained in rr.table_column. Again, this table can be naturally joined with rr.capability and rr.interface:

SELECT ivoid, access_url 
FROM rr.capability 
  NATURAL JOIN rr.table_column
  NATURAL JOIN rr.interface 
WHERE standard_id='ivo://ivoa.net/std/conesearch'
  AND intf_role='std'
  AND ucd='src.redshift'

Sometimes you want to match a whole set of ucds. Frequently the simple regular expressions of SQL will help, as in AND ucd LIKE 'pos.parallax%' . When that is not enough, use boolean OR expressions

Names from an Authority

Problem: Find all the resources published by a certain authority

An "authority" within the VO is something that hands out identifiers. You can tell what authority a record came from by looking at the "host part" of the IVO identifier, most naturally obtained from rr.resource. Since ADQL cannot actually parse URIs, we make do with simple string matching:

SELECT ivoid 
FROM rr.resource
WHERE ivoid LIKE 'ivo://org.gavo.dc%'

Records Published by X

Problem: What registry records are there from a given publisher?

This uses the rr.res_role table both match names (in this case, a publisher that has "gavo" in its name) and to ascertain the named entity actually publishes the resource (rather than, e.g., just being the contact in case of trouble). The result is a list of ivoids in this case. You could join this with any other table in the relational registry to find out more about these services.

SELECT ivoid 
FROM rr.res_role
WHERE 1=ivo_nocasematch(role_name, '%gavo%')
  AND base_role='publisher'

or, if the publisher actually gives its ivo-id in the registry records,

SELECT ivoid 
FROM rr.res_role
WHERE role_ivoid='ivo://ned.ipac/ned'
  AND base_role='publisher'

Records from Registry

Problem: What registry records are there originating from registry X?

This is mainly a query interesting for registry maintainers. Sill, it is a nice example for joining with the rr.res_detail table, in this case to first get a list of all authorities managed by the CDS registry.

SELECT ivoid FROM rr.resource
RIGHT OUTER JOIN (
  SELECT 'ivo://' || detail_value || '%' AS pat FROM rr.res_detail
  WHERE detail_xpath='/managedAuthority' 
    AND ivoid='ivo://cds.vizier/registry') 
  AS authpatterns
ON (resource.ivoid LIKE authpatterns.pat)

Locate RegTAP services

Problem: Find all TAP endpoints offering the relational registry

This is the discovery query for RegTAP services themselves; note how this combines rr.res_detail pairs with rr.capability and rr.interface to locate the desired protocol endpoints.

SELECT access_url
FROM rr.interface
NATURAL JOIN rr.capability
NATURAL JOIN rr.res_detail
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND intf_role='std'
  AND detail_xpath='/capability/dataModel/@ivo-id'
  AND 1=ivo_nocasematch(detail_value, 'ivo://ivoa.net/std/regtap/vor')

TAP with Physics

Problem: Find all TAP services exposing a table with certain features

"Certain features" could be "have some word in their description and having a column with a certain UCD". Either way, this kind of query fairly invariably combines the usual rr.capability and rr.interface for service location with rr.table_column for the column metadata and rr.res_table for properties of tables.

SELECT ivoid, access_url, name, ucd, column_description
FROM rr.capability 
  NATURAL JOIN rr.interface
  NATURAL JOIN rr.table_column
  NATURAL JOIN rr.res_table
WHERE standard_id='ivo://ivoa.net/std/tap'
  AND intf_role='std'
  AND 1=ivo_hasword(table_description, 'quasar')
  AND ucd='phot.mag;em.opt.v'

Theoretical SSA

Problem: Find all SSAP services that provide theoretical spectra

The metadata required to solve this problem is found in the SSAP registry extension and is thus kept in rr.res_detail:

SELECT access_url 
FROM rr.res_detail 
  NATURAL JOIN rr.capability 
  NATURAL JOIN rr.interface 
WHERE detail_xpath='/capability/dataSource' 
  AND intf_role='std'
  AND standard_id='ivo://ivoa.net/std/ssa'
  AND detail_value='theory'

Find Contact Persons

Problem: The service at http://dc.zah.uni-heidelberg.de/__system__/tap/run/tap is down, who can fix it?

This uses the rr.res_role table and returns all information on it based on the IVORN of a service that in turn was obtained from rr.interface. You could restrict to the actual technical contact person by requiring base_role='contact'.

SELECT DISTINCT base_role, role_name, email 
FROM rr.res_role 
  NATURAL JOIN rr.interface 
WHERE access_url='http://dc.zah.uni-heidelberg.de/__system__/tap/run/tap'

Related Capabilities

Problem: Get the capabilities of all services serving a specific resource (typically, a data collection)

In the VO, data providers can register pure data collections without access options (or just furnished with a link to a download). They can then declare that their data can be "served-by" some other resource, typically a TAP service or some collective service for a number of instruments. To locate the access options to the data itself, inspect rr.relationship and use it to select records from rr.capability.

SELECT * 
FROM rr.relationship AS a
  JOIN rr.capability AS b 
    ON (a.related_id=b.ivoid) 
WHERE relationship_type='served-by'

XPaths for res_details

This appendix defines the of the res_details table (see section table_res_detail for details) by giving xpaths for which xpath/value pairs MUST (where marked with an exclamation mark) or SHOULD be given if the corresponding data is present in the resource records. This list is normative for metadata defined in IVOA recommendations current as of the publication of this document (see section rolewithinivoa). As laid down in section table_res_detail, new VOResource extensions or new versions of existing VOResource extensions may amend this list.

The xpaths are suffient for locating the respective metdata as per section vorutypes. With the explanations we give the canonical prefixes for the XML namespaces from which the items originate, which is where further information can be found.

/accessURL (!)
For data collection resources, this is the URL that can be used to download the data contained. Do not enter accessURLs from interfaces into res_detail. (vs).
/capability/executionDuration/hard
The hard run time limit, given in seconds (tr).
/capability/complianceLevel
The category indicating the level to which this instance complies with the SSA standard (ssap).
/capability/creationType (!)
The category that describes the process used to produce the dataset; one of archival, cutout, filtered, mosaic, projection, specialExtraction, catalogExtraction (ssap).
/capability/dataModel (!)
The short, human-readable name of a data model supported by a TAP service; for most applications, clients should rather constrain /capability/dataModel/@ivo-id (tr).
/capability/dataModel/@ivo-id (!)
The IVORN of the data model supported by a TAP service (tr).
/capability/dataSource (!)
The category specifying where the data originally came from; one of survey, pointed, custom, theory, artificial (ssap).
/capability/defaultMaxRecords (!)
The largest number of records that the service will return when the MAXREC parameter is not specified in the query input (ssap).
/capability/executionDuration/default
The run time limit for newly-created jobs, given in seconds (tr).
/capability/imageServiceType (!)
The class of image service: Cutout, Mosaic, Atlas, Pointed (sia).
/capability/language/name (!)
A short, human-readable name of a language understood by the TAP service (tr).
/capability/language/version/@ivo-id (!)
The IVORN of a language supported by a TAP service (tr).
/capability/maxAperture
The largest aperture that can be supported upon request via the APERTURE input parameter by a service that supports the special extraction creation method (ssap).
/capability/maxFileSize (!)
The maximum image file size in bytes (sia).
/capability/maxImageExtent/lat
The maximum size in the latitude (Dec.) direction (sia).
/capability/maxImageExtent/long
The maximum size in the longitude (R.A.) direction (sia).
/capability/maxImageSize/lat
The image size in the latitude (Dec.) direction in pixels (sia-1.0).
/capability/maxImageSize/long
The image size in the longitude (R.A.) direction in pixels (sia-1.0).
/capability/maxImageSize
A measure of the largest image the service can produce given as the maximum number of pixels along the first or second axes. (sia).
/capability/maxQueryRegionSize/lat
The maximum size in the latitude (Dec.) direction (sia).
/capability/maxQueryRegionSize/long
The maximum size in the longitude (R.A.) direction (sia).
/capability/maxRecords (!)
The largest number of items (records, rows, etc) that the service will return (cs, sia, vg, ssap).
/capability/maxSearchRadius (!)
The largest search radius, in degrees, that will be accepted by the service without returning an error condition. Not providing this element or specifying a value of 180 indicates that there is no restriction. (ssap)
/capability/maxSR (!)
The largest search radius of a cone search service (cs).
/capability/outputFormat/@ivo-id (!)
An IVORN of an output format (tr).
/capability/outputFormat/alias
A short, mnemonic identifier for a service's output format (tr).
/capability/outputFormat/mime (!)
The MIME type of an output format (tr).
/capability/outputLimit/default
The maximal output size for newly-created jobs (tr).
/capability/outputLimit/default/@unit
The unit (rows/bytes) in which the service's default output limit is given (tr).
/capability/outputLimit/hard
The hard limit of a service's output size (tr).
/capability/outputLimit/hard/@unit
The unit of this service's hard output limit (tr).
/capability/retentionPeriod/default
The default time between job creation and removal on this service, given in seconds (tr).
/capability/retentionPeriod/hard
The hard limit for the retention time of jobs on this services (tr).
/capability/supportedFrame (!)
The STC name for a world coordinate system frame supported by this service (ssap).
/capability/testQuery/catalog
The catalog to query (cs).
/capability/testQuery/dec
Declination in a test query (cs)
/capability/testQuery/extras
Any extra (non-standard) parameters that must be provided (apart from what is part of base URL given by the accessURL element; cs, sia).
/capability/testQuery/pos/lat
Declination for a SIA test query (sia).
/capability/testQuery/pos/lat
The Declination of the center of the search position in decimal degrees (ssap).
/capability/testQuery/pos/long
The Right Ascension of the center of the search position in decimal degrees (ssap).
/capability/testQuery/pos/refframe
A coordinate system reference frame name for a test query. If not provided, ICRS is assumed (ssap).
/capability/testQuery/pos/refframe
The coordinate system reference frame name indicating the frame to assume for the given position (empty means ICRS; ssap).
/capability/testQuery/queryDataCmd
Fully specified test query formatted as an URL argument list in the syntax specified by the SSA standard. The list must exclude the REQUEST argument (ssap).
/capability/testQuery/ra
Right ascension in a test query (cs)
/capability/testQuery/size
The size of the search radius in an SSA search query (ssap).
/capability/testQuery/size/lat
Region size for a SIA test query in declination (sia).
/capability/testQuery/size/long
Region size for a SIA test query in RA (sia).
/capability/testQuery/sr
Search radius of a cone search service's test query (cs).
/capability/testQuery/verb
Verbosity of a service's test query (cs, sia).
/capability/uploadLimit/default
An advisory size above which user agents should reconfirm uploads to this service (tr).
/capability/uploadLimit/default/@unit
The unit of the limit specified (tr).
/capability/uploadLimit/hard
Hard limit for the size of uploads on this service (tr).
/capability/uploadLimit/hard/@unit
The unit of the limit specified (tr).
/capability/verbosity (!)
true if the service supports the VERB keyword; false, otherwise (cs).
/coverage/footprint (!)
A URL of a footprint service for retrieving precise and up-to-date description of coverage (vs).
/coverage/footprint/@ivo-id (!)
The URI form of the IVOA identifier for the service describing the capability refered to by this element (vs).
/deprecated (!)
A sentinel that all versions of the referenced standard are deprectaed. The value is a human-readable explanation for the designation (vstd).
/endorsedVersion (!)
A version of a standard that is recommended for use (vstd).
/facility (!)
The observatory or facility used to collect the data contained or managed by this resource (vs).
/format (!)
The physical or digital manifestation of the information supported by a (DataCollection) resource. MIME types should be used for network-retrievable, digital data, non-MIME type values are used for media that cannot be retrieved over the network (vs).
/format/@isMIMEType
If true, then an accompanying /format item is a MIME Type. Within res_details, this does not work for services that give more than one format; since furthermore the literal of vs:Format allows a good guess whether or not it is a MIME type, this does not appear a dramatic limitation (vs).
/instrument (!)
The instrument used to collect the data contained or managed by a resource (vr).
/instrument/@ivo-id (!)
IVORN of the instrument used to collect the data contained or managed by a resource (vr).
/managedAuthority (!)
An authority identifier managed by a registry (vg).
/managingOrg (!)
The organization that manages or owns this authority (vr).
/schema/@namespace (!)
An identifier for a schema described by a standard (vstd).

Note that the representation of StandardsRegExt's status and use attributes as well as its key would require sequences of complex objects, which is impossible using res_detail. Hence, the respective metadata is not queriable within the relational registry. Similarly, complex TAPRegExt metadata on languages, user defined functions, and the like cannot be represented in this table. Since these pieces of metadata do not seem relevant to resource discovery and are geared towards other uses of the respective VOResource extensions, a more complex model does not seem warrented just so they can be exposed.

The Extra UDFs in PL/pgSQL

What follows are (non-normative) implementations of the three user defined functions specificed in section adqludf in the SQL dialect of PostgreSQL (e.g., doc:Postgres92).

Note that PostgreSQL cannot use fulltext indices on the respective columns if the functions are defined in this way for (fairly subtle) reasons connected with NULL value handling. While workarounds are conceivable, they come with potentially unwelcome side effects, at least as of PostgreSQL 9.x. It is therefore recommended to replace expressions involving the functions given here with simple boolean expressions in the ADQL translation layer whenever possible.

Implementation notes

This appendix contains a set of constraints and recommendations for implementing the relational registry model on actual RDBMSes, originating partly from an analysis of the data content of the VO registry in February 2013, partly from a consideration of limits in various XML schema documents. This concerns, in particular, minimum lengths for columns of type adql:VARCHAR. Implementations MUST NOT truncate strings of length equal to or smaller than the minimal lengths given here; the limitations are not, however, upper limits, and indeed, when choosing an implementation strategy it is in general preferable to not impose artificial length restrictions, in particular if no performance penalty is incurred.

These notes can also be useful with a view to preparing user interfaces for the relational registry, since input forms and similar user interface elements invariably have limited space; the limits here give reasonable defaults for the amount of data that should minimally be manipulatable by a user with reasonable effort.

The ivoid field present in every table of this specification merits special consideration, on the one hand due to its frequency, but also since other IVOA identifiers stored in the relational registry should probably be treated analoguously. Given that IVORNs in the 2013 data fields have a maximum length of roughly 100 characters, we propose that a maximum length of 255 should be sufficient even when taking possible fragment identifiers into account.

Field type Datatype suggested Pertinent Fields
ivo-id VARCHAR(255) {all_tables}.ivoid
res_role.role_ivoid
capability.standard_id
relationship.related_id
validation.validated_by
res_detail.detail_value for several values of detail_xpath

The relational registry also contains some date-time values. The most straightforward implementation certainly is to use SQL timestamps. Other relational registry fields that straightforwardly map to common SQL types are those that require numeric values, viz., REAL, SMALLINT, and INTEGER. The following table summarizes these:

Field type Datatype Pertinent Fields
floating point REAL resource.region_of_regard
small integer SMALLINT capability.cap_index
res_schema.schema_index
res_table.schema_index
table_column.std
interface.cap_index
interface.intf_index
intf_param.intf_index
intf_param.std
validation.level
validation.cap_index
res_detail.cap_index
res_table.table_index
table_column.table_index

The fields containing Utypes, UCDs, and Units are treated in parallel here. The 2013 registry data indicates a length of 128 characters is sufficient for real-world purposes; actually, at least UCDs and Units could of course grow without limitations, but it seems unreasonable anything longer than a typical line might actually be useful. As far as utypes are concerned, we expect those to shrink rather than grow with new standardization efforts.

In this category, we also summarize our resource xpaths. With the conventions laid down in this document, it seems unlikely that future extensions to VOResource will be so deeply nested that 128 characters will not be sufficient for their resource xpaths.

Field type Datatype suggested Pertinent Fields
Utype(s)
UCD(s)
Units
xpaths
VARCHAR(128) res_schema.schema_utype
res_table.table_utype
table_column.ucd
table_column.unit
table_column.utype
intf_param.ucd
intf_param.unit
intf_param.utype
res_detail.detail_xpath

The relational registry further has an e-mail field, for which we chose 128 characters as a reasonable upper limit (based on a real maximum of 40 characters in the 2013 data). There are furthermore URLs (in addition to access and reference URLs, there are also URLs for the WSDL of SOAP services and logos for roles). Due to the importance of in particular the access URLs we strongly recommend to use non-truncating types here. Empirically, there are access URLs of up to 224 characters in 2013 (reference URLs are less critical at a maximum of 96 characters). Expecting that with REST-based services, URL lengths will probably rather tend down than up, we still permit truncation at 255 characters.

Field type Datatype suggested Pertinent Fields
e-mail VARCHAR(128) res_role.email
URLs VARCHAR(255) resource.reference_url
res_role.logo
interface.wsdl_url
interface.access_url

The next group of columns comprises those that have values taken from a controlled or finite vocabulary. Trying to simplify the view, lengths in the form of powers of two are considered.

Field type Datatype suggested Pertinent Fields
predefined
values
VARCHAR(255) resource.content_level
resource.content_type
VARCHAR(64) resource.rights
resource.waveband
VARCHAR(32) resource.res_type
capability.cap_type
res_table.table_type
table_column.flag
table_column.datatype
table_column.extended_schema
table_column.extended_type
table_column.type_system
interface.result_type
intf_param.datatype
intf_param.extended_schema
intf_param.extended_type
VARCHAR(4) interface.query_type
interface.url_use

Finally, there are the fields that actually contain what is basically free text. For these, we have made a choice from reasonable powers of two lengths considering the actual lengths in the 2013 registry data. A special case are fields that either contain natural language text (the descriptions) or those that have grown without limit historically (resource.creator_seq, and, giving in to current abuses discussed above, res_role.role_name). For all such fields, no upper limit can sensibly be defined. However, since certain DBMSes (e.g., MySQL older than version 5.6) cannot index fields with a TEXT datatype and thus using VARCHAR may be necessary at least for frequenly-searched fields, we give the maximal length of the fields in the 2013 registry in parentheses after the column designations for the TEXT datatype:

Field type Datatype suggested Interested Fields
free string values TEXT resource.res_description (7801)
resource.creator_seq (712)
res_role.role_name (712)
res_schema.schema_description (934)
res_table.table_description (934)
table_column.description (3735)
intf_param.description (347) capability.cap_description (100)
VARCHAR(255) resource.res_title
res_role.address
res_schema.schema_title
res_table.table_title
relationship.related_name
res_detail.detail_value
VARCHAR(128) resource.version
resource.source_value
res_subject.res_subject
VARCHAR(64) res_table.table_name
table_column.name
intf_param.name
VARCHAR(32) resource.source_format
res_role.telephone
res_schema.schema_name
interface.intf_type
interface.intf_role
relationship_type
res_date.value_role
VARCHAR(16) resource.short_name
table_column.arraysize
interface.std_version
intf_param.use

XSLT to enumerate RegTAP XPaths

The (non-normative) following XSL stylesheet emits RegTAP xpaths as discussed in section vorutypes when applied to a VOResource extension. Considering readability and limitations of XSLT, this is not fully general -- if VOResource extensions started to inherit from other extensions' subclasses of Resource, Capability, or Interface, this stylesheet will need to be extended.

Still, it is a useful tool when evaluating how to map a given extension to RegTAP.

Changes from Previous Versions

Changes from WD-20130909

  • Updates for REC of SimpleDALRegExt, which contains versions 1.1 of both the sia and the ssap XML schemas; thisi means there are now additional namespace URIs to take into accound, as well as new res_detail xpaths /capability/maxSearchRadius, /capability/maxImageSize, and /testQuery/pos/refframe.
  • Reinstated makeutypes.xslt script; it's useful even with the new xpaths.

Changes from WD-20130411

  • The final utype reform: most of our ex-utype strings aren't called utypes any more, they're fairly plain xpaths. Consequently, res_detail.detail_utype has been renamed res_detail.detail_xpath.
  • Renamed some columns and the subject table to relieve the need of quoting in MS SQL Server (or, in the case or use_param, maintain consistency after the renaming):
    OldNew
    resource.versionresource.res_version
    res_role.addressres_role.street_address
    subject.*res_subject.*
    res_subject.res_subjectres_subject.res_subject
    table_column.descriptiontable_column.column_description
    intf_param.descriptionintf_param.param_description
    intf_param.use_paramintf_param.param_use
    validation.levelvalidation.val_level
  • rr.intf_param grew the arraysize and delim columns that before accidentally were only present in rr.table_column.
  • Added warnings about having to match case-insensitively in res_detail.detail_value for IVORN-valued rows.
  • Restored the foreign key from interface to capability. Mandating ignoring interface elements from StandardsRegExt records really is the lesser evil.
  • resource.region_of_regard now must have unit metadata declared.
  • We now explicitely deprecate multiple access urls per interface and announce that single access URLs will be enforced in future VOResource versions.

References