I nternational
V irtual
O bservatory
A lliance
Registries provide a mechanism with which VO applications can discover and select resources—first and foremost data and services—that are relevant for a particular scientific problem. This specification defines an interface for searching this resource metadata based on the IVOA's TAP protocol. It specifies a set of tables that comprise a useful subset of the information contained in the registry records, as well as the table's data content in terms of the XML VOResource data model. The general design of the system is geared towards allowing easy authoring of queries.
A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.
This document has been developed in part with support from the German Astronomical Virtual Observatory (BMBF Bewilligungsnummer 05A08VHA).
The words "MUST ", "SHOULD", "MAY", "RECOMMENDED", and "OPTIONAL" (in upper or lower case) used in this document are to be interpreted as described in the IETF standard RFC 2119 std:RFC2119.
In the Virtual Observatory (VO), registries provide a means for discovering useful resources, i.e., data and services. Individual publishers offer the descriptions for their resources ("resource records") in publishing registries. At the time of writing, there are roughly 14000 such resource records active within the VO, originating from about 20 publishing registries.
The protocol spoken by these publishing registries, OAI-PMH only allows restricting queries by modification date and identifier and is hence not suitable for data discovery. Even if it were, data discovery would at least be fairly time consuming if each client had to query dozens or, potentially, hundreds of publishing registries.
To enable efficient data discovery nevertheless, there are services harvesting the resource records from the publishing registries and offering rich query languages. The IVOA Registry Interfaces specification std:RI1 defined, among other aspects of the VO registry system, such an interface using SOAP and an early draft of an XML-based query language.
This document provides an update to the query ("searchable registry") part of that specifiation (chapter 2), aimed towards usage with current VO standards, in particular TAP std:TAP and ADQL std:ADQL. It follows the model of ObsCore std:OBSCORE of defining a representation of a data model within a relational database. In this case, the data model is a simplification of the VO's resource metadata interchange representation, the VOResource XML format std:VOR. The simplification yields 13 tables. For each table, TAP_SCHEMA metadata is given together with rules for how to fill these tables from VOResource-serialized metadata records as well as conditions on foreign keys and recommendations on indices.
The resulting set of tables has a modest size by today's standards,
but is still non-trivial. The largest table, table_column
,
has about half a million rows at the time of writing.
The architecture laid out here allows client applications to perform "canned" queries on behalf of their users as well as complex queries formulated directly by advanced users, using the same TAP clients they employ to query astronomical data servers.
The set of tables and their metadata specified here, together with the mapping from VOResource ("ingestion rules") is collectively called "relational registry schema" or relational registry for short.
The specificiation additionally talks about how to embed these into TAP services, gives additional user defined functions, talks about discovering compliant services, etc. Since all this is tightly coupled to the "relational registry" as defined above, we do not introduce a new term for it. Hence, the entire standard is now known as "IVOA relational registry schema".
Historically, we intended to follow the ObsCore/ObsTAP model and talked about RegTAP. As changing this acronym is technically painful (e.g., identifiers and URLs would need to be adapted), we kept it after the distinction between the schema and its mapping on the one hand and the its combination with a TAP service on the other went away. This means that the official acronym for "IVOA relational registry schema" is RegTAP. This aesthetic defect seems preferable to causing actual incompatibilities.
This specification directly relates to other VO standards in the following ways:
res_detail
table using
xpaths into the registry documents.
This document should not in general need updates
for registry extension updates. For completeness, we note the
versions current as of this specification: SimpleDALRegExt 1.0
std:DALREGEXT,
StandardsRegExt 1.0 std:STDREGEXT, TAPRegExt 1.0
std:TAPREGEXT, Registry Interfaces 1.0 std:RI1.This standard also relates to other IVOA standards:
rr
schema making up the relational registry, that gives coverages of resources on the spatial,
temporal, spectral and redshift axes. These will be defined in a
separate document.In the design of the tables, the goal has been to preserve as much of VOResource and its extensions, including the element names, as possible.
An overriding consideration has been, however, to make natural joins
between the tables behave usefully, i.e., to actually combine rows
relevant to the same entity (resource, table, capability, etc.).
To disambiguate column names that name the same concept on different
entities (name, description, etc.) and would therefore interfere with
the natural join, a shortened tag for the source object
is prepended to the name. Thus, a description
element within
a resource ends up in a column named
res_description
, whereas the same element from a
capability becomes cap_description
.
We further renamed some columns with respect to their VOResource counterparts to avoid clashes with reserved words in popular database management systems. The alternatives would have been to either recommend quoting them or burden ADQL translation layers with the task of automatically converting them to delimited identifiers. Both alternatives seemed either confusing or impacting robustness.
Furthermore, camel-case identifiers have been converted to
underscore-separated ones (thus, standardId
becomes
standard_id
) to have all-lowercase column names; this saves
potential headache if users choose to reference the columns using SQL
delimited identifiers. Dashes in VOResource attribute names are
converted to underscores, too, with the exception of
ivo-id
, which is just rendered ivoid
.
Another design goal of this specification has been that different registries
operating on the same set of registry records will return identical responses
for most queries; hence, we try to avoid relying on features left not
defined by ADQL (e.g., the case sensitivity of string matches). However,
with a view to non-uniform support for information retrieval-type
queries in database systems, the ivo_hasword
user defined
function is not fully specified here; queries employing it may yield
different results on different implementations, even if they operate on
the same set of resource records.
ADQL has no operators for case-insensitive matching of strings. To allow for robust and straightforward queries nevertheless, most columns containing values not usually intended for display are required to be converted to lower case; in the table descriptions below, there are explicit requirements on case normalization near the end of each section. This is particularly important when the entities to be compared are defined to be case-insensitive (e.g., ucds, IVORNs). Client software that can inspect user-provided arguments (e.g., when filling template queries) should also convert the respective fields to lower case.
This conversion MUST cover all ASCII letters, i.e., A through Z. The conversion SHOULD take place according to algorithm R2 in section 3.13, "Default Case Algorithms" of std:UNICODE. In practice, non-ASCII characters are not expected to occur in columns for which lowercasing is required.
Analogously, case-insensitive comparisons as required by some of the user-defined functions for the relational registry MUST compare the ASCII letters without regard for case. They SHOULD compare according to D144 in std:UNICODE.
VOResource and its extensions make use of XML QNames in attribute
values, most prominently in xsi:type
. The standard
representation of these QNames in XML instance documents makes use of an
abbreviated notation employing prefixes declared using the xmlns mechanism
as discussed in std:XMLNS. Within an ADQL-exposed database, no
standard mechanism exists that could provide a similar mapping of URLs
and abbreviations. The correct way to handle this problem would thus be
to have full QNames in the database (e.g.,
{http://www.ivoa.net/xml/ConeSearch/v1.0}ConeSearch
for the
canonical cs:ConeSearch
). This, of course, would make for
excessively tedious and error-prone querying.
For various reasons, VOResource authors have always been encouraged to use a set of "standard" prefixes. This allows an easy and, to users, unsurprising exit from the problem of the missing xmlns declarations: For the representation of QNames within the database, these recommended prefixes are now mandatory. Future VOResource extensions define their mandatory prefixes themselves.
Following the existing practice, minor version changes are not in
general reflected in the recommended prefixes—e.g., both VODataService
1.0 and VODataService 1.1 use vs:
. For reference, here is
a table of XML namespaces and prefixes for namespaces relevant to this
specification:
oai | http://www.openarchives.org/OAI/2.0/ |
ri | http://www.ivoa.net/xml/RegistryInterface/v1.0 |
vg | http://www.ivoa.net/xml/VORegistry/v1.0 |
vr | http://www.ivoa.net/xml/VOResource/v1.0 |
dc | http://purl.org/dc/elements/1.1/ |
vs | http://www.ivoa.net/xml/VODataService/v1.0 |
vs | http://www.ivoa.net/xml/VODataService/v1.1 |
cs | http://www.ivoa.net/xml/ConeSearch/v1.0 |
sia | http://www.ivoa.net/xml/SIA/v1.0 |
sia | http://www.ivoa.net/xml/SIA/v1.1 |
ssap | http://www.ivoa.net/xml/SSA/v1.0 |
ssap | http://www.ivoa.net/xml/SSA/v1.1 |
tr | http://www.ivoa.net/xml/TAPRegExt/v1.0 |
vstd | http://www.ivoa.net/xml/StandardsRegExt/v1.0 |
xsi | http://www.w3.org/2001/XMLSchema-instance |
This specification piggybacks on top of the well-established VOResource standard. This means that it does not define a full data model, but rather something like a reasonably query-friendly view of a partial representation of one. The link between the actual data model, i.e., VOResource and its extensions as defined by the XML Schema documents, and the fields within this database schema, is provided by xpaths, which are here slightly abbreviated for both brevity and generality.
All xpaths given in this specifiation are assumed to be relative to
the enclosing vr:Resource
element; these are called
"resource xpaths" in the following. If resource xpaths are to be
applied to an OAI-PMH response, the xpath expression
*/*/*/oai:metadata/ri:Resource
must be prepended to it,
with the canonical prefixes from section qnameatts implied. The resource xpaths themselves
largely do not need explicit namespaces since VOResource elements are by
default unqualified. Elements and attributes from non-VOResource
schemata have the canonical namespace prefixes, which in this
specification only applies to several xsi:type
attribute
names.
Some tables draw data from several different VOResource elements.
For those, we have introduced an extended syntax with additional
metacharacters (, ), and |, where the vertical bar denotes an
alternative and the parentheses grouping. For instance, our notation
/(tableset/schema/|)table/
corresponds to the two xpaths
/table
and /tableset/schema/table
.
Within the Virtual Observatory, the link between data models and concrete data representations is usually made using utypes. Since VOResource is directly modelled in XML Schema, the choice of XPath as the bridging formalism is compelling, though, and utypes themselves are not necessary for the operation of a TAP service containing the relational registry. TAP, however, offers fields for utypes in its TAP_SCHEMA. Since they are not otherwise required, this specification takes the liberty of using them to denote the xpaths.
In the metadata for tables and columns below, the utypes given are
obtained from the xpaths by simply prepending them with
xpath:
. To avoid repetition, we allow relative xpaths:
when the xpath in a column utype does not start with a slash, it is
understood that is must be concatenated with the table utype to obtain
the full xpath.
For illustration, if a table has a utype of
xpath:/capability/interface/
and a column within this table
has a utype of xpath:accessURL/@use
, the resulting resource
xpath would come out to be
/capability/interface/accessURL/@use
; to match this in an
OAI-PMH response, the xpath would be
*/*/*/oai:metadata/ri:Resource/capability/interface/accessURL/@use
.
While clients MUST NOT rely on these utypes in either TAP_SCHEMA or the metadata delievered with TAP replies, service operators SHOULD provide them, in particular when there are local extensions to the relational registry in their services. Giving xpaths for extra columns and tables helps human interpretation of them at least when the defining schema files are available.
Resource xpaths are also used in the res_details table (section table_res_detail). These are normal xpaths (although again understood relative to the enclosing Resource element), which, in particular, means that they are case sensitive. On the other hand, to clients they are simply opaque strings, i.e., clients cannot just search for any xpaths into VOResource within res_details.
The relational registry can be part of any TAP service. The presence
of the tables discussed here is indicated by declaring support for the
data model Registry 1.0
with the IVORN
ivo://ivoa.net/std/RegTAP#1.0
in the service's
capabilities (see std:TAPREGEXT). Technically, this
entails adding
Registry 1.0 ]]>
as a child of the capability element with the type
{http://www.ivoa.net/xml/TAPRegExt/v1.0}TableAccess
.
A client that knows the access URL of one TAP service containing a relational registry can thus discover all other services exposing one. The "Find all TAP endpoints offering the relational registry" example in section sample_queries shows a query that does this.
It is recommended to additionally register a relational registry as a VODataService data collection and connect this and the TAP services with a pair of service-for/served-by relations. This allows giving more metadata on the data content like, for example, the frequency of harvesting.
Services implementing this data model that do not (strive to) offer the full data content of the VO registry (like domain-specific registries or experimental systems) MUST NOT declare the above data model in order to not invite clients expecting the VO registry to send queries to it.
In the following table descriptions, the names of tables and columns
are normative and MUST be used as given, and all-lowercase. On the
values in the utype
columns within TAP_SCHEMA
,
see section vorutypes. All columns defined in
this document MUST have a 1 in the std
column of the
TAP_SCHEMA.table_columns
table. Unless otherwise
specified, all values of ucd and unit in
TAP_SCHEMA.table_columns
are NULL for columns defined here.
Descriptions are not normative (as given, they usually are taken from
the schema files of VOResource and its extensions with slight
redaction). Registry operators MAY provide additional columns in their
tables, but they MUST provide all columns given in this
specification.
All table descriptions start out with brief remarks on the relationship of the table to the VOResource XML data model. Then, the columns are described in a selection of TAP_SCHEMA metadata. For each table, recommendations on explicit primary and foreign keys as well as indexed columns are given, where it is understood that primary and foreign keys are already indexed in order to allow efficient joins; these parts are not normative, but operators should ensure decent performance for queries assuming the presence of the given indices and relationships. Finally, lowercasing requirements (normative) are given.
The following tables make up the data model
ivo://ivoa.net/std/RegTAP#1.0
:
Table Utype | Description |
---|---|
rr.capability xpath:/capability/ | Pieces of behaviour of a resource. |
rr.interface xpath:/capability/interface/ | Information on access modes of a capability. |
rr.intf_param xpath:/capability/interface/param/ | Input parameters for services. |
rr.relationship xpath:/content/relationship/ | Relationships between resources, e.g., mirroring, derivation, but also providing access to data within a resource. |
rr.res_date xpath:/curation/ | A date associated with an event in the life cycle of the resource. This could be creation or update. The role column can be used to clarify. |
rr.res_detail N/A | XPath-value pairs for members of resource or capability and their derivations that are less used and/or from VOResource extensions. The pairs refer to a resource if cap_index is NULL, to the referenced capability otherwise. |
rr.res_role N/A | Entities, i.e., persons or organizations, operating on resources: creators, contacts, publishers, contributors. |
rr.res_schema xpath:/tableset/schema/ | Sets of tables related to resources. |
rr.res_subject xpath:/content/ | Topics, object types, or other descriptive keywords about the resource. |
rr.res_table xpath:/(tableset/schema/|)table/ | (Relational) tables that are part of schemata or resources. |
rr.resource xpath:/ | The resources, i.e., services, data collections, organizations, etc., present in this registry. |
rr.table_column xpath:/(tableset/schema|)/table/column/ | Metadata on columns of tables pertaining to resources. |
rr.validation xpath:/(capability|) | Validation levels for resources and capabilities. |
The rr.resource
table contains most atomic members of
vr:Resource
that have a 1:1 relationship to the resource
itself. Members of derived types are, in general, handled through
the res_detail
table even if 1:1 (see table_res_detail). The
content_level, content_type, waveband, and rights members are 1:n but still appear
here. If there are multiple values, they are concatenated with hash
characters (#). Use the ivo_hashlist_has
ADQL extension
function to check for the presence of a single value. This convention
saves on tables while not complicating common queries significantly.
A local addition is the creator_seq
column. It contains
all content of the name
elements below a resource element
curation
child's creator
children,
concatenated with a sequence of semicolon and blank characters (";
"
). The individual parts must be concatenated preserving the
sequence of the XML elements. The resulting string is primarily intended
for display purposes ("author list") and is hence not case-normalized.
It was added since the equivalent of an author list is expected to be
a metadatum that is displayed fairly frequently, but also since the
sequence of author names is generally considered significant. The
res_role
table, on the other hand, does not allow
recovering the input sequence of the rows belonging to one resource.
The res_type
column reflects the lower-cased value of
the ri:Resource
element's xsi:type
attribute,
where the canonical prefixes are used. While custom or experimental
VOResource extensions may lead to more or less arbitrary strings in that
column, VOResource and its IVOA-recommended extensions at the time of
writing define the following values for res_type
:
xsi:type="vg:Harvest"
. Old,
RegistryInterface 1-compliant registries also use this type with a
capability of type vg:Search
. The relational registry as
specified here, while superceding these old vg:Search
capabilities, does not use this type any more. See section
registration on how to locate services
supporting it.vs:catalogservice
probably intend to communicate
that there are no actual sky positions in the tables exposed.rr.relationship
table to find out services that allow
access to the data (the served_by
relation), and/or look for values for
/accessURL
in
rr.res_detail.The status
attribute of vr:Resource
is
considered an implementation detail of the XML serialization and is not
kept here. Neither inactive
nor deleted
records may be kept in the resource
table. Since all
other tables in the relational registry should keep a foreign key on the
ivoid
column, this implies that only metadata on
active
records
is being kept in the relational registry. In other words, users can
expect a resource to exist and work if they find it in a relational
registry .
Name Utype | Type | Description |
---|---|---|
res_type xpath:@xsi:type | adql:VARCHAR(*) | Resource type (something like vs:datacollection, vs:catalogservice, etc) |
created xpath:@created | adql:TIMESTAMP | The UTC date and time this resource metadata description was created. This timestamp must not be in the future. This time is not required to be accurate; it should be at least accurate to the day. Any insignificant time fields should be set to zero. |
short_name xpath:shortName | adql:VARCHAR(*) | A short name or abbreviation given to something. This name will be used where brief annotations for the resource name are required. Applications may use to refer to this resource in a compact display. One word or a few letters is recommended. No more than sixteen characters are allowed. |
res_title xpath:title | adql:VARCHAR(*) | The full name given to the resource |
updated xpath:@updated | adql:TIMESTAMP | The UTC date this resource metadata description was last updated. This timestamp must not be in the future. This time is not required to be accurate; it should be at least accurate to the day. Any insignificant time fields should be set to zero. |
content_level xpath:content/contentLevel | adql:VARCHAR(*) | Description of the content level or intended audience |
res_description xpath:content/description | adql:VARCHAR(*) | An account of the nature of the resource. |
reference_url xpath:content/referenceURL | adql:VARCHAR(*) | URL pointing to a human-readable document describing this resource. |
creator_seq xpath:curation/creator/name | adql:VARCHAR(*) | The creator(s) of the resource in the order given by the resource record author. |
content_type xpath:content/type | adql:VARCHAR(*) | Nature or genre of the content of the resource |
source_format xpath:content/source/@format | adql:VARCHAR(*) | The format of source_value. Recognized values include "bibcode", referring to a standard astronomical bibcode (http://cdsweb.u-strasbg.fr/simbad/refcode.html). |
source_value xpath:content/source | adql:VARCHAR(*) | A bibliographic reference from which the present resource is derived or extracted. |
res_version xpath:curation/version | adql:VARCHAR(*) | Label associated with creation or availablilty of a version of a resource. |
region_of_regard xpath:coverage/regionOfRegard | adql:REAL | A single numeric value representing the angle, given in decimal degrees, by which a positional query against this resource should be "blurred" in order to get an appropriate match. |
waveband xpath:coverage/waveband | adql:VARCHAR(*) | A hash-separated list of regions of the electro-magnetic spectrum that the resource's spectral coverage overlaps with. |
rights xpath:rights | adql:VARCHAR(*) | Information about rights held in and over the resource. |
ivoid xpath:identifier | adql:VARCHAR(*) | Unambiguous reference to the resource conforming to the IVOA standard for identifiers |
This table should have the ivoid
column explicitly set
as its primary key.
The following columns MUST be lowercased during ingestion:
ivoid
, res_type
, content_level
,
content_type
, source_format
,
waveband
.
Clients are advised to query the res_description
and
res_title
columns
using the the ivo_hasword
function, and to use
ivo_hashlist_has
on content_level
,
content_type
,
waveband
, and rights
.
The row for region_of_regard
TAP_SCHEMA.tables
must have deg
in its
unit
column.
This table subsumes the contact, publisher, contributor, and creator members of the VOResource data model. They have been combined into a single table to reduce the total number of tables, and also in anticipation of a unified data model for such entities in future versions of VOResource.
The actual role is given in the base_role
column, which
can be one of contact
, publisher
, contributor
, or
creator
. Depending on this value, here are the xpaths
for the table fields (we have abbreviated
/curation/publisher
as cp, /curation/contact
as co, /curation/creator
as cc,
and /curation/contributor
as cb):
base_role value | contact | publisher | creator | contributor |
---|---|---|---|---|
role_name | co/name | cp | cc/name | cb |
role_ivoid | co/name/@ivo-id | cp/@ivo-id | cc/name/@ivo-id | cb/@ivo-id |
address | co/address | N/A | N/A | N/A |
co/email | N/A | N/A | N/A | |
telephone | co/telephone | N/A | N/A | N/A |
logo | co/logo | N/A | cc/logo | N/A |
Not all columns are available for each role type in VOResource. For example, contacts have no logo, and creators no telephone members. Unavailable metadata (marked with N/A in the above table) MUST be represented with NULL values in the corresponding columns.
Note that, due to current practice in the VO, it is not easy to
predict what role_name
will contain; it could be a single
name, where again the actual format is unpredictable
(full first name, initials in front or behind, or
even a project name), but it could as well be a full author list. Thus,
when matching against role_names
, you will have to use
rather lenient regular expressions. Changing this, admittedly
regrettable, situation would
probably require a change in the VOResource schema.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
role_name N/A | adql:VARCHAR(*) | The real-world name or title of a person or organization |
role_ivoid N/A | adql:VARCHAR(*) | An IVOA identifier of a person or organization |
street_address N/A | adql:VARCHAR(*) | A mailing address for a person or organization |
email N/A | adql:VARCHAR(*) | An email address the entity can be reached at |
telephone N/A | adql:VARCHAR(*) | A telephone number the entity can be reached at |
logo N/A | adql:VARCHAR(*) | URL pointing to a graphical logo, which may be used to help identify the entity |
base_role N/A | adql:VARCHAR(*) | The role played by this entity; this is one of contact, publisher, and creator |
The ivoid
column should be an explicit foreign key into
the resource
table. It is recommended to maintain indexes
on at least the role_name
column, ideally in a way that
supports regular expressions.
The following columns MUST be lowercased during ingestion:
ivoid
, role_ivoid
,
base_role
.
Clients are advised to query the remaining columns, in particular
role_name
,
case-insensitively, e.g., using ivo_nocasematch
.
Since subject queries are expected to be frequent and perform relatively complex checks (e.g., resulting from thesaurus queries in the clients), the subjects are kept in a separate table rather than being hash-joined like other string-like 1:n members of resource.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
res_subject xpath:subject | adql:VARCHAR(*) | Topics, object types, or other descriptive keywords about the resource. |
The ivoid
column should be an explicit foreign key into
resource
. It is recommended to index the
res_subject
column, preferably in a way that allows to process
case-insensitive and pattern queries using the index.
The ivoid
column MUST be lowercased during
ingestion. Clients are advised query the res_subject
column
case-insensitively, e.g., using ivo_nocasematch
.
The capability table describes a resource's modes of interaction; it only
contains the members of the base type vr:Capability
.
Members of derived types are kept in the res_detail
table
(see table_res_detail).
The table has an
integer-typed column cap_index
to disambiguate multiple
capabilities on a single resource. Operators are free to choose the
actual values as convenient, although it is recommended to just
enumerate the capabilities in their physical sequence per-resource.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
cap_index N/A | adql:SMALLINT | Running number of this capability within the resource. |
cap_type xpath:@xsi:type | adql:VARCHAR(*) | The type of capability covered here. |
cap_description xpath:description | adql:VARCHAR(*) | A human-readable description of what this capability provides as part of the over-all service |
standard_id xpath:@standardID | adql:VARCHAR(*) | A URI for a standard this capability conforms to. |
This table should have an explicit primary key made up of
ivoid
and cap_index
.
The ivoid
column should be
an explicit foreign key into resource
.
It is recommended to maintain indexes on at least the
cap_type
and standard_id
columns.
The following columns MUST be lowercased during ingestion:
ivoid
, cap_type
, standard_id
.
Clients are advised to query the cap_description
column
using the ivo_hasword
function.
The res_schema
table corresponds to VODataService's
schema
element. It has been renamed to avoid clashes with
the SQL reserved word SCHEMA
.
The table has an integer-typed column schema_index
to disambiguate multiple schema elements on a single resource.
Operators are free to choose the actual values as convenient, although
it is recommended to just enumerate the schema elements in their
physical sequence per-resource. The xpath
count(preceding-sibling::schema)
gives one possible
set of values for this column; it has not been prescribed via a utype,
though, since other enumeration schemes are valid as well.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
schema_index N/A | adql:SMALLINT | A running number for the res_schema rows belonging to a resource. |
schema_description xpath:description | adql:VARCHAR(*) | A free text description of the tableset explaining in general how all of the tables are related. |
schema_name xpath:name | adql:VARCHAR(*) | A name for the set of tables. |
schema_title xpath:title | adql:VARCHAR(*) | A descriptive, human-interpretable name for the table set. |
schema_utype xpath:utype | adql:VARCHAR(*) | An identifier for a concept in a data model that the data in this schema as a whole represent. |
This table should have an explicit primary key made up of
ivoid
and schema_index
. The
ivoid
column should be an explicit foreign key into
resource
.
The following columns MUST be lowercased during ingestion:
ivoid
, schema_name
, schema_utype
.
Clients are advised to query the schema_description
and schema_title
columns
using the the ivo_hasword
function.
The res_table
table models VODataService's
table
element. It has been renamed to avoid name clashes
with the SQL reserved word TABLE
.
VODataService 1.0 had a similar element that was a direct child of resource. Ingestors should also accept such tables, as there are still numerous active VODataService 1.0 resources in the Registry at the time of writing (this is the reason for the alternative in the table xpath).
The table contains an integer-typed column table_index
to
disambiguate multiple tables on a single resource. Operators are free to
choose the actual values as convenient, although it is recommended to just
enumerate the tables in their physical sequence per-resource (but not per
schema—table_index
MUST be unique within a resource).
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
schema_index N/A | adql:SMALLINT | Index of the schema this table belongs to, if it belongs to a schema (otherwise NULL). |
table_description xpath:description | adql:VARCHAR(*) | A free-text description of the table's contents |
table_name xpath:name | adql:VARCHAR(*) | The fully qualified name of the table. This name should include all catalog or schema prefixes needed to distinguish it in a query. |
table_index N/A | adql:INTEGER | An artificial counter for the tables belonging to a resource |
table_title xpath:title | adql:VARCHAR(*) | A descriptive, human-interpretable name for the table |
table_type xpath:@type | adql:VARCHAR(*) | A name for the role this table plays. Recognized values include "output", indicating this table is output from a query; "base_table", indicating a table whose records represent the main subjects of its schema; and "view", indicating that the table represents a useful combination or subset of other tables. Other values are allowed. |
table_utype xpath:utype | adql:VARCHAR(*) | An identifier for a concept in a data model that the data in this table as a whole represent. |
This table should have an explicit primary key made up of
ivoid
and table_index
. The
ivoid
column should be an explicit
foreign key into resource
. It is recommended to
maintain an index on at least the table_description
column, ideally one suited for queries with ivo_hasword
.
The following columns MUST be lowercased during ingestion:
ivoid
, table_name
, table_type
,
table_utype
.
Clients are advised to query the table_description
and table_title
columns
using the the ivo_hasword
function.
The table_column
table models the content of VOResource's
column
element. The table has been renamed to avoid
a name clash with the SQL reserved word COLUMN
.
Since it is expected that queries for column properties will be fairly common in advanced queries, it is the column table that has the unprefixed versions of common member names (name, ucd, utype, etc).
The flag
column contains a concatenation of all values
of a column
element's flag
children, separated
by hash characters. Use the ivo_hashlist_has
function in
queries against flag
.
The table_column
table also includes information from
VODataService's data type concept. VODataService 1.1 includes several type
systems (VOTable, ADQL, Simple). The
type_system
column contains the value of the column's
datatype
child, with the VODataService XML prefix fixed
to vs; hence, this column will contain one of NULL,
vs:TAPType
,
vs:SimpleDataType
, and vs:VOTableType
.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
table_index N/A | adql:INTEGER | Index of the table this column belongs to. |
name xpath:name | adql:VARCHAR(*) | The name of the column. |
ucd xpath:ucd | adql:VARCHAR(*) | A unified content descriptor that describes the scientific content of the parameter. |
unit xpath:unit | adql:VARCHAR(*) | The unit associated with all values in the column. |
utype xpath:utype | adql:VARCHAR(*) | An identifier for a role in a data model that the data in this column represents. |
std xpath:@std | adql:SMALLINT | If 1, the meaning and use of this parameter is reserved and defined by a standard model. If 0, it represents a database-specific parameter that effectively extends beyond the standard. |
datatype xpath:dataType | adql:VARCHAR(*) | The type of the data contained in the column. |
extended_schema xpath:dataType/@extendedSchema | adql:VARCHAR(*) | An identifier for the schema that the value given by the extended attribute is drawn from. |
extended_type xpath:dataType/@extendedType | adql:VARCHAR(*) | A custom type for the values this column contains. |
arraysize xpath:dataType/@arraysize | adql:VARCHAR(*) | The shape of the array that constitutes the value, e.g., 4, *, 4*, 5x4, or 5x*, as specified by VOTable. |
delim xpath:dataType/@delim | adql:VARCHAR(*) | The string that is used to delimit elements of an array value when arraysize is not '1'. |
type_system xpath:dataType/@xsi:type | adql:VARCHAR(*) | The type system used, as a QName with a canonical prefix; this will ususally be one of vs:simpledatatype, vs:votabletype, and vs:taptype. |
flag xpath:flag | adql:VARCHAR(*) | Hash-separated keywords representing traits of the column. Recognized values include "indexed", "primary", and "nullable". |
column_description xpath:description | adql:VARCHAR(*) | A free-text description of the column's contents. |
The pair ivoid
, table_index
should be an
explicit foreign key into res_table
. It is recommended to
maintain indexes on at least the column_description
,
name
, ucd
, and utype
columns,
where the index on column_description
should ideally be able
to handle queries using ivo_hasword
.
The following columns MUST be lowercased during ingestion:
ivoid
, name
, ucd
,
utype
, datatype
, type_system
.
Clients are advised to query the description
column using the ivo_hasword
function, and to query
the flag
column using the ivo_hashlist_has
function.
The interface
table subsumes both the
vr:Interface
and vr:accessURL
types from
VOResource. The integration of accessURL
into
the interface
table means that an interface in the
relational registry can only have one access URL, where in VOResource it
can have many. In practice, this particular VOResource capability has
not been used by registry record authors. Since access URLs are
probably the item most queried for, it seems warranted to save one
indirection when querying for them.
This specification deprecates the maxOccurs="unbounded"
in the definition of Interface
's accessURL
child in the XML schema
http://www.ivoa.net/xml/VOResource/v1.0
; in future versions
of VOResource, implementations can expect this to be
maxOccurs="1"
. Meanwhile, implementation behaviour in the
presence of multiple access URLs in an interface is undefined.
The table contains an
integer-typed column intf_index
to disambiguate multiple
interfaces of one resource. Operators are free to choose the
actual values as convenient, although it is recommended to just
enumerate the interfaces in their physical sequence per resource.
The query_type
column is a hash-joined list (analoguos
to waveband
, etc. in the resource table), as
the XML schema allows listing up to two request methods.
This table only contains interface elements from within capabilities. Interface elements in StandardsRegExt records are ignored in the relational registry, and they must not be inserted in this table, since doing so would disturb the foreign key from interface into capability. In other words, the relational registry requires every interface to have a parent capability.
Analogous to resource.res_type
, the
intf_type
column contains type names; VOResource extensions
can define new types here, but at the time of writing, the following
types are mentioned in IVOA-recommended schemata:
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
cap_index N/A | adql:SMALLINT | The index of the parent capability. |
intf_index N/A | adql:SMALLINT | A running number for the interfaces of a resource. |
intf_type xpath:@xsi:type | adql:VARCHAR(*) | The type of the interface (vr:WebBrowser, vs:ParamHTTP, etc). |
intf_role xpath:@role | adql:VARCHAR(*) | An identifier for the role the interface plays in the particular capability. If the value is equal to "std" or begins with "std:", then the interface refers to a standard interface defined by the standard referred to by the capability's standardID attribute. |
std_version xpath:@version | adql:VARCHAR(*) | The version of a standard interface specification that this interface complies with. When the interface is provided in the context of a Capability element, then the standard being refered to is the one identified by the Capability's standardID element. |
query_type xpath:queryType | adql:VARCHAR(*) | Hash-joined list of expected HTTP method (GET or POST) supported by the service. |
result_type xpath:resultType | adql:VARCHAR(*) | The MIME type of a document returned in the HTTP response. |
wsdl_url xpath:wsdlURL | adql:VARCHAR(*) | The location of the WSDL that describes this Web Service. If NULL, the location can be assumed to be the accessURL with '?wsdl' appended. |
url_use xpath:accessURL/@use | adql:VARCHAR(*) | A flag indicating whether this should be interpreted as a base URL ('base'), a full URL ('full'), or a URL to a directory that will produce a listing of files ('dir'). |
access_url xpath:accessURL | adql:VARCHAR(*) | The URL at which the interface is found. |
This table should have the pair ivoid
, cap_index
as an explicit foreign key into
capability
, and the pair ivoid
, and
intf_index
as an explicit primary key. Additionally, it
is recommended to maintain an index on at least the
intf_type
column.
The following columns MUST be lowercased during ingestion:
ivoid
, intf_type
, intf_role
,
std_version
, query_type
,
result_type
, url_use
.
Clients are adviced to query query_type
using the the
ivo_hashlist_has
function.
The intf_param
table keeps information on the parameters
available on interfaces. It is therefore closely related to
table_column
, but the differences between the two are
significant enough to warrant a separation between the two tables.
Since the names of common column attributes are used where applicable in
both tables (e.g., name, ucd, etc), the two tables cannot be (naturally)
joined.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
intf_index N/A | adql:SMALLINT | The index of the interface this parameter belongs to. |
name xpath:name | adql:VARCHAR(*) | The name of the parameter. |
ucd xpath:ucd | adql:VARCHAR(*) | A unified content descriptor that describes the scientific content of the parameter. |
unit xpath:unit | adql:VARCHAR(*) | The unit associated with all values in the parameter. |
utype xpath:utype | adql:VARCHAR(*) | An identifier for a role in a data model that the data in this parameter represents. |
std xpath:@std | adql:SMALLINT | If 1, the meaning and use of this parameter is reserved and defined by a standard model. If 0, it represents a database-specific parameter that effectively extends beyond the standard. |
datatype xpath:dataType | adql:VARCHAR(*) | The type of the data contained in the parameter. |
extended_schema xpath:dataType/@extendedSchema | adql:VARCHAR(*) | An identifier for the schema that the value given by the extended attribute is drawn from. |
extended_type xpath:dataType/@extendedType | adql:VARCHAR(*) | A custom type for the values this parameter contains. |
arraysize xpath:dataType/@arraysize | adql:VARCHAR(*) | The shape of the array that constitutes the value, e.g., 4, *, 4*, 5x4, or 5x*, as specified by VOTable. |
delim xpath:dataType/@delim | adql:VARCHAR(*) | The string that is used to delimit elements of an array value when arraysize is not '1'. |
param_use xpath:@use | adql:VARCHAR(*) | An indication of whether this parameter is required to be provided for the application or service to work properly (one of required, optional, ignored, or NULL). |
param_description xpath:description | adql:VARCHAR(*) | A free-text description of the column's contents. |
The pair
ivoid
, intf_index
should be an explicit foreign key into interface
.
The remaining requirements and conventions are as per
section table_table_column
where applicable, and param_description
taking the role
of column_description
.
The relationship element is a slight denormalization of the
vr:Relationship
type: Whereas in VOResource, a single
relationship element can take several IVORNs, in the relational model,
the pairs are stored directly. It is straightforward to translate
between the two representations in the database ingestor.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
relationship_type xpath:relationshipType | adql:VARCHAR(*) | The named type of relationship; this can be mirror-of, service-for, served-by, derived-from, related-to, or something user-defined. |
related_id xpath:relatedResource/@ivo-id | adql:VARCHAR(*) | The URI form of the IVOA identifier for the resource refered to. |
related_name xpath:relatedResource | adql:VARCHAR(*) | The name of resource that this resource is related to. |
The
ivoid
column should be an explicit foreign key into the
resoure
table. You should index at least the
related_id
column.
The following columns MUST be lowercased during ingestion:
ivoid
, relationship_type
,
related_id
.
The validation
subsumes the
vr:validationLevel
-typed members of both vr:Resource
and vr:Capability
.
If the cap_index
column in NULL
the
validation comprises the entire resource. Otherwise, only the
referenced capability has been validated.
While it is recommended that harvesters only accept resource records
from their originating registries, it is valuable to gather validation
results from various sources. Hence, harvesters for the relational
registry may choose to obtain validation data from the OAI-PMH endpoints
of various registries by not harvesting just for the ivo_managed set and
generate rr.validation
rows from these records. This can
trigger potentially problematic behaviour when the original registry
updates its resource record in that naive implementations will lose all
third-party validation rows; this may actually be the correct behaviour,
since an update of the registry record might very well indicate
validation-relevant changes in the underlying services. Implementations
are free to handle or ignore validation results as they see fit, and
they may add validation results of their own.
For convenicence, we repeat the definition of the validation levels from std:RM:
- 0
- The resource has a description that is stored in a registry. This level does not imply a compliant description.
- 1
- In addition to meeting the level 0 definition, the resource description conforms syntactically to this standard and to the encoding scheme used.
- 2
- In addition to meeting the level 1 definition, the resource description refers to an existing resource that has been demonstrated to be functionally compliant. When the resource is a service, it is considered to exist and to be functionally compliant if use of the Service.InterfaceURL or Service.BaseURL responds without error when used as intended by the resource. If the service is a standard one, it must also demonstrate the response is syntactically compliant with the service standard in order to be considered functionally compliant. If the resource is not a service, then the ReferenceURL must be shown to return a document without error.
- 3
- In addition to meeting the level 2 definition, the resource description has been inspected by a human and judged to comply semantically to this standard as well as meeting any additional minimum quality criteria (e.g., providing values for important but non-required metadata) set by the human inspector. [...]
- 4
- In addition to meeting the level 3 definition, the resource description meets additional quality criteria set by the human inspector and is therefore considered an excellent description of the resource. Consequently, the resource is expected to operate well as part of a VO application or research study.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
validated_by xpath:validationLevel/@validatedBy | adql:VARCHAR(*) | The IVOA ID of the registry or organisation that assigned the validation level. |
val_level xpath:validationLevel | adql:SMALLINT | A numeric grade describing the quality of the resource description, when applicable, to be used to indicate the confidence an end-user can put in the resource as part of a VO application or research study. |
cap_index N/A | adql:SMALLINT | If non-NULL, the validation only refers to the the capability referenced here. |
The
ivoid
should be an explicit foreign key into resource
.
Note, however, that ivoid
, cap_index
is
not a foreign key into capability
since
cap_index
may be NULL
(in case the validation
addresses the entire resource).
The following columns MUST be lowercased during ingestion:
ivoid
, validated_by
.
The res_date
table contains information gathered from
vr:Curation
's date children.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
date_value xpath:date | adql:TIMESTAMP | A date associated with an event in the life cycle of the resource. |
value_role xpath:date/@role | adql:VARCHAR(*) | A string indicating what the date refers to, e.g., created, availability, updated. |
The ivoid
column should be an explicit foreign key into
resource
.
The following columns MUST be lowercased during ingestion:
ivoid
, value_role
.
The res_detail
table is the relational registry's primary means for
extensibility as well as a fallback for less-used simple
metadata. Conceptually, it stores triples of resource entity
references, resource xpaths,
and values, where resource entities can be resource records themselves
or capabilities. Thus, metadata with values that are either
string-valued or sets of strings can be represented in this table.
As long as the metadata that needs to be represented in the
relational registry for new VOResource extensions is simple enough, no changes to the schema defined
here will be neccessary as these are introduced. Instead, the extension itself simply defines
new xpaths to be added in res_detail
.
Some complex metadata—tr:languageFeature
or
vstd:key
being examples—cannot be kept in this table.
If a representation of such information in the relational registry is
required, this standard will need to be changed.
Appendix d_u_list gives a list of resource xpaths from the registry extensions that were recommendations at the time of writing. For the resource xpaths marked with an exclamation mark there, xpath/value pairs MUST be generated whenever the corresponding metadata items are given in a resource record. For the remaining resource xpaths, these pairs should be provided if convenient; they mostly concern test queries and other curation-type information that, while unlikely to be useful to normal users, may be relevant to curation-type clients that, e.g., ascertain the continued operation of services.
Some detail values must be interpreted case-insensitively; this
concerns, in particular, IVORNs like the TAP data model type. For other
rows—the test queries are immediate examples—, changing the case
will likely break the data. In order to avoid having to give and
implement case normalization rules by detail xpath, no case normalization
is done on detail values at all, and users and clients will have to use
the ivo_nocasematch
user defined function (see section
adqludf) when locating
case-insensitive values. For the resource xpaths given in Appendix d_u_list, this concerns all items with xpaths ending
in @ivo-id
.
Individual ingestors MAY choose to expose additional metadata using other utypes, provided they are formed according to the rules in section vorutypes (this rule is intended to minimize the risk of later clashes).
In addition to the metadata listed in this specification,
metadata defined in future
IVOA-approved VOResource extensions MUST or SHOULD be present in
res_details
as the extensions require it.
Name Utype | Type | Description |
---|---|---|
ivoid xpath:/identifier | adql:VARCHAR(*) | The parent resource. |
cap_index N/A | adql:SMALLINT | The index of the parent capability; if NULL the xpath-value pair describes a member of the entire resource. |
detail_xpath N/A | adql:VARCHAR(*) | The xpath of the data item |
detail_value N/A | adql:VARCHAR(*) | (Atomic) value of the member |
The ivoid
column should be an explicit foreign key into
resource
. It is recommended to maintain indexes on
at least the columns
detail_xpath
and detail_value
, where the
indices on detail_value
should ideally work for both direct
comparisons and searches using the ivo_nocasematch
function.
The following column MUST be lowercased during ingestion:
ivoid
. Clients are advised to
use the ivo_nocasematch
function to search in
detail_value
if the values are to be compared
case-insensitively (e.g., all IVORNs).
TAP Servers implementing the
ivo://ivoa.net/std/RegTAP#1.0
data model MUST
implement the following three functions in their ADQL 2.0 language,
with signatures written as recommended in std:TAPREGEXT:
ivo_nocasematch(value VARCHAR(*), pattern VARCHAR(*))->INTEGER
pattern
matches
value
, 0 otherwise.
pattern
is defined as for the SQL LIKE operator, but the
match is performed case-insensitively.
ivo_hasword(haystack VARCHAR(*), needle VARCHAR(*)) -> INTEGER
ivo_hasword
; for such
queries trying them on multiple registries may improve recall.ivo_hashlist_has(haslist VARCHAR(*), item VARCHAR(*))->INTEGER
Reference implementations of the three functions for the PostgreSQL database system are given in Appendix appPGDefs.
This section contains sample queries to the relational registry, mostly contributed as use cases by various members of the IVOA Registry working group. They are intended as an aid in designing relational registry queries, in particular for users new to the data model.
Where these queries locate access URLs of standard services, they
contain intf_role='std'
conditions. Desktop clients should
probably include this to weed out nonstandard interfaces on standard
capabilities, which are rare but technically legal; on the other hand,
there are fairly standards-compliant interfaces for which the operator
neglected to set role="std"
, and the respective services
would be missed with such a condition. Hand-crafted queries can probably
typically do without restraining role since most standard capabilities
in real-world resources only have
one—standard—interface.
Problem: Find all TAP services; return their accessURLs
As the capability type is in rr.capability, whereas the access URL can be found from rr.interface, this requires a join:
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.interface WHERE standard_id='ivo://ivoa.net/std/tap' AND intf_role='std'
Other standard_id
s relevant here include:
ivo://ivoa.net/std/registry
for OAI-PMH services,ivo://ivoa.net/std/sia
for SIA services,ivo://ivoa.net/std/conesearch
for SCS services,
andivo://ivoa.net/std/ssa
for SSA services.Problem: Find all SIA services that might have spiral galaxies
This is somewhat tricky since it is probably hard to image a part of the sky guaranteed not to have some, possibly distant, spiral galaxy in it. However, translating the intention into "find all SIA services that mention spiral in either the subject (from rr.res_subject), the description, or the title (which are in rr.resource)", the query would become:
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.resource NATURAL JOIN rr.interface NATURAL JOIN rr.res_subject WHERE standard_id='ivo://ivoa.net/std/sia' AND intf_role='std' AND ( 1=ivo_nocasematch(res_subject, '%spiral%') OR 1=ivo_hasword(res_description, 'spiral') OR 1=ivo_hasword(res_title, 'spiral'))
Problem: Find all SIA services that provide infrared images
The waveband information in rr.resource
comes in hash-separated atoms (which can be
radio, millimeter, infrared, optical, uv, euv, x-ray, or gamma-ray).
For matching those, use the ivo_hashlist_has
function as
below. The access URL and the service standard come from rr.interface and rr.capability, respectively.
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.resource NATURAL JOIN rr.interface WHERE standard_id='ivo://ivoa.net/std/sia' AND intf_role='std' AND 1=ivo_hashlist_has('infrared', waveband)
Problem: Find all searchable catalogs (i.e., cone search services) that provide a column containing redshifts
Metadata on columns exposed by a service are contained in rr.table_column. Again, this table can be naturally joined with rr.capability and rr.interface:
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.table_column NATURAL JOIN rr.interface WHERE standard_id='ivo://ivoa.net/std/conesearch' AND intf_role='std' AND ucd='src.redshift'
Sometimes you want to match a whole set of ucds. Frequently the
simple regular expressions of SQL will help, as in
AND ucd LIKE 'pos.parallax%'
. When that is not enough,
use boolean OR expressions
Problem: Find all the resources published by a certain authority
An "authority" within the VO is something that hands out identifiers. You can tell what authority a record came from by looking at the "host part" of the IVO identifier, most naturally obtained from rr.resource. Since ADQL cannot actually parse URIs, we make do with simple string matching:
SELECT ivoid FROM rr.resource WHERE ivoid LIKE 'ivo://org.gavo.dc%'
Problem: What registry records are there from a given publisher?
This uses the rr.res_role table both match names (in this case, a publisher that has "gavo" in its name) and to ascertain the named entity actually publishes the resource (rather than, e.g., just being the contact in case of trouble). The result is a list of ivoids in this case. You could join this with any other table in the relational registry to find out more about these services.
SELECT ivoid FROM rr.res_role WHERE 1=ivo_nocasematch(role_name, '%gavo%') AND base_role='publisher'
or, if the publisher actually gives its ivo-id in the registry records,
SELECT ivoid FROM rr.res_role WHERE role_ivoid='ivo://ned.ipac/ned' AND base_role='publisher'
Problem: What registry records are there originating from registry X?
This is mainly a query interesting for registry maintainers. Sill, it is a nice example for joining with the rr.res_detail table, in this case to first get a list of all authorities managed by the CDS registry.
SELECT ivoid FROM rr.resource RIGHT OUTER JOIN ( SELECT 'ivo://' || detail_value || '%' AS pat FROM rr.res_detail WHERE detail_xpath='/managedAuthority' AND ivoid='ivo://cds.vizier/registry') AS authpatterns ON (resource.ivoid LIKE authpatterns.pat)
Problem: Find all TAP endpoints offering the relational registry
This is the discovery query for RegTAP services themselves; note how this combines rr.res_detail pairs with rr.capability and rr.interface to locate the desired protocol endpoints.
SELECT access_url FROM rr.interface NATURAL JOIN rr.capability NATURAL JOIN rr.res_detail WHERE standard_id='ivo://ivoa.net/std/tap' AND intf_role='std' AND detail_xpath='/capability/dataModel/@ivo-id' AND 1=ivo_nocasematch(detail_value, 'ivo://ivoa.net/std/regtap#1.0')
Problem: Find all TAP services exposing a table with certain features
"Certain features" could be "have some word in their description and having a column with a certain UCD". Either way, this kind of query fairly invariably combines the usual rr.capability and rr.interface for service location with rr.table_column for the column metadata and rr.res_table for properties of tables.
SELECT ivoid, access_url, name, ucd, column_description FROM rr.capability NATURAL JOIN rr.interface NATURAL JOIN rr.table_column NATURAL JOIN rr.res_table WHERE standard_id='ivo://ivoa.net/std/tap' AND intf_role='std' AND 1=ivo_hasword(table_description, 'quasar') AND ucd='phot.mag;em.opt.v'
Problem: Find all SSAP services that provide theoretical spectra
The metadata required to solve this problem is found in the SSAP registry extension and is thus kept in rr.res_detail:
SELECT access_url FROM rr.res_detail NATURAL JOIN rr.capability NATURAL JOIN rr.interface WHERE detail_xpath='/capability/dataSource' AND intf_role='std' AND standard_id='ivo://ivoa.net/std/ssa' AND detail_value='theory'
Problem: The service at http://dc.zah.uni-heidelberg.de/__system__/tap/run/tap is down, who can fix it?
This uses the rr.res_role table and
returns all information on it based on the IVORN of a service that in
turn was obtained from rr.interface. You
could restrict to the actual technical contact person by requiring
base_role='contact'
.
SELECT DISTINCT base_role, role_name, email FROM rr.res_role NATURAL JOIN rr.interface WHERE access_url='http://dc.zah.uni-heidelberg.de/__system__/tap/run/tap'
Problem: Get the capabilities of all services serving a specific resource (typically, a data collection)
In the VO, data providers can register pure data collections without access options (or just furnished with a link to a download). They can then declare that their data can be "served-by" some other resource, typically a TAP service or some collective service for a number of instruments. To locate the access options to the data itself, inspect rr.relationship and use it to select records from rr.capability.
SELECT * FROM rr.relationship AS a JOIN rr.capability AS b ON (a.related_id=b.ivoid) WHERE relationship_type='served-by'
This appendix defines the of the res_details
table (see section table_res_detail for
details) by giving
xpaths for which xpath/value pairs MUST (where marked with an
exclamation mark) or SHOULD be given if the
corresponding data is present in the resource records. This list is
normative for metadata defined in IVOA recommendations current as of the
publication of this document (see section rolewithinivoa). As laid down in section table_res_detail, new VOResource extensions or new
versions of existing VOResource extensions may amend this list.
The xpaths are suffient for locating the respective metdata as per section vorutypes. With the explanations we give the canonical prefixes for the XML namespaces from which the items originate, which is where further information can be found.
true
if the service supports the VERB keyword; false
, otherwise (cs).true
, then an accompanying /format
item is a MIME Type. Within res_details, this does not work for services that give more than one format; since furthermore the literal of vs:Format allows a good guess whether or not it is a MIME type, this does not appear a dramatic limitation (vs).Note that the representation of StandardsRegExt's
status
and use
attributes as well as its key
would require sequences of
complex objects, which is impossible using res_detail
.
Hence, the respective metadata is not queriable
within the relational registry. Similarly, complex TAPRegExt metadata on
languages, user defined functions, and the like cannot be
represented in this table. Since these pieces of metadata do not seem
relevant to resource discovery and are geared towards other uses of the
respective VOResource extensions, a more complex model does not seem
warrented just so they can be exposed.
What follows are (non-normative) implementations of the three user defined functions specificed in section adqludf in the SQL dialect of PostgreSQL (e.g., doc:Postgres92).
Note that PostgreSQL cannot use fulltext indices on the respective columns if the functions are defined in this way for (fairly subtle) reasons connected with NULL value handling. While workarounds are conceivable, they come with potentially unwelcome side effects, at least as of PostgreSQL 9.x. It is therefore recommended to replace expressions involving the functions given here with simple boolean expressions in the ADQL translation layer whenever possible.
This appendix contains a set of constraints and recommendations for implementing the relational registry model on actual RDBMSes, originating partly from an analysis of the data content of the VO registry in February 2013, partly from a consideration of limits in various XML schema documents. This concerns, in particular, minimum lengths for columns of type adql:VARCHAR. Implementations MUST NOT truncate strings of length equal to or smaller than the minimal lengths given here; the limitations are not, however, upper limits, and indeed, when choosing an implementation strategy it is in general preferable to not impose artificial length restrictions, in particular if no performance penalty is incurred.
These notes can also be useful with a view to preparing user interfaces for the relational registry, since input forms and similar user interface elements invariably have limited space; the limits here give reasonable defaults for the amount of data that should minimally be manipulatable by a user with reasonable effort.
The ivoid
field present in every table of this
specification merits special consideration, on the one hand due to its
frequency, but also since other IVOA identifiers stored in the
relational registry should probably be treated analoguously.
Given that IVORNs in the 2013 data fields have a maximum
length of roughly 100 characters, we propose that a maximum length of
255 should be sufficient even when taking possible fragment identifiers
into account.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
ivo-id | VARCHAR(255) |
{all_tables}.ivoid res_role.role_ivoid capability.standard_id relationship.related_id validation.validated_by res_detail.detail_value for several values of detail_xpath |
The relational registry also contains some date-time values. The most
straightforward implementation certainly is to use SQL timestamps.
Other relational registry fields that straightforwardly map to common
SQL types are those that require numeric values, viz.,
REAL
, SMALLINT
, and
INTEGER
. The following table summarizes these:
Field type | Datatype | Pertinent Fields |
---|---|---|
floating point | REAL |
resource.region_of_regard |
small integer | SMALLINT |
capability.cap_index res_schema.schema_index res_table.schema_index table_column.std interface.cap_index interface.intf_index intf_param.intf_index intf_param.std validation.level validation.cap_index res_detail.cap_index res_table.table_index table_column.table_index |
The fields containing Utypes, UCDs, and Units are treated in parallel here. The 2013 registry data indicates a length of 128 characters is sufficient for real-world purposes; actually, at least UCDs and Units could of course grow without limitations, but it seems unreasonable anything longer than a typical line might actually be useful. As far as utypes are concerned, we expect those to shrink rather than grow with new standardization efforts.
In this category, we also summarize our resource xpaths. With the conventions laid down in this document, it seems unlikely that future extensions to VOResource will be so deeply nested that 128 characters will not be sufficient for their resource xpaths.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
Utype(s) UCD(s) Units xpaths |
VARCHAR(128) |
res_schema.schema_utype res_table.table_utype table_column.ucd table_column.unit table_column.utype intf_param.ucd intf_param.unit intf_param.utype res_detail.detail_xpath |
The relational registry further has an e-mail field, for which we chose 128 characters as a reasonable upper limit (based on a real maximum of 40 characters in the 2013 data). There are furthermore URLs (in addition to access and reference URLs, there are also URLs for the WSDL of SOAP services and logos for roles). Due to the importance of in particular the access URLs we strongly recommend to use non-truncating types here. Empirically, there are access URLs of up to 224 characters in 2013 (reference URLs are less critical at a maximum of 96 characters). Expecting that with REST-based services, URL lengths will probably rather tend down than up, we still permit truncation at 255 characters.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
VARCHAR(128) |
res_role.email |
|
URLs | VARCHAR(255) |
resource.reference_url res_role.logo interface.wsdl_url interface.access_url |
The next group of columns comprises those that have values taken from a controlled or finite vocabulary. Trying to simplify the view, lengths in the form of powers of two are considered.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
predefined values |
VARCHAR(255) |
resource.content_level resource.content_type |
VARCHAR(64) |
resource.rights resource.waveband |
|
VARCHAR(32) |
resource.res_type capability.cap_type res_table.table_type table_column.flag table_column.datatype table_column.extended_schema table_column.extended_type table_column.type_system interface.result_type intf_param.datatype intf_param.extended_schema intf_param.extended_type |
|
VARCHAR(4) |
interface.query_type interface.url_use |
Finally, there are the fields that actually contain what is basically free text. For these, we have made a choice from reasonable powers of two lengths considering the actual lengths in the 2013 registry data. A special case are fields that either contain natural language text (the descriptions) or those that have grown without limit historically (resource.creator_seq, and, giving in to current abuses discussed above, res_role.role_name). For all such fields, no upper limit can sensibly be defined. However, since certain DBMSes (e.g., MySQL older than version 5.6) cannot index fields with a TEXT datatype and thus using VARCHAR may be necessary at least for frequenly-searched fields, we give the maximal length of the fields in the 2013 registry in parentheses after the column designations for the TEXT datatype:
Field type | Datatype suggested | Interested Fields |
---|---|---|
free string values | TEXT |
resource.res_description (7801) resource.creator_seq (712) res_role.role_name (712) res_schema.schema_description (934) res_table.table_description (934) table_column.description (3735) intf_param.description (347) capability.cap_description (100) |
VARCHAR(255) |
resource.res_title res_role.address res_schema.schema_title res_table.table_title relationship.related_name res_detail.detail_value |
|
VARCHAR(128) |
resource.version resource.source_value res_subject.res_subject |
|
VARCHAR(64) |
res_table.table_name table_column.name intf_param.name |
|
VARCHAR(32) |
resource.source_format res_role.telephone res_schema.schema_name interface.intf_type interface.intf_role relationship_type res_date.value_role |
|
VARCHAR(16) |
resource.short_name table_column.arraysize interface.std_version intf_param.use |
The (non-normative) following XSL stylesheet emits xpaths as discussed in section vorutypes when applied to a VOResource extension. Considering readability and limitations of XSLT, this is not fully general -- if VOResource extensions started to inherit from other extensions' subclasses of Resource, Capability, or Interface, this stylesheet will need to be extended.
Still, it is a useful tool when evaluating how to map a given extension to the relational registry.
ivo://ivoa.net/std/RegTAP#1.0
to match usage with a
later standards record./capability/uploadMethod/@ivo-id
res_detail keys that was
accidentally lost in a previous version./capability/maxSearchRadius
,
/capability/maxImageSize
, and
/testQuery/pos/refframe
.res_detail.detail_utype
has been renamed
res_detail.detail_xpath
.use_param
, maintain
consistency after the renaming):Old | New |
---|---|
resource.version | resource.res_version |
res_role.address | res_role.street_address |
subject.* | res_subject.* |
res_subject.res_subject | res_subject.res_subject |
table_column.description | table_column.column_description |
intf_param.description | intf_param.param_description |
intf_param.use_param | intf_param.param_use |
validation.level | validation.val_level |
resource.region_of_regard
now must have unit metadata
declared.