I nternational
V irtual
O bservatory
A lliance
Registries provide a mechanism with which VO applications can discover and select resources—first and foremost data and services—that are relevant for a particular scientific problem. This specification defines an interface for searching this resource metadata based on the IVOA's TAP protocol. It specifies a set of tables that comprise a useful subset of the information contained in the registry records, as well as the table's data content in terms of the XML VOResource data model. The general design of the system is geared towards allowing easy authoring of queries.
This is an IVOA Working Draft for review by IVOA members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use IVOA Working Drafts as reference materials or to cite them as other than "work in progress".
A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.
This document has been developed in part with support from the German Astronomical Virtual Observatory (BMBF Bewilligungsnummer 05A08VHA).
The words "MUST ", "SHOULD", "MAY", "RECOMMENDED", and "OPTIONAL" (in upper or lower case) used in this document are to be interpreted as described in the IETF standard RFC 2119 [std:RFC2119].
In the Virtual Observatory (VO), registries provide a means for discovering useful resources, i.e., data and services. Individual publishers offer the descriptions for their resources ("resource records") in publishing registries. At the time of writing, there are roughly 14000 such resource records active within the VO, originating from about 20 publishing registries.
The protocol spoken by these publishing registries, OAI-PMH, is not suitable for data discovery, and even if it were, data discovery would at least be fairly time consuming if each client had to query dozens or, potentially, hundreds of publishing registries.
To enable efficient data discovery nevertheless, there are services harvesting the resource records from the publishing registries and offering interfaces more suitable for querying by their users. The IVOA Registry Interfaces specification [std:RI1] defined, among other aspects of the VO registry system, such an interface using SOAP and an early draft of an XML-based query language.
This document provides an update to the query ("full registry") part of this specifiation, aimed towards usage with current VO standards, in particular TAP [std:TAP] and ADQL [std:ADQL]. It follows the model of ObsCore [std:OBSCORE] of defining a representation of a data model within a relational data base. In this case, the data model is a simplification of the VO's resource metadata interchange representation, the VOResource XML format [std:VOR]. The simplification yields 13 tables. This specification gives the table metadata together with rules for how to fill these tables from VOResource-serialized metadata records.
This architecture allows client applications to perform "canned" queries on behalf of their user as well as complex queries formulated directly by advanced users, using the same TAP clients they employ to query astronomical data servers.
This specification directly relates to other VO standards in the following ways:
res_detail
table using
utypes algorithmically generated from the XML schema documents given by
these standards. This document should not in general need updates
for registry extension updates. Still, in particular with a view to the
caveat on the limits of the utypes generation algorithm given
in section 5, we note the
versions current as of this specification: SimpleDALRegExt 1.0
[std:DALREGEXT],
StandardsRegExt 1.0 [std:STDREGEXT], TAPRegExt 1.0
[std:TAPREGEXT], Registry Interfaces 1.0 [std:RI1].This standard also relates to other IVOA standards:
In the design of the tables, the goal has been to preserve as much of VOResource and its extensions, including the element names, as possible.
An overriding consideration has been, however, to make natural joins
between the tables behave usefully, i.e., to actually combine rows
relevant to the same entity (i.e., resource, table, capability, etc.).
To disambiguate column names that name the same concept on different
entities (name, description, etc.) and would therefore interfere with
the natural join, a shortened tag for the source object
is prepended to the name. Thus, a description
element within
a resource ends up in a column named
res_description
, whereas the same element from a
capability becomes cap_description
.
Furthermore, camel-case identifiers have been converted to
underscore-separated ones (thus, standardId
becomes
standard_id
) to have all-lowercase column names; this saves
potential headache if users choose to reference the columns using SQL
delimited identifiers. Dashes in VOResource attribute names are
converted to underscores, too, with the exception of
ivo-id
, which is just rendered ivoid
.
Another design goal of this specification has been that different registries
operating on the same set of registry records will return identical responses
for most queries; hence, we try to avoid relying on features left not
defined by ADQL (e.g., the case sensitivity of string matches). However,
with a view to non-uniform support for information retrieval-type
queries in database systems, the ivo_hasword
user defined
function is not fully specified here; queries employing it may yield
different results on different implementations, even if they operate on
the same set of resource records.
ADQL has no operators for case-insensitive matching of strings. To allow for robust and straightforward queries nevertheless, most columns containing values not usually intended for display are required to be converted to lower case; in the table descriptions below, there are explicit requirements on case normalization near the end of each section. This is particularly important when the entities to be compared are defined to be case-insensitive (e.g., ucds, IVORNs, utypes). Client software that can inspect user-provided arguments (e.g., when filling template queries) should also convert the respective fields to lower-case.
This conversion MUST cover all ASCII letters, i.e., A through Z. The conversion SHOULD take place according to algorithm R2 in section 3.13, "Default Case Algorithms" of [std:UNICODE]. In practice, non-ASCII characters are not expected to occur in columns for which lowercasing is required.
Analogously, case-insensitive comparisons as required by some of the user-defined functions for the relational registry MUST compare the ASCII letters without regard for case. They SHOULD compare according to D144 in [std:UNICODE].
VOResource and its extensions make use of XML QNames in attribute
values, most prominently in xsi:type
. The standard
representation of these QNames in XML instance documents makes use of an
abbreviated notation employing prefixes declared using the xmlns mechanism
as discussed in [std:XMLNS]. Within an ADQL database, no
standard mechanism exists that could provide a similar mapping of URLs
and abbreviations. The correct way to handle this problem would thus be
to have full QNames in the database (e.g.,
{http://www.ivoa.net/xml/ConeSearch/v1.0}ConeSearch
for the
canonical cs:ConeSearch
). This, of course, would make for
excessively tedious and error-prone querying.
For various reasons, VOResource authors have always been encouraged to use a set of "standard" prefixes. This allows an easy and, to users, unsurprising exit from the problem of the missing xmlns declarations: For the representation of QNames within the database, these recommended prefixes are now mandatory. Future VOResource extensions define their mandatory prefixes themselves.
Following the existing practice, minor version changes are not in
general reflected in the recommended prefixes—e.g., both VODataService
1.0 and VODataService 1.1 use vs:
. For reference, here is
a table of XML namespaces and prefixes for namespaces relevant to this
specification:
oai | http://www.openarchives.org/OAI/2.0/ |
ri | http://www.ivoa.net/xml/RegistryInterface/v1.0 |
vg | http://www.ivoa.net/xml/VORegistry/v1.0 |
vr | http://www.ivoa.net/xml/VOResource/v1.0 |
dc | http://purl.org/dc/elements/1.1/ |
vs | http://www.ivoa.net/xml/VODataService/v1.0 |
vs | http://www.ivoa.net/xml/VODataService/v1.1 |
cs | http://www.ivoa.net/xml/ConeSearch/v1.0 |
sia | http://www.ivoa.net/xml/SIA/v1.0 |
ssap | http://www.ivoa.net/xml/SSA/v1.0 |
tr | http://www.ivoa.net/xml/TAPRegExt/v1.0 |
vstd | http://www.ivoa.net/xml/StandardsRegExt/v1.0 |
This specification piggybacks on top of the well-established VOResource standard. It explicitly does not define a full data model, but rather something like a reasonably query-friendly view of a partial representation of one. The link between the actual data model, i.e., VOResource and extensions as defined by the XML Schema documents, and the fields within this database schema, is provided by utypes.
These utypes are generated from the XML schema files for VOResource and its extensions by means of an XSL stylesheet. They are intended to be compatible with the utypes generated according to a future IVOA specification. These are expected to be based on a modelling language called VO-DML; since VOResource is not modelled in VO-DML — and probably never will, since interoperability with other OAI-PMH applications makes XML Schema a compelling choice as the modelling language —, we have to provide some custom method to define utypes and their relation to modelling elements
The basic premise here is that 1:1 relationships are mapped to utypes
by concatenating type and attribute names until the leaf element is
reached; attributes and elements (in XSD terms) are not distinguished.
1:n relationships yield one utype for the attribute (the "collection
utype", which is used on some tables), while utypes for the elements of
the collection are rooted in the collection element's types. For
example, the capabilities of a service have the collection utype
vr:Service.capability
, but the standard identifier of a
capability has the utype vr:Capability.standardID
.
XML namespaces are ignored in utype generation. In particular, the
xsi:type
attributes that provide polymorphism for VOResource's
Resource
and Capability
elements yield
utypes ending in type, as in vr:Resource.type
and
vr:Capability.type
.
At the utype level, polymorphism is reflected by using the names of the
type definitions in which an element or attribute is defined as a utype
fragment. For
example, in the description of a single data collection there can be
values for vr:Resource.created
(as the definition of
the created
attribute is on vr:Resource
itself) as well as vs:DataCollection.instrument
(since
instrument
comes from the definition of the XML Schema
DataCollection
type in this case).
The data model name of the utypes, i.e., the part in front of the colon, is the canonical target name space prefix taken from the XML schema. The remarks on versioning those from section 4 apply here as well.
An XSLT stylesheet producing utypes together with short
documentation strings from the XML schema files is given in Appendix
A. Note that due to limitations of XSLT version 1, the complexity
of XML schema, as well as considerations of implementation simplicity,
this stylesheet is not completely general (see also comments in the
file). Future VOResource extensions could break it if, e.g., they were
to inherit from types derived from vr:Resource
in a
different file. As given here, the stylesheet produces the desired
results for the versions of VOResource and its extensions given in
section 1.1.
Note that for readability utypes are given in camel-case throughout the normative text of this document, following the XML Schema names. However, utypes are generally considered to be case insensitive. Hence, in the database tables, all utypes are stored all-lowercase; this is mandated by the case conversion rules in the table descriptions.
The relational registry can be part of any TAP service. The presence
of the tables discussed here is indicated by declaring support for the
data model Registry 1.0
with the IVORN
ivo://ivoa.net/std/RegTAP/vor
in the service's
capabilities (see [std:TAPREGEXT]). Technically, this
entails adding
<dataModel ivo-id="ivo://ivoa.net/std/RegTAP/vor" >Registry 1.0</dataModel>
as a child of the capability element with the type
{http://www.ivoa.net/xml/TAPRegExt/v1.0}TableAccess
.
A client that knows the access URL of one relational registry can thus discover all other services exposing one. The "Find all TAP endpoints offering the relational registry" example in section 9 shows a query that does this.
In the following table descriptions, names and utypes of tables and columns are normative and MUST be used as given, and all-lowercase. Descriptions are not normative (as given, they usually are taken from the schema files of VOResource and its extensions with slight redaction). Registry operators MAY provide additional columns in their tables, but they MUST provide all columns given in this specification.
All table descriptions start out with brief remarks on the relationship of the table to the VOResource XML data model. Then, the columns are described in a selection of TAP_SCHEMA metadata. For each table, recommendations on explicit primary and foreign keys as well as indexed columns are given, where it is understood that primary and foreign keys are already indexed in order to allow efficient joins; these parts are not normative, but operators should ensure decent performance for queries assuming the presence of the given indices and relationships. Finally, lowercasing requirements (normative) are given.
The following tables make up the data model
ivo://ivoa.net/std/RegTAP/vor
:
Table Utype | Description |
---|---|
rr.capability vr:Service.capability | Pieces of behaviour of a resource. |
rr.interface vr:Interface | Information on access modes of a capability. |
rr.intf_param vs:InputParam | Input parameters for services. |
rr.relationship vr:Resource.content.relationship | Relationships between resources, e.g., mirroring, derivation, but also providing access to data within a resource. |
rr.res_date vr:Resource.curation.date | A date associated with an event in the life cycle of the resource. This could be creation or update. The role column can be used to clarify. |
rr.res_detail N/A | Utype-value pairs for members of resource or capability and their derivations that are less used and/or from VOResource extensions. The pairs refer to a resource if cap_index is NULL, to the referenced capability otherwise. |
rr.res_role N/A | Entities, i.e., persons or organizations, operating on resources: creators, contacts, publishers. |
rr.res_schema vs:TableSchema | Sets of tables related to resources. |
rr.res_table vs:Table | (Relational) tables that are part of schemata or resources. |
rr.resource vr:Resource | The resources, i.e., services, data collections, organizations, etc., present in this registry. |
rr.subject vr:Resource.content.subject | Topics, object types, or other descriptive keywords about the resource. |
rr.table_column vs:Table.column | Metadata on columns of tables pertaining to resources. |
rr.validation N/A | Validation levels for resources and capabilities. |
The rr.resource
table contains most atomic members of
vr:Resource
that have a 1:1 relationship to the resource
itself. Members of derived types are, in general, handled through
the res_detail
table even if 1:1 (see 7.13). The
content_level, waveband, and rights members are 1:n but still appear
here. If there are multiple values, concatenate them with hash
characters (#). Use the ivo_hashlist_has
ADQL extension
function to check for the presence of a single value. This convention
saves on tables while not complicating common queries significantly.
A local addition is the creator_seq
column. It contains
all content of the name
elements below a resource element
curation
child's creator
children,
concatenated with a sequence of semicolon and blank characters (";
"
). The individual parts must be concatenated preserving the
sequence of the XML elements. The resulting string is primarily intended
for display purposes ("author list") and is hence not case-normalized.
It was added since the equivalent of an author list is expected to be
a metadatum that is displayed fairly frequently, but also since the
sequence of author names is generally considered significant. The
res_role
table, on the other hand, does not allow
recovering the input sequence of the rows belonging to one resource.
The res_type
column reflects the lower-cased value of
the ri:Resource
element's xsi:type
attribute,
where the canonical prefixes are used. While custom or experimental
VOResource extensions may lead to more or less arbitrary strings in that
column, here is an enumeration of res_type
s from the
VOResource and its IVOA-recommended extensions at the time of
writing, together with explanations taken from the schema files and
comments.
xsi:type="vg:Harvest"
. Old,
RegistryInterface 1-compliant registries also use this type with a
capability of type vg:Search
. The relational registry as
specified here, while superceding these old vg:Search
capabilities, does not use this type any more. See section
6 on how to locate services
supporting it.vs:catalogservice
probably intend to communicate
that there are no actual sky positions in the tables exposed.rr.relationship
table to find out services that allow
access to the data (the served_by
relation), and/or look for values for
vs:datacollection.accessurl
in
rr.res_detail
.The status
attribute of vr:Resource
is
considered an implementation detail of the XML serialization and is not
kept here. Neither inactive
nor deleted
records may be kept in the resource
table. Since all
other tables in the relational registry should keep a foreign key on the
ivoid
column, this implies that only metadata on
active
records
is being kept in the relational registry. In other words, users can
expect a resource to exist and work if they find it in a relational
registry .
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | Unambiguous reference to the resource conforming to the IVOA standard for identifiers |
res_type vr:Resource.type | adql:VARCHAR(*) | Resource type (something like vs:datacollection, vs:catalogservice, etc) |
created vr:Resource.created | adql:TIMESTAMP | The UTC date and time this resource metadata description was created. This timestamp must not be in the future. This time is not required to be accurate; it should be at least accurate to the day. Any insignificant time fields should be set to zero. |
short_name vr:Resource.shortName | adql:VARCHAR(*) | A short name or abbreviation given to something. This name will be used where brief annotations for the resource name are required. Applications may use to refer to this resource in a compact display. One word or a few letters is recommended. No more than sixteen characters are allowed. |
res_title vr:Resource.title | adql:VARCHAR(*) | The full name given to the resource |
updated vr:Resource.updated | adql:TIMESTAMP | The UTC date this resource metadata description was last updated. This timestamp must not be in the future. This time is not required to be accurate; it should be at least accurate to the day. Any insignificant time fields should be set to zero. |
content_level vr:ContentLevel | adql:VARCHAR(*) | Description of the content level or intended audience |
res_description vr:Resource.content.description | adql:VARCHAR(*) | An account of the nature of the resource. |
reference_url vr:Resource.content.referenceURL | adql:VARCHAR(*) | URL pointing to a human-readable document describing this resource. |
creator_seq vr:Creator.name | adql:VARCHAR(*) | The creator(s) of the resource in the order given by the resource record author. |
content_type vr:Resource.content.type | adql:VARCHAR(*) | Nature or genre of the content of the resource |
source_format vr:Resource.content.source.format | adql:VARCHAR(*) | The format of source_value. Recognized values include "bibcode", referring to a standard astronomical bibcode (http://cdsweb.u-strasbg.fr/simbad/refcode.html). |
source_value vr:Resource.content.source | adql:VARCHAR(*) | A bibliographic reference from which the present resource is derived or extracted. |
version vr:Resource.curation.version | adql:VARCHAR(*) | Label associated with creation or availablilty of a version of a resource. |
region_of_regard vr:Resource.coverage.regionOfRegard | adql:REAL | A single numeric value representing the angle, given in decimal degrees, by which a positional query against this resource should be "blurred" in order to get an appropriate match. |
waveband vs:Waveband | adql:VARCHAR(*) | A hash-separated list of regions of the electro-magnetic spectrum that the resource's spectral coverage overlaps with. |
rights vr:Rights | adql:VARCHAR(*) | Information about rights held in and over the resource. |
This table should have the ivoid
column explicitly set
as its primary key.
The following columns MUST be lowercased during ingestion:
ivoid
, res_type
, content_level
,
content_type
, source_format
,
waveband
.
Clients are advised to query the res_description
and
res_title
columns
using the the ivo_hasword
function, and to use
ivo_hashlist_has
on content_level
,
waveband
, and rights
.
This table subsumes the contact, publisher, contributor, and creator members of the VOResource data model. They have been combined into a single table to reduce the total number of tables, and also in anticipation of a unified data model for such entities in future versions of VOResource.
The actual role is given in the base_role
column, which
can be one of contact
, publisher
, contributor
, or
creator
. Depending on this value, here are the utypes
for the table fields (we have abbreviated
vr:Resource.curation.publisher
as Rcp, vr:Contact
as Co, vr:Creator
as Cr,
and vr:ResourceName
as R):
base_role value | contact | publisher | creator | contributor |
---|---|---|---|---|
role_name | Co.name | Rcp | Cr.name | R |
role_ivoid | Co.name.ivoid | Rcp.ivoid | Cr.name.ivoid | R.ivoid |
address | Co.address | N/A | N/A | N/A |
Co.email | N/A | N/A | N/A | |
telephone | Co.telephone | N/A | N/A | N/A |
logo | Co.logo | N/A | Cr.logo | N/A |
Not all columns are available for each role type in VOResource. For example, contacts have no logo, and creators no telephone members. Unavailable metadata MUST be represented with NULL values in the corresponding columns.
Note that, due to current practice in the VO, it is not easy to
predict what role_name
will contain; it could be a single
name, where again the actual format is unpredictable
(full first name, initials in front or behind, or
even a project name), but it could as well be a full author list. Thus,
when matching against role_names
, you will have to use
rather lenient regular expressions. Changing this, admittedly
regrettable, situation would
probably require a change in the VOResource schema.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
role_name N/A | adql:VARCHAR(*) | The real-world name or title of a person or organization |
role_ivoid N/A | adql:VARCHAR(*) | An IVOA identifier of a person or organization |
address N/A | adql:VARCHAR(*) | A mailing address for a person or organization |
email N/A | adql:VARCHAR(*) | An email address the entity can be reached at |
telephone N/A | adql:VARCHAR(*) | A telephone number the entity can be reached at |
logo N/A | adql:VARCHAR(*) | URL pointing to a graphical logo, which may be used to help identify the entity |
base_role N/A | adql:VARCHAR(*) | The role played by this entity; this is one of contact, publisher, and creator |
The ivoid
column should be an explicit foreign key into
the resource
table. It is recommended to maintain indexes
on at least the role_name
column, ideally in a way that
supports regular expressions.
The following columns MUST be lowercased during ingestion:
ivoid
, role_ivoid
,
base_role
.
Clients are advised to query the remaining columns, in particular
role_name
,
case-insensitively, e.g., using ivo_nocasematch
.
Since subject queries are expected to be frequent and perform relatively complex checks (e.g., resulting from thesaurus queries in the clients), the subjects are kept in a separate table rather than being hash-joined like other string-like 1:n members of resource.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
subject xs:token | adql:VARCHAR(*) | Topics, object types, or other descriptive keywords about the resource. |
The ivoid
column should be an explicit foreign key into
resource
. It is recommended to index the
subject
column, preferably in a way that allows to process
case-insensitive and pattern queries using the index.
The ivoid
column MUST be lowercased during
ingestion. Clients are advised query the subject
column
case-insensitively, e.g., using ivo_nocasematch
.
The capability table describes the capabilities of a resource; it only
contains the members of the base type vr:Capability
.
Members of derived types are kept in the res_detail
table
(see 7.13).
The table has an
integer-typed column cap_index
to disambiguate multiple
capabilities on a single resource. Operators are free to choose the
actual values as convenient, although it is recommended to just
enumerate the capabilities in their physical sequence per-resource.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
cap_index N/A | adql:SMALLINT | Running number of this capability within the resource. |
cap_type vr:Capability.type | adql:VARCHAR(*) | The type of capability covered here. |
cap_description vr:Capability.description | adql:VARCHAR(*) | A human-readable description of what this capability provides as part of the over-all service |
standard_id vr:Capability.standardID | adql:VARCHAR(*) | A URI for a standard this capability conforms to. |
This table should have an explicit primary key made up of
ivoid
and cap_index
.
The ivoid
column should be
an explicit foreign key into resource
.
It is recommended to maintain indexes on at least the
cap_type
and standard_id
columns.
The following columns MUST be lowercased during ingestion:
ivoid
, cap_type
, standard_id
.
Clients are advised to query the cap_description
column
using the ivo_hasword
function.
The res_schema
table corresponds to VODataService's
schema
element. It has been renamed to avoid clashes with
the SQL reserved word SCHEMA
.
The table has an integer-typed column schema_index
to disambiguate multiple schema elements on a single resource.
Operators are free to choose the actual values as convenient, although
it is recommended to just enumerate the schema elements in their
physical sequence per-resource.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
schema_index N/A | adql:SMALLINT | A running number for the res_schema rows belonging to a resource. |
schema_description vs:TableSchema.description | adql:VARCHAR(*) | A free text description of the tableset explaining in general how all of the tables are related. |
schema_name vs:TableSchema.name | adql:VARCHAR(*) | A name for the set of tables. |
schema_title vs:TableSchema.title | adql:VARCHAR(*) | A descriptive, human-interpretable name for the table set. |
schema_utype vs:TableSchema.utype | adql:VARCHAR(*) | An identifier for a concept in a data model that the data in this schema as a whole represent. |
This table should have an explicit primary key made up of
ivoid
and schema_index
. The
ivoid
column should be an explicit foreign key into
resource
.
The following columns MUST be lowercased during ingestion:
ivoid
, schema_name
, schema_utype
.
Clients are advised to query the schema_description
and schema_title
columns
using the the ivo_hasword
function.
The res_table
table models VODataService's
table
element. It has been renamed to avoid name clashes
with the SQL reserved word TABLE
.
The table contains an integer-typed column table_index
to
disambiguate multiple tables on a single resource. Operators are free to
choose the actual values as convenient, although it is recommended to just
enumerate the tables in their physical sequence per-resource (but not per
schema—table_index
MUST be unique within a resource).
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
schema_index N/A | adql:SMALLINT | Index of the schema this table belongs to, if it belongs to a schema (otherwise NULL). |
table_description vs:Table.description | adql:VARCHAR(*) | A free-text description of the table's contents |
table_name vs:Table.name | adql:VARCHAR(*) | The fully qualified name of the table. This name should include all catalog or schema prefixes needed to distinguish it in a query. |
table_index N/A | adql:INTEGER | An artificial counter for the tables belonging to a resource |
table_title vs:Table.title | adql:VARCHAR(*) | A descriptive, human-interpretable name for the table |
table_type vs:Table.type | adql:VARCHAR(*) | A name for the role this table plays. Recognized values include "output", indicating this table is output from a query; "base_table", indicating a table whose records represent the main subjects of its schema; and "view", indicating that the table represents a useful combination or subset of other tables. Other values are allowed. |
table_utype vs:Table.utype | adql:VARCHAR(*) | An identifier for a concept in a data model that the data in this table as a whole represent. |
This table should have an explicit primary key made up of
ivoid
and table_index
. The pair
ivoid
, schema_index
should be an explicit
foreign key into res_schema
. It is recommended to
maintain an index on at least the table_description
column, ideally one suited for queries with ivo_hasword
.
The following columns MUST be lowercased during ingestion:
ivoid
, table_name
, table_type
,
table_utype
.
Clients are advised to query the table_description
and table_title
columns
using the the ivo_hasword
function.
The table_column
table models the content of VOResource's
column
element. The table has been renamed to avoid
a name clash with the SQL reserved word COLUMN
.
Since it is expected that queries for column properties will be fairly common in advanced queries, it is the column table that has the unprefixed versions of common member names (name, description, ucd, utype, etc).
The flag
column contains a concatenation of all values
of a column
element's flag
children, separated
by hash characters. Use the ivo_hashlist_has
function in
queries against flag
.
The table_column
table also includes information from
VODataService's data type concept. VODataService 1.1 includes several type
systems (VOTable, ADQL, Simple). The
typesystem
column contains the value of the column's
datatype
child, with the VODataService XML prefix fixed
to vs; hence, this column will contain one of NULL,
vs:TAPType
,
vs:SimpleDataType
, and vs:VOTableType
.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
table_index N/A | adql:INTEGER | An artificial counter for the tables belonging to a resource |
description vs:BaseParam.description | adql:VARCHAR(*) | A free-text description of the column's contents. |
name vs:BaseParam.name | adql:VARCHAR(*) | The name of the column. |
ucd vs:BaseParam.ucd | adql:VARCHAR(*) | A unified content descriptor that describes the scientific content of the parameter. |
unit vs:BaseParam.unit | adql:VARCHAR(*) | The unit associated with all values in the column. |
utype vs:BaseParam.utype | adql:VARCHAR(*) | An identifier for a role in a data model that the data in this column represents. |
std vs:TableParam.std | adql:SMALLINT | If 1, the meaning and use of this parameter is reserved and defined by a standard model. If 0, it represents a database-specific parameter that effectively extends beyond the standard. |
datatype vs:DataType | adql:VARCHAR(*) | The type of the data contained in the column. |
extended_schema vs:DataType.extendedSchema | adql:VARCHAR(*) | An identifier for the schema that the value given by the extended attribute is drawn from. |
extended_type vs:DataType.extendedType | adql:VARCHAR(*) | The data value represented by this type can be interpreted as of a custom type identified by the value of this attribute. |
arraysize vs:DataType.arraysize | adql:VARCHAR(*) | The shape of the array that constitutes the value. |
delim vs:DataType.delim | adql:VARCHAR(*) | The string that is used to delimit elements of an array value when arraysize is not '1'. |
type_system vs:DataType.type | adql:VARCHAR(*) | The type system used, as a QName with a canonical prefix; this will ususally be one of vs:SimpleDataType, vs:VOTableType, and vs:TAPType. |
flag vs:TableParam.flag | adql:VARCHAR(*) | hash-separated keywords representing traits of the column. Recognized values include "indexed", "primary", and "nullable". |
The pair ivoid
, table_index
should be an
explicit foreign key into res_table
. It is recommended to
maintain indexes on at least the description
,
name
, ucd
, and utype
columns,
where the index on description
should ideally be able
to handle queries using ivo_hasword
.
The following columns MUST be lowercased during ingestion:
ivoid
, name
, ucd
,
utype
, datatype
, type_system
.
Clients are advised to query the description
column using the ivo_hasword
function, and to query
the flag
column using the ivo_hashlist_has
function.
The interface
table subsumes both the
vr:Interface
and vr:accessURL
types from
VOResource. The integration of accessURL
into
the interface
table means that an interface in the
relational registry can only have one access URL, where in VOResource it
can have many. In practice, this particular VOResource capability has
not been used by registry record authors. Since access URLs are
probably the item most queried for, it seems warranted to save one
indirection when querying for them.
Should interfaces with multiple access URLs come into use in the future, we propose a slight denormalization by creating mulitple interfaces with one access URL each within the registry's ingestion logic.
The table contains an
integer-typed column intf_index
to disambiguate multiple
interfaces of one resource. Operators are free to choose the
actual values as convenient, although it is recommended to just
enumerate the interfaces in their physical sequence per resource.
Note that StandardsRegExt records can contain interface elements
outside of capabilities; therefore, intf_index
runs over
resources, and the otherwise natural foreign key (ivoid
,
cap_index
) into the
capability
table cannot be used.
Analogous to resource.res_type
, the
intf_type
column contains type names; VOResource extensions
can define new types here, but at the time of writing, the following
types are mentioned in IVOA-recommended schemata:
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
cap_index N/A | adql:SMALLINT | The index of the parent capability; may be NULL for interfaces in resource (e.g., StandardsRegExt) |
intf_index N/A | adql:SMALLINT | A running number for the interfaces of a resource. |
intf_type vr:Interface.type | adql:VARCHAR(*) | The type of the interface (WebBrowser, ParamHTTP, etc). |
intf_role vr:Interface.role | adql:VARCHAR(*) | A tag name the identifies the role the interface plays in the particular capability. If the value is equal to "std" or begins with "std:", then the interface refers to a standard interface defined by the standard referred to by the capability's standardID attribute. |
std_version vr:Interface.version | adql:VARCHAR(*) | The version of a standard interface specification that this interface complies with. When the interface is provided in the context of a Capability element, then the standard being refered to is the one identified by the Capability's standardID element. |
query_type vs:ParamHTTP.queryType | adql:VARCHAR(*) | The type of HTTP request, either GET or POST. |
result_type vs:ParamHTTP.resultType | adql:VARCHAR(*) | The MIME type of a document returned in the HTTP response. |
wsdl_url vr:WebService.wsdlURL | adql:VARCHAR(*) | The location of the WSDL that describes this Web Service. If not provided, the location is assumed to be the accessURL with '?wsdl' appended. |
url_use vr:AccessURL.use | adql:VARCHAR(*) | A flag indicating whether this should be interpreted as a base URL, a full URL, or a URL to a directory that will produce a listing of files. |
access_url vr:AccessURL | adql:VARCHAR(*) | The URL at which the interface is found. |
This table should have ivoid
as an explicit foreign key into
resource
, and the pair ivoid
, and
intf_index
as an explicit primary key. The pair
ivoid
, cap_index
is frequently used in joins
with the capability table and hence should be indexed. Additionally, it
is recommended to maintain an index on at least the
intf_type
column.
The following columns MUST be lowercased during ingestion:
ivoid
, intf_type
, intf_role
,
std_version
, query_type
,
result_type
, url_use
.
The intf_param
table keeps information on the parameters
avalailable on interfaces. It is therefore closely related to
table_column
, but the differences between the two are
significant enough to warrant a separation between the two tables.
Since the names of common column attributes are used where applicable in
both tables (e.g., name, ucd, etc), the two tables cannot be (naturally)
joined.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
intf_index N/A | adql:SMALLINT | A running number for the interfaces of a resource. |
description vs:BaseParam.description | adql:VARCHAR(*) | A free-text description of the column's contents. |
name vs:BaseParam.name | adql:VARCHAR(*) | The name of the column. |
ucd vs:BaseParam.ucd | adql:VARCHAR(*) | A unified content descriptor that describes the scientific content of the parameter. |
unit vs:BaseParam.unit | adql:VARCHAR(*) | The unit associated with all values in the column. |
utype vs:BaseParam.utype | adql:VARCHAR(*) | An identifier for a role in a data model that the data in this column represents. |
std vs:InputParam.std | adql:SMALLINT | If 1, the meaning and use of this parameter is reserved and defined by a standard model. If 0, it represents a database-specific parameter that effectively extends beyond the standard. |
datatype vs:DataType | adql:VARCHAR(*) | The type of the data contained in the column. |
extended_schema vs:DataType.extendedSchema | adql:VARCHAR(*) | An identifier for the schema that the value given by the extended attribute is drawn from. |
extended_type vs:DataType.extendedType | adql:VARCHAR(*) | The data value represented by this type can be interpreted as of a custom type identified by the value of this attribute. |
use_param vs:InputParam.use | adql:VARCHAR(*) | An indication of whether this parameter is required to be provided for the application or service to work properly. |
The pair
ivoid
, intf_index
should be an explicit foreign key into interface
.
The remaining requirements and conventions are as per section 7.7 where applicable.
The relationship element is a slight denormalization of the
vr:Relationship
type: Whereas in VOResource, a single
relationship element can take several IVORNs, in the relational model,
the pairs are stored directly. It is straightforward to translate
between the two representations in the database ingestor.
For convenicence, we repeat the definition of the validation levels from [std:RM]:
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
relationship_type vr:Relationship.relationshipType | adql:VARCHAR(*) | The named type of relationship; this can be mirror-of, service-for, served-by, derived-from, related-to, or something user-defined. |
related_id vr:Relationship.relatedResource.ivoId | adql:VARCHAR(*) | The URI form of the IVOA identifier for the resource refered to. |
related_name vr:Relationship.relatedResource | adql:VARCHAR(*) | The name of resource that this resource is related to. |
The
ivoid
column should be an explicit foreign key into the
resoure
table. You should index at least the
related_id
column.
The following columns MUST be lowercased during ingestion:
ivoid
, relationship_type
,
related_id
.
The validation
subsumes the
vr:validationLevel
members of both vr:Resource
and vr:Capability
.
If the cap_index
column in NULL
the
validation comprises the entire resource. Otherwise, only the
referenced capability has been validated.
While it is recommended that harvesters only accept resource records
from their originating registries, rows in the validation table could very
well originate from third-party registries. Hence, rows in
rr.validation
for the same resource might originate from
completely different registries. This can trigger potentially
problematic behaviour when the original registry updates its resource
record in that naive implementations will lose all third-party
validation rows; also, the semantics of validation results over updates
of resources and/or resource records is not fully mapped to VOResource.
Implementations are free to handle or ignore validation results as they
see fit, and they may add validation results of their own.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
validated_by vr:Validation.validatedBy | adql:VARCHAR(*) | The IVOA ID of the registry or organisation that assigned the validation level. |
level vr:Validation.validationLevel | adql:SMALLINT | A numeric grade describing the quality of the resource description, when applicable, to be used to indicate the confidence an end-user can put in the resource as part of a VO application or research study. |
cap_index N/A | adql:SMALLINT | If non-NULL, the validation only refers to the the capability referenced here. |
The
ivoid
should be an explicit foreign key into resource
.
Note, however, that ivoid
, cap_index
is
not a foreign key into capability
since
cap_index
may be NULL
(in case the validation
addresses the entire resource).
The following columns MUST be lowercased during ingestion:
ivoid
, validated_by
.
The res_date
table contains information gathered from
vr:Curation
's date children.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
date_value vr:Date | adql:TIMESTAMP | A date associated with an event in the life cycle of the resource. |
value_role vr:Date.role | adql:VARCHAR(*) | A string indicating what the date refers to, e.g., created, availability, updated. |
The ivoid
column should be an explicit foreign key into
resource
.
The following columns MUST be lowercased during ingestion:
ivoid
, value_role
.
The res_detail
table is RegTAP's primary means for
extensability as well as a fallback for less-used simple
metadata. Conceptually, it stores triples of resource entity references, utypes,
and values, where resource entities can be resource records themselves
or capabilities. Thus, metadata with values that are either
string-valued or sets of strings can be represented in this table.
As long as the metadata that needs to be represented in the
relational registry for new VOResource extensions is simple enough, no changes to the schema defined
here will be neccessary as these are introduced. Instead, the extension itself simply defines
new utypes to be added in res_detail
.
Some complex metadata—tr:languageFeature
or
vstd:key
being examples—cannot be kept in this table.
If a representation of such information in the relational registry is
required, this standard will need to be changed.
Appendix B gives a list of utypes from the registry extensions that were recommendations at the time of writing. Implementations MUST put all rows that can be generated from the resource records processed for the utypes marked with an exclamation mark there. For the remaining utypes, rows should be provided if convenient; they mostly concern test queries and other curation-type information that, while unlikely to be useful to normal users, may be useful to clients to, e.g., ascertain the continued operation of services before exposing users to them.
Individual ingestors MAY choose to expose additional metadata using other utypes, provided they are formed according to the rules in section 5 (this rule is intended to minimize the risk of later clashes).
In addition to the metadata listed in this specification,
metadata defined in future
IVOA-approved VOResource extensions MUST or SHOULD be present in
res_details
as the extensions require it.
Name Utype | Type | Description |
---|---|---|
ivoid vr:Resource.identifier | adql:VARCHAR(*) | The parent resource. |
cap_index N/A | adql:SMALLINT | The index of the parent capability; if NULL the utype-value pair describes a member of the entire resource. |
detail_utype N/A | adql:VARCHAR(*) | The utype of the member |
detail_value N/A | adql:VARCHAR(*) | (atomic) value of the member |
The ivoid
column should be an explicit foreign key into
resource
. It is recommended to maintain indexes on
at least the columns
detail_utype
and detail_value
.
The following columns MUST be lowercased during ingestion:
ivoid
, detail_utype
. Clients are advised to
use the ivo_nocasematch
function to search in
detail_value
if the values have textual content.
TAP Servers implementing the
ivo://ivoa.net/std/RegTAP/vor
data model MUST
implement the following three functions in their ADQL 2.0 language:
ivo_nocasematch(value VARCHAR(*), pattern VARCHAR(*))->INTEGER
pattern
matches
value
, 0 otherwise.
pattern
is defined as for the SQL LIKE operator, but the
match is performed case-insensitively.
ivo_hasword(haystack VARCHAR(*), needle VARCHAR(*)) -> INTEGER
ivo_hasword
; for such
queries trying them on multiple registries may improve recall.ivo_hashlist_has(haslist VARCHAR(*), item VARCHAR(*))->INTEGER
Reference implementations of the three functions for the PostgreSQL database system are given in Appendix C.
This section contains sample queries to the relational registry, mostly contributed as use cases by various members of the IVOA Registry working group. They are intended as an aid in designing relational registry queries, in particular for users new to the data model.
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.interface WHERE standard_id='ivo://ivoa.net/std/tap'
Other standard_id
s relevant include:
ivo://ivoa.net/std/registry
for OAI-PMH services,ivo://ivoa.net/std/sia
for SIA services,ivo://ivoa.net/std/conesearch
for SCS services,
andivo://ivoa.net/std/ssa
for SSA services.This is somewhat tricky since it is probably hard to image a part of the sky guaranteed not to have some, possibly distant, spiral galaxy in it. However, translating the intention into "find all SIA services that mention spiral in either the subject, the description, or the title", the query would become:
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.resource NATURAL JOIN rr.interface NATURAL JOIN rr.subject WHERE standard_id='ivo://ivoa.net/std/sia' AND ( 1=ivo_nocasematch(subject, '%spiral%') OR 1=ivo_hasword(res_description, 'spiral') OR 1=ivo_hasword(res_title, 'spiral'))
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.resource NATURAL JOIN rr.interface WHERE standard_id='ivo://ivoa.net/std/sia' AND 1=ivo_hashlist_has('infrared', waveband)
SELECT ivoid, access_url FROM rr.capability NATURAL JOIN rr.table_column NATURAL JOIN rr.interface WHERE standard_id='ivo://ivoa.net/std/conesearch' AND ucd='src.redshift'
SELECT ivoid FROM rr.resource WHERE ivoid LIKE 'ivo://org.gavo.dc%'
SELECT ivoid FROM rr.res_role WHERE 1=ivo_nocasematch(role_name, '%gavo%') AND base_role='publisher'
or, if the publisher actually gives its ivo-id in the registry records,
SELECT ivoid FROM rr.res_role WHERE role_ivoid='ivo://ned.ipac/ned' AND base_role='publisher'
This is using the CDS registry as an example. Note how this query
builds conditions on values in res_detail
:
SELECT ivoid FROM rr.resource RIGHT OUTER JOIN ( SELECT 'ivo://' || detail_value || '%' AS pat FROM rr.res_detail WHERE detail_utype='vg:registry.managedauthority' AND ivoid='ivo://cds.vizier/registry') AS authpatterns ON (resource.ivoid LIKE authpatterns.pat)
SELECT access_url FROM rr.interface NATURAL JOIN rr.capability NATURAL JOIN rr.res_detail WHERE standard_id='ivo://ivoa.net/std/tap' AND detail_utype='tr:datamodeltype.ivoid' AND detail_value='ivo://ivoa.net/std/regtap/vor'
"Certain features" could be "have some word in their description and having a column with a certain UCD". Then, a query could look like this:
SELECT ivoid, access_url, name, ucd, description FROM rr.capability NATURAL JOIN rr.interface NATURAL JOIN rr.table_column NATURAL JOIN rr.res_table WHERE standard_id='ivo://ivoa.net/std/tap' AND 1=ivo_hasword(table_description, 'quasar') AND ucd='phot.mag;em.opt.v'
SELECT access_url FROM rr.res_detail NATURAL JOIN rr.capability NATURAL JOIN rr.interface WHERE detail_utype='ssap:datasource' AND standard_id='ivo://ivoa.net/std/ssa' AND detail_value='theory'
SELECT access_url, name, ucd, description FROM rr.capability NATURAL JOIN rr.interface NATURAL JOIN rr.table_column WHERE standard_id='ivo://ivoa.net/std/conesearch' AND ucd LIKE 'pos.parallax%'
Since there are multiple UCDs that might fit the concept of "stellar distance", one might need to give multiple UCDs here.
SELECT DISTINCT base_role, role_name, email FROM rr.res_role NATURAL JOIN rr.interface WHERE access_url='http://dc.zah.uni-heidelberg.de/__system__/tap/run/tap'
This appendix defines the required (those utypes annotated with an
exclamation mark) or recommended content of the res_details
table (section 7.13) by giving
utypes for which utype/value pairs MUST or SHOULD be given if the
corresponding data is present in the resource records. This list is
normative for metadata defined in IVOA recommendations current as of the
publication of this document (see section 1.1). As laid down in section 7.13, new VOResource extensions or new
versions of existing VOResource extensions may amend this list.
The utypes given here double
as partial XPath-like expressions into the VOResource XML trees.
This should make the generation of res_details
rows fairly
straightforward (see also the rules for utype generation given in
section 5).
The following list is sorted by originating standard.
true
, then an accompanying vs:Format
item is a MIME Type. Within res_details, this
does not work for services that give more than one format; since
furthermore the literal of vs:Format allows a good guess whether or not
it is a MIME type, this does not appear a dramatic limitation.Note that StandardsRegExt's status
and use
attributes and the key
element
cannot be represented here, and hence their values are not accessible
in the relational registry. Similarly, complex TAPRegExt metadata on
languages, user defined functions, and the like cannot be
represented in this table. Since these pieces of metadata do not seem
relevant to resource discovery, a more complex model does not seem
warrented just so they can be exposed.
What follows are implementations of the three user defined functions specificed in section 8 in the dialect the PostgreSQL server uses for SQL procedures, PL/pgSQL. For more details on PostgreSQL, see, e.g., [doc:Postgres92].
CREATE OR REPLACE FUNCTION ivo_hasword(haystack TEXT, needle TEXT) RETURNS INTEGER AS $func$ BEGIN IF to_tsvector(haystack) @@ plainto_tsquery(needle) THEN RETURN 1; ELSE RETURN 0; END IF; END; $func$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION ivo_hashlist_has(hashlist TEXT, item TEXT) RETURNS INTEGER AS $func$ BEGIN -- postgres can't RE-escape a user string; hence, we'll have -- to work on the hashlist IF lower(item) = ANY(string_to_array(lower(hashlist), '#')) THEN RETURN 1; ELSE RETURN 0; END IF; END; $func$ LANGUAGE plpgsql; CREATE OR REPLACE FUNCTION ivo_nocasematch(value TEXT, pattern TEXT) RETURNS INTEGER AS $func$ BEGIN IF value ILIKE pattern THEN RETURN 1; ELSE RETURN 0; END IF; END; $func$ LANGUAGE plpgsql;
This appendix contains a set of constraints and recommendations for implementing the relational registry model on actual RDBMSes, originating partly from an analysis of the data content of the VO registry in February 2013, partly from a consideration of limits in various XML schema documents. This concerns, in particular, minimum lengths for columns of type adql:VARCHAR. Implementations MUST NOT truncate strings of length equal to or smaller than the minimal lengths given here; the limitations are not, however, upper limits, and indeed, when choosing an implementation strategy it is in general preferable to not impose artificial length restrictions, in particular if no performance penalty is incurred.
These notes can also be useful with a view to preparing user interfaces for the relational registry, since input forms and similar user interface elements invariably have limited space; the limits here give reasonable defaults for the amount of data that should minimally be manipulatable by a user with reasonable effort.
The ivoid
field present in every table of this
specification merits special consideration, on the one hand due to its
frequency, but also since other IVOA identifiers present in the
relational registry should probably be treated analoguously.
Given that IVORNs in the 2013 data fields have a maximum
length of roughly 100 characters, we propose that a maximum length of
255 should be sufficient even when taking possible fragment identifiers
into account.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
ivo-id | VARCHAR(255) |
{all_tables}.ivoid resource.footprint_ivoid res_role.role_ivoid capability.standard_id relationship.related_id validation.validated_by |
The relational registry also contains some date-time values. The most
straightforward implementation certainly is to use SQL timestamps.
Other relational registry fields that straightforwardly map to common
SQL types are those that require numeric values, viz.,
adql:REAL
, adql:SMALLINT
, and
adql:INTEGER
. The following table summarizes these:
Field type | Datatype | Pertinent Fields |
---|---|---|
floating point | adql:REAL |
resource.region_of_regard |
small integer | adql:SMALLINT |
capability.cap_index res_schema.schema_index res_table.schema_index table_column.std interface.cap_index interface.intf_index intf_param.intf_index intf_param.std validation.level validation.cap_index res_detail.cap_index res_table.table_index table_column.table_index |
The fields containing Utypes, UCDs, and Units are treated in parallel here. The 2013 registry data indicates a length of 128 characters is sufficient for real-world purposes; actually, at least UCDs and Units could of course grow without limitations, but it seems unreasonable anything longer than a typical line might actually be useful. As far as utypes are concerned, we expect those to shrink rather than grow with new standardization efforts.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
Utype(s) UCD(s) Units |
VARCHAR(128) |
res_schema.schema_utype res_table.table_utype table_column.ucd table_column.unit table_column.utype intf_param.ucd intf_param.unit intf_param.utype res_detail.detail_utype |
The relational registry further has an e-mail field, for which we chose 128 characters as a reasonable upper limit (based on a real maximum of 40 characters in the 2013 data). There are furthermore URLs (in addition to access and reference URLs, there are also URLs for the WSDL of SOAP services and logos for roles). Due to the importance of in particular the access URLs we strongly recommend to use non-truncating types here. Empirically, there are access URLs of up to 224 characters in 2013 (reference URLs are less critical at a maximum of 96 characters). Expecting that with REST-based services, URL lengths will probably rather tend down than up, we still permit truncation at 255 characters.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
VARCHAR(128) |
res_role.email |
|
URLs | VARCHAR(255) |
resource.reference_url res_role.logo interface.wsdl_url interface.access_url |
The next group of columns comprises those that have values taken from a controlled or finite vocabulary. Trying to simplify the view, lengths in the form of powers of two are considered.
Field type | Datatype suggested | Pertinent Fields |
---|---|---|
predefined values |
VARCHAR(255) |
resource.content_level resource.content_type |
VARCHAR(64) |
resource.waveband | |
VARCHAR(32) |
resource.res_type resource.rights capability.cap_type res_table.table_type table_column.flag table_column.datatype table_column.extended_schema table_column.extended_type table_column.type_system interface.result_type intf_param.datatype intf_param.extended_schema intf_param.extended_type |
|
VARCHAR(4) |
interface.query_type interface.url_use |
Finally, there are the fields that actually contain what is basically free text. For these, we have made a choice from reasonable powers of two lengths considering the actual lengths in the 2013 registry data. A special case are fields that either contain natural language text (the descriptions) or those that have grown without limit historically (resource.creator_seq, and, giving in to current abuses discussed above, res_role.role_name). For all such fields, no upper limit can sensibly be defined. However, since certain DBMSes (e.g., MySQL older than version 5.6) cannot index fields with a TEXT datatype and thus using VARCHAR may be necessary at least for frequenly-searched fields, we give the maximal length of the fields in the 2013 registry in parentheses after the column designations for the TEXT datatype:
Field type | Datatype suggested | Interested Fields |
---|---|---|
free string values | TEXT |
resource.res_description (7801) resource.creator_seq (712) res_role.role_name (712) res_schema.schema_description (934) res_table.table_description (934) table_column.description (3735) intf_param.description (347) capability.cap_description (100) |
VARCHAR(255) |
resource.res_title res_role.address res_schema.schema_title res_table.table_title relationship.related_name res_detail.detail_value |
|
VARCHAR(128) |
resource.version resource.source_value resource.facility resource.instrument subject.subject |
|
VARCHAR(64) |
res_table.table_name table_column.name intf_param.name |
|
VARCHAR(32) |
resource.source_format res_role.telephone res_schema.schema_name interface.intf_type interface.intf_role relationship_type res_date.value_role |
|
VARCHAR(16) |
resource.short_name table_column.arraysize interface.std_version intf_param.use |