ViewVC logotype

Contents of /trunk/projects/registry/RegistryInterface/RegistryInterface.tex

Parent Directory Parent Directory | Revision Log Revision Log

Revision 4224 - (show annotations)
Fri Sep 1 17:27:57 2017 UTC (3 years, 10 months ago) by dower
File MIME type: application/x-tex
File size: 50386 byte(s)
RegistryInterface: 1.1 typos and added reference
1 \documentclass{ivoa}
2 \input tthdefs
4 \usepackage[utf8]{inputenc}
5 \usepackage{todonotes}
6 \usepackage{listings}
7 \usepackage{natbib}
8 \lstloadlanguages{XML}
9 \lstset{flexiblecolumns=true,basicstyle=\small,tagstyle=\ttfamily}
11 \SVN$Rev$
12 \SVN$Date$
13 \SVN$URL$
15 \hyphenation{name-space}
17 \newcommand{\oaiop}[1]{\textit{#1}}
19 \ivoagroup{Registry}
22 \author{Theresa Dower}
23 \author{Markus Demleitner}
24 \author{Kevin Benson}
25 \author{Ray Plante}
26 \author{Elizabeth Auden}
27 \author{Matthew Graham}
28 \author{Gretchen Greene}
29 \author{Martin Hill}
30 \author{Tony Linde}
31 \author{Dave Morris}
32 \author{Wil O`Mullane}
33 \author{Guy Rixon}
34 \author{Aur\'elien St\'eb\'e}
35 \author{Kona Andrews}
37 \editor{Theresa Dower}
38 \editor {Markus Demleitner}
40 \previousversion[http://www.ivoa.net/documents/RegistryInterface/20091104/]
41 {IVOA Registry Interfaces 1.0, IVOA Recommendation 2009-11-04}
44 \title{Registry Interfaces}
46 \begin{document}
48 \begin{abstract}
49 The VO Registry provides a mechanism with which VO applications can
50 discover and select resources that are relevant for a particular
51 scientific problem. This specification defines the operation of this
52 system. It is based on a general, distributed model composed of
53 searchable and publishing registries, as introduced at the beginning of
54 this document. The main body of the specification has three components:
55 (a) an interface for harvesting publishing registries, which builds upon
56 the Open Archives Initiative Protocol for Metadata Harvesting. (b) A
57 VOResource extension for registering registry services and description
58 of a central list of said IVOA registry services. (c) A discussion of
59 the Registry of Registries as the root component of data discovery in
60 the VO.
61 \end{abstract}
64 \section{Introduction}
66 \label{introduction}
68 In the Virtual Observatory (VO), registries provide a means for
69 discovering useful resources, i.e., data and services. This discovery
70 takes place by searching within structured descriptions of resources,
71 the resource records, authored by the data providers. In order to avoid
72 a single point of failure for the VO, the Registry is distributed.
73 This means that each data provider can run a service injecting
74 resource records into the Registry (a ``publishing registry'' as defined
75 below), and anyone can run services that allow global discovery (a
76 ``searchable registry'' as defined below).
78 To enable this, common mechanisms for registry communication and
79 interaction are required.
80 This document therefore describes the standard interfaces that enable
81 interoperable registries. Through these interfaces, registry
82 builders have a common way of sharing resource descriptions with users,
83 applications, and other registries.
85 This specification does not cover interfaces for global discovery, which
86 are the subject of other IVOA standards. Also, service operators are
87 free to build interactive, end-user interfaces in
88 any way that best serves their target community.
90 While the architecture and standard processes for distributed registry
91 search and maintenance remain similar to this document's version 1.0 and
92 it remains backward compatible, there is a significant philosophical
93 change in this update. Most importantly, a defined search interface
94 using SOAP technology is no longer recommended, and a Table Access
95 Protocol service using a registry data model is encouraged for search, with the
96 understanding that new technologies will continue to be developed and adopted.
98 \subsection{Registry Architecture and Definitions}
100 \label{arch}
102 A \emph{registry} is first a repository of structured descriptions of
103 resources. In the VO, a \emph{resource} is defined by the IVOA
104 Recommendation ``Resource Metadata for the Virtual Observatory''
105 \citep{std:RM}, henceforth referred to as RM, as being
108 \begin{quotation}
109 a general term referring to a VO element that can be
110 described in terms of who curates or maintains it and which can be
111 given a name and a unique identifier. Just about anything can be a
112 resource: it can be an abstract idea, such as sky coverage or an
113 instrumental setup, or it can be fairly concrete, like an organization
114 or a data collection.
115 \end{quotation}
117 Organizations, data collections, and services can be considered
118 classes of resources. The most important type of resource to
119 applications is a service that actually does something. A registry
120 (lower case),
121 then, is ``a service for which the response is a structured description
122 of resources'' (RM).
124 This specification is based on the general IVOA model for registries
125 \citep{2004ASPC..314..585P}, which builds on RM's model
126 for resources. In this model, the VO environment features
127 different types of registries that serve different functions. The
128 primary distinction is between publishing registries and searchable
129 ones. A secondary distinction is full versus partial.
131 A \emph{searchable registry} is one that allows users and client
132 applications to search for resource records using selection criteria
133 against the metadata contained in the records. The purpose of this type
134 of registry is to aggregate descriptions of many resources distributed
135 across the network. By providing a single place to locate data and
136 services, applications are spared from having to visit many different
137 sites just to determine which ones are relevant to the scientific
138 problem at hand. A searchable registry gathers its descriptions from
139 across the network through a process called \emph{harvesting}.
141 A \emph{publishing registry} is one that simply exposes its resource
142 descriptions to the VO environment in a way that allows those
143 descriptions to be harvested. The contents of these registries tend to
144 be limited to resources maintained by one or a few providers and thus
145 are local in nature; for example, a data center will run its own
146 publishing registry to allow other VO components to gather metadata on
147 the data center's published services.
148 Since the purpose is simply publishing and not to serve
149 users and applications directly, it is not necessary to support full
150 searching capabilities. This simplifies the requirements for a
151 publishing registry:
152 storage, management, and indexing of the records can be simpler, as
153 there is no need to support a
154 search interface facilitating complex discovery queries.
155 While a searchable registry in practice will necessitate the
156 use of a database system, a simple publishing registry may get by
157 storing its records as flat files on disk.
159 Note that some registries can play both roles; that is, a searchable
160 registry may also publish its own resource descriptions.
162 A secondary distinction is full versus local. A \emph{full registry}
163 is one that attempts to contain records of all resources known to the VO.
164 Several such registries exist, run by various VO projects. A
165 \emph{local registry}, on the other hand, contains only a subset of
166 known resources. While for publishing registries this subset usually is
167 defined by what services are maintained by the registry's operator,
168 other selection criteria are conceivable. For instance, the IVOA's
169 Education IG is considering running a registry only containing resources
170 manually selected for suitability for primary and secondary education.
172 As mentioned above, harvesting is the mechanism by which a registry can
173 collect resource records from other registries. It is used by full
174 registries to aggregate resource records from publishing
175 registries. It can also be used to synchronize two registries to ensure
176 that they have the same contents. Harvesting, in this specification, is
177 modeled as a pull operation between two registries. The term
178 \emph{harvester} refers to the registry that wishes to receive records
179 (usually a full searchable registry); it sends its request to the
180 \emph{harvestee} (usually a publishing registry), which responds with
181 the records. Harvesting is a much simpler process than a fully-featured
182 search interface, as only very few constraints need to be supported and
183 only full records are being transmitted in responses.
184 Consequently, different protocols are employed for the
185 two types of registry operations.
187 In this text, ``registry'' in lower case refers to concrete services,
188 while ``Registry'' (or ``VO Registry'') in upper case refers to the
189 combination of the set of all resource records and the interfaces to
190 query and manage them.
192 \subsection{The Registry Interface within the VO Architecture}
194 \label{sect:rolewithinivoa}
197 \begin{figure}[th]
198 \begin{center}
199 \includegraphics[width=0.9\textwidth]{archdiag.png}
200 \caption{IVOA Architecture
201 diagram with the Registry Interface specification (RI) and
202 the related standards marked up.}
203 \label{fig:arch}
204 \end{center}
205 \end{figure}
207 This specification directly relates to other VO standards in the
208 following ways:
211 \begin{bigdescription}
212 \item[VOResource \citep{std:VOR}]VOResource sets the foundation for a
213 formal definition of the data model for resource records via its schema
214 definition.
216 \item[IVOA Identifiers \citep{std:VOID2}]IVOA identifiers are something like
217 the primary keys to the VO registry. Also, the notion of an authority as
218 laid down in IVOA Identifiers plays an important role as publishing
219 registries can be viewed as a realization of a set of authorities.
221 \end{bigdescription}
224 \section{The IVOA Harvesting Interface}
226 \label{harvesting}
228 The harvesting interface allows the retrieval of complete VOResource
229 records from registries supporting harvesting. Publishing registries
230 MUST support the IVOA harvesting interface, searchable registries SHOULD
231 do so.
233 The IVOA harvesting interface is built on the standard Protocol for
234 Metadata Harvesting developed by the Open Archives Initiative, OAI-PMH
235 \citep{std:OAIPMH}. In this section, after giving a brief introduction
236 to OAI-PMH, we define additional constraints and requirements for
237 OAI-PMH services to be interoperable with the VO environment.
239 In version 1.0 of this document, a variant of the OAI-PMH
240 protocol was defined using SOAP in the exchange of messages. Version
241 1.1 no longer defines it (although of course there is no requirement to
242 remove it from running services); since OAI-PMH over SOAP has never been
243 in active use by the IVOA, we consider this still a minor specification
244 change not warranting a new major version.
246 \subsection{The OAI Protocol for Metadata Harvesting}
248 \label{oaipmh}
250 While for details of OAI-PMH we refer to \citet{std:OAIPMH},
251 in the following we give a
252 brief overview of OAI-PMH that should be sufficient to understand the
253 protocol's role within the Registry interface architecture.
255 The OAI-PMH v2.0 specification defines:
258 \begin{itemize}
260 \item the meaning and behavior of the six harvesting operations, referred
261 to as verbs,{}
263 \item the meaning of the input arguments for each operation, and{}
265 \item the XML Schema used to encode response messages.{}
266 \end{itemize}
268 The six standard operations laid down in OAI-PMH are:
271 \begin{bigdescription}
272 \item[Identify] provides a description of the registry
274 \item[ListIdentifiers]returns a list of identifiers for the resource
275 records held by the registry, possibly restricted to records changed
276 within a certain time span or to those belonging to a certain set.
278 \item[ListRecords]returns complete resource records in the registry,
279 possibly restricted to records changed within a certain time span or to
280 those belonging to a certain set.
282 \item[GetRecord]returns a single resource description matching a given
283 identifier.
285 \item[ListMetadataFormats]returns a list of supported formats that the
286 registry can use to encode resource descriptions upon a harvester's
287 request.
289 \item[ListSets]returns a list of set names supported by the registry
290 that harvesters can request in order to get back a subset of the
291 descriptions held by the registry.
293 \end{bigdescription}
295 The ListRecords and GetRecord operations return the actual resource
296 description records held by the registry. These descriptions are encoded
297 in XML and wrapped in a general-purpose envelope defined by the OAI-PMH
298 XML Schema (with the namespace
299 \texttt{http://www.openarchives.org/OAI/2.0}).
301 Through the operations' arguments, OAI-PMH provides a number of useful
302 features:
305 \begin{itemize}
307 \item Support for multiple return formats. As suggested by the existence
308 of the
309 \oaiop{ListMetadataFormats} operation, a harvester can request the
310 formats available for encoding returned resource descriptions.{}
312 \item Harvesting by date. The \oaiop{ListIdentifiers} and
313 \oaiop{ListRecords} operations both support \texttt{from} and
314 \texttt{until} date arguments which restrict the response to records
315 changed withing the given, possibly half-open, interval.{}
317 \item Harvesting by category. The \oaiop{ListIdentifiers} and
318 \oaiop{ListRecords} operations both support a set argument for
319 retrieving resources that are grouped in a particular category. Resource
320 records may belong to multiple sets.{}
322 \item Marking records as deleted. Registries may mark records as deleted
323 so that harvesters will be notified that a resource has become
324 unavailable even if only performing incremental harvests.
326 \item Support for resumption tokens. If a request results in returning a
327 very large number of records, the registry can choose to split the
328 results over several calls; this is done by passing a resumption token
329 back to the harvester. The harvester uses it to retrieve the next set of
330 matching results.{}
332 \end{itemize}
333 It is important to note that the OAI-PMH interface is not intended
334 to be a general search interface. The filtering capabilities described
335 above are just enough to support intelligent harvesting between
336 registries. Most end-user applications will use a dedicated search
337 interface on a searchable registry (cf.~sect.~\ref{sect:searching}).
339 In addition to basic OAI-PMH compliance, this specification defines
340 a set of OAI-PMH-compliant requirements and recommendations
341 special to OAI-PMH's use within the VO that are described in the
342 remaining subsections.
345 \subsection{Metadata Formats for Resource Descriptions}
347 \label{sect:metadataformats}
349 All IVOA registries that support the Harvesting Interface must support
350 two standard metadata formats: the OAI Dublin Core format (mandated by
351 the base OAI-PMH standard) and the IVOA VOResource metadata format
352 \citep{std:VOR}.
354 The VOResource metadata format has the metadata prefix name
355 \texttt{ivo\_vor}, which can be used wherever \citet{std:OAIPMH} allows a
356 metadata prefix name. The format uses the VOResource core XML Schema
357 with the namespace
358 \texttt{http://www.ivoa.net/xml/VOResource/v1.0}
359 (recommended namespace prefix \xmlel{vr:}) along with any legal
360 extension of this schema to encode the resource descriptions within the
361 OAI-PMH metadata tag from the OAI XML Schema (namespace
362 \texttt{http://www.openarchives.org/OAI/2.0}, recommended namespace
363 prefix \xmlel{oai:}).
365 As VOResource and its extensions do not define global elements, the
366 child element within \xmlel{oai:metadata} needs to be separately
367 defined. This specification does this by providing the
368 \xmlel{ri:Resource} element. It is defined in a schema with the target
369 namespace
370 \nolinkurl{http://www.ivoa.net/xml/RegistryInterface/v1.0}, which is given
371 in appendix~\ref{app:rischema}.
373 The
374 \xmlel{ri:Resource} element MUST include an \xmlel{xsi:type} attribute
375 that assigns the element's type to \xmlel{vr:Resource} or one of its
376 legal extensions.
378 It is strongly recommended that all QName values of \xmlel{xsi:type}
379 attributes within the VOResource record use XML namespace prefixes as
380 recommended in VOResource or the VOResource extensions. Minor version
381 changes are not in general reflected in the recommended prefixes --
382 e.g., both VODataService 1.0 and VODataService 1.1 use \xmlel{vs:}.
383 Registry operators
384 who must deliver OAI-PMH documents containing resource records written
385 to different versions of a registry extension are advised to
386 override the prefix
387 bindings on the element level if at all possible.
389 The OAI Dublin Core format, with the metadata prefix of \texttt{oai\_dc},
390 is defined by the OAI-PMH base standard and must be supported by all
391 OAI-PMH compliant registries.
393 Harvestable registries may support other metadata formats. Responses to
394 the
395 \oaiop{ListMeta\-dataFormats} operation
396 must list all names for formats supported
397 by the registry; even though they are mandatory, this list must include
398 \texttt{ivo\_vor} and \texttt{oai\_dc}.
401 \subsection{Identifiers in OAI Messages}
403 \label{oaiidentifiers}
405 In accordance with the OAI-PMH standard, an OAI-PMH XML envelope that
406 contains a resource description must include a globally unique URI that
407 identifies that resource record. This identifier must be the IVOA
408 identifier used to identify the resource being described as given in
409 its \xmlel{vr:identifier} child element.
411 This specification does not follow the recommendation of the OAI-PMH
412 standard with regard to record identifiers. OAI-PMH makes a distinction
413 between the resource record containing resource metadata and the
414 resource itself; thus, it recommends that the identifier in the OAI
415 envelope be different from the resource identifier. In particular, the
416 former is the choice of the publishing registry. This allows one to
417 distinguish resource descriptions of the same resource from different
418 registries, which in principle could be different.
420 In the VO, because it is intended that resource descriptions of the
421 same resource from different registries should not differ (apart from
422 possible additions of \xmlel{vr:validationLevel} elements), there
423 is not a strong need to distinguish between the resource and the
424 resource description.
426 By making the resource and resource record
427 identifiers the same, it becomes much easier to retrieve the record for
428 a single resource via \oaiop{GetRecord}, regardless of which
429 registry is being queried. Otherwise -- when the registry chooses
430 the record identifier -- a client will not a priori know the record
431 identifier for a particular resource, and so it is left to call
432 \oaiop{ListRecords} and search through the metadata of all the
433 records itself to find the one of interest. In contrast, IVOA
434 identifiers are intended to be a cross-application way of referring to a
435 resource, and thus when a client wants only a single specific resource
436 record, it is very likely that it would know the resource identifier
437 when making a call to the \oaiop{GetRecord} operation.
440 \subsection{Required Records}
441 \label{oairequired}
443 This section describes the records that a harvestable VO registry
444 must include among those it emits via the OAI-PMH operations.
446 The harvestable registry MUST return one record that describes the
447 registry itself as a whole, and the \texttt{ivo\_vor} format MUST be
448 supported for this record. This record is also included in the
449 \oaiop{Identify} operation response. When encoded using the
450 \texttt{ivo\_vor} format, the returned \xmlel{ri:Resource} element must
451 be of the type \xmlel{vg:Registry} from the VORegistry schema
452 (see sect.~\ref{sect:vgharvest}). The
453 record MUST include a \xmlel{vg:managedAuthority} for every authority
454 identifier that originates at that registry.
456 Additions to the list of a registry's managed authorities must follow
457 the protocol outlined in sect.~\ref{sect:authres}.
459 The harvestable registry must be able to return exactly one record in
460 \texttt{ivo\_vor} for each authority identifier listed as a
461 \xmlel{vg:managedAuthority} in the \xmlel{vg:Registry} record
462 that describes that registry. When encoded in the \texttt{ivo\_vor}
463 format, the type of these elements must be \xmlel{vg:Authority}.
466 \subsection{The Identify Operation}
468 \label{sect:oaiidentify}
470 The \oaiop{Identify} operation describes the harvestable registry as a
471 whole. The response from this operation must include all information
472 required by the OAI-PMH standard. In particular, it must include an
473 \xmlel{oai:baseURL} element that must refer to the base URL to the
474 harvesting interface endpoint. The \oaiop{Identify} response must
475 include an \xmlel{oai:description} element containing a single
476 \xmlel{ri:Resource} element with an \xmlel{xsi:type} attribute that
477 sets the element's type to \xmlel{vg:Registry}. The content of
478 \xmlel{vg:Registry} type must be the registry description of the
479 harvestable registry itself.
481 In its \oaiop{Identify} response, an OAI-PMH-compliant registry must
482 declare its support for deleted records. This can be one of
484 \begin{description}
486 \item[\texttt{no}] -- the registry will never notify harvesters of
487 records that have become unavailable. In an enviroment like the VO,
488 where searchable registries frequently harvest publishing registries,
489 this is severely discouraged, as without deleted records, harvesters
490 need to perform full harvests every time or risk delivering stale
491 records.
492 \item[\texttt{transient}] -- the registry will notify harvesters of
493 records that have become unavailable, but the deleted records will
494 entirely vanish after some time. This specification adds to the OAI-PMH
495 requirements that registries declaring \texttt{transient} support MUST
496 keep their deleted records for at least six months (after which they may
497 discard them).
498 \item[\texttt{persistent}] -- the registry promises to indefinitely keep
499 deleted records.
500 \end{description}
502 \subsection{IVOA Supported Sets}
504 \label{supportedsets}
506 Sets, as defined in the OAI-PMH standard, are ``an optional construct
507 for grouping items for the purpose of selective harvesting'' (see
508 \citet{std:OAIPMH}, section 2.6). Harvestable IVOA registries are free
509 to define any number of custom sets for categorizing records. The
510 OAI-PMH standard allows a record to be a member of multiple sets.
512 This specification defines one reserved set name with a special
513 meaning; future versions of this specification may define additional set
514 names. These reserved set names will all start with the characters
515 \texttt{ivo\_}; implementors should not define their own set names
516 that begin with this string. While support for sets is optional
517 in the OAI-PMH standard, a VO registry MUST support
518 the set with the reserved name \texttt{ivo\_managed} to be compliant
519 with this specification.
521 The \texttt{ivo\_managed} set refers to all records that originate from the
522 queried registry. That is, those records that were harvested from other
523 registries are excluded. The resource identifiers given in the
524 records MUST have an authority identifier that matches on one of the
525 \xmlel{vg:managedAuthority} values in the \xmlel{vg:Registry}
526 record for that registry. Full searchable registries may use this set
527 while harvesting other registries to avoid getting duplicate records.
529 \subsection{Time Granularity}
531 \label{sect:timegranularity}
533 Datestamps in the OAI-PMH 2.0 standard are encoded using ISO8601 and
534 expressed in UTC, with the UTC designator ``Z'' appended to seconds-based
535 granularity where supplied, i.e. \texttt{YYYY-MM-DDThh:mm:ssZ}. In
536 general OAI-PMH registries, granularity at seconds scale is optional.
537 Harvestable IVOA registries MUST report datestamps at the granularity of
538 seconds and accept \texttt{from} and \texttt{until} arguments in the same format. This
539 simplifies the incremental harvesting process in the multi-registry IVOA
540 environment.
542 \section{Registering Registries}
543 \label{regreg}
545 Harvesting registries must able to locate remote registry resources
546 relevant to them, and both harvesting registries and clients need access
547 to metadata for the registry service itself. We address both of these
548 issues by providing a schema for describing registries themselves, and a
549 repository for indexing them.
551 The resource specification for registries themselves is defined by an
552 \xmlel{ri:Re\-source} extension \xmlel{vg:Registry}, which describes
553 metadata of the registry itself and its support for interfaces
554 described in this document or elsewhere.
555 These resources are themselves stored as
556 records in registries as described in \ref{oairequired}. From each
557 identifier, further IVOA identifiers for authority information,
558 services, and other records belonging under that publishing umbrella
559 may be created. A publishing registry is said to exclusively manage a
560 naming authority on behalf of the owning publisher; this means that
561 within the IVOA registry network, only that specific registry may
562 publish records having identifiers which begin with that authority identifier.
564 The XML namespace URI of this schema is
565 \nolinkurl{http://www.ivoa.net/xml/VORegistry/v1.0}. It has been chosen
566 to allow it to be resolved as a URL to the XML Schema document, which is
567 also given in appendix~\ref{app:vgschema}. The recommended prefix for
568 this namespace is \xmlel{vg:}.
570 The schema has not been changed from the one used in version 1.0,
571 although the standard contents have somewhat changed. The rationale for
572 keeping the schema unchanged is that the presence of schema features no longer relevant
573 has no detrimental consequences for Registry operations, whereas changing
574 the schema could break already operational clients.
577 \begin{figure}[th]
578 \begin{lstlisting}[language=XML]
579 <ri:Resource status="active" xsi:type="vg:Authority"
580 updated="2006-07-01T09:00:00" created="2006-07-01T09:00:00">
581 <title>IVOA Naming Authority</title>
582 <shortName>IVOA</shortName>
583 <identifier>ivo://ivoa.net</identifier>
584 <curation>
585 <publisher ivo-id="ivo://ivoa.net/IVOA">International Virtual
586 Observatory Alliance</publisher>
587 <creator>
588 <name>Raymond Plante</name>
589 <logo>http://www.ivoa.net/icons/ivoa_logo_small.jpg</logo>
590 </creator>
591 <date>2006-07-01</date>
592 <contact>
593 <name>IVOA Resource Registry Working Group</name>
594 <email>registry@ivoa.net</email>
595 </contact>
596 </curation>
597 <content>
598 <subject>virtual observatory</subject>
599 <description>This registers the IVOA as the owner of the ivoa.net
600 authority identifier.</description>
601 <referenceURL>http://rofr.ivoa.net</referenceURL>
602 </content>
603 <managingOrg>International Virtual Observatory Alliance</managingOrg>
604 </ri:Resource>
605 \end{lstlisting}
606 \caption{A sample \xmlel{vg:Authority}-typed resource record as it would
607 be delivered within \xmlel{oai:metadata}. XML namespace declarations
608 for the prefixes \xmlel{ri:}, \xmlel{xsi:}, and \xmlel{vg:} are
609 assumed on enclosing elements.}
610 \label{fig:authrecord}
611 \end{figure}
613 \subsection{The Authority Resource Extension and the Publishing Process}
615 \label{sect:authres}
618 The \xmlel{vg:Authority} type extends the core \xmlel{vr:Resource}
619 type to specifically describe the ownership of an authority identifier
620 by a publishing organization.
622 The IVOA identifier of a \xmlel{vg:Authority} record provided via the
623 \xmlel{vr:identifi\-er} element must have an empty resource key component
624 as defined in \citet{std:VOID}.
626 The meaning of a \xmlel{vg:Authority} record is that the organization
627 referenced in the \xmlel{vg:managingOrg} element has the sole right to
628 create (in collaboration with a publishing registry) and register
629 resource descriptions using the authority identifier given by the
630 \xmlel{vr:identifier} element.
632 Before a publisher can create resource descriptions using a new
633 authority identifier, it must first register its claim to the authority
634 identifier by creating a \xmlel{vg:Authority} record. Before the
635 publishing registry commits the record for export, it must first search
636 a full registry to determine if a \xmlel{vg:Authority} with this
637 identifier already exists; if it does, the publication of the new
638 \xmlel{vg:Authority} record must fail.
640 When a registry creates a
641 \xmlel{vg:Authority} record, it is said that the registry manages the
642 associated authority identifier (on behalf of the owning publisher)
643 because only that registry may create records with identifiers beginning
644 with that authority identifier. The registry must also document this ownership
645 by adding a corresponding \xmlel{vg:managedAuthority} element to the
646 registry's own resource record.
648 The mechanism outlined here is not free of potential conflicts in the distributed
649 environment of the VO Registry. The IVOA Registry Working group
650 periodically monitors the registry-authority graph to ensure each
651 authority in the Registry is claimed by exactly one registry.
653 \subsection{Describing Registries with the Registry Resource Extension}
655 \label{sect:resext}
657 The \xmlel{vg:Registry} type extends the core \xmlel{vr:Service} type to
658 specifically describe registries in order to support discovering them
659 and collecting their metadata; in addition, the extension type also
660 defines the VO-specific metadata in the response to an OAI-PMH
661 \oaiop{Identify} request.
663 As a subclass of \xmlel{vr:Service}, the \xmlel{vg:Registry}
664 type uses \xmlel{vr:capability} elements to describe its support for
665 network interfaces to the services. The specific types defined here
666 derive from an intermediate restriction on \xmlel{vr:Capability} called
667 \xmlel{vg:RegCapRestriction} to force the value of the
668 \xmlel{standardID} attribute to be \nolinkurl{ivo://ivoa.net/std/Registry}.
669 In particular, OAI-PMH endpoints as specified here are identified by
670 \nolinkurl{ivo://ivoa.net/std/Registry}. Client should discover
671 registries by looking for records with capabilities declaring
672 this \xmlel{standardID}.
674 If the \xmlel{vg:full} element in an \xmlel{vg:Registry} instance
675 is set to \texttt{true}, it indicates the registry's intent to
676 accept all valid resource records it harvests from other
677 registries in accordance with the OAI-PMH specification. This will
678 typically be searchable registries implementing some Registry search
679 interface, but there are also use cases for full registries only
680 implementing OAI-PMH (and thus only providing an \xmlel{vg:Harvest}
681 capability).
683 The \xmlel{vg:managedAuthority} is used by publishing registries to
684 claim an authority identifier (see also sect.~\ref{oairequired}). Note
685 that for each managed authority claimed, the registry MUST provide a
686 \xmlel{vg:Authority}-typed resource record for that authority identifier
687 within its \texttt{ivo\_managed} set.
689 As of version 1.1 of this specification, VO registry records must provide
690 the three mandatory VOSI capabilities: availability, a listing of
691 service capabilities, and a listing of tables if relevant, i.e. if a
692 RegTAP or other tabular interface is available \citep{std:VOSI}.
695 \subsection{The Search Capability}
696 \label{sect:vgsearch}
698 Version 1.0 of this standard defined a search interface, and such
699 interfaces are described by capabilites of the type \xmlel{vg:Search}.
700 Since in this version, search interfaces are specified by external
701 standards, such external standards may define differing ways of
702 discovering them\footnote{For instance, RegTAP \citep{std:RegTAP} uses
703 the \xmlel{tre:dataModel} element from TAPRegExt as its primary
704 discovery mechanism in its version 1.0.}. The search capability nevertheless is
705 not removed from the schema for backward compatibility, and is available in appendix
706 \ref{app:RISearch}.
708 \subsection{The Harvesting Capability}
710 \label{sect:vgharvest}
712 A registry declares itself to be a harvestable registry by including a
713 \xmlel{vr:capability} element with an \xmlel{xsi:type}
714 attribute set to \xmlel{vg:Harvest}. An example capability for this
715 type is provided in the appendix \ref{sect:exampleCap}.
717 A \xmlel{vr:capability} element of type \xmlel{vg:Harvest} MUST
718 include at least one \xmlel{vr:interface} element with an
719 \xmlel{xsi:type} attribute set to \xmlel{vg:OAIHTTP} and the
720 \xmlel{role} attribute set to \texttt{std}. If the
721 \xmlel{vr:capability} element is used to simultaneously describe
722 support for other versions of this Registry Interface standard, then the
723 \xmlel{vr:interface} element describing support for this version must
724 include the version attribute set to \texttt{1.0}. The
725 \xmlel{vr:accessURL} element must be set to the base URL for the
726 OAI-PMH interface.
728 The \xmlel{vg:OAISOAP} extension of \xmlel{vr:WebService}
729 was defined in version 1.0 of this specification and is no longer part of VO
730 Registry interfaces since it was never used.
732 \section{Registry Discovery}
734 \subsection{The Registry of Registries}
736 \label{sect:rofr}
738 To facilitate discovery and automated harvesting of VO registries,
739 a master list of IVOA registries exists as part of the IVOA web
740 infrastructure, hosted at \nolinkurl{http://rofr.ivoa.net}.
741 It is referred to as the Registry of Registries, or RofR (pronounced ``rover'').
742 As the RofR is itself a registry, it provides an OAI-PMH interface conforming
743 to this document. The OAI-PMH interface is always available at
744 \nolinkurl{http://rofr.ivoa.net/oai}. The RofR includes resource records
745 describing each currently active registry of IVOA resources, its status
746 as a full or local registry, authorities associated with it, and its
747 programmatic interfaces. Each record is of type \xmlel{vg:Registry} as
748 defined in section \ref{sect:resext}.
750 Once a registry provider has deployed a new publishing registry, they
751 must enroll it the RofR for their records to be seen by the full
752 searchable registries, and therefore registry search clients accessing
753 the whole IVOA registry ecosystem. The RofR provides a dedicated
754 web-based interface for this purpose accessible
755 from \nolinkurl{http://rofr.ivoa.net}. The RofR includes a
756 validator package, which thoroughly checks the new registry, including
757 schema validation for the OAI interface itself and all listed resources.
758 The registration process will only accept registries that validate
759 successfully. Local updates within a publishing registry post-inclusion
760 in the RofR are not necessarily automatically validated by the RofR
761 software later: the validator tool can, and indeed should, be used
762 independently of the initial admission process by the registry providers
763 to periodically make sure their registries are still compliant with the
764 relevant IVOA standards.
766 The Registry of Registries also contains resources describing
767 the most recent versions of IVOA standards for resources and
768 resource extensions themselves; these are of type \xmlel{vstd:Standard}.
769 It is not guaranteed that every standard will be represented in RofR,
770 but for the ones that are listed, the RofR version of their document
771 is the canonical version.
773 \subsection{Harvesting the Registry of Registries}
775 \label{sect:harvestrofr}
777 Given the Registry of Registries contains records for all other
778 currently active and validated IVOA registries, a client wishing to
779 harvest the contents of all registries should begin at the RofR. Full
780 searchable registries wishing to include records from the other IVOA
781 registries count among these potential clients. To harvest the entire
782 contents of IVOA registries, it is recommended to first harvest the
783 Registry of Registries via its OAI-PMH interface.
785 This first step is done by making a call to the RofR's OAI-PMH interface
786 with the \textbf{ListRecords} operation, with the \textbf{set} argument
787 set to \textbf{ivo\_publishers}. This will return the registry records
788 (i.e. resources with xsi:type='vg:Registry') for the registries that
789 successfully registered themselves as described in \ref{sect:rofr}.
791 The next step in harvesting the entire distributed IVOA registry
792 contents is to iterate over the \xmlel{accessURL} of each
793 \xmlel{vg:Registry} record's \xmlel{vr:capability} of type
794 \xmlel{vg:Harvest}, and use the URL for each of those OAI-PMH interfaces
795 to harvest the individual registries. In iterating over the OAI interface
796 of each registry itself, to avoid harvesting duplicate records from the
797 full searchable registries, it is recommended to add the \texttt{set}
798 parameter to that OAI query as well: records locally published by
799 a full registry comprise that registry's supported set \texttt{ivo\_managed}.
801 The very first time the harvester executes the \textbf{ListRecords}
802 operation on the RofR or any listed registry, the \textbf{from} argument
803 should be not used so that all known publishing registries are returned,
804 as well as all known resources within each discovered registry. If the
805 harvesting client wishes to use the OAI interface for incremental
806 updates, it can cache at least a mapping of the registry identifiers to
807 their respective harvesting endpoints along with a timestamp for when
808 this operation was last successfully carried out on each. Then, at the
809 start of subsequent harvesting updates, the harvester can provide the
810 cached date using the \textbf{from} argument to receive only new and
811 updated records, and update the cached timestamp upon success.
813 Experience has shown that when relying on incremental harvests
814 exclusively, minor problems eventually accumulate to severe
815 inconsistencies even when registries declare support for deleted
816 records. It is therefore recommended that harvesting clients occasionally
817 (e.g., semianually) perform full updates to an empty local copy without
818 using the \textbf{from} parameter, even for registries that announce
819 deletion of records. To further provide some robustness against small
820 operational issues in the publishing process, it is also recommended
821 to leave an overlap in incremental harvesting requests, e.g. to request
822 resources going back to the beginning of the day of last incremental harvest.
825 For example, to get a listing of registries in the IVOA ecosystem, one
826 would first query
827 \nolinkurl{http://rofr.ivoa.net/oai?verb=ListRecords\&metadataPrefix=ivo\_vor\&set=ivo_publishers}.
828 Then, for each returned resource, the \xmlel{accessURL} under a
829 \xmlel{Capability} with \xmlel{xsi:type=vg:Harvest}, that URL could be
830 called as such:
831 \nolinkurl{http://accessURLValue?verb=ListRecords\&metadataPrefix=ivo\_vor}
832 or
833 \nolinkurl{http://accessURLValue?verb=ListRecords\&metadataPrefix=ivo_vor\&from=YYYY-MM-DDTHH:MM:SSZ}
834 for return visits, with the 'from' date representing the last successful
835 query to that accessURL.
837 \section{Searching Registries}
838 \label{sect:searching}
840 Experience with version~1 of this specification suggests that it is
841 preferable to not couple the relatively stable standards for harvesting and
842 general registry maintenance with client interfaces to the registry,
843 which were found to be in much more need of experimentation. For a
844 discussion of the history of client interfaces in the VO, see
845 \citet{paper:regclient}.
847 \subsection{RI Search}
848 \label{RISearch}
849 A SOAP-based search capability, \xmlel{vg:RISearch} defined in Registry
850 Interfaces 1.0, exists but is no longer encouraged or required for searchable
851 registries as technologies have moved forward. However, it is still a valid
852 capability defined in the registry resource schema so that registry operators
853 may continue to provide valid RI1 registries without having to support different
854 versions of the VORegistry schema. The base \xmlel{vg:RISearch} extension may
855 also be useful for the description of future registry search interfaces. RISearch
856 is described in appendix~\ref{app:RISearch}.
859 \subsection{Registry Table Access Protocol Services}
860 \label{RegTAP}
862 One second-generation standard search interface to the VO Registry that
863 has progressed to become an IVOA recommendation is RegTAP
864 \citep{std:RegTAP}, an interface based on a relational representation of
865 key fields in resourcce descriptions and on the IVOA Table Access Protocol
866 \citep{std:TAP}. RegTAP services have been made available from several
867 registry providers listed in the Registry of Registries.
869 RegTAP-based registries should be located by clients as described
870 in the RegTAP standard (which in version 1.0 happens through locating
871 TAP services with a certain data model identifier like
872 \nolinkurl{ivo://ivoa.net/std/RegTAP#1.0}). To aid smart clients of
873 the full RofR which generate lists for initial discovery, RegTAP registries
874 must also be registered as separate resources with the appropriate tableset.
875 These must include either a full TAP service capability according to
876 TAPRegExt \citep{std:TAPREGEXT-20120827} or an auxiliary capability
877 referencing a TAP service as per \citet{note:DataCollect}. An example
878 for the latter option, preferable if the TAP service in question
879 contains additional tables, is given in appendix \ref{sect:exampleCap}.
881 \subsection{Announcing Local vs Full Searchable Registries}
882 \label{FullSearch}
884 While a publishing registry may provide search capabilities for its
885 own hosted records, this is considered a locally searchable registry,
886 and not a full searchable one, as distinguished in the RofR listing.
887 For a registry to be considered full searchable, it must harvest resources
888 from the other publishing registries listed in the RofR, and implement
889 an IVOA standard programmatic interface beyond the interface for OAI harvesting,
890 with some method for filtering resource queries.
891 This can be announced simply in the registry's own self-describing resource
892 record with a \xmlel{full} tag set to true, without having to proscribe any
893 one interface as the defining search feature.
895 \section{Looking Forward}
896 \label{LookingForward}
898 While the OAI-PMH harvesting interface as adopted from outside the IVOA
899 community is stable and replacing it would require a major revision
900 of this document, we expect that new search interfaces for registries will
901 be continually developed, leveraging new technologies and best practices
902 as they emerge. These search interfaces can be added without sacrificing
903 interoperability with the IVOA registry ecosystem. Whether these emerging
904 search technologies become formally endorsed by the IVOA as notes or new
905 standards documents, so long as a registry supports the basic harvest interface
906 and hosts valid \xmlel{ri:Resource} documents including registry and authority
907 records, it should be considered covered by the practices described herein and
908 a welcome addition to the Registry of Registries listing, with all of its
909 records also accessible through the full registries.
911 \appendix
913 \section{The RegistryInterface Schema}
914 \label{app:rischema}
916 The following schema defines a global element, allowing the inclusion of
917 VOResource records into \xmlel{oai:metadata} elements in OAI-PMH
918 responses for the \texttt{ivo\_vor} metadata prefix. See
919 sect.~\ref{sect:metadataformats} for details.
921 The schema is unchanged from version 1.0 of this specification and
922 therefore does not change its version.
924 \lstinputlisting[language=XML]{RegistryInterface-1.0.xsd}
926 \section{The VORegistry Schema}
927 \label{app:vgschema}
929 The following schema defines VOResource types for describing registries
930 in the Registry. It is unchanged from version 1.0 of this specification
931 and therefore does not change its version.
933 Note that standards defining search interfaces may specify alternative
934 or complementary methods of registering the services defined by them,
935 and that auxiliary capabilities for these search capabilities may be
936 listed within the registry record.
938 \lstinputlisting[language=XML]{VORegistry-1.0.xsd}
940 \section{Example Capabilities}
941 \label{sect:exampleCap}
943 The following XML fragment shows the three capability elements discussed
944 in this document: The OAI-PMH-based publishing registry, the legacy
945 RI 1.1 searchable registry, and an auxiliary TAP capability as used
946 for RegTAP.
948 \begin{lstlisting}[language=XML]
949 <ri:Resource
950 xmlns:vg="http://www.ivoa.net/xml/VORegistry/v1.0"
951 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
952 xmlns:xmlns:ri="http://www.ivoa.net/xml/RegistryInterface/v1.0">
954 <!-- Standard VOResource metadata omitted for brevity -->
956 <!-- The capability for an OAI-PMH endpoint (publishing registry) -->
957 <capability xsi:type="vg:Harvest" standardID="ivo://ivoa.net/std/Registry">
958 <interface xsi:type="vg:OAIHTTP" version="1.0" role="std">
959 <accessURL use="base">http://registry.example.org/oai</accessURL>
960 </interface>
961 <maxRecords>100</maxRecords>
962 </capability>
964 <!-- A legacy, RI1.0 searchable registry endpoint, with an
965 extra interface for web browsers. -->
966 <capability xsi:type="vg:Search" standardID="ivo://ivoa.net/std/Registry">
967 <interface xsi:type="vr:WebBrowser" version="1.0" role="gui">
968 <accessURL use="full">http://registry.euro-vo.org</accessURL>
969 </interface>
970 <interface xsi:type="vr:WebService" version="1.0" role="std">
971 <accessURL use="full"
972 >http://registry.example.org/services/RegistrySearch</accessURL>
973 </interface>
974 <maxRecords>100</maxRecords>
975 <extensionSearchSupport>core</extensionSearchSupport>
976 </capability>
978 <!-- A reference to RegTAP-enabled TAP service as an auxiliary
979 capability -->
980 <capability standardID="ivo://ivoa.net/std/TAP#aux">
981 <interface xsi:type="vs:ParamHTTP" role="std">
982 <accessURL use="base">http://registry.example.org/tap</accessURL>
983 </interface>
984 </capability>
986 <!-- A RegTAP-capable searchable registry should have a tableset
987 with all its tables in the rr schema here -->
988 </ri:Resource>
989 \end{lstlisting}
991 \section{The RISearch Schema}
992 \label{app:RISearch}
994 The following schema defines the SOAP-based RISearch interface, which
995 is discouraged as of version 1.1 but still available.
996 It is unchanged from version 1.0 of this specification
997 and therefore does not change its version.
999 % GENERATED: !schemadoc VORegistry-1.0.xsd Search
1000 %\begin{generated}
1002 \begingroup
1003 \renewcommand*\descriptionlabel[1]{%
1004 \hbox to 5.5em{\emph{#1}\hfil}}\vspace{2ex}\noindent\textbf{\xmlel{vg:Search} Type Schema Documentation}
1006 \noindent{\small
1007 The capabilities of the Registry Search implementation.
1008 \par}
1010 \vspace{1ex}\noindent\textbf{\xmlel{vg:Search} Type Schema Definition}
1012 \begin{lstlisting}[language=XML,basicstyle=\footnotesize]
1013 <xs:complexType name="Search" >
1014 <xs:complexContent >
1015 <xs:extension base="vr:Capability" >
1016 <xs:sequence >
1017 <xs:element name="maxRecords" type="xs:int" />
1018 <xs:element name="extensionSearchSupport"
1019 type="vg:ExtensionSearchSupport" />
1020 <xs:element name="optionalProtocol" type="vg:OptionalProtocol" minOccurs="0"
1021 maxOccurs="unbounded" />
1022 </xs:sequence>
1023 </xs:extension>
1024 </xs:complexContent>
1025 </xs:complexType>
1026 \end{lstlisting}
1028 \vspace{0.5ex}\noindent\textbf{\xmlel{vg:Search} Extension Metadata Elements}
1030 \begingroup\small\begin{bigdescription}\item[Element \xmlel{maxRecords}]
1031 \begin{description}
1032 \item[Type] \xmlel{xs:int}
1033 \item[Meaning]
1034 The largest number of records that the registry search
1035 method will return. A value of zero or less indicates
1036 that there is no explicit limit.
1038 \item[Occurrence] required
1040 \end{description}
1041 \item[Element \xmlel{extensionSearchSupport}]
1042 \begin{description}
1043 \item[Type] string
1044 \item[Meaning]
1045 (deprecated)
1047 \item[Occurrence] required
1049 \item[Allowed Values]\hfil
1050 \begin{longtermsdescription}
1051 \item[core]
1052 Only searches against the core VOResource metadata are
1053 supported.
1055 \item[partial]
1056 Searches against some VOResource extension metadata are
1057 supported but not necessarily all that exist in the registry.
1059 \item[full]
1060 Searches against all VOResource extension metadata contained
1061 in the registry are supported.
1063 \end{longtermsdescription}
1064 \item[Comment]
1065 This was used in Registry Interfaces 1.0 to indicate
1066 what VOResource extensions a search interface supported.
1067 Modern search interfaces will indicate that through
1068 version, their tableset, or similar.
1071 \end{description}
1072 \item[Element \xmlel{optionalProtocol}]
1073 \begin{description}
1074 \item[Type] string
1075 \item[Meaning]
1076 (deprecated)
1078 \item[Occurrence] optional; multiple occurrences allowed.
1080 \item[Allowed Values]\hfil
1081 \begin{longtermsdescription}
1082 \item[XQuery]
1083 the XQuery (http://www.w3.org/TR/xquery/) protocol as defined
1084 in the VO Registry Interface standard.
1086 \end{longtermsdescription}
1087 \item[Comment]
1088 This was used in Registry Interfaces 1.0 to indicate
1089 search protocol extensions. In 1.1, use multiple
1090 capabilities with the appropriate standardIDs
1091 to declare special search capabilities.
1094 \end{description}
1097 \end{bigdescription}\endgroup
1099 \endgroup
1100 %\end{generated}
1105 \section{Changes from Previous Versions}
1107 \label{sect:changes}
1109 For pre-REC-1.0 changes, see \citet{std:RI1}.
1111 \subsection{Changes from first 1.1 WD}
1113 \begin{itemize}
1115 \item Text clarifications for harvesting the entire RofR, and
1116 exhortation to harvest from scratch occasionally as OAI
1117 announcement of record deletions are not mandatory.
1119 \item Simplified announcement of full searchable registry
1120 in the RofR and removed operational instructions which may change
1122 \end {itemize}
1124 \subsection{Changes from Version 1.0}
1126 \label{changes-1.0}
1129 \begin{itemize}
1131 \item Corrected reference to OAI-PMH spec in registry interface
1132 description to v2.0.
1134 \item Added requirement for OAI-PMH interface to support seconds
1135 granularity, optional in the OAI-PMH 2.0 standard itself. {}
1137 \item Removed requirement for VOResource version number changes to force
1138 an update of this document. {}
1140 \item Removed the implementation-dependent requirement for searchable
1141 registries in section 2, specifically the SOAP-based services
1142 based on ``ADQL 1.0'' and XQuery.{}
1144 \item Dropped the requirement on registries to not deliver any records
1145 that are OAI-PMH deleted when no temporal constraint is given.{}
1147 \item Added a requirement to provide VOSI endpoints.
1149 \item Added support for auxiliary Registry TAP Service search interfaces
1151 \item Clarified that the requirement to keep deleted records for six
1152 months only applies to the transient case; also discouraging registries
1153 with no support of deleted records.
1155 \item Added recommended process for discovery of registries and their
1156 resources using the Registry of Registries, based on the Registry of
1157 Registries IVOA note
1159 \item Added conclusion describing implications of future search and
1160 publishing interface changes in the Registry environment.
1162 \item Many editorial changes across the text, mostly as a consequence of
1163 externalizing search interfaces.
1165 \end{itemize}
1168 \bibliography{ivoatex/ivoabib,ivoatex/docrepo}
1170 \end{document}


Name Value
svn:keywords Date Rev URL

ViewVC Help
Powered by ViewVC 1.1.26