/[volute]/trunk/projects/registry/RegistryInterface/RegistryInterface.tex
ViewVC logotype

Contents of /trunk/projects/registry/RegistryInterface/RegistryInterface.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 4224 - (show annotations)
Fri Sep 1 17:27:57 2017 UTC (3 years, 10 months ago) by dower
File MIME type: application/x-tex
File size: 50386 byte(s)
RegistryInterface: 1.1 typos and added reference
1 \documentclass{ivoa}
2 \input tthdefs
3
4 \usepackage[utf8]{inputenc}
5 \usepackage{todonotes}
6 \usepackage{listings}
7 \usepackage{natbib}
8 \lstloadlanguages{XML}
9 \lstset{flexiblecolumns=true,basicstyle=\small,tagstyle=\ttfamily}
10
11 \SVN$Rev$
12 \SVN$Date$
13 \SVN$URL$
14
15 \hyphenation{name-space}
16
17 \newcommand{\oaiop}[1]{\textit{#1}}
18
19 \ivoagroup{Registry}
20
21
22 \author{Theresa Dower}
23 \author{Markus Demleitner}
24 \author{Kevin Benson}
25 \author{Ray Plante}
26 \author{Elizabeth Auden}
27 \author{Matthew Graham}
28 \author{Gretchen Greene}
29 \author{Martin Hill}
30 \author{Tony Linde}
31 \author{Dave Morris}
32 \author{Wil O`Mullane}
33 \author{Guy Rixon}
34 \author{Aur\'elien St\'eb\'e}
35 \author{Kona Andrews}
36
37 \editor{Theresa Dower}
38 \editor {Markus Demleitner}
39
40 \previousversion[http://www.ivoa.net/documents/RegistryInterface/20091104/]
41 {IVOA Registry Interfaces 1.0, IVOA Recommendation 2009-11-04}
42
43
44 \title{Registry Interfaces}
45
46 \begin{document}
47
48 \begin{abstract}
49 The VO Registry provides a mechanism with which VO applications can
50 discover and select resources that are relevant for a particular
51 scientific problem. This specification defines the operation of this
52 system. It is based on a general, distributed model composed of
53 searchable and publishing registries, as introduced at the beginning of
54 this document. The main body of the specification has three components:
55 (a) an interface for harvesting publishing registries, which builds upon
56 the Open Archives Initiative Protocol for Metadata Harvesting. (b) A
57 VOResource extension for registering registry services and description
58 of a central list of said IVOA registry services. (c) A discussion of
59 the Registry of Registries as the root component of data discovery in
60 the VO.
61 \end{abstract}
62
63
64 \section{Introduction}
65
66 \label{introduction}
67
68 In the Virtual Observatory (VO), registries provide a means for
69 discovering useful resources, i.e., data and services. This discovery
70 takes place by searching within structured descriptions of resources,
71 the resource records, authored by the data providers. In order to avoid
72 a single point of failure for the VO, the Registry is distributed.
73 This means that each data provider can run a service injecting
74 resource records into the Registry (a ``publishing registry'' as defined
75 below), and anyone can run services that allow global discovery (a
76 ``searchable registry'' as defined below).
77
78 To enable this, common mechanisms for registry communication and
79 interaction are required.
80 This document therefore describes the standard interfaces that enable
81 interoperable registries. Through these interfaces, registry
82 builders have a common way of sharing resource descriptions with users,
83 applications, and other registries.
84
85 This specification does not cover interfaces for global discovery, which
86 are the subject of other IVOA standards. Also, service operators are
87 free to build interactive, end-user interfaces in
88 any way that best serves their target community.
89
90 While the architecture and standard processes for distributed registry
91 search and maintenance remain similar to this document's version 1.0 and
92 it remains backward compatible, there is a significant philosophical
93 change in this update. Most importantly, a defined search interface
94 using SOAP technology is no longer recommended, and a Table Access
95 Protocol service using a registry data model is encouraged for search, with the
96 understanding that new technologies will continue to be developed and adopted.
97
98 \subsection{Registry Architecture and Definitions}
99
100 \label{arch}
101
102 A \emph{registry} is first a repository of structured descriptions of
103 resources. In the VO, a \emph{resource} is defined by the IVOA
104 Recommendation ``Resource Metadata for the Virtual Observatory''
105 \citep{std:RM}, henceforth referred to as RM, as being
106
107
108 \begin{quotation}
109 a general term referring to a VO element that can be
110 described in terms of who curates or maintains it and which can be
111 given a name and a unique identifier. Just about anything can be a
112 resource: it can be an abstract idea, such as sky coverage or an
113 instrumental setup, or it can be fairly concrete, like an organization
114 or a data collection.
115 \end{quotation}
116
117 Organizations, data collections, and services can be considered
118 classes of resources. The most important type of resource to
119 applications is a service that actually does something. A registry
120 (lower case),
121 then, is ``a service for which the response is a structured description
122 of resources'' (RM).
123
124 This specification is based on the general IVOA model for registries
125 \citep{2004ASPC..314..585P}, which builds on RM's model
126 for resources. In this model, the VO environment features
127 different types of registries that serve different functions. The
128 primary distinction is between publishing registries and searchable
129 ones. A secondary distinction is full versus partial.
130
131 A \emph{searchable registry} is one that allows users and client
132 applications to search for resource records using selection criteria
133 against the metadata contained in the records. The purpose of this type
134 of registry is to aggregate descriptions of many resources distributed
135 across the network. By providing a single place to locate data and
136 services, applications are spared from having to visit many different
137 sites just to determine which ones are relevant to the scientific
138 problem at hand. A searchable registry gathers its descriptions from
139 across the network through a process called \emph{harvesting}.
140
141 A \emph{publishing registry} is one that simply exposes its resource
142 descriptions to the VO environment in a way that allows those
143 descriptions to be harvested. The contents of these registries tend to
144 be limited to resources maintained by one or a few providers and thus
145 are local in nature; for example, a data center will run its own
146 publishing registry to allow other VO components to gather metadata on
147 the data center's published services.
148 Since the purpose is simply publishing and not to serve
149 users and applications directly, it is not necessary to support full
150 searching capabilities. This simplifies the requirements for a
151 publishing registry:
152 storage, management, and indexing of the records can be simpler, as
153 there is no need to support a
154 search interface facilitating complex discovery queries.
155 While a searchable registry in practice will necessitate the
156 use of a database system, a simple publishing registry may get by
157 storing its records as flat files on disk.
158
159 Note that some registries can play both roles; that is, a searchable
160 registry may also publish its own resource descriptions.
161
162 A secondary distinction is full versus local. A \emph{full registry}
163 is one that attempts to contain records of all resources known to the VO.
164 Several such registries exist, run by various VO projects. A
165 \emph{local registry}, on the other hand, contains only a subset of
166 known resources. While for publishing registries this subset usually is
167 defined by what services are maintained by the registry's operator,
168 other selection criteria are conceivable. For instance, the IVOA's
169 Education IG is considering running a registry only containing resources
170 manually selected for suitability for primary and secondary education.
171
172 As mentioned above, harvesting is the mechanism by which a registry can
173 collect resource records from other registries. It is used by full
174 registries to aggregate resource records from publishing
175 registries. It can also be used to synchronize two registries to ensure
176 that they have the same contents. Harvesting, in this specification, is
177 modeled as a pull operation between two registries. The term
178 \emph{harvester} refers to the registry that wishes to receive records
179 (usually a full searchable registry); it sends its request to the
180 \emph{harvestee} (usually a publishing registry), which responds with
181 the records. Harvesting is a much simpler process than a fully-featured
182 search interface, as only very few constraints need to be supported and
183 only full records are being transmitted in responses.
184 Consequently, different protocols are employed for the
185 two types of registry operations.
186
187 In this text, ``registry'' in lower case refers to concrete services,
188 while ``Registry'' (or ``VO Registry'') in upper case refers to the
189 combination of the set of all resource records and the interfaces to
190 query and manage them.
191
192 \subsection{The Registry Interface within the VO Architecture}
193
194 \label{sect:rolewithinivoa}
195
196
197 \begin{figure}[th]
198 \begin{center}
199 \includegraphics[width=0.9\textwidth]{archdiag.png}
200 \caption{IVOA Architecture
201 diagram with the Registry Interface specification (RI) and
202 the related standards marked up.}
203 \label{fig:arch}
204 \end{center}
205 \end{figure}
206
207 This specification directly relates to other VO standards in the
208 following ways:
209
210
211 \begin{bigdescription}
212 \item[VOResource \citep{std:VOR}]VOResource sets the foundation for a
213 formal definition of the data model for resource records via its schema
214 definition.
215
216 \item[IVOA Identifiers \citep{std:VOID2}]IVOA identifiers are something like
217 the primary keys to the VO registry. Also, the notion of an authority as
218 laid down in IVOA Identifiers plays an important role as publishing
219 registries can be viewed as a realization of a set of authorities.
220
221 \end{bigdescription}
222
223
224 \section{The IVOA Harvesting Interface}
225
226 \label{harvesting}
227
228 The harvesting interface allows the retrieval of complete VOResource
229 records from registries supporting harvesting. Publishing registries
230 MUST support the IVOA harvesting interface, searchable registries SHOULD
231 do so.
232
233 The IVOA harvesting interface is built on the standard Protocol for
234 Metadata Harvesting developed by the Open Archives Initiative, OAI-PMH
235 \citep{std:OAIPMH}. In this section, after giving a brief introduction
236 to OAI-PMH, we define additional constraints and requirements for
237 OAI-PMH services to be interoperable with the VO environment.
238
239 In version 1.0 of this document, a variant of the OAI-PMH
240 protocol was defined using SOAP in the exchange of messages. Version
241 1.1 no longer defines it (although of course there is no requirement to
242 remove it from running services); since OAI-PMH over SOAP has never been
243 in active use by the IVOA, we consider this still a minor specification
244 change not warranting a new major version.
245
246 \subsection{The OAI Protocol for Metadata Harvesting}
247
248 \label{oaipmh}
249
250 While for details of OAI-PMH we refer to \citet{std:OAIPMH},
251 in the following we give a
252 brief overview of OAI-PMH that should be sufficient to understand the
253 protocol's role within the Registry interface architecture.
254
255 The OAI-PMH v2.0 specification defines:
256
257
258 \begin{itemize}
259
260 \item the meaning and behavior of the six harvesting operations, referred
261 to as verbs,{}
262
263 \item the meaning of the input arguments for each operation, and{}
264
265 \item the XML Schema used to encode response messages.{}
266 \end{itemize}
267
268 The six standard operations laid down in OAI-PMH are:
269
270
271 \begin{bigdescription}
272 \item[Identify] provides a description of the registry
273
274 \item[ListIdentifiers]returns a list of identifiers for the resource
275 records held by the registry, possibly restricted to records changed
276 within a certain time span or to those belonging to a certain set.
277
278 \item[ListRecords]returns complete resource records in the registry,
279 possibly restricted to records changed within a certain time span or to
280 those belonging to a certain set.
281
282 \item[GetRecord]returns a single resource description matching a given
283 identifier.
284
285 \item[ListMetadataFormats]returns a list of supported formats that the
286 registry can use to encode resource descriptions upon a harvester's
287 request.
288
289 \item[ListSets]returns a list of set names supported by the registry
290 that harvesters can request in order to get back a subset of the
291 descriptions held by the registry.
292
293 \end{bigdescription}
294
295 The ListRecords and GetRecord operations return the actual resource
296 description records held by the registry. These descriptions are encoded
297 in XML and wrapped in a general-purpose envelope defined by the OAI-PMH
298 XML Schema (with the namespace
299 \texttt{http://www.openarchives.org/OAI/2.0}).
300
301 Through the operations' arguments, OAI-PMH provides a number of useful
302 features:
303
304
305 \begin{itemize}
306
307 \item Support for multiple return formats. As suggested by the existence
308 of the
309 \oaiop{ListMetadataFormats} operation, a harvester can request the
310 formats available for encoding returned resource descriptions.{}
311
312 \item Harvesting by date. The \oaiop{ListIdentifiers} and
313 \oaiop{ListRecords} operations both support \texttt{from} and
314 \texttt{until} date arguments which restrict the response to records
315 changed withing the given, possibly half-open, interval.{}
316
317 \item Harvesting by category. The \oaiop{ListIdentifiers} and
318 \oaiop{ListRecords} operations both support a set argument for
319 retrieving resources that are grouped in a particular category. Resource
320 records may belong to multiple sets.{}
321
322 \item Marking records as deleted. Registries may mark records as deleted
323 so that harvesters will be notified that a resource has become
324 unavailable even if only performing incremental harvests.
325
326 \item Support for resumption tokens. If a request results in returning a
327 very large number of records, the registry can choose to split the
328 results over several calls; this is done by passing a resumption token
329 back to the harvester. The harvester uses it to retrieve the next set of
330 matching results.{}
331
332 \end{itemize}
333 It is important to note that the OAI-PMH interface is not intended
334 to be a general search interface. The filtering capabilities described
335 above are just enough to support intelligent harvesting between
336 registries. Most end-user applications will use a dedicated search
337 interface on a searchable registry (cf.~sect.~\ref{sect:searching}).
338
339 In addition to basic OAI-PMH compliance, this specification defines
340 a set of OAI-PMH-compliant requirements and recommendations
341 special to OAI-PMH's use within the VO that are described in the
342 remaining subsections.
343
344
345 \subsection{Metadata Formats for Resource Descriptions}
346
347 \label{sect:metadataformats}
348
349 All IVOA registries that support the Harvesting Interface must support
350 two standard metadata formats: the OAI Dublin Core format (mandated by
351 the base OAI-PMH standard) and the IVOA VOResource metadata format
352 \citep{std:VOR}.
353
354 The VOResource metadata format has the metadata prefix name
355 \texttt{ivo\_vor}, which can be used wherever \citet{std:OAIPMH} allows a
356 metadata prefix name. The format uses the VOResource core XML Schema
357 with the namespace
358 \texttt{http://www.ivoa.net/xml/VOResource/v1.0}
359 (recommended namespace prefix \xmlel{vr:}) along with any legal
360 extension of this schema to encode the resource descriptions within the
361 OAI-PMH metadata tag from the OAI XML Schema (namespace
362 \texttt{http://www.openarchives.org/OAI/2.0}, recommended namespace
363 prefix \xmlel{oai:}).
364
365 As VOResource and its extensions do not define global elements, the
366 child element within \xmlel{oai:metadata} needs to be separately
367 defined. This specification does this by providing the
368 \xmlel{ri:Resource} element. It is defined in a schema with the target
369 namespace
370 \nolinkurl{http://www.ivoa.net/xml/RegistryInterface/v1.0}, which is given
371 in appendix~\ref{app:rischema}.
372
373 The
374 \xmlel{ri:Resource} element MUST include an \xmlel{xsi:type} attribute
375 that assigns the element's type to \xmlel{vr:Resource} or one of its
376 legal extensions.
377
378 It is strongly recommended that all QName values of \xmlel{xsi:type}
379 attributes within the VOResource record use XML namespace prefixes as
380 recommended in VOResource or the VOResource extensions. Minor version
381 changes are not in general reflected in the recommended prefixes --
382 e.g., both VODataService 1.0 and VODataService 1.1 use \xmlel{vs:}.
383 Registry operators
384 who must deliver OAI-PMH documents containing resource records written
385 to different versions of a registry extension are advised to
386 override the prefix
387 bindings on the element level if at all possible.
388
389 The OAI Dublin Core format, with the metadata prefix of \texttt{oai\_dc},
390 is defined by the OAI-PMH base standard and must be supported by all
391 OAI-PMH compliant registries.
392
393 Harvestable registries may support other metadata formats. Responses to
394 the
395 \oaiop{ListMeta\-dataFormats} operation
396 must list all names for formats supported
397 by the registry; even though they are mandatory, this list must include
398 \texttt{ivo\_vor} and \texttt{oai\_dc}.
399
400
401 \subsection{Identifiers in OAI Messages}
402
403 \label{oaiidentifiers}
404
405 In accordance with the OAI-PMH standard, an OAI-PMH XML envelope that
406 contains a resource description must include a globally unique URI that
407 identifies that resource record. This identifier must be the IVOA
408 identifier used to identify the resource being described as given in
409 its \xmlel{vr:identifier} child element.
410
411 This specification does not follow the recommendation of the OAI-PMH
412 standard with regard to record identifiers. OAI-PMH makes a distinction
413 between the resource record containing resource metadata and the
414 resource itself; thus, it recommends that the identifier in the OAI
415 envelope be different from the resource identifier. In particular, the
416 former is the choice of the publishing registry. This allows one to
417 distinguish resource descriptions of the same resource from different
418 registries, which in principle could be different.
419
420 In the VO, because it is intended that resource descriptions of the
421 same resource from different registries should not differ (apart from
422 possible additions of \xmlel{vr:validationLevel} elements), there
423 is not a strong need to distinguish between the resource and the
424 resource description.
425
426 By making the resource and resource record
427 identifiers the same, it becomes much easier to retrieve the record for
428 a single resource via \oaiop{GetRecord}, regardless of which
429 registry is being queried. Otherwise -- when the registry chooses
430 the record identifier -- a client will not a priori know the record
431 identifier for a particular resource, and so it is left to call
432 \oaiop{ListRecords} and search through the metadata of all the
433 records itself to find the one of interest. In contrast, IVOA
434 identifiers are intended to be a cross-application way of referring to a
435 resource, and thus when a client wants only a single specific resource
436 record, it is very likely that it would know the resource identifier
437 when making a call to the \oaiop{GetRecord} operation.
438
439
440 \subsection{Required Records}
441 \label{oairequired}
442
443 This section describes the records that a harvestable VO registry
444 must include among those it emits via the OAI-PMH operations.
445
446 The harvestable registry MUST return one record that describes the
447 registry itself as a whole, and the \texttt{ivo\_vor} format MUST be
448 supported for this record. This record is also included in the
449 \oaiop{Identify} operation response. When encoded using the
450 \texttt{ivo\_vor} format, the returned \xmlel{ri:Resource} element must
451 be of the type \xmlel{vg:Registry} from the VORegistry schema
452 (see sect.~\ref{sect:vgharvest}). The
453 record MUST include a \xmlel{vg:managedAuthority} for every authority
454 identifier that originates at that registry.
455
456 Additions to the list of a registry's managed authorities must follow
457 the protocol outlined in sect.~\ref{sect:authres}.
458
459 The harvestable registry must be able to return exactly one record in
460 \texttt{ivo\_vor} for each authority identifier listed as a
461 \xmlel{vg:managedAuthority} in the \xmlel{vg:Registry} record
462 that describes that registry. When encoded in the \texttt{ivo\_vor}
463 format, the type of these elements must be \xmlel{vg:Authority}.
464
465
466 \subsection{The Identify Operation}
467
468 \label{sect:oaiidentify}
469
470 The \oaiop{Identify} operation describes the harvestable registry as a
471 whole. The response from this operation must include all information
472 required by the OAI-PMH standard. In particular, it must include an
473 \xmlel{oai:baseURL} element that must refer to the base URL to the
474 harvesting interface endpoint. The \oaiop{Identify} response must
475 include an \xmlel{oai:description} element containing a single
476 \xmlel{ri:Resource} element with an \xmlel{xsi:type} attribute that
477 sets the element's type to \xmlel{vg:Registry}. The content of
478 \xmlel{vg:Registry} type must be the registry description of the
479 harvestable registry itself.
480
481 In its \oaiop{Identify} response, an OAI-PMH-compliant registry must
482 declare its support for deleted records. This can be one of
483
484 \begin{description}
485
486 \item[\texttt{no}] -- the registry will never notify harvesters of
487 records that have become unavailable. In an enviroment like the VO,
488 where searchable registries frequently harvest publishing registries,
489 this is severely discouraged, as without deleted records, harvesters
490 need to perform full harvests every time or risk delivering stale
491 records.
492 \item[\texttt{transient}] -- the registry will notify harvesters of
493 records that have become unavailable, but the deleted records will
494 entirely vanish after some time. This specification adds to the OAI-PMH
495 requirements that registries declaring \texttt{transient} support MUST
496 keep their deleted records for at least six months (after which they may
497 discard them).
498 \item[\texttt{persistent}] -- the registry promises to indefinitely keep
499 deleted records.
500 \end{description}
501
502 \subsection{IVOA Supported Sets}
503
504 \label{supportedsets}
505
506 Sets, as defined in the OAI-PMH standard, are ``an optional construct
507 for grouping items for the purpose of selective harvesting'' (see
508 \citet{std:OAIPMH}, section 2.6). Harvestable IVOA registries are free
509 to define any number of custom sets for categorizing records. The
510 OAI-PMH standard allows a record to be a member of multiple sets.
511
512 This specification defines one reserved set name with a special
513 meaning; future versions of this specification may define additional set
514 names. These reserved set names will all start with the characters
515 \texttt{ivo\_}; implementors should not define their own set names
516 that begin with this string. While support for sets is optional
517 in the OAI-PMH standard, a VO registry MUST support
518 the set with the reserved name \texttt{ivo\_managed} to be compliant
519 with this specification.
520
521 The \texttt{ivo\_managed} set refers to all records that originate from the
522 queried registry. That is, those records that were harvested from other
523 registries are excluded. The resource identifiers given in the
524 records MUST have an authority identifier that matches on one of the
525 \xmlel{vg:managedAuthority} values in the \xmlel{vg:Registry}
526 record for that registry. Full searchable registries may use this set
527 while harvesting other registries to avoid getting duplicate records.
528
529 \subsection{Time Granularity}
530
531 \label{sect:timegranularity}
532
533 Datestamps in the OAI-PMH 2.0 standard are encoded using ISO8601 and
534 expressed in UTC, with the UTC designator ``Z'' appended to seconds-based
535 granularity where supplied, i.e. \texttt{YYYY-MM-DDThh:mm:ssZ}. In
536 general OAI-PMH registries, granularity at seconds scale is optional.
537 Harvestable IVOA registries MUST report datestamps at the granularity of
538 seconds and accept \texttt{from} and \texttt{until} arguments in the same format. This
539 simplifies the incremental harvesting process in the multi-registry IVOA
540 environment.
541
542 \section{Registering Registries}
543 \label{regreg}
544
545 Harvesting registries must able to locate remote registry resources
546 relevant to them, and both harvesting registries and clients need access
547 to metadata for the registry service itself. We address both of these
548 issues by providing a schema for describing registries themselves, and a
549 repository for indexing them.
550
551 The resource specification for registries themselves is defined by an
552 \xmlel{ri:Re\-source} extension \xmlel{vg:Registry}, which describes
553 metadata of the registry itself and its support for interfaces
554 described in this document or elsewhere.
555 These resources are themselves stored as
556 records in registries as described in \ref{oairequired}. From each
557 identifier, further IVOA identifiers for authority information,
558 services, and other records belonging under that publishing umbrella
559 may be created. A publishing registry is said to exclusively manage a
560 naming authority on behalf of the owning publisher; this means that
561 within the IVOA registry network, only that specific registry may
562 publish records having identifiers which begin with that authority identifier.
563
564 The XML namespace URI of this schema is
565 \nolinkurl{http://www.ivoa.net/xml/VORegistry/v1.0}. It has been chosen
566 to allow it to be resolved as a URL to the XML Schema document, which is
567 also given in appendix~\ref{app:vgschema}. The recommended prefix for
568 this namespace is \xmlel{vg:}.
569
570 The schema has not been changed from the one used in version 1.0,
571 although the standard contents have somewhat changed. The rationale for
572 keeping the schema unchanged is that the presence of schema features no longer relevant
573 has no detrimental consequences for Registry operations, whereas changing
574 the schema could break already operational clients.
575
576
577 \begin{figure}[th]
578 \begin{lstlisting}[language=XML]
579 <ri:Resource status="active" xsi:type="vg:Authority"
580 updated="2006-07-01T09:00:00" created="2006-07-01T09:00:00">
581 <title>IVOA Naming Authority</title>
582 <shortName>IVOA</shortName>
583 <identifier>ivo://ivoa.net</identifier>
584 <curation>
585 <publisher ivo-id="ivo://ivoa.net/IVOA">International Virtual
586 Observatory Alliance</publisher>
587 <creator>
588 <name>Raymond Plante</name>
589 <logo>http://www.ivoa.net/icons/ivoa_logo_small.jpg</logo>
590 </creator>
591 <date>2006-07-01</date>
592 <contact>
593 <name>IVOA Resource Registry Working Group</name>
594 <email>registry@ivoa.net</email>
595 </contact>
596 </curation>
597 <content>
598 <subject>virtual observatory</subject>
599 <description>This registers the IVOA as the owner of the ivoa.net
600 authority identifier.</description>
601 <referenceURL>http://rofr.ivoa.net</referenceURL>
602 </content>
603 <managingOrg>International Virtual Observatory Alliance</managingOrg>
604 </ri:Resource>
605 \end{lstlisting}
606 \caption{A sample \xmlel{vg:Authority}-typed resource record as it would
607 be delivered within \xmlel{oai:metadata}. XML namespace declarations
608 for the prefixes \xmlel{ri:}, \xmlel{xsi:}, and \xmlel{vg:} are
609 assumed on enclosing elements.}
610 \label{fig:authrecord}
611 \end{figure}
612
613 \subsection{The Authority Resource Extension and the Publishing Process}
614
615 \label{sect:authres}
616
617
618 The \xmlel{vg:Authority} type extends the core \xmlel{vr:Resource}
619 type to specifically describe the ownership of an authority identifier
620 by a publishing organization.
621
622 The IVOA identifier of a \xmlel{vg:Authority} record provided via the
623 \xmlel{vr:identifi\-er} element must have an empty resource key component
624 as defined in \citet{std:VOID}.
625
626 The meaning of a \xmlel{vg:Authority} record is that the organization
627 referenced in the \xmlel{vg:managingOrg} element has the sole right to
628 create (in collaboration with a publishing registry) and register
629 resource descriptions using the authority identifier given by the
630 \xmlel{vr:identifier} element.
631
632 Before a publisher can create resource descriptions using a new
633 authority identifier, it must first register its claim to the authority
634 identifier by creating a \xmlel{vg:Authority} record. Before the
635 publishing registry commits the record for export, it must first search
636 a full registry to determine if a \xmlel{vg:Authority} with this
637 identifier already exists; if it does, the publication of the new
638 \xmlel{vg:Authority} record must fail.
639
640 When a registry creates a
641 \xmlel{vg:Authority} record, it is said that the registry manages the
642 associated authority identifier (on behalf of the owning publisher)
643 because only that registry may create records with identifiers beginning
644 with that authority identifier. The registry must also document this ownership
645 by adding a corresponding \xmlel{vg:managedAuthority} element to the
646 registry's own resource record.
647
648 The mechanism outlined here is not free of potential conflicts in the distributed
649 environment of the VO Registry. The IVOA Registry Working group
650 periodically monitors the registry-authority graph to ensure each
651 authority in the Registry is claimed by exactly one registry.
652
653 \subsection{Describing Registries with the Registry Resource Extension}
654
655 \label{sect:resext}
656
657 The \xmlel{vg:Registry} type extends the core \xmlel{vr:Service} type to
658 specifically describe registries in order to support discovering them
659 and collecting their metadata; in addition, the extension type also
660 defines the VO-specific metadata in the response to an OAI-PMH
661 \oaiop{Identify} request.
662
663 As a subclass of \xmlel{vr:Service}, the \xmlel{vg:Registry}
664 type uses \xmlel{vr:capability} elements to describe its support for
665 network interfaces to the services. The specific types defined here
666 derive from an intermediate restriction on \xmlel{vr:Capability} called
667 \xmlel{vg:RegCapRestriction} to force the value of the
668 \xmlel{standardID} attribute to be \nolinkurl{ivo://ivoa.net/std/Registry}.
669 In particular, OAI-PMH endpoints as specified here are identified by
670 \nolinkurl{ivo://ivoa.net/std/Registry}. Client should discover
671 registries by looking for records with capabilities declaring
672 this \xmlel{standardID}.
673
674 If the \xmlel{vg:full} element in an \xmlel{vg:Registry} instance
675 is set to \texttt{true}, it indicates the registry's intent to
676 accept all valid resource records it harvests from other
677 registries in accordance with the OAI-PMH specification. This will
678 typically be searchable registries implementing some Registry search
679 interface, but there are also use cases for full registries only
680 implementing OAI-PMH (and thus only providing an \xmlel{vg:Harvest}
681 capability).
682
683 The \xmlel{vg:managedAuthority} is used by publishing registries to
684 claim an authority identifier (see also sect.~\ref{oairequired}). Note
685 that for each managed authority claimed, the registry MUST provide a
686 \xmlel{vg:Authority}-typed resource record for that authority identifier
687 within its \texttt{ivo\_managed} set.
688
689 As of version 1.1 of this specification, VO registry records must provide
690 the three mandatory VOSI capabilities: availability, a listing of
691 service capabilities, and a listing of tables if relevant, i.e. if a
692 RegTAP or other tabular interface is available \citep{std:VOSI}.
693
694
695 \subsection{The Search Capability}
696 \label{sect:vgsearch}
697
698 Version 1.0 of this standard defined a search interface, and such
699 interfaces are described by capabilites of the type \xmlel{vg:Search}.
700 Since in this version, search interfaces are specified by external
701 standards, such external standards may define differing ways of
702 discovering them\footnote{For instance, RegTAP \citep{std:RegTAP} uses
703 the \xmlel{tre:dataModel} element from TAPRegExt as its primary
704 discovery mechanism in its version 1.0.}. The search capability nevertheless is
705 not removed from the schema for backward compatibility, and is available in appendix
706 \ref{app:RISearch}.
707
708 \subsection{The Harvesting Capability}
709
710 \label{sect:vgharvest}
711
712 A registry declares itself to be a harvestable registry by including a
713 \xmlel{vr:capability} element with an \xmlel{xsi:type}
714 attribute set to \xmlel{vg:Harvest}. An example capability for this
715 type is provided in the appendix \ref{sect:exampleCap}.
716
717 A \xmlel{vr:capability} element of type \xmlel{vg:Harvest} MUST
718 include at least one \xmlel{vr:interface} element with an
719 \xmlel{xsi:type} attribute set to \xmlel{vg:OAIHTTP} and the
720 \xmlel{role} attribute set to \texttt{std}. If the
721 \xmlel{vr:capability} element is used to simultaneously describe
722 support for other versions of this Registry Interface standard, then the
723 \xmlel{vr:interface} element describing support for this version must
724 include the version attribute set to \texttt{1.0}. The
725 \xmlel{vr:accessURL} element must be set to the base URL for the
726 OAI-PMH interface.
727
728 The \xmlel{vg:OAISOAP} extension of \xmlel{vr:WebService}
729 was defined in version 1.0 of this specification and is no longer part of VO
730 Registry interfaces since it was never used.
731
732 \section{Registry Discovery}
733
734 \subsection{The Registry of Registries}
735
736 \label{sect:rofr}
737
738 To facilitate discovery and automated harvesting of VO registries,
739 a master list of IVOA registries exists as part of the IVOA web
740 infrastructure, hosted at \nolinkurl{http://rofr.ivoa.net}.
741 It is referred to as the Registry of Registries, or RofR (pronounced ``rover'').
742 As the RofR is itself a registry, it provides an OAI-PMH interface conforming
743 to this document. The OAI-PMH interface is always available at
744 \nolinkurl{http://rofr.ivoa.net/oai}. The RofR includes resource records
745 describing each currently active registry of IVOA resources, its status
746 as a full or local registry, authorities associated with it, and its
747 programmatic interfaces. Each record is of type \xmlel{vg:Registry} as
748 defined in section \ref{sect:resext}.
749
750 Once a registry provider has deployed a new publishing registry, they
751 must enroll it the RofR for their records to be seen by the full
752 searchable registries, and therefore registry search clients accessing
753 the whole IVOA registry ecosystem. The RofR provides a dedicated
754 web-based interface for this purpose accessible
755 from \nolinkurl{http://rofr.ivoa.net}. The RofR includes a
756 validator package, which thoroughly checks the new registry, including
757 schema validation for the OAI interface itself and all listed resources.
758 The registration process will only accept registries that validate
759 successfully. Local updates within a publishing registry post-inclusion
760 in the RofR are not necessarily automatically validated by the RofR
761 software later: the validator tool can, and indeed should, be used
762 independently of the initial admission process by the registry providers
763 to periodically make sure their registries are still compliant with the
764 relevant IVOA standards.
765
766 The Registry of Registries also contains resources describing
767 the most recent versions of IVOA standards for resources and
768 resource extensions themselves; these are of type \xmlel{vstd:Standard}.
769 It is not guaranteed that every standard will be represented in RofR,
770 but for the ones that are listed, the RofR version of their document
771 is the canonical version.
772
773 \subsection{Harvesting the Registry of Registries}
774
775 \label{sect:harvestrofr}
776
777 Given the Registry of Registries contains records for all other
778 currently active and validated IVOA registries, a client wishing to
779 harvest the contents of all registries should begin at the RofR. Full
780 searchable registries wishing to include records from the other IVOA
781 registries count among these potential clients. To harvest the entire
782 contents of IVOA registries, it is recommended to first harvest the
783 Registry of Registries via its OAI-PMH interface.
784
785 This first step is done by making a call to the RofR's OAI-PMH interface
786 with the \textbf{ListRecords} operation, with the \textbf{set} argument
787 set to \textbf{ivo\_publishers}. This will return the registry records
788 (i.e. resources with xsi:type='vg:Registry') for the registries that
789 successfully registered themselves as described in \ref{sect:rofr}.
790
791 The next step in harvesting the entire distributed IVOA registry
792 contents is to iterate over the \xmlel{accessURL} of each
793 \xmlel{vg:Registry} record's \xmlel{vr:capability} of type
794 \xmlel{vg:Harvest}, and use the URL for each of those OAI-PMH interfaces
795 to harvest the individual registries. In iterating over the OAI interface
796 of each registry itself, to avoid harvesting duplicate records from the
797 full searchable registries, it is recommended to add the \texttt{set}
798 parameter to that OAI query as well: records locally published by
799 a full registry comprise that registry's supported set \texttt{ivo\_managed}.
800
801 The very first time the harvester executes the \textbf{ListRecords}
802 operation on the RofR or any listed registry, the \textbf{from} argument
803 should be not used so that all known publishing registries are returned,
804 as well as all known resources within each discovered registry. If the
805 harvesting client wishes to use the OAI interface for incremental
806 updates, it can cache at least a mapping of the registry identifiers to
807 their respective harvesting endpoints along with a timestamp for when
808 this operation was last successfully carried out on each. Then, at the
809 start of subsequent harvesting updates, the harvester can provide the
810 cached date using the \textbf{from} argument to receive only new and
811 updated records, and update the cached timestamp upon success.
812
813 Experience has shown that when relying on incremental harvests
814 exclusively, minor problems eventually accumulate to severe
815 inconsistencies even when registries declare support for deleted
816 records. It is therefore recommended that harvesting clients occasionally
817 (e.g., semianually) perform full updates to an empty local copy without
818 using the \textbf{from} parameter, even for registries that announce
819 deletion of records. To further provide some robustness against small
820 operational issues in the publishing process, it is also recommended
821 to leave an overlap in incremental harvesting requests, e.g. to request
822 resources going back to the beginning of the day of last incremental harvest.
823
824
825 For example, to get a listing of registries in the IVOA ecosystem, one
826 would first query
827 \nolinkurl{http://rofr.ivoa.net/oai?verb=ListRecords\&metadataPrefix=ivo\_vor\&set=ivo_publishers}.
828 Then, for each returned resource, the \xmlel{accessURL} under a
829 \xmlel{Capability} with \xmlel{xsi:type=vg:Harvest}, that URL could be
830 called as such:
831 \nolinkurl{http://accessURLValue?verb=ListRecords\&metadataPrefix=ivo\_vor}
832 or
833 \nolinkurl{http://accessURLValue?verb=ListRecords\&metadataPrefix=ivo_vor\&from=YYYY-MM-DDTHH:MM:SSZ}
834 for return visits, with the 'from' date representing the last successful
835 query to that accessURL.
836
837 \section{Searching Registries}
838 \label{sect:searching}
839
840 Experience with version~1 of this specification suggests that it is
841 preferable to not couple the relatively stable standards for harvesting and
842 general registry maintenance with client interfaces to the registry,
843 which were found to be in much more need of experimentation. For a
844 discussion of the history of client interfaces in the VO, see
845 \citet{paper:regclient}.
846
847 \subsection{RI Search}
848 \label{RISearch}
849 A SOAP-based search capability, \xmlel{vg:RISearch} defined in Registry
850 Interfaces 1.0, exists but is no longer encouraged or required for searchable
851 registries as technologies have moved forward. However, it is still a valid
852 capability defined in the registry resource schema so that registry operators
853 may continue to provide valid RI1 registries without having to support different
854 versions of the VORegistry schema. The base \xmlel{vg:RISearch} extension may
855 also be useful for the description of future registry search interfaces. RISearch
856 is described in appendix~\ref{app:RISearch}.
857
858
859 \subsection{Registry Table Access Protocol Services}
860 \label{RegTAP}
861
862 One second-generation standard search interface to the VO Registry that
863 has progressed to become an IVOA recommendation is RegTAP
864 \citep{std:RegTAP}, an interface based on a relational representation of
865 key fields in resourcce descriptions and on the IVOA Table Access Protocol
866 \citep{std:TAP}. RegTAP services have been made available from several
867 registry providers listed in the Registry of Registries.
868
869 RegTAP-based registries should be located by clients as described
870 in the RegTAP standard (which in version 1.0 happens through locating
871 TAP services with a certain data model identifier like
872 \nolinkurl{ivo://ivoa.net/std/RegTAP#1.0}). To aid smart clients of
873 the full RofR which generate lists for initial discovery, RegTAP registries
874 must also be registered as separate resources with the appropriate tableset.
875 These must include either a full TAP service capability according to
876 TAPRegExt \citep{std:TAPREGEXT-20120827} or an auxiliary capability
877 referencing a TAP service as per \citet{note:DataCollect}. An example
878 for the latter option, preferable if the TAP service in question
879 contains additional tables, is given in appendix \ref{sect:exampleCap}.
880
881 \subsection{Announcing Local vs Full Searchable Registries}
882 \label{FullSearch}
883
884 While a publishing registry may provide search capabilities for its
885 own hosted records, this is considered a locally searchable registry,
886 and not a full searchable one, as distinguished in the RofR listing.
887 For a registry to be considered full searchable, it must harvest resources
888 from the other publishing registries listed in the RofR, and implement
889 an IVOA standard programmatic interface beyond the interface for OAI harvesting,
890 with some method for filtering resource queries.
891 This can be announced simply in the registry's own self-describing resource
892 record with a \xmlel{full} tag set to true, without having to proscribe any
893 one interface as the defining search feature.
894
895 \section{Looking Forward}
896 \label{LookingForward}
897
898 While the OAI-PMH harvesting interface as adopted from outside the IVOA
899 community is stable and replacing it would require a major revision
900 of this document, we expect that new search interfaces for registries will
901 be continually developed, leveraging new technologies and best practices
902 as they emerge. These search interfaces can be added without sacrificing
903 interoperability with the IVOA registry ecosystem. Whether these emerging
904 search technologies become formally endorsed by the IVOA as notes or new
905 standards documents, so long as a registry supports the basic harvest interface
906 and hosts valid \xmlel{ri:Resource} documents including registry and authority
907 records, it should be considered covered by the practices described herein and
908 a welcome addition to the Registry of Registries listing, with all of its
909 records also accessible through the full registries.
910
911 \appendix
912
913 \section{The RegistryInterface Schema}
914 \label{app:rischema}
915
916 The following schema defines a global element, allowing the inclusion of
917 VOResource records into \xmlel{oai:metadata} elements in OAI-PMH
918 responses for the \texttt{ivo\_vor} metadata prefix. See
919 sect.~\ref{sect:metadataformats} for details.
920
921 The schema is unchanged from version 1.0 of this specification and
922 therefore does not change its version.
923
924 \lstinputlisting[language=XML]{RegistryInterface-1.0.xsd}
925
926 \section{The VORegistry Schema}
927 \label{app:vgschema}
928
929 The following schema defines VOResource types for describing registries
930 in the Registry. It is unchanged from version 1.0 of this specification
931 and therefore does not change its version.
932
933 Note that standards defining search interfaces may specify alternative
934 or complementary methods of registering the services defined by them,
935 and that auxiliary capabilities for these search capabilities may be
936 listed within the registry record.
937
938 \lstinputlisting[language=XML]{VORegistry-1.0.xsd}
939
940 \section{Example Capabilities}
941 \label{sect:exampleCap}
942
943 The following XML fragment shows the three capability elements discussed
944 in this document: The OAI-PMH-based publishing registry, the legacy
945 RI 1.1 searchable registry, and an auxiliary TAP capability as used
946 for RegTAP.
947
948 \begin{lstlisting}[language=XML]
949 <ri:Resource
950 xmlns:vg="http://www.ivoa.net/xml/VORegistry/v1.0"
951 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
952 xmlns:xmlns:ri="http://www.ivoa.net/xml/RegistryInterface/v1.0">
953
954 <!-- Standard VOResource metadata omitted for brevity -->
955
956 <!-- The capability for an OAI-PMH endpoint (publishing registry) -->
957 <capability xsi:type="vg:Harvest" standardID="ivo://ivoa.net/std/Registry">
958 <interface xsi:type="vg:OAIHTTP" version="1.0" role="std">
959 <accessURL use="base">http://registry.example.org/oai</accessURL>
960 </interface>
961 <maxRecords>100</maxRecords>
962 </capability>
963
964 <!-- A legacy, RI1.0 searchable registry endpoint, with an
965 extra interface for web browsers. -->
966 <capability xsi:type="vg:Search" standardID="ivo://ivoa.net/std/Registry">
967 <interface xsi:type="vr:WebBrowser" version="1.0" role="gui">
968 <accessURL use="full">http://registry.euro-vo.org</accessURL>
969 </interface>
970 <interface xsi:type="vr:WebService" version="1.0" role="std">
971 <accessURL use="full"
972 >http://registry.example.org/services/RegistrySearch</accessURL>
973 </interface>
974 <maxRecords>100</maxRecords>
975 <extensionSearchSupport>core</extensionSearchSupport>
976 </capability>
977
978 <!-- A reference to RegTAP-enabled TAP service as an auxiliary
979 capability -->
980 <capability standardID="ivo://ivoa.net/std/TAP#aux">
981 <interface xsi:type="vs:ParamHTTP" role="std">
982 <accessURL use="base">http://registry.example.org/tap</accessURL>
983 </interface>
984 </capability>
985
986 <!-- A RegTAP-capable searchable registry should have a tableset
987 with all its tables in the rr schema here -->
988 </ri:Resource>
989 \end{lstlisting}
990
991 \section{The RISearch Schema}
992 \label{app:RISearch}
993
994 The following schema defines the SOAP-based RISearch interface, which
995 is discouraged as of version 1.1 but still available.
996 It is unchanged from version 1.0 of this specification
997 and therefore does not change its version.
998
999 % GENERATED: !schemadoc VORegistry-1.0.xsd Search
1000 %\begin{generated}
1001
1002 \begingroup
1003 \renewcommand*\descriptionlabel[1]{%
1004 \hbox to 5.5em{\emph{#1}\hfil}}\vspace{2ex}\noindent\textbf{\xmlel{vg:Search} Type Schema Documentation}
1005
1006 \noindent{\small
1007 The capabilities of the Registry Search implementation.
1008 \par}
1009
1010 \vspace{1ex}\noindent\textbf{\xmlel{vg:Search} Type Schema Definition}
1011
1012 \begin{lstlisting}[language=XML,basicstyle=\footnotesize]
1013 <xs:complexType name="Search" >
1014 <xs:complexContent >
1015 <xs:extension base="vr:Capability" >
1016 <xs:sequence >
1017 <xs:element name="maxRecords" type="xs:int" />
1018 <xs:element name="extensionSearchSupport"
1019 type="vg:ExtensionSearchSupport" />
1020 <xs:element name="optionalProtocol" type="vg:OptionalProtocol" minOccurs="0"
1021 maxOccurs="unbounded" />
1022 </xs:sequence>
1023 </xs:extension>
1024 </xs:complexContent>
1025 </xs:complexType>
1026 \end{lstlisting}
1027
1028 \vspace{0.5ex}\noindent\textbf{\xmlel{vg:Search} Extension Metadata Elements}
1029
1030 \begingroup\small\begin{bigdescription}\item[Element \xmlel{maxRecords}]
1031 \begin{description}
1032 \item[Type] \xmlel{xs:int}
1033 \item[Meaning]
1034 The largest number of records that the registry search
1035 method will return. A value of zero or less indicates
1036 that there is no explicit limit.
1037
1038 \item[Occurrence] required
1039
1040 \end{description}
1041 \item[Element \xmlel{extensionSearchSupport}]
1042 \begin{description}
1043 \item[Type] string
1044 \item[Meaning]
1045 (deprecated)
1046
1047 \item[Occurrence] required
1048
1049 \item[Allowed Values]\hfil
1050 \begin{longtermsdescription}
1051 \item[core]
1052 Only searches against the core VOResource metadata are
1053 supported.
1054
1055 \item[partial]
1056 Searches against some VOResource extension metadata are
1057 supported but not necessarily all that exist in the registry.
1058
1059 \item[full]
1060 Searches against all VOResource extension metadata contained
1061 in the registry are supported.
1062
1063 \end{longtermsdescription}
1064 \item[Comment]
1065 This was used in Registry Interfaces 1.0 to indicate
1066 what VOResource extensions a search interface supported.
1067 Modern search interfaces will indicate that through
1068 version, their tableset, or similar.
1069
1070
1071 \end{description}
1072 \item[Element \xmlel{optionalProtocol}]
1073 \begin{description}
1074 \item[Type] string
1075 \item[Meaning]
1076 (deprecated)
1077
1078 \item[Occurrence] optional; multiple occurrences allowed.
1079
1080 \item[Allowed Values]\hfil
1081 \begin{longtermsdescription}
1082 \item[XQuery]
1083 the XQuery (http://www.w3.org/TR/xquery/) protocol as defined
1084 in the VO Registry Interface standard.
1085
1086 \end{longtermsdescription}
1087 \item[Comment]
1088 This was used in Registry Interfaces 1.0 to indicate
1089 search protocol extensions. In 1.1, use multiple
1090 capabilities with the appropriate standardIDs
1091 to declare special search capabilities.
1092
1093
1094 \end{description}
1095
1096
1097 \end{bigdescription}\endgroup
1098
1099 \endgroup
1100 %\end{generated}
1101
1102 % /GENERATED
1103
1104
1105 \section{Changes from Previous Versions}
1106
1107 \label{sect:changes}
1108
1109 For pre-REC-1.0 changes, see \citet{std:RI1}.
1110
1111 \subsection{Changes from first 1.1 WD}
1112
1113 \begin{itemize}
1114
1115 \item Text clarifications for harvesting the entire RofR, and
1116 exhortation to harvest from scratch occasionally as OAI
1117 announcement of record deletions are not mandatory.
1118
1119 \item Simplified announcement of full searchable registry
1120 in the RofR and removed operational instructions which may change
1121
1122 \end {itemize}
1123
1124 \subsection{Changes from Version 1.0}
1125
1126 \label{changes-1.0}
1127
1128
1129 \begin{itemize}
1130
1131 \item Corrected reference to OAI-PMH spec in registry interface
1132 description to v2.0.
1133
1134 \item Added requirement for OAI-PMH interface to support seconds
1135 granularity, optional in the OAI-PMH 2.0 standard itself. {}
1136
1137 \item Removed requirement for VOResource version number changes to force
1138 an update of this document. {}
1139
1140 \item Removed the implementation-dependent requirement for searchable
1141 registries in section 2, specifically the SOAP-based services
1142 based on ``ADQL 1.0'' and XQuery.{}
1143
1144 \item Dropped the requirement on registries to not deliver any records
1145 that are OAI-PMH deleted when no temporal constraint is given.{}
1146
1147 \item Added a requirement to provide VOSI endpoints.
1148
1149 \item Added support for auxiliary Registry TAP Service search interfaces
1150
1151 \item Clarified that the requirement to keep deleted records for six
1152 months only applies to the transient case; also discouraging registries
1153 with no support of deleted records.
1154
1155 \item Added recommended process for discovery of registries and their
1156 resources using the Registry of Registries, based on the Registry of
1157 Registries IVOA note
1158
1159 \item Added conclusion describing implications of future search and
1160 publishing interface changes in the Registry environment.
1161
1162 \item Many editorial changes across the text, mostly as a consequence of
1163 externalizing search interfaces.
1164
1165 \end{itemize}
1166
1167
1168 \bibliography{ivoatex/ivoabib,ivoatex/docrepo}
1169
1170 \end{document}

Properties

Name Value
svn:keywords Date Rev URL

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26