/[volute]/trunk/projects/registry/RegistryInterface/RegistryInterface.tex
ViewVC logotype

Contents of /trunk/projects/registry/RegistryInterface/RegistryInterface.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 3892 - (show annotations)
Wed Mar 8 23:06:07 2017 UTC (3 years, 10 months ago) by mbt
File MIME type: application/x-tex
File size: 49980 byte(s)
RegistryInterface: fix some tiny typos
1 \documentclass{ivoa}
2 \input tthdefs
3
4 \usepackage[utf8]{inputenc}
5 \usepackage{todonotes}
6 \usepackage{listings}
7 \usepackage{natbib}
8 \lstloadlanguages{XML}
9 \lstset{flexiblecolumns=true,basicstyle=\small,tagstyle=\ttfamily}
10
11 \SVN$Rev$
12 \SVN$Date$
13 \SVN$URL$
14
15 \hyphenation{name-space}
16
17 \newcommand{\oaiop}[1]{\textit{#1}}
18
19 \ivoagroup{Registry}
20
21
22 \author{Theresa Dower}
23 \author{Markus Demleitner}
24 \author{Kevin Benson}
25 \author{Ray Plante}
26 \author{Elizabeth Auden}
27 \author{Matthew Graham}
28 \author{Gretchen Greene}
29 \author{Martin Hill}
30 \author{Tony Linde}
31 \author{Dave Morris}
32 \author{Wil O`Mullane}
33 \author{Guy Rixon}
34 \author{Aur\'elien St\'eb\'e}
35 \author{Kona Andrews}
36
37 \editor{Theresa Dower}
38 \editor {Markus Demleitner}
39
40 \previousversion[http://www.ivoa.net/documents/RegistryInterface/20091104/]
41 {IVOA Registry Interfaces 1.0, IVOA Recommendation 2009-11-04}
42
43
44 \title{Registry Interfaces}
45
46 \begin{document}
47
48 \begin{abstract}
49 The VO Registry provides a mechanism with which VO applications can
50 discover and select resources that are relevant for a particular
51 scientific problem. This specification defines the operation of this
52 system. It is based on a general, distributed model composed of
53 searchable and publishing registries, as introduced at the beginning of
54 this document. The main body of the specification has three components:
55 (a) an interface for harvesting publishing registries, which builds upon
56 the Open Archives Initiative Protocol for Metadata Harvesting. (b) A
57 VOResource extension for registering registry services and description
58 of a central list of said IVOA registry services. (c) A discussion of
59 the Registry of Registries as the root component of data discovery in
60 the VO.
61 \end{abstract}
62
63
64 \section{Introduction}
65
66 \label{introduction}
67
68 In the Virtual Observatory (VO), registries provide a means for
69 discovering useful resources, i.e., data and services. This discovery
70 takes place by searching within structured descriptions of resources,
71 the resource records, authored by the data providers. In order to avoid
72 a single point of failure for the VO, the Registry is distributed.
73 This means that each data provider can run a service injecting
74 resource records into the Registry (a ``publishing registry'' as defined
75 below), and anyone can run services that allow global discovery (a
76 ``searchable registry'' as defined below).
77
78 To enable this, common mechanisms for registry communication and
79 interaction are required.
80 This document therefore describes the standard interfaces that enable
81 interoperable registries. Through these interfaces, registry
82 builders have a common way of sharing resource descriptions with users,
83 applications, and other registries.
84
85 This specification does not cover interfaces for global discovery, which
86 are the subject of other IVOA standards. Also, service operators are
87 free to build interactive, end-user interfaces in
88 any way that best serves their target community.
89
90 \subsection{Registry Architecture and Definitions}
91
92 \label{arch}
93
94 A \emph{registry} is first a repository of structured descriptions of
95 resources. In the VO, a \emph{resource} is defined by the IVOA
96 Recommendation ``Resource Metadata for the Virtual Observatory''
97 \citep{std:RM}, henceforth referred to as RM, as being
98
99
100 \begin{quotation}
101 a general term referring to a VO element that can be
102 described in terms of who curates or maintains it and which can be
103 given a name and a unique identifier. Just about anything can be a
104 resource: it can be an abstract idea, such as sky coverage or an
105 instrumental setup, or it can be fairly concrete, like an organization
106 or a data collection.
107 \end{quotation}
108
109 Organizations, data collections, and services can be considered
110 classes of resources. The most important type of resource to
111 applications is a service that actually does something. A registry
112 (lower case),
113 then, is ``a service for which the response is a structured description
114 of resources'' (RM).
115
116 This specification is based on the general IVOA model for registries
117 \citep{2004ASPC..314..585P}, which builds on RM's model
118 for resources. In this model, the VO environment features
119 different types of registries that serve different functions. The
120 primary distinction is between publishing registries and searchable
121 ones. A secondary distinction is full versus partial.
122
123 A \emph{searchable registry} is one that allows users and client
124 applications to search for resource records using selection criteria
125 against the metadata contained in the records. The purpose of this type
126 of registry is to aggregate descriptions of many resources distributed
127 across the network. By providing a single place to locate data and
128 services, applications are spared from having to visit many different
129 sites to just to determine which ones are relevant to the scientific
130 problem at hand. A searchable registry gathers its descriptions from
131 across the network through a process called \emph{harvesting}.
132
133 A \emph{publishing registry} is one that simply exposes its resource
134 descriptions to the VO environment in a way that allows those
135 descriptions to be harvested. The contents of these registries tend to
136 be limited to resources maintained by one or a few providers and thus
137 are local in nature; for example, a data center will run its own
138 publishing registry to allow other VO components to gather metadata on
139 the data center's published services.
140 Since the purpose is simply publishing and not to serve
141 users and applications directly, it is not necessary to support full
142 searching capabilities. This simplifies the requirements for a
143 publishing registry:
144 storage, management, and indexing of the records can be simpler, as
145 there is no need to support a
146 search interface facilitating complex discovery queries.
147 While a searchable registry in practice will necessitate the
148 use of a database system, a simple publishing registry may get by
149 storing its records as flat files on disk.
150
151 Note that some registries can play both roles; that is, a searchable
152 registry may also publish its own resource descriptions.
153
154 A secondary distinction is full versus local. A \emph{full registry} or
155 \emph{fully searchable registry} is one that attempts to contain records
156 of all resources known to the VO.
157 Several such registries exist, run by various VO projects. A
158 \emph{local registry}, on the other hand, contains only a subset of
159 known resources. While for publishing registries this subset usually is
160 defined by what services are maintained by the registry's operator,
161 other selection criteria are conceivable. For instance, the IVOA's
162 Education IG is considering running a registry only containing resources
163 manually selected for suitability for primary and secondary education.
164
165 As mentioned above, harvesting is the mechanism by which a registry can
166 collect resource records from other registries. It is used by full
167 registries to aggregate resource records from publishing
168 registries. It can also be used to synchronize two registries to ensure
169 that they have the same contents. Harvesting, in this specification, is
170 modeled as a pull operation between two registries. The term
171 \emph{harvester} refers to the registry that wishes to receive records
172 (usually a fully searchable registry); it sends its request to the
173 \emph{harvestee} (usually a publishing registry), which responds with
174 the records. Harvesting is a much simpler process than a fully-featured
175 search interface, as only very few constraints need to be supported and
176 only full records are being transmitted in responses.
177 Consequently, different protocols are employed for the
178 two types of registry operations.
179
180 In this text, ``registry'' in lower case refers to concrete services,
181 while ``Registry'' (or ``VO Registry'') in upper case refers to the
182 combination of the set of all resource records and the interfaces to
183 query and manage them.
184
185 \subsection{The Registry Interface within the VO Architecture}
186
187 \label{sect:rolewithinivoa}
188
189
190 \begin{figure}[thm]
191 \begin{center}
192 \includegraphics[width=0.9\textwidth]{archdiag.png}
193 \caption{IVOA Architecture
194 diagram with the Registry Interface specification (RI) and
195 the related standards marked up.}
196 \label{fig:arch}
197 \end{center}
198 \end{figure}
199
200 This specification directly relates to other VO standards in the
201 following ways:
202
203
204 \begin{bigdescription}
205 \item[VOResource \citep{std:VOR}]VOResource sets the foundation for a
206 formal definition of the data model for resource records via its schema
207 definition.
208
209 \item[IVOA Identifiers]IVOA identifiers are something like the
210 primary keys to the VO registry. Also, the notion of an authority as
211 laid down in IVOA Identifiers plays an important role as publishing
212 registries can be viewed as a realization of a set of authorities.
213
214 \end{bigdescription}
215
216
217 \section{The IVOA Harvesting Interface}
218
219 \label{harvesting}
220
221 The harvesting interface allows the retrieval of complete VOResource
222 records from registries supporting harvesting. Publishing registries
223 MUST support the IVOA harvesting interface, searchable registries SHOULD
224 do so.
225
226 The IVOA harvesting interface is built on the standard Protocol for
227 Metadata Harvesting developed by the Open Archives Initiative, OAI-PMH
228 \citep{std:OAIPMH}. In this section, after giving a brief introduction
229 to OAI-PMH, we define additional constraints and requirements for
230 OAI-PMH services to be interoperable with the VO environment.
231
232 In version 1.0 of this document, the variant of the OAI-PMH
233 protocol was defined using SOAP in the exchange of messages. Version
234 1.1 no longer defines it (although of course there is no requirement to
235 remove it from running services); since OAI-PMH over SOAP has never been
236 in active use by the IVOA, we consider this still a minor specification
237 change not warranting a new major version.
238
239 \subsection{The OAI Protocol for Metadata Harvesting}
240
241 \label{oaipmh}
242
243 While for details of OAI-PMH we refer to \citet{std:OAIPMH},
244 in the following we give a
245 brief overview of OAI-PMH that should be sufficient to understand the
246 protocol's role within the Registry interface architecture.
247
248 The OAI-PMH v2.0 specification defines:
249
250
251 \begin{itemize}
252
253 \item the meaning and behavior of the six harvesting operations, referred
254 to as verbs,{}
255
256 \item the meaning of the input arguments for each operation, and{}
257
258 \item the XML Schema used to encode response messages.{}
259 \end{itemize}
260
261 The six standard operations laid down in OAI-PMH are:
262
263
264 \begin{bigdescription}
265 \item[Identify] provides a description of the registry
266
267 \item[ListIdentifiers]returns a list of identifiers for the resource
268 records held by the registry, possibly restricted to records changed
269 within a certain time span or to those belonging to a certain set.
270
271 \item[ListRecords]returns complete resource records in the registry,
272 possibly restricted to records changed within a certain time span or to
273 those belonging to a certain set.
274
275 \item[GetRecord]returns a single resource description matching a given
276 identifier.
277
278 \item[ListMetadataFormats]returns a list of supported formats that the
279 registry can use to encode resource descriptions upon a harvester's
280 request.
281
282 \item[ListSets]returns a list of set names supported by the registry
283 that harvesters can request in order to get back a subset of the
284 descriptions held by the registry.
285
286 \end{bigdescription}
287
288 The ListRecords and GetRecord operations return the actual resource
289 description records held by the registry. These descriptions are encoded
290 in XML and wrapped in a general-purpose envelope defined by the OAI-PMH
291 XML Schema (with the namespace
292 \texttt{http://www.openarchives.org/OAI/2.0}).
293
294 Through the operations' arguments, OAI-PMH provides a number of useful
295 features:
296
297
298 \begin{itemize}
299
300 \item Support for multiple return formats. As suggested by the existence
301 of the
302 \oaiop{ListMetadataFormats} operation, a harvester can request the
303 formats available for encoding returned resource descriptions.{}
304
305 \item Harvesting by date. The \oaiop{ListIdentifiers} and
306 \oaiop{ListRecords} operations both support \texttt{from} and
307 \texttt{until} date arguments which restrict the response to records
308 changed withing the given, possibly half-open, interval.{}
309
310 \item Harvesting by category. The \oaiop{ListIdentifiers} and
311 \oaiop{ListRecords} operations both support a set argument for
312 retrieving resources that are grouped in a particular category. Resource
313 records may belong to multiple sets.{}
314
315 \item Marking records as deleted. Registries may mark records as deleted
316 so that harvesters will be notified that a resource has become
317 unavailable even if only performing incremental harvests.
318
319 \item Support for resumption tokens. If a request results in returning a
320 very large number of records, the registry can choose to split the
321 results over several calls; this is done by passing a resumption token
322 back to the harvester. The harvester uses it to retrieve the next set of
323 matching results.{}
324
325 \end{itemize}
326 It is important to note that the OAI-PMH interface is not intended
327 to be a general search interface. The filtering capabilities described
328 above are just enough to support intelligent harvesting between
329 registries. Most end-user applications will use a dedicated search
330 interface on a searchable registry (cf.~sect.~\ref{sect:searching}).
331
332 In addition to basic OAI-PMH compliance, this specification defines
333 a set of OAI-PMH-compliant requirements and recommendations
334 special to OAI-PMH's use within the VO that are described in the
335 remaining subsections.
336
337
338 \subsection{Metadata Formats for Resource Descriptions}
339
340 \label{sect:metadataformats}
341
342 All IVOA registries that support the Harvesting Interface must support
343 two standard metadata formats: the OAI Dublin Core format (mandated by
344 the base OAI-PMH standard) and the IVOA VOResource metadata format
345 \citep{std:VOR}.
346
347 The VOResource metadata format has the metadata prefix name
348 \texttt{ivo\_vor}, which can be used wherever \citet{std:OAIPMH} allows a
349 metadata prefix name. The format uses the VOResource core XML Schema
350 with the namespace
351 \texttt{http://www.ivoa.net/xml/VOResource/v1.0}
352 (recommended namespace prefix \xmlel{vr:}) along with any legal
353 extension of this schema to encode the resource descriptions within the
354 OAI-PMH metadata tag from the OAI XML Schema (namespace
355 \texttt{http://www.openarchives.org/OAI/2.0}, recommended namespace
356 prefix \xmlel{oai:}).
357
358 As VOResource and its extensions do not define global elements, the
359 child element within \xmlel{oai:metadata} needs to be separately
360 defined. This specification does this by providing the
361 \xmlel{ri:Resource} element. It is defined in a schema with the target
362 namespace
363 \nolinkurl{http://www.ivoa.net/xml/RegistryInterface/v1.0}, which is given
364 in appendix~\ref{app:rischema}.
365
366 The
367 \xmlel{ri:Resource} element MUST include an \xmlel{xsi:type} attribute
368 that assigns the element's type to \xmlel{vr:Resource} or one of its
369 legal extensions.
370
371 It is strongly recommended that all QName values of \xmlel{xsi:type}
372 attributes within the VOResource record use XML namespace prefixes as
373 recommended in VOResource or the VOResource extensions. Minor version
374 changes are not in general reflected in the recommended prefixes --
375 e.g., both VODataService 1.0 and VODataService 1.1 use \xmlel{vs:}.
376 Registry operators
377 who must deliver OAI-PMH decuments containing resource records written
378 to different versions of a registry extension are advised to
379 override the prefix
380 bindings on the element level if at all possible.
381
382 The OAI Dublin Core format, with the metadata prefix of \texttt{oai\_dc},
383 is defined by the OAI-PMH base standard and must be supported by all
384 OAI-PMH compliant registries.
385
386 Harvestable registries may support other metadata formats. Responses to
387 the
388 \oaiop{ListMeta\-dataFormats} operation
389 must list all names for formats supported
390 by the registry; even though they are mandatory, this list must include
391 \texttt{ivo\_vor} and \texttt{oai\_dc}.
392
393
394 \subsection{Identifiers in OAI Messages}
395
396 \label{oaiidentifiers}
397
398 In accordance with the OAI-PMH standard, an OAI-PMH XML envelope that
399 contains a resource description must include a globally unique URI that
400 identifies that resource record. This identifier must be the IVOA
401 identifier used to identify the resource being described as given in
402 its \xmlel{vr:identifier} child element.
403
404 This specification does not follow the recommendation of the OAI-PMH
405 standard with regard to record identifiers. OAI-PMH makes a distinction
406 between the resource record containing resource metadata and the
407 resource itself; thus, it recommends that the identifier in the OAI
408 envelope be different from the resource identifier. In particular, the
409 former is the choice of the publishing registry. This allows one to
410 distinguish resource descriptions of the same resource from different
411 registries, which in principle could be different.
412
413 In the VO, because it is intended that resource descriptions of the
414 same resource from different registries should not differ (apart from
415 possible additions of \xmlel{vr:validationLevel} elements), there
416 is not a strong need to distinguish between the resource and the
417 resource description.
418
419 By making the resource and resource record
420 identifiers the same, it becomes much easier to retrieve the record for
421 a single resource via \oaiop{GetRecord}, regardless of which
422 registry is being queried. Otherwise -- when the registry chooses
423 the record identifier -- a client will not a priori know the record
424 identifier for a particular resource, and so it is left to call
425 \oaiop{ListRecords} and search through the metadata of all the
426 records itself to find the one of interest. In contrast, IVOA
427 identifiers are intended to be a cross-application way of referring to a
428 resource, and thus when a client wants only a single specific resource
429 record, it is very likely that it would know the resource identifier
430 when making a call to the \oaiop{GetRecord} operation.
431
432
433 \subsection{Required Records}
434 \label{oairequired}
435
436 This section describes the records that a harvestable VO registry
437 must include among those it emits via the OAI-PMH operations.
438
439 The harvestable registry MUST return one record that describes the
440 registry itself as a whole, and the \texttt{ivo\_vor} format MUST be
441 supported for this record. This record is also included in the
442 \oaiop{Identify} operation response. When encoded using the
443 \texttt{ivo\_vor} format, the returned \xmlel{ri:Resource} element must
444 be of the type \xmlel{vg:Registry} from the VORegistry schema
445 (see sect.~\ref{sect:vgharvest}). The
446 record MUST include a \xmlel{vg:managedAuthority} for every authority
447 identifier that originates at that registry.
448
449 Additions to the list of a registry's managed authorities must follow
450 the protocol outlined in sect.~\ref{sect:authres}.
451
452 The harvestable registry must be able to return exactly one record in
453 \texttt{ivo\_vor} for each authority identifier listed as a
454 \xmlel{vg:managedAuthority} in the \xmlel{vg:Registry} record
455 that describes that registry. When encoded in the \texttt{ivo\_vor}
456 format, the type of these elements must be \xmlel{vg:Authority}.
457
458
459 \subsection{The Identify Operation}
460
461 \label{sect:oaiidentify}
462
463 The \oaiop{Identify} operation describes the harvestable registry as a
464 whole. The response from this operation must include all information
465 required by the OAI-PMH standard. In particular, it must include an
466 \xmlel{oai:baseURL} element that must refer to the base URL to the
467 harvesting interface endpoint. The \oaiop{Identify} response must
468 include an \xmlel{oai:description} element containing a single
469 \xmlel{ri:Resource} element with an \xmlel{xsi:type} attribute that
470 sets the element's type to \xmlel{vg:Registry}. The content of
471 \xmlel{vg:Registry} type must be the registry description of the
472 harvestable registry itself.
473
474 In its \oaiop{Identify} response, an OAI-PMH-compliant registry must
475 declare its support for deleted records. This can be one of
476
477 \begin{description}
478
479 \item[\texttt{no}] -- the registry will never notify harvesters of
480 records that have become unvailable. In an enviroment like the VO,
481 where searchable registries frequently harvest publishing registries,
482 this is severely discouraged, as without deleted records, harvesters
483 need to perform full harvests every time or risk delivering stale
484 records.
485 \item[\texttt{transient}] -- the registry will notify harvesters of
486 records that have become unavailable, but the deleted records will
487 entirely vanish after some time. This specification adds to the OAI-PMH
488 requirements that registries declaring \texttt{transient} support MUST
489 keep their deleted records for at least six months (after which they may
490 discard them).
491 \item[\texttt{persistent}] -- the registry promises to indefinitely keep
492 deleted records.
493 \end{description}
494
495 \subsection{IVOA Supported Sets}
496
497 \label{supportedsets}
498
499 Sets, as defined in the OAI-PMH standard, are ``an optional construct
500 for grouping items for the purpose of selective harvesting'' (see
501 \citet{std:OAIPMH}, section 2.6). Harvestable IVOA registries are free
502 to define any number of custom sets for categorizing records. The
503 OAI-PMH standard allows a record to be a member of multiple sets.
504
505 This specification defines one reserved set name with a special
506 meaning; future versions of this specification may define additional set
507 names. These reserved set names will all start with the characters
508 \texttt{ivo\_}; implementors should not define their own set names
509 that begin with this string. While support for sets is optional
510 in the OAI-PMH standard, a VO registry MUST support
511 the set with the reserved name \texttt{ivo\_managed} to be compliant
512 with this specification.
513
514 The \texttt{ivo\_managed} set refers to all records that originate from the
515 queried registry. That is, those records that were harvested from other
516 registries are excluded. The resource identifiers given in the
517 records MUST have an authority identifier that matches on one of the
518 \xmlel{vg:managedAuthority} values in the \xmlel{vg:Registry}
519 record for that registry. Fully searchable registries may use this set
520 while harvesting other searchable registries to avoid getting
521 duplicate records.
522
523 \subsection{Time Granularity}
524
525 \label{sect:timegranularity}
526
527 Datestamps in the OAI-PMH 2.0 standard are encoded using ISO8601 and
528 expressed in UTC, with the UTC designator ``Z'' appended to seconds-based
529 granularity where supplied, i.e. \texttt{YYYY-MM-DDThh:mm:ssZ}. In
530 general OAI-PMH registries, granularity at seconds scale is optional.
531 Harvestable IVOA registries MUST report datestamps at the granularity of
532 seconds and accept \texttt{from} and \texttt{until} arguments in the same format. This
533 simplifies the incremental harvesting process in the multi-registry IVOA
534 environment.
535
536 \section{Registering Registries}
537 \label{regreg}
538
539 Harvesting registries must able to locate remote registry resources
540 relevant to them, and both harvesting registries and clients need access
541 to metadata for the registry service itself. We address both of these
542 issues by providing a schema for describing registries themselves, and a
543 repository for indexing them.
544
545 The resource specification for registries themselves is defined by an
546 \xmlel{ri:Re\-source} extension \xmlel{vg:Registry}, which describes
547 metadata of the registry itself and its support for interfaces
548 described in this document or elsewhere.
549 These resources are themselves be stored as
550 records in registries as described in \ref{oairequired}. From each
551 identifier, further IVOA identifiers for authority information,
552 services, and other records belonging under that publishing umbrella
553 may be created. A publishing registry is said to exclusively manage a
554 naming authority on behalf of the owning publisher; this means that
555 within the IVOA registry network, only that specific registry may
556 publish records having identifiers which begin with that authority identifier.
557
558 The XML namespace URI of this schema is
559 \nolinkurl{http://www.ivoa.net/xml/VORegistry/v1.0}. It has been chosen
560 to allow it to be resolved as a URL to the XML Schema document, which is
561 also given in appendix~\ref{app:vgschema}. The recommended prefix for
562 this namespace is \xmlel{vg:}.
563
564 The schema has not been changed from the one used in version 1.0,
565 although the standard contents have somewhat changed. The rationale for
566 keeping the schema unchanged is that the presence of schema features no longer relevant
567 has no detrimental consequences for Registry operations, whereas changing
568 the schema could break already operational clients.
569
570
571 \begin{figure}[thm]
572 \begin{lstlisting}[language=XML]
573 <ri:Resource status="active" xsi:type="vg:Authority"
574 updated="2006-07-01T09:00:00" created="2006-07-01T09:00:00">
575 <title>IVOA Naming Authority</title>
576 <shortName>IVOA</shortName>
577 <identifier>ivo://ivoa.net</identifier>
578 <curation>
579 <publisher ivo-id="ivo://ivoa.net/IVOA">International Virtual
580 Observatory Alliance</publisher>
581 <creator>
582 <name>Raymond Plante</name>
583 <logo>http://www.ivoa.net/icons/ivoa_logo_small.jpg</logo>
584 </creator>
585 <date>2006-07-01</date>
586 <contact>
587 <name>IVOA Resource Registry Working Group</name>
588 <email>registry@ivoa.net</email>
589 </contact>
590 </curation>
591 <content>
592 <subject>virtual observatory</subject>
593 <description>This registers the IVOA as the owner of the ivoa.net
594 authority identifier.</description>
595 <referenceURL>http://rofr.ivoa.net</referenceURL>
596 </content>
597 <managingOrg>International Virtual Observatory Alliance</managingOrg>
598 </ri:Resource>
599 \end{lstlisting}
600 \caption{A sample \xmlel{vg:Authority}-typed resource record as it would
601 be delivered within \xmlel{oai:metadata}. XML namespace declarations
602 are for the prefixes \xmlel{ri:}, \xmlel{xsi:}, and \xmlel{vg:} are
603 assumed on enclosing elements.}
604 \label{fig:authrecord}
605 \end{figure}
606
607 \subsection{The Authority Resource Extension and the Publishing Process}
608
609 \label{sect:authres}
610
611
612 The \xmlel{vg:Authority} type extends the core \xmlel{vr:Resource}
613 type to specifically describe the ownership of an authority identifier
614 by a publishing organization.
615
616 The IVOA identifier of a \xmlel{vg:Authority} record provided via the
617 \xmlel{vr:identifi\-er} element must have an empty resource key component
618 as defined in \citet{std:VOID}.
619
620 The meaning of a \xmlel{vg:Authority} record is that the organization
621 referenced in the \xmlel{vg:managingOrg} element has the sole right to
622 create (in collaboration with a publishing registry) and register
623 resource descriptions using the authority identifier given by the
624 \xmlel{vr:identifier} element.
625
626 Before a publisher can create resource descriptions using a new
627 authority identifier, it must first register its claim to the authority
628 identifier by creating a \xmlel{vg:Authority} record. Before the
629 publishing registry commits the record for export, it must first search
630 a full registry to determine if a \xmlel{vg:Authority} with this
631 identifier already exists; if it does, the publication of the new
632 \xmlel{vg:Authority} record must fail.
633
634 When a registry creates a
635 \xmlel{vg:Authority} record, it is said that the registry manages the
636 associated authority identifier (on behalf of the owning publisher)
637 because only that registry may create records with identifiers beginning
638 with that authority identifier. The registry must also document this ownership
639 by adding a corresponding \xmlel{vg:managedAuthority} element to the
640 registry's own resource record.
641
642 The mechanism outlined here is not free of potential conflicts in the distributed
643 environment of the VO Registry. The IVOA Registry Working group
644 periodically monitors the registry-authority graph to ensure each
645 authority in the Registry is claimed by exactly one registry.
646
647 \subsection{Describing Registries with the Registry Resource Extension}
648
649 \label{sect:resext}
650
651 The \xmlel{vg:Registry} type extends the core \xmlel{vr:Service} type to
652 specifically describe registries in order to support discovering them
653 and collecting their metadata; in addition, the extension type also
654 defines the VO-specific metadata in the response to an OAI-PMH
655 \oaiop{Identify} request.
656
657 As a subclass of \xmlel{vr:Service}, the \xmlel{vg:Registry}
658 type uses \xmlel{vr:capability} elements to describe its support for
659 network interfaces to the services. The specific types defined here
660 derive from an intermediate restriction on \xmlel{vr:Capability} called
661 \xmlel{vg:RegCapRestriction} to force the value of the
662 \xmlel{standardID} attribute to be \nolinkurl{ivo://ivoa.net/std/Registry}.
663 In particular, OAI-PMH endpoints as specified here are identified by
664 \nolinkurl{ivo://ivoa.net/std/Registry}. Client should discover
665 registries by looking for records with capabilities declaring
666 this \xmlel{standardID}.
667
668 If the \xmlel{vg:full} element in an \xmlel{vg:Registry} instance
669 is set to \texttt{true}, it indicates the registry's intent to
670 accept all valid resource records it harvests from other
671 registries in accordance with the OAI-PMH specification. This will
672 typically be searchable registries implementing some Registry search
673 interface, but there are also use cases for full registries only
674 implementing OAI-PMH (and thus only providing an \xmlel{vg:Harvest}
675 capability).
676
677 The \xmlel{vg:managedAuthority} is used by publishing registries to
678 claim an authority identifier (see also sect.~\ref{oairequired}). Note
679 that for each managed authority claimed, the registry MUST provide a
680 \xmlel{vg:Authority}-typed resource record for that authority identifier
681 within its \texttt{ivo\_managed} set.
682
683 As of version 1.1 of this specification, VO registry records must provide
684 the three mandatory VOSI capabilities: availability, a listing of
685 service capabilities, and a listing of tables if relevant, i.e. if a
686 RegTAP or other tabular interface is available \citep{std:VOSI}.
687
688
689 \subsection{The Search Capability}
690 \label{sect:vgsearch}
691
692 Version 1 of this standard defined a search interface, and such
693 interfaces are described by capabilites of the type \xmlel{vg:Search}.
694 Since in this version, search interfaces are specified by external
695 standards, such external standards may define differing ways of
696 discovering them\footnote{For instance, RegTAP \citep{std:RegTAP} uses
697 the \xmlel{tre:dataModel} element from TAPRegExt as its primary
698 discovery mechanism in its version 1.0.}. The search capability nevertheless is not removed
699 from the schema for backward compatibility, and is available in appendix
700 \ref{app:RISearch}.
701
702 \subsection{The Harvesting Capability}
703
704 \label{sect:vgharvest}
705
706 A registry declares itself to be a harvestable registry by including a
707 \xmlel{vr:capability} element with an \xmlel{xsi:type}
708 attribute set to \xmlel{vg:Harvest}. An example capability for this
709 type is provided in the appendix \ref{sect:exampleCap}.
710
711 A \xmlel{vr:capability} element of type \xmlel{vg:Harvest} MUST
712 include at least one \xmlel{vr:interface} element with an
713 \xmlel{xsi:type} attribute set to \xmlel{vg:OAIHTTP} and the
714 \xmlel{role} attribute set to \texttt{std}. If the
715 \xmlel{vr:capability} element is used to simultaneously describe
716 support for other versions of this Registry Interface standard, then the
717 \xmlel{vr:interface} element describing support for this version must
718 include the version attribute set to \texttt{1.0}. The
719 \xmlel{vr:accessURL} element must be set to the base URL for the
720 OAI-PMH interface.
721
722 The \xmlel{vg:OAISOAP} extension of \xmlel{vr:WebService}
723 was defined in version 1.0 of this specification and is no longer part of VO
724 Registry interfaces since it was never used.
725
726 \section{Registry Discovery}
727
728 \subsection{The Registry of Registries}
729
730 \label{sect:rofr}
731
732 To facilitate discovery and automated harvesting of VO registries,
733 a registry serving as a master list of
734 IVOA registries exists as part of the IVOA web infrastructure, hosted at
735 \nolinkurl{http://rofr.ivoa.net}. It is referred to as the Registry of
736 Registries, or RofR (pronounced ``rover''). As the RofR is itself a
737 registry, an OAI-PMH interface is provided which conforms to this
738 document. The OAI-PMH interface is always available at
739 \nolinkurl{http://rofr.ivoa.net/oai}.
740
741 The Registry of Registries includes the resource records directly
742 representing each currently active registry of IVOA resources, be they
743 fully searchable or publishing registries providing only an OAI-PMH
744 harvesting interface. These resources are of type \xmlel{vg:Registry} as
745 defined in section \ref{sect:resext}.
746
747 Once a registry provider has deployed a new publishing registry, they
748 must enroll it the RofR for their records to be seen by the fully
749 searchable registries, and therefore registry search clients accessing
750 the whole IVOA registry ecosystem. The RofR provides a dedicated
751 web-based interface for this purpose accessible
752 from \nolinkurl{http://rofr.ivoa.net}. The RofR includes a
753 validator package, which thoroughly checks the new registry, including
754 schema validation for the OAI interface itself and all listed resources.
755 The registration process will only accept registries that validate
756 successfully. Local updates within a publishing registry post-inclusion
757 in the RofR are not necessarily automatically validated by the RofR
758 software later: the validator tool can, and indeed should, be used
759 independently of the initial admission process by the registry providers
760 to periodically make sure their registries are still compliant with the
761 relevant IVOA standards.
762
763 The Registry of Registries also contains resources describing
764 the most recent versions of IVOA standards for resources and
765 resource extensions themselves; these are of type \xmlel{vstd:Standard}.
766 It is not guaranteed that every standard will be represented in RofR,
767 but for the ones that are listed, the RofR version of their document
768 is the canonical version.
769
770 \subsection{Harvesting the Registry of Registries}
771
772 \label{sect:harvestrofr}
773
774 Given the Registry of Registries contains records for all other
775 currently active and validated IVOA registries, a client wishing to
776 harvest the contents of all registries should begin at the RofR. Fully
777 searchable registries wishing to include records from the other IVOA
778 registries count among these potential clients. To harvest the entire
779 contents of IVOA registries, it is recommended to first harvest the
780 Registry of Registries via its OAI-PMH interface.
781
782 This first step is done by making a call to the RofR's OAI-PMH interface
783 with the \textbf{ListRecords} operation, with the \textbf{set} argument
784 set to \textbf{ivo\_publishers}. This will return the registry records
785 (i.e. resources with xsi:type='vg:Registry') for the registries that
786 successfully registered themselves as described in \ref{sect:rofr}.
787
788 The next step in harvesting the entire distributed IVOA registry
789 contents is to iterate over the \xmlel{accessURL} of each
790 \xmlel{vg:Registry} record's \xmlel{vr:capability} of type
791 \xmlel{vg:Harvest}, and use the URL for each of those OAI-PMH interfaces
792 to harvest the individual registries. This filtering of RofR contents
793 can be done by adding the \texttt{set} parameter to an OAI query to the
794 RofR: registries in the RofR comprise the supported set
795 \texttt{ivo\_publishers}. Then when harvesting each registry in turn, to
796 avoid harvesting duplicate records from the fully searchable registries,
797 it is recommended to add the \texttt{set} parameter to that OAI query:
798 records specifically published by a registry which also has a search
799 interface comprise that registry's supported set \texttt{ivo\_managed}.
800
801 The very first time the harvester executes the \textbf{ListRecords}
802 operation on the RofR or any listed registry, the \textbf{from} argument
803 should be not used so that all known publishing registries are returned,
804 as well as all known resources within each discovered registry. If the
805 harvesting client wishes to use the OAI interface for incremental
806 updates, it can cache at least a mapping of the registry identifiers to
807 their respective harvesting endpoints along with a timestamp for when
808 this operation was last successfully carried out on each. Then, at the
809 start of subsequent harvesting updates, the harvester can provide the
810 cached date using the \textbf{from} argument to receive only new and
811 updated records, and update the cached timestamp upon success. It is
812 suggested that harvesting clients perform full updates without the
813 \textbf{from} parameter on an occasional basis.
814
815 For example, to get a listing of registries in the IVOA ecosystem, one
816 would first query
817 \nolinkurl{http://rofr.ivoa.net/oai?verb=ListRecords\&metadataPrefix=ivo\_vor\&set=ivo_publishers}.
818 Then, for each returned resource, the \xmlel{accessURL} under a
819 \xmlel{Capability} with \xmlel{xsi:type=vg:Harvest}, that URL could be
820 called as such:
821 \nolinkurl{http://accessURLValue?verb=ListRecords\&metadataPrefix=ivo\_vor}
822 or
823 \nolinkurl{http://accessURLValue?verb=ListRecords\&metadataPrefix=ivo_vor\&from=YYYY-MM-DDTHH:MM:SSZ}
824 for return visits, with the 'from' date representing the last successful
825 query to that accessURL. Note according to the OAI-PMH standard the granularity
826 of dates in 'from' fields is optional beyond day (DD), and some overlap in
827 requested timeframes may be useful from an operational standpoint.
828
829 \section{Searching Registries}
830 \label{sect:searching}
831
832 Experience with version~1 of this specification suggests that it is
833 preferable to not couple the relatively stable standards for harvesting and
834 general registry maintenance with client interfaces to the registry,
835 which were found to be in much more need of experimentation. For a
836 discussion of the history of client interfaces in the VO, see
837 \citet{paper:regclient}.
838
839 \subsection{RI Search}
840 \label{RISearch}
841 A SOAP-based search capability, \xmlel{vg:RISearch} defined in Registry
842 Interfaces 1.0, exists but is no longer encouraged or required for searchable
843 registries as technologies have moved forward. However, it is still a valid
844 capability defined in the registry resource schema so that registry operators
845 may continue to provide valid RI1 registries without having to support different
846 versions of the VORegistry schema. The base \xmlel{vg:RISearch} extension may
847 also be useful for the description future registry search interfaces. RISearch
848 is described in appendix~\ref{app:RISearch}.
849
850
851 \subsection{Registry Table Access Protocol Services}
852 \label{RegTAP}
853
854 One second-generation standard search interface to the VO Registry that
855 has progressed to become an IVOA recommendation is RegTAP
856 \citep{std:RegTAP}, an interface based on a relational representation of
857 key fields in resourcce descriptions and on the IVOA Table Access Protocol
858 \citep{std:TAP}. RegTAP services have been made available from several
859 registry providers listed in the Registry of Registries.
860
861 RegTAP-based registries should be located by clients as described
862 in the RegTAP standard (which in version 1.0 happens through locating
863 TAP services with a certain data model identifier like
864 \nolinkurl{ivo://ivoa.net/std/RegTAP#1.0}). To aid smart clients of
865 the full RofR which generate lists for initial discovery, RegTAP registries
866 must also be registered as separate resources with the appropriate tableset.
867 These must include either a full TAP service capability according to
868 TAPRegExt \citep{std:TAPREGEXT-20120827} or an auxiliary capability
869 referencing a TAP service as per \citet{note:DataCollect}. An example
870 for the latter option, preferable if the TAP service in question
871 contains additional tables, is given in appendix \ref{sect:exampleCap}.
872
873 \subsection{Locally and Fully Searchable Registries}
874 \label{FullSearch}
875
876 While a publishing registry may provide search capabilities for its
877 own hosted records, this is considered a locally searchable registry,
878 and not a fully searchable one, as distinguished in the RofR listing.
879 For a registry to be considered fully searchable, it must harvest resources
880 from the other publishing registries listed in the RofR, and implement
881 an IVOA standard programmatic interface beyond the interface for OAI harvesting.
882 Thus a fully searchable registry which regularly harvests local-set resources
883 from the other registries known to the RofR is represented by a Registry
884 resource with an additional capability or auxiliary capability referencing an
885 IVOA standard and a ParamHTTP interface element to that capability.
886 The ability for a registry to harvest others is not an announced feature in
887 the \xmlel{vr:Resource} schema, and whether this harvesting is being used
888 operationally cannot be simply programmatically determined.
889 Therefore to list a registry intended to be fully searchable as fully searchable
890 in the Registry of Registries listing, one must use the contact information
891 in the RofR's own registry record after listing it as a publishing registry,
892 and have the registry moved to the fully searchable list by hand.
893
894 \section{Looking Forward}
895 \label{LookingForward}
896
897 While the OAI-PMH harvesting interface as adopted from outside the IVOA
898 community is stable and replacing it would require a major revision
899 of this document, we expect that new search interfaces for registries will
900 be continually developed, leveraging new technologies and best practices
901 as they emerge. These search interfaces can be added without sacrificing
902 interoperability with the IVOA registry ecosystem. Whether these emerging
903 search technologies become formally endorsed by the IVOA as notes or new
904 standards documents, so long as a registry supports the basic harvest interface
905 and hosts valid \xmlel{ri:Resource} documents including registry and authority
906 records, it should be considered covered by the practices described herein and
907 a welcome addition to the Registry of Registries listing, with all of its
908 records also accessible through the full searchable registries.
909
910 \appendix
911
912 \section{The RegistryInterface Schema}
913 \label{app:rischema}
914
915 The following schema defines a global element, allowing the inclusion of
916 VOResource records into \xmlel{oai:metadata} elements in OAI-PMH
917 responses for the \texttt{ivo\_vor} metadata prefix. See
918 sect.~\ref{sect:metadataformats} for details.
919
920 The schema is unchanged from version 1.0 of this specification and
921 therefore does not change its version.
922
923 \lstinputlisting[language=XML]{RegistryInterface-1.0.xsd}
924
925 \section{The VORegistry Schema}
926 \label{app:vgschema}
927
928 The following schema defines VOResource types for describing registries
929 in the Registry. It is unchanged from version 1.0 of this specification
930 and therefore does not change its version.
931
932 Note that standards defining search interfaces may specify alternative
933 or complementary methods of registering the services defined by them,
934 and that auxiliary capabilities for these search capabilities may be
935 listed within the registry record.
936
937 \lstinputlisting[language=XML]{VORegistry-1.0.xsd}
938
939 \section{Example Capabilities}
940 \label{sect:exampleCap}
941
942 The following XML fragment shows the three capability elements discussed
943 in this document: The OAI-PMH-based publishing registry, the legacy
944 RI 1.1 searchable registry, and an auxiliary TAP capability as used
945 for RegTAP.
946
947 \begin{lstlisting}[language=XML]
948 <ri:Resource
949 xmlns:vg="http://www.ivoa.net/xml/VORegistry/v1.0"
950 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
951 xmlns:xmlns:ri="http://www.ivoa.net/xml/RegistryInterface/v1.0">
952
953 <!-- Standard VOResource metadata omitted for brevity -->
954
955 <!-- The capability for an OAI-PMH endpoint (publishing registry) -->
956 <capability xsi:type="vg:Harvest" standardID="ivo://ivoa.net/std/Registry">
957 <interface xsi:type="vg:OAIHTTP" version="1.0" role="std">
958 <accessURL use="base">http://registry.example.org/oai</accessURL>
959 </interface>
960 <maxRecords>100</maxRecords>
961 </capability>
962
963 <!-- A legacy, RI1.0 searchable registry endpoint, with an
964 extra interface for web browsers. -->
965 <capability xsi:type="vg:Search" standardID="ivo://ivoa.net/std/Registry">
966 <interface xsi:type="vr:WebBrowser" version="1.0" role="gui">
967 <accessURL use="full">http://registry.euro-vo.org</accessURL>
968 </interface>
969 <interface xsi:type="vr:WebService" version="1.0" role="std">
970 <accessURL use="full"
971 >http://registry.example.org/services/RegistrySearch</accessURL>
972 </interface>
973 <maxRecords>100</maxRecords>
974 <extensionSearchSupport>core</extensionSearchSupport>
975 </capability>
976
977 <!-- A reference to RegTAP-enabled TAP service as an auxiliary
978 capability -->
979 <capability standardID="ivo://ivoa.net/std/TAP#aux">
980 <interface xsi:type="vs:ParamHTTP" role="std">
981 <accessURL use="base">http://registry.example.org/tap</accessURL>
982 </interface>
983 </capability>
984
985 <!-- A RegTAP-capable searchable registry should have a tableset
986 with all its tables in the rr schema here -->
987 </ri:Resource>
988 \end{lstlisting}
989
990 \section{The RISearch Schema}
991 \label{app:RISearch}
992
993 The following schema defines the SOAP-based RISearch interface, which
994 is discouraged as of version 1.1 but still available.
995 It is unchanged from version 1.0 of this specification
996 and therefore does not change its version.
997
998 % GENERATED: !schemadoc VORegistry-1.0.xsd Search
999 %\begin{generated}
1000 \begingroup
1001 \renewcommand*\descriptionlabel[1]{%
1002 \hbox to 5.5em{\emph{#1}\hfil}}\vspace{2ex}\noindent\textbf{\xmlel{vg:Search} Type Schema Documentation}
1003
1004 \noindent{\small
1005 The capabilities of the Registry Search implementation.
1006 \par}
1007
1008 \vspace{1ex}\noindent\textbf{\xmlel{vg:Search} Type Schema Definition}
1009
1010 \begin{lstlisting}[language=XML,basicstyle=\footnotesize]
1011 <xs:complexType name="Search" >
1012 <xs:complexContent >
1013 <xs:extension base="vr:Capability" >
1014 <xs:sequence >
1015 <xs:element name="maxRecords" type="xs:int" />
1016 <xs:element name="extensionSearchSupport"
1017 type="vg:ExtensionSearchSupport" />
1018 <xs:element name="optionalProtocol" type="vg:OptionalProtocol" minOccurs="0"
1019 maxOccurs="unbounded" />
1020 </xs:sequence>
1021 </xs:extension>
1022 </xs:complexContent>
1023 </xs:complexType>
1024 \end{lstlisting}
1025
1026 \vspace{0.5ex}\noindent\textbf{\xmlel{vg:Search} Extension Metadata Elements}
1027
1028 \begingroup\small\begin{bigdescription}\item[Element \xmlel{maxRecords}]
1029 \begin{description}
1030 \item[Type] \xmlel{xs:int}
1031 \item[Meaning]
1032 The largest number of records that the registry search
1033 method will return. A value of zero or less indicates
1034 that there is no explicit limit.
1035
1036 \item[Occurrence] required
1037
1038 \end{description}
1039 \item[Element \xmlel{extensionSearchSupport}]
1040 \begin{description}
1041 \item[Type] string
1042 \item[Meaning]
1043 (deprecated)
1044
1045 \item[Occurrence] required
1046
1047 \item[Allowed Values]\hfil
1048 \begin{longtermsdescription}
1049 \item[core]
1050 Only searches against the core VOResource metadata are
1051 supported.
1052
1053 \item[partial]
1054 Searches against some VOResource extension metadata are
1055 supported but not necessarily all that exist in the registry.
1056
1057 \item[full]
1058 Searches against all VOResource extension metadata contained
1059 in the registry are supported.
1060
1061 \end{longtermsdescription}
1062 \item[Comment]
1063 This was used in Registry Interfaces 1.0 to indicate
1064 what VOResource extensions a search interface supported.
1065 Modern search interfaces will indicate that through
1066 version, their tableset, or similar.
1067
1068
1069 \end{description}
1070 \item[Element \xmlel{optionalProtocol}]
1071 \begin{description}
1072 \item[Type] string
1073 \item[Meaning]
1074 (deprecated)
1075
1076 \item[Occurrence] optional; multiple occurrences allowed.
1077
1078 \item[Allowed Values]\hfil
1079 \begin{longtermsdescription}
1080 \item[XQuery]
1081 the XQuery (http://www.w3.org/TR/xquery/) protocol as defined
1082 in the VO Registry Interface standard.
1083
1084 \end{longtermsdescription}
1085 \item[Comment]
1086 This was used in Registry Interfaces 1.0 to indicate
1087 search protocol extensions. In 1.1, use multiple
1088 capabilities with the appropriate standardIDs
1089 to declare special search capabilities.
1090
1091
1092 \end{description}
1093
1094
1095 \end{bigdescription}\endgroup
1096
1097 \endgroup
1098 %\end{generated}
1099
1100 % /GENERATED
1101
1102
1103 \section{Changes from Previous Versions}
1104
1105 \label{sect:changes}
1106
1107 For pre-REC-1.0 changes, see \citet{std:RI1}.
1108
1109 \subsection{Changes from Version 1.0}
1110
1111 \label{changes-1.0}
1112
1113
1114 \begin{itemize}
1115
1116 \item Corrected reference to OAI-PMH spec in registry interface
1117 description to v2.0.
1118
1119 \item Added requirement for OAI-PMH interface to support seconds
1120 granularity, optional in the OAI-PMH 2.0 standard itself. {}
1121
1122 \item Removed requirement for VOResource version number changes to force
1123 an update of this document. {}
1124
1125 \item Removed the implementation-dependent requirement for searchable
1126 registries in section 2, specifically the SOAP-based services
1127 based on ``ADQL 1.0'' and XQuery.{}
1128
1129 \item Dropped the requirement on registries to not deliver any records
1130 that are OAI-PMH deleted when no temporal constraint is given.{}
1131
1132 \item Added a requirement to provide VOSI endpoints.
1133
1134 \item Added support for auxiliary Registry TAP Service search interfaces
1135
1136 \item Clarified that the requirement to keep deleted records for six
1137 months only applies to the transient case; also discouraging registries
1138 with no support of deleted records.
1139
1140 \item Added recommended process for discovery of registries and their
1141 resources using the Registry of Registries, based on the Registry of
1142 Registries IVOA note
1143
1144 \item Added conclusion describing implications of future search and
1145 publishing interface changes in the Registry environment.
1146
1147 \item Many editorial changes across the text, mostly as a consequence of
1148 externalizing search interfaces.
1149
1150 \end{itemize}
1151
1152
1153 \bibliography{ivoatex/ivoabib,ivoatex/docrepo}
1154
1155 \end{document}

Properties

Name Value
svn:keywords Date Rev URL

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26