/[volute]/trunk/projects/registry/Identifiers/Identifiers.tex
ViewVC logotype

Contents of /trunk/projects/registry/Identifiers/Identifiers.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 3158 - (show annotations)
Fri Nov 20 08:04:09 2015 UTC (5 years, 11 months ago) by msdemlei
File MIME type: application/x-tex
File size: 45667 byte(s)
Identifiers: Fixing references to previous version in both preamble and
changelog.

(Thanks, Brian)

1 \documentclass{ivoa}
2 \input tthdefs
3
4 \SVN$Rev$
5 \SVN$Date$
6 \SVN$URL$
7
8 \newcommand{\abnfterm}[1]{%
9 \ensuremath{\,\hbox{\texttt{\char'042\relax #1\char'042}}\,}}
10 \newcommand{\abnfrepeat}[1]{\,*#1}
11 \newcommand{\abnfoptional}[1]{[#1]}
12 \newcommand{\abnfor}{\ensuremath{\,\,/\,\,}}
13 \newcommand{\abnfnt}[1]{\ensuremath{\langle\textit{#1\/}\rangle\,}}
14 \newcommand{\abnfto}{=}
15 \newcommand{\mytilde}{\char'176}
16
17 \iftth
18 \renewcommand{\abnfterm}[1]{%
19 \special{html:<tt>"}#1\special{html:"</tt>}}
20 \renewcommand{\abnfnt}[1]{\special{html:<i>&lt;}#1\special{html:&gt;</i>}}
21 \renewcommand{\mytilde}{\special{html:~}}
22 \fi
23
24 \hyphenation{Stan-dards-Reg-Ext}
25 \hyphenation{Obs-Core}
26
27 \title{IVOA Identifiers}
28
29 \ivoagroup{Resource Registry}
30
31 \author[http://www.ivoa.net/twiki/bin/view/IVOA/MarkusDemleitner]{Markus Demleitner}
32 \author[http://www.ivoa.net/twiki/bin/view/IVOA/RayPlante]{Raymond Plante}
33 \author[http://www.ivoa.net/twiki/bin/view/IVOA/TonyLinde]{Tony Linde}
34 \author[http://www.ivoa.net/twiki/bin/view/IVOA/RoyWilliams]{Roy Williams}
35 \author[http://www.ivoa.net/twiki/bin/view/IVOA/KeithNoddle]{Keith Noddle}
36 \author{and the IVOA Registry Working Group}
37
38 \editor{Markus Demleitner}
39
40 \previousversion[http://www.ivoa.net/Documents/REC/Identifiers/Identifiers-20070302.html]{REC-1.12}
41 \previousversion[http://www.ivoa.net/Documents/PR/Identifiers/Identifiers-20050302.html]{PR-20050302}
42 \previousversion[http://www.ivoa.net/Documents/PR/Identifiers/Identifiers-20040621.html]{PR-20040621}
43 \previousversion[http://www.ivoa.net/Documents/WD/Identifiers/Identifiers-20040209.html]{WD-20040209.html}
44 \previousversion[http://www.ivoa.net/Documents/PR/Identifiers/Identifiers-20031031.html]{WD-20031031}
45 \previousversion[http://www.ivoa.net/Documents/WD/Identifiers/Identifiers-20030930.html]{WD-20030930}
46 \previousversion[http://www.ivoa.net/Documents/WD/Identifiers/Identifiers-20030830.html]{PR-20030830.html}
47
48 \begin{document}
49 \begin{abstract}
50 An IVOA Identifier is a globally unique name for a resource
51 within the Virtual Observatory. This
52 name can be used to retrieve a unique description of the resource
53 from an IVOA-compliant registry or to identify an entity like a dataset
54 or a protocol without dereferencing the identifier.
55 This document describes the syntax
56 for IVOA Identifiers as well as how they are created.
57 The syntax has been defined to encourage global-uniqueness naturally
58 and to maximize the freedom of resource providers to control the
59 character content of an identifier.
60 \end{abstract}
61
62
63 \section*{Acknowledgments}
64
65 This document builds on the concept of a Uniform Resource Identifier
66 as described in RFC 3986 \citep{std:RFC3986} and its predecessors.
67
68 This document has been developed with support from the
69 \href{http://www.nsf.gov}{National Science Foundation}'s
70 Information Technology Research Program under Cooperative Agreement
71 AST0122449 with The Johns Hopkins University, from the
72 \href{http://www.pparc.ac.uk}{UK Particle Physics and Astronomy
73 Research Council (PPARC)}, from the
74 \href{http://fp6.cordis.lu/fp6/home.cfm}{European Commission's Sixth
75 Framework Program} via the \href{http://www.astro-opticon.org/} {Optical
76 Infrared Coordination Network (OPTICON)}, and from the German
77 Astrophyiscal Virtual Observatory GAVO, BMBF grant 05A14VHA.
78
79
80 \section*{Conformance-related definitions}
81
82 The words ``MUST'', ``SHALL'', ``SHOULD'', ``MAY'', ``RECOMMENDED'', and
83 ``OPTIONAL'' (in upper or lower case) used in this document are to be
84 interpreted as described in RFC 2119 \citep{std:RFC2119}.
85
86 \section*{Usage of ABNF}
87
88 This specification uses ABNF \citep{std:RFC2234} to specify grammar
89 rules. The rules from RFC 3986 are assumed throughout. Where both this
90 specification and RFC 3986 define a nonterminal, the rule in this
91 specification overrides the corresponding rule from RFC 3986.
92
93 For explicitness, we write ABNF nonterminals in angle brackets
94 (\abnfnt{like this}) throughout.
95
96 \section{Introduction}
97
98 Virtual Observatory applications frequently need to
99 unambiguously refer to some resource or concept
100 which is described elsewhere. It is therefore necessary to
101 define global, potentially
102 dereferenceable identifiers. In the VO, these are called
103 IVOA identifiers (IVOIDs).
104 An unambiguous reference within the entire Virtual Observatory
105 requires that the identifier is globally unique. Ensuring
106 this uniqueness inevitably requires oversight by a moderating
107 authority; however, a flexible framework can minimize the opportunity
108 for duplicated identifiers.
109
110 Many data providers in the VO were creating
111 and using identifiers long before this specification was developed.
112 Their choices of identifiers were made
113 presumably to best fit the needs of the data.
114 In order to minimize the cost of adoption of the IVOA identifier framework
115 the design specified here maximizes the
116 control providers have, thus allowing the reuse of the identifiers
117 data providers already have in place
118 as well as the creation of
119 new idenfiers that are consistent with their overall
120 organization.
121
122 Identifiers are crucial to the operation of registries that aid users in
123 discovering data and services \citep{std:RI1}. In general, a registry stores
124 descriptions of data and services in a searchable form, and it
125 distinguishes them by the unique identifier defined here. It thus
126 serves as a primary key for the VO Registry, and thus allows
127 dereferencing identifiers to metadata about a resource (the
128 resource record).
129
130 IVOA identifiers with query or fragment parts can furthermore
131 reference essentially arbitrary
132 entities like datasets or protocols, based on this primary mechanism of
133 dereferencing.
134
135 We recognize that resources do
136 not always remain in the control of a single organization forever.
137 This
138 necessitates a form of referencing that is
139 location-independent -- or more precisely, organization-independent.
140 Apart from enabling seamless transfers of data curation, an
141 attractive use case for such identifiers is
142 when several copies of a dataset exist at several locations around
143 the VO and one could refer to all of them collectively, deferring the choice
144 of a particular instance until it is actually needed.
145 Such references thus serve as
146 \emph{persistent} pointers to data that can be flexibly resolved.
147 This is very important to journal publishers
148 that wish to refer to data in publications (whose useful life might be
149 measured in decades) without worry that the references will become
150 obsolete.
151
152 This specification, in contrast, defines
153 \emph{organization-dependent identifiers}.
154 Persistent, organization- and location-independent identifiers are
155 \emph{not} (directly) defined here.
156
157 Referencing resources is
158 addressed by the IETF standard for URIs, RFC 3986 \citep{std:RFC3986}.
159 Thus, the framework proposed
160 in this document builds directly on this standard. Essentially, this
161 standard sets the parameters left open for application use
162 by RFC 3986.
163
164 \subsection{Definitions}
165
166 A \emph{Uniform Resource Identifier}
167 (URI) is defined by RFC 3986 as ``a compact sequence of
168 characters that identifies an abstract or physical resource'' which
169 complies with the syntax specification of that document
170 \citep{std:RFC3986}. It can point to an
171 actual retrievable resource, but there is no requirement for it to be
172 dereferenceable at all, let alone by a stock web browser.
173
174 An \emph{IVOA identifier}, or IVOID, is a
175 special sort of URI complying with all parts of this specification.
176 Historically, these have also been known as \emph{IVOA Resource Names}
177 (IVORNs), in parallel to
178 the \emph{Uniform
179 Resource Names} (URNs) that formulated extra requirements on persistency
180 and location-independence. As plain IVOIDs do not fulfill those
181 requirements and the term URN has been deprecated by
182 RFC 3986, we now deprecate the term IVORN, too.
183
184 A full IVOID can thus be split into a \emph{Registry part} (schema,
185 authority, and path) and a possibly empty
186 \emph{local part} consisting of query and fragment component, again
187 using RFC 3986 nomenclature. An IVOID with an empty local part is also
188 known as a \emph{Registry reference}.
189
190 In VO practice, the term \emph{resource} is somewhat ambiguous.
191 The IVOA Recommendation on
192 Resource Metadata \citep{std:RM}, from here on referred to as RM,
193 defines it as ``a VO element that can be described in terms of who
194 curates or maintains it and which can be given a name and a unique
195 identifier.'' It then goes on to define the relevant pieces of
196 metadata, which later provided the foundations of the data model behind
197 the IVOA Registry.
198
199 This might lead to the expectation that there is a 1:1 relationship
200 between Registry records, ``VO resources'', and IVOA identifiers, and
201 version 1 of this document essentially implied as much. In this
202 version, we only require Registry references to resolve in the
203 Registry.
204
205 IVOIDs having a nonempty local part do not dereference to
206 Registry records. Since we want to
207 maintain the notion that a resource is whatever a URI points to,
208 ``resource'' as used here does \emph{not} correspond to the usage of the term in
209 VOResource \citep{std:VOR}. To maintain the distinction, we call
210 resources in the sense of VOResource \emph{Registry records}.
211 These form a subset of the resources (in the URI sense) that
212 can be referenced by IVOIDs.
213
214 We refer to organizations and providers in the sense that they
215 are defined in RM:
216
217 \begin{quotation}
218 An \emph{organization} is a specific
219 type of resource that brings people together to pursue
220 participation in VO applications. Organizations can be hierarchical
221 and range greatly in size and scope. At a high level, it could be a
222 university, observatory, or government agency. At a finer level, it
223 could be a specific scientific project, space mission, or individual
224 researcher. A \emph{provider} is an
225 \emph{organization} that makes data and/or services
226 available to users over the network.
227 \end{quotation}
228
229 Definitions of other types of resources, including data collection
230 and service, are also provided in RM, and
231 are assumed by this document.
232
233
234 \subsection{Selected Requirements}
235
236 This proposal is the result of various requirement studies for VO
237 identifiers and registries in general (e.g.
238 NVO ID
239 requirements\footnote{
240 \url{http://web.archive.org/web/20070226120639/http://nvo.ncsa.uiuc.edu/~rplante/VO/metadata/oidreq2.txt}}).
241 This section highlights a few of the important
242 ones that guided the design of the ID framework.
243
244 \begin{enumerate}
245 \item A single framework should be used to identify anything a VO
246 application can refer to, including organizations, projects
247 (mission/telescope), data collections, and services.
248
249 \item It should be easy to compare two instances of an identifier to
250 determine if they refer to the same object.
251
252 \item It should be possible to use an identifier to access a unique
253 description of the resource it identifies.
254
255 \item The framework should maximize the freedom of data providers to
256 choose identifiers for resources and collections under their
257 control.
258 \end{enumerate}
259
260 \subsection{Rationale for Version 2}
261
262 A need for revising the IVOA Identifiers specification was discerned
263 ever since \citet{note:uriforms} pointed out that common practices
264 regarding dataset identifiers were not in line with URI semantics.
265 Also, with the publication of StandardsRegExt
266 \citep{std:STDREGEXT}, it became advisable to regulate the ways
267 standards are referenced in the VO in ways compatible with the spirit of
268 that standard.
269
270 As the Registry Working Group set about revising the Identifiers
271 recommendation, it was decided to drop the XML representation for
272 IVOIDs since it complicated the text but had never actually been found
273 useful. Even in XML serializations only the URI form of IVOA
274 identifiers had been used. Dropping the XML form nevertheless
275 constitutes an incompatible change, which necessitates an increase in the
276 major version number.
277
278 Despite the new major version, consensus was that current usage of
279 IVOIDs should not be impacted and existing practices sanctioned as far
280 as possible. Apart from deprecating the use of fragment identifiers to
281 distinguish datasets and restricting authorities to only use
282 \abnfnt{unreserved} characters (which does not impact existing authority
283 identifiers), this specification therefore refrains from modifying
284 version 1 regulations even where they were found somewhat burdensome
285 (e.g., as regards case-insensitiveness in resource keys).
286
287 The opportunity of a revision was also used to organize the
288 specification content in parallel to RFC 3986; for instance, the notion
289 of stop characters from version 1 -- necessitated by the non-URI XML
290 representation -- has no counterpart in non-IVOID URIs and is now
291 encompassed naturally by the usual rules for parsing URIs.
292
293 Closely following RFC 3986 also allows rigorous definitions for the
294 interpretation of local parts. In addition to what version 1
295 specified, we now allow
296 percent-encoded characters there, and we comment on techniques to
297 resolve IVOIDs with such local parts. This is finally used to define
298 the standard and the dataset identifiers that started the revision
299 process.
300
301 \subsection{IVOA Identifiers within the VO Architecture}
302
303 \begin{figure}[ht]
304 \centering
305 \includegraphics[width=0.9\textwidth]{archdiag.png}
306 \caption{Architecture diagram for the IVOA Resource Identifier
307 specification}
308 \label{fig:archdiag}
309 \end{figure}
310
311 Fig.~\ref{fig:archdiag} shows the role this document plays within the
312 IVOA architecture \citep{note:VOARCH}. As identifiers are the primary
313 keys into the Registry, essentially all standards regulating the
314 Registry depend on this specification. The data access protocols are
315 mainly impacted through the use of dataset identifiers -- e.g., in SSAP
316 \citep{std:SSAP} and Obscore \citep{std:OBSCORE} --, which are also
317 IVOID. For the same reason, VOEvent is impacted.
318
319 The core of this standard has no dependencies on other VO standards.
320 The section on identifiers for standards depends on
321 StandardsRegExt \citep{std:STDREGEXT}.
322
323 \section{Specification}
324
325 After a brief, informal specification that should be enough for
326 non-demanding applications, this section gives, for each relevant part
327 of RFC 3986, additional requirements for IVOA identifiers. The
328 normative content should be read together with RFC 3986.
329
330 \subsection{Overview (non-normative)}
331
332 IVOA identifiers (IVOIDs for short) are RFC 3986-compliant URIs with a
333 scheme of \texttt{ivo}. Thus, their generic form is
334
335 $$\underbrace{\texttt{ivo:}
336 \texttt{//}
337 \abnfnt{authority}
338 \abnfnt{path}}_{\mbox{Registry part}}
339 \underbrace{\texttt{?}\abnfnt{query}
340 \texttt{\#}\abnfnt{fragment}}_{\mbox{local part}},
341 $$
342 where \abnfnt{path} is either empty or starts with a slash, and both
343 items in the local part are optional.
344 .
345
346 IVOIDs consisting of only scheme and authority are known as authority
347 identifiers and play a special role in creating other IVOIDs (see
348 sec.~\ref{sect:creating}). IVOIDs without a local part
349 must resolve to a Registry record within the IVOA Registry.
350 Likewise, for all IVOIDs, the IVOID resulting from stripping the local
351 part (the Registry part) must resolve within the IVOA Registry. It is
352 called a \emph{Registry reference}.
353
354 The RFC 3986 \abnfnt{path} element is called \emph{resource
355 key} in IVOIDs.
356
357 Authority ids must consist of letters, numbers, dashes and dots
358 exclusively. Resource keys must not contain URI reserved characters
359 (essentially, only alphanumeric characters, dashes, dots,
360 underscores, and tildes are allowed) except where an IVOA standard
361 defines how they are to be treated.
362
363 The Registry references are,
364 as a whole, compared case-insensitively, and must be treated
365 case-insensitively throughout to maintain backwards compatibility with
366 version 1 of this specification. When comparing full IVOIDs, the local
367 part must be split off and compared preserving case, while the registry
368 part must be compared case-insensitively.
369
370 To make IVOIDs useful where these complex rules are hard to implement
371 (e.g., database columns), handling applications SHOULD NOT change the
372 case of any part of IVOIDs when these might have a local part.
373
374 Examples for IVOIDs:
375
376 \begin{itemize}
377 \item \nolinkurl{ivo://ivoa.net} -- an IVOID without a resource key,
378 i.e., an authority; dereferencing in the Registry must yield a
379 \xmlel{vr:Authority}-typed record.
380
381 \item \nolinkurl{ivo://ivoa.net/std/Identifiers} -- an IVOID with a
382 resource key. Dereferencing this in the Registry must yield a resource
383 record. As long as there is no local part, an IVOID only differing
384 in case, e.g.,
385 \nolinkurl{ivo://IVOA.NET/std/identifiers}, is in every respect equivalent to
386 it.
387
388 \item \nolinkurl{ivo://example.org/~?path/to/\%C3\%89CLAIRE} -- an IVOID
389 without guarantees as to if it resolves and what it resolves to. The
390 Registry reference \nolinkurl{ivo://example.org/~} must resolve to a valid Registry
391 record, though.
392
393 \item \nolinkurl{ivo://example.org/svc?voc.xml#Term} -- an IVOID
394 conceptually referencing some item within
395 \nolinkurl{ivo://example.org/svc?voc.xml}. If that latter IVOID can be
396 dereferenced, there should be an entity within the resource retrieved
397 that is itself identified by \texttt{Term}. The classic example would
398 be an element with an \xmlel{id} of \texttt{Term} within an XML
399 document.
400 \end{itemize}
401
402 The remainder of this section contains a formalization of these points.
403
404 \subsection{Characters}
405
406 \label{sect:chars}
407
408 This specification poses no additional global constraints on the
409 character content of IVOIDs over what Section~2 of RFC 3986 specifies.
410 Special restrictions on the authority part and the resource key are
411 given below. In particular, the \abnfnt{gen-delims} have, where
412 applicable, the standard URI interpretation. As IVOIDs have no use for
413 IPv6 addresses or user components, square brackets and the commercial at
414 sign MUST NOT occur literally in IVOIDs anywhere.
415
416 The \abnfnt{sub-delims} MUST NOT be part of the resource key unless
417 another IVOA specification defines their use. Their use in local parts
418 is not restricted by this specification, nor is any semantics defined
419 for them. Other IVOA specifications may furnish them with semantics.
420
421 In IVOIDs, characters from \abnfnt{unreserved} MUST NOT be
422 percent-encoded.
423
424 Percent-encoded characters are allowed in local parts (but neither in
425 authority nor the resource key). When
426 specifications or applications require text to be percent-encoded within
427 an IVOID, the text MUST be encoded in UTF-8.
428
429
430 \subsection{Syntax Components}
431
432 \subsubsection{Scheme}
433
434 The \abnfnt{scheme} part of IVOIDs is \texttt{ivo}. Note that, by RFC
435 3986, scheme identifiers are case-insensitive.
436
437 A URI that uses this scheme (an IVOID) signals that:
438
439 \begin{itemize}
440
441 \item the registry part of the IVOID
442 and the resource it refers to have been
443 registered in the VO Registry
444 \item the URI complies with the additional restrictions laid down in
445 this document
446 \end{itemize}
447
448 The ivo scheme does not imply a transport protocol by which the resource
449 may be accessed. Agents, in general, should not depend on implicit
450 mappings between IVOIDs and URIs in other schemes like \texttt{http}
451 when dereferencing them. The only defined way to dereference IVOIDs is
452 described in sect.~\ref{sect:dereferencing}. Resource publishers,
453 however, may support additional mappings between identifiers and other
454 URIs (such as http URLs) that they manage; in this case, agents should
455 only assume the mapping applies within the domain of the publisher.
456
457
458
459 \subsubsection{Authority}
460
461 \begin{admonition}{Note}
462 While the syntax for the authority identifiers
463 allows it to look just like a DNS hostname, current convention
464 discourages this practice to avoid the suggestion that an IVOA
465 Identifier can be resolved like a common http URL.
466 As of this writing, the
467 convention of the US Virtual Astronomical Observatory (VAO)
468 is hierarchical naming that
469 combines the publishing organization name with the project or
470 archive (e.g. ``adil.ncsa'') while leaving out fields like
471 ``.edu''
472 or ``.org''. In the AstroGrid
473 project, the convention is to use a DNS name in reverse order
474 (e.g. ``org.astrogrid.www''); this practice has the advantage of
475 reducing the probability that two organizations will want to
476 use the same authority identifier.
477 \end{admonition}
478
479
480 A \emph{naming authority} is an
481 organization (usually a data
482 provider) that has been granted the right by
483 the IVOA to create IVOA-compliant identifiers for resources it
484 registers. See sect.~\ref{sect:creating} for
485 details on how this right is granted. The naming authority creates
486 IVOIDs with empty local parts within the scope of one or more
487 authority identifiers.
488
489 The \emph{authority} component of an IVOID is severely restricted over
490 RFC 3986 as follows:
491
492 \begin{itemize}
493 \item it MUST be at least three characters long
494 \item it MUST begin with an alpha-numeric character
495 \item it MUST NOT contain percent-encoded characters
496 \item it MUST NOT contain characters outside of \abnfnt{unreserved},
497 with the tilde strongly discouraged
498 \item there are no \abnfnt{userinfo} or \abnfnt{port} components
499 \end{itemize}
500
501
502 In ABNF, using the symbols from RFC 3986, an authority identifier
503 in IVOIDs thus has the form:
504
505 \begin{eqnarray*}
506 \abnfnt{authority} &\abnfto& \abnfnt{alphanum} \abnfnt{unreserved}
507 \abnfnt{unreserved} \abnfrepeat{\abnfnt{unreserved}}
508 \end{eqnarray*}
509
510 A naming authority is allowed to control multiple
511 authority identifiers to organize related resources into different
512 namespaces. For example, an organization may
513 choose to control two authority identifiers, one for research-related
514 resources and one for education/outreach resources, even though they
515 are all maintained by the same organization and perhaps made available
516 through the same machine.
517
518
519 \paragraph{Examples for valid authorities}
520
521 \begin{compactenum}[(1)]
522 \item \texttt{nasa.heasarc}
523 \item \texttt{n\_1a.alph-0.02}
524 \item \texttt{123} (authorities can start with a number)
525 \end{compactenum}
526
527 \paragraph{Examples for invalid authorities}
528
529 \begin{compactenum}[(1)]
530 \item \texttt{a2} (less than three characters)
531 \item \texttt{\_temporary.id} (authorities must begin with an alphanumeric
532 character, which the underscore is not)
533 \item \texttt{DAT\%41} (percent-encoded characters are not allowed, even if they
534 work out to be unreserved characters)
535 \item \texttt{de!uni-hd!physics\#ari} (not entirely consisting of unreserved
536 characters)
537 \end{compactenum}
538
539
540 \subsubsection{Resource Key}
541
542 \label{sect:reskey}
543
544 RFC 3986's \abnfnt{path} part of an IVOID is called a \emph{resource key}.
545 It is a
546 name for a resource that is unique within the namespace of an
547 authority identifier. The naming authority creates keys for its namespaces
548 and has complete control of their forms beyond the syntax constraints
549 specified here.
550
551 On top of the definitions in RFC 3986 for paths, section 3.3, resource keys in
552 IVOIDs are further constrained in that
553
554 \begin{itemize}
555 \item \abnfnt{segment} MUST NOT contain percent-encoded characters
556 \item \abnfnt{segment} MUST NOT contain colons or commercial at signs
557 \item Only \abnfnt{path-abempty} expansions are allowed
558 \end{itemize}
559
560 In ABNF, using or overriding the symbols of RFC 3986, this means:
561
562 \begin{eqnarray*}
563 \abnfnt{path} &\abnfto &\abnfnt{path-abempty}\\
564 \abnfnt{segment} &\abnfto & \abnfrepeat{\abnfnt{ivo-segment-char}}\\
565 \abnfnt{ivo-segment-char}& \abnfto& \abnfnt{unreserved} \abnfor
566 \abnfnt{sub-delims}
567 \end{eqnarray*}
568
569 Naming authorities MUST NOT create path
570 segments matching either ``.'' or ``..''; empty
571 segments, resulting in two or more consecutive slashes or a trailing
572 slash, are also forbidden. In particular, as
573 described in sect.~\ref{sect:comparing},
574 such segments would not have the
575 special meaning they have in traditional file system pathnames; that
576 is, a resource key cannot be transformed by removing any kinds of
577 segments and still reference the same resource.
578
579 Note that, as discussed in sect.~\ref{sect:chars}, characters from
580 \abnfnt{sub-delims} MUST NOT be used in resource keys unless their
581 semantics is defined in an IVOA specification. As percent-encoded
582 characters are not allowed in resource keys, these characters MUST NOT
583 occur in generic Registry references at all.
584
585 The naming authority is free to create a
586 resource key that suggests something about the resource it refers to.
587 Any meaning that is suggested by the resource key is intended only for
588 human consumption. The character content of a resource key is not
589 semantically machine-interpretable within the context of the IVOA as
590 defined by this document.
591
592 The presence of a resource key is optional. An identifier that
593 contains only an authority identifier refers to the authority
594 itself and MUST resolve to a \xmlel{vr:Authority}-typed resource record
595 \citep{std:VOR} in the IVOA Registry.
596
597 VO applications MUST be case-insensitive when processing
598 resource keys. In presentation,
599 the preferred use of case is set by the rendering of the key by the
600 naming authority when the IVOID is registered. This may contain
601 capital letters to improve readability.
602
603 \paragraph{Examples for valid resource keys}
604
605 \begin{compactenum}[(1)]
606 \item \texttt{""} (i.e., the empty string; zero repetitions of (\abnfterm/ \abnfnt{segment}) are
607 legal)
608 \item \texttt{/reskey}
609 \item \texttt{/\char127 user/STScI\_1/1a-7z.u} (unreserved characters are
610 allowed, and arbitrarily many segments are allowed)
611 \end{compactenum}
612
613 \paragraph{Examples of invalid resource keys}
614
615 \begin{compactenum}[(1)]
616 \item \texttt{/} (empty \abnfnt{segment}s are forbidden)
617 \item \texttt{reskey} (nonempty resource keys must always start with a
618 slash)
619 \item \texttt{/data/} (empty \abnfnt{segment}s are forbidden)
620 \item \texttt{/data//other} (empty \abnfnt{segment}s are forbidden)
621 \item \texttt{/data/c/../d} (\abnfnt{segment}s that indicate tree traversal in
622 other URI schemes are forbidden)
623 \item \texttt{/data!g-vo.org} (although this might become legal when some
624 IVOA standard gives the bang -- which is from \abnfnt{sub-delims} -- an
625 extra meaning)
626 \item \texttt{/user/M\%fcller} (percent encoding is forbidden in resource
627 keys; if it were, the codepoint 0xfc is not in
628 \abnfnt{ivo-segment-char}; if that were true, it would still not be
629 valid utf-8)
630 \end{compactenum}
631
632
633 \subsubsection{Query}
634 \label{sect:querypart}
635
636 This specification does not pose constraints on \abnfnt{query} beyond
637 the definitions in RFC 3986. It also does not define any semantics.
638
639 Creators of IVOIDs are encouraged to adhere to URI semantics, i.e.,
640 IVOIDs with different query parts should refer to different resources.
641
642 To allow some resilience towards clients erronerously case folding the
643 query part, operators SHOULD NOT define IVOIDs referring to different
644 resources differing only by case in the query part.
645
646 Still, operators are not required to perform case folding on query
647 parts. Therefore, applications MUST NOT change the case of characters
648 in query parts.
649
650 \paragraph{Examples for valid query parts}
651
652 \begin{compactenum}[(1)]
653 \item \texttt{par1=val1\&par2=val2} (the classic use for query parts in
654 HTTP URLs as, e.g., generated by browser forms)
655 \item \texttt{//..//!:??} (but sub-delims, slashes and question marks
656 are allowed here, as are strings looking like forbidden segments in
657 resource keys)
658 \item \texttt{\%C2\%B5\%20Her} (percent-encoding special characters is
659 legal, but outside of ASCII one has to use utf-8; this example works out
660 to be ``$mu$ Her'')
661 \item \texttt{\%3A\%5B\%5D} (while the generic delimiters \#, [,
662 and ] are not allowed in query parts literally, they can be included
663 in percent-encoded forms)
664 \end{compactenum}
665
666 \paragraph{Examples for invalid query parts}
667
668 \begin{compactenum}[(1)]
669 \item \texttt{:\#[] bad} (most generic delimiters are not allowed
670 literally in query parts, nor is the blank)
671 \item \texttt{\%B5\%20Her} (sequences of percent-encoded characters must
672 be valid utf-8 after decoding)
673 \end{compactenum}
674
675
676 \subsubsection{Fragment}
677
678 This specification does not pose constraints on \abnfnt{fragment}
679 beyond the definitions in RFC 3986.
680
681 Creators of IVOIDs are encouraged to adhere to URI semantics, i.e.,
682 fragment identifiers should be used to distinguish between different
683 entities within the same parent resource as discussed in
684 \citet{note:uriforms}. The details of this process depend on the type
685 of document being retrieved. See sects.~\ref{sect:dereferencing} and
686 \ref{sect:standards} for details.
687
688 Applications MUST NOT change the case of characters in fragments.
689
690 For examples for valid and invalid fragments, see the examples for query
691 parts in sect.~\ref{sect:querypart}
692
693 \subsection{Usage}
694
695 IVOIDs are used to identify resources in the general sense, i.e., they
696 might refer to datasets, abstract concepts, etc.; their Registry
697 parts, on the other
698 hand, MUST always be dereferenceable, i.e., resolve in the VO Registry.
699
700 No hierarchy is implied in any of the components. Therefore, there are
701 no relative URIs for IVOA Identifiers. In effect, this specification
702 overrides the rule in section~4.1 of RFC 3986 to become
703
704 $$
705 \abnfnt{URI-reference} \abnfto \abnfnt{URI}.
706 $$
707
708 \subsection{Reference Resolution}
709 \label{sect:dereferencing}
710
711 Registry references
712 can always be resolved to a Registry record by querying a
713 searchable registry, for instance, using RegTAP \citep{std:RegTAP}.
714 Clients will usually have some Registry endpoint URLs built in, more
715 are discoverable as described in \citet{std:RI1}. In a full registry
716 with an OAI-PMH interface, the OAI-PMH \emph{GetRecord} operation
717 provides another means for obtaining the Registry record referenced by
718 an IVOID.
719
720 If an IVOID's Registry part does not resolve in the Registry,
721 clients SHOULD assume it
722 is obsolete and that any IVOID built with it does not reference an
723 existing resource or entity either.
724
725 When dereferencing IVOIDs with query parts, applications should first
726 dereference the reference part to a registry record. From that, a service
727 should be identified that can dereference the full IVOID. Concrete
728 procedures may be given in IVOA specifications introducing certain
729 resource types. One example for this is sect.~\ref{sect:dids}.
730
731 There is no mechanism that would allow applications to tell from
732 an IVOID's form whether or not it can be dereferenced in any special
733 way. Any such information has to be obtained from the context the IVOID
734 is found in.
735
736 For resolving IVOIDs with fragment identifiers, applications would again
737 resolve the Registry part in the Registry. In the presence of a query
738 component, it would be dereferenced as just discussed to obtain a basic
739 document, otherwise the basic document is the Registry record itself.
740 The entity referred to is then extracted from the basic document by
741 means specific to the document type; one example of such a prescription
742 is given in sect.~\ref{sect:standards}.
743
744 As there are no relative IVOIDs, most of RFC 3986's section~5 does not
745 apply here.
746
747 \subsection{Normalization and Comparison}
748 \label{sect:comparing}
749
750 An important use of identifiers is comparing two instances to
751 determine if they refer to the same resource. This will most commonly
752 occur when using an identifier to look up the associated resource
753 description in a registry.
754
755 IVOID comparison is according to RFC 3986, section 6.2.2, with the
756 following additional regulations:
757
758 \begin{itemize}
759 \item As no hierarchy is implied in any IVOID part, no path segment
760 normalization is ever performed on IVOIDs.
761 \item As IVOIDs must not percent-encode characters that do not need to
762 be encoded, no percent-encoding normalization is ever performed on
763 IVOIDs.
764 \item In addition to scheme and authority as in RFC 3986, in IVOIDs the
765 resource key is also compared case-insensitively. This means that
766 Registry references can be case-folded for processing.
767 \end{itemize}
768
769 Note that neither query parts nor fragment identifiers may be compared
770 case-insensitively or normalized in any other way; allowing this would
771 severely impact their usefulness, as they, in general, refer to
772 case-sensitive entities like XML ids or file system paths.
773
774 No further normalizations are performed in IVOID comparison, i.e.,
775 sections 6.2.3 and 6.2.4 of RFC 3986 do not apply.
776
777 For instance, given the IVOID
778 $$\mbox{\nolinkurl{ivo://example.com/res/key1?par=U\%20Pic\#Part1},}$$ the
779 IVOID
780 $$\mbox{\nolinkurl{IVO://EXAMPLE.COM/RES/KEY1?par=U\%20Pic\#Part1}}
781 $$ must compare equal, while the following IVOIDs must compare
782 non-equal:
783
784 \begin{itemize}
785 \item \nolinkurl{ivo://example.com/res/key1?par=u\%20Pic\#part1}
786 (query part and fragment are non case-insensitive)
787 \item \nolinkurl{ivo://example.com/./res/key1?par=U\%20Pic\#Part1}
788 (no path normalization takes place, even if that were a legal IVOID)
789 \item \nolinkurl{ivo://example.com/res/key1?par=U\%20Pic} (fragment
790 identifiers may not be stripped off for comparison)
791 \item \nolinkurl{ivo://example.com/res/key1?par=U\%20Pic\&\#Part1}
792 (query parts are not parsed, and their interpretation as key/value pairs
793 is up to data providers)
794 \item \nolinkurl{ivo://example.com/res/\%6Bey1?par=U\%20Pic\#Part1}
795 (no normalization of percent encoding takes place)
796 \end{itemize}
797
798 In general, the string-based comparison of identifiers
799 cannot determine definitively if two identifiers refer to different
800 resources. While it is not intended that a Registry record is
801 registered multiple times with different identifiers, it is not
802 disallowed by this specification. In particular, it is possible that
803 two resources with different identifiers may be mirrors of each other;
804 such a relationship can only be determined by examining the metadata
805 contained in the descriptions associated with each identifier.
806
807
808 This concludes the additional constraints and regulations for IVOIDs
809 over RFC 3986 compliant URIs. The remainder of this document
810 standardizes certain aspects not in the scope of RFC 3986.
811
812 \section{Creating Identifiers}
813 \label{sect:creating}
814
815 An important aim of the process for creating identifiers is to ensure
816 uniqueness. In the context of IVOA
817 identifiers, ``unique'' means that a given identifier MUST NOT refer
818 to two different resources at any instant. Furthermore, the
819 identifier SHOULD refer to at most one resource over all time; that
820 is, IVOIDs should not be reused for unrelated resouces. Note that a
821 resource may potentially be dynamic (such as 'weather at telescope' or
822 'current version of the standard') -- here, there is a conceptually unique
823 resource, even though the content of it may change in time.
824
825 Another aim of the identifier creation process is to trace the
826 delegation of authority over the identifier.
827 In practice, a Registry reference is created by
828 an organization when registering a resource.
829 Thus, only recognized naming authorities (or
830 persons representing such organizations) may create Registry references.
831
832 The details of the service used to claim a
833 naming authority is described in the IVOA Registry
834 Interfaces standard \citep{std:RI2}.
835
836 Once an organization is recognized as a naming authority, it is free
837 to register any number of resources with identifiers having an
838 authority identifier that they control. No
839 organization may create an identifier with an
840 authority identifier it does not control. The naming
841 authority has full control over the creation of a
842 resource key as long as it conforms to the syntax
843 and uniqueness constraints described in this specification.
844
845 Likewise, once a Registry reference is established, any number of IVOIDs may be
846 built using it (e.g., when publishing new datasets). In this case, the
847 VO Registry is not involved, IVOID creation happens under the exclusive
848 control of the owner of the service or data collection the Registry
849 reference refers to.
850
851
852
853 \section{Special Identifier Types}
854 \label{sect:specials}
855
856 This section discusses some special classes of IVOIDs that reference
857 something other than Registry records and for which identifier forms for
858 one reason or other must or should be uniform across the different other
859 standards that define the resources referenced.
860
861 \subsection{Dataset Identifiers}
862 \label{sect:dids}
863
864 DAL standards like Obscore \citep{std:OBSCORE}, SSAP
865 \citep{std:SSAP}, or Datalink \citep{std:Datalink} need to reference
866 datasets. The SSAP standard defines these as ``an individual data object
867 usually including associated metadata.'' In astronomy, single images or
868 spectra are datasets, but tables or more complex data products might, at
869 the publisher's discretion, also be referenced as a single dataset.
870
871 A reference to a dataset is called a dataset identifier (DID), more
872 specifically publisher DID if the DID was assigned by the dataset's
873 publisher, and creator DID if the DID was assigned by the dataset's
874 author. Various standards mandate that DIDs must be IVOIDs.
875
876 Historically, DIDs were customarily formed by adding fragment
877 identifiers to Registry reference, a practice recommended in
878 SSAP in versions up to 1.1.
879 This definition was criticized in
880 \citet{note:uriforms} as a potential interoperability issue.
881
882 Therefore, this specification deprecates the regulation from SSAP 1.1.
883 Instead, DIDs in the VO now MUST use the query part to distinguish
884 datasets within one VO resource. In short, the separator between
885 Registry reference and local part now must be the question mark rather than the
886 octothorpe. A welcome side effect is that the fragment identifier can
887 now be used to reference sub-entities within the datasets.
888
889 An example for a dataset id (that should actually resolve according to
890 the scheme laid out below) is $$
891 \mbox{\nolinkurl{ivo://org.gavo.dc/\~?flashheros/data/ca92/f0065.mt}.}$$
892
893 Existing DIDs in services implementing SSAP up to 1.1 and Obscore 1.0
894 are not affected by these requirements and may be used until the
895 respective services are updated to newer standards.
896
897 Note that by this specification publishers have no obligation to ensure
898 continued access to datasets identified with PubDIDs. They are \emph{not}
899 by themselves
900 persistent identifiers with guarantees on resolvability. Their main
901 function is to provide globally unique identifiers for use in, e.g.,
902 federating responses from different services.
903
904 Publishers are, however, encouraged to declare at least one capability
905 of a protocol dealing with
906 PubDIDs\footnote{At the time of this writing, Datalink, Obscore, and
907 SSA are IVOA recommended protocols allowing queries involving PubDIDs.
908 SIA \citep{std:SIAP} will, according to current
909 proposed recommendations, have an analogous facility in version
910 2.0.} in the resource record referenced by the Registry part of
911 a PubDID (i.e., the URI in front of the first question mark). In that
912 way, clients can attempt to retrieve data based on
913 stand-alone PubDIDs by querying the
914 Registry for the ``embedding'' resource and seeing if it supports any
915 protocol they implement.
916
917 The definition of a proper resolver or resolution strategy is beyond the
918 scope of this standard. Although services prototyping such funtionality have
919 been written\footnote{e.g., GAVO's global PubDID resolver at
920 \url{http://dc.g-vo.org/glopidir}.}, we
921 maintain additional efforts are required outside of Registry to build a
922 reliable infrastructure on top of PubDIDs.
923
924 \subsection{Standard Identifiers}
925 \label{sect:standards}
926
927 In many VO standards, it is important to express adherence to a
928 set of constraints.
929 Common examples include the declaration of the protocol --
930 and the version of the protocol -- that an endpoint implements in
931 VOResource's \xmlel{capability} element or a data model represented
932 with a TAP service in TAPRegExt. The resource record such identifiers
933 reference is defined by StandardsRegExt \citep{std:STDREGEXT}. As such
934 records typically describe multiple versions of a standard, and a single
935 standard may contain definitions of multiple different capabilities that
936 need to be discerned, the simple Registry Reference of the standard record usually is
937 not enough.
938
939 Therefore, StandardsRegExt records should define one
940 \xmlel{key} element for each such referenceable
941 entity. The \xmlel{name} child of this key, denoting both the kind of
942 capability and the major and minor version, is then what is referenced
943 by the identifier as defined by StandardsRegExt, such that the complete
944 element will typically have the form
945 $$
946 \abnfnt{standard-ref}\abnfterm{\#}\abnfnt{key-name}\abnfterm{-}\abnfnt{version}
947 $$
948
949 For instance, the standard exampleProto might define both a
950 data model \texttt{model} and a query capability \texttt{query}. In
951 its version 1.0, there would be two standard keys \texttt{model-1.0} and
952 \texttt{query-1.0}. In a \xmlel{capability} element in another
953 resource's Registry record, support of the query capability would then
954 be declared with the IVOID
955 \texttt{ivo://ivoa.net/std/exampleProto\#query-1.0}, whereas a TAP
956 service exposing the model would contain a \xmlel{dataModel} element
957 with an \xmlel{ivo-id} attribute of
958 \texttt{ivo://ivoa.net/std/exampleProto\#model-1.0}.
959
960 As the exampleProto develops, new standard keys like
961 \texttt{query-1.1} or \texttt{query-2.0} are added. Note that while ideally,
962 the version tags in the keys will correspond to the version of the
963 document that defines them, this is not a requirement. Indeed, if the
964 underlying model has no incompatible changes, even exampleProto 2.0
965 might specify that its data model would remain
966 \texttt{ivo://ivoa.net/std/exampleProto\#model-1.0}. This allows clients
967 to easily discover all services they can operate.
968
969 Registry interfaces will typically offer some pattern matching
970 capability for comparing such identifiers.
971 Clients should use that feature
972 to ignore minor versions if appropriate -- by the IVOA's versioning
973 rules \citep{std:docSTD},
974 a generic client for version 1 of a protocol should be able to
975 operate all version 1 services, regardless of their minor versions, and
976 clients implementing multiple versions of a standard can entirely ignore
977 the version tag. For instance, with RegTAP \citep{std:RegTAP}
978 an exampleProto 1.0 client would look for capabilities for which
979 $$
980 \texttt{standard\_id LIKE 'ivo://ivoa.net/std/exampleProto\#query-1.\%'}
981 $$
982 holds, whereas a client that speaks both versions 1 and 2 of the
983 protocol would look for capabilities with
984 $$
985 \texttt{standard\_id LIKE 'ivo://ivoa.net/std/exampleProto\#query-\%'}.
986 $$
987
988 \appendix
989
990 \section{Changes from Previous Versions}
991
992 \subsection{Changes from PR-2015-07-09}
993
994 \begin{itemize}
995 \item Now deprecating the term IVORN, as historical usage has been too
996 inconsistent. Instead, there is now the ``Registry part'' of an IVOID,
997 and an IVOID that only has a registry part is called a Registry
998 reference.
999 \item More examples
1000 \item No longer suggesting a concrete algorithm for PubDID resolution;
1001 instead, clear encouragement to PubDID minters to point to appropriate
1002 services from the Registry part of a PubDID.
1003 \item Editorial changes
1004 \end{itemize}
1005
1006 \subsection{Changes from 1.12}
1007
1008 \begin{itemize}
1009 \item Removed the (unused) XML representation of Identifiers.
1010 \item Rewrote the section on URI forms to more closely correspond to
1011 the organization of RFC 3986.
1012 \item Case-insensitive handling of IVORNs is now a MUST.
1013 \item Now allowing percent-encoded items outside of the authority and
1014 resource key.
1015 \item Added rules for forming URI-compliant dataset identifiers
1016 \item Added rules for forming StandardsRegExt-compliant standard
1017 identifiers.
1018 \item Empty path segments, as well as those consisting exclusively of
1019 dots, are now forbidden rather than just discouraged.
1020 \item Dropped the recommendation to present authority identifiers in
1021 lower case.
1022 \item Generally moved to IVOID as the abbreviation for IVOA identifier,
1023 defined IVORN to be the part of an IVOID without a local part.
1024 \item Removed some obsolete introductory material that has been
1025 superseded by other standards.
1026 \item Migrated to ivoatex source
1027 \end{itemize}
1028
1029 \subsection{Changes from v1.10}
1030
1031 \begin{itemize}
1032 \item Moved ``!'' from the discouraged list of
1033 characters to the reserved list,
1034 thereby disallowing its inclusion in IVOA identifiers.
1035 \item Clarified the list of characters disallowed in an authority ID by:
1036 \begin{itemize}
1037 \item explicitly disallowing URI-escaped sequences.
1038 \item listing as reserved characters only those characters
1039 that are allowed by the URI spec but disallowed by this
1040 one.
1041 \item Listed in a tip box the characters that are disallowed
1042 by the URI spec.
1043 \end{itemize}
1044 As before, the definition of the resource key
1045 refers to the same list of
1046 reserved characters as those disallowed.
1047 \item Fixed numerous links and references.
1048 \end{itemize}
1049
1050
1051 \subsection{Changes from v1.0}
1052
1053 \begin{itemize}
1054 \item The prohibition of using ``+'' and ``='' within
1055 Identifier components has been dropped.
1056 \item Recommendations for authority ID strings
1057 have been updated to match current practice in AstroGrid and the
1058 NVO.
1059 \item In the example schema in App. A, the namespace was altered to conform
1060 with IVOA conventions. A correction was also made to the
1061 allowed pattern for AuthorityIDType to properly comply with the XML
1062 specification defined in section 3.2.1.
1063 \item various clarifications based on reviewer comments
1064 \end{itemize}
1065
1066 \subsection{Changes from v0.1}
1067
1068 \begin{itemize}
1069 \item Resource key is now required except when referring to a naming
1070 authority itself.
1071 \item support for DNS-like authority IDs clarified.
1072 \item added role of \# and ? as ``stop'' characters in URI form.
1073 \item dropped non-binding Appendix B: Recommended Mechanism for
1074 becoming a Naming authority.
1075 \end{itemize}
1076
1077
1078 \bibliography{ivoatex/ivoabib}
1079
1080
1081 \end{document}

Properties

Name Value
svn:keywords Date Rev URL

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26