Contents of /trunk/projects/semantics/Vocabularies/Vocabularies.tex

Revision 5952 - (show annotations)
Tue May 11 12:26:25 2021 UTC (6 weeks ago) by msdemlei
File MIME type: application/x-tex
File size: 82109 byte(s)
Changes after first DAL review.


 1 \documentclass[11pt,a4paper]{ivoa} 2 \input tthdefs 3 4 \usepackage{todonotes} 5 \lstloadlanguages{XML,python} 6 \lstset{flexiblecolumns=true,tagstyle=\ttfamily, showstringspaces=False, 7 basicstyle=\footnotesize} 8 9 \definecolor{termcolor}{rgb}{0.6,0.1,0.1} 10 11 \iftth 12 \def\vocterm#1{\emph{\color{termcolor}#1}} 13 14 \else 15 \def\vocterm{\startvocterm\realvocterm} 16 \def\realvocterm#1{\emph{\color{termcolor}#1}\endvocterm} 17 \begingroup 18 \gdef\breakablecolon{:\hskip0pt} 19 \catcode\:=\active 20 \gdef\startvocterm{\begingroup 21 \catcode\:=\active\let:=\breakablecolon} 22 \gdef\endvocterm{\endgroup} 23 \endgroup 24 \fi 25 26 27 \newcommand{\vepitem}[1]{\emph{#1}} 28 29 \title{Vocabularies in the VO} 30 31 % see ivoatexDoc for what group names to use here 32 \ivoagroup{Semantics} 33 34 \author{Markus 35 Demleitner} 36 \author{Norman 37 Gray} 38 \author{Mark 39 Taylor} 40 41 \editor{Markus Demleitner} 42 43 \previousversion 44 {WD-20200612} 45 \previousversion 46 {WD-20200326} 47 \previousversion 48 {WD-20190905} 49 50 51 \begin{document} 52 \begin{abstract} 53 In this document, we discuss practices related to the use of RDF-based 54 consensus vocabularies in the Virtual Observatory, that is the creation, 55 publication, maintenance, and consumption of 56 hierarchical word lists agreed upon within the IVOA. 57 To cover the wide range of use cases envisoned, we define different 58 vocabulary types for informal knowledge organisation on the 59 one hand, and strict hierarchies of classes and properties on the other. 60 While the framework rests on the solid foundations of W3C RDF, 61 provisions are made to facilitate using IVOA vocabularies without 62 specific RDF tooling. 63 Non-normative appendices detail the current vocabulary-related tooling. 64 \end{abstract} 65 66 67 \section*{Acknowledgments} 68 69 While this is a complete rewrite of the specification of how vocabularies 70 are treated in the VO, we gratefully acknowlegde the groundbreaking work 71 of the authors of version 1 of Vocabulary in the VO, S\'ebastien 72 Derriere, Alasdair Gray, Norman Gray, Frederic Hessmann, Tony Linde, 73 Andrea Preite Martinez, Rob Seaman, and Brian Thomas. 74 75 In particular, the vocabulary for datalink semantics done by Norman Gray 76 was formative for many aspects of what is specified here. 77 78 \section*{Conformance-related definitions} 79 80 The words MUST'', SHALL'', SHOULD'', MAY'', RECOMMENDED'', and 81 OPTIONAL'' (in upper or lower case) used in this document are to be 82 interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}. 83 84 The \emph{Virtual Observatory (VO)} is a 85 general term for a collection of federated resources that can be used 86 to conduct astronomical research, education, and outreach. 87 The \href{http://www.ivoa.net}{International 88 Virtual Observatory Alliance (IVOA)} is a global 89 collaboration of separately funded projects to develop standards and 90 infrastructure that enable VO applications. 91 92 \section{Introduction} 93 94 The W3C's Resource Description Framework RDF \citep{note:rdfprimer} is a powerful 95 and very generic means to represent, transmit, and reason on highly 96 structured, semantic'' information. With both its power and 97 generality, however, comes a high complexity for consumers of this 98 information if no further conventions are in force. Also, the generic 99 W3C standards understandably do not cover how semantic resources (e.g., 100 vocabularies or ontologies) are to be managed, let alone developed 101 within organisations like the IVOA. 102 103 While for many applications even within the VO, the significant 104 complexity and the lack of defined management processes is acceptable, 105 for several other use cases -- in particular those given in 106 sect.~\ref{sect:usecases} ––, having extra conventions greatly 107 help implementability and interoperability. 108 109 Based on requirements derived from these use cases 110 (sect.~\ref{sect:requirements}), this standard will therefore define 111 conventions for vocabularies based on either SKOS \citep{std:skos} or 112 RDFS \citep{std:rdfs} in 113 sect.~\ref{sect:voccontent}. Where these vocabularies -- and hence, in 114 particular, the permanent URIs of their RDF resources (terms'') 115 -- are managed by the 116 IVOA, they need to be reviewed and consensus be found. A process to 117 ensure this is described in 118 sect.~\ref{sect:management}. In order 119 to provide certain guarantees to clients, sect.~\ref{sect:deployment} 120 defines minimal standards for how IVOA-managed vocabularies must be made 121 available. In order to help adopters simply looking for simple 122 vocabulary-related recipes, sect.~\ref{sect:withoutrdf} discusses how IVOA 123 vocabularies can be used without knowledge of RDF. 124 125 The non-normative appendices~\ref{app:tools} and \ref{app:curtech} 126 describe the tooling 127 currently used or recommended for building and managing vocabularies in the 128 IVOA. 129 130 131 \subsection{Role within the VO Architecture} 132 133 \begin{figure} 134 \centering 135 136 \includegraphics[width=0.9\textwidth]{role_diagram.pdf} 137 \caption{Architecture diagram for this document} 138 \label{fig:archdiag} 139 \end{figure} 140 141 Fig.~\ref{fig:archdiag} shows the role the Vocabularies in the VO standard 142 plays within the IVOA architecture \citep{2010ivoa.rept.1123A}. 143 144 This standard defines a set of conventions on procedures on 145 top of several W3C standards that can be adopted by other VO standards 146 that require interoperable, consensus vocabularies, such as: 147 148 \begin{bigdescription} 149 \item[Datalink \citep{2015ivoa.spec.0617D}] Datalink includes a 150 vocabulary letting clients work out the kind of artefact a row pertains 151 to. 152 153 \item[VOResource \citep{2018ivoa.spec.0625P}] VOResource 1.1 comes with 154 several (rather flat) vocabularies enumerating, for instance, the types 155 of relationships between VO resources, their intended audiences, or 156 classes of actions performed on them. 157 158 \item[VOEvent \citep{2006ivoa.spec.1101S}] VOEvent defines \emph{Why} 159 and \emph{What} elements which, while not formally required to be drawn 160 from a specific vocabulary in version 1.11, certainly become much more 161 useful if they are. 162 163 \item[VOTable \citep{2019ivoa.spec.1021O}] VOTable, in its version 1.4, 164 introduces vocabularies for time scales and reference positions. 165 166 167 \item[UCDs \citep{2007ivoa.spec.0402M}] UCDs are related to vocabularies in 168 that they provide machine-readable semantics. Because the terms listed 169 in the document can be combined and have an underlying grammar, however, 170 they go beyond standard RDF. Hence, no attempt is being made to 171 integrate them into the framework proposed here at this time. The 172 UCD atoms might be organised in an RDF vocabulary, though, and doing so 173 might be considered in the future. 174 \end{bigdescription} 175 176 Other VO standards can do with fewer normative constraints; using W3C 177 standards without the extra requirements laid down here is explicitly 178 encouraged where the use cases do not require the extra management and 179 definition effort, or where perhaps more complex structures (e.g., full 180 ontologies) must be employed. An example for a direct use of SKOS 181 without adoption of the present document is the Simulation Data Model 182 SimDM \citep{2012ivoa.spec.0503L}, where several fields constrain their 183 values to be \vocterm{skos:narrower} than certain top-level concepts. 184 185 \subsection{Relationship to Vocabularies in the VO Version 1} 186 \label{sect:version1rel} 187 188 Published in 2009, version 1.19 of the IVOA Recommendation on 189 Vocabularies in the VO had an outlook fairly different from the present 190 document: the big use case was VOEvent's Why and What, and so its focus 191 was on large, general-purpose vocabularies, of which several existed even 192 back then, while an overhaul of a thesaurus of general astronomical 193 terms approved by the IAU in 1993 was underway as part of IVOA's 194 activities. Mapping between vocabularies maintained by different VO 195 and non-VO parties seemed to be the way to ensure interoperability and 196 therefore played a large role in the document. Also, the use cases 197 called for soft'' relations, which is why the standard confined itself 198 to SKOS as the vocabulary formalism. 199 200 Since then, the'' large astronomy thesaurus is being maintained 201 outside of the IVOA (the UAT\footnote{\url{http://astrothesaurus.org}}), 202 and there is hope that its takeup will be sufficient to make mapping 203 between it and, say, legacy journal keyword systems an exercise general 204 clients will not have to perform. 205 206 Instead, in 2010, a fairly formal vocabulary of what 207 should be properties (in the RDF sense) rather than \vocterm{skos:Concept}-s 208 was required during the development of the datalink standard. The 209 vocabulary was (and still is) small in comparison to, say, the UAT. In 210 contrast to the expectations of Vocabularies~1, the plan had been that 211 most data providers would work with this small vocabulary, and terms 212 from external vocabularies would only be used as temporary stand-ins 213 until the consensus vocabulary was updated. Of course, this required a 214 process for managing such vocabularies. The lack of such a process 215 became even more noticeable when VOResource 1.1 and VOTable 1.4 216 introduced vocabularies of their own similar in size and scope to the 217 datalink vocabulary. 218 219 On the other hand, we are not aware of a single attempt to map 220 between different vocabularies in a VO context, and the SKOS versions of 221 some vocabularies that Vocabularies 1 declared as normative in its 222 section~4 were largely unused and have been unmaintained for a while now. 223 224 Since large parts of the original specification turned out to be 225 irrelevant or unsustainable as the VO ecosystem evolved, 226 while some core requirements found later 227 were not addressed, it was decided to prepare a new major version of the 228 Vocabularies in the VO standard. 229 230 \subsection{Reading Guide} 231 232 We hope that software authors or annotators just wanting to consume IVOA 233 vocabularies or use them to annotate documents will be able to 234 do so after reading just section~\ref{sect:withoutrdf}. In particular, no 235 deeper understanding of RDF should be necessary. 236 237 Persons intending to participate in vocabulary evolution should skim 238 sect.~\ref{sect:voccontent}, in particular the subsection on the kind of 239 vocabulary they want to modify, and must study 240 sect.~\ref{sect:management}. 241 242 Readers unfamiliar with RDF should read \citet{local:normanspaper} before 243 reading anything outside of section~\ref{sect:withoutrdf}. 244 In particular, we assume familiarity with all RDF 245 terminology discussed there. Concepts not covered by Gray's 246 essay will be informally introduced here. Of course, the 247 underlying W3C standards are normative where applicable. 248 249 250 251 \subsection{Terminology, Conventions, Typography} 252 253 When we speak of \emph{term} here, that either means a \vocterm{skos:Concept} 254 in SKOS vocabularies, an \vocterm{rdfs:Class} in RDF class vocabularies, 255 or an \vocterm{rdf:Property} in RDF property vocabularies. We also use 256 \emph{term} for the string after the hash character in 257 the RDF resource URI'', i.e., the machine-readable string typically used 258 in annotation. It is rarely necessary to distinguish between the two 259 meanings. 260 261 We refer to classes and properties by CURIEs \citep{std:curie}, i.e., 262 URIs shortened by replacing long strings with compact prefixes and a 263 colon. The prefixes in this 264 document correspond to the following base URIs: 265 266 \begin{compactitem} 267 \item dc -- \url{http://purl.org/dc/terms/} 268 \item rdf -- \url{http://www.w3.org/1999/02/22-rdf-syntax-ns#} 269 \item rdfs -- \url{http://www.w3.org/2000/01/rdf-schema#} 270 \item owl -- \url{http://www.w3.org/2002/07/owl#} 271 \item skos -- \url{http://www.w3.org/2004/02/skos/core#} 272 \item ivoasem -- \url{http://www.ivoa.net/rdf/ivoasem#} 273 \end{compactitem} 274 275 Vocabulary terms are written in italics (e.g., \vocterm{rdfs:Class}) 276 and, where supported, in a reddish hue. As common in IVOA 277 specifications, XML element and attribute names are written in 278 typewriter italic (e.g., \xmlel{img}). 279 280 \section{Derivation of Requirements (Non-Normative)} 281 282 \subsection{Use Cases} 283 \label{sect:usecases} 284 285 The normative content of this document is guided by a set of 286 requirements derived from the following use cases. 287 288 \subsubsection{Controlled Vocabulary in VOResource} 289 \label{uc:simplevoc} 290 291 In VOResource, in certain use cases clients have to find services that 292 publish a given data collection. This is effected by linking the resource 293 records for service and data with a 294 DataCite-compatible \vocterm{isServedBy} relationship. 295 Its concrete literal needs to be reliably defined in order to let 296 clients find such relationships by a simple string comparison in RegTAP 297 queries. 298 299 A related use case is that validators can flag errors (or at least 300 warnings) when resource records use terms that are not part of some 301 controlled vocabulary (e.g., content levels or types of events in a 302 resource's history). Very typically, such out-of-vocabulary terms 303 indicate small oversights on the part of the resource record author that 304 will lead to hard-to-debug problems in data discovery. 305 306 \subsubsection{Controlled Vocabularies in VOTable} 307 \label{uc:votvoc} 308 309 VOTable 1.4 constrains two attributes of the TIMESYS elements 310 -- reference positions and time 311 scales -- using vocabularies. 312 While with time scales the situation is not fundamentally 313 different from the VOResource case discussed in 314 use case.~\ref{uc:simplevoc} -- a simple enumeration of agreed-upon strings 315 is enough to uniquely determine what operations need to be performed to 316 combine times given in different time scales --, the situation for 317 reference positions is probably different. There, even if a client does 318 not exactly know the location of, say, the Hubble Space Telescope at any 319 given time, several important use cases can already be satisfied if a 320 client knows that it is in lower Earth orbit (e.g., assuming a reference 321 position Geocenter and adjusting the systematic error estimates). For 322 this, a client needs information of the type \vocterm{HST} 323 \vocterm{is-close-to} \vocterm{GEOCENTER\/}'' (or similar). 324 325 There is also another difference between this and at least the 326 VOResource relationship vocabulary from use case~\ref{uc:simplevoc} 327 in that the latter is property-like, as 328 in Resource-1 \vocterm{isServedBy} Resource-2\/''. In contrast with 329 this, a time scale would be used like Time-coordinate 330 \vocterm{is-given-in} \vocterm{TT\/}''. In RDFS terminology, time scales 331 are therefore better modelled as classes rather than properties. 332 333 \subsubsection{Datalink Link Selection} 334 \label{uc:links} 335 336 In Datalink, clients receive a set of links 337 to pieces of information (e.g., previews, additional metadata, 338 progenitors, or 339 derived data) and need to present to the user only those items 340 relevant to the task at hand. For instance, in a discovery phase, only 341 previews should be offered, while scientific exploitation would call for 342 cutout services, alternate formats, or derived data. For debugging, 343 progenitors should be made accessible, and so on. 344 345 Operators of datalink services, on the other hand, want to be precise in 346 their annotation of datasets. For instance, they may want to discern 347 among progenitors: the raw image, a dark frame, and a flat field. In all 348 these cases, clients should still be able to work out that such 349 artefacts are progenitors. 350 351 \subsubsection{VOEvent Filtering, Query Expansion} 352 \label{uc:filtering} 353 354 In VOEvent, an event stream can contain a classification of what the 355 observers believe was observed, for instance supernova Ia explosion''. 356 While an event stream from one project might provide a classification on 357 that level for some event, it might not (yet) be able to do that in 358 another event, and a different event stream might not be able to 359 distinguish between different sorts of supernovae at all. 360 361 In this situation, an event broker looking for supernovae of type Ia 362 will filter out anything not related to supernovae; however, since for 363 one reason or another a Ia supernova might only be tagged as supernova, 364 it will want to widen its filter somewhat, where some backend process 365 might prioritise events classified as Ia upstream over those only tagged 366 as a generic supernova, and those, again, over those tagged explicitly 367 as some different type of supernova. 368 369 Similar use cases exist, for instance, in the discovery of simulations 370 and possibly for subjects of VO resources. 371 372 373 \subsubsection{Vocabulary Updates in VOResource} 374 \label{uc:deprecation} 375 376 In VOResource 1.0 \citep{2008ivoa.spec.0222P}, relationship types 377 like \vocterm{served-by} or 378 \vocterm{service-for} were defined. Later, DataCite defined equivalent 379 terms \vocterm{IsServedBy} and \vocterm{IsServiceFor}. Arguably, the VO should, 380 as far as sensible, take up standards in the wider data management 381 community, and so VOResource 1.1 adopts the DataCite terms. In a minor 382 version, it cannot forbid the old terms. It can, however, say not only 383 \vocterm{served-by\/} is the same as \vocterm{isServedBy\/}'' but also 384 Use the latter term in preference to the former''. If this information is 385 available machine-readably, validators can warn against the use of 386 deprecated terms and user interfaces can transparently replace 387 deprecated terms with current ones. This latter use case is is 388 already specified in RegTAP 1.1 \citep{2019ivoa.spec.1011D}. 389 390 Another use case in the context of VOResource and vocabulary updating 391 is the definition of content levels. In VOResource 1.0, a list of 392 terms was adopted that was far too fine-grained in the area of public 393 outreach, distinguishing, for instance, Middle School'' from 394 Secondary Education''. While this granularity was useful for the 395 original realm of the list of terms, in the VO it resulted in extremely 396 inhomogeneous annotation. Obviously, persons employed in research 397 institutions can hardly be expected to assess needs and capabilities of 398 middle school versus elementary school educators. Eventually, for 399 VOResource 1.1 a three-term list was drawn up and is now actually used. 400 To avoid a repetition of such an experience, we want to enable small 401 initial vocabularies easily extendable as new terms are actually needed 402 and the use of the existing terms is well understood. 403 404 405 \subsubsection{Vocabularies in VO-DML} 406 407 The modelling language VO-DML \citep{2018ivoa.spec.0910L} lets model 408 designers constrain attribute values though external resources defined 409 through a vocabulary URI and possibly a top concept. The standard 410 mentions both SKOS -- inspired by version 1 of this document -- and RDFS 411 as possible technologies for such constraints. 412 413 Depending on the nature of the attributes constrained, modellers might 414 forsee the need for having these vocabularies managed by the IVOA. Of 415 course, that is up to the modeller: There are certainly many cases in 416 which there is no need for the overhead this specification brings with 417 it, be it because vocabularies are externally defined or because the 418 concrete application profits from less-constrained vocabularies. 419 420 \subsubsection{Discovering Meanings} 421 \label{uc:discovering} 422 423 Software developers or researchers want to work out 424 what some term mentioned means'' (where we are agnostic as to what 425 means'' should mean here). If the term URI alone is insufficient, 426 they can simply paste the resource URI of the term into a web browser 427 and read (at least) its description and perhaps find out even more using 428 relationships between terms. 429 430 \subsubsection{Simple Review Process} 431 \label{uc:simplereview} 432 433 As vocabularies evolve, new terms are being added to 434 vocabularies. To facilitate their review and enable rapid uptake 435 of the proposed terms, it is desirable that new terms and even 436 new vocabularies are immediately visible to users and tools. 437 Note that since terms under review might be modified or removed later, 438 this use case is somewhat in conflict with the basic requirement 439 of stable vocabularies (i.e., a document valid once will not 440 become invalid later because of changes in vocabularies). 441 442 \subsubsection{Understanding Vocabulary Evolution} 443 \label{uc:understanding} 444 445 When a question comes up, such as what \vocterm{calibration} actually means 446 in the datalink core vocabulary, and the (legacy) description is not 447 sufficiently clear, people can go back to the discussions that led up 448 to the addition of that term. This will also help clarify existing 449 usage that might have begun at the time of the initial definition. 450 451 \subsubsection{Offline operation} 452 \label{uc:offline} 453 454 A system doing, say, coordinate transformations might run without an internet 455 connection but still needs to use semantic resources on frames and 456 reference positions (e.g., figure out that a given space probe is in L1 457 and use that as reference position). To do that, it wants to use a 458 previously downloaded copy of the vocabulary. 459 460 \subsubsection{UAT in VOResource} 461 \label{uc:uat} 462 463 VOResource 1.1, in the description of the \xmlel{subject} element, says 464 that its content should be drawn from the Unified Astronomy Thesaurus'' 465 (here: UAT). This is intended to later facilitate interactive topic 466 navigation within the Registry or semantic expansion of Registry queries 467 (include narrower terms''). 468 469 470 \subsection{Requirements} 471 \label{sect:requirements} 472 473 \subsubsection{Lists of Terms} 474 \label{req:lists} 475 476 We need to be able to represent simple lists of terms even for the most 477 basic use case~\ref{uc:simplevoc}. As per 478 use case~\ref{uc:votvoc}, we will have to represent instances of both 479 \vocterm{rdf:Property} and \vocterm{rdfs:Class} (though not necessarily 480 in one vocabulary). In order to not break existing practices (e.g., 481 use cases \ref{uc:simplevoc}, \ref{uc:votvoc}, \ref{uc:links}), the 482 machine-readable terms must be allowed to follow existing patterns of 483 essentially human-readable identifiers (against external best practices 484 of using non-informative URI forms). In general, in essentially all use 485 cases discussed, making the machine-readable terms discernable by a 486 human is an advantage. 487 488 \subsubsection{Hierarchies of Terms} 489 \label{req:hierarchy} 490 491 Both use case~\ref{uc:links} and use case~\ref{uc:filtering} require a hierarchy 492 of terms, where clients can find wider and potentially narrower terms 493 relative to an original one. There is a difference, 494 however: in the datalink use-case, strict \vocterm{is-a} relationships 495 are what clients need (e.g., give me all kinds of previews''). In the 496 VOEvent case, however, a somewhat softer sort of hierarchy is required. 497 For instance, a filter for accretion disks might very well expand to 498 match both quasars and cataclysmic variables. Hence, we want to 499 be able to represent strict class hierarchies as well as thesaurus-like 500 soft knowledge structures. 501 502 \subsubsection{Tree-like Hierarchies} 503 \label{req:tree} 504 505 Where we expect some sort of semi-formal inference to take place on the 506 vocabularies, the hierarchy should be a tree in order to facilitate 507 traversal and controlled query expansion. In other words, outside of 508 SKOS we do not support multiple inheritance. Use cases requiring 509 something equivalent would have to resort to supporting multiple terms 510 on the annotation level. 511 512 \subsubsection{Consensus Vocabularies} 513 \label{req:consensus} 514 515 Essentially all our our use cases will be much easier to implement if 516 clients can work through simple string comparisons. Therefore, 517 wherever feasible IVOA standards should build on IVOA-sanctioned, 518 consensus vocabularies. 519 520 \subsubsection{Deprecating Terms} 521 \label{req:deprecating} 522 523 While we believe at this point that terms once approved by the IVOA 524 should never disappear -- for instance, because validators might 525 otherwise flag previously valid instance documents as invalid --, use 526 case~\ref{uc:deprecation} shows that some way of declaring 527 deprecations must be forseen. 528 529 \subsubsection{Public Availability of Machine-Readable Vocabularies} 530 \label{req:machine} 531 532 In particular in use cases~\ref{uc:links} and \ref{uc:filtering}, 533 clients can flexibly incorporate vocabulary updates without code 534 changes, perhaps even without re-deployment, if vocabularies are 535 available at constant, public URIs, where clients can retrieve them in 536 formats reasonably easy to parse. 537 538 Use case~\ref{uc:discovering} implies that at least one representation 539 of the vocabulary should be human-readable. 540 541 \subsubsection{Minimal Term Metadata} 542 \label{req:mtm} 543 544 To support use case~\ref{uc:discovering}, all terms in IVOA vocabularies 545 MUST come with a non-trivial description. 546 547 \subsubsection{Simple Cases do not Require RDF Tooling} 548 \label{req:nordf} 549 550 (Not derived from any specific use case). Since libraries implementing 551 (some subset of) RDF tend to be rather massive and thus appear 552 unproportional when all a client wants is an up-to date list of terms 553 with their descriptions, at least the basic use cases must not require 554 specific RDF tooling. Indeed, simple uses should not require an 555 understanding of RDF in the first place. 556 557 558 \subsubsection{Vocabulary Evolution} 559 \label{req:evolution} 560 561 Most use cases make it desirable that terms can be added to existing 562 vocabularies; this is very clear for the reference positions in 563 use case~\ref{uc:votvoc}, where new instruments would imply new 564 terms. The history of content level annotation in VOResource mentioned 565 in use case~\ref{uc:deprecation} illustrates the desirability of a 566 simple process that invites standard authors to start with minimal 567 vocabularies, relying on later extensions. 568 569 \subsubsection{Traceable Provenance} 570 \label{req:traceable} 571 572 To satisfy use case~\ref{uc:understanding}, the considerations that led 573 to the adoption or modification of a term must be documented publicly 574 in sufficient detail. It is clearly an advantage if a brief, accessible 575 summary of these considerations can easily be found without, say, 576 resorting to version control logs. 577 578 \subsubsection{Preliminary Vocabularies and Terms} 579 \label{req:preliminary} 580 581 In use case~\ref{uc:simplereview}, it is desirable to admit 582 preliminary'' vocabularies and terms. For these, both humans 583 and machines must be able to discern a temporary status, and 584 their use implies that the general rule once valid, always 585 valid'' does not apply. Validators and similar software could 586 then add notices to that effect in their outputs. 587 588 \subsubsection{Vocabulary Files are Usable Stand-Alone} 589 \label{req:standalone} 590 591 Vocabulary files need to be cacheable without applications having to 592 manage extra metadata (e.g., the URL from which the file was obtained) 593 in order to easily satisfy use case~\ref{uc:offline} (or other scenarios 594 in which vocabulary content cannot be retrieved from the IVOA 595 site for each session). 596 597 \subsubsection{Externally Curated Vocabularies and VO Tooling} 598 \label{req:external} 599 600 Regrettably, VOResource does not explain how use case~\ref{uc:uat} would 601 look like in actual documents, and the example given in the document 602 clearly does not use UAT concepts. 603 604 The first difficulty in a straightforward uptake is that UAT URIs look 605 like \url{http://astrothesaurus.org/uat/1774}. Given that, should 606 publishers have such URIs in \xmlel{subject}? Or should they rather use 607 just the last URI segment for conciseness? Or perhaps the preferred 608 labels, in keeping with the style of existing subject content and its 609 use by clients (which typically look for natural language in subject), 610 even though the labels are not considered stable? 611 612 Regardless of how VOResource clarifies this matter, UAT artefacts (e.g., 613 SKOS files), do not match some of our other requirements. In particular, 614 the human-readable URIs from \ref{req:lists}, the specific way we 615 satisfy \ref{req:machine}, and the non-RDF requirement \ref{req:nordf} are 616 not immediately satisfied by the UAT as distributed at the time of 617 writing. 618 619 For simple, uniform use of such externally curated vocabularies, it 620 should be possible to have some sort of endorsement process and then 621 distribute the vocabularies in a form compliant with this specification. 622 This will entail IVOA-specific concept URIs, and we must be able to 623 express that these resources have the same meaning as the ones 624 externally maintained. 625 626 627 \subsection{Non-Requirement} 628 629 This specification is not called Semantics in the VO'' or the like 630 because we do \emph{not} intend to prescribe ways to turn any VO 631 artefact into RDF triples. Indeed, for many existing vocabularies, it 632 is left open what exactly the domain or range of properties might be or 633 what subject and predicate the classes or concepts should be used with. 634 635 This is partly because this would substantially complicate the 636 generation of vocabularies -- which would quickly turn into proper 637 ontologies --, partly because the information encoded by 638 the triples has traditionally been expressed using techniques developed 639 by the Data Models working group. 640 641 In particular with a view to later use in linked data scenarios, 642 vocabulary authors should neverthess take care that, given appropriate 643 properties or annotation tools, the vocabularies \emph{could} be used in 644 meaningful RDF triples. 645 646 Conversely, this specification is written with future deeper'' 647 semantics in the VO in mind; tools restricting their operations to the ones 648 discussed here should not break when future specifications enrich 649 existing vocabularies towards full ontologies. 650 651 652 \section{Using IVOA Vocabularies without RDF Tooling} 653 \label{sect:withoutrdf} 654 655 RDF is a 656 powerful system for expressing a wide range of semantics and enriching 657 various documents with semantic information in a globally distributed 658 fashion. Due to its generality, handling its artefacts is relatively 659 involved and in general requires special tooling, non-negligible 660 investment in understanding RDF, and non-trivial management of URIs and 661 prefix mappings. 662 663 To lower the bar for an adoption of IVOA vocabularies 664 [requirement~\ref{req:nordf}], they are given in 665 two formats usable without RDF tooling or, indeed, deeper knowledge of 666 RDF. This section discusses these. 667 668 \subsection{Choosing Terms From IVOA Vocabularies (non-normative)} 669 670 Resource annotators can usually treat IVOA Vocabularies as simple lists 671 of (case-sensitive) strings with human-readable labels and definitions. 672 These lists can be inspected with a simple web browser. 673 674 Each IVOA vocabulary has an associated URI starting with 675 \url{http://www.ivoa.net/rdf}. Dereferencing that URI yields a list of 676 the vocabularies approved or under review. 677 678 An individual vocabulary has a 679 URI like \url{http://www.ivoa.net/rdf/refposition}. Dereferencing this URI 680 with a web browser (or, indeed, any user agent indicating it prefers 681 text/html media) redirects to a tabular representation of the vocabulary, 682 giving: 683 \begin{itemize} 684 \item \emph{terms} -- i.e., the strings actually used in annotation, 685 \item \emph{labels} -- i.e., strings that should be presented to humans instead of 686 the slightly formalised terms, and 687 \item \emph{descriptions}, which should 688 be sufficiently precise to allow someone with a certain amount 689 of domain expertise to decide whether a certain thing'' is or is not 690 covered by the term (or more precisely, the underlying concept). 691 \end{itemize} 692 693 Some terms may be marked as deprecated, in which case they should no 694 longer be used in new annotations. In most cases, deprecated terms will 695 come with information about what to use instead. 696 697 Some terms may be marked as preliminary. Such terms might disappear 698 without further notice. Casual users should avoid the use of such 699 terms; if they find they want to use them, the semantics working group 700 requests notification over its mailing list, since such use is clearly 701 relevant to the term's adoption process. 702 703 Once a term is located within the HTML page, annotators can usually 704 directly use it in instance documents. For instance, continuing the 705 refposition example, the string \texttt{BARYCENTER} found in the 706 vocabulary is directly used in VOTable's TIMESYS element. 707 708 Some applications (Datalink being the prime example) instead use URIs 709 relative to the vocabulary URI. In practical terms, this just means 710 that a hash sign is prepended to the term (e.g., \texttt{\#progenitor}). 711 712 This latter practice builds on the property of IVOA vocabularies that if 713 one adds the term as fragment to the vocabulary URI (e.g., 714 \url{http://ivoa.net/rdf/refposition#BARYCENTER}), that URI is the full, 715 RDF-compliant resource identifier of the concept. When used in 716 HTML-aware user agents (such as a web browser), dereferencing this URI 717 (i.e., opening it) will give the table of terms with the chosen term 718 highlighted. How exactly this is represented depends on the user agent. 719 720 721 \subsection{Semantic Operations Without RDF Tooling} 722 \label{sect:desise} 723 724 Many VO components need a machine-readable representation of the 725 entire vocabulary, for instance in order to 726 (cf.~sect.~\ref{sect:usecases}): 727 728 \begin{compactitem} 729 \item display labels and descriptions for terms to users, 730 \item perform query expansion or similar exploitation of hierarchical 731 relationships, or 732 \item validate annotated instances for the use of correct and current 733 terms. 734 \end{compactitem} 735 736 \subsubsection{Vocabularies in desise} 737 738 To let VO programs perform such tasks with minimal technical overhead, 739 in addition to the RDF artefacts described in 740 sect.~\ref{sect:deployment}, IVOA vocabularies are also available in an 741 ad-hoc format defined here for VO-internal use, nicknamed desise'' 742 (dead simple semantics''). Clients can retrieve vocabularies in 743 desise by requesting the vocabulary URI with the HTTP accept header set 744 to \texttt{application/x-desise+json}. 745 746 What is returned is a JSON-encoded \citep{std:JSON} mapping (object'' 747 in JSON terms) 748 containing the following keys (all mandatory): 749 750 \begin{description} 751 \item[uri] The vocabulary URI. All terms occurring in desise documents 752 can be turned into full, RDF-compliant resource URIs by prefixing them 753 with this URI and a hash character. 754 \item[flavour] The flavour of the vocabulary (can generally be ignored; 755 see sect.~\ref{sect:voccontent}). 756 757 \item[terms] A JSON object mapping the (machine-readable) terms to a 758 JSON object giving the term's properties as described below. 759 The keys in \textit{terms} are the strings used in 760 machine-readable data. 761 \end{description} 762 763 The JSON objects present as values in the terms object can have the 764 following keys: 765 766 \begin{description} 767 \item[label] (mandatory) 768 A human-readable label for display purposes; clients should 769 always try to display this rather than the raw term. 770 771 \item[description] (mandatory) A human-readable definition of the underlying 772 concept. 773 774 \item[deprecated] present and mapped to a reserved value if the term is 775 deprecated and should no longer be used; validators will warn against 776 its use. 777 778 \item[preliminary] present and mapped to a reserved value if the term 779 is preliminary, meaning that in contrast to the other, eternal'' terms 780 it can disappear again; validators should qualify a validation as 781 preliminary if a document uses such a term. 782 783 \item[wider] (mandatory) A JSON array 784 of wider'' terms. Most IVOA vocabularies are 785 tree-like, and for them, there is only up to one term in here, which 786 would be the the parent node, which is the hypernym of the current term. 787 In SKOS-flavoured vocabularies, multiple terms can be here, and the 788 meaning of wider'' is a bit less clear-cut. The \textit{wider} list 789 is empty for top-level terms. 790 791 \item[narrower] (mandatory) A JSON array 792 of narrower'' terms. In SKOS-flavoured 793 vocabularies, that is just a list of all terms that list the current 794 term as wider. Otherwise, the vocabularies are tree-like and 795 \textit{narrower} is a list of all terms on the term's branch and below 796 it in the tree (it is the transitive closure of the inverse of 797 wider''). This is much more easily understood in an example, which we 798 give below in the discussion on addressing use case~\ref{uc:links}. 799 \end{description} 800 801 Note that, while \textit{wider} and \textit{narrower} are mandatory 802 keys, their values can of course be empty lists. 803 804 See appendix~\ref{app:desiseexample} for a example of a vocabulary 805 represented in desise. 806 807 \subsubsection{Working with desise (non-normative)} 808 809 For illustration, here are recipes showing how to address 810 the various use cases in Python: 811 812 \paragraph{Load a vocabulary} Using the popular requests module:\\ 813 \begin{lstlisting} 814 import requests 815 voc = requests.get( 816 "http://www.ivoa.net/rdf/uat", 817 headers={"accept": "application/x-desise+json"} 818 ).json() 819 \end{lstlisting} 820 821 Note, however, that non-trivial clients should cache files retrieved in 822 this way for a reasonable time span; IVOA vocabularies typically do not 823 change on time scales of months. 824 825 \paragraph{See if a term is in the vocabulary} (\ref{uc:simplevoc}, 826 \ref{uc:votvoc})\\ \lstinline{term in voc["terms"]} 827 828 \paragraph{See if a term is deprecated} (\ref{uc:deprecation})\\ 829 \lstinline{"deprecated" in voc["terms"][term]} 830 831 \paragraph{Find a human-readable label for a term} 832 (\ref{uc:discovering})\\ 833 \lstinline{voc["terms"][term]["label"]} 834 835 \paragraph{Find a human-readable description for a term} 836 (\ref{uc:discovering})\\ 837 \lstinline{voc["terms"][term]["description"]} 838 839 \paragraph{Find out if a term is preliminary} (\ref{uc:simplereview})\\ 840 \lstinline{"preliminary" in voc["terms"][term]} 841 842 \paragraph{Query expansion: select branch} (in \ref{uc:links}, select all 843 progenitors, including flat fields, dark frames, etc) 844 \begin{lstlisting}[language=python] 845 base_term = "progenitor" 846 expanded_terms = set( 847 [base_term] 848 +voc["terms"][base_term]["narrower"]) 849 is_match = datalink_row["semantics"][1:] in expanded_terms 850 \end{lstlisting} 851 852 \paragraph{SKOS-type query expansion by neighbouring terms} 853 (\ref{uc:filtering}) 854 \begin{lstlisting}[language=python] 855 assert voc["flavour"]=="SKOS" 856 expanded_terms = set( 857 [base_term] 858 +voc["terms"][base_term]["narrower"] 859 +voc["terms"][base_term]["wider"]) 860 is_match = keyword_found in expanded_terms 861 \end{lstlisting} 862 863 864 \section{Vocabulary Content} 865 \label{sect:voccontent} 866 867 IVOA vocabularies MUST be based on W3C's Resource Description Framework. 868 Details on required serialisations are given in 869 sect.~\ref{sect:deployment}. This section deals with what kinds of 870 statements users of IVOA vocabularies SHOULD evaluate to ensure 871 interoperability. Statements of other types are legal in IVOA 872 vocabularies but are not expected to be interpreted interoperably. 873 Clients MAY ignore them. 874 875 In IVOA vocabularies, the concept URI MUST begin with 876 \url{http://www.ivoa.net/rdf}\footnote{In retrospect, the unnecessary 877 www'' in this URI is somewhat regrettable, but existing vocabularies 878 have used URIs including it, and it seems a small price to pay for 879 having uniform URIs.}. It is recommended to not introduce 880 additional hierarchy levels, i.e., vocabulary URIs SHOULD be direct children 881 of \texttt{rdf}\footnote{Some existing vocabularies do not follow this 882 rule; since vocabulary URI changes will break certain usage scenarios, 883 their URIs are still retained.}. 884 885 Since all vocabularies specified here are 886 single-file, the full term (i.e., RDF resource) 887 URI is formed by appending a hash sign 888 and a fragment identifier. In IVOA vocabularies, this fragment 889 identifier MUST consist of ASCII letters, numbers, underscores and 890 dashes exclusively [for requirement~\ref{req:machine}]. 891 892 The fragment identifiers in the vocabulary URIs SHOULD be 893 human-readable, usually by suitably contracting the 894 preferred label. In the IVOA, we do \emph{not} use natural 895 language-neutral concept identifiers but instead expect that domain 896 experts will already have an impression of a term's meaning from looking 897 at its URI. 898 899 Examples of URIs in the recommended form include: 900 901 \begin{itemize} 902 \item \url{http://www.ivoa.net/rdf/ivoasem#preliminary} for a 903 preliminary term by this specification. 904 \item \url{http://www.ivoa.net/rdf/timescale#TT} for the Terrestial Time 905 time scale. 906 \item \url{http://www.ivoa.net/rdf/uat#active-galactic-nuclei} for the 907 concept Active Galactic Nuclei''. 908 \end{itemize} 909 910 In this specification, we distinguish three different flavours'' of 911 vocabularies. Each covers a particular domain of problems and is 912 therefore subject to different requirements. 913 Although the requirements are largely non-contradicting, each vocabulary must 914 be clearly identified as \emph{either} giving SKOS concepts, RDFS 915 classes or RDF properties so clients know how to extract word lists and 916 hierarchies; see sect.~\ref{sect:genprop} 917 for details. 918 919 920 \subsection{SKOS Vocabularies} 921 \label{sect:skosvoc} 922 923 SKOS vocabularies should be used where terms are organised 924 in informal (i.e., non necessarily strict is-a) 925 hierarchies. The classic use case here is query expansion, where, for 926 instance, a search for AGN'' might be expanded to include matches for 927 accretion disk'' (under certain circumstances). 928 929 The terms in SKOS vocabularies have the RDF type \vocterm{skos:Concept}. 930 931 \subsubsection{Properties in SKOS Vocabularies} 932 \label{sect:skosvoc-prop} 933 934 IVOA SKOS vocabularies use the following properties: 935 936 \begin{itemize} 937 \item \vocterm{skos:broader} -- interpreted in the standard SKOS sense. 938 The reverse property, \vocterm{skos:narrower}, MAY be given, but clients 939 MUST NOT depend on their presence [this satisifies 940 requirement~\ref{req:hierarchy}]. 941 942 \item \vocterm{skos:prefLabel} -- all concepts MUST have an 943 English-language preferred label, which is an RDF plain literal [by 944 requirement~\ref{req:mtm}]. No RDF language label is allowed on the 945 literal, and only one preferred label is permitted 946 [these help requirement~\ref{req:nordf}]. 947 948 \item \vocterm{skos:definition} -- all concepts MUST have a non-trivial 949 English-language definition. It is obviously impossible to define 950 non-trivial'' in a rigorous way; a suggested criterion is that a 951 domain expert would, given the definition, presumably arrive at a 952 similar preferred label, and recursive definitions (i.e., those using 953 the label itself) should be avoided whenever possible. Definitions in 954 non-English languages are not permitted, and only one definition is 955 permitted [again, this helps requirement~\ref{req:mtm}]. 956 957 \item \vocterm{skos:exactMatch} -- for externally managed vocabularies 958 the IVOA has endorsed (see sect.~\ref{sect:externally-managed}), this 959 property links the IVOA term (subject) to the external RDF resource 960 (object) [mostly for requirement~\ref{req:external}]. 961 962 \item General properties discussed in \ref{sect:genprop} [this is 963 for requirements~\ref{req:deprecating} and 964 \ref{req:preliminary}]. The \vocterm{ivoasem:vocflavour} of these 965 vocabularies is \verb|SKOS|. 966 \end{itemize} 967 968 This specification does not include requirements on the use or the 969 interpretation of \vocterm{skos:related}, 970 \vocterm{skos:closeMatch}, \vocterm{skos:broadMatch}, 971 \vocterm{skos:narrowMatch}, \vocterm{skos:ConceptScheme}, 972 \vocterm{skos:inScheme}, \vocterm{skos:hasTopconcept}, 973 \vocterm{skos:altLabel}, and \vocterm{skos:hiddenLabel}. If use cases 974 are found that require those, this specification will be amended. Until 975 then, vocabulary authors SHOULD NOT use them in order to avoid creating 976 practices that might conflict with later usage patterns. 977 978 This specification does not include requirements on the use or the 979 interpretation of the transitive SKOS properties 980 (\vocterm{skos:broaderTransitive}, \vocterm{skos:narrowerTransitive}). 981 At this point, we believe that applications requiring this type of 982 reasoning-friendly semantics should preferably use RDF class 983 vocabularies. 984 985 \subsubsection{Example (non-normative)} 986 987 Here is a term from a SKOS vocabulary conforming to this specification 988 in RDF/XML serialisation: 989 990 \begin{lstlisting}[language=XML] 991 992 AGN 993 A compact object in the center of a galaxy showing 994 unusual emission ("active galactic nucleus"). 995 997 999 1000 \end{lstlisting} 1001 1002 \subsection{RDF Properties Vocabularies} 1003 \label{sect:refpropvoc} 1004 1005 RDF properties vocabularies should be used when the terms in the 1006 vocabulary are mainly used to state 1007 relationships between entities that can sensibly be imagined as 1008 resources in the RDF sense. Such terms would naturally be used as 1009 predicates in RDF triples. Obvious examples might be something 1010 like is-progenitor-for in a provenance chain or, indeed, the special 1011 properties for IVOA vocabularies introduced in sect.~\ref{sect:genprop}. 1012 1013 1014 The terms in RDF Properties vocabularies have the RDF type 1015 \vocterm{rdf:Property}. 1016 1017 \subsubsection{Properties in RDF Properties Vocabularies} 1018 \label{sect:propvoc-prop} 1019 1020 IVOA RDF properties vocabularies use the following properties (where 1021 not specified, the requirements considered essentially match those in 1022 sect.~\ref{sect:skosvoc-prop}): 1023 1024 \begin{itemize} 1025 \item \vocterm{rdfs:label} -- all terms MUST have an English-language 1026 label, and clients should prefer it over the fragment in the 1027 term URI for presentation purposes. Only 1028 one such label is permitted. 1029 1030 \item \vocterm{rdfs:comment} -- all concepts MUST have a non-trivial 1031 English-language comment serving as a human-oriented definition of the 1032 term. The considerations for \vocterm{skos:definition} in 1033 sect.~\ref{sect:skosvoc-prop} apply. As for those, only one 1034 \vocterm{rdfs:comment} per term is allowed. 1035 1036 \item \vocterm{rdfs:subPropertyOf} -- interpreted as in RDFS to induce 1037 the hierarchy of terms; a term MUST NOT appear as subject of more than 1038 one \vocterm{rdfs:subPropertyOf} triple (i.e., the hierarchy is a tree). 1039 1040 \item General properties discussed in sect.~\ref{sect:genprop}. 1041 The \vocterm{ivoasem:vocflavour} of these vocabularies is 1042 \verb|RDF Property|. 1043 1044 \end{itemize} 1045 1046 \subsubsection{Example (non-normative)} 1047 \label{sect:rdfpxex} 1048 1049 \begin{lstlisting}[language=XML] 1050 1052 preview of the data as a 2-dimensional 1053 image 1054 Image preview 1055 1057 1058 \end{lstlisting} 1059 1060 1061 \subsection{RDF Class Vocabularies} 1062 1063 RDF class vocabularies should be used when the terms in the vocabulary 1064 are reasonably class-like, i.e., would usually be either subjects or 1065 objects in RDF triples. As opposed to SKOS vocabularies, the hierarchy 1066 implied is strict in the sense of \vocterm{rdfs:subClassOf} 1067 (roughly: statements that are true for a wider term must be true 1068 for a more specialised term, too). This lets clients confidently perform 1069 inferences. 1070 1071 For instance, coordinates in the FK4 reference frame are equatorial, and 1072 thus even a client unfamiliar with the FK4 frame as such can confidently 1073 infer that the coordinates are right ascension and declination, and that 1074 right ascensions increase eastwards. Reasoning of this type is 1075 impossible within a SKOS vocabulary. 1076 1077 The terms in RDF Class vocabularies have the RDF type 1078 \vocterm{rdfs:Class}. 1079 1080 \subsubsection{Properties in RDF Class Vocabularies} 1081 \label{sect:classvoc-prop} 1082 1083 IVOA RDF class vocabularies use the following properties: 1084 1085 \begin{itemize} 1086 \item \vocterm{rdfs:label} -- all terms MUST have an English-language 1087 label, and clients should prefer it over the term (the fragment of the 1088 term URI) for presentation purposes. Only 1089 one such label is permitted. 1090 1091 \item \vocterm{rdfs:comment} -- all concepts MUST have a non-trivial 1092 English-language comment serving as a human-oriented definition of the 1093 term. The considerations for \vocterm{skos:definition} in 1094 sect.~\ref{sect:skosvoc-prop} apply. As for those, only one 1095 \vocterm{rdfs:comment} per term is allowed. 1096 1097 \item \vocterm{rdfs:subClassOf} -- interpreted as in RDFS to induce 1098 the hierarchy of terms; a term MUST NOT appear as subject of more than 1099 one \vocterm{rdfs:subClassOf} triple (i.e., the hierarchy is a tree). 1100 1101 \item General properties discussed in \ref{sect:genprop}. 1102 The \vocterm{ivoasem:vocflavour} of these vocabularies is 1103 \verb|RDF Class|. 1104 \end{itemize} 1105 1106 \subsubsection{Example (non-normative)} 1107 1108 Here is a term from an RDF class vocabulary conforming to this 1109 specification in RDF/XML serialisation: 1110 1111 \begin{lstlisting}[language=XML] 1112 1113 1114 Positions based on the 5th Fundamental Katalog. If no equinox is 1115 [...] 1116 1117 FK5 1118 1120 1121 \end{lstlisting} 1122 1123 \subsection{General Properties} 1124 \label{sect:genprop} 1125 1126 To cover requirements~\ref{req:deprecating} and 1127 \ref{req:preliminary} and to facilitate the handling of vocabularies not 1128 directly retrieved via HTTP (which means that the application may not 1129 know the vocabulary URI a priori; cf.~requirement~\ref{req:standalone}), 1130 the Semantics WG defines some 1131 properties of its own in the vocabulary 1132 \url{http://www.ivoa.net/rdf/ivoasem}. The following properties may be 1133 used in all three vocabulary flavours: 1134 1135 \begin{itemize} 1136 \item \vocterm{dc:created} -- IVOA vocabularies MUST include exactly one 1137 triple with the vocabulary as subject and a predicate 1138 \vocterm{dc:created}. The object is the datestamp of the vocabulary in 1139 YYYY-MM-DD format. Clients may only use this for debugging and similar 1140 purposes. 1141 1142 \item \vocterm{ivoasem:vocflavour} -- IVOA vocabularies MUST include 1143 exactly one triple with the vocabulary as subject and a string literal 1144 specifying the kind of vocabulary as per this specification. The 1145 General properties'' bullet points of sects.~\ref{sect:skosvoc-prop} 1146 (\verb|SKOS|), \ref{sect:propvoc-prop} (\verb|RDF Property|), and 1147 \ref{sect:classvoc-prop} (\verb|RDF Class|) define what strings may occur 1148 here. 1149 1150 \item \vocterm{ivoasem:preliminary} -- this property indicates 1151 that a term is preliminary and might disappear from the 1152 vocabulary without warning. The object of triples using it 1153 is a blank node. Validators need not warn against the use 1154 of preliminary terms, but as they encounter them, they SHOULD 1155 qualify their validation to the effect that it is temporary. 1156 1157 \item \vocterm{ivoasem:deprecated} -- this property indicates 1158 that a term is deprecated. The object of triples using it 1159 is a blank node. Validators SHOULD issue warnings if such terms 1160 are encountered. 1161 1162 \item \vocterm{ivoasem:useInstead} -- for a deprecated term, the 1163 objects of RDF triples using this property indicate 1164 which terms should be 1165 used instead of the deprecated one. This property MUST NOT be used with 1166 non-deprecated subjects. 1167 1168 \end{itemize} 1169 1170 \subsubsection{Example (non-normative)} 1171 1172 The following snippets show RDF/XML triples using the common terms, 1173 taken from the existing relationship\_type vocabulary; the notation 1174 \verb|__| as a blank node is an implementation detail and must not be 1175 relied upon. In general, where ivoasem properties take blank nodes as 1176 objects, clients should normally just ignore the objects. 1177 1178 \begin{lstlisting}[language=XML] 1179 1181 2016-08-17 1182 1183 1185 RDF Property 1186 1187 1189 1191 1192 1194 1196 1198 1199 \end{lstlisting} 1200 1201 1202 \section{Vocabulary Management} 1203 \label{sect:management} 1204 1205 This section discusses the processes through which new vocabularies can be 1206 defined and how vocabulary updates are performed in way 1207 that ensures community participation and at least a minimal level of 1208 consensus. Procedures here primarily address requirements 1209 \ref{req:consensus}, \ref{req:evolution} and \ref{req:traceable}. 1210 1211 In the following, the phrase chair of the Semantics WG'' is understood 1212 to mean chair or vice-chair of the Semantics WG''; in the unlikely 1213 situation that chair and vice-chair dissent, the resolution of the 1214 problem is up to the TCG chair. 1215 1216 1217 \subsection{New Vocabularies} 1218 \label{sect:new-vocabularies} 1219 1220 New vocabularies in the VO should be introduced with a document going 1221 through the normal IVOA approval process, i.e., intended to become a 1222 recommendation or an endorsed note, with RFC as described in the IVOA 1223 Document Standards \citep{2017ivoa.spec.0517G}. 1224 1225 At the discretion of the chair of the Semantics WG, the vocabulary is 1226 uploaded to the vocabulary repository when a document reaches the state 1227 of a Working Draft. At the latest, the vocabulary is uploaded when the 1228 document becomes a Proposed Recommendation or a Proposed Endorsed Note 1229 in order to support a thorough review and reference implementations. 1230 1231 The entire vocabulary is marked human-readably as preliminary in the 1232 vocabulary index (cf.~sect.~\ref{sect:deployment}). All terms in the 1233 vocabulary are marked as preliminary using the 1234 \vocterm{ivoasem:preliminary} property (cf.~sect.~\ref{sect:genprop}) in 1235 order to satisfy requirement~\ref{req:preliminary}. 1236 1237 The entire new vocabulary gets approved as the document introducing it 1238 reaches the status of Recommendation or Endorsed Note. At that point, 1239 all its terms become un-deprecated. From then 1240 on, it is managed by the Semantics WG using the process defined in 1241 the next section. 1242 1243 Once approved (i.e., no longer marked as preliminary), 1244 terms in IVOA vocabularies cannot be removed. They can, 1245 however, be marked as deprecated. 1246 1247 \subsection{Updating Vocabularies} 1248 \label{sect:updating-vocabularies} 1249 1250 IVOA vocabularies can be extended as domain requirements develop 1251 [requirement~\ref{req:evolution}]. Clients 1252 should therefore be designed such that they gracefully deal with terms 1253 that have not been part of the vocabulary at build time, typically by 1254 exploiting information in the vocabulary, perhaps by falling back to 1255 wider, known terms, or by presenting their users labels and descriptions 1256 for terms not explicitly handled. 1257 1258 1259 \subsubsection{Vocabulary Enhancement Proposals} 1260 1261 To add one or more terms to a vocabulary, to introduce deprecations or 1262 to change term labels, descriptions, or relationships, 1263 an interested party -- not necessarily affiliated with the Working Group 1264 that has originally introduced the vocabulary -- prepares a Vocabulary 1265 Enhancement Proposal (VEP). In the interest of thorough review and 1266 topical discussion, a single VEP should only cover directly related 1267 terms. For instance, in a vocabulary of reference frames, it would be 1268 reasonable to add old-style and new-style galactic frames in one 1269 VEP, but not, say, azimuthal and supergalactic coordinates. The 1270 arguments for both terms in the former pair are rather 1271 analogous\footnote{This does not rule out that, in the example, one 1272 might argue that old-style galactic coordinates are so ancient that 1273 perhaps they should not be supported in the VO at all; the chair of the 1274 Semantics WG might then decree that the VEP still needs to be split.}. 1275 In the latter case, two very different rationales would have 1276 to be put forward, which is a clear sign that two VEPs are in order. 1277 1278 \begin{figure} 1279 \begin{verbatim} 1280 Vocabulary: http://www.ivoa.net/rdf/datalink/core 1281 Author: msdemlei@ari.uni-heidelberg.de 1282 Date: 2019-07-19 1283 1284 Term: IsPreviousVersionOf 1285 Action: Addition 1286 Label: Newer Version 1287 Description: This dataset in a previous edition, e.g., processed 1288 with an older pipeline, as part of an older data release. 1289 Relationships: rdfs:subProperyOf(this) 1290 Used-in: http://example.org/datalink?ID=doc-v1 1291 1292 Term: IsNewVersionOf 1293 Action: Addition 1294 Label: Previous Version 1295 Description: This dataset in a newer edition, e.g., processed 1296 with a newer pipeline, as part of a newer data release. 1297 Relationships: rdfs:subProperyOf(this) 1298 Used-in: http://example.org/datalink?ID=doc-v2 1299 1300 Rationale: 1301 1302 The terms are mainly intended for projects with data releases. 1303 IsPreviousVersionOf allows services to mark up links to (typically 1304 datalink documents for) later version(s) of this data set. It 1305 allows a client to alert users that a newer, probably improved, 1306 rendition of the current dataset is available and should 1307 presumably be used instead of what they are looking at. The 1308 inverse relationship, IsNewVersionOf, is useful if projects want 1309 to keep previous versions of the dataset findable without having 1310 them show up in the default queries. 1311 1312 The terms are taken from the relationship types of DataCite. 1313 \end{verbatim} 1314 1315 \caption{A sample VEP.} 1316 \label{fig:vepsample} 1317 \end{figure} 1318 1319 A VEP is a semistructured text file containing the following items: 1320 1321 \begin{itemize} 1322 \item \vepitem{Vocabulary:} The URI of the vocabulary 1323 \item \vepitem{Author:} Contact information for the author(s) of 1324 the VEP. 1325 \item \vepitem{Date:} The date on which the VEP was posted. 1326 \item \vepitem{Term:} The identifier of the term to be added, modified, 1327 or deleted. 1328 \item \vepitem{Action:} one of \textit{Addition}, \textit{Deprecation}, or 1329 \textit{Modification}. 1330 \item \vepitem{Label:} The English-language, human-readable label of the term. 1331 \item \vepitem{Description:} The description that will come with the term. 1332 \item \vepitem{Relationships}: If applicable, relationships the new 1333 term will have to existing terms, using the properties defined in 1334 the present document. 1335 \item \vepitem{Used-In}: At least one URI of a document using the 1336 proposed term. 1337 \item \vepitem{Rationale}: A discussion of use cases, the role of the term in 1338 the vocabulary, and the like. In particular, the item(s) in Used-In 1339 should be commented on. 1340 \end{itemize} 1341 1342 The items \vepitem{Term}, \vepitem{Action}, \vepitem{Label}, 1343 \vepitem{Description}, \vepitem{Used-in}, 1344 and \vepitem{Relationships}, may be repeated if 1345 multiple terms are affected by a VEP. In \textit{Addition} VEPs, all items 1346 except \vepitem{Relationships} are mandatory. 1347 1348 When \vepitem{Action} is \textit{Deprecation}, \vepitem{Label}, 1349 \vepitem{Description}, and \vepitem{Relationships} are optional but can be 1350 given if useful for understanding the VEP. The rationale MUST discuss 1351 the reasons for a deprecation. Usually, one or more replacement 1352 term(s) will be proposed within the same VEP. 1353 1354 When \vepitem{Action} is \textit{Modification}, \vepitem{Label}, 1355 \vepitem{Description}, and \vepitem{Relationships} give the proposed new 1356 values of the term. The term itself cannot be modified. The rationale 1357 will usually detail the changes proposed while mentioning the previous 1358 values. 1359 1360 We do not expect the VEPs to be evaluated by machines. Therefore, we 1361 define no grammar for the markup of sections, section headers, and their 1362 content. It is still recommended that authors follow the formatting of 1363 the example in Fig.~\ref{fig:vepsample}. 1364 1365 \subsubsection{Publishing a VEP} 1366 1367 To publish a VEP, it is sent to the chair of the Semantics WG, 1368 preferably by e-mail. The chair of the Semantics WG will perform a 1369 formal validation, in particular as regards the presence of all required 1370 items and syntactically valid relationships. No assessment of the 1371 contents is done at this stage. 1372 1373 VEPs formally valid then receive a running number. The first VEP was 1374 VEP-0001, the second VEP-0002, and so on. The chair of the Semantics WG 1375 then adds the new VEP to the public index of VEPs as 1376 Current'' (see Appendix~\ref{app:curtech} for the technical details). 1377 This index has a link to each VEP's text (in general, a location in a 1378 version control system). 1379 1380 Once the VEP is uploaded, it is announced to the IVOA Semantics Working 1381 Group and all other IVOA Working Groups concerned (again, the technical 1382 details are found in Appendix~\ref{app:curtech}). The chair of the 1383 Semantics WG can extend the distribution as they see fit. The 1384 announcement in particular contains a copy of the VEP in question. 1385 1386 As soon as possible after the upload, the chair of the Semantics WG adds 1387 any term(s) proposed to the vocabulary as a preliminary term using the 1388 \vocterm{ivoasem:preliminary} property. This means that the terms can 1389 immediately be used without raising warnings or errors, but in contrast 1390 to approved terms, they may disappear again. Deprecation or 1391 modification VEPs have no immediate effect. 1392 1393 \subsubsection{Approval Process} 1394 \label{sect:approval} 1395 1396 Discussion of a VEP takes place in the WGs' discussion forums (again, 1397 see Appendix~\ref{app:curtech}). The chair of the Semantics WG will 1398 summarise the discussion in the VEP in a \textit{Discussion} section. 1399 1400 During the process, all parts of the VEP may be changed except the 1401 term(s) proposed. 1402 1403 Once the chair of the Semantics WG sees a sufficient consensus reached, 1404 they announce the VEP in the TCG. If, at the next meeting of the TCG, 1405 no Working Group objects to the VEP, it is accepted and the marker that 1406 a term is preliminary is removed from the relationships of any terms 1407 added by the VEP. In the case of deprecation or modification VEPs, the 1408 requested actions are taken at this point. 1409 1410 If, on the other hand, discussion of an addition request results in the 1411 realisation that terms proposed need to be changed, the VEP in question 1412 must be withdrawn, its effects on the vocabulary be undone, and zero or 1413 more new VEPs are posted containing proposals for terms for which 1414 consensus appears feasible. The VEP withdrawn receives a 1415 \vepitem{Superceded-by} item referencing any new VEPs, any new VEPs have 1416 a \vepitem{Supercedes} item referencing the original VEP. 1417 1418 \subsubsection{Guidelines for Creating Concepts (non-normative)} 1419 1420 When introducing terms, it is useful to consider a very simple 1421 semantic model, where the world is a set of (tangible or non-tangible) 1422 things'' in the sense of naive set theory. 1423 1424 A vocabulary has a scope, which is a subset of the world; this could be 1425 reference systems'' or astronomical object types'' or even something 1426 as concrete as observatories''. 1427 1428 In this picture, a term denotes a certain subset of a vocabulary's 1429 scope. This set is called the term's (or, where an additional level 1430 between the concrete letters making up the term as defined by this 1431 document and the set is useful, the concept's) extension''. 1432 1433 Now, in an ideal vocabulary the extensions of its 1434 top-level terms are disjunct (meaning: each thing in scope of the vocabulary 1435 belongs to not more than one top-level term's extension) and the terms cover the 1436 entire scope (meaning: for each thing in the scope, there is at least 1437 one term's extension that contains that thing). The top-level terms are 1438 equivalence classes over the vocabulary's scope. 1439 1440 Where vocabularies are hierarchical, analogous considerations would 1441 apply for the extensions of a general term and its more specialised 1442 terms. 1443 1444 When natural language and the real world are involved, 1445 this ideal generally is unreachable. 1446 But when proposing a term and its definition, authors should try to 1447 make sure that 1448 1449 \begin{compactenum} 1450 \item their new term has a useful extension (i.e., consumers actually 1451 want to know whether a thing is or is not inside it) 1452 \item the extension is reasonably disjunct from existing terms, or is a 1453 true superset (in which case the other terms are narrower), or is a true 1454 subset (in which case they are wider) of other terms' extensions. 1455 \end{compactenum} 1456 1457 Put another way: When designing terms, it is as important to say what is 1458 not covered as to clearly say what is. 1459 1460 This is a major reason why it is important to give clear definitions 1461 whenever these definitions are not uniquely given by the domain. For 1462 instance, while an object type vocabulary probably does not need to be 1463 very diligent in defining $\delta$~Cephei stars because the extension of 1464 that term is uncontroversial to first order\footnote{Although it might 1465 seem desirable to clarify whether, say, W~Virginis stars are or are not 1466 excluded}, a term like dataset'' should come with a precise 1467 definition, ideally containing a reference to a longer explanation. 1468 1469 \subsection{Externally Managed Vocabularies} 1470 \label{sect:externally-managed} 1471 1472 The IVOA is not the only body developing vocabularies, and of course VO 1473 components are free to use other, non-IVOA vocabularies whenever 1474 convenient or even required for interoperability beyond the IVOA. 1475 1476 Sometimes, however, it is advantageous to subject an external vocabulary 1477 to the requirements set forth by this specification. The motivating use 1478 case here is \ref{uc:uat}, the Unified Astronomy Thesaurus. As derived 1479 in requirement~\ref{req:external}, multiple considerations make a 1480 mirror'' of the vocabulary in the IVOA RDF repository highly 1481 desirable. Regrettably, since RDF resources (i.e., what we call terms 1482 here) are identified by their full URIs, this will create new RDF 1483 resources, and hence care must be taken that RDF tools can work out the 1484 identity of the mirrored IVOA terms and the original RDF resources. 1485 1486 Also, the processes from sects.~\ref{sect:new-vocabularies} 1487 and~\ref{sect:updating-vocabularies} obviously cannot apply to such 1488 vocabularies, which have their own management procedures. 1489 1490 To address these issues, the following rules apply: 1491 1492 When a vocabulary managed by an IVOA-external body needs to be made 1493 available in the form prescribed by this specification, a proposal for 1494 doing this needs to pass the endorsed notes process of the IVOA as laid 1495 out in the IVOA Document Standards \citep{2017ivoa.spec.0517G}. As it 1496 concerns external relationships of the IVOA, it additionally needs 1497 endorsment by the IVOA Executive Committee to become effective. 1498 1499 This proposal has to specify: 1500 \begin{itemize} 1501 \item The basic metadata for the vocabulary on the IVOA side. 1502 \item The rules for mapping the external RDF resource URIs to IVOA term 1503 URIs, together with a plan for how this mapping is kept stable. 1504 \item If during the mapping of the vocabulary, external RDF triples are 1505 discarded (which likely is necessary to ensure adherence to our 1506 constraints), what triples are discarded. 1507 \item A description of and reference to software that performs this 1508 mapping. 1509 \item A description of the external management process. 1510 \end{itemize} 1511 1512 The proposing party has to provide software to automatically translate 1513 resources from the external format to a suitable input for the IVOA 1514 vocabulary tooling. 1515 1516 Each term in the IVOA vocabulary mirror MUST declare its identity to 1517 the original, external RDF resource. At this point, this is only 1518 defined for SKOS-flavoured vocabularies, where the IVOA term must be the 1519 subject of exactly one triple with the \vocterm{skos:exactMatch} 1520 property. The object of that triple is the URI of the external RDF 1521 resource. 1522 1523 For other flavours, no such mechanism is defined in this version of the 1524 specification, which means that for now, externally managed vocabularies 1525 must use the SKOS flavour. 1526 1527 Once an external vocabulary is endorsed by both the TCG and the 1528 Executive Committee, the chair of the Semantics working group has the 1529 responsibility to keep the IVOA mirror of the vocabulary synchronised, 1530 ideally by using a monitored, automatised process like a post-commit 1531 action on an external version control system. 1532 1533 1534 \section{Publishing Vocabularies} 1535 \label{sect:deployment} 1536 1537 This section is an adaptation of \citet{note:cooluris} and is 1538 intended to satisfy requirements~\ref{req:machine} 1539 and~\ref{req:mtm}. It also briefly discusses how IVOA vocabularies 1540 should be referenced. 1541 1542 \subsection{Deploying Vocabularies} 1543 1544 All IVOA-approved vocabularies are accessible as children of 1545 \url{http://www.ivoa.net/rdf}. Dereferencing that URI will lead to an 1546 index of current approved and proposed vocabularies. 1547 Vocabularies still under review are clearly marked as such. 1548 1549 When dereferencing a vocabulary URI, clients will receive an HTTP 303 1550 (See Other) code, with the \texttt{Location} header set to the last 1551 version of the vocabulary. The version is written as the date of the 1552 last update in the format YYYY-MM-DD. Depending on the value of the 1553 request's accept header, the redirect will end up at 1554 1555 \begin{itemize} 1556 \item an HTML rendition of the vocabulary by default. The HTML element 1557 corresponding to a term has the term (i.e., the fragment identifier in the 1558 term's URI) as its HTML id ; hence a URI 1559 \verb|#| will immediately focus the term's HTML 1560 rendition in common 1561 user agents [requirement~\ref{req:mtm}]. 1562 1563 \item a Turtle rendition of the vocabulary if the accept header 1564 indicates that \verb|text/turtle| documents are preferred. 1565 1566 \item an RDF/XML rendition of the vocabulary 1567 if the accept header indicates that 1568 \verb|application/rdf+xml| documents are preferred. 1569 1570 \item an ad-hoc JSON rendition of the vocabulary as specified in 1571 sect.~\ref{sect:desise} if the accept header indicates that 1572 \verb|application/x-desise+json| documents are preferred. 1573 \end{itemize} 1574 1575 Individual vocabularies may be available in additional formats. 1576 Content negotiation might then consider additional media types. 1577 1578 Clients may record the full versioned URI of the vocabulary used for 1579 debug or provenance purposes. These URIs, however, MUST NOT be used 1580 externally. In particular, a URI like 1581 \url{http://www.ivoa.net/rdf/example/2019-07-14/example.html#term} has no 1582 RDF meaning by this standard and must never be used in publicly visible 1583 RDF triples. Always use URIs of the form 1584 \url{http://www.ivoa.net/rdf/example#term}. 1585 1586 \subsection{Referencing Vocabularies} 1587 1588 Since IVOA vocabularies, at least after some time, generally are a 1589 collective effort with a continuous evolution, it is inappropriate to 1590 cite them in the conventional author-year-title format. 1591 1592 However, the vocabulary URI is intended to be stable and uniquely 1593 identifies the vocabulary as such. Hence, this URI is what should 1594 normally be cited. The standard style would be along the lines of 1595 \begin{lstlisting}[language={}] 1596 Terms in this field must be taken from the IVOA vocabulary 1597 \url{http://www.ivoa.net/rdf/voresource/content_level}. 1598 \end{lstlisting} 1599 or, in formats where footnotes are appropriate and inline URIs should be 1600 avoided for typographical reasons 1601 \begin{lstlisting}[language={}] 1602 Terms in this field must be taken from the IVOA vocabulary 1603 \emph{Content levels for VO resources}\footnote{ 1604 \url{http://www.ivoa.net/rdf/voresource/content_level}}. 1605 \end{lstlisting} 1606 -- the footnote anchor should be the vocabulary name as given in the 1607 IVOA vocabulary repository\footnote{\url{http://www.ivoa.net/rdf}}. 1608 1609 Except in the rare cases in which version-sharp references are actually 1610 necessary (for instance, descriptions of errors), it is inappropriate to 1611 references URLs with dates (e.g., 1612 \url{http://ivoa.net/rdf/voresource/content_level/2016-08-17/}). URIs 1613 to actual resources (e.g., the XML or Turtle renditions) must never be 1614 used to reference vocabularies. 1615 1616 We do not see a relevant use case for having IVOA vocabularies formally 1617 cited in reference sections of scholarly works: such references will not 1618 aid in finding them, and there is no credible benefit in tracking their 1619 usage from citation in literature. 1620 1621 1622 \appendix 1623 \section{The 2019 IVOA Vocabulary Toolset (non-normative)} 1624 \label{app:tools} 1625 1626 This appendix describes the recommended toolset for authoring IVOA 1627 vocabularies as of 2019. Vocabulary authors may decide to use other 1628 tools but should consider that that may incur additional work for the 1629 chair of the Semantics WG in later maintenance. 1630 1631 This appendix is non-normative. It will serve as documentation of the 1632 toolset and will occasionally be updated as the tooling evolves; 1633 vocabulary authors are still advised to inspect documentation within the 1634 tools. Even major changes here will not lead to a new major version of 1635 the standard. 1636 1637 1638 \subsection{Input Format} 1639 1640 In the current tooling, RDF class and property 1641 vocabularies are authored in simple CSV files 1642 with five columns. These columns are: 1643 1644 \begin{description} 1645 \item[term] 1646 This is the actual, machine-readable vocabulary term. Only use 1647 letters, digits, underscores, and dashes here. As specified in 1648 sect.~\ref{sect:voccontent}, these identifiers should be 1649 human-readable, even though they are not directly intended for human 1650 consumption (clients will use the label). In the interest of 1651 reasonably compact URIs we advise to keep the length of the 1652 terms below, say, 30 characters. 1653 \item[level] 1654 This is used for simple input of wider/narrower relationships. 1655 It is 1 for root'' terms. Terms with a level of 2 that follow a 1656 root term become its children. i.e., the tooling will add the 1657 appropriate wider relationship between the level 2 and the level 1 1658 term. You can nest, i.e., have 1659 terms of level 3 below terms of level 2. Note that this means the 1660 order of rows must be preserved in the CSV files: Do \emph{not} sort 1661 vocabulary CSVs. 1662 \item[label] 1663 This is a short, human-readable label for the term. In the VO, this 1664 is generally derived fairly directly from the content of the first 1665 column, usually by 1666 inserting blanks at the right places and fixing capitalisation. 1667 \item[description] 1668 This is a longer explanation of what the term means. We do not 1669 support any markup here, not even paragraphs, so there is probably a 1670 limit to how much can be communicated. 1671 \item[more\_relations] 1672 This column can be used to declare non-hierarchical relationships 1673 and contains whitespace-separated declarations. Each declaration has 1674 the form property[(term)]. Omitting the term is allowed for certain 1675 properties; in RDF, this corresponds to a blank node. See below for 1676 the common properties supported here. Plain terms are resolved 1677 within the vocabulary, but CURIEs with known prefixes or full URIs are 1678 admitted, too. 1679 \end{description} 1680 1681 Non-ASCII characters are allowed in label and description; files must be 1682 encoded in UTF-8, the column separator currently is required to be a 1683 semicolon in order to save on escaping with descriptions (which very 1684 commonly contains commas). Fields that contain semicolons are escaped 1685 with double quotes, embedded double quotes are doubled. 1686 1687 The following properties are supported in the more\_relations 1688 column: 1689 1690 \begin{itemize} 1691 \item \vocterm{ivoasem:deprecated} -- see sect.~\ref{sect:genprop}. 1692 \item \vocterm{ivoasem:useInstead} -- see sect.~\ref{sect:genprop}. 1693 \item \vocterm{ivoasem:preliminary} -- see sect.~\ref{sect:genprop}. 1694 \end{itemize} 1695 1696 \subsection{Vocabulary Metadata} 1697 \label{sect:vocmeta} 1698 1699 Global vocabulary metadata is kept an INI-style format. The following 1700 keys are understood: 1701 1702 \begin{description} 1703 \item[timestamp] 1704 A manually maintained date of the last modification. This is 1705 essentially a version marker and should be changed only in preparation 1706 for a release. It is recommended to set it to the intended release 1707 date during development and not change it for every edit. 1708 \item[title] 1709 A human-readable short phrase saying what the vocabulary describes. 1710 \item[flavour] 1711 One of \textit{RDF Class}, \textit{RDF Property}, or \textit{SKOS} 1712 (where SKOS currently expects RDF/XML serialised SKOS rather than CSV). 1713 \item[description] 1714 A longer text (about a paragraph) stating what the vocabulary should 1715 be used for. No markup is supported here. 1716 \item[authors] 1717 Persons involved with the creation of the vocabulary. These are \emph{not} 1718 the persons to ask for maintenance; all requests for changes should be 1719 directed to the Semantics working group first. 1720 \item[filename] 1721 The tooling expects the input at 1722 \verb|/terms.csv|. If it is kept elsewhere, give 1723 the source file name here. This is to support legacy 1724 vocabularies with nonstandard names and native SKOS input. 1725 \item[draft] 1726 While a vocabulary is still being reviewed in its entirety, add a key 1727 draft set to \texttt{True}. This will add language to the effect that 1728 terms may still vanish from the vocabulary and mark all terms as 1729 preliminary. Once the vocabulary is approved, this key is deleted. 1730 \item[licenseuri] 1731 IVOA-managed vocabularies are always made available under CC-0 and 1732 hence do not use this key. External vocabularies as per 1733 sect.~\ref{sect:externally-managed} may be subject to actual licences, 1734 in which case this field holds a URI containing the licence's 1735 conditions. 1736 \item[licensehtml] 1737 This is arbitrary HTML expressing whatever licence terms may be 1738 attached to an external vocabulary. Again, do not use for IVOA 1739 vocabularies. 1740 \end{description} 1741 1742 Currently, the global metadata is maintained in a file 1743 \verb|vocabs.conf| in the root of the vocabulary source repository, with one 1744 section per vocabulary. The section name is the vocabulary name. 1745 1746 \subsection{Vocabulary Source Repository} 1747 1748 Vocabulary authors are encouraged to maintain their vocabularies in the 1749 shared version control system of the IVOA. At the time of writing, this 1750 is a subversion repository at 1751 \url{https://volute.g-vo.org/svn/trunk/projects/semantics/voc-source}. 1752 1753 Authors of new vocabularies should create a child directory and place 1754 their terms.csv file in there. They should then edit \verb|vocabs.conf| 1755 and add a section named after their directory with the content discussed 1756 in sect.~\ref{sect:vocmeta}. 1757 1758 1759 \section{Current Network Resources (non-normative)} 1760 \label{app:curtech} 1761 1762 This appendix details network resources used in vocabulary management. 1763 It is non-normative and will occasionally be updated as the IVOA's 1764 infrastructure evolves. Even major changes here will not lead to a new 1765 major version of the standard. 1766 1767 The list of vocabulary enhancement proposals is maintained in the IVOA's 1768 wiki at 1769 \url{https://wiki.ivoa.net/twiki/bin/view/IVOA/VEPs}. 1770 Approved VEPs will be moved to an archive page linked there. 1771 VEPs may be added as attachments to this page, but authors are 1772 encouraged to maintain them in version controlled repositories instead. 1773 The recommended place to do that is 1774 \url{https://volute.g-vo.org/svn/trunk/projects/semantics/veps}. 1775 1776 The discussion of VEPs (see sect.~\ref{sect:approval}) is to take place 1777 on the appropriate mailing list(s). See 1778 \url{http://ivoa.net/members/index.html} for a directory of IVOA mailing 1779 lists and their addresses. 1780 1781 \section{An Example for a Vocabulary in Desise (non-normative)} 1782 \label{app:desiseexample} 1783 1784 The following example shows what a vocabulary in desise looks like. The 1785 content is, superficial similarities to real vocabularies 1786 notwithstanding, contrived. 1787 1788 \begin{lstlisting}[language=python] 1789 { 1790 "uri": "http://www.ivoa.net/rdf/example", 1791 "flavour": "RDF Class", 1792 "terms": { 1793 "EQUATORIAL": { 1794 "label": "Equatorial", 1795 "description": "Umbrella term for all sorts of equatorial frames.", 1796 "narrower": ["ICRS", "ICRS2", "BD", "BD1875.0"], "wider": [] 1797 }, 1798 "ICRS": { 1799 "label": "ICRS", 1800 "description": "As defined by 1998AJ....116..516M.", 1801 "wider": ["EQUATORIAL"], "narrower": [] 1802 }, 1803 "B1875": { 1804 "label": "Bonner Durchmusterung System", 1805 "description": "Deprecated term for the reference system implied by BD/CD", 1806 "deprecated": "", 1807 "wider": ["EQUATORIAL"], "narrower": [] 1808 }, 1809 "BD": { 1810 "label": "Bonner Durchmusterung System", 1811 "description": "The reference system implied by BD/CD" 1812 "wider": ["EQUATORIAL"], "narrower": [] 1813 }, 1814 "ICRS2": { 1815 "label": "ICRS 2", 1816 "description": "The reference system defined by 2027A&A..1234...12B", 1817 "preliminary": "", 1818 "wider": ["EQUATORIAL"], "narrower": [] 1819 } 1820 } 1821 } 1822 \end{lstlisting} 1823 1824 \section{Changes from Previous Versions} 1825 1826 \subsection{Changes from WD-2020-06-12} 1827 1828 \begin{itemize} 1829 \item No changes to normative material. 1830 \item Adding a use case on vocabulary evolution and on VO-DML. 1831 \item Various editorial changes. 1832 \end{itemize} 1833 1834 \subsection{Changes from WD-2020-03-26} 1835 1836 \begin{itemize} 1837 \item Desise term values are now dicts with label and description to 1838 make it a bit more self-explanatory; this let us pull in preliminary, 1839 deprecated, and wider as well. 1840 \item Desise now contains an inversion of wider, narrower, with meanings 1841 quite different between SKOS and the other flavours. 1842 \item The main media type for Desise is now application/x-desise+json rather 1843 than text/json because there is no text/json, and you can't have 1844 content media type parameters on either. 1845 \item Mentioning licenseuri and licensehtml in the non-normative part on 1846 managing vocabulary metadata. Also stating there that IVOA-managed 1847 vocabularies are CC-0. 1848 \end{itemize} 1849 1850 1851 \subsection{Changes from WD-2019-09-05} 1852 1853 \begin{itemize} 1854 \item We no longer recommend that non-RDF clients use RDF/XML. We have 1855 therefore removed the usage with plain XML tooling'' sections. We 1856 have also removed the description of the revovo python module from the 1857 toolset appendix. 1858 1859 \item Instead, we now have the custom desise'' format described in a 1860 new section that doubles as a very quick introduction for adopters not 1861 interested in RDF. 1862 1863 \item Adding a use case and requirement for the UAT (and, perhaps, 1864 similar externally curated vocabularies). Adding a section on how 1865 such vocabularies may be integrated into the IVOA RDF repository. 1866 1867 \item Now requiring a \emph{Used-in} item in addition VEPs, implying 1868 that only terms that are already applied may be proposed. 1869 1870 \item Adding \emph{Supercedes} and \emph{Superceded-by} items, 1871 formalising the previous language on splitting'' VEPs a bit. 1872 1873 \item Adding advice on referencing vocabularies. 1874 1875 \item We now demand a formal validation of VEPs by the semantics chair. 1876 The responsibility for uploading'' the VEP, i.e., adding it to the VEP 1877 index, is now assigned to them. 1878 1879 \item Adding a soapbox section with advice on what to do when proposing 1880 new terms and introducing a naive semantics model. 1881 \end{itemize} 1882 1883 \subsection{Changes from REC-1.19} 1884 1885 The present document is a full re-write of Version 1 of Vocabularies in 1886 the VO. See sect.~\ref{sect:version1rel} for details. 1887 1888 \bibliography{local.bib,ivoatex/ivoabib,ivoatex/docrepo} 1889 1890 1891 \end{document}

 msdemlei@ari.uni-heidelberg.de ViewVC Help Powered by ViewVC 1.1.26