/[volute]/trunk/projects/vocabularies/doc/vocabularies.xml
ViewVC logotype

Annotation of /trunk/projects/vocabularies/doc/vocabularies.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 537 - (hide annotations)
Fri May 23 11:04:43 2008 UTC (12 years, 8 months ago) by norman.x.gray
File MIME type: text/xml
File size: 71742 byte(s)
Make the revision/date appear in this editors' draft

1 norman.x.gray 2 <?xml version="1.0" encoding="utf-8"?>
2     <!-- Based on template at
3     http://www.ivoa.net/Documents/templates/ivoa-tmpl.html -->
4     <html xmlns="http://www.w3.org/1999/xhtml"
5     xmlns:dc="http://purl.org/dc/elements/1.1/"
6     xmlns:dcterms="http://purl.org/dc/terms/"
7     xml:lang="en" lang="en">
8    
9     <head>
10     <title>Vocabularies in the Virtual Observatory</title>
11     <link rev="made" href="http://nxg.me.uk/norman/#norman" title="Norman Gray"/>
12     <meta name="author" content="Norman Gray"/>
13     <meta name="DC.subject" content="IVOA, Virtual Observatory, Vocabulary"/>
14     <meta name="rcsdate" content="$Date$"/>
15     <link href="http://www.ivoa.net/misc/ivoa_wd.css" rel="stylesheet" type="text/css"/>
16     <style type="text/css">
17 norman.x.gray 58 /* make the ToC a little more compact, and without bullets */
18 norman.x.gray 2 div.toc ul { list-style: none; padding-left: 1em; }
19 norman.x.gray 26 div.toc li { padding-top: 0ex; padding-bottom: 0ex; }
20 norman.x.gray 23 li { padding-top: 1ex; padding-bottom: 1ex; }
21 alasdair.gray 34 td { vertical-align: top; }
22 norman.x.gray 420 td.rdfxml { background: #ECC; }
23     td.turtle { background: #CEC; }
24 norman.x.gray 2 span.userinput { font-weight: bold; }
25     span.url { font-family: monospace; }
26 norman.x.gray 43 span.rfc2119 { color: #800; }
27 norman.x.gray 2 .todo { background: #ff7; }
28 norman.x.gray 420 pre { background: #EEE; padding: 1em; }
29     pre.rdfxml { background: #ECC; padding: 1em; }
30     pre.turtle { background: #CEC; padding: 1em; }
31 norman.x.gray 58
32     /* 'link here' text in section headers */
33     *.hlink a {
34     text-decoration: none;
35     color: #fff; /* the page background colour */
36     }
37     *:hover.hlink a {
38     color: #800;
39     }
40 norman.x.gray 2 </style>
41     </head>
42    
43     <body>
44     <div class="head">
45 norman.x.gray 77 <a href="http://www.ivoa.net/"><img alt="IVOA" src="http://www.ivoa.net/pub/images/IVOA_wb_300.jpg" width="300" height="169"/></a>
46 norman.x.gray 2
47 norman.x.gray 77 <h1>Vocabularies in the Virtual Observatory<br/>Version @VERSION@</h1>
48 norman.x.gray 537 <h2>IVOA Working Draft, @RELEASEDATE@ [Editors' draft]</h2>
49     <p><strong>$Revision$ $Date$</strong></p>
50 norman.x.gray 2
51     <dl>
52    
53     <dt>This version</dt>
54 norman.x.gray 420 <dd><span class='url'>@DOCURI@.html</span></dd>
55 norman.x.gray 2
56     <dt>Latest version</dt>
57 norman.x.gray 104 <dd><span class='url'>http://www.ivoa.net/Documents/latest/vocabularies.html</span><br/>
58 norman.x.gray 70 and <a href='@ISSUESLIST@' >issues list</a></dd>
59 norman.x.gray 2
60 norman.x.gray 420 <dt>Previous version</dt>
61     <dd><span class='url'>http://www.ivoa.net/Documents/WD/Semantics/vocabularies-20080320.html</span></dd>
62    
63 norman.x.gray 77 <dt>Working Group</dt>
64     <dd><em><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaSemantics">Semantics</a></em></dd>
65    
66 norman.x.gray 2 <dt>Editors</dt>
67 norman.x.gray 420 <dd>Alasdair J G Gray, University of Glasgow, UK<br/>
68     <a href='http://nxg.me.uk/norman/' >Norman Gray</a>, University of
69     Leicester / University of Glasgow, UK<br/>
70     Frederic V Hessman, University of Göttingen, Germany<br/>
71     Andrea Preite Martinez, INAF, Italy</dd>
72 norman.x.gray 2
73     <dt>Authors</dt>
74     <dd>
75 norman.x.gray 57 <span property="dc:creator">Sébastien Derriere</span>,
76 norman.x.gray 2 <span property="dc:creator">Alasdair J G Gray</span>,
77     <span property="dc:creator">Norman Gray</span>,
78 norman.x.gray 57 <span property="dc:creator">Frederic V Hessman</span>,
79     <span property="dc:creator">Tony Linde</span>,
80     <span property="dc:creator">Andrea Preite Martinez</span>,
81     <span property="dc:creator">Rob Seaman</span> and
82     <span property="dc:creator">Brian Thomas</span>
83 norman.x.gray 2 </dd>
84     </dl>
85     <hr/>
86     </div>
87    
88     <div class="section-nonum" id="abstract">
89     <p class="title">Abstract</p>
90    
91     <div class="abstract">
92 norman.x.gray 21 <p>As the astronomical information processed within the <em>Virtual Observatory
93     </em> becomes more complex, there is an increasing need for a more
94     formal means of identifying quantities, concepts, and processes not
95     confined to things easily placed in a FITS image, or expressed in a
96 alasdair.gray 534 catalogue or a table. We propose that the IVOA adopt a standard
97 norman.x.gray 21 format for vocabularies based on the W3C's <em>Resource Description
98 norman.x.gray 57 Framework</em> (RDF) and <em>Simple Knowledge Organisation System</em>
99 norman.x.gray 21 (SKOS). By adopting a standard and simple format, the IVOA will
100 norman.x.gray 57 permit different groups to create and maintain their own specialised
101 norman.x.gray 21 vocabularies while letting the rest of the astronomical community
102 alasdair.gray 65 access, use, and combine them. The use of current, open standards
103 norman.x.gray 22 ensures that VO applications will be able to tap into resources of the
104 norman.x.gray 21 growing semantic web. Several examples of useful astronomical
105     vocabularies are provided, including work on a common IVOA thesaurus
106     intended to provide a semantic common base for VO applications.</p>
107 norman.x.gray 2 </div>
108    
109     </div>
110    
111     <div class="section-nonum" id="status">
112     <p class="title">Status of this document</p>
113    
114 norman.x.gray 70 <p>This is an IVOA Working
115 norman.x.gray 43 Draft. The first release of this document was
116 norman.x.gray 102 <span property="dc:date">2008 March 20</span>.</p>
117 norman.x.gray 2
118     <p>This document is an IVOA Working Draft for review by IVOA members
119     and other interested parties. It is a draft document and may be
120     updated, replaced, or obsoleted by other documents at any time. It is
121     inappropriate to use IVOA Working Drafts as reference materials or to
122     cite them as other than <q>work in progress</q>.</p>
123    
124     <p>A list of current IVOA Recommendations and other technical
125     documents can be found at
126 norman.x.gray 43 <span class='url' >http://www.ivoa.net/Documents/</span>.</p>
127 norman.x.gray 2
128 norman.x.gray 536 <p>This document includes a normative reference to the W3C SKOS
129     standard <span class='cite'>std:skosref</span>, despite the fact that,
130     at the time this present document was standardised, the SKOS document
131     was still a W3C Working Draft and thus a <q>work in progress</q>. The
132     core part of the SKOS standard which this standard refers to (that is,
133     the concept schemes, documentation and intravocabulary relationship
134     vocabularies) are stable, and are very unlikely to change before
135     Recommendation. When the SKOS document becomes a W3C Recommendation,
136     we will issue a minor update to this present document referring to the
137     finalised SKOS standard, and incorporating any errata which have
138     appeared by then.</p>
139    
140 norman.x.gray 2 <h3>Acknowledgments</h3>
141    
142 norman.x.gray 21 <p>We would like to thank the members of the IVOA semantic working
143     group for many interesting ideas and fruitful discussions.</p>
144 norman.x.gray 2 </div>
145    
146     <h2><a id="contents" name="contents">Table of Contents</a></h2>
147     <?toc?>
148    
149     <hr/>
150    
151     <div class="section" id="introduction">
152 norman.x.gray 61 <p class="title">Introduction (informative)</p>
153 norman.x.gray 2
154 norman.x.gray 77 <div class="section" id='astrovocab'>
155 norman.x.gray 2 <p class="title">Vocabularies in astronomy</p>
156    
157     <p>Astronomical information of relevance to the Virtual Observatory
158     (VO) is not confined to quantities easily expressed in a catalogue or
159 alasdair.gray 25 a table.
160     Fairly simple things such as position on the sky, brightness in some
161 norman.x.gray 57 units, times measured in some frame, redshifts, classifications or
162 alasdair.gray 25 other similar quantities are easily manipulated and stored in VOTables
163     and can currently be identified using IVOA Unified Content Descriptors
164     (UCDs) <span class="cite">std:ucd</span>.
165     However, astrophysical concepts and quantities use a wide variety of
166     names, identifications, classifications and associations, most of
167     which cannot be described or labelled via UCDs.</p>
168 norman.x.gray 2
169 alasdair.gray 25 <p>There are a number of basic forms of organised semantic knowledge
170 norman.x.gray 536 of potential use to the VO. Informal <q>folksonomies</q> are at one
171     extreme, and are a very lightly coordinated collection of labels
172     chosen by users. In a slightly more formally structured
173     <q>vocabulary</q>, the label is drawn from a predefined set of
174     definitions, and which can include relationships to other labels;
175     vocabularies are primarily associated with searching and browsing
176     tasks. At the other extreme are <q>ontologies</q>, where the domain
177     is formally captured in a set of logical classes, typically related in
178     a subclass hierarchy. More formal definitions are presented later in
179     this document.
180 alasdair.gray 25 </p>
181    
182 norman.x.gray 48 <p>An astronomical ontology is necessary if we are to have a computer
183 norman.x.gray 70 (appear to) <q>understand</q> something of the domain.
184 alasdair.gray 25 There has been some progress towards creating an ontology of
185 norman.x.gray 22 astronomical object types <span
186 alasdair.gray 25 class="cite">std:ivoa-astro-onto</span> to meet this need.
187     However there are distinct use cases for letting human users find
188     resources of interest through search and navigation of the information space.
189     The most appropriate technology to meet these use cases derives from
190     the Information Science community, that of <em>controlled
191     vocabularies, taxonomies and thesauri</em>.
192     In the present document, we do not distinguish between controlled
193     vocabularies, taxonomies and thesauri, and use the term
194     <em>vocabulary</em> to represent all three.
195     </p>
196 norman.x.gray 2
197     <p>One of the best examples of the need for a simple vocabulary within
198     the VO is VOEvent <span class="cite">std:voevent</span>, the VO
199 norman.x.gray 43 standard for supporting rapid notification of astronomical events.
200     This standard requires some formalised indication of what a published
201 norman.x.gray 70 event is <q>about</q>, in a formalism which can be used straightforwardly
202 norman.x.gray 43 by the developer of relevant services. See <span class='xref'
203     >usecases</span> for further discussion.</p>
204 norman.x.gray 2
205 norman.x.gray 43 <p>A number of astronomical vocabularies have been created, with a
206     variety of goals and intended uses. Some examples are detailed below. </p>
207 alasdair.gray 34
208 norman.x.gray 2 <ul>
209 alasdair.gray 25
210 norman.x.gray 2 <li>The <em>Second Reference Dictionary of the Nomenclature of
211 alasdair.gray 25 Celestial Objects</em> <span class="cite">lortet94</span>, <span
212     class="cite">lortet94a</span> contains 500 paper pages of astronomical
213     nomenclature</li>
214 norman.x.gray 2
215     <li>For decades professional journals have used a set of reasonably
216     compatible keywords to help classify the content of whole articles.
217     These keywords have been analysed by Preite Martinez &amp; Lesteven
218 norman.x.gray 43 <span class="cite">preitemartinez07</span>, who derived a
219 alasdair.gray 31 set of common keywords constituting one of the potential bases for a
220 norman.x.gray 2 fuller VO vocabulary. The same authors also attempted to derive a set
221 norman.x.gray 57 of common concepts by analysing the contents of abstracts in journal
222 norman.x.gray 22 articles, which should comprise a list of tokens/concepts more
223 alasdair.gray 31 up-to-date than the old list of journal keywords. A similar but less
224     formal attempt was made by Hessman <span class='cite'>hessman05</span>
225 norman.x.gray 70 for the VOEvent working group, resulting in a similar list.</li>
226 norman.x.gray 2
227 alasdair.gray 34 <li>Astronomical databases generally use simple sets of keywords
228 norman.x.gray 57 – sometimes hierarchically organised – to help users make queries.
229 norman.x.gray 43 Two examples from very
230 alasdair.gray 34 different contexts are the list of object types used in the <a
231 alasdair.gray 25 href="http://simbad.u-strasbg.fr">Simbad</a> database and the search
232     keywords used in the educational Hands-On Universe image database
233     portal.</li>
234 norman.x.gray 2
235 alasdair.gray 25 <li>The Astronomical Outreach Imagery (AOI) working group has created
236     a simple taxonomy for helping to classify images used for educational
237 norman.x.gray 61 or public relations <span class="cite">std:aoim</span>. See
238 norman.x.gray 36 <span class='xref'>vocab-aoim</span>.</li>
239 norman.x.gray 48
240 alasdair.gray 25 <!--
241 norman.x.gray 2 <li>The Hands-On Universe project (see <span class='url'
242 norman.x.gray 22 >http://sunra.lbl.gov/telescope2/index.html</span>) has maintained a
243 norman.x.gray 2 public database of images for use by the general public since the
244     1990s. The images are very heterogeneous, since they are gathered from
245     a variety of professional, semi-professional, amateur, and school
246 norman.x.gray 22 observatories, so a simple taxonomy is used to facilitate browsing
247 norman.x.gray 2 by the users of the database.</li>
248 norman.x.gray 48 -->
249 norman.x.gray 2
250     <li>In 1993, Shobbrook and Shobbrook published an Astronomy Thesaurus
251 norman.x.gray 536 endorsed by the IAU <span class='cite' >shobbrook93</span>. This
252 alasdair.gray 25 collection of nearly 3000 terms, in five languages, is a valuable
253     resource, but has seen little use in recent years. Its very size,
254     which gives it expressive power, is a disadvantage to the extent that
255 norman.x.gray 57 it is consequently hard to use. See <span class='xref'>vocab-iau93</span>.</li>
256 norman.x.gray 2
257 norman.x.gray 43 <li>The VO's Unified Content Descriptors <span class='cite'
258     >std:ucd</span> (UCD) constitute the main controlled vocabulary of the
259 norman.x.gray 57 IVOA and contain some taxonomic information. However, UCD has some
260 norman.x.gray 43 features which supports its goals, but which make it difficult to use
261 norman.x.gray 57 beyond the present applications of labelling VOTables: firstly, there
262 norman.x.gray 43 is no standard means of identifying and processing the contents of the
263     text-based reference document; secondly, the content cannot be openly
264     extended beyond that set by a formal IVOA committee without going
265     through a laborious and time-consuming negotiation process of
266     extending the primary vocabulary itself; and thirdly, the UCD
267     vocabulary is primarily concerned with data types and their
268     processing, and only peripherally with astronomical objects (for
269     example, it defines formal labels for RA, flux, and bandpass, but does
270 norman.x.gray 57 not mention the Sun). See <span class='xref'>vocab-ucd1</span>.</li>
271 norman.x.gray 21
272 norman.x.gray 2 </ul>
273     </div>
274    
275 norman.x.gray 40 <div class='section' id='usecases'>
276     <p class='title'>Use-cases, and the motivation for formalised vocabularies</p>
277    
278     <p>The most immediate high-level motivation for this work is the
279     requirement of the VOEvent standard <span class='cite'
280     >std:voevent</span> for a controlled vocabulary usable in the
281 norman.x.gray 64 VOEvent's <code>&lt;Why/&gt;</code> and <code>&lt;What/&gt;</code>
282 norman.x.gray 57 elements, which describe what
283 norman.x.gray 40 sort of object the VOEvent packet is describing, in some broadly
284 norman.x.gray 70 intelligible way. For example a <q>burst</q> might be a gamma-ray burst
285 norman.x.gray 43 due to the collapse of a star in a distant galaxy, a solar flare, or
286     the brightening of a stellar or AGN accretion disk, and having an
287     explicit list of vocabulary terms can help guide the event publisher
288     into using a term which will be usefully precise for the event's
289     consumers. A free-text label can help here (which brings us into the
290 norman.x.gray 70 domain sometimes referred to as folksonomies), but the astronomical
291 norman.x.gray 57 community, with a culture sympathetic to international agreement, can
292     do better.</p>
293 norman.x.gray 40
294 norman.x.gray 57 <p>The purpose of this proposal is to establish a set of conventions for
295     the creation, publication, use, and manipulation of
296     astronomical vocabularies within the Virtual Observatory, based upon
297     the W3C's SKOS standard. We include as appendices to this proposal
298     formalised versions of a number of existing vocabularies, encoded as
299 alasdair.gray 59 SKOS vocabularies <span class="cite">std:skosref</span>.</p>
300 norman.x.gray 57
301 norman.x.gray 40 <p>Specific use-cases include the following.</p>
302     <ul>
303     <li>A user wishes to process all events concerning supernovae, which
304 norman.x.gray 57 means that an event concerning a type 1a supernova must be understood to be
305 norman.x.gray 40 relevant. [This supports a system working autonomously, filtering
306     incoming information]</li>
307    
308     <li>A user is searching an archive of VOEvents for microlensing
309     events, and retrieves a large number of them; the search interface may
310 norman.x.gray 536 then prompt them to narrow their search using one of a set of terms
311 norman.x.gray 70 including, say, binary lens events. [This supports so-called <q>semantic
312     search</q>, providing semantic support to an interface which is in turn
313 norman.x.gray 40 supporting a user]</li>
314    
315     <li>A user wishes to search for resources based on the
316 norman.x.gray 43 journal-supported keywords in a paper; they might either initiate this by
317 norman.x.gray 40 hand, or have this done on their behalf by a tool which can extract
318     the keywords from a PDF. The keywords are in the A&amp;A vocabulary,
319     and mappings have been defined between this vocabulary and others,
320 alasdair.gray 65 which means that the query keywords are translated automatically
321 norman.x.gray 40 into those appropriate for a search of an outreach image database
322 norman.x.gray 57 (everyone likes pretty pictures), the VO Registry, a set of Simbad
323 norman.x.gray 40 object types, and one or more concepts in more formal ontologies. The
324     search interface is then able to support the user browsing up and down
325 norman.x.gray 57 the AOIM vocabulary, and a specialised Simbad tool is able to take
326 norman.x.gray 40 over the search, now it has an appropriate starting place. [This
327     supports interoperability, building on the investments which
328     institutions and users have made in existing vocabularies]</li>
329    
330 norman.x.gray 64 <li>A user receives a VOTable of results from a VO application – for
331     example a catalogue of objects or observations – and wants to search a
332     database of old FITS files for potential matches. Because the UCDs
333     labeling the columns of the tables are expressed in well-documented
334     SKOS, both the official descriptions of the UCDs and their semantic
335     matches to a variety of other plain-text vocabularies (such as the IAU
336 alasdair.gray 65 or AOIM thesauri) are available to the VO application, providing a basis
337 norman.x.gray 64 for massive searches for all kinds of FITS keyword values.</li>
338    
339 norman.x.gray 40 </ul>
340    
341 norman.x.gray 420 <p>The goal of this standard is to show how vocabularies can be easily
342     expressed in an interoperable and computer-manipulable format, and the
343     sole normative section of this Recommendation (namely section <span
344     class='xref'>publishing</span>) contains requirements and suggestions
345     intended to promote this. Four example vocabularies that have
346     previously been expressed using non-standardized formats – namely the
347 alasdair.gray 534 A&amp;A keyword list, the IAU thesaurus and AOIM taxonomy, and UCD1 – are
348 norman.x.gray 420 included below as illustrations of how simple it is to publish them in
349     SKOS, without losing any of the information of the original source
350     vocabularies.</p>
351    
352 norman.x.gray 57 <p>It is not a goal of this standard, as it is not a goal of SKOS, to
353     produce knowledge-engineering artefacts which can support elaborate
354     machine reasoning – such artefacts would be very valuable, but require
355     much more expensive work on ontologies. As the supernova use-case
356     above illustrates, even simple vocabularies can support useful machine
357     reasoning.</p>
358    
359 norman.x.gray 69 <p>It is also not a goal of this standard to produce new vocabularies,
360     or substantially alter existing ones; instead, the vocabularies
361     included below in section <span class='xref'>distvocab</span> are directly
362     derived from existing vocabularies (the exceptions are the IVOAT
363     vocabulary, which is ultimately intended to be a significant update to
364     the IAU-93 original, and the constellations vocabulary, which is
365     intended to be purely didactic). It therefore follows that the ambiguities,
366     redundancies and incompleteness of the source vocabularies are
367     faithfully represented in the distributed SKOS vocabularies. We hope
368     that this formalisation process will create greater visibility and
369     broader use for the various vocabularies, and that this will guide the
370     maintenance efforts of the curating groups.</p>
371 norman.x.gray 57
372     <p>The reason for both of these limitations is that vocabularies are
373     extremely expensive to produce, maintain and deploy, and we must
374     therefore rely on such vocabularies as have been developed, and
375     attached as metadata to resources, by others. Such vocabularies are
376 norman.x.gray 420 less rich or less coherent than we might prefer, but they are widely enough
377     deployed to be useful. We hope that the set of example vocabularies
378     we have provided will build on this deployment, by providing material
379     which is useful out of the box.</p>
380 norman.x.gray 57
381 norman.x.gray 40 </div>
382    
383 norman.x.gray 77 <div class="section" id='formalising'>
384 norman.x.gray 2 <p class="title">Formalising and managing multiple vocabularies</p>
385    
386     <p>We find ourselves in the situation where there are multiple
387     vocabularies in use, describing a broad range of resources of interest
388     to professional and amateur astronomers, and members of the public.
389     These different vocabularies use different terms and different
390     relationships to support the different constituencies they cater for.
391 alasdair.gray 25 For example, <q>delta Sct</q> and <q>RR Lyr</q> are terms one would
392     find in a vocabulary aimed at professional astronomers, associated
393     with the notion of <q>variable star</q>; however one would
394     <em>not</em> find such technical terms in a vocabulary intended to
395     support outreach activities.</p>
396 norman.x.gray 2
397     <p>One approach to this problem is to create a single consensus
398     vocabulary, which draws terms from the various existing vocabularies
399     to create a new vocabulary which is able to express anything its users
400     might desire. The problem with this is that such an effort would be
401 norman.x.gray 43 very expensive, both in terms of time and effort on the part of those
402 norman.x.gray 2 creating it, and to the potential users, who have to learn
403     to navigate around it, recognise the new terms, and who have to be
404     supported in using the new terms correctly (or, more often,
405     incorrectly).</p>
406    
407     <p>The alternative approach to the problem is to evade it, and this is
408 norman.x.gray 22 the approach taken in this document. Rather than deprecating the
409 norman.x.gray 2 existence of multiple overlapping vocabularies, we embrace it,
410 norman.x.gray 43 help interest groups formalise as many of them as are appropriate, and
411     standardise the process of formally declaring the relationships between
412 norman.x.gray 2 them. This means that:</p>
413     <ul>
414 alasdair.gray 25 <li>The various vocabularies are allowed to evolve separately, on
415 norman.x.gray 21 their own timescales, managed either by the IVOA, individual working
416     groups within the IVOA, or by third parties;</li>
417    
418 norman.x.gray 57 <li>Specialised vocabularies can be developed and maintained by the
419 norman.x.gray 22 community with the most knowledge about a specific topic, ensuring
420 norman.x.gray 43 that the vocabulary will have the most appropriate breadth, depth, and
421     precision;</li>
422 norman.x.gray 21
423 alasdair.gray 25 <li>Users can choose the vocabulary or combination of vocabularies most
424 norman.x.gray 21 appropriate to their situation, either when annotating resources, or
425     when querying them; and</li>
426    
427 alasdair.gray 25 <li>We can retain the previous investments made in vocabularies by
428 norman.x.gray 21 users and resource owners.</li>
429    
430 norman.x.gray 2 </ul>
431    
432    
433     </div>
434    
435     </div>
436    
437 norman.x.gray 77 <div class='section' id='skos'>
438 norman.x.gray 43 <p class='title'>SKOS-based vocabularies (informative)</p>
439 norman.x.gray 2
440 norman.x.gray 43 <p>In this section, we introduce the concepts of SKOS-based
441 norman.x.gray 57 vocabularies, and the technology of mapping between them. We describe
442     some additional requirements for IVOA vocabularies in the next
443     section, <span class='xref' >publishing</span>.</p>
444 norman.x.gray 43
445 norman.x.gray 22 <div class="section" id='vocab'>
446 norman.x.gray 21 <p class="title">Selection of the vocabulary format</p>
447 norman.x.gray 2
448 norman.x.gray 21 <p>After extensive online and face-to-face discussions, the authors have
449     brokered a consensus within the IVOA community that
450     formalised vocabularies should be published at least in SKOS (Simple Knowledge
451 norman.x.gray 57 Organisation System) format, a W3C draft standard application of RDF to the
452 norman.x.gray 2 field of knowledge organisation <span
453 alasdair.gray 59 class="cite">std:skosref</span>. SKOS draws on long experience
454 norman.x.gray 2 within the Library and Information Science community, to address a
455     well-defined set of problems to do with the indexing and retrieval of
456     information and resources; as such, it is a close match to the problem
457 norman.x.gray 43 this document is addressing.</p>
458 norman.x.gray 2
459     <p>ISO 5964 <span class='cite' >std:iso5964</span> defines a number of
460     the relevant terms (ISO 5964:1985=BS 6723:1985; see also <span
461     class='cite' >std:bs8723-1</span> and <span class='cite'
462     >std:z39.19</span>), and some of the (lightweight) theoretical
463     background. The only technical distinction relevant to this document
464 norman.x.gray 70 is that between vocabulary and thesaurus: BS-8723-1 defines a
465 norman.x.gray 536 controlled vocabulary as a</p>
466 norman.x.gray 2 <blockquote>
467 norman.x.gray 536 prescribed list of terms or headings each one having an assigned meaning
468     [noting that <q>Controlled vocabularies are designed for use in
469     classifying or indexing documents and for searching them.</q>]
470     </blockquote>
471     <p>and a thesaurus as a</p>
472     <blockquote>
473 alasdair.gray 25 Controlled vocabulary in which concepts are represented by preferred
474 norman.x.gray 2 terms, formally organized so that paradigmatic relationships between
475     the concepts are made explicit, and the preferred terms are
476 norman.x.gray 57 accompanied by lead-in entries for synonyms or quasi-synonyms.
477     <!-- NOTE:
478 norman.x.gray 2 The purpose of a thesaurus is to guide both the indexer and the
479     searcher to select the same preferred term or combination of preferred
480 norman.x.gray 57 terms to represent a given subject. -->
481     (BS-8723-1, sect. 2.39)
482 norman.x.gray 2 </blockquote>
483 norman.x.gray 536 <p>with a similar definition in ISO-5964 sect. 3.16.</p>
484 norman.x.gray 2
485 norman.x.gray 536 <p>The paradigmatic relationships in question are those relating a
486     term to a <q>broader</q>, <q>narrower</q> or more generically
487     <q>related</q> term. These notions have an operational definition:
488     any resource retrieved as a result of a search on a given term will
489     also be retrievable through a search on that term's <q>broader
490     term</q> (<q>narrower</q> is a simple inverse, so that for any pair of
491     terms, if <code>A skos:broader B</code>, then <code>B skos:narrower
492     A</code>; a term may have multiple narrower and broader terms). This
493     is not a subsumption relationship, as there is no implication that the
494     concept referred to by a narrower term is of the same <em>type</em> as
495     a broader term. Further, the <code>skos:broader</code> and
496     <code>skos:narrower</code> relationships are not transitive (that is,
497     declaring that if <code>A skos:broader B</code> and <code>B
498     skos:broader C</code> does not imply that <code>A skos:broader
499     C</code>). However the SKOS standard includes the notions of
500     <code>skos:broaderTransitive</code> and
501     <code>skos:narrowerTransitive</code> relations for the subset of
502     vocabularies and systems which would find these useful.</p>
503    
504 norman.x.gray 21 <p>Thus <strong>a vocabulary (SKOS or otherwise) is not an
505     ontology</strong>. It has lighter and looser semantics than an
506     ontology, and is specialised for the restricted case of resource
507     retrieval. Those interested in ontological analyses can easily
508     transfer the vocabulary relationship information from SKOS to a formal
509 norman.x.gray 22 ontological format such as OWL <span class='cite' >std:owl</span>.</p>
510 norman.x.gray 2
511 norman.x.gray 57 <p>The purpose of a thesaurus is to help users find resources they
512     might be interested in, be they library books, image archives, or VOEvent
513     packets.</p>
514    
515 norman.x.gray 21 </div>
516 norman.x.gray 2
517 norman.x.gray 77 <div class='section' id='skos-format'>
518 norman.x.gray 21 <p class='title'>Content and format of a SKOS vocabulary</p>
519    
520 alasdair.gray 25 <p>A published vocabulary in SKOS format consists of a set of
521 alasdair.gray 34 <q>concepts</q> – an example concept capturing the
522 norman.x.gray 43 vocabulary information about spiral galaxies is provided in the <a
523     href='#figexample' >Figure below</a>, with the RDF shown in both
524     RDF/XML <span class='cite' >std:rdfxml</span> and Turtle notation <span
525 alasdair.gray 34 class='cite' >std:turtle</span> (Turtle is similar to the more
526 norman.x.gray 420 informal <em>Notation3</em>). The elements of a concept are detailed
527 alasdair.gray 34 below.</p>
528    
529     <center>
530 norman.x.gray 43 <p><a name='figexample' >Figure: examples of SKOS vocabularies</a></p>
531 alasdair.gray 34 <table>
532     <tr>
533 norman.x.gray 420 <th class='rdfxml'>XML Syntax</th>
534 alasdair.gray 34 <th width="10"/>
535 norman.x.gray 420 <th class='turtle'>Turtle Syntax</th>
536 alasdair.gray 34 </tr>
537 norman.x.gray 36 <tr><td/></tr>
538 alasdair.gray 34 <tr>
539 norman.x.gray 420 <td class='rdfxml'>
540     <pre class='rdfxml'>
541 alasdair.gray 34 &lt;skos:Concept rdf:about="#spiralGalaxy"&gt;
542     &lt;skos:prefLabel lang="en"&gt;
543     spiral galaxy
544     &lt;/prefLabel&gt;
545     &lt;skos:prefLabel lang="de"&gt;
546     Spiralgalaxie
547     &lt;/prefLabel&gt;
548     &lt;skos:altLabel lang="en"&gt;
549     spiral nebula
550     &lt;/skos:altLabel&gt;
551     &lt;skos:hiddenLabel lang="en"&gt;
552     spiral glaxy
553     &lt;/hiddenLabel&gt;
554     &lt;skos:definition lang="en"&gt;
555     A galaxy having a spiral structure.
556     &lt;/skos:definition&gt;
557     &lt;skos:scopeNote lang="en"&gt;
558     Spiral galaxies fall into one of
559     three catagories: Sa, Sc, and Sd.
560     &lt;/skos:scopeNote&gt;
561 norman.x.gray 36 &lt;skos:narrower
562     rdf:resource="#barredSpiralGalaxy"/&gt;
563     &lt;skos:broader
564     rdf:resource="#galaxy"/&gt;
565     &lt;skos:related
566     rdf:resource="#spiralArm"/&gt;
567 alasdair.gray 34 &lt;/skos:Concept&gt;
568     </pre>
569     </td>
570     <td/>
571 norman.x.gray 420 <td class='turtle'>
572     <pre class='turtle'>
573 alasdair.gray 34 &lt;#spiralGalaxy&gt; a skos:Concept;
574 norman.x.gray 36 skos:prefLabel
575     "spiral galaxy"@en,
576 alasdair.gray 34 "Spiralgalaxie"@de;
577     skos:altLabel "spiral nebula"@en;
578     skos:hiddenLabel "spiral glaxy"@en;
579 norman.x.gray 36 skos:definition """A galaxy having a
580     spiral structure."""@en;
581     skos:scopeNote """Spiral galaxies fall
582     into one of three categories:
583     Sa, Sc, and Sd"""@en;
584 alasdair.gray 34 skos:narrower &lt;#barredSpiralGalaxy&gt;;
585     skos:broader &lt;#galaxy&gt;;
586     skos:related &lt;#spiralArm&gt; .
587     </pre>
588     </td>
589     </tr>
590     </table>
591     </center>
592    
593 norman.x.gray 43 <p>A SKOS vocabulary includes the following features.</p>
594    
595 alasdair.gray 34 <ul>
596    
597 norman.x.gray 43 <li>A single URI representing the concept, mainly for use by computers.
598 alasdair.gray 34 <!--
599     <code>&lt;#spiralGalaxy&gt; a skos:Concept</code>.
600     <code>&lt;skos:Concept rdf:about="#spiralGalaxy"&gt;</code>
601     -->
602 norman.x.gray 22 </li>
603 norman.x.gray 21
604 alasdair.gray 34 <li>A single prefered label in each supported language of the
605 norman.x.gray 43 vocabulary, for use by humans.
606 alasdair.gray 34 <!--
607     <code>skos:prefLabel "spiral galaxy"@en, "Spiralgalaxie"@de</code>.
608     <code>&lt;skos:prefLabel&gt;spiral galaxy&lt;/skos:prefLabel&gt;</code>
609     -->
610 norman.x.gray 22 </li>
611 norman.x.gray 21
612 alasdair.gray 25 <li>Optional alternative labels which applications may encounter or in
613 norman.x.gray 70 common use, whether simple synonyms or commonly-used aliases such as
614     <q>GRB</q> for "gamma-ray burst", or <q>Spiral nebula</q> for
615 alasdair.gray 34 spiral galaxies.
616     <!--
617     <code>skos:altLabel "GRB"@en</code>
618     <code>&lt;skos:altLabel lang="de"&gt;Spiralgalaxie&lt;/skos:altLabel&gt;</code>
619     -->
620     </li>
621 norman.x.gray 21
622 norman.x.gray 43 <li>Optional hidden labels which capture terms which are sometimes
623     used for the corresponding concept, but which are deprecated in some
624     sense. This might include common misspellings for
625     either the preferred or alternate labels, for example <q>glaxy</q> for
626     <q>galaxy</q>.
627 alasdair.gray 34 </li>
628 alasdair.gray 25
629     <li>A definition for the concept, where one exists in the original
630 alasdair.gray 34 vocabulary, to clarify the meaning of the term.
631     <!--
632     <code>skos:definition "A galaxy having a spiral structure."@en</code>
633     <code>&lt;skos:definition lang="en"&gt;<br/>A galaxy having a spiral structure.<br/>&lt;/skos:definition&gt;</code>
634     -->
635     </li>
636 alasdair.gray 25
637 norman.x.gray 57 <li>A scope note to further clarify a definition, or the usage of the
638 alasdair.gray 34 concept.
639     <!--
640     <code>skos:scopeNote "Spiral galaxies fall into one of three categories: Sa, Sc, and Sd"@en</code>
641     <code>&lt;skos:scopeNote lang="en"&gt;<br/>Spiral galaxies fall into one of three catagories: Sa, Sc, and Sd.<br/>&lt;/skos:scopeNote&gt;</code>
642     -->
643     </li>
644 alasdair.gray 25
645 alasdair.gray 34 <li>Optionally, a concept may be involved in any number of relationships
646 alasdair.gray 25 to other concepts. The types of relationships are
647 norman.x.gray 21 <ul>
648 norman.x.gray 43 <li>Narrower or more specific concepts, for example a link to the concept
649 alasdair.gray 34 representing a <q>barred spiral galaxy</q>.
650 alasdair.gray 31 <!--
651 alasdair.gray 34 <code>skos:narrower &lt;#barredSpiralGalaxy&gt;</code>.
652     <code>&lt;skos:narrower rdf:resource="#barredSpiralGalaxy"&gt;</code>
653     -->
654 norman.x.gray 22 </li>
655 norman.x.gray 43 <li>Broader or more general concepts, for example a link to the token
656 alasdair.gray 34 representing galaxies in general.
657     <!--
658     <code>skos:broader &lt;#galaxy&gt;</code>.
659     <code>&lt;skos:broader rdf:resource="#galaxy"&gt;</code>
660     -->
661 norman.x.gray 22 </li>
662 norman.x.gray 43 <li>Related concepts, for example a link to the token representing spiral
663 alasdair.gray 34 arms of galaxies
664     <!--
665     <code>skos:related &lt;#spiralArm&gt;</code>
666     <code>&lt;skos:related rdf:resource="#spiralArm"&gt;</code>
667     -->
668 alasdair.gray 31 <br/>
669 alasdair.gray 25 (note this relationship does not say that spiral galaxies have spiral
670     arms – that would be ontological information of a higher order which
671     is beyond the requirements for information stored in a vocabulary).</li>
672 norman.x.gray 21 </ul>
673     </li>
674     </ul>
675 alasdair.gray 25
676     <p>In addition to the information about a single concept, a vocabulary
677     can contain information to help users navigate its structure and
678     contents:</p>
679     <ul>
680     <li>The <q>top concepts</q> of the vocabulary, i.e. those that occur
681     at the top of the vocabulary hierarchy defined by the broader/narrower
682     relationships, can be explicitly stated to make it easier to navigate
683     the vocabulary.</li>
684    
685     <li>Concepts that form a natural group can be defined as being members
686     of a <q>collection</q>.</li>
687    
688     <li>Versioning information can be added using change notes.</li>
689    
690 norman.x.gray 536 <li>Additional metadata about the vocabulary, for example indicating
691     the publisher, may be documented using the Dublin Core metadata set
692     <span class='cite' >std:dublincore</span>, <span
693     class='cite'>std:pubguide</span>. At a minimum, the vocabulary's
694     <code>skos:ConceptScheme</code> should be annotated with DC title,
695     creator, description and date terms.</li>
696    
697     <li>The SKOS standard describes a number of <q>documentation
698     properties</q>; these should be used to document provenance of and
699     changes to vocabulary terms.</li>
700    
701     <li>A set of mappings between vocabularies has the potential to be
702     circular or create inconsistencies, though this is probably reasonably
703     unlikely in fact. This is in principle out of the
704     control of the vocabulary authors, since vocabularies do not contain
705     mappings, and so this can only be detected dynamically by applications
706     which use the vocabularies.</li>
707 alasdair.gray 25 </ul>
708 norman.x.gray 21 </div>
709    
710 alasdair.gray 25
711 norman.x.gray 77 <div class='section' id='skos-relationships'>
712 norman.x.gray 420 <p class='title'>Mapping relationships between vocabularies</p>
713 norman.x.gray 21
714 norman.x.gray 420 <p>There already exist several vocabularies in the domain of astronomy.
715 alasdair.gray 27 Instead of attempting to replace all these existing vocabularies,
716     which have been developed to achieve different aims and user groups,
717     we embrace them.
718     This requires a mechanism to relate the concepts in the different
719 norman.x.gray 420 vocabularies.</p>
720 alasdair.gray 27
721 norman.x.gray 420 <p>Part of the SKOS standard <span class='cite'>std:skosref</span>
722 alasdair.gray 59 allows a concept in one vocabulary to be related to a concept in
723     another vocabulary.
724     There are four types of relationship provided to capture the
725     relationships between concepts in vocabularies, which are similar to
726     those defined for relationships between concepts within a single
727     vocabulary.
728 norman.x.gray 536 The types of mapping relationships are as follows.</p>
729 alasdair.gray 59
730 norman.x.gray 21 <ul>
731 alasdair.gray 27
732     <li>
733 alasdair.gray 35 Equivalence between concepts, i.e. the concepts in the different
734 alasdair.gray 27 vocabularies refer to the same real world entity.
735 alasdair.gray 59 This is captured with the RDF statement
736     <blockquote>
737     <code>AAkeys:#Cosmology skos:exactMatch aoim:#Cosmology</code>
738     </blockquote>
739     which states that the cosmology concept in the A&amp;A Keywords is the
740     same as the cosmology concept in the AOIM.
741     (Note the use of an external namespaces <code>AAkeys</code> and
742     <code>aoim</code> which must be defined within the document.)
743 alasdair.gray 25 </li>
744 alasdair.gray 27
745     <li>
746     Broader concept, i.e. there is not an equivalent concept but there is
747     a more general one.
748 alasdair.gray 59 This is captured with the RDF statement
749     <blockquote>
750     <code>AAkeys:#Moon skos:broadMatch aoim:PlanetSatellite</code>
751     </blockquote>
752 norman.x.gray 536 which states that the AOIM concept <q>Planet Satellite</q> is a more general
753     term than the A&amp;A Keywords concept <q>Moon</q>.
754 alasdair.gray 27 </li>
755    
756     <li>
757     Narrower concept, i.e. there is not an equivalent concept but there is
758     a more specific one.
759 alasdair.gray 59 This is captured with the RDF statement
760     <blockquote>
761     <code>AAkeys:#IsmClouds skos:narrowMatch
762     aoim:#NebulaAppearanceDarkMolecularCloud</code>
763     </blockquote>
764 norman.x.gray 536 which states that the AOIM concept <q>Nebula Appearance Dark Molecular
765     Cloud</q> is more specific than the A&amp;A Keywords concept <q>ISM Clouds</q>.
766 alasdair.gray 27 </li>
767    
768     <li>
769     Related concept, i.e. there is some form of relationship.
770 alasdair.gray 59 This is captured with the RDF statement
771     <blockquote>
772     <code>AAkeys:#BlackHolePhysics skos:relatedMatch
773     aoim:#StarEvolutionaryStageBlackHole</code>
774     </blockquote>
775 norman.x.gray 536 which states that the A&amp;A Keywords concept <q>Black Hole Physics</q> has
776     an association with the AOIM concept <q>Star Evolutionary Stage Black Hole</q>.
777 alasdair.gray 27 </li>
778    
779 alasdair.gray 25 </ul>
780 norman.x.gray 21
781 norman.x.gray 70 <p>The semantic mapping relationships have certain properties.
782 alasdair.gray 59 The broadMatch relationship has the narrowMatch relationship as its
783     inverse and the exactMatch and relatedMatch relationships are
784     symmetrical.
785     The consequence of these properties is that if you have a mapping from
786     concept <code>A</code> in one vocabulary to concept <code>B</code> in
787     another vocabulary then you can infer a mapping from concept
788     <code>B</code> to concept <code>A</code>.
789 alasdair.gray 27 </p>
790    
791 norman.x.gray 420 <p class='todo'>At the time of writing, the SKOS document is still a
792 alasdair.gray 534 working draft, and may or may not end up with support for mappings in
793     the core document rather than in a companion document. This section
794     of this Working Draft, and other references to mappings below, should
795     therefore be considered as current best practice and could be updated
796     in a subsequent version of this document once the SKOS document has
797     become a standard.</p>
798 norman.x.gray 420
799 norman.x.gray 21 </div>
800    
801 norman.x.gray 420 <div class='section' id='vocabversions'>
802     <p class='title'>Vocabulary versions</p>
803    
804     <p>The document <span class='cite'>kendall08</span> discusses good
805     practice for managing RDF vocabularies. At the time of writing (2008
806     May) this is still an editor's draft, and it itself notes that good
807     practice in this area is not yet fully stable, so our recommendations
808     here are necessarily tentative, and in some places restricted to the
809     relatively small vocabularies (100s to 1000s of terms) we expect to
810     encounter in the VO. We expect to adjust or enhance this advice in
811     future editions of this Recommendation, as best practice evolves, or
812     as we gain more experience with the relevant vocabularies.</p>
813    
814     <p>We must distinguish between versions of a vocabulary, and versions
815     of the description of a vocabulary. In the former case, we are
816     concerned with the presence or absence of certain concepts, such as
817     <q>star</q> or <q>GRB</q>, and expect that there will be some
818     reasonably stable relationship between the concept URI and the
819     real-world concept it refers to. In the latter case, we are concerned
820     with the technicalities of associating a concept URI with its
821     labels, its description, and with other related concept URIs. While
822     it is true that there are epistemological commitments involved in the
823     simple act of naming (and the terms <q>GRB</q> and <q>planet</q>
824     remind us that there is knowledge implicit within a name), it is the
825     latter case that generally represents the <em>knowledge</em> we have
826     of an object, and it is this knowledge which we must version.</p>
827    
828     <p>In consequence, <em>the concept URIs should not carry
829     version information</em>. The partial exception to this is when a
830     vocabulary undergoes a major restructuring, as a result of the terms
831     in it becoming significantly incoherent – for example, we might
832     imagine the IAU92 thesaurus being updated to form an IAU 200x
833     thesaurus – but in this case we should regard the result as a new
834     vocabulary, rather than simply an adjusted version of an old one.</p>
835    
836 norman.x.gray 536 <p>All the terms in the SKOS vocabulary appear in an unversioned
837     namespace, and once in the vocabulary they are not removed <span
838     class='cite'>kendall08</span>. Successive versions of the vocabulary
839     description describe the vocabulary terms as <q>unstable</q>, <q>testing</q>,
840 norman.x.gray 420 <q>stable</q> or <q>deprecated</q>.</p>
841 norman.x.gray 536 <!-- there seems to be no
842     discussion of this in a SKOS document, as opposed to commentary on
843     SKOS -->
844 norman.x.gray 420
845     <p>The Dublin Core namespaces are managed in a similar way <span
846     class='cite'>dc:namespaces</span>. The namespace URIs, which act as
847     common prefixes to the DC terms, and which are defined using a <q>hash
848     URI</q> strategy, in RDF terms, have no version numbers, so that
849     the namespace for the DC terms vocabulary is <span
850     class='url'>http://purl.org/dc/terms/</span>. Terms such as <span
851     class='url'>http://purl.org/dc/terms/extent</span> then 302-redirect
852     to a URL which, for administrative convenience, happens to contain a
853     release date, but which resolves to RDF which defines the unversioned
854     term <span class='url'>http://purl.org/dc/terms/extent</span>. This
855     file includes the following content (translated into Turtle from the
856     original RDF/XML for legibility).</p>
857     <pre>@prefix rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt; .
858     @prefix skos: &lt;http://www.w3.org/2004/02/skos/core#&gt; .
859     @prefix dcam: &lt;http://purl.org/dc/dcam/&gt; .
860     @prefix dcterms: &lt;http://purl.org/dc/terms/&gt; .
861     @prefix rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt; .
862    
863     &lt;http://purl.org/dc/terms/&gt;
864     dcterms:title """DCMI Namespace for metadata terms
865     in the http://purl.org/dc/terms/ namespace"""@en-us;
866     rdfs:comment """To comment on this schema,
867     please contact dcmifb@dublincore.org.""";
868     dcterms:publisher "The Dublin Core Metadata Initiative"@en-us;
869     dcterms:modified "2008-01-14" .
870    
871     dcterms:extent
872     rdfs:label "Extent"@en-us;
873     rdfs:comment "The size or duration of the resource."@en-us;
874     rdfs:isDefinedBy &lt;http://purl.org/dc/terms/&gt;;
875     dcterms:issued "2000-07-11";
876     dcterms:modified "2008-01-14";
877     a rdf:Property;
878     dcterms:hasVersion &lt;http://dublincore.org/usage/terms/history/#extent-003&gt;;
879     rdfs:range dcterms:SizeOrDuration;
880     rdfs:subPropertyOf &lt;http://purl.org/dc/elements/1.1/format&gt;,
881     dcterms:format .
882     ...
883     </pre>
884     <p>This includes the definition of the (unversioned) <span class='url'
885     >http://purl.org/dc/terms/extent</span> concept, along with semantic
886     knowledge about the concept (<code>rdfs:subPropertyOf</code>) as of
887     2008-01-14, plus other editorial (<code>dcterms:modified</code>) and
888     definitional (<code>rdfs:isDefinedBy</code>) metadata.</p>
889    
890 norman.x.gray 43 </div>
891    
892 norman.x.gray 420 </div>
893    
894 norman.x.gray 43 <div class='section' id='publishing'>
895     <p class='title'>Publishing vocabularies (normative)</p>
896    
897     <div class='section' id='pubreq'>
898     <p class='title'>Requirements</p>
899    
900     <p>A vocabulary which conforms to this IVOA standard has the following
901     features. In this section, the keywords
902     <span class='rfc2119' >must</span>,
903     <span class='rfc2119' >should</span>
904     and so on, are to be interpreted as described in <span
905     class='cite'>std:rfc2119</span>.</p>
906    
907     <div class='section'>
908 norman.x.gray 48 <p class='title'>Dereferenceable namespace</p>
909 norman.x.gray 43
910 norman.x.gray 420 <p>The namespace of the vocabulary <span class='rfc2119'>must</span>
911     be dereferenceable on the web. That is, typing the namespace URL into
912     a web browser will produce human-readable documentation about the
913     vocabulary. In addition, the namespace URL <span class='rfc2119'
914     >should</span> return an RDF version of the vocabulary if it is
915     retrieved with one of the RDF MIME types in the HTTP Accept header.
916     At the time of writing, the only fully standardised RDF MIME type is
917     <code>application/rdf+xml</code> for RDF/XML, but
918     <code>text/rdf+n3</code> and <code>text/turtle</code> are the proposed
919     types for Notation3 <span class='cite'>notation3</span> and Turtle
920     <span class='cite'>std:turtle</span>, respectively.</p>
921 norman.x.gray 43
922     <p><em>Rationale: These prescriptions are intended to be compatible
923     with the patterns described in <span class='cite'>berrueta08</span>
924 norman.x.gray 420 and <span class='cite'>sauermann08</span>, and vocabulary distributors
925 norman.x.gray 43 <span class='rfc2119' >should</span> follow these patterns where
926     possible.</em></p>
927     </div>
928    
929     <div class='section'>
930 norman.x.gray 48 <p class='title'>Long-term availability</p>
931 norman.x.gray 43
932 alasdair.gray 534 <p>The files defining a vocabulary, including those of superseded
933     versions, <span class='rfc2119' >should</span> remain permanently
934     available. There is no requirement that the namespace URL be at any
935     particular location, although the IVOA web pages, or a journal
936     publisher's web pages, would likely be suitable archival
937     locations.</p> </div>
938 norman.x.gray 43
939     <div class='section'>
940 norman.x.gray 48 <p class='title'>Distribution format</p>
941 norman.x.gray 43
942     <p>Vocabularies <span class='rfc2119'>must</span> be made available
943 alasdair.gray 534 for distribution as SKOS RDF files in RDF/XML <span
944     class='cite'>std:rdfxml</span> format. A human readable version in
945     Turtle <span class='cite'>std:turtle</span> format <span
946     class='rfc2119'>should</span> also be made available. As an
947     alternative to Turtle, vocabularies may be made available in that
948 norman.x.gray 420 subset of Notation3 <span class='cite'>notation3</span> which is
949     compatible with Turtle; if Turtle or Notation3 is being served, it is
950     prudent to support both <code>text/rdf+n3</code> and
951     <code>text/turtle</code> as MIME types in the <code>Accept</code>
952 alasdair.gray 534 header of the HTTP request. <!-- See issue <a
953     href='@ISSUESLIST@#distformat-2'>[distformat-2]</a> -->.</p>
954 norman.x.gray 43
955 norman.x.gray 420 <p>A publisher <span class='rfc2119'>may</span> make available RDF in
956     other formats, or other supporting files. A publisher <span
957     class='rfc2119'>must</span> make available at least some
958     human-readable documentation – see section <span
959     class='xref'>servevocab</span> for a discussion of the mechanics here.</p>
960 norman.x.gray 43
961     <p><em>Rationale: this does imply that the vocabulary source files can only
962     realistically be parsed using an RDF parser. An alternative is to
963     require that vocabularies be distributed using a subset of RDF/XML
964     which can also be naively handled as traditional XML; however as well
965     as creating an extra standardisation requirement, this would make it
966     effectively infeasible to write out the distribution version of the
967     vocabulary using an RDF or general SKOS tool.</em></p>
968     </div>
969    
970 norman.x.gray 420 <div class='section' id='versioning'>
971 norman.x.gray 48 <p class='title'>Clearly versioned vocabulary</p>
972 norman.x.gray 43
973 alasdair.gray 534 <p>The vocabulary <em>namespace</em> <span class='rfc2119' >should
974     not</span> be versioned, but it <span class='rfc2119' >should</span>
975     be easy to retrieve earlier versions of the RDF describing the
976 norman.x.gray 420 vocabulary. See the discussion in section <span
977     class='xref'>vocabversions</span> for the rationale for this, and see
978     section <span class='xref'>servevocab</span> for a discussion of its
979     implications for the way that vocabularies are served on the web.</p>
980 norman.x.gray 43
981 norman.x.gray 420 <!-- Issue <a href='@ISSUESLIST@#versioning-3' >[versioning-3]</a>-->
982    
983 norman.x.gray 43 </div>
984    
985     <div class='section'>
986 norman.x.gray 48 <p class='title'>No restrictions on source files</p>
987 norman.x.gray 43
988 norman.x.gray 420 <p>This Recommendation does not place any restrictions on the format of the
989 norman.x.gray 43 files managed by the maintenance process, as long as the distributed
990 alasdair.gray 534 files are as specified above.
991     <!-- See issue
992     <a href='@ISSUESLIST@#masterformat-1' >[masterformat-1]</a> -->
993     </p>
994 norman.x.gray 43 </div>
995    
996     </div>
997    
998 norman.x.gray 36 <div class='section' id='practices'>
999 norman.x.gray 420 <p class='title'>Good practices of vocabulary design</p>
1000 norman.x.gray 21
1001 norman.x.gray 43 <p>This standard imposes a number of requirements on conformant
1002 norman.x.gray 420 vocabularies (see <span class='xref' >pubreq</span>). In
1003 norman.x.gray 43 this section we list a number of good practices that IVOA vocabularies
1004     <span class='rfc2119'>should</span> abide by. Some of the
1005     prescriptions below are more specific than good-practice guidelines
1006     for vocabularies in general.</p>
1007    
1008 norman.x.gray 57 <p>The adoption of the following guidelines will make it easier to use
1009     vocabularies in generic VO applications. However, VO applications
1010     <span class='rfc2119'>should</span> be able to accept any vocabulary
1011     that complies with the latest SKOS standard
1012 norman.x.gray 420 <span class="cite">std:skosref</span> (this is a syntactical
1013     requirement, and does not imply that an application will necessarily
1014     understand the terms in
1015 norman.x.gray 57 an alien vocabulary, although the presence of mappings to a known
1016     vocabulary should allow it to derive some benefit).</p>
1017    
1018 norman.x.gray 21 <ol>
1019    
1020 norman.x.gray 43 <li>Concept identifiers <span class='rfc2119'>should</span> consist
1021     only of the letters a-z, A-Z, and numbers 0-9, i.e. no spaces, no
1022 norman.x.gray 70 exotic letters (for example umlauts), and no characters which would make a
1023 norman.x.gray 43 token inexpressible as part of a URI; since tokens are for use by
1024     computers only, this is not a big restriction, since the exotic
1025     letters can be used within the labels and documentation if
1026     appropriate.</li>
1027 norman.x.gray 21
1028 norman.x.gray 43 <li>The concept identifiers <span class='rfc2119'>should</span> be
1029     kept in human-readable form, directly reflect the implied meaning, and
1030     not be semi-random identifiers only (for example, use
1031 norman.x.gray 70 <code>spiralGalaxy</code>, not <code>t1234567</code>); tokens <span
1032 norman.x.gray 43 class='rfc2119'>should</span> preferably be created via a direct
1033     conversion from the preferred label via removable/translation of
1034     non-token characters (see above) and sub-token separation via
1035 norman.x.gray 70 capitalisation of the first sub-token character (for example the label <code>My
1036     favourite idea-label #42</code> is converted into
1037     <code>MyFavouriteIdeaLabel42</code>).</li>
1038 norman.x.gray 21
1039 norman.x.gray 420 <li>Labels <span class='rfc2119'>should</span> be unchanged from the
1040     labels appearing in the source vocabulary. When
1041     developing a new vocabulary standard thesaurus practice indicates that
1042     english language labels for concrete concepts should be pluralised –
1043     thus <code>"galaxies"@en</code>, but <code>"astronomy"@en</code>;
1044     thesaurus practice in other european languages uses the singular for
1045     all cases.</li>
1046     <!--
1047     the singular form <span
1048 norman.x.gray 43 class='rfc2119'>should</span> be preferred,
1049 norman.x.gray 70 for example <code>spiral galaxy</code>, not <code>spiral galaxies</code>.
1050     <span class='todo'><a href="http://code.google.com/p/volute/issues/detail?id=1">Open issue</a></span></li>
1051 norman.x.gray 420 -->
1052 norman.x.gray 21
1053 norman.x.gray 43 <li>Each concept <span class='rfc2119'>should</span> have a definition
1054 alasdair.gray 25 (<code>skos:definition</code>) that constitutes a short description of
1055     the concept which could be adopted by an application using the
1056 norman.x.gray 57 vocabulary. Each concept <span class='rfc2119'>should</span> have
1057     additional documentation using SKOS Notes or
1058     Dublin Core terms as appropriate
1059 norman.x.gray 420 (see <span class='cite'>std:skosref</span>). In practice, this
1060     requirement is rather difficult to satisfy, since pre-existing
1061     structured vocabularies, being convered to SKOS, frequently provide
1062     only labels, and not fuller descriptions or scope notes.</li>
1063 norman.x.gray 21
1064 norman.x.gray 57 <li>The language localisation <span class='rfc2119'>should</span> be
1065 norman.x.gray 43 declared where appropriate, in preferred labels, alternate labels,
1066 norman.x.gray 57 definitions, and the like.</li>
1067 norman.x.gray 21
1068 alasdair.gray 25 <li>Relationships (<q>broader</q>, <q>narrower</q>, <q>related</q>)
1069 norman.x.gray 43 between concepts <span class='rfc2119'>should</span> be present, but
1070     are not required; if used, they <span class='rfc2119'>should</span> be
1071     complete (thus all <q>broader</q> links have corresponding
1072 alasdair.gray 25 <q>narrower</q> links in the referenced entries and <q>related</q>
1073     entries link each other).</li>
1074 norman.x.gray 21
1075 norman.x.gray 43 <li><q>TopConcept</q> entries (see above) <span
1076     class='rfc2119'>should</span> be declared and normally consist of
1077     those concepts that do not have any <q>broader</q> relationships
1078     (i.e. not at a sub-ordinate position in the hierarchy).</li>
1079 norman.x.gray 21
1080 norman.x.gray 61 <li>The SKOS standard describes some good practices for vocabulary
1081     maintenance, such as using <code>&lt;skos:changeNote&gt;</code> and
1082 norman.x.gray 420 the like, and these are elaborated in the (currently draft) note <span
1083 norman.x.gray 536 class='cite'>kendall08</span>. At a minimum, the vocabulary's
1084     <code>skos:ConceptScheme</code> <span class='rfc2119'>must</span> be
1085     annotated with DC title, creator, description and date terms
1086     <span class='cite'>std:dublincore</span>,
1087     <span class='cite'>std:pubguide</span>.
1088     Publishers <span class='rfc2119'>should</span> respect such good practices
1089     as are available to direct vocabulary development and
1090     maintenance.</li>
1091 norman.x.gray 61
1092 norman.x.gray 43 <li>Publishers <span class='rfc2119'>should</span> publish
1093     <q>mappings</q> between their vocabularies and other commonly used
1094 norman.x.gray 536 vocabularies. These <span class='rfc2119'>must</span> be external to
1095 norman.x.gray 43 the defining vocabulary document so that the vocabulary can be used
1096 alasdair.gray 534 independently of the publisher's mappings. The mapping file <span
1097     class='rfc2119' >should</span> contain metadata using suitable Dublin
1098 norman.x.gray 536 Core terms.</li>
1099 norman.x.gray 21
1100 norman.x.gray 536 <li>When adapting an existing vocabulary into the SKOS format,
1101     implementors <span class='rfc2119'>should not</span> change the form
1102     of labels (for example changing the grammatical number) beyond any
1103     changes necessarily required by SKOS.</li>
1104    
1105     </ol>
1106    
1107 norman.x.gray 57 <!--
1108 alasdair.gray 34 <p>These suggestions are by no means trivial – there was
1109     considerable discussion within the semantic working group on many of
1110     these topics, particularly about token formats (some wanted lower-case
1111     only), and singular versus plural forms of the labels (different
1112     traditions exist within the international library science
1113     community). Obviously, no publisher of an astronomical vocabulary has
1114     to adopt these rules, but the adoption of these rules will make it
1115     easier to use the vocabularly in external generic VO
1116     applications. However, VO applications should be developed to accept
1117     any vocabulary that complies with the latest SKOS standard <span
1118 alasdair.gray 59 class="cite">std:skosref</span>.</p>
1119 norman.x.gray 57 -->
1120     </div>
1121 norman.x.gray 21
1122 norman.x.gray 420 <div class='section' id='servevocab'>
1123     <p class='title'>Good practices when serving vocabularies on the web</p>
1124    
1125     <p>The W3C Interest Group Note <em>Cool URIs for the Semantic Web</em>
1126     <span class='cite'>sauermann08</span> presents guidelines for the
1127     effective use of URIs when serving web documents and concepts on the
1128     Semantic Web. When providing vocabularies to the VO, we recommend
1129     that publishers conform to these guidelines in general. We make some
1130     further observations below.</p>
1131    
1132     <p>The “Cool URIs” guidelines describe a number of desirable
1133     features of URIs in this context, namely simplicity, stability and
1134     manageability. Section 4.5 of the document describes
1135     these features as follows (quoted directly).</p>
1136     <dl>
1137     <dt>Simplicity</dt>
1138     <dd>Short, mnemonic URIs will not break as easily when sent in emails
1139     and are in general easier to remember, e.g. when debugging your
1140     Semantic Web server.</dd>
1141     <dt>Stability</dt>
1142     <dd>Once you set up a URI to identify a certain resource, it should
1143     remain this way as long as possible. Think about the next ten
1144     years. Maybe twenty. Keep implementation-specific bits and pieces such
1145     as .php and .asp out of your URIs, you may want to change technologies
1146     later.</dd>
1147     <dt>Manageability</dt>
1148     <dd>Issue your URIs in a way that you can manage. One good practice is
1149     to include the current year in the URI path, so that you can change
1150     the URI-schema each year without breaking older URIs. Keeping all 303
1151     URIs on a dedicated subdomain,
1152     e.g. <code>http://id.example.com/alice</code>, eases later migration
1153     of the URI-handling subsystem.</dd>
1154     </dl>
1155     <p>We endorse this advice in this Recommendation: VO vocabularies
1156     <span class='rfc2119'>should</span> use URIs which have these
1157     properties. The advice in the third point is a general point about
1158     maintaining the general URI namespace on a particular server, and is
1159     not about versioning vocabulary namespaces.</p>
1160    
1161     <p>The “Cool URIs” document also describes two broad strategies for
1162     making these URIs available on the web, which they name <em>303
1163     URIs</em> and <em>hash URIs</em> (see the document, section 4, for
1164     descriptions). They note that the <em>hash URI</em> strategy <q>should
1165     be preferred for rather small and stable sets of resources that evolve
1166     together. The ideal case[s] are RDF Schema vocabularies and OWL
1167     ontologies, where the terms are often used together, and the number of
1168     terms is unlikely to grow out of control in the future.</q> Since
1169     this is the case for the (relatively small) SKOS vocabularies this
1170     Recommendation discusses, and since an application will generally want
1171     to use the complete vocabulary rather than only single concepts, we
1172     suggest that vocabularies conformant to this Recommendation <span
1173     class='rfc2119'>should</span> be distributed as <em>hash URI</em>
1174     ones.</p>
1175    
1176     <p>Common to the two strategies above is the insistence that the
1177     vocabulary URIs <em>are HTTP URIs which are retrievable on the
1178     web</em> – they differ only in the practicalities of achieving this.
1179     The strategies also share the expectation that the vocabulary URIs are
1180     retrievable both as RDF (machine-readable) and as HTML (providing
1181     documentation for humans). We elevate this to a requirement of this
1182     Recommendation: vocabulary terms <span class='rfc2119'>must</span> be
1183     HTTP URIs which <span class='rfc2119'>must</span> be dereferenceable
1184     as both RDF and HTML using the mechanism appropriate to the URI naming
1185     strategy.</p>
1186    
1187     </div>
1188    
1189     <div class='section'>
1190     <p class='title'>Example: serving the A&amp;A vocabulary</p>
1191    
1192     <p>While <span class='cite'>sauermann08</span> discusses the design of
1193     the URIs naming concepts, it says little about the mechanics of making
1194     these available on the web. We refer vocabulary publishers to the
1195     recipe advice contained in <span class='cite'>berrueta08</span>, which
1196     we illustrate here in the case of the <em>hash URI</em> strategy.</p>
1197    
1198     <p>The A&amp;A vocabulary has the namespace <span
1199     class='url'>@BASEURI@/AAkeys</span>. In accordance with the above
1200     guidelines, this namespace URI is dereferenceable, and if you enter
1201     the URI into a web browser, you will end up at a page describing the
1202     vocabulary. The way this works can be illustrated by using
1203     <code>curl</code> to dereference the URI (URIs are cropped for legibility):</p>
1204 norman.x.gray 536 <pre>% curl http://[...]/rdf/AAkeys
1205 norman.x.gray 420 HTTP/1.1 303 See Other
1206     Date: Thu, 08 May 2008 14:07:12 GMT
1207     Server: Apache
1208     Location: http://[...]/rdf/vocabularies-2008-05-08/AAkeys/AAkeys.html
1209     Connection: close
1210 norman.x.gray 536 Content-Type: text/html; charset=utf-8
1211    
1212     &lt;title&gt;Redirected&lt;/title&gt;
1213     &lt;p&gt;See &lt;a href='http://[...]/rdf/vocabularies-2008-05-08/AAkeys/AAkeys.html'
1214     &gt;elsewhere&lt;/a&gt;&lt;/p&gt;
1215 norman.x.gray 420 </pre>
1216     <p>The server has responded to the HTTP GET for the URI with a 303
1217     response, and a <code>Location</code> header, pointing to the HTML
1218 norman.x.gray 536 representation of this thing. In this example, the server has
1219     included a brief HTML explanation in case a human happens to see this response.</p>
1220 norman.x.gray 420
1221     <p>If we instead request an RDF representation, by stating a desired
1222     MIME type in the HTTP <code>Accept</code> header, we get a slightly
1223     different response:</p>
1224     <pre>% curl --head -H accept:text/turtle http://[...]/rdf/AAkeys
1225     HTTP/1.1 303 See Other
1226     Date: Thu, 08 May 2008 14:11:28 GMT
1227     Server: Apache
1228     Location: http://[...]/rdf/vocabularies-2008-05-08/AAkeys/AAkeys.ttl
1229     Connection: close
1230     Content-Type: text/html; charset=iso-8859-1
1231     </pre>
1232     <p>This is also a 303 response, but the <code>Location</code> header
1233     this time points to an RDF file in Turtle syntax, which we can now retrieve normally.</p>
1234     <pre>% curl --include http://[...]/rdf/vocabularies-2008-05-08/AAkeys/AAkeys.ttl
1235     HTTP/1.1 200 OK
1236     Date: Thu, 08 May 2008 14:13:35 GMT
1237     Server: Apache
1238     Content-Type: text/turtle; charset=utf-8
1239    
1240     @base &lt;http://[...]/rdf/AAkeys&gt; .
1241     @prefix rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt; .
1242     @prefix dc: &lt;http://purl.org/dc/elements/1.1/&gt; .
1243     @prefix rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt; .
1244     @prefix : &lt;http://www.w3.org/2004/02/skos/core#&gt; .
1245    
1246     &lt;&gt;
1247     dc:created "2008-05-08" ;
1248     dc:title "Vocabulary for Astronomy &amp; Astrophysics Journal keywords (Version wd-1.0)"@en ;
1249     a :ConceptScheme ;
1250    
1251     # and so on...
1252     </pre>
1253     <p>Note that the base URI in the returned RDF still refers to the
1254     unversioned concept names.</p>
1255    
1256     <p>This behaviour is controlled by (in this case) an Apache
1257     <code>.htaccess</code> file which looks like this:</p>
1258     <pre>AddType application/rdf+xml .rdf
1259     # The MIME type for .n3 should be text/rdf+n3, not application/n3:
1260     # see MIME notes at &lt;http://www.w3.org/2000/10/swap/doc/changes.html&gt;
1261     #
1262     # The MIME type for Turtle is text/turtle, though this has not
1263     # completed its registration: see
1264     # &lt;http://www.w3.org/TeamSubmission/turtle/#sec-mediaReg&gt;
1265     AddType text/rdf+n3 .n3
1266     AddType text/turtle .ttl
1267     # For Charset types, see &lt;http://www.iana.org/assignments/character-sets&gt;
1268     AddCharset UTF-8 .n3
1269     AddCharset UTF-8 .ttl
1270    
1271     RewriteEngine On
1272     # This will match the directory where this file is located.
1273     RewriteBase /users/norman/ivoa/vocabularies/rdf
1274    
1275     RewriteCond %{HTTP_ACCEPT} application/rdf\+xml
1276     RewriteRule ^(AAkeys|AOIM|UCD|IVOAT|IAUT93)$ vocabularies-2008-05-08/$1/$1.rdf [R=303]
1277    
1278     RewriteCond %{HTTP_ACCEPT} text/rdf\+n3 [OR]
1279     RewriteCond %{HTTP_ACCEPT} application/n3 [OR]
1280     RewriteCond %{HTTP_ACCEPT} text/turtle
1281     RewriteRule ^(AAkeys|AOIM|UCD|IVOAT|IAUT93)$ vocabularies-2008-05-08/$1/$1.ttl [R=303]
1282    
1283 norman.x.gray 536 # Any other accept header, including none: make the .html version the default
1284 norman.x.gray 420 RewriteRule ^(AAkeys|AOIM|UCD|IVOAT|IAUT93)$ vocabularies-2008-05-08/$1/$1.html [R=303]
1285     </pre>
1286     <p>These various <code>RewriteRule</code> statements examine the
1287     content of the HTTP <code>Accept</code> header, and return
1288     303-redirections to the appropriate actual resource.</p>
1289    
1290     <p>Note that the namespace remains unversioned throughout the
1291     maintainance history of this vocabulary, even though the actual RDF
1292     files being returned might change as labels or relationships are
1293     adjusted. Previous versions of the vocabulary RDF will remain
1294     available, though they will no longer be served by dereferencing the
1295     namespace URL.</p>
1296    
1297     </div>
1298    
1299 alasdair.gray 27 </div>
1300 norman.x.gray 21
1301    
1302 norman.x.gray 57 <div class="section" id='distvocab'>
1303 norman.x.gray 420 <p class="title">Example vocabularies (informative)</p>
1304 norman.x.gray 2
1305 norman.x.gray 57 <p>The intent of having the IVOA adopt SKOS as the preferred format for
1306 norman.x.gray 21 astronomical vocabularies is to encourage the creation and management
1307     of diverse vocabularies by competent astronomical groups, so that
1308     users of the VO and related resources can benefit directly and
1309 alasdair.gray 25 dynamically without the intervention of the IAU or IVOA. However, we
1310     felt it important to provide several examples of vocabularies in the
1311     SKOS format as part of the proposal, to illustrate their simplicity
1312 norman.x.gray 57 and power, and to provide an immediate vocabulary basis for VO
1313 alasdair.gray 25 applications.</p>
1314 norman.x.gray 21
1315 norman.x.gray 61 <p>The vocabularies described below are included, as SKOS files, in
1316     the distributed version of this standard. These vocabularies have
1317 norman.x.gray 420 stable URLs, and may be cited and used indefinitely. These
1318     vocabularies will not, however, be developed as part of the
1319     maintenance of this standard. Interested groups, within and outwith
1320     the IVOA, are encouraged to take these as a starting point and absorb
1321     them within existing processes.</p>
1322 norman.x.gray 43
1323 norman.x.gray 64 <p>The exceptions to this rule are the constellation vocabulary,
1324     provided here mainly for didactic purposes, and the proposed IVOA
1325     Thesaurus, which is being developed as a separate project and whose
1326     aim is to provide a corrected, more user-friendly, more complete, and
1327     updated version of the 1993 IAU thesaurus. Although work on the IVOA
1328     Thesaurus is on-going, the fact that it is largely based on the IAU
1329     thesaurus means that it is already a very useful resource, so a usable
1330     snapshot of this vocabulary will be published with the other
1331     examples.</p>
1332 norman.x.gray 61
1333 norman.x.gray 2 <p>We provide a set of SKOS files representing the vocabularies which
1334 norman.x.gray 420 have been developed, and mappings between them. These vocabularies
1335     have base URIs starting <span class='url'>@BASEURI@</span>, and can be
1336 alasdair.gray 25 downloaded at the URL</p>
1337 norman.x.gray 5 <blockquote>
1338     <span class='url'>@BASEURI@/@DISTNAME@.tar.gz</span>
1339     </blockquote>
1340 norman.x.gray 2
1341 alasdair.gray 65 <!--
1342 norman.x.gray 70 <p class='todo'>Not yet: instead go to
1343     <span class='url'>http://code.google.com/p/volute/downloads/list</span></p>-->
1344 norman.x.gray 2
1345 norman.x.gray 36 <div class='section' id='vocab-constellation'>
1346 norman.x.gray 57 <p class='title'>A Constellation Name Vocabulary</p>
1347 norman.x.gray 21
1348 norman.x.gray 420 <p>This vocabulary is presented as a simple example of an astronomical vocabulary for a very particular purpose, such as handling constellation information like that commonly encountered in variable star research. For example, <q>SS Cygni</q> is a cataclysmic variable located in the constellation <q>Cygnus</q>. The name of the star uses the genitive form <q>Cygni</q>, but the alternate label <q>SS Cyg</q> uses the standard abbreviation <q>Cyg</q>. Given the constellation vocabulary, all of these forms are recorded together in a computer-manipulatable format. Various incorrect forms should probably be represented in SKOS hidden labels.</p>
1349 norman.x.gray 21
1350 norman.x.gray 420 <p>The <code>&lt;skos:ConceptScheme&gt;</code> contains a single
1351     <code>&lt;skos:TopConcept&gt;</code>, <q>constellation</q></p>
1352    
1353     <center>
1354 norman.x.gray 36 <table>
1355     <tr><th bgcolor="#eecccc">XML Syntax</th>
1356     <th width="10"/><th bgcolor="#cceecc">Turtle Syntax</th></tr>
1357     <tr><td/></tr>
1358     <tr>
1359 norman.x.gray 420 <td class='rdfxml'>
1360     <pre class='rdfxml'>&lt;skos:Concept rdf:about="#constellation"&gt;
1361 alasdair.gray 31 &lt;skos:inScheme rdf:resource=""/&gt;
1362     &lt;skos:prefLabel&gt;
1363     constellation
1364     &lt;/skos:prefLabel&gt;
1365     &lt;skos:definition&gt;
1366     IAU-sanctioned constellation names
1367     &lt;/skos:definition&gt;
1368     &lt;skos:narrower rdf:resource="#Andromeda"/&gt;
1369     ...
1370     &lt;skos:narrower rdf:resource="#Vulpecula"/&gt;
1371     &lt;/skos:Concept&gt;
1372 norman.x.gray 21 </pre>
1373 alasdair.gray 31 </td>
1374     <td/>
1375 norman.x.gray 420 <td class='turtle'>
1376     <pre class='turtle'>&lt;#constellation&gt; a :Concept;
1377 alasdair.gray 31 :inScheme &lt;&gt;;
1378     :prefLabel "constellation";
1379     :definition "IAU-sanctioned constellation names";
1380     :narrower &lt;#Andromeda&gt;;
1381     ...
1382     :narrower &lt;#Vulpecula&gt;.
1383 norman.x.gray 22 </pre>
1384 alasdair.gray 31 </td></tr>
1385     </table></center>
1386 norman.x.gray 22 <p>and the entry for <q>Cygnus</q> is</p>
1387 alasdair.gray 31 <center><table><tr>
1388 norman.x.gray 420 <td class='rdfxml'>
1389     <pre class='rdfxml'>&lt;skos:Concept rdf:about="#Cygnus"&gt;
1390 alasdair.gray 31 &lt;skos:inScheme rdf:resource=""/&gt;
1391     &lt;skos:prefLabel&gt;Cygnus&lt;/skos:prefLabel&gt;
1392     &lt;skos:definition&gt;Cygnus&lt;/skos:definition&gt;
1393     &lt;skos:altLabel&gt;Cygni&lt;/skos:altLabel&gt;
1394     &lt;skos:altLabel&gt;Cyg&lt;/skos:altLabel&gt;
1395     &lt;skos:broader rdf:resource="#constellation"/&gt;
1396     &lt;skos:scopeNote&gt;
1397     Cygnus is nominative form; the alternative
1398     labels are the genitive and short forms
1399     &lt;/skos:scopeNote&gt;
1400     &lt;/skos:Concept&gt;
1401 norman.x.gray 21 </pre>
1402 alasdair.gray 31 </td>
1403     <td width="10"/>
1404 norman.x.gray 420 <td class='turtle'>
1405     <pre class='turtle'>&lt;#Cygnus&gt; a :Concept;
1406 alasdair.gray 31 :inScheme &lt;&gt;;
1407     :prefLabel "Cygnus";
1408     :definition "Cygnus";
1409     :altLabel "Cygni";
1410     :altLabel "Cyg";
1411     :broader &lt;#constellation&gt;;
1412 norman.x.gray 36 :scopeNote """Cygnus is nominative form;
1413     the alternative labels are the genitive and
1414     short forms""" .
1415 alasdair.gray 31 </pre>
1416     </td>
1417     </tr></table></center>
1418    
1419 norman.x.gray 21 <p>Note that SKOS alone does not permit the distinct differentiation
1420     of genitive forms and abbreviations, but the use of alternate labels
1421     is more than adequate enough for processing by VO applications where
1422 norman.x.gray 22 the difference between <q>SS Cygni</q>, <q>SS Cyg</q>, and the incorrect form
1423     <q>SS Cygnus</q> is probably irrelevant.</p>
1424 norman.x.gray 2 </div>
1425 norman.x.gray 21
1426 norman.x.gray 36 <div class='section' id='vocab-aa'>
1427 norman.x.gray 57 <p class='title'>The Astronomy &amp; Astrophysics Keyword List</p>
1428 norman.x.gray 21
1429 norman.x.gray 420 <p>Namespace: <span class='url'>@BASEURI@/AAkeys</span>.</p>
1430 alasdair.gray 35
1431 norman.x.gray 420 <p>This vocabulary is a set of keywords maintained jointly by the
1432     publishers of the journals <em>Astronomy and Astrophysics</em>
1433     (A&amp;A), <em>Monthly Notices of the Royal Astronomical Society</em>
1434     (MNRAS) and the <em>Astrophysical Journal</em> (ApJ). As noted in the
1435     introduction, an analysis of these keywords <span
1436     class='cite'>preitemartinez07</span> indicates that the different
1437     journals are slightly inconsistent with each other; we have rather
1438     arbitrarily used the list from the A&amp;A web site. The intended
1439     usage of the vocabulary is to tag articles with descriptive keywords
1440     to aid searching for articles on a particular topic.</p>
1441    
1442     <p>The keywords are organised into categories which have been modelled as
1443 norman.x.gray 57 hierarchical relationships.
1444 alasdair.gray 35 Additionally, some of the keywords are grouped into collections which
1445     has been mirrored in the SKOS version.
1446 alasdair.gray 59 The vocabulary contains no definitions or related links as these are
1447     not provided in the original keyword list, and only a handful of
1448     alternative labels and scope notes that are present in the original
1449 norman.x.gray 420 keyword list.</p>
1450 alasdair.gray 35
1451 norman.x.gray 21 </div>
1452    
1453 norman.x.gray 36 <div class='section' id='vocab-aoim'>
1454 norman.x.gray 57 <p class='title'>The AOIM Taxonomy</p>
1455 norman.x.gray 21
1456 norman.x.gray 420 <p>Namespace: <span class='url'>@BASEURI@/AOIM</span>.</p>
1457    
1458     <p>This vocabulary is published by the IVOA to allow images to be tagged
1459 alasdair.gray 35 with keywords that are relevant for the public.
1460     It consists of a set of keywords organised into an enumerated
1461     hierarchical structure.
1462     Each term consists of a taxonomic number and a label.
1463 alasdair.gray 59 There are no definitions, scope notes, or cross references.
1464 alasdair.gray 35 </p>
1465 norman.x.gray 21
1466 norman.x.gray 36 <p>When converting the AOIM into SKOS, it was decided to model the
1467 alasdair.gray 35 taxonomic number as an alternative label.
1468     Since there are duplication of terms, the token for a term consists of
1469     the full hierarchical location of the term.
1470 norman.x.gray 36 Thus, it is possible to distinguish between</p>
1471 norman.x.gray 420 <pre>Planet -> Feature -> Surface -> Canyon</pre>
1472 norman.x.gray 36 <p>and</p>
1473 norman.x.gray 420 <pre>Planet -> Satellite -> Feature -> Surface -> Canyon</pre>
1474 norman.x.gray 36 <p>which have the tokens <code>PlanetFeatureSurfaceCanyon</code> and
1475 alasdair.gray 35 <code>PlanetSatelliteFeatureSurfaceCanyon</code> respectively.
1476     </p>
1477    
1478 norman.x.gray 21 </div>
1479    
1480 norman.x.gray 36 <div class='section' id='vocab-ucd1'>
1481 norman.x.gray 57 <p class='title'>The UCD1+ Vocabulary</p>
1482 norman.x.gray 21
1483 norman.x.gray 420 <p>Namespace: <span class='url'>@BASEURI@/UCD</span>.</p>
1484    
1485 alasdair.gray 25 <p>The UCD standard is an officially sanctioned and managed vocabulary
1486     of the IVOA. The normative document is a simple text file containing
1487 norman.x.gray 70 entries consisting of tokens (for example <code>em.IR</code>), a short
1488 alasdair.gray 25 description, and usage information (<q>syntax codes</q> which permit
1489 norman.x.gray 21 UCD tokens to be concatenated). The form of the tokens implies a
1490 alasdair.gray 25 natural hierarchy: <code>em.IR.8-15um</code> is obviously a narrower
1491     term than <code>em.IR</code>, which in turn is narrower than
1492     <code>em</code>.</p>
1493 norman.x.gray 21
1494     <p>Given the structure of the UCD1+ vocabulary, the natural
1495     translation to SKOS consists of preferred labels equal to the original
1496     tokens (the UCD1 words include dashes and periods), vocabulary tokens
1497 norman.x.gray 57 created using guidelines in <span class='xref'
1498 norman.x.gray 70 >practices</span> (for example, "emIR815Um" for
1499 norman.x.gray 22 <code>em.IR.8-15um</code>), direct use of the definitions, and the syntax codes
1500 norman.x.gray 57 placed in usage documentation: <code>&lt;skos:scopeNote&gt;UCD syntax code: P&lt;/skos:scopeNote&gt;</code></p>
1501 norman.x.gray 21
1502     <p>Note that the SKOS document containing the UCD1+ vocabulary does
1503     NOT consistute the official version: the normative document is still
1504     the text list. However, on the long term, the IVOA may decide to make
1505     the SKOS version normative, since the SKOS version contains all of the
1506     information contained in the original text document but has the
1507     advantage of being in a standard format easily read and used by any
1508 alasdair.gray 59 application on the semantic web whilst still being usable in the
1509     current ways.</p>
1510 norman.x.gray 21
1511     </div>
1512    
1513 norman.x.gray 36 <div class='section' id='vocab-iau93'>
1514 norman.x.gray 57 <p class='title'>The 1993 IAU Thesaurus</p>
1515 norman.x.gray 21
1516 norman.x.gray 420 <p>Namespace: <span class='url'>@BASEURI@/IAUT93</span>.</p>
1517    
1518 norman.x.gray 57 <p>The IAU Thesaurus consists of concepts with mostly capitalised
1519 alasdair.gray 59 labels and a rich set of thesaurus relationships (<q>BT</q> for
1520     "broader term", <q>NT</q> for <q>narrower term</q>, and <q>RT</q> for
1521     <q>related term</q>). The thesaurus also contains <q>U</q> (for
1522 norman.x.gray 36 <q>use</q>) and <q>UF</q> (<q>use for</q>) relationships. In a SKOS
1523     model of a vocabulary these are captured as alternative labels. A
1524     separate document contains translations of the vocabulary terms in
1525     five languages: English, French, German, Italian, and
1526 norman.x.gray 70 Spanish. Enumerable concepts are plural (for example <q>SPIRAL
1527 norman.x.gray 36 GALAXIES</q>) and non-enumerable concepts are singular
1528 norman.x.gray 70 (for example <q>STABILITY</q>). Finally, there are some usage hints like
1529 alasdair.gray 65 <q>combine with other</q>, which have been modelled as scope notes.</p>
1530 norman.x.gray 36
1531     <p>In converting the IAU Thesaurus to SKOS, we have been as faithful
1532     as possible to the original format of the thesaurus. Thus, preferred
1533     labels have been kept in their uppercase format.</p>
1534    
1535     <p>The IAU Thesaurus has been unmaintained since its initial production in
1536     1993; it is therefore significantly out of date in places. This
1537     vocabulary is published for the sake of completeness, and to make the
1538     link between the evolving vocabulary work and any uses of the 1993
1539     vocabulary which come to light. We do not expect to make any future
1540     maintenance changes to this vocabulary, and would expect the IVOAT
1541 norman.x.gray 57 vocabulary, based on this one, to be used instead (see <span class='xref'>vocab-ivoat</span>).</p>
1542 norman.x.gray 36
1543     </div>
1544    
1545     <div class='section' id='vocab-ivoat'>
1546     <p class='title'>Towards an IVOA Thesaurus</p>
1547    
1548 norman.x.gray 22 <p>While it is true that the adoption of SKOS will make it easy to
1549 norman.x.gray 21 publish and access different astronomical vocabularies, the fact is
1550 norman.x.gray 22 that there is no vocabulary which makes it easy to jump-start the
1551 norman.x.gray 21 use of vocabularies in generic astrophysical VO applications: each of
1552     the previously developed vocabularies has their own limits and
1553     biases. For example, the IAU Thesaurus provides a large number of
1554 norman.x.gray 22 entries, copious relationships, and translations to four other languages,
1555 norman.x.gray 21 but there are no definitions, many concepts are now only useful for
1556 norman.x.gray 70 historical purposes (for example many photographic or historical instrument
1557 norman.x.gray 21 entries), some of the relationships are false or outdated, and many
1558     important or newer concepts and their common abbreviations are
1559     missing.</p>
1560    
1561 norman.x.gray 22 <p>Despite its faults, the IAU Thesaurus constitutes a very extensive
1562 norman.x.gray 21 vocabulary which could easily serve as the basis vocabulary once
1563 norman.x.gray 36 we have removed its most egregious faults and extended it to cover the
1564 norman.x.gray 21 most obvious semantic holes. To this end, a heavily revised IAU
1565     thesaurus is in preparation for use within the IVOA and other
1566     astronomical contexts. The goal is to provide a general vocabulary
1567 norman.x.gray 57 foundation to which other, more specialised, vocabularies can be added
1568 norman.x.gray 22 as needed, and to provide a good <q>lingua franca</q> for the creation of
1569 norman.x.gray 21 vocabulary mappings.</p>
1570     </div>
1571     </div> <!-- End: Example vocabularies -->
1572    
1573 alasdair.gray 59 <div class='section' id='distmappings'>
1574 norman.x.gray 420 <p class='title'>Mapping vocabularies (informative)</p>
1575 norman.x.gray 21
1576 norman.x.gray 420 <p>Part of the motivation for formalising vocabularies within the VO
1577     is to support <em>mapping between vocabularies</em>, so that an
1578     application which understands, or can natively process, one
1579     vocabulary, can use a mapping to provide at least partial support for
1580 norman.x.gray 536 data described using another vocabulary. The SKOS standard describes
1581     a number of properties for expressing such matches, and we anticipate
1582     that we will shortly see explicit mappings between vocabularies,
1583     produced either by vocabulary maintainers, describing the
1584     relationships between their own vocabularies and others, or by third
1585     parties, asserting such relationships as an intellectual contribution
1586     of their own.</p>
1587 norman.x.gray 420
1588 norman.x.gray 536 <p>The vocabularies distributed in association with this document
1589     include some non-exhaustive mappings between vocabularies, as an
1590     example of how such mappings will appear.</p>
1591    
1592 norman.x.gray 420 <!--
1593 alasdair.gray 59 <p>To show how mappings can be expressed between two vocabularies, we
1594     have provided one example mapping document which maps the concepts in
1595     the A&amp;A Keywords vocabulary to the concepts in the AOIM
1596     vocabulary.
1597     All four types of mappings were required.
1598     Since all the mapping relationships have inverse relationships
1599     defined, the mapping document can also be used to infer the set of
1600 norman.x.gray 61 mappings from the AOIM vocabulary to the A&amp;A keywords.</p>
1601 alasdair.gray 59
1602 norman.x.gray 61 <p>To provide provenence information about the set of mappings in a
1603     document, Dublin Core metadata is included in the mapping
1604     document.</p>
1605 alasdair.gray 59
1606 norman.x.gray 420 -->
1607 norman.x.gray 70
1608 alasdair.gray 59 </div>
1609    
1610 norman.x.gray 2 <div class="appendices">
1611    
1612 norman.x.gray 70 <!-- <p><span class='todo'>To come</span></p>-->
1613 alasdair.gray 65
1614    
1615 norman.x.gray 2 <div class="section-nonum" id="bibliography">
1616 norman.x.gray 70 <p class="title">References</p>
1617 norman.x.gray 2 <?bibliography rm-refs ?>
1618     </div>
1619    
1620     <p style="text-align: right; font-size: x-small; color: #888;">
1621     $Revision$ $Date$
1622     </p>
1623    
1624     </div>
1625    
1626     </body>
1627     </html>

Properties

Name Value
svn:keywords Author Date Revision

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26