/[volute]/trunk/projects/vocabularies/doc/vocabularies.xml
ViewVC logotype

Contents of /trunk/projects/vocabularies/doc/vocabularies.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 31 - (show annotations)
Fri Jan 11 10:28:35 2008 UTC (12 years, 10 months ago) by alasdair.gray
File MIME type: text/xml
File size: 40282 byte(s)
Faithfully added Rick's additions to the document. His email message stating his alterations was:
       "I filled in the missing reference to "VOConcepts".

       I changed it so that the examples are BOTH in XML and Turtle. "
1 <?xml version="1.0" encoding="utf-8"?>
2 <!-- Based on template at
3 http://www.ivoa.net/Documents/templates/ivoa-tmpl.html -->
4 <html xmlns="http://www.w3.org/1999/xhtml"
5 xmlns:dc="http://purl.org/dc/elements/1.1/"
6 xmlns:dcterms="http://purl.org/dc/terms/"
7 xml:lang="en" lang="en">
8
9 <head>
10 <title>Vocabularies in the Virtual Observatory</title>
11 <link rev="made" href="http://nxg.me.uk/norman/#norman" title="Norman Gray"/>
12 <meta name="author" content="Norman Gray"/>
13 <meta name="DC.subject" content="IVOA, Virtual Observatory, Vocabulary"/>
14 <meta name="rcsdate" content="$Date$"/>
15 <link href="http://www.ivoa.net/misc/ivoa_wd.css" rel="stylesheet" type="text/css"/>
16 <!-- style: make the ToC a little more compact, and without bullets -->
17 <style type="text/css">
18 div.toc ul { list-style: none; padding-left: 1em; }
19 div.toc li { padding-top: 0ex; padding-bottom: 0ex; }
20 li { padding-top: 1ex; padding-bottom: 1ex; }
21 span.userinput { font-weight: bold; }
22 span.url { font-family: monospace; }
23 q { color: #666; }
24 q:before { content: "“"; }
25 q:after { content: "”"; }
26 .todo { background: #ff7; }
27 </style>
28 </head>
29
30 <body>
31 <div class="head">
32 <table>
33 <tr><td><a href="http://www.ivoa.net/"><img alt="IVOA logo" src="http://ivoa.net/icons/ivoa_logo_small.jpg" border="0"/></a></td></tr>
34 </table>
35
36 <h1>Vocabularies in the Virtual Observatory, v@VERSION@</h1>
37 <h2>IVOA Working Draft, @RELEASEDATE@ [DRAFT $Revision$]</h2>
38 <!-- $Revision$ $Date$ -->
39
40 <dl>
41 <dt>Working Group</dt>
42 <dd><em><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaSemantics">Semantics</a></em></dd>
43
44 <dt>This version</dt>
45 <dd>@BASEURI@</dd> <!-- XXX adjust current/latest URI from Makefile -->
46
47 <dt>Latest version</dt>
48 <dd>@BASEURI@</dd>
49
50 <dt>Editors</dt>
51 <dd>TBD</dd>
52
53 <dt>Authors</dt>
54 <dd>
55 <!-- The following are the folk that I'm definitely know have contributed
56 text or code to this document: add others as appropriate -->
57 <span property="dc:creator">Alasdair J G Gray</span>,
58 <span property="dc:creator">Norman Gray</span>,
59 <span property="dc:creator">Frederic V Hessman</span> and
60 <span property="dc:creator">Andrea Preite Martinez</span>
61 </dd>
62 </dl>
63 <hr/>
64 </div>
65
66 <div class="section-nonum" id="abstract">
67 <p class="title">Abstract</p>
68
69 <div class="abstract">
70 <p>As the astronomical information processed within the <em>Virtual Observatory
71 </em> becomes more complex, there is an increasing need for a more
72 formal means of identifying quantities, concepts, and processes not
73 confined to things easily placed in a FITS image, or expressed in a
74 catalogue or a table. We proposed that the IVOA adopt a standard
75 format for vocabularies based on the W3C's <em>Resource Description
76 Framework</em> (RDF) and <em>Simple Knowledge Organization System</em>
77 (SKOS). By adopting a standard and simple format, the IVOA will
78 permit different groups to create and maintain their own specialized
79 vocabularies while letting the rest of the astronomical community
80 access, use, and combined them. The use of current, open standards
81 ensures that VO applications will be able to tap into resources of the
82 growing semantic web. Several examples of useful astronomical
83 vocabularies are provided, including work on a common IVOA thesaurus
84 intended to provide a semantic common base for VO applications.</p>
85 </div>
86
87 </div>
88
89 <div class="section-nonum" id="status">
90 <p class="title">Status of this document</p>
91
92 <p>This is an IVOA Working Draft. The first release of this document was
93 <span property="dc:date">@RELEASEDATE@</span>.</p>
94
95 <p>This document is an IVOA Working Draft for review by IVOA members
96 and other interested parties. It is a draft document and may be
97 updated, replaced, or obsoleted by other documents at any time. It is
98 inappropriate to use IVOA Working Drafts as reference materials or to
99 cite them as other than <q>work in progress</q>.</p>
100
101 <p>A list of current IVOA Recommendations and other technical
102 documents can be found at
103 <a href="http://www.ivoa.net/Documents/"><code>http://www.ivoa.net/Documents/</code></a>.</p>
104
105 <h3>Acknowledgments</h3>
106
107 <p>We would like to thank the members of the IVOA semantic working
108 group for many interesting ideas and fruitful discussions.</p>
109 </div>
110
111 <h2><a id="contents" name="contents">Table of Contents</a></h2>
112 <?toc?>
113
114 <hr/>
115
116 <div class="section" id="introduction">
117 <p class="title">Introduction</p>
118
119 <div class="section">
120 <p class="title">Vocabularies in astronomy</p>
121
122 <p>Astronomical information of relevance to the Virtual Observatory
123 (VO) is not confined to quantities easily expressed in a catalogue or
124 a table.
125 Fairly simple things such as position on the sky, brightness in some
126 units, times measured in some frame, redshits, classifications or
127 other similar quantities are easily manipulated and stored in VOTables
128 and can currently be identified using IVOA Unified Content Descriptors
129 (UCDs) <span class="cite">std:ucd</span>.
130 However, astrophysical concepts and quantities use a wide variety of
131 names, identifications, classifications and associations, most of
132 which cannot be described or labelled via UCDs.</p>
133
134 <p>There are a number of basic forms of organised semantic knowledge
135 of potential use to the VO, ranging from informal <q>folksonomies</q>
136 (where users are free to choose their own labels) at one extreme, to
137 formally structured <q>vocabularies</q> (where the label is drawn from
138 a predefined set of defintions which can include relationships between
139 labels) and <q>ontologies</q> (where the domain is captured in a data
140 model) at the other.
141 More formal definitions are presented later in this document.
142 </p>
143
144 <!-- <span
145 class='todo' >I think this list covers definitions covered more
146 naturally in the text below it - omissable?[NG]</span></p>
147 <ul>
148 <li>A <em>controlled vocabulary</em> is a standardized list of
149 words or other tokens with accepted meanings (for example <q>M31</q>,
150 <q>spiral galaxy</q>, <q>star</q>, <q>gas</q>, <q>dust</q>,
151 <q>cloud</q>, <q>black hole</q>, <q>Dark Matter</q>,
152 <q>halo</q>). See the fuller discussion in <span class='xref'
153 >vocab</span>.</li>
154
155 <li>A <em>taxonomy</em> is a controlled vocabulary encompassing all of
156 the members of a semantic group (for example there are <q>spiral</q>,
157 <q>elliptical</q>, <q>lenticular</q>, and <q>irregular</q> galaxies).</li>
158
159 <li>A <em>thesaurus</em> is a controlled vocabulary with some linking
160 between tokens so that simple hierarchical structures and equivalences
161 can be identified (for example <q>M31</q> is a narrower term for a <q>spiral
162 galaxy</q> which, in turn, is a narrower term for a
163 <q>galaxy</q>).</li>
164
165 <li>At the most formal end of this spectrum, an <em>ontology</em> is,
166 in the now-standard description
167 ultimately attributable to <span class='cite' >gruber93</span>, <q>a
168 formal specification of a shared conceptualisation</q>, that is, a set
169 of classes and properties which articulate a model of the world (see
170 also <span class='cite' >baader04</span>). It can range from an
171 elaborate set of definitions and restrictions, to a lightweight model
172 which is barely more than a set of subclass relationships. For
173 example, one might define a set of astronomical concepts and their
174 relations with each other, and say that <q>M31</q> is a
175 member of the class <q>Spiral Galaxy</q>, the latter consisting of
176 <q>Stars</q>, <q>Gas and Dust Clouds</q>, a <q>Central Black Hole</q>,
177 and a <q>Dark Matter Halo</q>.</li>
178 </ul>
179
180 <p>The term <q>folksonomy</q> has emerged in the last few years, to
181 describe what would in other circumstances be described as an
182 uncontrolled keyword list. The new term, and the substantial recent
183 interest in it, is a consequence of the realisation that even such a
184 simple mechanism can in certain circumstances (well-known examples are
185 the Flickr and del.icio.us social services) add substantial value to
186 a set of resources.</p>
187 -->
188
189 <p>
190 An astronomical ontology is necessary if we are to have a computer
191 (appear to) `understand' something of the domain.
192 There has been some progress towards creating an ontology of
193 astronomical object types <span
194 class="cite">std:ivoa-astro-onto</span> to meet this need.
195 However there are distinct use cases for letting human users find
196 resources of interest through search and navigation of the information space.
197 The most appropriate technology to meet these use cases derives from
198 the Information Science community, that of <em>controlled
199 vocabularies, taxonomies and thesauri</em>.
200 In the present document, we do not distinguish between controlled
201 vocabularies, taxonomies and thesauri, and use the term
202 <em>vocabulary</em> to represent all three.
203 </p>
204
205 <p>One of the best examples of the need for a simple vocabulary within
206 the VO is VOEvent <span class="cite">std:voevent</span>, the VO
207 standard for handling astronomical events: if someone broadcasts, or
208 `publishes', the occurrence of an event, the implication is that
209 someone else is going to want to respond to it, but no institution is
210 interested in all possible events, so some standardised information
211 about what the event `is about' is necessary, in a form which
212 ensures that the parties can communicate effectively. If a `burst' is
213 announced, is it a Gamma Ray Burst due to the collapse of a star in a
214 distant galaxy, a solar flare, or the brightening of a stellar or AGN
215 accretion disk? If a publisher doesn't use the label one might have
216 expected, how is one to guess what other equivalent labels might have
217 been used?</p>
218
219 <p>There have been a number of attempts to create astronomical
220 vocabularies.</p>
221 <ul>
222
223 <li>The <em>Second Reference Dictionary of the Nomenclature of
224 Celestial Objects</em> <span class="cite">lortet94</span>, <span
225 class="cite">lortet94a</span> contains 500 paper pages of astronomical
226 nomenclature</li>
227
228 <li>For decades professional journals have used a set of reasonably
229 compatible keywords to help classify the content of whole articles.
230 These keywords have been analysed by Preite Martinez &amp; Lesteven
231 <span class="cite">preitemartinez07</span>, from which they derived a
232 set of common keywords constituting one of the potential bases for a
233 fuller VO vocabulary. The same authors also attempted to derive a set
234 of common concepts by analyzing the contents of abstracts in journal
235 articles, which should comprise a list of tokens/concepts more
236 up-to-date than the old list of journal keywords. A similar but less
237 formal attempt was made by Hessman <span class='cite'>hessman05</span>
238 for the VOEvent working group, resulting in a similar list <span
239 class="todo">Check differences from the A&amp;A list</span>.</li>
240
241 <li>Astronomical databases generally use simple sets of keywords –
242 sometimes hierarchically organized – to aid the users in the querying
243 of the databases. Two examples from totally different contexts are the
244 list of object types used in the <a
245 href="http://simbad.u-strasbg.fr">Simbad</a> database and the search
246 keywords used in the educational Hands-On Universe image database
247 portal.</li>
248
249 <li>The Astronomical Outreach Imagery (AOI) working group has created
250 a simple taxonomy for helping to classify images used for educational
251 or public relations <span class="cite">std:aoim</span>.</li>
252 <!--
253 <li>The Hands-On Universe project (see <span class='url'
254 >http://sunra.lbl.gov/telescope2/index.html</span>) has maintained a
255 public database of images for use by the general public since the
256 1990s. The images are very heterogeneous, since they are gathered from
257 a variety of professional, semi-professional, amateur, and school
258 observatories, so a simple taxonomy is used to facilitate browsing
259 by the users of the database.</li>
260
261 <li>Remote Telescope Markup Language <span
262 class="cite">std:rtml</span>, a document definition for the transfer
263 of observing requests that has been adopted by the Heterogeneous
264 Telescope Network (HTN) Consortium and is indirectly supported by the
265 VOEvent protocol, currently contains several telescope and
266 observation-related taxonomies of terms (e.g. for devices, filters,
267 objects).<span class='todo'>Confirm status: does this need to be
268 converted to SKOS? [AG]. No: RTML will use IVOAT! [FVH] So delete
269 this item? [NG]</span></li>
270 -->
271 <li>In 1993, Shobbrook and Shobbrook published an Astronomy Thesaurus
272 endorsed by the IAU <span class='cite' >shobbrook92</span>. This
273 collection of nearly 3000 terms, in five languages, is a valuable
274 resource, but has seen little use in recent years. Its very size,
275 which gives it expressive power, is a disadvantage to the extent that
276 it is therefore hard to use.</li>
277
278 <li>The Unified Content Descriptors <span class='cite' >std:ucd</span>
279 (UCD) constitute the main controlled vocabulary of the IVOA and
280 contains some taxonometric information. However, UCD suffers from two
281 major problems which makes it difficult to use beyond the present
282 applications of labeling VOTables: firstly, there is no standard means of
283 identifying and processing the contents of the text-based reference
284 document; and secondly, the content cannot be openly extended beyond that set
285 by a formal IVOA committee without going through a laborious and
286 time-consuming negotiation process of extending the primary vocabulary
287 itself.</li>
288
289 </ul>
290 </div>
291
292 <div class="section">
293 <p class="title">Formalising and managing multiple vocabularies</p>
294
295 <p>We find ourselves in the situation where there are multiple
296 vocabularies in use, describing a broad range of resources of interest
297 to professional and amateur astronomers, and members of the public.
298 These different vocabularies use different terms and different
299 relationships to support the different constituencies they cater for.
300 For example, <q>delta Sct</q> and <q>RR Lyr</q> are terms one would
301 find in a vocabulary aimed at professional astronomers, associated
302 with the notion of <q>variable star</q>; however one would
303 <em>not</em> find such technical terms in a vocabulary intended to
304 support outreach activities.</p>
305
306 <p>One approach to this problem is to create a single consensus
307 vocabulary, which draws terms from the various existing vocabularies
308 to create a new vocabulary which is able to express anything its users
309 might desire. The problem with this is that such an effort would be
310 very expensive: both in terms of time and effort on the part of those
311 creating it, and to the potential users, who have to learn
312 to navigate around it, recognise the new terms, and who have to be
313 supported in using the new terms correctly (or, more often,
314 incorrectly).</p>
315
316 <p>The alternative approach to the problem is to evade it, and this is
317 the approach taken in this document. Rather than deprecating the
318 existence of multiple overlapping vocabularies, we embrace it,
319 formalise all of them, and formally declare the relationships between
320 them. This means that:</p>
321 <ul>
322 <li>The various vocabularies are allowed to evolve separately, on
323 their own timescales, managed either by the IVOA, individual working
324 groups within the IVOA, or by third parties;</li>
325
326 <li>Specialized vocabularies can be developed and maintained by the
327 community with the most knowledge about a specific topic, ensuring
328 that the vocabulary will have the right breadth, depth, and
329 accuracy;</li>
330
331 <li>Users can choose the vocabulary or combination of vocabularies most
332 appropriate to their situation, either when annotating resources, or
333 when querying them; and</li>
334
335 <li>We can retain the previous investments made in vocabularies by
336 users and resource owners.</li>
337
338 </ul>
339
340 <p>The purpose of this proposal is to establish a common format for
341 the grass-roots creation, publishing, use, and manipulation of
342 astronomical vocabularies within the Virtual Observatory, based upon
343 the W3C's SKOS standard. We include as appendices to this proposal
344 formalised versions of a number of existing vocabularies, encoded as
345 SKOS vocabularies <span class="cite">std:skoscore</span>.</p>
346
347 </div>
348
349 </div>
350
351 <div class='section'>
352 <p class='title'>SKOS-based vocabularies</p>
353
354 <div class="section" id='vocab'>
355 <p class="title">Selection of the vocabulary format</p>
356
357 <p>After extensive online and face-to-face discussions, the authors have
358 brokered a consensus within the IVOA community that
359 formalised vocabularies should be published at least in SKOS (Simple Knowledge
360 Organising Systems) format, a W3C draft standard application of RDF to the
361 field of knowledge organisation <span
362 class="cite">std:skoscore</span>. SKOS draws on long experience
363 within the Library and Information Science community, to address a
364 well-defined set of problems to do with the indexing and retrieval of
365 information and resources; as such, it is a close match to the problem
366 this working group is addressing.</p>
367
368 <p>ISO 5964 <span class='cite' >std:iso5964</span> defines a number of
369 the relevant terms (ISO 5964:1985=BS 6723:1985; see also <span
370 class='cite' >std:bs8723-1</span> and <span class='cite'
371 >std:z39.19</span>), and some of the (lightweight) theoretical
372 background. The only technical distinction relevant to this document
373 is that between `vocabulary' and `thesaurus': BS-8723-1 defines a
374 thesaurus as a</p>
375 <blockquote>
376 Controlled vocabulary in which concepts are represented by preferred
377 terms, formally organized so that paradigmatic relationships between
378 the concepts are made explicit, and the preferred terms are
379 accompanied by lead-in entries for synonyms or quasi-synonyms. NOTE:
380 The purpose of a thesaurus is to guide both the indexer and the
381 searcher to select the same preferred term or combination of preferred
382 terms to represent a given subject. (BS-8723-1, sect. 2.39)
383 </blockquote>
384 <p>with a similar definition in ISO-5964 sect. 3.16. The paradigmatic
385 relationships in question are those relating a term to a <q>broader</q>,
386 <q>narrower</q> or more generically <q>related</q> term, with an operational
387 definition of <q>broader term</q> which is such that a resource retrieved
388 by a given term will also be retrieved by that term's <q>broader term</q>.
389 This is not a subsumption relationship, as there is no implication
390 that the concept referred to by a narrower term is of the same
391 <em>type</em> as a broader term.</p>
392
393 <p>Thus <strong>a vocabulary (SKOS or otherwise) is not an
394 ontology</strong>. It has lighter and looser semantics than an
395 ontology, and is specialised for the restricted case of resource
396 retrieval. Those interested in ontological analyses can easily
397 transfer the vocabulary relationship information from SKOS to a formal
398 ontological format such as OWL <span class='cite' >std:owl</span>.</p>
399
400 <!--
401 <p><span class='todo' >What is to be the format of the `master' files?
402 SKOS or mildly-formatted plain text?[NG] By definition, this will be
403 left up to the publishers! All we need to see is SKOS. [FVH] There's
404 more than one notation for SKOS (RDF/XML and Turtle/N3): do we need to
405 mandate one over others (FVH says yes, RDF/XML; NG says no). Open
406 issue.</span></p>
407 -->
408 </div>
409
410 <div class='section'>
411 <p class='title'>Content and format of a SKOS vocabulary</p>
412
413 <p>A published vocabulary in SKOS format consists of a set of
414 <q>concepts</q> – the examples below are shown in both the
415 RDF/XML and the Turtle notation for RDF <span class='cite'
416 >std:turtle</span> (this is similar to the more informal N3 notation).
417 Each concept should contain the following elements:</p> <ul> <li>A
418 single URI representing the concept, mainly for use by computers but
419 preferably human-readable, e.g. an entry for <q>spiral galaxy</q>
420 might look like: <br/><br/><center><table><tr><th bgcolor="#eecccc">XML Syntax</th><th width="10"/><th bgcolor="#cceecc">Turtle Syntax</th></tr><tr></tr><tr>
421 <td bgcolor="#eecccc"><code>&lt;skos:Concept rdf:about="#spiralGalaxy"&gt;<br/>...<br/>&lt;/skos:Concept&gt;</code></td>
422 <td/>
423 <td bgcolor="#cceecc"><code>&lt;#spiralGalaxy&gt; a skos:Concept</code></td></tr>
424 </table></center>
425 <!-- <code>&lt;#spiralGalaxy&gt; a skos:Concept</code>.
426 <code>&lt;skos:Concept rdf:about="#spiralGalaxy"&gt;</code>-->
427 </li>
428
429 <li>A prefered label in each supported language for the vocabulary for
430 use by humans, e.g.
431 <br/><br/><center><table><tr>
432 <td bgcolor="#eecccc"><code>&lt;skos:prefLabel lang="en"&gt;spiral galaxy&lt;/prefLabel&gt;<br/>&lt;skos:prefLabel lang="de"&gt;Spiralgalaxie&lt;/prefLabel&gt;</code></td>
433 <td width="10"/>
434 <td bgcolor="#cceecc"><code>skos:prefLabel "spiral galaxy"@en, "Spiralgalaxie"@de
435 </code></td></tr>
436 </table></center>
437 <!-- <code>skos:prefLabel "spiral galaxy"@en,
438 "Spiralgalaxie"@de</code>.
439 <code>&lt;skos:prefLabel&gt;spiral galaxy&lt;/skos:prefLabel&gt;</code>-->
440 </li>
441
442 <li>Optional alternative labels which applications may encounter or in
443 common use, whether simple synonyms or commonly-used aliases,
444 e.g. <q>GRB</q> for "gamma-ray burst":
445 <br/><br/><center><table><tr>
446 <td bgcolor="#eecccc"><code>&lt;skos:altLabel lang="en"&gt;GRB&lt;/prefLabel&gt;</code></td>
447 <td width="10"/>
448 <td bgcolor="#cceecc"><code>skos:altLabel "GRB"@en</code></td></tr>
449 </table></center>
450 <!-- code>skos:altLabel "GRB"@en</code> <code>&lt;skos:altLabel
451 lang="de"&gt;Spiralgalaxie&lt;/skos:altLabel&gt;</code> --></li>
452
453 <li>Optional hidden labels which capture common misspellings for
454 either the preferred or alternate labels, e.g. <q>spiral glaxy</q> for
455 <q>spiral galaxy</q>:
456 <br/><br/><center><table><tr>
457 <td bgcolor="#eecccc"><code>&lt;skos:hiddenLabel lang="en"&gt;spiral glaxy&lt;/prefLabel&gt;</code></td>
458 <td width="10"/>
459 <td bgcolor="#cceecc"><code>skos:altLabel "spiral glaxy"@en</code></td></tr>
460 </table></center>
461 <!-- <code>skos:hiddenLabel "spiral glaxy"@en</code>-->.</li>
462
463 <li>A definition for the concept, where one exists in the original
464 vocabulary, to clarify the meaning of the term, e.g.
465 <br/><br/><center><table><tr> <td
466 bgcolor="#eecccc"><code>&lt;skos:definition lang="en"&gt;<br/>A galaxy
467 having a spiral structure.<br/>&lt;/skos:definition&gt;</code></td>
468 <td width="10"/> <td bgcolor="#cceecc"><code>skos:definition "A galaxy
469 having a spiral structure."@en</code></td></tr> </table></center>
470 <!--<code>skos:definition "A galaxy having a spiral
471 structure."@en</code>-->.</li>
472
473 <li>A scope note to further clarify a defintion, or the usage of the
474 concept, e.g.
475 <br/><br/><center><table><tr>
476 <td bgcolor="#eecccc"><code>&lt;skos:scopeNote lang="en"&gt;<br/>Spiral galaxies fall into one of three catagories: Sa, Sc, and Sd.<br/>&lt;/skos:scopeNote&gt;</code></td>
477 <td width="10"/>
478 <td bgcolor="#cceecc"><code>skos:scopeNote "Spiral galaxies fall into one of
479 three categories: Sa, Sc, and Sd"@en</code>.</td></tr>
480 </table></center>
481 <!-- <code>skos:scopeNote "Spiral galaxies fall into one of
482 three categories: Sa, Sc, and Sd"@en</code>-->.</li>
483
484 <li>Optional, a concept may be involved in any number of relationships
485 to other concepts. The types of relationships are
486 <ul>
487 <li>Narrower or more specific concepts, e.g. a link to the concept
488 representing a <q>barred spiral galaxy</q>:
489 <br/><br/><center><table><tr>
490 <td bgcolor="#eecccc"><code>&lt;skos:narrower rdf:resource="#barredSpiralGalaxy"/&gt;</code></td>
491 <td width="10"/>
492 <td bgcolor="#cceecc"><code>skos:narrower &lt;#barredSpiralGalaxy&gt;</code></td></tr>
493 </table></center>
494 <!--
495 <code>skos:narrower
496 &lt;#barredSpiralGalaxy&gt;</code>.
497 <code>&lt;skos:narrower rdf:resource="#barredSpiralGalaxy&gt;</code> -->
498 </li>
499 <li>Broader or more general concepts, e.g. a link to the token
500 representing galaxies in general:
501 <br/><br/><center><table><tr>
502 <td bgcolor="#eecccc"><code>&lt;skos:broader rdf:resource="#galaxy"/&gt;</code></td>
503 <td width="10"/>
504 <td bgcolor="#cceecc"><code>skos:broader &lt;#galaxy&gt;</code></td></tr>
505 </table></center>
506 <!-- <code>skos:broader
507 &lt;#galaxy&gt;</code>.
508 <code>&lt;skos:broader rdf:resource="#galaxy&gt;</code>-->
509 </li>
510 <li>Related concepts, e.g. a link to the token representing spiral
511 arms of galaxies:
512 <br/><br/><center><table><tr>
513 <td bgcolor="#eecccc"><code>&lt;skos:related rdf:resource="#spiralArm"/&gt;</code></td>
514 <td width="10"/>
515 <td bgcolor="#cceecc"><code>skos:related &lt;#spiralArm&gt;</code></td></tr>
516 </table></center>
517 <br/>
518 <!-- <code>skos:related &lt;#spiralArm&gt;</code>
519 <code>&lt;skos:related rdf:resource="#spiralArm"&gt;</code> -->
520 (note this relationship does not say that spiral galaxies have spiral
521 arms – that would be ontological information of a higher order which
522 is beyond the requirements for information stored in a vocabulary).</li>
523 </ul>
524 </li>
525 </ul>
526
527 <p>In addition to the information about a single concept, a vocabulary
528 can contain information to help users navigate its structure and
529 contents:</p>
530 <ul>
531 <li>The <q>top concepts</q> of the vocabulary, i.e. those that occur
532 at the top of the vocabulary hierarchy defined by the broader/narrower
533 relationships, can be explicitly stated to make it easier to navigate
534 the vocabulary.</li>
535
536 <li>Concepts that form a natural group can be defined as being members
537 of a <q>collection</q>.</li>
538
539 <li>Versioning information can be added using change notes.</li>
540
541 <li>Additional metadata about the vocabulary, e.g. the publisher, may
542 be documented using the Dublin Core metadata set <span class='cite'
543 >std:dublincore</span>.</li>
544 </ul>
545 </div>
546
547
548 <div class='section'>
549 <p class='title'>Relationships Between Vocabularies</p>
550
551 <p>
552 There already exist several vocabularies in the domain of astronomy.
553 Instead of attempting to replace all these existing vocabularies,
554 which have been developed to achieve different aims and user groups,
555 we embrace them.
556 This requires a mechanism to relate the concepts in the different
557 vocabularies.
558 The W3C are in the process of developing a standard for relating the
559 concepts in different SKOS vocabularies <span
560 class='cite'>std:skosMapping</span> and when completed this should be
561 reviewed for use by the IVOA.
562 </p>
563
564 <p>
565 Four types of relationship are sufficient to capture the relationships
566 between concepts in vocabularies and are similar to those defined for
567 relationships between concepts within a single vocabulary.
568 The relationships are as follows.
569 <span class='todo'>[TODO] Add specifics to the examples.</span>
570 </p>
571 <ul>
572
573 <li>
574 Equivalence between concepts, i.e. the concpets in the different
575 vocabularies refer to the same real world entity.
576 This is captured with the following RDF statement
577 <code>iau93:#SPIRALGALAXY map:exactMatch ivoat:#spiralGalaxy</code>
578 which states the the spiral galaxy concept in the IAU thesaurus is the
579 same as the spiral galaxy concept in the IVOAT.
580 (Note the use of an external namespaces <code>iau93</code> and
581 <code>ivoat</code> which must be defined within the document.)
582 </li>
583
584 <li>
585 Broader concept, i.e. there is not an equivalent concept but there is
586 a more general one.
587 This is captured with the RDF statement <code>iau93:#XXX
588 map:broadMatch ivoat:#YYY</code> which states that the IVOAT concept
589 YYY is more general than the IAU93 concept XXX.
590 </li>
591
592 <li>
593 Narrower concept, i.e. there is not an equivalent concept but there is
594 a more specific one.
595 This is captured with the RDF statement <code>iau93:#XXX
596 map:narrowMatch ivoat:#YYY</code> which states that the IVOAT concept
597 YYY is more specific than the IAU93 concept XXX.
598 </li>
599
600 <li>
601 Related concept, i.e. there is some form of relationship.
602 This is captured with the RDF statement <code>iau93:#XXX
603 map:relatedMatch ivoat:#YYY</code> which states that the IAU93 concept
604 XXX has an association with the IVOAT concept YYY.
605 </li>
606
607 </ul>
608
609 <p>
610 <span class='todo'>[TODO:] Enter text regarding the resolution of
611 issue 7.</span>
612 </p>
613
614 </div>
615
616 <div class='section'>
617 <p class='title'>Suggested good practices</p>
618
619 <p>As long as the vocabularies conform to the SKOS standard and
620 published in a machine processable RDF format, there is nothing
621 keeping a VO application from using the vocabulary to support the
622 human user and to enable new connections between different sources of
623 information.
624 However, we have identified a set of
625 <q>best practice rules</q> which, if followed, will make the creation,
626 management, and use of the vocabularies within the VO simpler and more
627 effective:</p>
628
629 <ol>
630 <li>The SKOS documents defining the vocabulary should be published at
631 a long-term accessible URI and should be mirrored at a central IVOA
632 vocabulary repository.
633 Each version of the vocabulary should be indicated within the name
634 (e.g. "MyFavoriteVocabulary-v3.14") and previous versions should
635 continue to be available even after having been subsumed by newer
636 versions; Published vocabulary updates should be infrequent and
637 individual changes should be documented, e.g. by
638 <code>&lt;skos:changeNote&gt;</code>. The vocabulary namespace should
639 be the same as the location of the vocabulary.</li>
640
641 <li>Concept identifiers should consist only of the letters a-z, A-Z,
642 and numbers 0-9, i.e. no spaces, no exotic letters (e.g. umlauts), and
643 no characters which would make a token inexpressible as part of a URI;
644 since tokens are for use by computers only, this is not a big
645 restriction - the exotic letters can be used within the labels and
646 documentation if appropriate.</li>
647
648 <li>Token names should be kept in human-readable form, directly
649 reflect the implied meaning, and not be semi-random identifiers only
650 (e.g. <q>spiralGalaxy</q>, not "t1234567"); tokens should preferably
651 be created via a direct conversion from the preferred label via
652 removable/translation of non-token characters (see above) and
653 sub-token separation via capitalization of the first sub-token
654 character (e.g. the label "My favorite idea-label #42" is converted
655 into "MyFavoriteIdeaLabel42"). <span class='todo'>Open
656 issue</span></li>
657
658 <li>Labels should be in the form of the source vocabulary. When
659 developing a new vocabulary the singular form is preferred,
660 e.g. <q>spiral galaxy</q>, not "spiral galaxies". <span
661 class='todo'>Open issue</span></li>
662
663 <li>Each concept should have a definition
664 (<code>skos:definition</code>) that constitutes a short description of
665 the concept which could be adopted by an application using the
666 vocabulary; The use of additional documentation in standard SKOS or
667 Dublin format (see above) is encouraged. <span class='todo'>Note
668 distinction between description and SKOS scope-note</span></li>
669
670 <li>The language localization should be declared where appropriate,
671 e.g. preferred labels, alternate labels, defintions, etc.</li>
672
673 <li>Relationships (<q>broader</q>, <q>narrower</q>, <q>related</q>)
674 between concepts are encouraged, but not required; if used, they
675 should be complete (e.g. all <q>broader</q> links have corresponding
676 <q>narrower</q> links in the referenced entries and <q>related</q>
677 entries link each other).</li>
678
679 <li><q>TopConcept</q> entries (see above) should be declared and
680 normally consist of those concepts that do not have any <q>broader</q>
681 relationships (i.e. not at a sub-ordinate position in the
682 hierarchy).</li>
683
684 <li>Publishers are encouraged to publish <q>mappings</q> between their
685 vocabularies and other commonly used vocabularies. These should be
686 external to the defining vocabulary document so that the vocabulary
687 can be used independently of the publisher's mappings.</li>
688 </ol>
689
690 <p>These suggestions are by no means trivial – there was considerable
691 discussion within the semantic working group on many of these topics,
692 particularly about token formats (some wanted lower-case only), and
693 singular versus plural forms of the labels (different traditions exist
694 within the international library science community). Obviously, no
695 publisher of an astronomical vocabulary has to adopt these rules, but
696 the adoption of these rules will make it easier to use the vocabularly
697 in external generic VO applications. However, VO applications should
698 be developed to accept any vocabulary that complies with the latest
699 SKOS standard <span class="cite">std:skoscore</span>.</p>
700 </div>
701
702 </div>
703
704
705 <div class="section">
706 <p class="title">Example vocabularies</p>
707
708 <p>The intent of having the IVOA adopt SKOS as the prefered format for
709 astronomical vocabularies is to encourage the creation and management
710 of diverse vocabularies by competent astronomical groups, so that
711 users of the VO and related resources can benefit directly and
712 dynamically without the intervention of the IAU or IVOA. However, we
713 felt it important to provide several examples of vocabularies in the
714 SKOS format as part of the proposal, to illustrate their simplicity
715 and power, and to provide an immediate vocabular basis for VO
716 applications.</p>
717
718 <p>We provide a set of SKOS files representing the vocabularies which
719 have been developed, and mappings between them. These can be
720 downloaded at the URL</p>
721 <blockquote>
722 <span class='url'>@BASEURI@/@DISTNAME@.tar.gz</span>
723 </blockquote>
724
725 <p><span class='todo' >[To be expanded:] there are no mappings at the
726 moment. Also, the vocabularies are all in a single language, though
727 translations of the IAU93 thesaurus are available.</span></p>
728
729 <div class='section'>
730 <p class='title'>A Constellation Name Vocabulary (normative)</p>
731
732 <p>This vocabulary is presented as a simple example of an astronomical vocabulary for a very particular purpose, e.g. handling constellation information like that commonly encountered in variable star research. For example, <q>SS Cygni</q> is a cataclysmic variable located in the constellation <q>Cygnus</q>. The name of the star uses the genitive form <q>Cygni</q>, but the alternate label <q>SS Cyg</q> uses the standard abbreviation <q>Cyg</q>. Given the constellation vocabulary, all of these forms are recorded together in a computer-manipulatable format. <span class='todo'>`Incorrect' forms should probably be represented in SKOS `hidden labels'</span></p>
733
734 <p>The &lt;skos:ConceptScheme&gt; contains a single &lt;skos:TopConcept&gt;, <q>constellation</q></p>
735 <br/><br/><center><table><tr><th bgcolor="#eecccc">XML Syntax</th><th width="10"/><th bgcolor="#cceecc">Turtle Syntax</th></tr><tr></tr><tr>
736 <td bgcolor="#eecccc">
737 <pre>
738 &lt;skos:Concept rdf:about="#constellation"&gt;
739 &lt;skos:inScheme rdf:resource=""/&gt;
740 &lt;skos:prefLabel&gt;
741 constellation
742 &lt;/skos:prefLabel&gt;
743 &lt;skos:definition&gt;
744 IAU-sanctioned constellation names
745 &lt;/skos:definition&gt;
746 &lt;skos:narrower rdf:resource="#Andromeda"/&gt;
747 ...
748 &lt;skos:narrower rdf:resource="#Vulpecula"/&gt;
749 &lt;/skos:Concept&gt;
750 </pre>
751 </td>
752 <td/>
753 <td bgcolor="#cceecc">
754 <pre>
755 &lt;#constellation&gt; a :Concept;
756 :inScheme &lt;&gt;;
757 :prefLabel "constellation";
758 :definition "IAU-sanctioned constellation names";
759 :narrower &lt;#Andromeda&gt;;
760 ...
761 :narrower &lt;#Vulpecula&gt;.
762 </pre>
763 </td></tr>
764 </table></center>
765 <p>and the entry for <q>Cygnus</q> is</p>
766 <center><table><tr>
767 <td bgcolor="#eecccc">
768 <pre>
769 &lt;skos:Concept rdf:about="#Cygnus"&gt;
770 &lt;skos:inScheme rdf:resource=""/&gt;
771 &lt;skos:prefLabel&gt;Cygnus&lt;/skos:prefLabel&gt;
772 &lt;skos:definition&gt;Cygnus&lt;/skos:definition&gt;
773 &lt;skos:altLabel&gt;Cygni&lt;/skos:altLabel&gt;
774 &lt;skos:altLabel&gt;Cyg&lt;/skos:altLabel&gt;
775 &lt;skos:broader rdf:resource="#constellation"/&gt;
776 &lt;skos:scopeNote&gt;
777 Cygnus is nominative form; the alternative
778 labels are the genitive and short forms
779 &lt;/skos:scopeNote&gt;
780 &lt;/skos:Concept&gt;
781 </pre>
782 </td>
783 <td width="10"/>
784 <td bgcolor="#cceecc">
785 <pre>
786 &lt;#Cygnus&gt; a :Concept;
787 :inScheme &lt;&gt;;
788 :prefLabel "Cygnus";
789 :definition "Cygnus";
790 :altLabel "Cygni";
791 :altLabel "Cyg";
792 :broader &lt;#constellation&gt;;
793 :scopeNote "Cygnus is nominative form; the alternative
794 labels are the genitive and short forms".
795 </pre>
796 </td>
797 </tr></table></center>
798
799 <p>Note that SKOS alone does not permit the distinct differentiation
800 of genitive forms and abbreviations, but the use of alternate labels
801 is more than adequate enough for processing by VO applications where
802 the difference between <q>SS Cygni</q>, <q>SS Cyg</q>, and the incorrect form
803 <q>SS Cygnus</q> is probably irrelevant.</p>
804 </div>
805
806 <div class='section'>
807 <p class='title'>The 1993 IAU Thesaurus (normative)</p>
808
809 <p>The IAU Thesaurus consists of concepts with mostly capitalized
810 labels and a rich set of thesaurus relationships (<q>BF</q> for
811 "broader form", <q>NF</q> for <q>narrower form</q>, and <q>RF</q> for
812 <q>related form</q>). The thesaurus also contains <q>U</q> (for
813 <q>use</q>) and <q>UF</q> (<q>use for</q>) relationships. In a SKOS
814 model of a vocabulary these are captured as alternative labels. A
815 separate document contains translations of the vocabulary terms in
816 five languages: English, French, German, Italian, and
817 Spanish. Enumeratable concepts are plural (e.g. <q>SPIRAL
818 GALAXIES</q>) and non-enumerable concepts are singular
819 (e.g. <q>STABILITY</q>). Finally, there are some useage hints like
820 <q>combine with other</q></p>
821
822 <p>In converting the IAU Thesaurus to SKOS, we have been as faithful
823 as possible to the original format of the thesaurus. Thus, preferred
824 labels have been kept in their uppercase format.</p>
825
826 </div>
827
828 <div class='section'>
829 <p class='title'>The Astronomy &amp; Astrophysics Keyword List (normative)</p>
830
831 <p><span class='todo'>[TODO] AG to write a short description here</span></p>
832 </div>
833
834 <div class='section'>
835 <p class='title'>The AOIM Taxonomy (normative)</p>
836
837 <p><span class='todo'>[TODO] AG to write a short description here</span></p>
838
839 </div>
840
841 <div class='section'>
842 <p class='title'>The UCD1+ Vocabulary (non-normative)</p>
843
844 <p>The UCD standard is an officially sanctioned and managed vocabulary
845 of the IVOA. The normative document is a simple text file containing
846 entries consisting of tokens (e.g. <code>em.IR</code>), a short
847 description, and usage information (<q>syntax codes</q> which permit
848 UCD tokens to be concatenated). The form of the tokens implies a
849 natural hierarchy: <code>em.IR.8-15um</code> is obviously a narrower
850 term than <code>em.IR</code>, which in turn is narrower than
851 <code>em</code>.</p>
852
853 <p>Given the structure of the UCD1+ vocabulary, the natural
854 translation to SKOS consists of preferred labels equal to the original
855 tokens (the UCD1 words include dashes and periods), vocabulary tokens
856 created using the "5th Commandment" (e.g. "emIR815Um" for
857 <code>em.IR.8-15um</code>), direct use of the definitions, and the syntax codes
858 placed in usage documentation: <code>&lt;skos:scopeNote&gt;UCD syntax code: P&lt;/skos:scopeNote&gt;</code>
859 <span class='todo'>NOTE: THIS IS THE FORMAT I USED IN MY VERSION - MAY NOT BE THE SAME AS NORMAN'S [FVH]</span></p>
860
861 <p>Note that the SKOS document containing the UCD1+ vocabulary does
862 NOT consistute the official version: the normative document is still
863 the text list. However, on the long term, the IVOA may decide to make
864 the SKOS version normative, since the SKOS version contains all of the
865 information contained in the original text document but has the
866 advantage of being in a standard format easily read and used by any
867 application on the semantic web.</p>
868
869 </div>
870
871 <div class='section'>
872 <p class='title'>The proposed IVOA Thesaurus</p>
873
874 <p>While it is true that the adoption of SKOS will make it easy to
875 publish and access different astronomical vocabularies, the fact is
876 that there is no vocabulary which makes it easy to jump-start the
877 use of vocabularies in generic astrophysical VO applications: each of
878 the previously developed vocabularies has their own limits and
879 biases. For example, the IAU Thesaurus provides a large number of
880 entries, copious relationships, and translations to four other languages,
881 but there are no definitions, many concepts are now only useful for
882 historical purposes (e.g. many photographic or historical instrument
883 entries), some of the relationships are false or outdated, and many
884 important or newer concepts and their common abbreviations are
885 missing.</p>
886
887 <p>Despite its faults, the IAU Thesaurus constitutes a very extensive
888 vocabulary which could easily serve as the basis vocabulary once
889 we have removed its most egregrious faults and extended it to cover the
890 most obvious semantic holes. To this end, a heavily revised IAU
891 thesaurus is in preparation for use within the IVOA and other
892 astronomical contexts. The goal is to provide a general vocabulary
893 foundation to which other, more specialized, vocabularies can be added
894 as needed, and to provide a good <q>lingua franca</q> for the creation of
895 vocabulary mappings.</p>
896 </div>
897 </div> <!-- End: Example vocabularies -->
898
899
900 <div class="appendices">
901
902 <div class="section-nonum" id="bibliography">
903 <p class="title">Bibliography</p>
904 <?bibliography rm-refs ?>
905 </div>
906
907 <p style="text-align: right; font-size: x-small; color: #888;">
908 $Revision$ $Date$
909 </p>
910
911 </div>
912
913 </body>
914 </html>

Properties

Name Value
svn:keywords Author Date Revision

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26