/[volute]/trunk/projects/vocabularies/doc/vocabularies.xml
ViewVC logotype

Annotation of /trunk/projects/vocabularies/doc/vocabularies.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 59 - (hide annotations)
Tue Feb 12 13:03:58 2008 UTC (12 years, 11 months ago) by alasdair.gray
File MIME type: text/xml
File size: 49288 byte(s)
Rewritten Rewritten Section 2.3 SKOS Mappings 

updated document to use the latest working draft of the skos standard

Corrected some factual details about the AOIM and A&A vocabularies-issues.bbl

Added a section detailing the example mapping included

1 norman.x.gray 2 <?xml version="1.0" encoding="utf-8"?>
2     <!-- Based on template at
3     http://www.ivoa.net/Documents/templates/ivoa-tmpl.html -->
4     <html xmlns="http://www.w3.org/1999/xhtml"
5     xmlns:dc="http://purl.org/dc/elements/1.1/"
6     xmlns:dcterms="http://purl.org/dc/terms/"
7     xml:lang="en" lang="en">
8    
9     <head>
10     <title>Vocabularies in the Virtual Observatory</title>
11     <link rev="made" href="http://nxg.me.uk/norman/#norman" title="Norman Gray"/>
12     <meta name="author" content="Norman Gray"/>
13     <meta name="DC.subject" content="IVOA, Virtual Observatory, Vocabulary"/>
14     <meta name="rcsdate" content="$Date$"/>
15     <link href="http://www.ivoa.net/misc/ivoa_wd.css" rel="stylesheet" type="text/css"/>
16     <style type="text/css">
17 norman.x.gray 58 /* make the ToC a little more compact, and without bullets */
18 norman.x.gray 2 div.toc ul { list-style: none; padding-left: 1em; }
19 norman.x.gray 26 div.toc li { padding-top: 0ex; padding-bottom: 0ex; }
20 norman.x.gray 23 li { padding-top: 1ex; padding-bottom: 1ex; }
21 alasdair.gray 34 td { vertical-align: top; }
22 norman.x.gray 2 span.userinput { font-weight: bold; }
23     span.url { font-family: monospace; }
24 norman.x.gray 43 span.rfc2119 { color: #800; }
25 norman.x.gray 2 q { color: #666; }
26     q:before { content: "“"; }
27     q:after { content: "”"; }
28     .todo { background: #ff7; }
29 norman.x.gray 58
30     /* 'link here' text in section headers */
31     *.hlink a {
32     text-decoration: none;
33     color: #fff; /* the page background colour */
34     }
35     *:hover.hlink a {
36     color: #800;
37     }
38 norman.x.gray 2 </style>
39     </head>
40    
41     <body>
42     <div class="head">
43     <table>
44     <tr><td><a href="http://www.ivoa.net/"><img alt="IVOA logo" src="http://ivoa.net/icons/ivoa_logo_small.jpg" border="0"/></a></td></tr>
45     </table>
46    
47     <h1>Vocabularies in the Virtual Observatory, v@VERSION@</h1>
48 norman.x.gray 22 <h2>IVOA Working Draft, @RELEASEDATE@ [DRAFT $Revision$]</h2>
49 norman.x.gray 2 <!-- $Revision$ $Date$ -->
50    
51     <dl>
52     <dt>Working Group</dt>
53     <dd><em><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaSemantics">Semantics</a></em></dd>
54    
55     <dt>This version</dt>
56 norman.x.gray 43 <dd><span class='url'>http://www.ivoa.net/twiki/bin/view/IVOA/IvoaSemantics</span><br/>
57 norman.x.gray 46 <span class='url'>@BASEURI@.xhtml</span></dd>
58 norman.x.gray 2
59     <dt>Latest version</dt>
60 norman.x.gray 43 <dd><span class='url'>@DISTURI@</span></dd>
61 norman.x.gray 2
62     <dt>Editors</dt>
63 norman.x.gray 57 <dd>Alasdair J G Gray,
64     <a href='http://nxg.me.uk/norman#norman' >Norman Gray</a>,
65     Frederic V Hessman and
66     Andrea Preite Martinez</dd>
67 norman.x.gray 2
68     <dt>Authors</dt>
69     <dd>
70 norman.x.gray 57 <span property="dc:creator">Sébastien Derriere</span>,
71 norman.x.gray 2 <span property="dc:creator">Alasdair J G Gray</span>,
72     <span property="dc:creator">Norman Gray</span>,
73 norman.x.gray 57 <span property="dc:creator">Frederic V Hessman</span>,
74     <span property="dc:creator">Tony Linde</span>,
75     <span property="dc:creator">Andrea Preite Martinez</span>,
76     <span property="dc:creator">Rob Seaman</span> and
77     <span property="dc:creator">Brian Thomas</span>
78 norman.x.gray 2 </dd>
79     </dl>
80     <hr/>
81     </div>
82    
83     <div class="section-nonum" id="abstract">
84     <p class="title">Abstract</p>
85    
86     <div class="abstract">
87 norman.x.gray 21 <p>As the astronomical information processed within the <em>Virtual Observatory
88     </em> becomes more complex, there is an increasing need for a more
89     formal means of identifying quantities, concepts, and processes not
90     confined to things easily placed in a FITS image, or expressed in a
91     catalogue or a table. We proposed that the IVOA adopt a standard
92     format for vocabularies based on the W3C's <em>Resource Description
93 norman.x.gray 57 Framework</em> (RDF) and <em>Simple Knowledge Organisation System</em>
94 norman.x.gray 21 (SKOS). By adopting a standard and simple format, the IVOA will
95 norman.x.gray 57 permit different groups to create and maintain their own specialised
96 norman.x.gray 21 vocabularies while letting the rest of the astronomical community
97     access, use, and combined them. The use of current, open standards
98 norman.x.gray 22 ensures that VO applications will be able to tap into resources of the
99 norman.x.gray 21 growing semantic web. Several examples of useful astronomical
100     vocabularies are provided, including work on a common IVOA thesaurus
101     intended to provide a semantic common base for VO applications.</p>
102 norman.x.gray 2 </div>
103    
104     </div>
105    
106     <div class="section-nonum" id="status">
107     <p class="title">Status of this document</p>
108    
109 norman.x.gray 43 <p>This is (<strong>an internal draft of</strong>) an IVOA Working
110     Draft. The first release of this document was
111 norman.x.gray 2 <span property="dc:date">@RELEASEDATE@</span>.</p>
112    
113     <p>This document is an IVOA Working Draft for review by IVOA members
114     and other interested parties. It is a draft document and may be
115     updated, replaced, or obsoleted by other documents at any time. It is
116     inappropriate to use IVOA Working Drafts as reference materials or to
117     cite them as other than <q>work in progress</q>.</p>
118    
119     <p>A list of current IVOA Recommendations and other technical
120     documents can be found at
121 norman.x.gray 43 <span class='url' >http://www.ivoa.net/Documents/</span>.</p>
122 norman.x.gray 2
123     <h3>Acknowledgments</h3>
124    
125 norman.x.gray 21 <p>We would like to thank the members of the IVOA semantic working
126     group for many interesting ideas and fruitful discussions.</p>
127 norman.x.gray 2 </div>
128    
129     <h2><a id="contents" name="contents">Table of Contents</a></h2>
130     <?toc?>
131    
132     <hr/>
133    
134     <div class="section" id="introduction">
135     <p class="title">Introduction</p>
136    
137     <div class="section">
138     <p class="title">Vocabularies in astronomy</p>
139    
140     <p>Astronomical information of relevance to the Virtual Observatory
141     (VO) is not confined to quantities easily expressed in a catalogue or
142 alasdair.gray 25 a table.
143     Fairly simple things such as position on the sky, brightness in some
144 norman.x.gray 57 units, times measured in some frame, redshifts, classifications or
145 alasdair.gray 25 other similar quantities are easily manipulated and stored in VOTables
146     and can currently be identified using IVOA Unified Content Descriptors
147     (UCDs) <span class="cite">std:ucd</span>.
148     However, astrophysical concepts and quantities use a wide variety of
149     names, identifications, classifications and associations, most of
150     which cannot be described or labelled via UCDs.</p>
151 norman.x.gray 2
152 alasdair.gray 25 <p>There are a number of basic forms of organised semantic knowledge
153     of potential use to the VO, ranging from informal <q>folksonomies</q>
154     (where users are free to choose their own labels) at one extreme, to
155     formally structured <q>vocabularies</q> (where the label is drawn from
156 norman.x.gray 57 a predefined set of definitions, and which can include relationships between
157 norman.x.gray 43 labels) and <q>ontologies</q> (where the domain is captured in a
158     formal data model) at the other.
159 alasdair.gray 25 More formal definitions are presented later in this document.
160     </p>
161    
162 norman.x.gray 48 <p>An astronomical ontology is necessary if we are to have a computer
163 alasdair.gray 25 (appear to) `understand' something of the domain.
164     There has been some progress towards creating an ontology of
165 norman.x.gray 22 astronomical object types <span
166 alasdair.gray 25 class="cite">std:ivoa-astro-onto</span> to meet this need.
167     However there are distinct use cases for letting human users find
168     resources of interest through search and navigation of the information space.
169     The most appropriate technology to meet these use cases derives from
170     the Information Science community, that of <em>controlled
171     vocabularies, taxonomies and thesauri</em>.
172     In the present document, we do not distinguish between controlled
173     vocabularies, taxonomies and thesauri, and use the term
174     <em>vocabulary</em> to represent all three.
175     </p>
176 norman.x.gray 2
177     <p>One of the best examples of the need for a simple vocabulary within
178     the VO is VOEvent <span class="cite">std:voevent</span>, the VO
179 norman.x.gray 43 standard for supporting rapid notification of astronomical events.
180     This standard requires some formalised indication of what a published
181     event is `about', in a formalism which can be used straightforwardly
182     by the developer of relevant services. See <span class='xref'
183     >usecases</span> for further discussion.</p>
184 norman.x.gray 2
185 norman.x.gray 43 <p>A number of astronomical vocabularies have been created, with a
186     variety of goals and intended uses. Some examples are detailed below. </p>
187 alasdair.gray 34
188 norman.x.gray 2 <ul>
189 alasdair.gray 25
190 norman.x.gray 2 <li>The <em>Second Reference Dictionary of the Nomenclature of
191 alasdair.gray 25 Celestial Objects</em> <span class="cite">lortet94</span>, <span
192     class="cite">lortet94a</span> contains 500 paper pages of astronomical
193     nomenclature</li>
194 norman.x.gray 2
195     <li>For decades professional journals have used a set of reasonably
196     compatible keywords to help classify the content of whole articles.
197     These keywords have been analysed by Preite Martinez &amp; Lesteven
198 norman.x.gray 43 <span class="cite">preitemartinez07</span>, who derived a
199 alasdair.gray 31 set of common keywords constituting one of the potential bases for a
200 norman.x.gray 2 fuller VO vocabulary. The same authors also attempted to derive a set
201 norman.x.gray 57 of common concepts by analysing the contents of abstracts in journal
202 norman.x.gray 22 articles, which should comprise a list of tokens/concepts more
203 alasdair.gray 31 up-to-date than the old list of journal keywords. A similar but less
204     formal attempt was made by Hessman <span class='cite'>hessman05</span>
205     for the VOEvent working group, resulting in a similar list <span
206 alasdair.gray 34 class="todo">[TODO] Check differences from the A&amp;A
207     list</span>.</li>
208 norman.x.gray 2
209 alasdair.gray 34 <li>Astronomical databases generally use simple sets of keywords
210 norman.x.gray 57 – sometimes hierarchically organised – to help users make queries.
211 norman.x.gray 43 Two examples from very
212 alasdair.gray 34 different contexts are the list of object types used in the <a
213 alasdair.gray 25 href="http://simbad.u-strasbg.fr">Simbad</a> database and the search
214     keywords used in the educational Hands-On Universe image database
215     portal.</li>
216 norman.x.gray 2
217 alasdair.gray 25 <li>The Astronomical Outreach Imagery (AOI) working group has created
218     a simple taxonomy for helping to classify images used for educational
219 norman.x.gray 36 or public relations <span class="cite">std:aoim</span>. See section
220     <span class='xref'>vocab-aoim</span>.</li>
221 norman.x.gray 48
222 alasdair.gray 25 <!--
223 norman.x.gray 2 <li>The Hands-On Universe project (see <span class='url'
224 norman.x.gray 22 >http://sunra.lbl.gov/telescope2/index.html</span>) has maintained a
225 norman.x.gray 2 public database of images for use by the general public since the
226     1990s. The images are very heterogeneous, since they are gathered from
227     a variety of professional, semi-professional, amateur, and school
228 norman.x.gray 22 observatories, so a simple taxonomy is used to facilitate browsing
229 norman.x.gray 2 by the users of the database.</li>
230 norman.x.gray 48 -->
231 norman.x.gray 2
232     <li>In 1993, Shobbrook and Shobbrook published an Astronomy Thesaurus
233 norman.x.gray 26 endorsed by the IAU <span class='cite' >shobbrook92</span>. This
234 alasdair.gray 25 collection of nearly 3000 terms, in five languages, is a valuable
235     resource, but has seen little use in recent years. Its very size,
236     which gives it expressive power, is a disadvantage to the extent that
237 norman.x.gray 57 it is consequently hard to use. See <span class='xref'>vocab-iau93</span>.</li>
238 norman.x.gray 2
239 norman.x.gray 43 <li>The VO's Unified Content Descriptors <span class='cite'
240     >std:ucd</span> (UCD) constitute the main controlled vocabulary of the
241 norman.x.gray 57 IVOA and contain some taxonomic information. However, UCD has some
242 norman.x.gray 43 features which supports its goals, but which make it difficult to use
243 norman.x.gray 57 beyond the present applications of labelling VOTables: firstly, there
244 norman.x.gray 43 is no standard means of identifying and processing the contents of the
245     text-based reference document; secondly, the content cannot be openly
246     extended beyond that set by a formal IVOA committee without going
247     through a laborious and time-consuming negotiation process of
248     extending the primary vocabulary itself; and thirdly, the UCD
249     vocabulary is primarily concerned with data types and their
250     processing, and only peripherally with astronomical objects (for
251     example, it defines formal labels for RA, flux, and bandpass, but does
252 norman.x.gray 57 not mention the Sun). See <span class='xref'>vocab-ucd1</span>.</li>
253 norman.x.gray 21
254 norman.x.gray 2 </ul>
255     </div>
256    
257 norman.x.gray 40 <div class='section' id='usecases'>
258     <p class='title'>Use-cases, and the motivation for formalised vocabularies</p>
259    
260     <p>The most immediate high-level motivation for this work is the
261     requirement of the VOEvent standard <span class='cite'
262     >std:voevent</span> for a controlled vocabulary usable in the
263 norman.x.gray 57 VOEvent's <code>&lt;why/&gt;</code> and <code>&lt;what/&gt;</code>
264     elements, which describe what
265 norman.x.gray 40 sort of object the VOEvent packet is describing, in some broadly
266 norman.x.gray 57 intelligible way. For example a `burst' might be a gamma-ray burst
267 norman.x.gray 43 due to the collapse of a star in a distant galaxy, a solar flare, or
268     the brightening of a stellar or AGN accretion disk, and having an
269     explicit list of vocabulary terms can help guide the event publisher
270     into using a term which will be usefully precise for the event's
271     consumers. A free-text label can help here (which brings us into the
272 norman.x.gray 57 area sometimes referred to as `folksonomies'), but the astronomical
273     community, with a culture sympathetic to international agreement, can
274     do better.</p>
275 norman.x.gray 40
276 norman.x.gray 57 <p>The purpose of this proposal is to establish a set of conventions for
277     the creation, publication, use, and manipulation of
278     astronomical vocabularies within the Virtual Observatory, based upon
279     the W3C's SKOS standard. We include as appendices to this proposal
280     formalised versions of a number of existing vocabularies, encoded as
281 alasdair.gray 59 SKOS vocabularies <span class="cite">std:skosref</span>.</p>
282 norman.x.gray 57
283 norman.x.gray 40 <p>Specific use-cases include the following.</p>
284     <ul>
285     <li>A user wishes to process all events concerning supernovae, which
286 norman.x.gray 57 means that an event concerning a type 1a supernova must be understood to be
287 norman.x.gray 40 relevant. [This supports a system working autonomously, filtering
288     incoming information]</li>
289    
290     <li>A user is searching an archive of VOEvents for microlensing
291     events, and retrieves a large number of them; the search interface may
292     then prompt her to narrow her search using one of a set of terms
293     including, say, binary lens events. [This supports so-called `semantic
294     search', providing semantic support to an interface which is in turn
295     supporting a user]</li>
296    
297     <li>A user wishes to search for resources based on the
298 norman.x.gray 43 journal-supported keywords in a paper; they might either initiate this by
299 norman.x.gray 40 hand, or have this done on their behalf by a tool which can extract
300     the keywords from a PDF. The keywords are in the A&amp;A vocabulary,
301     and mappings have been defined between this vocabulary and others,
302     which means that the query keywords is translated automatically
303     into those appropriate for a search of an outreach image database
304 norman.x.gray 57 (everyone likes pretty pictures), the VO Registry, a set of Simbad
305 norman.x.gray 40 object types, and one or more concepts in more formal ontologies. The
306     search interface is then able to support the user browsing up and down
307 norman.x.gray 57 the AOIM vocabulary, and a specialised Simbad tool is able to take
308 norman.x.gray 40 over the search, now it has an appropriate starting place. [This
309     supports interoperability, building on the investments which
310     institutions and users have made in existing vocabularies]</li>
311    
312     </ul>
313    
314 norman.x.gray 57 <p>It is not a goal of this standard, as it is not a goal of SKOS, to
315     produce knowledge-engineering artefacts which can support elaborate
316     machine reasoning – such artefacts would be very valuable, but require
317     much more expensive work on ontologies. As the supernova use-case
318     above illustrates, even simple vocabularies can support useful machine
319     reasoning.</p>
320    
321     <p>It is also not a goal of this standard to produce new vocabularies, or
322     substantially alter existing ones; instead, the vocabularies included
323     below in <span class='xref'>distvocab</span> are directly derived from
324     existing vocabularies, with adjustments to make them structurally
325     compatible with SKOS, or to remove (in the case of the IAU-93 and
326     IVOAT pair) significant anachronisms. It therefore follows that the
327     ambiguities, redundancies and incompleteness of the source vocabularies
328     are faithfully represented in the distributed SKOS vocabularies.</p>
329    
330     <p>The reason for both of these limitations is that vocabularies are
331     extremely expensive to produce, maintain and deploy, and we must
332     therefore rely on such vocabularies as have been developed, and
333     attached as metadata to resources, by others. Such vocabularies are
334     less rich or less coherent than we might prefer, but widely enough
335     deployed to be useful.</p>
336    
337 norman.x.gray 40 </div>
338    
339 norman.x.gray 2 <div class="section">
340     <p class="title">Formalising and managing multiple vocabularies</p>
341    
342     <p>We find ourselves in the situation where there are multiple
343     vocabularies in use, describing a broad range of resources of interest
344     to professional and amateur astronomers, and members of the public.
345     These different vocabularies use different terms and different
346     relationships to support the different constituencies they cater for.
347 alasdair.gray 25 For example, <q>delta Sct</q> and <q>RR Lyr</q> are terms one would
348     find in a vocabulary aimed at professional astronomers, associated
349     with the notion of <q>variable star</q>; however one would
350     <em>not</em> find such technical terms in a vocabulary intended to
351     support outreach activities.</p>
352 norman.x.gray 2
353     <p>One approach to this problem is to create a single consensus
354     vocabulary, which draws terms from the various existing vocabularies
355     to create a new vocabulary which is able to express anything its users
356     might desire. The problem with this is that such an effort would be
357 norman.x.gray 43 very expensive, both in terms of time and effort on the part of those
358 norman.x.gray 2 creating it, and to the potential users, who have to learn
359     to navigate around it, recognise the new terms, and who have to be
360     supported in using the new terms correctly (or, more often,
361     incorrectly).</p>
362    
363     <p>The alternative approach to the problem is to evade it, and this is
364 norman.x.gray 22 the approach taken in this document. Rather than deprecating the
365 norman.x.gray 2 existence of multiple overlapping vocabularies, we embrace it,
366 norman.x.gray 43 help interest groups formalise as many of them as are appropriate, and
367     standardise the process of formally declaring the relationships between
368 norman.x.gray 2 them. This means that:</p>
369     <ul>
370 alasdair.gray 25 <li>The various vocabularies are allowed to evolve separately, on
371 norman.x.gray 21 their own timescales, managed either by the IVOA, individual working
372     groups within the IVOA, or by third parties;</li>
373    
374 norman.x.gray 57 <li>Specialised vocabularies can be developed and maintained by the
375 norman.x.gray 22 community with the most knowledge about a specific topic, ensuring
376 norman.x.gray 43 that the vocabulary will have the most appropriate breadth, depth, and
377     precision;</li>
378 norman.x.gray 21
379 alasdair.gray 25 <li>Users can choose the vocabulary or combination of vocabularies most
380 norman.x.gray 21 appropriate to their situation, either when annotating resources, or
381     when querying them; and</li>
382    
383 alasdair.gray 25 <li>We can retain the previous investments made in vocabularies by
384 norman.x.gray 21 users and resource owners.</li>
385    
386 norman.x.gray 2 </ul>
387    
388    
389     </div>
390    
391     </div>
392    
393 norman.x.gray 21 <div class='section'>
394 norman.x.gray 43 <p class='title'>SKOS-based vocabularies (informative)</p>
395 norman.x.gray 2
396 norman.x.gray 43 <p>In this section, we introduce the concepts of SKOS-based
397 norman.x.gray 57 vocabularies, and the technology of mapping between them. We describe
398     some additional requirements for IVOA vocabularies in the next
399     section, <span class='xref' >publishing</span>.</p>
400 norman.x.gray 43
401 norman.x.gray 22 <div class="section" id='vocab'>
402 norman.x.gray 21 <p class="title">Selection of the vocabulary format</p>
403 norman.x.gray 2
404 norman.x.gray 21 <p>After extensive online and face-to-face discussions, the authors have
405     brokered a consensus within the IVOA community that
406     formalised vocabularies should be published at least in SKOS (Simple Knowledge
407 norman.x.gray 57 Organisation System) format, a W3C draft standard application of RDF to the
408 norman.x.gray 2 field of knowledge organisation <span
409 alasdair.gray 59 class="cite">std:skosref</span>. SKOS draws on long experience
410 norman.x.gray 2 within the Library and Information Science community, to address a
411     well-defined set of problems to do with the indexing and retrieval of
412     information and resources; as such, it is a close match to the problem
413 norman.x.gray 43 this document is addressing.</p>
414 norman.x.gray 2
415     <p>ISO 5964 <span class='cite' >std:iso5964</span> defines a number of
416     the relevant terms (ISO 5964:1985=BS 6723:1985; see also <span
417     class='cite' >std:bs8723-1</span> and <span class='cite'
418     >std:z39.19</span>), and some of the (lightweight) theoretical
419     background. The only technical distinction relevant to this document
420     is that between `vocabulary' and `thesaurus': BS-8723-1 defines a
421     thesaurus as a</p>
422     <blockquote>
423 alasdair.gray 25 Controlled vocabulary in which concepts are represented by preferred
424 norman.x.gray 2 terms, formally organized so that paradigmatic relationships between
425     the concepts are made explicit, and the preferred terms are
426 norman.x.gray 57 accompanied by lead-in entries for synonyms or quasi-synonyms.
427     <!-- NOTE:
428 norman.x.gray 2 The purpose of a thesaurus is to guide both the indexer and the
429     searcher to select the same preferred term or combination of preferred
430 norman.x.gray 57 terms to represent a given subject. -->
431     (BS-8723-1, sect. 2.39)
432 norman.x.gray 2 </blockquote>
433     <p>with a similar definition in ISO-5964 sect. 3.16. The paradigmatic
434 norman.x.gray 22 relationships in question are those relating a term to a <q>broader</q>,
435 norman.x.gray 57 <q>narrower</q> or more generically <q>related</q> term. These
436     notions have an operational definition: any resource
437     retrieved as a result of a search on a given term will also be
438     retrievable through a search on that term's <q>broader term</q>
439     (<q>narrower</q> is a simple inverse, so that for any pair of terms,
440     if <code>A skos:broader B</code>, then <code>B skos:narrower A</code>;
441     a term may have multiple narrower and broader terms).
442 norman.x.gray 2 This is not a subsumption relationship, as there is no implication
443     that the concept referred to by a narrower term is of the same
444     <em>type</em> as a broader term.</p>
445    
446 norman.x.gray 21 <p>Thus <strong>a vocabulary (SKOS or otherwise) is not an
447     ontology</strong>. It has lighter and looser semantics than an
448     ontology, and is specialised for the restricted case of resource
449     retrieval. Those interested in ontological analyses can easily
450     transfer the vocabulary relationship information from SKOS to a formal
451 norman.x.gray 22 ontological format such as OWL <span class='cite' >std:owl</span>.</p>
452 norman.x.gray 2
453 norman.x.gray 57 <p>The purpose of a thesaurus is to help users find resources they
454     might be interested in, be they library books, image archives, or VOEvent
455     packets.</p>
456    
457 norman.x.gray 21 </div>
458 norman.x.gray 2
459 norman.x.gray 21 <div class='section'>
460     <p class='title'>Content and format of a SKOS vocabulary</p>
461    
462 alasdair.gray 25 <p>A published vocabulary in SKOS format consists of a set of
463 alasdair.gray 34 <q>concepts</q> – an example concept capturing the
464 norman.x.gray 43 vocabulary information about spiral galaxies is provided in the <a
465     href='#figexample' >Figure below</a>, with the RDF shown in both
466     RDF/XML <span class='cite' >std:rdfxml</span> and Turtle notation <span
467 alasdair.gray 34 class='cite' >std:turtle</span> (Turtle is similar to the more
468     informal N3 notation). The elements of a concept are detailed
469     below.</p>
470    
471     <center>
472 norman.x.gray 43 <p><a name='figexample' >Figure: examples of SKOS vocabularies</a></p>
473 alasdair.gray 34 <table>
474     <tr>
475     <th bgcolor="#eecccc">XML Syntax</th>
476     <th width="10"/>
477     <th bgcolor="#cceecc">Turtle Syntax</th>
478     </tr>
479 norman.x.gray 36 <tr><td/></tr>
480 alasdair.gray 34 <tr>
481     <td bgcolor="#eecccc">
482     <pre>
483     &lt;skos:Concept rdf:about="#spiralGalaxy"&gt;
484     &lt;skos:prefLabel lang="en"&gt;
485     spiral galaxy
486     &lt;/prefLabel&gt;
487     &lt;skos:prefLabel lang="de"&gt;
488     Spiralgalaxie
489     &lt;/prefLabel&gt;
490     &lt;skos:altLabel lang="en"&gt;
491     spiral nebula
492     &lt;/skos:altLabel&gt;
493     &lt;skos:hiddenLabel lang="en"&gt;
494     spiral glaxy
495     &lt;/hiddenLabel&gt;
496     &lt;skos:definition lang="en"&gt;
497     A galaxy having a spiral structure.
498     &lt;/skos:definition&gt;
499     &lt;skos:scopeNote lang="en"&gt;
500     Spiral galaxies fall into one of
501     three catagories: Sa, Sc, and Sd.
502     &lt;/skos:scopeNote&gt;
503 norman.x.gray 36 &lt;skos:narrower
504     rdf:resource="#barredSpiralGalaxy"/&gt;
505     &lt;skos:broader
506     rdf:resource="#galaxy"/&gt;
507     &lt;skos:related
508     rdf:resource="#spiralArm"/&gt;
509 alasdair.gray 34 &lt;/skos:Concept&gt;
510     </pre>
511     </td>
512     <td/>
513     <td bgcolor="#cceecc">
514     <pre>
515     &lt;#spiralGalaxy&gt; a skos:Concept;
516 norman.x.gray 36 skos:prefLabel
517     "spiral galaxy"@en,
518 alasdair.gray 34 "Spiralgalaxie"@de;
519     skos:altLabel "spiral nebula"@en;
520     skos:hiddenLabel "spiral glaxy"@en;
521 norman.x.gray 36 skos:definition """A galaxy having a
522     spiral structure."""@en;
523     skos:scopeNote """Spiral galaxies fall
524     into one of three categories:
525     Sa, Sc, and Sd"""@en;
526 alasdair.gray 34 skos:narrower &lt;#barredSpiralGalaxy&gt;;
527     skos:broader &lt;#galaxy&gt;;
528     skos:related &lt;#spiralArm&gt; .
529     </pre>
530     </td>
531     </tr>
532     </table>
533     </center>
534    
535 norman.x.gray 43 <p>A SKOS vocabulary includes the following features.</p>
536    
537 alasdair.gray 34 <ul>
538    
539 norman.x.gray 43 <li>A single URI representing the concept, mainly for use by computers.
540 alasdair.gray 34 <!--
541     <code>&lt;#spiralGalaxy&gt; a skos:Concept</code>.
542     <code>&lt;skos:Concept rdf:about="#spiralGalaxy"&gt;</code>
543     -->
544 norman.x.gray 22 </li>
545 norman.x.gray 21
546 alasdair.gray 34 <li>A single prefered label in each supported language of the
547 norman.x.gray 43 vocabulary, for use by humans.
548 alasdair.gray 34 <!--
549     <code>skos:prefLabel "spiral galaxy"@en, "Spiralgalaxie"@de</code>.
550     <code>&lt;skos:prefLabel&gt;spiral galaxy&lt;/skos:prefLabel&gt;</code>
551     -->
552 norman.x.gray 22 </li>
553 norman.x.gray 21
554 alasdair.gray 25 <li>Optional alternative labels which applications may encounter or in
555     common use, whether simple synonyms or commonly-used aliases,
556 alasdair.gray 34 e.g. <q>GRB</q> for "gamma-ray burst", or <q>Spiral nebula</q> for
557     spiral galaxies.
558     <!--
559     <code>skos:altLabel "GRB"@en</code>
560     <code>&lt;skos:altLabel lang="de"&gt;Spiralgalaxie&lt;/skos:altLabel&gt;</code>
561     -->
562     </li>
563 norman.x.gray 21
564 norman.x.gray 43 <li>Optional hidden labels which capture terms which are sometimes
565     used for the corresponding concept, but which are deprecated in some
566     sense. This might include common misspellings for
567     either the preferred or alternate labels, for example <q>glaxy</q> for
568     <q>galaxy</q>.
569 alasdair.gray 34 </li>
570 alasdair.gray 25
571     <li>A definition for the concept, where one exists in the original
572 alasdair.gray 34 vocabulary, to clarify the meaning of the term.
573     <!--
574     <code>skos:definition "A galaxy having a spiral structure."@en</code>
575     <code>&lt;skos:definition lang="en"&gt;<br/>A galaxy having a spiral structure.<br/>&lt;/skos:definition&gt;</code>
576     -->
577     </li>
578 alasdair.gray 25
579 norman.x.gray 57 <li>A scope note to further clarify a definition, or the usage of the
580 alasdair.gray 34 concept.
581     <!--
582     <code>skos:scopeNote "Spiral galaxies fall into one of three categories: Sa, Sc, and Sd"@en</code>
583     <code>&lt;skos:scopeNote lang="en"&gt;<br/>Spiral galaxies fall into one of three catagories: Sa, Sc, and Sd.<br/>&lt;/skos:scopeNote&gt;</code>
584     -->
585     </li>
586 alasdair.gray 25
587 alasdair.gray 34 <li>Optionally, a concept may be involved in any number of relationships
588 alasdair.gray 25 to other concepts. The types of relationships are
589 norman.x.gray 21 <ul>
590 norman.x.gray 43 <li>Narrower or more specific concepts, for example a link to the concept
591 alasdair.gray 34 representing a <q>barred spiral galaxy</q>.
592 alasdair.gray 31 <!--
593 alasdair.gray 34 <code>skos:narrower &lt;#barredSpiralGalaxy&gt;</code>.
594     <code>&lt;skos:narrower rdf:resource="#barredSpiralGalaxy"&gt;</code>
595     -->
596 norman.x.gray 22 </li>
597 norman.x.gray 43 <li>Broader or more general concepts, for example a link to the token
598 alasdair.gray 34 representing galaxies in general.
599     <!--
600     <code>skos:broader &lt;#galaxy&gt;</code>.
601     <code>&lt;skos:broader rdf:resource="#galaxy"&gt;</code>
602     -->
603 norman.x.gray 22 </li>
604 norman.x.gray 43 <li>Related concepts, for example a link to the token representing spiral
605 alasdair.gray 34 arms of galaxies
606     <!--
607     <code>skos:related &lt;#spiralArm&gt;</code>
608     <code>&lt;skos:related rdf:resource="#spiralArm"&gt;</code>
609     -->
610 alasdair.gray 31 <br/>
611 alasdair.gray 25 (note this relationship does not say that spiral galaxies have spiral
612     arms – that would be ontological information of a higher order which
613     is beyond the requirements for information stored in a vocabulary).</li>
614 norman.x.gray 21 </ul>
615     </li>
616     </ul>
617 alasdair.gray 25
618     <p>In addition to the information about a single concept, a vocabulary
619     can contain information to help users navigate its structure and
620     contents:</p>
621     <ul>
622     <li>The <q>top concepts</q> of the vocabulary, i.e. those that occur
623     at the top of the vocabulary hierarchy defined by the broader/narrower
624     relationships, can be explicitly stated to make it easier to navigate
625     the vocabulary.</li>
626    
627     <li>Concepts that form a natural group can be defined as being members
628     of a <q>collection</q>.</li>
629    
630     <li>Versioning information can be added using change notes.</li>
631    
632     <li>Additional metadata about the vocabulary, e.g. the publisher, may
633 norman.x.gray 26 be documented using the Dublin Core metadata set <span class='cite'
634     >std:dublincore</span>.</li>
635 alasdair.gray 25 </ul>
636 norman.x.gray 21 </div>
637    
638 alasdair.gray 25
639 norman.x.gray 21 <div class='section'>
640 alasdair.gray 25 <p class='title'>Relationships Between Vocabularies</p>
641 norman.x.gray 21
642 alasdair.gray 27 <p>
643     There already exist several vocabularies in the domain of astronomy.
644     Instead of attempting to replace all these existing vocabularies,
645     which have been developed to achieve different aims and user groups,
646     we embrace them.
647     This requires a mechanism to relate the concepts in the different
648     vocabularies.
649     </p>
650    
651     <p>
652 alasdair.gray 59 Part of the SKOS standard <span class='cite'>std:skosref</span>
653     allows a concept in one vocabulary to be related to a concept in
654     another vocabulary.
655    
656     There are four types of relationship provided to capture the
657     relationships between concepts in vocabularies, which are similar to
658     those defined for relationships between concepts within a single
659     vocabulary.
660     The types of mapping relationships are:
661 alasdair.gray 27 </p>
662 alasdair.gray 59
663 norman.x.gray 21 <ul>
664 alasdair.gray 27
665     <li>
666 alasdair.gray 35 Equivalence between concepts, i.e. the concepts in the different
667 alasdair.gray 27 vocabularies refer to the same real world entity.
668 alasdair.gray 59 This is captured with the RDF statement
669     <blockquote>
670     <code>AAkeys:#Cosmology skos:exactMatch aoim:#Cosmology</code>
671     </blockquote>
672     which states that the cosmology concept in the A&amp;A Keywords is the
673     same as the cosmology concept in the AOIM.
674     (Note the use of an external namespaces <code>AAkeys</code> and
675     <code>aoim</code> which must be defined within the document.)
676 alasdair.gray 25 </li>
677 alasdair.gray 27
678     <li>
679     Broader concept, i.e. there is not an equivalent concept but there is
680     a more general one.
681 alasdair.gray 59 This is captured with the RDF statement
682     <blockquote>
683     <code>AAkeys:#Moon skos:broadMatch aoim:PlanetSatellite</code>
684     </blockquote>
685     which states that the AOIM concept Planet Satellite is a more general
686     term than the A&amp;A Keywords concept Moon.
687 alasdair.gray 27 </li>
688    
689     <li>
690     Narrower concept, i.e. there is not an equivalent concept but there is
691     a more specific one.
692 alasdair.gray 59 This is captured with the RDF statement
693     <blockquote>
694     <code>AAkeys:#IsmClouds skos:narrowMatch
695     aoim:#NebulaAppearanceDarkMolecularCloud</code>
696     </blockquote>
697     which states that the AOIM concept Nebula Appearance Dark Molecular
698     Cloud is more specific than the A&amp;A Keywords concept ISM Clouds.
699 alasdair.gray 27 </li>
700    
701     <li>
702     Related concept, i.e. there is some form of relationship.
703 alasdair.gray 59 This is captured with the RDF statement
704     <blockquote>
705     <code>AAkeys:#BlackHolePhysics skos:relatedMatch
706     aoim:#StarEvolutionaryStageBlackHole</code>
707     </blockquote>
708     which states that the A&amp;A Keywords concept Black Hole Physics has
709     an association with the AOIM concept Star Evolutionary Stage Black Hole.
710 alasdair.gray 27 </li>
711    
712 alasdair.gray 25 </ul>
713 norman.x.gray 21
714 alasdair.gray 27 <p>
715 alasdair.gray 59 <!-- <span class='todo'>[TODO:] Enter text regarding the resolution of <a
716 alasdair.gray 34 href="http://code.google.com/p/volute/issues/detail?id=7">Issue
717 alasdair.gray 59 7</a>.</span> -->
718    
719     The semantic mapping relationships have certain properties.
720     The broadMatch relationship has the narrowMatch relationship as its
721     inverse and the exactMatch and relatedMatch relationships are
722     symmetrical.
723     The consequence of these properties is that if you have a mapping from
724     concept <code>A</code> in one vocabulary to concept <code>B</code> in
725     another vocabulary then you can infer a mapping from concept
726     <code>B</code> to concept <code>A</code>.
727 alasdair.gray 27 </p>
728    
729 norman.x.gray 21 </div>
730    
731 norman.x.gray 43 </div>
732    
733     <div class='section' id='publishing'>
734     <p class='title'>Publishing vocabularies (normative)</p>
735    
736     <div class='section' id='pubreq'>
737     <p class='title'>Requirements</p>
738    
739     <p>A vocabulary which conforms to this IVOA standard has the following
740     features. In this section, the keywords
741     <span class='rfc2119' >must</span>,
742     <span class='rfc2119' >should</span>
743     and so on, are to be interpreted as described in <span
744     class='cite'>std:rfc2119</span>.</p>
745    
746     <div class='section'>
747 norman.x.gray 48 <p class='title'>Dereferenceable namespace</p>
748 norman.x.gray 43
749     <p>The namespace of the
750     vocabulary <span class='rfc2119'>must</span> be dereferenceable on the
751     web. That is, typing the namespace URL into a web browser will
752     produce human-readable documentation about the vocabulary. In
753     addition, the namespace URL <span class='rfc2119' >should</span>
754     return the RDF version of the vocabulary if it is retrieved with an
755     HTTP Accept header of <code>application/rdf+xml</code>.</p>
756    
757     <p><em>Rationale: These prescriptions are intended to be compatible
758     with the patterns described in <span class='cite'>berrueta08</span>
759     and <span class='cite'>sauermann07</span>, and vocabulary distributors
760     <span class='rfc2119' >should</span> follow these patterns where
761     possible.</em></p>
762     </div>
763    
764     <div class='section'>
765 norman.x.gray 48 <p class='title'>Long-term availability</p>
766 norman.x.gray 43
767     <p>The files defining a
768 norman.x.gray 57 vocabulary, including those of superseded versions, should remain
769 norman.x.gray 43 permanently available. There is no requirement that the namespace
770     URL be at any particular location, although the IVOA web pages, or the
771     online sections of the A&amp;A journal would likely be suitable
772     archival locations.</p>
773     </div>
774    
775     <div class='section'>
776 norman.x.gray 48 <p class='title'>Distribution format</p>
777 norman.x.gray 43
778     <p>Vocabularies <span class='rfc2119'>must</span> be made available
779     for distribution as SKOS RDF files, in either RDF/XML <span
780     class='cite'>std:rdfxml</span> or Turtle <span
781     class='cite'>std:turtle</span> format; vocabularies <span
782     class='rfc2119'>should</span> be made available in both formats. See
783 norman.x.gray 46 issue <a href='@DISTURI@/issues#distformat-2'>[distformat-2]</a>.</p>
784 norman.x.gray 43
785     <p>A publisher <span class='rfc2119'>may</span> make available
786     documentation and supporting files in other formats.</p>
787    
788     <p><em>Rationale: this does imply that the vocabulary source files can only
789     realistically be parsed using an RDF parser. An alternative is to
790     require that vocabularies be distributed using a subset of RDF/XML
791     which can also be naively handled as traditional XML; however as well
792     as creating an extra standardisation requirement, this would make it
793     effectively infeasible to write out the distribution version of the
794     vocabulary using an RDF or general SKOS tool.</em></p>
795     </div>
796    
797     <div class='section'>
798 norman.x.gray 48 <p class='title'>Clearly versioned vocabulary</p>
799 norman.x.gray 43
800     <p><span class='todo' >To be decided. There are interactions with
801 norman.x.gray 57 'long-term availability' and 'dereferenceable namespace', since this
802 norman.x.gray 43 implies that the vocabulary version should be manifestly encoded in
803     the namespace URI.</span> See issue <a
804 norman.x.gray 46 href='@DISTURI@/issues#versioning-3' >[versioning-3]</a>.</p>
805 norman.x.gray 43
806     </div>
807    
808     <div class='section'>
809 norman.x.gray 48 <p class='title'>No restrictions on source files</p>
810 norman.x.gray 43
811     <p>This standard does not place any restrictions on the format of the
812     files managed by the maintenance process, as long as the distributed
813     files are as specified above. See issue
814 norman.x.gray 46 <a href='@DISTURI@/issues#masterformat-1' >[masterformat-1]</a>.</p>
815 norman.x.gray 43 </div>
816    
817     </div>
818    
819 norman.x.gray 36 <div class='section' id='practices'>
820 norman.x.gray 21 <p class='title'>Suggested good practices</p>
821    
822 norman.x.gray 43 <p>This standard imposes a number of requirements on conformant
823 norman.x.gray 57 vocabularies (see <span class='xref' >publishing</span>). In
824 norman.x.gray 43 this section we list a number of good practices that IVOA vocabularies
825     <span class='rfc2119'>should</span> abide by. Some of the
826     prescriptions below are more specific than good-practice guidelines
827     for vocabularies in general.</p>
828    
829 norman.x.gray 57 <p>The adoption of the following guidelines will make it easier to use
830     vocabularies in generic VO applications. However, VO applications
831     <span class='rfc2119'>should</span> be able to accept any vocabulary
832     that complies with the latest SKOS standard
833 alasdair.gray 59 <span class="cite">std:skosref</span> (this does not imply, of
834 norman.x.gray 57 course, that an application will necessarily understand the terms in
835     an alien vocabulary, although the presence of mappings to a known
836     vocabulary should allow it to derive some benefit).</p>
837    
838 norman.x.gray 21 <ol>
839    
840 norman.x.gray 43 <li>Concept identifiers <span class='rfc2119'>should</span> consist
841     only of the letters a-z, A-Z, and numbers 0-9, i.e. no spaces, no
842     exotic letters (e.g. umlauts), and no characters which would make a
843     token inexpressible as part of a URI; since tokens are for use by
844     computers only, this is not a big restriction, since the exotic
845     letters can be used within the labels and documentation if
846     appropriate.</li>
847 norman.x.gray 21
848 norman.x.gray 43 <li>The concept identifiers <span class='rfc2119'>should</span> be
849     kept in human-readable form, directly reflect the implied meaning, and
850     not be semi-random identifiers only (for example, use
851     <q>spiralGalaxy</q>, not "t1234567"); tokens <span
852     class='rfc2119'>should</span> preferably be created via a direct
853     conversion from the preferred label via removable/translation of
854     non-token characters (see above) and sub-token separation via
855 norman.x.gray 57 capitalisation of the first sub-token character (e.g. the label "My
856     favourite idea-label #42" is converted into
857     "MyFavouriteIdeaLabel42").</li>
858 norman.x.gray 21
859 norman.x.gray 43 <li>Labels <span class='rfc2119'>should</span> be in the form of the source vocabulary. When
860     developing a new vocabulary the singular form <span
861     class='rfc2119'>should</span> be preferred,
862 alasdair.gray 25 e.g. <q>spiral galaxy</q>, not "spiral galaxies". <span
863 alasdair.gray 34 class='todo'><a
864     href="http://code.google.com/p/volute/issues/detail?id=1">Open
865     issue</a></span></li>
866 norman.x.gray 21
867 norman.x.gray 43 <li>Each concept <span class='rfc2119'>should</span> have a definition
868 alasdair.gray 25 (<code>skos:definition</code>) that constitutes a short description of
869     the concept which could be adopted by an application using the
870 norman.x.gray 57 vocabulary. Each concept <span class='rfc2119'>should</span> have
871     additional documentation using SKOS Notes or
872     Dublin Core terms as appropriate
873 alasdair.gray 59 (see <span class='cite'>std:skosref</span>)</li>
874 norman.x.gray 21
875 norman.x.gray 57 <li>The language localisation <span class='rfc2119'>should</span> be
876 norman.x.gray 43 declared where appropriate, in preferred labels, alternate labels,
877 norman.x.gray 57 definitions, and the like.</li>
878 norman.x.gray 21
879 alasdair.gray 25 <li>Relationships (<q>broader</q>, <q>narrower</q>, <q>related</q>)
880 norman.x.gray 43 between concepts <span class='rfc2119'>should</span> be present, but
881     are not required; if used, they <span class='rfc2119'>should</span> be
882     complete (thus all <q>broader</q> links have corresponding
883 alasdair.gray 25 <q>narrower</q> links in the referenced entries and <q>related</q>
884     entries link each other).</li>
885 norman.x.gray 21
886 norman.x.gray 43 <li><q>TopConcept</q> entries (see above) <span
887     class='rfc2119'>should</span> be declared and normally consist of
888     those concepts that do not have any <q>broader</q> relationships
889     (i.e. not at a sub-ordinate position in the hierarchy).</li>
890 norman.x.gray 21
891 norman.x.gray 43 <li>Publishers <span class='rfc2119'>should</span> publish
892     <q>mappings</q> between their vocabularies and other commonly used
893     vocabularies. These <span class='rfc2119'>should</span> be external to
894     the defining vocabulary document so that the vocabulary can be used
895     independently of the publisher's mappings. <span class='todo' ><a
896     href='http://code.google.com/p/volute/issues/detail?id=8' >Open
897     issue</a></span>.</li>
898 norman.x.gray 21 </ol>
899    
900 norman.x.gray 57 <!--
901 alasdair.gray 34 <p>These suggestions are by no means trivial – there was
902     considerable discussion within the semantic working group on many of
903     these topics, particularly about token formats (some wanted lower-case
904     only), and singular versus plural forms of the labels (different
905     traditions exist within the international library science
906     community). Obviously, no publisher of an astronomical vocabulary has
907     to adopt these rules, but the adoption of these rules will make it
908     easier to use the vocabularly in external generic VO
909     applications. However, VO applications should be developed to accept
910     any vocabulary that complies with the latest SKOS standard <span
911 alasdair.gray 59 class="cite">std:skosref</span>.</p>
912 norman.x.gray 57 -->
913     </div>
914 norman.x.gray 21
915 alasdair.gray 27 </div>
916 norman.x.gray 21
917    
918 norman.x.gray 57 <div class="section" id='distvocab'>
919 norman.x.gray 21 <p class="title">Example vocabularies</p>
920 norman.x.gray 2
921 norman.x.gray 57 <p>The intent of having the IVOA adopt SKOS as the preferred format for
922 norman.x.gray 21 astronomical vocabularies is to encourage the creation and management
923     of diverse vocabularies by competent astronomical groups, so that
924     users of the VO and related resources can benefit directly and
925 alasdair.gray 25 dynamically without the intervention of the IAU or IVOA. However, we
926     felt it important to provide several examples of vocabularies in the
927     SKOS format as part of the proposal, to illustrate their simplicity
928 norman.x.gray 57 and power, and to provide an immediate vocabulary basis for VO
929 alasdair.gray 25 applications.</p>
930 norman.x.gray 21
931 norman.x.gray 43 <p>See also issue
932 norman.x.gray 46 <a href='@DISTURI@/issues#vocabset-5' >[vocabset-5]</a>. The
933 norman.x.gray 43 identification of sections as normative or informative depends on the
934     outcome of this issue.</p>
935    
936 norman.x.gray 2 <p>We provide a set of SKOS files representing the vocabularies which
937 norman.x.gray 12 have been developed, and mappings between them. These can be
938 alasdair.gray 25 downloaded at the URL</p>
939 norman.x.gray 5 <blockquote>
940     <span class='url'>@BASEURI@/@DISTNAME@.tar.gz</span>
941     </blockquote>
942 norman.x.gray 46 <p class='todo'>Not yet: instead go to
943     <span class='url'>http://code.google.com/p/volute/downloads/list</span></p>
944 norman.x.gray 2
945 alasdair.gray 25 <p><span class='todo' >[To be expanded:] there are no mappings at the
946     moment. Also, the vocabularies are all in a single language, though
947 norman.x.gray 36 translations of the IAU93 thesaurus are available. See also
948     <a href='http://code.google.com/p/volute/issues/detail?id=8' >issue 8</a></span></p>
949 norman.x.gray 2
950 norman.x.gray 36 <div class='section' id='vocab-constellation'>
951 norman.x.gray 57 <p class='title'>A Constellation Name Vocabulary</p>
952 norman.x.gray 21
953 norman.x.gray 57 <p>This vocabulary is presented as a simple example of an astronomical vocabulary for a very particular purpose, e.g. handling constellation information like that commonly encountered in variable star research. For example, <q>SS Cygni</q> is a cataclysmic variable located in the constellation <q>Cygnus</q>. The name of the star uses the genitive form <q>Cygni</q>, but the alternate label <q>SS Cyg</q> uses the standard abbreviation <q>Cyg</q>. Given the constellation vocabulary, all of these forms are recorded together in a computer-manipulatable format. Various incorrect forms should probably be represented in SKOS `hidden labels'</p>
954 norman.x.gray 21
955 norman.x.gray 22 <p>The &lt;skos:ConceptScheme&gt; contains a single &lt;skos:TopConcept&gt;, <q>constellation</q></p>
956 norman.x.gray 36 <br/><br/><center>
957     <table>
958     <tr><th bgcolor="#eecccc">XML Syntax</th>
959     <th width="10"/><th bgcolor="#cceecc">Turtle Syntax</th></tr>
960     <tr><td/></tr>
961     <tr>
962 alasdair.gray 31 <td bgcolor="#eecccc">
963 norman.x.gray 21 <pre>
964 alasdair.gray 31 &lt;skos:Concept rdf:about="#constellation"&gt;
965     &lt;skos:inScheme rdf:resource=""/&gt;
966     &lt;skos:prefLabel&gt;
967     constellation
968     &lt;/skos:prefLabel&gt;
969     &lt;skos:definition&gt;
970     IAU-sanctioned constellation names
971     &lt;/skos:definition&gt;
972     &lt;skos:narrower rdf:resource="#Andromeda"/&gt;
973     ...
974     &lt;skos:narrower rdf:resource="#Vulpecula"/&gt;
975     &lt;/skos:Concept&gt;
976 norman.x.gray 21 </pre>
977 alasdair.gray 31 </td>
978     <td/>
979     <td bgcolor="#cceecc">
980 norman.x.gray 21 <pre>
981 norman.x.gray 22 &lt;#constellation&gt; a :Concept;
982 alasdair.gray 31 :inScheme &lt;&gt;;
983     :prefLabel "constellation";
984     :definition "IAU-sanctioned constellation names";
985     :narrower &lt;#Andromeda&gt;;
986     ...
987     :narrower &lt;#Vulpecula&gt;.
988 norman.x.gray 22 </pre>
989 alasdair.gray 31 </td></tr>
990     </table></center>
991 norman.x.gray 22 <p>and the entry for <q>Cygnus</q> is</p>
992 alasdair.gray 31 <center><table><tr>
993     <td bgcolor="#eecccc">
994 norman.x.gray 22 <pre>
995 alasdair.gray 31 &lt;skos:Concept rdf:about="#Cygnus"&gt;
996     &lt;skos:inScheme rdf:resource=""/&gt;
997     &lt;skos:prefLabel&gt;Cygnus&lt;/skos:prefLabel&gt;
998     &lt;skos:definition&gt;Cygnus&lt;/skos:definition&gt;
999     &lt;skos:altLabel&gt;Cygni&lt;/skos:altLabel&gt;
1000     &lt;skos:altLabel&gt;Cyg&lt;/skos:altLabel&gt;
1001     &lt;skos:broader rdf:resource="#constellation"/&gt;
1002     &lt;skos:scopeNote&gt;
1003     Cygnus is nominative form; the alternative
1004     labels are the genitive and short forms
1005     &lt;/skos:scopeNote&gt;
1006     &lt;/skos:Concept&gt;
1007 norman.x.gray 21 </pre>
1008 alasdair.gray 31 </td>
1009     <td width="10"/>
1010     <td bgcolor="#cceecc">
1011     <pre>
1012     &lt;#Cygnus&gt; a :Concept;
1013     :inScheme &lt;&gt;;
1014     :prefLabel "Cygnus";
1015     :definition "Cygnus";
1016     :altLabel "Cygni";
1017     :altLabel "Cyg";
1018     :broader &lt;#constellation&gt;;
1019 norman.x.gray 36 :scopeNote """Cygnus is nominative form;
1020     the alternative labels are the genitive and
1021     short forms""" .
1022 alasdair.gray 31 </pre>
1023     </td>
1024     </tr></table></center>
1025    
1026 norman.x.gray 21 <p>Note that SKOS alone does not permit the distinct differentiation
1027     of genitive forms and abbreviations, but the use of alternate labels
1028     is more than adequate enough for processing by VO applications where
1029 norman.x.gray 22 the difference between <q>SS Cygni</q>, <q>SS Cyg</q>, and the incorrect form
1030     <q>SS Cygnus</q> is probably irrelevant.</p>
1031 norman.x.gray 2 </div>
1032 norman.x.gray 21
1033 norman.x.gray 36 <div class='section' id='vocab-aa'>
1034 norman.x.gray 57 <p class='title'>The Astronomy &amp; Astrophysics Keyword List</p>
1035 norman.x.gray 21
1036 alasdair.gray 35 <p>
1037     This vocabulary is a set of keywords made available on a web page by
1038     the publisher of the journal.
1039     The intended usage of the vocabulary is to tag articles with
1040     descriptive keywords to aid searching for articles on a particular
1041     topic.
1042     </p>
1043    
1044     <p>
1045     The keywords are organised into categories which have been modelled as
1046 norman.x.gray 57 hierarchical relationships.
1047 alasdair.gray 35 Additionally, some of the keywords are grouped into collections which
1048     has been mirrored in the SKOS version.
1049 alasdair.gray 59 The vocabulary contains no definitions or related links as these are
1050     not provided in the original keyword list, and only a handful of
1051     alternative labels and scope notes that are present in the original
1052 alasdair.gray 35 keyword list.
1053     </p>
1054    
1055 norman.x.gray 21 </div>
1056    
1057 norman.x.gray 36 <div class='section' id='vocab-aoim'>
1058 norman.x.gray 57 <p class='title'>The AOIM Taxonomy</p>
1059 norman.x.gray 21
1060 alasdair.gray 35 <p>
1061     This vocabulary is published by the IVOA to allow images to be tagged
1062     with keywords that are relevant for the public.
1063     It consists of a set of keywords organised into an enumerated
1064     hierarchical structure.
1065     Each term consists of a taxonomic number and a label.
1066 alasdair.gray 59 There are no definitions, scope notes, or cross references.
1067 alasdair.gray 35 </p>
1068 norman.x.gray 21
1069 norman.x.gray 36 <p>When converting the AOIM into SKOS, it was decided to model the
1070 alasdair.gray 35 taxonomic number as an alternative label.
1071     Since there are duplication of terms, the token for a term consists of
1072     the full hierarchical location of the term.
1073 norman.x.gray 36 Thus, it is possible to distinguish between</p>
1074 alasdair.gray 35 <pre>
1075     Planet -> Feature -> Surface -> Canyon
1076     </pre>
1077 norman.x.gray 36 <p>and</p>
1078 alasdair.gray 35 <pre>
1079     Planet -> Satellite -> Feature -> Surface -> Canyon
1080     </pre>
1081 norman.x.gray 36 <p>which have the tokens <code>PlanetFeatureSurfaceCanyon</code> and
1082 alasdair.gray 35 <code>PlanetSatelliteFeatureSurfaceCanyon</code> respectively.
1083     </p>
1084    
1085 norman.x.gray 21 </div>
1086    
1087 norman.x.gray 36 <div class='section' id='vocab-ucd1'>
1088 norman.x.gray 57 <p class='title'>The UCD1+ Vocabulary</p>
1089 norman.x.gray 21
1090 alasdair.gray 25 <p>The UCD standard is an officially sanctioned and managed vocabulary
1091     of the IVOA. The normative document is a simple text file containing
1092     entries consisting of tokens (e.g. <code>em.IR</code>), a short
1093     description, and usage information (<q>syntax codes</q> which permit
1094 norman.x.gray 21 UCD tokens to be concatenated). The form of the tokens implies a
1095 alasdair.gray 25 natural hierarchy: <code>em.IR.8-15um</code> is obviously a narrower
1096     term than <code>em.IR</code>, which in turn is narrower than
1097     <code>em</code>.</p>
1098 norman.x.gray 21
1099     <p>Given the structure of the UCD1+ vocabulary, the natural
1100     translation to SKOS consists of preferred labels equal to the original
1101     tokens (the UCD1 words include dashes and periods), vocabulary tokens
1102 norman.x.gray 57 created using guidelines in <span class='xref'
1103 norman.x.gray 36 >practices</span> (e.g., "emIR815Um" for
1104 norman.x.gray 22 <code>em.IR.8-15um</code>), direct use of the definitions, and the syntax codes
1105 norman.x.gray 57 placed in usage documentation: <code>&lt;skos:scopeNote&gt;UCD syntax code: P&lt;/skos:scopeNote&gt;</code></p>
1106 norman.x.gray 21
1107     <p>Note that the SKOS document containing the UCD1+ vocabulary does
1108     NOT consistute the official version: the normative document is still
1109     the text list. However, on the long term, the IVOA may decide to make
1110     the SKOS version normative, since the SKOS version contains all of the
1111     information contained in the original text document but has the
1112     advantage of being in a standard format easily read and used by any
1113 alasdair.gray 59 application on the semantic web whilst still being usable in the
1114     current ways.</p>
1115 norman.x.gray 21
1116     </div>
1117    
1118 norman.x.gray 36 <div class='section' id='vocab-iau93'>
1119 norman.x.gray 57 <p class='title'>The 1993 IAU Thesaurus</p>
1120 norman.x.gray 21
1121 norman.x.gray 57 <p>The IAU Thesaurus consists of concepts with mostly capitalised
1122 alasdair.gray 59 labels and a rich set of thesaurus relationships (<q>BT</q> for
1123     "broader term", <q>NT</q> for <q>narrower term</q>, and <q>RT</q> for
1124     <q>related term</q>). The thesaurus also contains <q>U</q> (for
1125 norman.x.gray 36 <q>use</q>) and <q>UF</q> (<q>use for</q>) relationships. In a SKOS
1126     model of a vocabulary these are captured as alternative labels. A
1127     separate document contains translations of the vocabulary terms in
1128     five languages: English, French, German, Italian, and
1129 norman.x.gray 57 Spanish. Enumerable concepts are plural (e.g. <q>SPIRAL
1130 norman.x.gray 36 GALAXIES</q>) and non-enumerable concepts are singular
1131     (e.g. <q>STABILITY</q>). Finally, there are some usage hints like
1132     <q>combine with other</q></p>
1133    
1134     <p>In converting the IAU Thesaurus to SKOS, we have been as faithful
1135     as possible to the original format of the thesaurus. Thus, preferred
1136     labels have been kept in their uppercase format.</p>
1137    
1138     <p>The IAU Thesaurus has been unmaintained since its initial production in
1139     1993; it is therefore significantly out of date in places. This
1140     vocabulary is published for the sake of completeness, and to make the
1141     link between the evolving vocabulary work and any uses of the 1993
1142     vocabulary which come to light. We do not expect to make any future
1143     maintenance changes to this vocabulary, and would expect the IVOAT
1144 norman.x.gray 57 vocabulary, based on this one, to be used instead (see <span class='xref'>vocab-ivoat</span>).</p>
1145 norman.x.gray 36
1146     </div>
1147    
1148     <div class='section' id='vocab-ivoat'>
1149     <p class='title'>Towards an IVOA Thesaurus</p>
1150    
1151 norman.x.gray 22 <p>While it is true that the adoption of SKOS will make it easy to
1152 norman.x.gray 21 publish and access different astronomical vocabularies, the fact is
1153 norman.x.gray 22 that there is no vocabulary which makes it easy to jump-start the
1154 norman.x.gray 21 use of vocabularies in generic astrophysical VO applications: each of
1155     the previously developed vocabularies has their own limits and
1156     biases. For example, the IAU Thesaurus provides a large number of
1157 norman.x.gray 22 entries, copious relationships, and translations to four other languages,
1158 norman.x.gray 21 but there are no definitions, many concepts are now only useful for
1159     historical purposes (e.g. many photographic or historical instrument
1160     entries), some of the relationships are false or outdated, and many
1161     important or newer concepts and their common abbreviations are
1162     missing.</p>
1163    
1164 norman.x.gray 22 <p>Despite its faults, the IAU Thesaurus constitutes a very extensive
1165 norman.x.gray 21 vocabulary which could easily serve as the basis vocabulary once
1166 norman.x.gray 36 we have removed its most egregious faults and extended it to cover the
1167 norman.x.gray 21 most obvious semantic holes. To this end, a heavily revised IAU
1168     thesaurus is in preparation for use within the IVOA and other
1169     astronomical contexts. The goal is to provide a general vocabulary
1170 norman.x.gray 57 foundation to which other, more specialised, vocabularies can be added
1171 norman.x.gray 22 as needed, and to provide a good <q>lingua franca</q> for the creation of
1172 norman.x.gray 21 vocabulary mappings.</p>
1173     </div>
1174     </div> <!-- End: Example vocabularies -->
1175    
1176 alasdair.gray 59 <div class='section' id='distmappings'>
1177     <p class='title'>Example Mapping</p>
1178 norman.x.gray 21
1179 alasdair.gray 59 <p>To show how mappings can be expressed between two vocabularies, we
1180     have provided one example mapping document which maps the concepts in
1181     the A&amp;A Keywords vocabulary to the concepts in the AOIM
1182     vocabulary.
1183     All four types of mappings were required.
1184     Since all the mapping relationships have inverse relationships
1185     defined, the mapping document can also be used to infer the set of
1186     mappings from the AOIM vocabulary to the A&amp;A keywords.
1187     </p>
1188    
1189     <p>
1190     To provide provenence information about the set of mappings in a
1191     document, dublin core metadata is included in the mapping document.
1192     </p>
1193    
1194     </div>
1195    
1196 norman.x.gray 2 <div class="appendices">
1197    
1198     <div class="section-nonum" id="bibliography">
1199     <p class="title">Bibliography</p>
1200     <?bibliography rm-refs ?>
1201     </div>
1202    
1203     <p style="text-align: right; font-size: x-small; color: #888;">
1204     $Revision$ $Date$
1205     </p>
1206    
1207     </div>
1208    
1209     </body>
1210     </html>

Properties

Name Value
svn:keywords Author Date Revision

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26