ViewVC logotype

Contents of /trunk/projects/vocabularies/doc/vocabularies.xml

Parent Directory Parent Directory | Revision Log Revision Log

Revision 12 - (show annotations)
Thu Dec 6 17:28:48 2007 UTC (13 years, 9 months ago) by norman.x.gray
File MIME type: text/xml
File size: 14239 byte(s)
Adjust release date in configure.ac
Mild tweaks to doc
This is release vocabularies-0.01

1 <?xml version="1.0" encoding="utf-8"?>
2 <!-- Based on template at
3 http://www.ivoa.net/Documents/templates/ivoa-tmpl.html -->
4 <html xmlns="http://www.w3.org/1999/xhtml"
5 xmlns:dc="http://purl.org/dc/elements/1.1/"
6 xmlns:dcterms="http://purl.org/dc/terms/"
7 xml:lang="en" lang="en">
9 <head>
10 <title>Vocabularies in the Virtual Observatory</title>
11 <link rev="made" href="http://nxg.me.uk/norman/#norman" title="Norman Gray"/>
12 <meta name="author" content="Norman Gray"/>
13 <meta name="DC.subject" content="IVOA, Virtual Observatory, Vocabulary"/>
14 <meta name="rcsdate" content="$Date$"/>
15 <link href="http://www.ivoa.net/misc/ivoa_wd.css" rel="stylesheet" type="text/css"/>
16 <!-- style: make the ToC a little more compact, and without bullets -->
17 <style type="text/css">
18 div.toc ul { list-style: none; padding-left: 1em; }
19 span.userinput { font-weight: bold; }
20 span.url { font-family: monospace; }
21 q { color: #666; }
22 q:before { content: "“"; }
23 q:after { content: "”"; }
24 .todo { background: #ff7; }
25 </style>
26 </head>
28 <body>
29 <div class="head">
30 <table>
31 <tr><td><a href="http://www.ivoa.net/"><img alt="IVOA logo" src="http://ivoa.net/icons/ivoa_logo_small.jpg" border="0"/></a></td></tr>
32 </table>
34 <h1>Vocabularies in the Virtual Observatory, v@VERSION@</h1>
35 <h2>IVOA Note/Working Draft, @RELEASEDATE@ [DRAFT $Revision$]</h2>
36 <!-- $Revision$ $Date$ -->
38 <dl>
39 <dt>Working Group</dt>
40 <dd><em><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaSemantics">Semantics</a></em></dd>
42 <dt>This version</dt>
43 <dd>@BASEURI@</dd> <!-- XXX adjust current/latest URI from Makefile -->
45 <dt>Latest version</dt>
46 <dd>@BASEURI@</dd>
48 <dt>Editors</dt>
49 <dd>TBD</dd>
51 <dt>Authors</dt>
52 <dd>
53 <!-- The following are the folk that I'm definitely know have contributed
54 text or code to this document: add others as appropriate -->
55 <span property="dc:creator">Alasdair J G Gray</span>,
56 <span property="dc:creator">Norman Gray</span>,
57 <span property="dc:creator">Frederick V Hessman</span> and
58 <span property="dc:creator">Andrea Preite Martinez</span>
59 </dd>
60 </dl>
61 <hr/>
62 </div>
64 <div class="section-nonum" id="abstract">
65 <p class="title">Abstract</p>
67 <div class="abstract">
68 <p>Use SKOS. <span class='todo' >might need some expansion...</span></p>
69 </div>
71 </div>
73 <div class="section-nonum" id="status">
74 <p class="title">Status of this document</p>
76 <p>This is an IVOA Note/Working Draft. The first release of this document was
77 <span property="dc:date">@RELEASEDATE@</span>.</p>
79 <p>This document is an IVOA Working Draft for review by IVOA members
80 and other interested parties. It is a draft document and may be
81 updated, replaced, or obsoleted by other documents at any time. It is
82 inappropriate to use IVOA Working Drafts as reference materials or to
83 cite them as other than <q>work in progress</q>.</p>
85 <p>A list of current IVOA Recommendations and other technical
86 documents can be found at
87 <a href="http://www.ivoa.net/Documents/"><code>http://www.ivoa.net/Documents/</code></a>.</p>
89 <h3>Acknowledgments</h3>
91 <p>None so far.</p>
93 </div>
95 <h2><a id="contents" name="contents">Table of Contents</a></h2>
96 <?toc?>
98 <hr/>
100 <div class="section" id="introduction">
101 <p class="title">Introduction</p>
103 <div class="section">
104 <p class="title">Vocabularies in astronomy</p>
106 <p>Astronomical information of relevance to the Virtual Observatory
107 (VO) is not confined to quantities easily expressed in a catalogue or
108 a table. Fairly simple things like position on the sky, brightness
109 in some units, times measured in some frame, redshits, classifications
110 or other similar quantities are easily manipulated and stored in
111 VOTables and can now be identified using IVOA UCDs <span class="cite">std:ucd</span>. However, astrophysical concepts and
112 quantities consist of a wide variety of names, identifications,
113 classifications and associations, most of which cannot be described or
114 labelled via UCDs.</p>
116 <p>There has been some progress towards creating an ontology of
117 astronomical object types <span class="cite">std:ivoa-astro-onto</span> (an ontology is a systematic formal
118 description of a set of concepts and their relations with each other),
119 such a formal approach may not be necessary, and may be
120 counterproductive [AG Not sure counterproductive is the right argument here. Ontologies do not meet all of the navigation and retrieval use cases.]. An ontology is necessary if we are to have a
121 computer (appear to) `understand' something of a domain, but in the
122 present case, we are more concerned with the related but distinct
123 problem of letting human users find resources of interest, and so the
124 most appropriate technology derives from the Information Science
125 community, that of <em>controlled vocabularies, taxonomies and
126 thesauri</em>.</p>
128 <p>One of the best examples of the need for a simple vocabulary within
129 the VO is VOEvent <span class="cite">std:voevent</span>, the VO
130 standard for handling astronomical events: if someone broadcasts, or
131 `publishes', the occurrence of an event, the implication is that
132 someone else is going to want to respond to it, but no institution is
133 interested in all possible events, so some standardised information
134 about what the event `is about' is necessary, in a form which
135 ensures that the parties can communicate effectively. If a `burst' is
136 announced, is it a Gamma Ray Burst due to the collapse of a star in a
137 distant galaxy, a solar flare, or the brightening of a stellar or AGN
138 accretion disk? If a publisher doesn't use the label one might have
139 expected, how is one to guess what other equivalent labels might have
140 been used?</p>
142 <p>There have been a number of attempts to create astronomical
143 vocabularies (in the present note we will not need to distinguish
144 vocabularies, taxonomies and thesauri, and will use the term
145 `vocabulary' for all three cases).</p>
146 <ul>
147 <li>The <em>Second Reference Dictionary of the Nomenclature of
148 Celestial Objects</em> <span class="cite">lortet94</span>, <span class="cite">lortet94a</span> contains 500 paper pages of
149 astronomical nomenclature</li>
151 <li>For decades professional journals have used a set of reasonably
152 compatible keywords to help classify the content of whole articles.
153 These keywords have been analysed by Preite Martinez &amp; Lesteven
154 <span class="cite">preitemartinez07</span>, from which they derived a set
155 of common keywords constituting one of the potential bases for a
156 fuller VO vocabulary. The same authors also attempted to derive a set
157 of common concepts by analyzing the contents of abstracts in journal
158 articles, the list of which should contain more up-to-date
159 tokens/concepts than the old list of journal keywords. A similar but
160 less formal attempt was made by Hessman for the VOEvent working group,
161 resulting in a similar list <span class="todo">Find Hessman05
162 reference, and check differences from the A&amp;A list</span>.</li>
164 <li>Astronomical databases generally use simple sets of keywords –
165 sometimes hierarchically organized – to aid the users in the querying
166 of the databases. Two examples from totally different contexts are the
167 list of object types used in the <a href="http://simbad.u-strasbg.fr">Simbad</a> database and the search keywords used in the educational
168 Hands-On Universe image database portal.</li>
170 <li>The Astronomical Outreach Imagery (AOI) working group has created a simple
171 taxonomy for helping to classify images used for educational or public
172 relations. See <span class='url'>http://ivoa.net/Documents/latest/AOIMetadata.html</span></li>
174 <li>The Hands-On Universe project (see <span class='url'
175 >http://sunra.lbl.gov/telescope2/index.html</span> has maintained a
176 public database of images for use by the general public since the
177 1990s. The images are very heterogeneous, since they are gathered from
178 a variety of professional, semi-professional, amateur, and school
179 observatories, so a simple taxonomy is used to facilitate the browsing
180 by the users of the database.</li>
182 <li>Remote Telescope Markup Language <span
183 class="cite">std:rtml</span>, a document definition for the transfer
184 of observing requests that has been adopted by the Heterogeneous
185 Telescope Network (HTN) Consortium and is indirectly supported by the
186 VOEvent protocol, currently contains several telescope and
187 observation-related taxonomies of terms (e.g. for devices, filters,
188 objects).<span class='todo'>Confirm status: does this need to be
189 converted to SKOS? [AG]. Possibly: chase with Rick? [NG]</span></li>
191 <li>In 1993, Shobbrook and Shobbrook published an Astronomy Thesaurus
192 endoresed by the IAU (see <span class='url'
193 >http://www.aao.gov.au/lib/thesaurus.html</span> <span class='todo'
194 >What's the correct full reference for this?</span>. This collection of
195 just short of 3000 terms, in four languages, is a valuable resource,
196 but has been unfortunately little used in recent years. Its very
197 size, which gives it expressive power, is a disadvantage to the extent
198 that it is therefore hard to use.</li>
200 </ul>
201 </div>
203 <div class="section">
204 <p class="title">Formalising and managing multiple vocabularies</p>
206 <p>We find ourselves in the situation where there are multiple
207 vocabularies in use, describing a broad range of resources of interest
208 to professional and amateur astronomers, and members of the public.
209 These different vocabularies use different terms and different
210 relationships to support the different constituencies they cater for.
211 For example, `delta Sct' and `RR Lyr' are terms one would hope to find
212 in a vocabulary aimed at professional astronomers, associated with the
213 notion of `variable star'; one would hope <em>not</em> to find such
214 technical terms in a vocabulary intended to support outreach
215 activities.</p>
217 <p>One approach to this problem is to create a single consensus
218 vocabulary, which draws terms from the various existing vocabularies
219 to create a new vocabulary which is able to express anything its users
220 might desire. The problem with this is that such an effort would be
221 very expensive: both in terms of time and effort on the part of those
222 creating it, and to the potential users, who have to learn
223 to navigate around it, recognise the new terms, and who have to be
224 supported in using the new terms correctly (or, more often,
225 incorrectly).</p>
227 <p>The alternative approach to the problem is to evade it, and this is
228 the approach taken in this Draft. Rather than deprecating the
229 existence of multiple overlapping vocabularies, we embrace it,
230 formalise all of them, and formally declare the relationships between
231 them. This means that:</p>
232 <ul>
233 <li>The various vocabularies can evolve separately, on their own
234 timescales, managed by the IVOA or by third parties;</li>
235 <li>Users can use the vocabulary most appropriate to their situation,
236 either when annotating resources, or when querying them;</li>
237 <li>We retain the investments made in vocabularies by users and
238 resource owners.</li>
239 </ul>
241 <p>To this end we present in this Draft formalised versions of a
242 number of existing vocabularies, encoded as SKOS vocabularies <span class="cite">std:skoscore</span>.</p>
244 </div>
246 </div>
249 <div class="section">
250 <p class="title">Formalising the Vocabularies</p>
252 <p>After a number of online and face-to-face discussions, the authors
253 brokered a consensus within the IVOA community that the published formats of
254 formalised vocabularies should include at least SKOS (Simple Knowledge
255 Organising Systems), a W3C draft standard application of RDF to the
256 field of knowledge organisation <span
257 class="cite">std:skoscore</span>. SKOS draws on long experience
258 within the Library and Information Science community, to address a
259 well-defined set of problems to do with the indexing and retrieval of
260 information and resources; as such, it is a close match to the problem
261 this working group is addressing.</p>
263 <p>ISO 5964 <span class='cite' >std:iso5964</span> defines a number of
264 the relevant terms (ISO 5964:1985=BS 6723:1985; see also <span
265 class='cite' >std:bs8723-1</span> and <span class='cite'
266 >std:z39.19</span>), and some of the (lightweight) theoretical
267 background. The only technical distinction relevant to this document
268 is that between `vocabulary' and `thesaurus': BS-8723-1 defines a
269 thesaurus as a</p>
270 <blockquote>
271 controlled vocabulary in which concepts are represented by preferred
272 terms, formally organized so that paradigmatic relationships between
273 the concepts are made explicit, and the preferred terms are
274 accompanied by lead-in entries for synonyms or quasi-synonyms. NOTE:
275 The purpose of a thesaurus is to guide both the indexer and the
276 searcher to select the same preferred term or combination of preferred
277 terms to represent a given subject. (BS-8723-1, sect. 2.39)
278 </blockquote>
279 <p>with a similar definition in ISO-5964 sect. 3.16. The paradigmatic
280 relationships in question are those relating a term to a `broader',
281 `narrower' or more generically `related' term, with an operational
282 definition of `broader term' which is such that a resource retrieved
283 by a given term will also be retrieved by that term's `broader term'.
284 This is not a subsumption relationship, as there is no implication
285 that the concept referred to by a narrower term is of the same
286 <em>type</em> as a broader term.</p>
288 <p>Thus a vocabulary (SKOS or otherwise) is not an ontology. It has
289 lighter and looser semantics than an ontology, and is specialised for
290 the restricted case of resource retrieval.</p>
292 <p><span class='todo' >What is to be the format of the `master' files?
293 SKOS or mildly-formatted plain text?</span></p>
295 <div class="section">
296 <p class="title">SKOS files (normative)</p>
298 <p>We provide a set of SKOS files representing the vocabularies which
299 have been developed, and mappings between them. These can be
300 downloaded at the URL</p>
301 <blockquote>
302 <span class='url'>@BASEURI@/@DISTNAME@.tar.gz</span>
303 </blockquote>
305 <p><span class='todo' >To be expanded: there are no mappings at the moment.
306 Also, the vocabularies are all in a single language, though
307 translations of the IAU93 thesaurus are available.</span></p>
309 </div>
310 </div>
312 <div class="appendices">
314 <div class="section-nonum" id="bibliography">
315 <p class="title">Bibliography</p>
316 <?bibliography rm-refs ?>
317 </div>
319 <p style="text-align: right; font-size: x-small; color: #888;">
320 $Revision$ $Date$
321 </p>
323 </div>
325 </body>
326 </html>


Name Value
svn:keywords Author Date Revision

ViewVC Help
Powered by ViewVC 1.1.26