/[volute]/trunk/projects/vocabularies/doc/vocabularies.xml
ViewVC logotype

Contents of /trunk/projects/vocabularies/doc/vocabularies.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 2 - (show annotations)
Mon Dec 3 22:17:21 2007 UTC (12 years, 11 months ago) by norman.x.gray
File MIME type: text/xml
File size: 14069 byte(s)
Create initial structure, and copy draft note-so-far from Explicator repository.

1 <?xml version="1.0" encoding="utf-8"?>
2 <!-- Based on template at
3 http://www.ivoa.net/Documents/templates/ivoa-tmpl.html -->
4 <html xmlns="http://www.w3.org/1999/xhtml"
5 xmlns:dc="http://purl.org/dc/elements/1.1/"
6 xmlns:dcterms="http://purl.org/dc/terms/"
7 xml:lang="en" lang="en">
8
9 <head>
10 <title>Vocabularies in the Virtual Observatory</title>
11 <link rev="made" href="http://nxg.me.uk/norman/#norman" title="Norman Gray"/>
12 <meta name="author" content="Norman Gray"/>
13 <meta name="DC.subject" content="IVOA, Virtual Observatory, Vocabulary"/>
14 <meta name="rcsdate" content="$Date$"/>
15 <link href="http://www.ivoa.net/misc/ivoa_wd.css" rel="stylesheet" type="text/css"/>
16 <!-- style: make the ToC a little more compact, and without bullets -->
17 <style type="text/css">
18 div.toc ul { list-style: none; padding-left: 1em; }
19 span.userinput { font-weight: bold; }
20 span.url { font-family: monospace; }
21 q { color: #666; }
22 q:before { content: "“"; }
23 q:after { content: "”"; }
24 .todo { background: #ff7; }
25 </style>
26 </head>
27
28 <body>
29 <div class="head">
30 <table>
31 <tr><td><a href="http://www.ivoa.net/"><img alt="IVOA logo" src="http://ivoa.net/icons/ivoa_logo_small.jpg" border="0"/></a></td></tr>
32 </table>
33
34 <h1>Vocabularies in the Virtual Observatory, v@VERSION@</h1>
35 <h2>IVOA Working Draft, @RELEASEDATE@</h2>
36 <!-- $Revision$ $Date$ -->
37
38 <dl>
39 <dt>Working Group</dt>
40 <dd><em><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaSemantics">Semantics</a></em></dd>
41
42 <dt>This version</dt>
43 <dd>@BASEURI@</dd> <!-- XXX adjust current/latest URI from Makefile -->
44
45 <dt>Latest version</dt>
46 <dd>@BASEURI@</dd>
47
48 <dt>Editors</dt>
49 <dd>TBD</dd>
50
51 <dt>Authors</dt>
52 <dd>
53 <!-- The following are the folk that I'm aware have contributed text or code to this document: add others as appropriate -->
54 <span property="dc:creator">Alasdair J G Gray</span>,
55 <span property="dc:creator">Norman Gray</span>,
56 <span property="dc:creator">Frederick V Hessman</span> and
57 <span property="dc:creator">Andrea Preite Martinez</span>
58 </dd>
59 </dl>
60 <hr/>
61 </div>
62
63 <div class="section-nonum" id="abstract">
64 <p class="title">Abstract</p>
65
66 <div class="abstract">
67 <p>Use SKOS.</p>
68 </div>
69
70 </div>
71
72 <div class="section-nonum" id="status">
73 <p class="title">Status of this document</p>
74
75 <p>This is an IVOA Working Draft. The first release of this document was
76 <span property="dc:date">@RELEASEDATE@</span>.</p>
77
78 <p>This document is an IVOA Working Draft for review by IVOA members
79 and other interested parties. It is a draft document and may be
80 updated, replaced, or obsoleted by other documents at any time. It is
81 inappropriate to use IVOA Working Drafts as reference materials or to
82 cite them as other than <q>work in progress</q>.</p>
83
84 <p>A list of current IVOA Recommendations and other technical
85 documents can be found at
86 <a href="http://www.ivoa.net/Documents/"><code>http://www.ivoa.net/Documents/</code></a>.</p>
87
88 <h3>Acknowledgments</h3>
89
90 <p>None so far.</p>
91
92 </div>
93
94 <h2><a id="contents" name="contents">Table of Contents</a></h2>
95 <?toc?>
96
97 <hr/>
98
99 <div class="section" id="introduction">
100 <p class="title">Introduction</p>
101
102 <div class="section">
103 <p class="title">Vocabularies in astronomy</p>
104
105 <p>Astronomical information of relevance to the Virtual Observatory
106 (VO) is not confined to quantities easily expressed in a catalogue or
107 a table. Fairly simple things like position on the sky, brightness
108 in some units, times measured in some frame, redshits, classifications
109 or other similar quantities are easily manipulated and stored in
110 VOTables and can now be identified using IVOA UCDs <span class="cite">std:ucd</span>. However, astrophysical concepts and
111 quantities consist of a wide variety of names, identifications,
112 classifications and associations, most of which cannot be described or
113 labelled via UCDs.</p>
114
115 <p>There has been some progress towards creating an ontology of
116 astronomical object types <span class="cite">std:ivoa-astro-onto</span> (an ontology is a systematic formal
117 description of a set of concepts and their relations with each other),
118 such a formal approach may not be necessary, and may be
119 counterproductive [AG Not sure counterproductive is the right argument here. Ontologies do not meet all of the navigation and retrieval use cases.]. An ontology is necessary if we are to have a
120 computer (appear to) `understand' something of a domain, but in the
121 present case, we are more concerned with the related but distinct
122 problem of letting human users find resources of interest, and so the
123 most appropriate technology derives from the Information Science
124 community, that of <em>controlled vocabularies, taxonomies and
125 thesauri</em>.</p>
126
127 <p>One of the best examples of the need for a simple vocabulary within
128 the VO is VOEvent <span class="cite">std:voevent</span>, the VO
129 standard for handling astronomical events: if someone broadcasts, or
130 `publishes', the occurrence of an event, the implication is that
131 someone else is going to want to respond to it, but no institution is
132 interested in all possible events, so some standardised information
133 about what the event `is about' is necessary, in a form which
134 ensures that the parties can communicate effectively. If a `burst' is
135 announced, is it a Gamma Ray Burst due to the collapse of a star in a
136 distant galaxy, a solar flare, or the brightening of a stellar or AGN
137 accretion disk? If a publisher doesn't use the label one might have
138 expected, how is one to guess what other equivalent labels might have
139 been used?</p>
140
141 <p>There have been a number of attempts to create astronomical
142 vocabularies (in the present note we will not need to distinguish
143 vocabularies, taxonomies and thesauri, and will use the term
144 `vocabulary' for all three cases).</p>
145 <ul>
146 <li>The <em>Second Reference Dictionary of the Nomenclature of
147 Celestial Objects</em> <span class="cite">lortet94</span>, <span class="cite">lortet94a</span> contains 500 paper pages of
148 astronomical nomenclature</li>
149
150 <li>For decades professional journals have used a set of reasonably
151 compatible keywords to help classify the content of whole articles.
152 These keywords have been analysed by Preite Martinez &amp; Lesteven
153 <span class="cite">preitemartinez07</span>, from which they derived a set
154 of common keywords constituting one of the potential bases for a
155 fuller VO vocabulary. The same authors also attempted to derive a set
156 of common concepts by analyzing the contents of abstracts in journal
157 articles, the list of which should contain more up-to-date
158 tokens/concepts than the old list of journal keywords. A similar but
159 less formal attempt was made by Hessman for the VOEvent working group,
160 resulting in a similar list <span class="todo">Find Hessman05
161 reference, and check differences from the A&amp;A list</span>.</li>
162
163 <li>Astronomical databases generally use simple sets of keywords –
164 sometimes hierarchically organized – to aid the users in the querying
165 of the databases. Two examples from totally different contexts are the
166 list of object types used in the <a href="http://simbad.u-strasbg.fr">Simbad</a> database and the search keywords used in the educational
167 Hands-On Universe image database portal.</li>
168
169 <li>The Astronomical Outreach Imagery (AOI) working group has created a simple
170 taxonomy for helping to classify images used for educational or public
171 relations. See <span class='url'>http://ivoa.net/Documents/latest/AOIMetadata.html</span></li>
172
173 <li>The Hands-On Universe project (see <span class='url'
174 >http://sunra.lbl.gov/telescope2/index.html</span> has maintained a
175 public database of images for use by the general public since the
176 1990s. The images are very heterogeneous, since they are gathered from
177 a variety of professional, semi-professional, amateur, and school
178 observatories, so a simple taxonomy is used to facilitate the browsing
179 by the users of the database.</li>
180
181 <li>Remote Telescope Markup Language <span
182 class="cite">std:rtml</span>, a document definition for the transfer
183 of observing requests that has been adopted by the Heterogeneous
184 Telescope Network (HTN) Consortium and is indirectly supported by the
185 VOEvent protocol, currently contains several telescope and
186 observation-related taxonomies of terms (e.g. for devices, filters,
187 objects).<span class='todo'>Confirm status: does this need to be
188 converted to SKOS? [AG]. Possibly: chase with Rick? [NG]</span></li>
189
190 <li>In 1993, Shobbrook and Shobbrook published an Astronomy Thesaurus
191 endoresed by the IAU (see <span class='url'
192 >http://www.aao.gov.au/lib/thesaurus.html</span> <span class='todo'
193 >What's the correct citation for this?</span>. This collection of
194 just short of 3000 terms, in four languages, is a valuable resource,
195 but has been unfortunately little used in recent years. Its very
196 size, which gives it expressive power, is a disadvantage to the extent
197 that it is therefore hard to use.</li>
198
199 </ul>
200 </div>
201
202 <div class="section">
203 <p class="title">Formalising and managing multiple vocabularies</p>
204
205 <p>We find ourselves in the situation where there are multiple
206 vocabularies in use, describing a broad range of resources of interest
207 to professional and amateur astronomers, and members of the public.
208 These different vocabularies use different terms and different
209 relationships to support the different constituencies they cater for.
210 For example, `delta Sct' and `RR Lyr' are terms one would hope to find
211 in a vocabulary aimed at professional astronomers, associated with the
212 notion of `variable star'; one would hope <em>not</em> to find such
213 technical terms in a vocabulary intended to support outreach
214 activities.</p>
215
216 <p>One approach to this problem is to create a single consensus
217 vocabulary, which draws terms from the various existing vocabularies
218 to create a new vocabulary which is able to express anything its users
219 might desire. The problem with this is that such an effort would be
220 very expensive: both in terms of time and effort on the part of those
221 creating it, and to the potential users, who have to learn
222 to navigate around it, recognise the new terms, and who have to be
223 supported in using the new terms correctly (or, more often,
224 incorrectly).</p>
225
226 <p>The alternative approach to the problem is to evade it, and this is
227 the approach taken in this Draft. Rather than deprecating the
228 existence of multiple overlapping vocabularies, we embrace it,
229 formalise all of them, and formally declare the relationships between
230 them. This means that:</p>
231 <ul>
232 <li>The various vocabularies can evolve separately, on their own
233 timescales, managed by the IVOA or by third parties;</li>
234 <li>Users can use the vocabulary most appropriate to their situation,
235 either when annotating resources, or when querying them;</li>
236 <li>We retain the investments made in vocabularies by users and
237 resource owners.</li>
238 </ul>
239
240 <p>To this end we present in this Draft formalised versions of a
241 number of existing vocabularies, encoded as SKOS vocabularies <span class="cite">std:skoscore</span>.</p>
242
243 </div>
244
245 </div>
246
247
248 <div class="section">
249 <p class="title">Formalising the Vocabularies</p>
250
251 <p>After a number of online and face-to-face discussions, the authors
252 brokered a consensus within the IVOA community that the published formats of
253 formalised vocabularies should include at least SKOS (Simple Knowledge
254 Organising Systems), a W3C draft standard application of RDF to the
255 field of knowledge organisation <span
256 class="cite">std:skoscore</span>. SKOS draws on long experience
257 within the Library and Information Science community, to address a
258 well-defined set of problems to do with the indexing and retrieval of
259 information and resources; as such, it is a close match to the problem
260 this working group is addressing.</p>
261
262 <p>ISO 5964 <span class='cite' >std:iso5964</span> defines a number of
263 the relevant terms (ISO 5964:1985=BS 6723:1985; see also <span
264 class='cite' >std:bs8723-1</span> and <span class='cite'
265 >std:z39.19</span>), and some of the (lightweight) theoretical
266 background. The only technical distinction relevant to this document
267 is that between `vocabulary' and `thesaurus': BS-8723-1 defines a
268 thesaurus as a</p>
269 <blockquote>
270 controlled vocabulary in which concepts are represented by preferred
271 terms, formally organized so that paradigmatic relationships between
272 the concepts are made explicit, and the preferred terms are
273 accompanied by lead-in entries for synonyms or quasi-synonyms. NOTE:
274 The purpose of a thesaurus is to guide both the indexer and the
275 searcher to select the same preferred term or combination of preferred
276 terms to represent a given subject. (BS-8723-1, sect. 2.39)
277 </blockquote>
278 <p>with a similar definition in ISO-5964 sect. 3.16. The paradigmatic
279 relationships in question are those relating a term to a `broader',
280 `narrower' or more generically `related' term, with an operational
281 definition of `broader term' which is such that a resource retrieved
282 by a given term will also be retrieved by that term's `broader term'.
283 This is not a subsumption relationship, as there is no implication
284 that the concept referred to by a narrower term is of the same
285 <em>type</em> as a broader term.</p>
286
287 <p>Thus a vocabulary (SKOS or otherwise) is not an ontology. It has
288 lighter and looser semantics than an ontology, and is specialised for
289 the restricted case of resource retrieval.</p>
290
291 <p><span class='todo' >What is to be the format of the `master' files?
292 SKOS or mildly-formatted plain text?</span></p>
293
294 <div class="section">
295 <p class="title">SKOS files (normative)</p>
296
297 <p>We provide a set of SKOS files representing the vocabularies which
298 have been developed.</p>
299
300 <p>To come: one SKOS file per vocabulary, defining the list of
301 concepts; at least one file per vocabulary, giving mappings to other
302 vocabularies; possibly translations. See Makefile in ../Vocabularies,
303 which produces a tarball, at present without mappings or translations.</p>
304
305 </div>
306 </div>
307
308 <div class="appendices">
309
310 <div class="section-nonum" id="bibliography">
311 <p class="title">Bibliography</p>
312 <?bibliography rm-refs ?>
313 </div>
314
315 <p style="text-align: right; font-size: x-small; color: #888;">
316 $Revision$ $Date$
317 </p>
318
319 </div>
320
321 </body>
322 </html>

Properties

Name Value
svn:keywords Author Date Revision

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26