/[volute]/trunk/projects/dal/SODA/SODA.tex
ViewVC logotype

Contents of /trunk/projects/dal/SODA/SODA.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 3749 - (show annotations)
Mon Nov 28 18:25:45 2016 UTC (4 years, 8 months ago) by francois
File MIME type: application/x-tex
File size: 44567 byte(s)


1 \documentclass[11pt,a4paper]{ivoa}
2 \input tthdefs
3
4 \newcommand{\xtype}[1]{\texttt{#1}}
5 \newcommand{\ucd}[1]{\texttt{#1}}
6
7 \usepackage{listings}
8 \lstloadlanguages{XML,sh}
9 \lstset{flexiblecolumns=true,basicstyle=\small,tagstyle=\ttfamily}
10 \usepackage[utf8]{inputenc}
11 \usepackage{todonotes}
12
13 \title{IVOA Server-side Operations for Data Access}
14
15 \ivoagroup{Data Access Layer Working Group}
16
17 \author{Fran\c cois Bonnarel}
18 \author{Markus Demleitner}
19 \author{Patrick Dowler}
20 \author{Douglas Tody}
21 \author{James Dempsey}
22
23 \editor{Fran\c cois Bonnarel, Patrick Dowler}
24
25 \previousversion{PR-SODA-1.0-20160429}
26 \previousversion{WD-SODA-1.0-20151221}
27 \previousversion{WD-AccessData-1.0-20151021}
28 \previousversion{WD-AccessData-1.0-20140730}
29 \previousversion{WD-AccessData-1.0-20140312}
30
31
32
33 \begin{document}
34
35 \begin{abstract}
36 This document describes the Server-side Operations for Data Access
37 (SODA) web service capability. SODA is a low-level data access
38 capability or server side data processing that can act upon the data
39 files, performing various kinds of operations: filtering/subsection,
40 transformations, pixel operations, and applying functions to the data.
41 \end{abstract}
42
43 \section*{Acknowledgments}
44 The authors would like to thank all the participants in DAL-WG discussions for
45 their ideas, critical
46 reviews, and contributions to this document.
47
48 \section*{Conformance-related definitions}
49
50 The words ``MUST'', ``SHALL'', ``SHOULD'', ``MAY'', ``RECOMMENDED'', and
51 ``OPTIONAL'' (in upper or lower case) used in this document are to be
52 interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}.
53
54 The \emph{Virtual Observatory (VO)} is a
55 general term for a collection of federated resources that can be used
56 to conduct astronomical research, education, and outreach.
57 The \href{http://www.ivoa.net}{International
58 Virtual Observatory Alliance (IVOA)} is a global
59 collaboration of separately funded projects to develop standards and
60 infrastructure that enable VO applications.
61
62
63 \section{Introduction}
64 The SODA web service interface defines a RESTful web service for
65 performing server-side operations and processing on data before
66 transfer.
67
68 \subsection{The Role in the IVOA Architecture}
69
70 \begin{figure}
71 \centering
72
73 \includegraphics[width=0.9\textwidth]{archdiag.png}
74 \caption{SODA in the global VO architecture}
75 \label{fig:architecture}
76 \end{figure}
77
78 Figure~\ref{fig:architecture} shows how SODA fits into the IVOA architecture.
79 SODA services conform to the Data Access Layer Interface (DALI,
80 \citet{std:DALI}) specification,
81 including the Virtual Observatory Support Interfaces (VOSI,
82 \citet{std:VOSI}) resources.
83
84 Within the IVOA architecture, SODA services could be found and used in
85 several ways. First, a SODA service could be found in the IVOA Registry
86 and used directly. A description of a SODA service may be found along
87 with specific dataset metadata at either the data discovery phase using
88 Simple Image Access (SIA, \citet{std:SIAv2}) or Table Access Protocol
89 (TAP, \citet{std:TAP}) and the ObsCore data model \citep{std:OBSCORE} or
90 via a DataLink \citep{std:DataLink} service. The service descriptors and
91 three-factor semantics rely on UCDs \citep{std:UCD} and the VO Unit
92 specification \citep{std:VOUNIT}. Since the discovery of SODA services
93 makes use of DataLink service descriptor(s) to provide parameter
94 metadata, the VOSI-capabilities specified in
95 Section~\ref{sec:capabilities} do not make use of a registry extension.
96
97 \subsubsection{SODA Service in the Registry}
98
99 Resources in the IVOA Registry may include SODA capabilities. In order to
100 use such services, clients require prior knowledge of suitable
101 identifiers that are usable with a registered SODA service. This
102 scenario is described in more detail below in
103 Section~\ref{sec:reg-soda}.
104
105 \subsubsection{SODA Service from Data Discovery}
106
107 In the simplest case, the identifiers found via data discovery can be
108 used directly with an associated SODA service. The query response (from
109 SIA or TAP) would include one or more DataLink service descriptor(s)
110 that describe the available SODA capabilities. This scenario is described
111 in detail in Section~\ref{sec:disc-soda}.
112
113 \subsubsection{SODA Service from DataLink}
114
115 In the general case, data discovery responses may direct clients to an
116 associated DataLink service where access details can be obtained. The
117 DataLink output will in turn provide service descriptor(s) of the
118 associated SODA service(s). Service providers may choose this approach
119 for several reasons; for example, one entry from data discovery may lead
120 to multiple files or resources, or access via services such as SODA may
121 be considered the primary access mode and direct download is not
122 available or discouraged. This scenario is described in detail in
123 Section~\ref{sec:disc-links-soda}.
124
125
126 \subsection{Motivating Use Cases}
127 Below are some of the more common use cases that have motivated the
128 development of the SODA specification. While this is not complete, it
129 helps to understand the problem area covered by this specification.
130
131 \subsubsection{Retrieve Subsection of a Datacube}
132 \label{sec:use-cube}
133
134 Cutout a subsection using coordinate axis values. The input to the
135 cutout operation will include one or more of the following:
136
137 \begin{itemize}
138 \item a region on the sky
139 \item an energy value or range
140 \item a time value or range
141 \item one or more polarization states
142 \end{itemize}
143
144 The region on the sky should be something simple: a circle,
145 a range of coordinate values, or a polygon.
146
147 \subsubsection{Retrieve subsection of a 2D Image}
148 This is a special case of \ref{sec:use-cube},
149 where the cutout is only in the spatial axes.
150
151 \subsubsection{Retrieve subsection of a Spectrum}
152
153 This is a special case of \ref{sec:use-cube},
154 where the cutout is only in the spectral axis.
155
156 \subsection{Anticipated Future Use Cases}
157
158 \subsubsection{Provide the data in different formats}
159
160 Examples are images in PNG, or JPEG instead of FITS and spectra in csv,
161 FITS or VOTable.
162
163 \subsubsection{Flatten a Datacube into a 2D Image}
164
165 This use case will be developed and supported in the
166 SODA-1.1 (or later) specification.
167
168 \subsubsection{Flatten a Datacube into a 1D Spectrum}
169
170 This use case will be developed and supported in the
171 SODA-1.1 (or later) specification.
172
173 \subsubsection{Rebin Data by a Fixed Factor}
174
175 This use case will be developed and supported in the
176 SODA-1.1 (or later) specification.
177
178 \subsubsection{Reproject Data onto a Specified Grid}
179
180 This use case will be developed and supported in the
181 SODA-1.1 (or later) specification.
182
183 \subsubsection{Compute Aggregate Functions over the Data}
184
185 This use case will be developed and supported in the
186 SODA-1.1 (or later) specification.
187
188
189 \subsubsection{Apply Standard Function to Data Values}
190
191 It could be
192 ``denoising'' with standard methods or ``on the fly'' recalibration.
193 This use case will be developed and supported in the SODA-1.1 (or later)
194 specification.
195
196 \subsubsection{Apply Arbitrary User-Specified Function to Data Values}
197
198 This use case will be developed and supported in the
199 SODA-1.1 (or later) specification.
200
201 \subsubsection{Run Arbitrary User-Supplied Code on the Data}
202
203 This use case will be developed and supported in the
204 SODA-1.1 (or later) specification.
205
206
207 \section{Resources}
208 \label{sec:resources}
209
210 SODA services are implemented as HTTP REST \citep{richardson07} web
211 services with a \{sync\} resource that conforms to the DALI-
212 sync resource description.
213
214 \begin{table}[ht]
215 \begin{tabular}{rrr}
216 \sptablerule
217 \textbf{resource type}&\textbf{resource name}&\textbf{required}\cr
218 \sptablerule
219 \{sync\}&service specific&\cr
220 \{async\}&service specific&\cr
221 {DALI-examples}&/examples&no\cr
222 {VOSI-availability}&/availability&yes\cr
223 {VOSI-capabilities}&/capabilities&yes\cr
224 \sptablerule
225 \end{tabular}
226 \caption{Endpoints for AccessData services}
227 \end{table}
228
229 A stand-alone SODA service may have one or both of the \{sync\} and
230 \{async\} resources. For either type, it could have multiple resources
231 (e.g. to support alternate authentication schemes). The SODA service may
232 also include other custom or supporting resources.
233
234 Either the \{sync\} or \{async\} SODA capability may be included as part
235 of other web services. For example, a single web service could contain
236 the SIA-2.0 \{query\} capability, the DataLink-1.0 \{links\} capability,
237 and the SODA \{sync\} capability. Such a service must also have the
238 VOSI-availability and VOSI-capabilities resources to report on and
239 describe all the implemented capabilities.
240
241 \subsection{\{sync\} resource}
242 \label{sec:sync}
243
244 The \{sync\} resource is a synchronous web service resource
245 that conforms to the DALI-sync description. Implementors
246 are free to name this resource however
247 they like, except that the name must consist of one URI segment only (i.e.,
248 contain no slash). This is to allow clients, given the access URL,
249 can reliably find out the URL of the capabilities endpoint.
250 Clients, in turn, can find the resource path using the
251 VOSI-capabilities resource, but will in general be provided the access
252 URLs through a previous data discovery query or through direct user
253 input.
254
255 The \{sync\} resource performs the data access as specified by
256 the input parameters and returns the data directly in the
257 output stream. Synchronous data access is suitable when the
258 operations can be quickly performed and the data stream can
259 be setup and written to (by the service) in a short period
260 of time (e.g. before any timeouts).
261
262 \subsection{\{async\} resource}
263 \label{sec:async}
264
265 The \{async\} resource is an asynchronous web service resource
266 that conforms to the DALI-async description. The considerations on
267 naming the resource given in sect.~\ref{sec:sync} apply for it.
268
269 The \{async\} resource performs the data access as specified
270 by the input parameters and either (i) stores the results
271 for later transfer or (ii) pushes the results to a specified
272 destination (e.g. to a VOSpace location). Asynchronous data
273 access usually introduces resource constraints on the
274 service (which may be limited) and usually imposes a higher
275 latency before any results can be seen because the location
276 of results does not have to be valid until the data access
277 job is complete. Asynchronous data access is intended for
278 (but not limited to) use when the operations take
279 considerable time and results must be staged (e.g. some
280 multi-pass algorithms or operations that result in multiple
281 outputs).
282
283 \subsection{Examples: DALI-examples}
284 \label{sec:examples}
285
286 SODA services should provide a DALI-examples resource
287 with one example invocation that shows the variety of
288 operations the service can perform. Example operations using
289 the \{sync\} resource and that output a small data stream are
290 preferred, as the examples may be used by automatic validators doing
291 relatively frequent (of order daily) queries.
292
293 Parameters to be passed to the service must be given using the DALI
294 \texttt{generic-parameter} term.
295
296
297 \subsection{Availability: VOSI-availability}
298 \label{sec:availability}
299
300 A SODA web service must have a VOSI-availability
301 resource \citep{std:VOSI} as described in DALI \citep{std:DALI}.
302
303 \subsection{Capabilities: VOSI-capabilities}
304 \label{sec:capabilities}
305
306 A web service that includes SODA capabilities must
307 have a VOSI-capabilities resource \citep{std:VOSI} as described in DALI
308 \citep{std:DALI}. The standardID for the \{sync\} resource is
309 $$\hbox{\texttt{ivo://ivoa.net/std/SODA\#sync-1.0}}$$
310
311 The standardID for the \{async\} resource is
312 $$\hbox{\texttt{ivo://ivoa.net/std/SODA\#async-1.0}}$$
313
314 All DAL services must implement the \texttt{/capabilities} resource.
315 The following capabilities document shows the minimal
316 metadata for a stand-alone SODA service and does not
317 require a registry extension schema:
318
319 \begin{lstlisting}[language=XML]
320 <?xml version="1.0"?>
321 <capabilities
322 xmlns:vosi="http://www.ivoa.net/xml/VOSICapabilities/v1.0"
323 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
324 xmlns:vod="http://www.ivoa.net/xml/VODataService/v1.1">
325
326 <capability standardID="ivo://ivoa.net/std/VOSI#capabilities">
327 <interface xsi:type="vod:ParamHTTP" version="1.0">
328 <accessURL use="full">http://example.com/data/capabilities</accessURL>
329 </interface>
330 </capability>
331
332 <capability standardID="ivo://ivoa.net/std/VOSI#availability">
333 <interface xsi:type="vod:ParamHTTP" version="1.0">
334 <accessURL use="full">
335 http://example.com/data/availability
336 </accessURL>
337 </interface>
338 </capability>
339
340 <capability standardid="ivo://ivoa.net/std/SODA#sync-1.0">
341 <interface xsi:type="vod:ParamHTTP" role="std" version="1.0">
342 <accessURL use="full">
343 http://example.com/data/sync
344 </accessURL>
345 </interface>
346 </capability>
347
348 <capability standardid="ivo://ivoa.net/std/SODA#async-1.0">
349 <interface xsi:type="vod:ParamHTTP" role="std" version="1.0">
350 <accessURL use="full">
351 http://example.com/data/async
352 </accessURL>
353 </interface>
354 </capability>
355 </capabilities>
356 \end{lstlisting}
357
358 Note that the \{sync\} and \{async\} resources do not have to be
359 named as shown in the accessURL(s) above. Multiple
360 interface elements within the \{sync\} and the \{async\} capabilities
361 may be included; this is typically used if the differ in
362 protocol (http vs. https) and/or authentication
363 requirements.
364
365
366 \section{Parameters for SODA \{sync\} and \{async\}}
367 \label{sec:parameters}
368
369 The \{sync\} and \{async\} resources accept the same set of
370 parameters. Support for multiple values of parameters is optional.
371 If a request includes multiple values for a parameter and the
372 service does not support multiple values for that parameter, the
373 request must fail with the MultiValuedParamNotSupported error listed
374 below (\ref{sec:error-codes}). For example, a service may
375 allow only single values for ID but multiple values for cutout parameters.
376 Supported multiplicity may also differ between {sync} and {async} requests.
377
378 \enlargethispage{\baselineskip}
379
380 In general, services would support multi-valued parameters as they may be
381 able to provide more efficient access to data files. Clients may attempt to use
382 multi-valued parameters, but must be prepared to fall back to multiple requests
383 if the service indicates this is not supported. A future version of
384 DataLink
385 \citep{std:DataLink} should provide a mechanism to describe parameter
386 multiplicity.
387
388 \subsection{Common Parameters}
389
390 \subsubsection{ID}
391 \label{sec:ID}
392
393 The ID parameter is used to specify the dataset or file to
394 be accessed. The values for the ID parameter are generally
395 discovered from data discovery or DataLink requests. The
396 values must be treated as opaque identifiers that are used
397 as-is. The DataLink specification \citep{std:DataLink} describes mechanisms
398 for conveying opaque parameters and values in service
399 descriptor resources that can be used by clients to set the
400 ID parameter.
401
402 The UCD \citep{std:UCD} describing the ID parameter is\\
403 \ucd{meta.ref.url;meta.curation}.
404
405 \subsection{Filtering Parameters}
406
407 Filtering parameters are used to extract subsets of larger
408 datasets or data files.
409
410 \subsubsection{CIRCLE}
411 \label{sec:CIRCLE}
412
413 The CIRCLE parameter defines a spatial region using the \xtype{circle}
414 xtype defined in DALI \citep{std:DALI}.
415
416 Example: a circle at (12,34) with radius 0.5:
417
418 \begin{lstlisting}
419 CIRCLE=12 34 0.5
420 \end{lstlisting}
421
422 The UCD \citep{std:UCD} describing the CIRCLE parameter is
423 \ucd{phys.angArea;obs}.
424
425 CIRCLE is equivalent in functionality to \texttt{POS=CIRCLE ...} but
426 with type-safe serialised value and unit metadata.
427
428 \subsubsection{POLYGON}
429 \label{sec:POLYGON}
430
431 The POLYGON parameter defines a spatial region using the \xtype{polygon}
432 xtype defined in \citep{std:DALI}.
433
434 Example: a polygon from (12,34) to (14,34) to (14,36) to (12,36) and
435 (implicitly) back to (12,34):
436
437 \begin{lstlisting}
438 POLYGON=12 34 14 34 14 36 12 36
439 \end{lstlisting}
440
441 The UCD \citep{std:UCD} describing the POLYGON parameter is
442 \ucd{phys.angArea;obs}.
443
444 POLYGON is equivalent in functionality to \texttt{POS=POLYGON ...} but
445 with type-safe serialised value and unit metadata.
446
447 \subsubsection{POS}
448 \label{sec:POS}
449
450 The POS parameter defines the positional region(s) to be
451 extracted from the data using the \xtype{region}
452 xtype defined in \citep{std:DALI}. The value is made up of a shape
453 keyword followed by coordinate values. The
454 allowed shapes are given in Table~\ref{tab:shapetypes}.
455
456 \begin{table}[h]
457 \begin{tabular}{ll}
458 \sptablerule
459 \textbf{Shape}&\textbf{Coordinate values}\cr
460 \sptablerule
461 \texttt{CIRCLE}&\texttt{<longitude> <latitude> <radius>}\cr
462 \texttt{RANGE}&\texttt{<longitude1> <longitude2> <latitude1> <latitude2>}\cr
463 \texttt{POLYGON}&\texttt{<longitude1> <latitude1> ... (at least 3 pairs)}\cr
464 \sptablerule
465 \end{tabular}
466 \caption{POS Values in Spherical Coordinates}
467 \label{tab:shapetypes}
468 \end{table}
469
470 As in DALI intervals, open ranges use -Inf or +Inf as one limit.
471
472 \goodbreak
473 Examples for POS values:
474
475 \begin{itemize}
476 \item A circle at (12,34) with radius 0.5:
477
478 \begin{lstlisting}
479 POS=CIRCLE 12 34 0.5
480 \end{lstlisting}
481
482 \item A range of [12,14] in longitude and [34,36] in latitude:
483
484 \begin{lstlisting}
485 POS=RANGE 12 14 34 36
486 \end{lstlisting}
487
488 \item A polygon from (12,34) to (14,34) to (14,36) to (12,36) and
489 (implicitly) back to (12,34):
490
491 \begin{lstlisting}
492 POS=POLYGON 12 34 14 34 14 36 12 36
493 \end{lstlisting}
494
495 \item A band around the equator:
496
497 \begin{lstlisting}
498 POS=RANGE 0 360 -2 2
499 \end{lstlisting}
500
501 \item The north pole:
502
503 \begin{lstlisting}
504 POS=RANGE 0 360 89 +Inf
505 \end{lstlisting}
506 \end{itemize}
507
508 All longitude and latitude values (plus the radius of the
509 CIRCLE) are expressed in degrees in ICRS. A future
510 version of this specification may allow the use of other
511 reference systems (specifically the native system of the
512 data).
513
514 The UCD \citep{std:UCD} describing the POS parameter is \ucd{phys.angArea;obs}.
515
516 Since it is string-valued, POS is unitless; however, the numeric values
517 contained in the the string are all in decimal degrees. In VOTable, the
518 POS parameter has \texttt{datatype="char"} and \texttt{arraysize="*"}.
519
520
521
522 POS is included in SODA for consistency with the SIA-2.0
523 \citep{std:SIAv2} query parameter of the same name. Note that use of the
524 POS parameter with shape keyword ``circle'' provides the equivalent
525 spatial region as the CIRCLE parameter and POS with the shape keyword
526 ``polygon'' is equivalent to the POLYGON parameter. There is no
527 type-specific parameter that is equivalent to the ``range'' shape
528 keyword. There is no way for a service provider to declare support for a
529 subset of the POS shape keywords in a DataLink \citep{std:DataLink}
530 service descriptor; either POS is included or not and if included then
531 all keywords must be supported.
532
533
534 \subsubsection{BAND}
535 \label{sec:BAND}
536
537 The BAND parameter defines the energy interval(s) to be extracted from
538 the data using a floating point interval (\texttt{xtype="interval"}) as
539 defined in \citep{std:DALI}. The value is an open or closed numeric
540 interval with numeric values interpreted as wavelength(s) in metres. As
541 in DALI, open intervals use -Inf or +Inf as one limit.
542
543 \begin{itemize}
544 \item The closed interval [500,550]:
545
546 \begin{lstlisting}
547 BAND=500 550
548 \end{lstlisting}
549
550 \item The open interval (-inf,300]:
551
552 \begin{lstlisting}
553 BAND=-Inf 300
554 \end{lstlisting}
555
556 \item The open interval [750,inf):
557
558 \begin{lstlisting}
559 BAND=750 +Inf
560 \end{lstlisting}
561
562 \item The scalar value 550, equivalent to [550,550]:
563
564 \begin{lstlisting}
565 BAND=550 550
566 \end{lstlisting}
567
568 \end{itemize}
569
570 Extracting using a scalar value should normally extract a
571 single pixel along the energy axis of the data; extracting
572 using an interval should extract one or more pixels.
573
574 All energy values are expressed as barycentric wavelength in
575 meters. A future version of this specification may allow the
576 use of other reference systems (specifically the native
577 system of their data).
578
579 The UCD \citep{std:UCD} describing the BAND parameter is \ucd{em.wl}.
580
581
582 \subsubsection{TIME}
583 \label{sec:TIME}
584
585 The BAND parameter defines the time interval(s) to be extracted from the
586 data using a floating point interval (\texttt{xtype="interval"}) as
587 defined in DALI \citep{std:DALI}. The value is an open or closed
588 interval with numeric values interpreted as Modified Julian Date(s) in
589 UTC. As in DALI, open intervals use -Inf or +Inf as one limit.
590
591 \begin{itemize}
592 \item An open interval from the MJD 55100.0 and all later times:
593
594 \begin{lstlisting}
595 TIME= 55100.0 +Inf
596 \end{lstlisting}
597
598 \item A range of MJD values:
599
600 \begin{lstlisting}
601 TIME=55123.456 55123.466
602 \end{lstlisting}
603
604 \item An instant in time using Modified Julian Date:
605
606 \begin{lstlisting}
607 TIME=55678.123456 55678.123456
608 \end{lstlisting}
609 \end{itemize}
610
611 The UCD \citep{std:UCD} describing the TIME parameter is
612 \ucd{time.interval;obs.exposure}.
613
614
615 \subsubsection{POL}
616 \label{sec:POL}
617
618 The POL parameter defines the polarization state(s) (Stokes)
619 to be extracted from the data.
620
621 \begin{itemize}
622 \item Extract the unpolarized intensity:
623 \begin{lstlisting}
624 POL=I
625 \end{lstlisting}
626 \item Extract the standard circular polarization:
627 \begin{lstlisting}
628 POL=V
629 \end{lstlisting}
630
631 \item Extract only the IQU components:
632 \begin{lstlisting}
633 POL=I
634 POL=Q
635 POL=U
636 \end{lstlisting}
637 \end{itemize}
638
639 As shown in the example above, the POL parameter must support multiple values
640 for both \{sync\} and \{async\} requests. Unlike general filtering parameters,
641 all values of POL are combined into a single filter; for example, if the request
642 includes the three values above, the job would generate one result with
643 some or all of these polarization states (per combination of ID and
644 other filtering parameters).
645
646 The UCD \citep{std:UCD} describing the POL parameter is
647 \ucd{meta.code;phys.polarization}.
648
649
650 \subsubsection{Filtering parameters and ObsCore data model}
651
652
653 Filtering parameters drive the generation of virtual datasets. The ObsCore model is perfectly valid to describe virtual data that SODA is able to generate. Hence all SODA Filtering parameters are coupled with some Obscore model concepts.
654
655 The spatial parameters (CIRCLE, POLYGON and POS) constrain the spatial support of the output virtual dataset. (The ObsCore attribute name is s\_region - utype "obscore:Char.SpatialAxis.Coverage.Support.Area" -). The spatial support of the output dataset is included in the spatial support of the archived dataset.
656
657
658 The TIME parameter constrains the time bounds of the SODA output virtual dataset (Obscore feature of utype: \\
659 "Char.TimeAxis.Coverage.Bounds.Limits"). The time bounds of the output dataset are limited by the time bounds of the archived dataset.
660 The BAND parameter constrains the spectral bounds of the SODA output virtual
661 dataset (Obscore feature of utype: \\ "Char.spectralAxis.Coverage.Bounds.Limits").
662 The spectral bounds of the output datasets are limited by the spectral bounds of
663 the archived dataset.
664
665
666 The POL parameter constrain the list of polarization states in the output virtual dataset (ObsCore feature of name "pol\_states" \\
667 - utype "obscore:Char.PolarizationAxis.stateList"). The
668 valid values for this param are included in the list given by the value
669 of the pol\_states attribute of the archived dataset.
670
671 \subsection{Three-Factor Semantics}
672
673 Parameters in SODA are uniquely defined by the triple of name, UCD
674 \citep{std:UCD}, and unit \citep{std:VOUNIT}. Data services are free to
675 support as many such parameters as is appropriate for their datasets, in
676 addition to supporting standard parameters. With the three factors, it
677 is unlikely that two service providers will by accident use the same
678 three factors for parameters of differing semantics.
679
680 With standard parameters as defined in this document, clients can rely
681 on certain semantics and exploit that knowledge in the provision of
682 special UIs or APIs. Standard parameters defined so far are given
683 in table~\ref{table:standardpars}. Instructions for how to propose
684 additional standard parameters are given on the landing page of the IVOA
685 DAL working group\footnote{At the time of writing, this is\\
686 \href{http://wiki.ivoa.net/twiki/bin/view/IVOA/DefiningServiceParameters}{http://wiki.ivoa.net/twiki/bin/view/IVOA/DefiningServiceParameters}.}.
687
688 \begin{table}[mht]
689 \begin{tabular}{l l l l}
690 \sptablerule
691 \textbf{Name}&\textbf{UCD}&\textbf{Unit}&\textbf{Semantics} \cr
692 \sptablerule
693 ID&meta.ref.url;meta.curation&&cf.~sect.~\ref{sec:ID} \cr
694 CIRCLE&phys.angArea;obs&deg&cf.~sect.~\ref{sec:CIRCLE} \cr
695 POLYGON&phys.angArea;obs&deg&cf.~sect.~\ref{sec:POLYGON} \cr
696 POS&phys.angArea;obs&&cf.~sect.~\ref{sec:POS} \cr
697 BAND&em.wl&m&cf.~sect.~\ref{sec:BAND} \cr
698 TIME&time.interval;obs.exposure&d&cf.~sect.~\ref{sec:TIME} \cr
699 POL&meta.code;phys.polarization&&cf.~sect.~\ref{sec:POL} \cr
700 \sptablerule
701 \end{tabular}
702 \caption{Three-Factor Semantics for standard SODA parameters}
703 \label{table:standardpars}
704 \end{table}
705
706 Both standard and non-standard parameters should follow DALI conventions
707 if at all possible. Roughly, float-valued target fields should be accessed or
708 constrained via interval-valued parameters (i.e., do not split up
709 minimum and maximum into separate parameters). Depending on their
710 semantics, integer parameters should either be intervals or enumerated
711 parameters (which typically can be repeated in the manner of POL).
712 Geometry fields should be
713 accessed or constrained using geometry values (circle and polygon xtypes
714 from DALI \citep{std:DALI}), following the examples of CIRCLE
715 (\ref{sec:CIRCLE}) and POLYGON (\ref{sec:POLYGON}).
716
717 Parameter metadata, including three-factor semantics, is conveyed to
718 clients via DataLink \citep{std:DataLink} service descriptor(s) as
719 described in Section~\ref{sec:integration}.
720
721 \section{Integration of Service Capabilities}
722 \label{sec:integration}
723
724 Finding and using SODA services depends on several other standards;
725 service providers can follow one or more strategies in integrating a
726 range of standard and custom services with their SODA implementation.
727 Here we describe these strategies and show how to use the standards
728 together.
729
730 Within the IVOA architecture, SODA services could be found and used in two
731 ways. First, a SODA service could be found in the IVOA Registry and used
732 directly. Second, a description of a SODA service may be found along
733 with specific dataset metadata; this is the primary anticipated usage:
734 clients discover applicable SODA services while doing data discovery
735 queries.
736
737 The DataLink \citep{std:DataLink} recommendation provides a mechanism
738 to include ``a description of a SODA service'' using a standard resource
739 called a service descriptor. The service descriptor is included in any
740 VOTable \citep{std:VOTable} output and can describe the parameters for
741 use with a DALI-sync or DALI-async compliant capability. Aside from DALI
742 \citep{std:DALI} compliance, this may be a standard service or a custom
743 service. Since the service descriptor can describe all input parameters,
744 it can declare available standard parameters, extensions (custom
745 parameters in standard services), and parameters for custom services.
746 This mechanism is expected to be the primary means of finding and using
747 a SODA service.
748
749 A generic SODA sync service descriptor describing the standard
750 parameters (see sect.~\ref{sec:parameters}):
751
752 \begin{lstlisting}[language=XML]
753 <RESOURCE type="meta" ID="soda-sync" utype="adhoc:service">
754 <PARAM name="standardID" datatype="char" arraysize="*"
755 value="ivo://ivoa.net/std/SODA#sync-1.0" />
756 <PARAM name="accessURL" datatype="char" arraysize="*"
757 value="http://example.com/soda/sync" />
758 <GROUP name="inputParams">
759 <PARAM name="ID" ucd="meta.ref.url;meta.curation"
760 ref="idcolumn-ref"
761 datatype="char" arraysize="*" value="" />
762 <PARAM name="POS" ucd="phys.angArea;obs"
763 datatype="char" arraysize="*" value="" />
764 <PARAM name="CIRCLE" unit="deg" ucd="phys.angArea;obs"
765 datatype="double" arraysize="3"
766 xtype="circle" value="" />
767 <PARAM name="POLYGON" unit="deg" ucd="phys.angArea;obs"
768 datatype="double" arraysize="*"
769 xtype="polygon" value="" />
770 <PARAM name="BAND" unit="m" ucd="em.wl"
771 datatype="double" arraysize="2"
772 xtype="interval" value="" />
773 <PARAM name="TIME" ucd="time.interval;obs" unit="d"
774 datatype="double" arraysize="2"
775 xtype="interval" value="" />
776 <PARAM name="POL" ucd="meta.code;phys.polarization"
777 datatype="char" arraysize="*" value="" />
778 </GROUP>
779 </RESOURCE>
780 \end{lstlisting}
781
782 This service descriptor is generic because the ID parameter uses a
783 \xmlel{ref} attribute to specify that identifier values come from
784 elsewhere in the document (usually this refers to a FIELD element that
785 describes a table column within another RESOURCE element). Thus, this
786 descriptor can be used with any ID values in that column.
787
788 The PARAM with \texttt{name="standardID"} specifies that this service is
789 a SODA sync service. The standardID values for SODA are specified in
790 Section~\ref{sec:capabilities}.
791
792 The GROUP with \texttt{name="inputParams"} shows the standard
793 description of the standard SODA parameters as defined in
794 Section~\ref{sec:parameters}. Services should only include parameter
795 descriptions for supported parameters; in a generic service descriptor
796 ``supported'' means supported by the implementation and does not imply
797 that use of that parameter is applicable to all data (e.g. to all
798 possible identifier values).
799
800
801 All PARAMs in the descriptor may include a \xmlel{VALUES} subelement. This
802 element is providing \xmlel{PARAMETER} domain limits or list of admitted
803 values. See section \ref{sec:disc-links-soda} for a full description of the usage of this feature.
804
805 \subsection{SODA Service in the Registry}
806 \label{sec:reg-soda}
807
808 Resources in the IVOA Registry may include SODA capabilities. However,
809 in order to
810 use such services, clients require prior knowledge of suitable
811 identifiers that are usable with a registered SODA service. As a result,
812 finding and
813 using a SODA service via the registry is not expected to be a common
814 usage pattern.
815
816
817 \subsection{SODA Service Descriptor from Data Discovery}
818 \label{sec:disc-soda}
819
820 In the simplest case, the identifiers found via data discovery (e.g. the
821 \texttt{obs\_publishder\_did} in Obscore \citep{std:OBSCORE}) can be
822 used directly with an associated SODA service. The query response from
823 SIAv2 \citep{std:SIAv2} or TAP \citep{std:TAP} should include one or
824 more DataLink \citep{std:DataLink} service descriptors that describe the
825 SODA capabilities. These would have a \texttt{standardID} parameter
826 specifying SODA \{async\} or SODA \{sync\} as specified in
827 Section~\ref{sec:capabilities} and an appropriate \texttt{accessURL}
828 parameter for the service. If the service is registered, the provider
829 can include a \texttt{resourceIdentifier} parameter. The supported SODA
830 service parameters (standard and custom) would be declared in the
831 inputParams group of the service descriptor.
832
833 The declaration of the ID parameter will specify which column in the
834 data discovery response contains the suitable identifier; although this
835 is usually the obs\_publisher\_did from the ObsCore data model, this is
836 not required and the provider may have the ID parameter reference
837 another (possibly custom) column.
838
839 The data discovery response will in general contain metadata the client
840 can use to determine the values of SODA filtering parameters that will
841 yield valid subsets of the data. For example, standard data discovery
842 using either SIAv2 or TAP and Obscore will provide metadata for
843 specifying POS, CIRCLE, and POLYGON (s\_region, s\_ra, s\_dec, s\_fov),
844 BAND (em\_min, em\_max), TIME (t\_min, t\_max), and POL (pol\_states)
845 parameters.
846
847 When a service descriptor for a SODA service is provided in the data
848 discovery response, it should be a generic descriptor (see above) for
849 use with multiple ID values. Thus, there will normally be a single
850 service descriptor for each available service.
851
852
853 \subsection{SODA Service Descriptor from DataLink}
854 \label{sec:disc-links-soda}
855
856 The alternative scenario has the discovery service return Datalink
857 documents (see \citet{std:DataLink} for the two ways to do that: via the access\_url or via a DataLink "service descriptor" in the query response). These
858 Datalink documents can then contain one or more SODA descriptor(s),
859 most typically one per dataset described. To allow SODA clients
860 the inference of parameter ranges and the presentation of useful
861 user interfaces, data providers SHOULD communicate the admissable
862 ranges of the parameters in question using the VOTable
863 \xmlel{VALUES} element.
864
865 For float-valued intervals (e.g., the standard BAND and TIME
866 parameters), \xmlel{VALUES/MIN} and \xmlel{VALUES/MAX} should be used to
867 communicate the range of values for which clients can expect to
868 receive data. Example:
869
870 \begin{lstlisting}[language=XML]
871 <PARAM name="BAND" unit="m" ucd="em.wl"
872 datatype="double" arraysize="2"
873 xtype="interval" value="">
874 <DESCRIPTION>The wavelength intervals to be extracted</DESCRIPTION>
875 <VALUES>
876 <MIN value="3e-7"/>
877 <MAX value="8e-7"/>
878 </VALUE>
879 </PARAM>
880 \end{lstlisting}
881
882 Enumerated values, both for integeral and textual types, use
883 \xmlel{VALUES}/\xmlel{OP\-TION} elements unless there are too many possible
884 values. Again, only values for which nonempty responses can be
885 expected for the described dataset should be listed. Example:
886
887 \begin{lstlisting}[language=XML]
888 <PARAM name="POL" ucd="meta.code;phys.polarization"
889 datatype="char" arraysize="*" value="">
890 <DESCRIPTION>Polarization states to be extracted.</DESCRIPTION>
891 <VALUES>
892 <OPTION>I</OPTION>
893 <OPTION>V</OPTION>
894 </VALUE>
895 </PARAM>
896 \end{lstlisting}
897
898 In case the option enumeration becomes too large, the description
899 of the parameter should carefully describe what values are
900 admissable, e.g., by providing a link to an enumeration in the
901 \xmlel{DESCRIPTION}.
902
903 Intervals of integers are described analogous to float-valued
904 intervals, i.e., using \xmlel{MIN} and \xmlel{MAX} elements.
905
906 Standard VOTable semantics are insufficient for the metadata of the SODA
907 POLYGON and CIRCLE parameters. We therefore define special cases for
908 the \xmlel{xtype}s \emph{circle} and \emph{polygon} at least until such
909 time as a proper data model for space-time coordinates defines a
910 different way to communicate such coverages within VOTables.
911
912 For CIRCLE, only a \xmlel{MAX} is given. It contains three
913 floating point values, separated by whitespace. These correspond
914 to the RA and Dec of the center of a spherical circle covering the
915 dataset, and a radius of such a covering circle. Data providers
916 SHOULD make sure they choose the center and radius such that the
917 covering circle is close to the minimal one of the dataset.
918 Example:
919
920 \begin{lstlisting}[language=XML]
921 <PARAM name="CIRCLE" unit="deg" ucd="phys.angArea;obs"
922 datatype="double" arraysize="3"
923 xtype="circle" value="">
924 <DESCRIPTION>
925 A spherical circle to be contained by the cutout
926 </DESCRIPTION>
927 <VALUES> <MAX value="12.0 34.0 0.5"/> </VALUES>
928 </PARAM>
929 \end{lstlisting}
930
931 For POLYGON, again only a \xmlel{MAX} is given. It consists of
932 a sequence of floating-point values, again separated by blanks,
933 describing RA and Dec of the vertices of a spherical polygon
934 covering the dataset. Data providers are encouraged to choose a
935 minimal polygon. Example:
936
937 \begin{lstlisting}[language=XML]
938 <PARAM name="POLYGON" unit="deg" ucd="phys.angArea;obs"
939 datatype="double" arraysize="*"
940 xtype="polygon" value="">
941 <DESCRIPTION>A polygon to be contained by the cutout</DESCRIPTION>
942 <VALUES>
943 <MAX value="11.5 33.5 12.5 33.5 12.5 34.5 11.5 34.5"/>
944 </VALUES>
945 </PARAM>
946 \end{lstlisting}
947
948 Angles in both CIRCLE and POLYGON are in degrees. As in the input,
949 the ICRS reference system is assumed, with no further metadata (e.g.,
950 reference position) prescribed by this standard. Further metadata
951 should be given using standard STC annotation when the formalism to do
952 that is finalised.
953
954 For POS, useful metadata cannot be given. Services supporting POS
955 should therefore provide POLYGON as well, and clients wishing to
956 use POS should infer sensible values for that parameter from
957 \xmlel{VALUES} given for POLYGON.
958
959 A full example for a dataset-specific datalink descriptor is given in
960 appendix~\ref{app:fullsoda}.
961
962 Providing values in the parameter descriptions of a data-specific
963 service descriptor implies that the resource generating this has access
964 to the applicable metadata. Depending on system architecture, this may be
965 difficult to implement. An ``autodescription'' mechanism where the SODA
966 service can generate a data-specific service descriptor of itself
967 may be included in SODA-1.1 or later.
968
969
970
971
972 \section{\{sync\} Responses}
973
974 All responses from the \{sync\} resource follow the rules for
975 DALI-sync resources, except that the \{sync\} response allows
976 for error messages for individual input identifier values.
977
978 \subsection{Successful Requests}
979
980 Successfully executed requests should result in a response
981 with HTTP status code 200 (OK) and a response in the format
982 requested by the client or in the default format for the
983 service.
984
985 If the values specified for cutout parameters do not include
986 any pixels from the target dataset/file, the service must
987 respond with HTTP status code 204 (No Content) and no
988 response body.
989
990 The service should set the following HTTP headers to the
991 correct values where possible.
992
993 \begin{tabular}{rr}
994 \sptablerule
995 Content-Type&mime-type of the response\cr
996 Content-Encoding&encoding/compression of the response (if applicable)\cr
997 \sptablerule
998 \end{tabular}
999
1000 Since the response is usually dynamically generated, the
1001 Content-Length and Last-Modified headers cannot usually be
1002 set.
1003
1004 \subsection{Errors}
1005 \label{sec:error-codes}
1006
1007 The error handling specified for DALI-sync resources applies
1008 to service failure. Error documents should be text using the
1009 text/plain content-type and the text must begin with one of
1010 the following strings:
1011
1012 \begin{table}[h]
1013 \begin{tabular}{l l}
1014 \sptablerule
1015 \textbf{Error Code} & \textbf{Description} \cr
1016 \sptablerule
1017 Error&General error (not covered below) \cr
1018 AuthenticationError&Not authenticated \cr
1019 AuthorizationError&Not authorized to access the resource \cr
1020 ServiceUnavailable&Transient error (could succeed with retry) \cr
1021 UsageError&Permanent error (retry pointless) \cr
1022 MultiValuedParamNotSupported&request included multiple values for a parameter\cr
1023 &but the service only supports a single value \cr
1024 \sptablerule
1025 \end{tabular}
1026 \caption{error messages with their meaning}
1027 \end{table}
1028
1029 \section{\{async\} Responses}
1030
1031 The \{async\} resource conforms to the DALI-async resource
1032 description, which means the job is a UWS job with all the
1033 job control features available. All result files are to be
1034 listed as children of the UWS results resource. The service
1035 provider is free to name each result.
1036
1037 When multiple values of input parameters are accepted,
1038 each combination of values produces one result. For
1039 example, if an \{async\} job included two CIRCLE and two BAND
1040 values, there must be four results. If a combination
1041 of input parameters does not produce a result (e.g. there is no
1042 overlap between the parameter values and data extent), the job results
1043 must contain a result entry that indicates this. This should be
1044 a result URL which returns a text/plain document with a message
1045 starting with one of the error labels in Section~\ref{sec:error-codes}
1046 above.
1047
1048 \appendix
1049
1050 \section{Full Sample SODA Descriptor}
1051 \label{app:fullsoda}
1052
1053 Below is an example illustrating how a SODA descriptor for a dataset as
1054 delivered in a DataLink document might look like (see
1055 sect~\ref{sec:disc-links-soda}). Note in particular how \xmlel{value}
1056 is used in the declaration of the ID parameter to convey the fixed value
1057 corresponding to the dataset described.
1058
1059 The particular dataset described here is a spectral cube. Therefore
1060 no TIME and POL parameters are defined.
1061
1062 The example also illustrates how a custom parameter (here, KIND) would
1063 be declared.
1064
1065 \begin{lstlisting}[language=XML,basicstyle=\footnotesize]
1066 <RESOURCE ID="referenced" type="meta" utype="adhoc:service">
1067 <GROUP name="inputParams">
1068 <PARAM arraysize="*" datatype="char" name="ID" ucd="meta.id;meta.main"
1069 value="ivo://org.gavo.dc/~?califa/datadr3/COMB/NGC0180.COMB.rscube.fits">
1070 <DESCRIPTION>The pubisher DID of the dataset of interest</DESCRIPTION>
1071 </PARAM>
1072 <PARAM arraysize="*" datatype="char" name="POS" ucd="phys.angArea;obs"
1073 value="">
1074 <DESCRIPTION>Region to (approximately) cut out, as Circle, Box,
1075 or Polygon</DESCRIPTION>
1076 </PARAM>
1077 <PARAM arraysize="*" datatype="double" name="POLYGON"
1078 ucd="phys.argArea;obs" unit="deg" value="">
1079 <DESCRIPTION>A polygon (as a flattened array of ra, dec pairs) that
1080 should be covered by the cutout.</DESCRIPTION>
1081 <VALUES>
1082 <MAX value="9.499 8.626 9.499 8.645 9.478 8.645 9.478 8.626"/>
1083 </VALUES>
1084 </PARAM>
1085 <PARAM arraysize="3" datatype="double" name="CIRCLE"
1086 ucd="phys.argArea;obs" unit="deg" value="">
1087 <DESCRIPTION>A circle (as a flattened array of ra, dec, radius)
1088 that should be covered by the cutout.</DESCRIPTION>
1089 <VALUES>
1090 <MAX value="9.4889955890 8.6358711588 0.0146493214"/>
1091 </VALUES>
1092 </PARAM>
1093 <PARAM arraysize="2" datatype="double" name="BAND" ucd="em.wl"
1094 unit="m" value="" xtype="interval">
1095 <DESCRIPTION>Vacuum wavelength limits</DESCRIPTION>
1096 <VALUES>
1097 <MIN value="3.701e-07"/>
1098 <MAX value="7.501e-07"/>
1099 </VALUES>
1100 </PARAM>
1101 <PARAM arraysize="*" datatype="char" name="KIND" ucd="" value="">
1102 <DESCRIPTION>Set to HEADER to retrieve just the primary header,
1103 leave empty for data.</DESCRIPTION>
1104 <VALUES>
1105 <OPTION name="Retrieve header only" value="HEADER"/>
1106 <OPTION name="Retrieve the full data, including header (default)"
1107 value="DATA"/>
1108 </VALUES>
1109 </PARAM>
1110 </GROUP>
1111 <PARAM arraysize="*" datatype="char" name="accessURL" ucd="meta.ref.url"
1112 value="http://dc.g-vo.org/califa/q3/dl/dlget"/>
1113 <PARAM arraysize="*" datatype="char" name="standardID"
1114 value="ivo://ivoa.net/std/SODA#sync-1.0"/>
1115 </RESOURCE>
1116 \end{lstlisting}
1117
1118
1119 \section{Changes from Previous Versions}
1120
1121 \subsection{Changes from PR-SODA-20160429}
1122 \begin{itemize}
1123 \item Make multiple values for all parameters optional in both {sync} and
1124 {async} requests and introduce a specific error message if multiplicity of
1125 a parameter is not supported.
1126 \item Added section introducing the different usage scenarios for SODA
1127 and how they can interact with other DAL capabilities. Moved the bulk of
1128 the normative text to an integration section so that it follows the
1129 primary specification of SODA resources and parameters.
1130 \item re-organised so that UCDs for parameters are only specified once
1131 in the section on three-factor semantics.
1132 \item Added CIRCLE AND POLYGON ``double array'' parameters. POS is
1133 retained for consistency with SIA-2.0 query.
1134 \item Interval xtype as strict arraysize=2 array consistently with DALI 1.1
1135 \item SODA autodescription is postponed to version 1.1.
1136 \item VALUES for xtype=interval now use MIN and MAX rather than MAX
1137 alone.
1138 \end{itemize}
1139
1140 \subsection{Changes from WD-SODA-1.0-20151212}
1141 \begin{itemize}
1142 \item POS is now unitless
1143 \item Aligned parameter UCDs with what is in ObsCore
1144 \item Removed gratuitous xtypes.
1145 \end{itemize}
1146
1147 \subsection{Changes from WD-SODA-1.0-20151120}
1148
1149 Change the name of the protocol. Suppression of SELECT and COORD. xtype description are in DALI. Reference to this has been added.
1150
1151 \subsection{Changes from WD-AccessData-1.0-20151021}
1152
1153 Added general introduction on PARAMETER description to
1154 section 3. Modified SELECT and COORD sections in order to
1155 detach them from SimDal. Added Appendix on xtype description
1156 with BNF syntax.
1157
1158 \subsection{Changes from WD-AccessData-1.0-20140730}
1159
1160 \begin{itemize}
1161 \item Removed REQUEST parameter since the DAL-WG decision to not
1162 include it when there is only one value.
1163
1164 \item Clarified that ID and filtering parameters are single
1165 valued for \{sync\} and multi-valued for \{async\}, wth POL
1166 being multi-valued but still being treated as a single
1167 filter.
1168 \end{itemize}
1169
1170 \subsection{Changes from WD-AccessData-1.0-20140312}
1171
1172 This is the initial document version.
1173
1174 \bibliography{ivoatex/ivoabib}
1175
1176 \end{document}

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26