/[volute]/branches/SODA-markus/SODA.tex
ViewVC logotype

Diff of /branches/SODA-markus/SODA.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 3229 by msdemlei, Mon Jan 25 17:21:28 2016 UTC revision 3230 by msdemlei, Tue Jan 26 12:32:11 2016 UTC
# Line 121  Line 121 
121  This use case will be developed and supported in the  This use case will be developed and supported in the
122  SODA-1.1 (or later) specification.  SODA-1.1 (or later) specification.
123    
124    \subsection{SODA Operation}
125    
126    In contrast to other IVOA protocols, SODA services are not usually
127    discovered through Registry queries.  Instead, clients encounter them
128    in Datalink \citep{std:datalink} declarations, which can either be
129    standalone or embedded within other services' responses.
130    
131    Since this pattern can appear somewhat confusing at first, this
132    introductory (non-normative) chapter discusses the scenarios in which
133    clients can encounter SODA services.  In parallel, we provide advice on
134    the server-side implications of these scenarios.
135    
136    In all cases, the first step is a data discovery service; when used
137    below, this term could refer to, for instance, SIA, SSA, or ObsTAP, but also
138    to some sort of resolution engine for persistent identifiers.
139    
140    \subsubsection{Pure Datalink discovery}
141    \label{sect:pure-datalink}
142    
143    In this scenario, in the data discovery step, the service has returned a
144    row that declared a result with the media type
145    $$\hbox{\texttt{application/x-votable+xml;content=datalink}}.$$
146    
147    To the client, this indicates that what is given in the access reference
148    (e.g., the access\_url column in ObsTAP or SIA version 2 or the column
149    with the UCD VOX:Image\_AccessReference in SIA version 1) is a datalink
150    document.  Within that document, there is a datalink service descriptor
151    with certain properties defined in this document that looks somewhat
152    like this:\todo{of course,
153    this needs descriptions and ranges;  if this text is accepted for the
154    main standard, MD will fill this in.}
155    
156    \begin{lstlisting}[language=XML,basicstyle=\footnotesize]
157    <VOTABLE...>
158    <RESOURCE type="results">
159      [datalink links, one of them being]
160      <TR>[id=ivo://example.com/data?ds1 service-def=soda; semantics=#proc]</TR>
161    </RESOURCE>
162    
163    <RESOURCE type="meta" utype="adhoc:service" ID="soda">
164    
165      <PARAM name="standardID" datatype="char" arraysize="*"
166            value="ivo://ivoa.net/std/SODA#sync-1.0" />
167    
168      <PARAM name="accessURL" datatype="char" arraysize="*"
169            value="http://example.com/my-svcs/soda/sync?ID=ivo://example.com/data?ds1" />
170    
171      <GROUP name="inputParams">
172         <PARAM name="POS" ucd="phys.angArea;obs" datatype="char"
173              arraysize="*" />
174         <PARAM name="BAND" ucd="em.wl" unit="m" datatype="double"
175              arraysize="*"/>
176         <PARAM name="TIME" ucd="time.interval;obs.exposure"
177              unit="d" datatype="double"
178              arraysize="*" xtype="interval"  />
179         <PARAM name="POL" ucd="meta.code;phys.polarization" datatype="char"
180              arraysize="*" />
181      </GROUP>
182    </RESOURCE>
183    \end{lstlisting}
184    
185    Of course, the service is free to choose the VOTable ID of the
186    adhoc:service resource; the service will only declare the parameters it
187    (and the underlying data) actually supports.  
188    
189    From the datalink line, the client sees that there is a service for the
190    dataset in question (identified here through its publisher DID,
191    \nolinkurl{ivo://example.com/data?ds1}), and from the service
192    descriptor's standardID \xmlel{PARAM} it learns that the service's
193    parameters follow the rules laid down here, in particular as regards the
194    three-factor semantics.  For instance, the client is guaranteed that
195    BAND, with UCD em.wl and unit meters actually denotes the parameter
196    controlling where a cutout on the spectral axis will happen.
197    
198    SODA's role here is exactly this guarantee of a specific semantics, as
199    opposed to a non-standard service that could use BAND in an entirely
200    different way.
201    
202    An attractive implementation strategy for small-to-medium sized
203    installations is to pre-generate the datalink files.  In that way, no
204    extra endpoint is required besides the discovery service and the SODA
205    service.
206    
207    Here is a sketch of the query pattern in this case\todo{If people think
208    this is a good idea, I'll do SVGs of these}:
209    
210    \begin{verbatim}
211    Client ---- discovery query ----> DAL service
212                                         |
213         +----- Results with ------------+
214         |      Datalink-valued accrefs
215         v
216     Datalink client --- retrieved accref ---> e.g., plain HTTP
217                                                  service
218                                                    |
219        +-------- Datalink document with -----------+
220        |         SODA descriptor
221        v
222     SODA client -----> SODA instructions ----> SODA service
223                                                   |
224    Data viewer <------ sliced-and-diced data -----+
225    \end{verbatim}
226    
227    \subsubsection{Datalink Discovery with Backward Compatiblity}
228    \label{sect:dlplusbackward}
229    
230    The problem with the scheme discussed in sect.~\ref{sect:pure-datalink} is
231    that legacy clients, i.e., those that do not understand Datalink, will
232    not be able to interpret the results of the discovery step.  While this
233    is probably desirable when services hand out large data cubes that
234    legacy clients probably will not properly handle anyway, in many other
235    situations it is desirable to deliver conventional (e.g., FITS) data
236    products to such legacy clients.  To still enable SODA and other
237    datalink functionality, DAL services can add a serivce descriptor in the
238    DAL response that indicates the availability of a Datalink
239    \emph{service} accompanying the DAL service, looking more or less like
240    this:
241    
242    \begin{lstlisting}[language=XML]
243    <RESOURCE type="results">
244      [a result from services like TAP, SIA, SSA]
245      <TABLE>
246        [in particular, we have one field like]
247        <FIELD ID="primaryID" name="pubDID" datatype="char" arraysize="*">
248          <DESCRIPTION>The publisher DID for the dataset</DESCRIPTION>
249        </FIELD>
250        ...
251      </TABLE>
252    </RESOURCE>
253    <RESOURCE type="meta" utype="adhoc:service">
254      <PARAM name="standardID" datatype="char" arraysize="*"
255        value="ivo://ivoa.net/std/DataLink#links-1.0" />
256      <PARAM name="accessURL" datatype="char" arraysize="*"
257        value="http://example.com/mylinks/get" />
258      <GROUP name="inputParams">
259        <PARAM name="ID" datatype="char" arraysize="*"
260          value="" ref="primaryID"/>
261      </GROUP>
262    </RESOURCE>
263    \end{lstlisting}
264    
265    Note that while this looks very similar to the SODA descriptor above,
266    this fragment is in the DAL response and references one (or more)
267    field(s) from the DAL response, whereas the SODA descriptor resides in a
268    Datalink document.  This is explained in more detail in section 4.2 of
269    the Datalink recommendation 1.0.  The net result is that
270    datalink-enabled clients can find ancillary data and use SODA services
271    for data access by virtue of being able to retrieve Datalink documents,
272    whereas legacy clients still retain basic functionality.
273    
274    On the service side, this incurs the additional cost of having to
275    provide a datalink \{links\} resource, on the client side, some extra
276    dereferencing becomes necessary.  Hence, this pattern should be
277    preferred over the simpler pattern from sect.~\ref{sect:pure-datalink}
278    only if there is a significant advantage in serving data to legacy
279    clients.
280    
281    The query pattern in this case looks like this:
282    
283    \begin{verbatim}
284    
285    Client ---- discovery query ----> DAL service
286                                         |
287         +----- Results with ------------+
288         |      pubDIDs and a {links} descriptor
289         v
290     Datalink client ----- ID=pubDID -----> Datalink service
291                                                    |
292        +-------- datalink document with -----------+
293        |         SODA descriptor
294        v
295     SODA client -----> SODA instructions ----> SODA service
296                                                   |
297    Data viewer <------ sliced-and-diced data -----+
298    \end{verbatim}
299    
300    \subsubsection{Sidestepping Datalink}
301    
302    In some situations, the extra request to retrieve the datalink document
303    for each dataset is inconvenient, while the client may have sufficient
304    information to operate the SODA service based on common metadata.  A
305    classic example would be a service containing relatively homogeneous
306    results of a single instrument, perhaps a spectrograph where all
307    spectra essentially have the same spectral coverage and a client may
308    want to only retrieve, say, the vicinity of a spectral line.
309    
310    In such cases a service may provide a shortcut by including a SODA
311    descriptor directly in the DAL response.  In essence the resulting
312    descriptor looks like a union of the one given in
313    sect.~\ref{sect:pure-datalink} and the one given in
314    sect.~\ref{sect:dlplusbackward}: It includes the SODA parameters, the ID
315    parameter with the reference to the column to take the publisher DID
316    from, but it has a SODA standardID from sect.~\ref{sect:pure-datalink}
317    rather than the Datalink one from sect.~\ref{sect:dlplusbackward}.
318    
319    While sidestepping the extra datalink request might appear attractive in
320    principle, the difficulty of determining the useful parameter ranges
321    make this pattern only interesting in relatively few special cases.
322    Clients must not rely on the presence of full SODA descriptors in DAL
323    responses.  Normal SODA operation follows the pattern given in
324    sects.~\ref{sect:pure-datalink} and~\ref{sect:dlplusbackward}.
325    
326    The query pattern here is:
327    
328    \begin{verbatim}
329    Client ---- discovery query ----> DAL service
330                                         |
331         +----- Results with ------------+
332         |      SODA descriptor
333         v
334     SODA client -----> SODA instructions ----> SODA service
335                                                   |
336    Data viewer <------ sliced-and-diced data -----+
337    \end{verbatim}
338    
339    
340  \section{Resources}  \section{Resources}
341    
342  SODA services are implemented as HTTP REST \citep{richardson07} web  SODA services are implemented as HTTP REST \citep{richardson07} web

Legend:
Removed from v.3229  
changed lines
  Added in v.3230

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26