/[volute]/trunk/projects/dm/provenance/description/provaccess.tex
ViewVC logotype

Annotation of /trunk/projects/dm/provenance/description/provaccess.tex

Parent Directory Parent Directory | Revision Log Revision Log


Revision 4205 - (hide annotations)
Wed Aug 2 23:04:40 2017 UTC (3 years, 2 months ago) by kriebe
File MIME type: application/x-tex
File size: 16681 byte(s)
Add updates on ProvDAL (and TODO-boxes with questions concerning this)

1 kriebe 4071 \subsection{Provenance Data Model serialization}\label{sec:serialisations}
2 francois 4012 There are two possible families of ProvenanceDM metadata serializations, examples for these can be found in the implementation section (\ref{sec:usecases-implementations}) and the links therein.
3 kriebe 3447 \begin{itemize}
4 kriebe 3792 \item W3C serializations: PROV\-N, PROV\-JSON, PROV\-XML. These are serializations of the W3C provenance data model. They allow the possibility to add additional IVOA or ad hoc attributes to the basic ones in each class. This way the IVOA models can produce W3C compliant serializations.
5 francois 4012 % \item Mapping of ProvenanceDM classes onto tables with appropriate relationships. This can allow management by a TAP service (the model mapping is then described with the TAP schema). The serialization will result in a single table according to the query.
6 francois 3589
7 kriebe 3654 %\TODO{TAP SCHEMA of the ProvenanceDM datamodel: Maybe Mathieu can provide us with a copy of the TAP schema he designed ?}
8 kriebe 3630
9     \item Direct VOTABLE mapping by using some ad hoc mapping based on transcription of PROV-N format: this is called PROV-VOTABLE. Moreover in the future we could also define a VO-DML \citep{std:VODML} version of the mapping.
10 kriebe 4071 %The following is an example of provenance metadata in this PROV-VOTABLE format. Objects become tables, their classes are rendered by a utype. Attributes and relationships become FIELDS or PARAMS. The model attribute names also become VOTABLE utypes.
11 michele.sanguillon 4068
12     \end{itemize}
13    
14 kriebe 4138 This can be done using the voprov \footnote{\url{https://github.com/sanguillon/voprov}} python module, also see Section~\ref{sec:implementation_voprov}.
15     Here is an example serialization for the process of creating a composite image from three images, in PROV-N format:
16 kriebe 3630
17 michele.sanguillon 4068 \begin{verbatim}
18 francois 4002
19 michele.sanguillon 4068 document
20     prefix ivo <http://www.ivoa.net/documents/rer/ivo/>
21     prefix hips <http://cds.u-strasbg.fr/data/>
22     prefix voprov <http://www.ivoa.net/documents/dm/provdm/voprov/>
23 kriebe 3630
24 michele.sanguillon 4068 entity(ivo://CDS/P/DSS2color#RGB_NGC6946, [voprov:annotation="This is a PNG RGB image built from DSS2 with Aladin for galaxy NGC 6946", voprov:doculink="http://cds.u-strasbg.fr/aladin.gml", voprov:name="RGB DSS2 image for NGC 6946"])
25     entity(ivo://CDS/P/DSS2/POSSII#POSSII.J-DSS2.143, [voprov:annotation="This is the DSS2 digitazition of the Blue POSSII Schmidt survey around NGC 6946", voprov:doculink="http://cds.u-strasbg.fr/aladin.gm", voprov:name="POSSII Blue Survey DSS2 NGC6946"])
26     entity(ivo://CDS/P/DSS2/POSSII#POSSII.F-DSS2.143, [voprov:annotation="This is the DSS2 digitazition of the Red POSSII Schmidt survey around NGC 6946", voprov:doculink="http://cds.u-strasbg.fr/aladin.gml", voprov:name="POSSII Red Survey DSS2 NGC6946"])
27     entity(ivo://CDS/P/DSS2/POSSII#POSSII.N-DSS2.143, [voprov:annotation="This is the DSS2 digitazition of the Infra red POSSII Schmidt survey around NGC 6946", voprov:doculink="http://cds.u-strasbg.fr/aladin.gm", voprov:name="POSSII Infra Red Survey DSS2 NGC6946"])
28     activity(hips:AlaRGB1, 2017-04-18T17:28:00, 2017-04-19T17:29:00, [voprov:desc_id="AlaRGB", voprov:desc_type="RGBencoding", voprov:annotation="Aladin RGB image generation for NGC 6946", voprov:desc_name="Aladin RGB image generation algorithm", voprov:name="Aladin RGB 1", voprov:desc_doculink="http://cds.u-strasbg.fr/aladin.gml"])
29     used(hips:AlaRGB1, ivo://CDS/P/DSS2/POSSII#POSSII.J-DSS2.143, -)
30     used(hips:AlaRGB1, ivo://CDS/P/DSS2/POSSII#POSSII.F-DSS2.143, -)
31     used(hips:AlaRGB1, ivo://CDS/P/DSS2/POSSII#POSSII.N-DSS2.143, -)
32     wasGeneratedBy(ivo://CDS/P/DSS2color#RGB_NGC6946, hips:AlaRGB1, 2017-05-05T00:00:00)
33     endDocument
34 kriebe 3630
35 michele.sanguillon 4068 \end{verbatim}
36 francois 4002
37 kriebe 4138 This is the corresponding PROV-JSON serialization:
38 francois 4002
39 michele.sanguillon 4068 \begin{verbatim}
40     {
41     "prefix": {
42     "ivo": "http://www.ivoa.net/documents/rer/ivo/",
43     "voprov": "http://www.ivoa.net/documents/dm/provdm/voprov/",
44     "hips": "http://cds.u-strasbg.fr/data/"
45     },
46     "activity": {
47     "hips:AlaRGB1": {
48     "voprov:desc_doculink": "http://cds.u-strasbg.fr/aladin.gml",
49     "voprov:desc_id": "AlaRGB",
50     "prov:startTime": "2017-04-18T17:28:00",
51     "voprov:annotation": "Aladin RGB image generation for NGC 6946",
52     "voprov:desc_type": "RGBencoding",
53     "voprov:desc_name": "Aladin RGB image generation algorithm",
54     "prov:endTime": "2017-04-19T17:29:00",
55     "voprov:name": "Aladin RGB 1"
56     }
57     },
58     "wasGeneratedBy": {
59     "_:id4": {
60     "prov:time": "2017-05-05T00:00:00",
61     "prov:entity": "ivo://CDS/P/DSS2color#RGB_NGC6946",
62     "prov:activity": "hips:AlaRGB1"
63     }
64     },
65     "used": {
66     "_:id1": {
67     "prov:entity": "ivo://CDS/P/DSS2/POSSII#POSSII.J-DSS2.143",
68     "prov:activity": "hips:AlaRGB1"
69     },
70     "_:id3": {
71     "prov:entity": "ivo://CDS/P/DSS2/POSSII#POSSII.N-DSS2.143",
72     "prov:activity": "hips:AlaRGB1"
73     },
74     "_:id2": {
75     "prov:entity": "ivo://CDS/P/DSS2/POSSII#POSSII.F-DSS2.143",
76     "prov:activity": "hips:AlaRGB1"
77     }
78     },
79     "entity": {
80     "ivo://CDS/P/DSS2/POSSII#POSSII.J-DSS2.143": {
81     "voprov:name": "POSSII Blue Survey DSS2 NGC6946",
82     "voprov:annotation": "This is the DSS2 digitazition of the Blue POSSII Schmidt survey around NGC 6946",
83     "voprov:doculink": "http://cds.u-strasbg.fr/aladin.gm"
84     },
85     "ivo://CDS/P/DSS2/POSSII#POSSII.F-DSS2.143": {
86     "voprov:name": "POSSII Red Survey DSS2 NGC6946",
87     "voprov:annotation": "This is the DSS2 digitazition of the Red POSSII Schmidt survey around NGC 6946",
88     "voprov:doculink": "http://cds.u-strasbg.fr/aladin.gml"
89     },
90     "ivo://CDS/P/DSS2/POSSII#POSSII.N-DSS2.143": {
91     "voprov:name": "POSSII Infra Red Survey DSS2 NGC6946",
92     "voprov:annotation": "This is the DSS2 digitazition of the Infra red POSSII Schmidt survey around NGC 6946",
93     "voprov:doculink": "http://cds.u-strasbg.fr/aladin.gm"
94     },
95     "ivo://CDS/P/DSS2color#RGB_NGC6946": {
96     "voprov:name": "RGB DSS2 image for NGC 6946",
97     "voprov:annotation": "This is a PNG RGB image built from DSS2 with Aladin for galaxy NGC 6946",
98     "voprov:doculink": "http://cds.u-strasbg.fr/aladin.gml"
99     }
100     }
101     }
102     \end{verbatim}
103    
104     This is the VOTABLE serialization:
105    
106     \begin{verbatim}
107    
108     <?xml version="1.0" encoding="UTF-8"?>
109     <VOTABLE version="1.2" xmlns="http://www.ivoa.net/xml/VOTable/v1.2" xmlns:hips="http://cds.u-strasbg.fr/data/" xmlns:ivo="http://www.ivoa.net/documents/rer/ivo/" xmlns:voprov="http://www.ivoa.net/documents/dm/provdm/voprov/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ivoa.net/xml/VOTable/v1.2 http://www.ivoa.net/xml/VOTable/VOTable-1.2.xsd">
110     <RESOURCE type="provenance">
111     <DESCRIPTION>Provenance VOTable</DESCRIPTION>
112     <TABLE name="Usage" utype="voprov:used">
113     <FIELD arraysize="*" datatype="char" name="activity" ucd="meta.id" utype="voprov:Usage.activity"/>
114     <FIELD arraysize="*" datatype="char" name="entity" ucd="meta.id" utype="voprov:Usage.entity"/>
115     <DATA>
116     <TABLEDATA>
117     <TR>
118     <TD>hips:AlaRGB1</TD>
119     <TD>ivo://CDS/P/DSS2/POSSII#POSSII.N-DSS2.143</TD>
120     </TR>
121     </TABLEDATA>
122     </DATA>
123     </TABLE>
124     <TABLE name="Generation" utype="voprov:wasGeneratedBy">
125     <FIELD arraysize="*" datatype="char" name="entity" ucd="meta.id" utype="voprov:Generation.entity"/>
126     <FIELD arraysize="*" datatype="char" name="activity" ucd="meta.id" utype="voprov:Generation.activity"/>
127     <DATA>
128     <TABLEDATA>
129     <TR>
130     <TD>ivo://CDS/P/DSS2color#RGB_NGC6946</TD>
131     <TD>hips:AlaRGB1</TD>
132     </TR>
133     </TABLEDATA>
134     </DATA>
135     </TABLE>
136     <TABLE name="Activity" utype="voprov:Activity">
137     <FIELD arraysize="*" datatype="char" name="id" ucd="meta.id" utype="voprov:Activity.id"/>
138     <FIELD arraysize="*" datatype="char" name="name" ucd="meta.title" utype="voprov:Activity.name"/>
139     <FIELD arraysize="*" datatype="char" name="start" ucd="" utype="voprov:Activity.startTime"/>
140     <FIELD arraysize="*" datatype="char" name="stop" ucd="" utype="voprov:Activity.endTime"/>
141     <FIELD arraysize="*" datatype="char" name="annotation" ucd="meta.description" utype="voprov:Activity.annotation"/>
142     <FIELD arraysize="*" datatype="char" name="desc_id" ucd="" utype="voprov:ActivityDescription.id"/>
143     <FIELD arraysize="*" datatype="char" name="desc_name" ucd="" utype="voprov:ActivityDescription.name"/>
144     <FIELD arraysize="*" datatype="char" name="desc_type" ucd="meta.code.class" utype="voprov:ActivityDescription.type"/>
145     <FIELD arraysize="*" datatype="char" name="desc_doculink" ucd="meta.ref.url" utype="voprov:ActivityDescription.doculink"/>
146     <DATA>
147     <TABLEDATA>
148     <TR>
149     <TD>hips:AlaRGB1</TD>
150     <TD>Aladin RGB 1</TD>
151     <TD>2017-04-18 17:28:00</TD>
152     <TD>2017-04-19 17:29:00</TD>
153     <TD>Aladin RGB image generation for NGC 6946</TD>
154     <TD>AlaRGB</TD>
155     <TD>Aladin RGB image generation algorithm</TD>
156     <TD>RGBencoding</TD>
157     <TD>http://cds.u-strasbg.fr/aladin.gml</TD>
158     </TR>
159     </TABLEDATA>
160     </DATA>
161     </TABLE>
162     <TABLE name="Entity" utype="voprov:Entity">
163     <FIELD arraysize="*" datatype="char" name="id" ucd="meta.id" utype="voprov:Entity.id"/>
164     <FIELD arraysize="*" datatype="char" name="name" ucd="meta.title" utype="voprov:Entity.name"/>
165     <FIELD arraysize="*" datatype="char" name="annotation" ucd="meta.description" utype="voprov:Entity.annotation"/>
166     <DATA>
167     <TABLEDATA>
168     <TR>
169     <TD>ivo://CDS/P/DSS2/POSSII#POSSII.J-DSS2.143</TD>
170     <TD>POSSII Blue Survey DSS2 NGC6946</TD>
171     <TD>This is the DSS2 digitazition of the Blue POSSII Schmidt survey around NGC 6946</TD>
172     </TR>
173     <TR>
174     <TD>ivo://CDS/P/DSS2/POSSII#POSSII.F-DSS2.143</TD>
175     <TD>POSSII Red Survey DSS2 NGC6946</TD>
176     <TD>This is the DSS2 digitazition of the Red POSSII Schmidt survey around NGC 6946</TD>
177     </TR>
178     <TR>
179     <TD>ivo://CDS/P/DSS2/POSSII#POSSII.N-DSS2.143</TD>
180     <TD>POSSII Infra Red Survey DSS2 NGC6946</TD>
181     <TD>This is the DSS2 digitazition of the Infra red POSSII Schmidt survey around NGC 6946</TD>
182     </TR>
183     <TR>
184     <TD>ivo://CDS/P/DSS2color#RGB_NGC6946</TD>
185     <TD>RGB DSS2 image for NGC 6946</TD>
186     <TD>This is a PNG RGB image built from DSS2 with Aladin for galaxy NGC 6946</TD>
187     </TR>
188     </TABLEDATA>
189     </DATA>
190     </TABLE>
191     <INFO name="QUERY_STATUS" value="OK"/>
192     </RESOURCE>
193 kriebe 3630 </VOTABLE>
194 francois 4002
195 francois 3589 \end{verbatim}
196 kriebe 4138
197 kriebe 4205 Such serializations can be retrieved through access protocols (see \ref{sec:access_protocols} ) or directly integrated in dataset headers or ``associated metadata'' in order to provide provenance metadata for these datasets. E.g. for FITS files a provenance extension called ``PROVENANCE'' could be added which contains provenance information of the workflow that generated the FITS file in one of the serialisation formats.
198 kriebe 3630
199 kriebe 4205 \TODO{Check that this keyword is not already taken.}
200    
201 kriebe 4138 % \subsection{Graphic formats} --> moved to implementation section. But may want to
202     % include a more general section here, mentioning different ways to serialize
203 francois 4012
204 michele.sanguillon 4068
205 kriebe 3617 \subsection{Access protocols}
206 kriebe 4071 \label{sec:access_protocols}
207 kriebe 3630 We envision two possible access protocols:
208 francois 3589 \begin{itemize}
209 kriebe 4135 \item ProvDAL: retrieve provenance information based on given ID of a data entity or activity.
210     \item ProvTAP: allows detailed queries for provenance information, discovery of datasets based on e.g. code version.
211     \end{itemize}
212 francois 3589
213 kriebe 4135 \subsubsection{ProvDAL}
214 kriebe 4205 ProvDAL is a service the interface of which is organized around one main parameter, the ``ID'' of an entity (obs\_publisher\_did of an ObsDataSet for example) or activity. The response is given in one of the following formats: PROV-N, PROV-JSON, PROV-XML, PROV-VOTABLE. Additional parameters can complete ID to refine the query: FORMAT allows to choose the output format. BACKWARD gives the number of relations that shall be tracked in backward direction, i.e. along the provenance history. Its value is either a positive integer or ALL. If this parameter is omitted, the default is ALL, wich returs the complete provenance history.
215     The optional parameter FORWARD defines the number of forward relations; it's also either a positive integer or ALL, but default is 0. That means if neither FORWARD nor BACKWARD are specified, then the complete provenance history is returned.
216 kriebe 4201
217    
218 kriebe 4135 The ID parameter is allowed more than once in order to retrieve several data set provenance details at the same time. An example request could look like this:
219 francois 3589
220 kriebe 4135 \begin{verbatim}
221 kriebe 4201 {provdal-base-url}?ID=rave:dr4&BACKWARD=1&FORMAT=PROV-JSON
222 kriebe 4135 \end{verbatim}
223 francois 3589
224 kriebe 4205 Each of the provenance relation has a direction, BACKWARD follows these directions whereas FORWARD follows the relations in reverse direction, independent of the relation type. This is easier to implement, but has the (for a user unexpected) side effect that e.g. agent relations are only retrieved when using BACKWARD, but never with FORWARD. Similarly for membership (hadStep, hadMember) relations: members of a collection or activityFlow are retrieved only in BACKWARD direction, and collections or activityFlows that contain an entity or activity are only found in FORWARD direction. In order to provide a more user-friendly interface with less surprising behaviour, we define three more request parameters: EXPAND\_AGENT, EXPAND\_COLLECTION and EXPAND\_ACTIVITYFLOW. They take TRUE or FALSE as arguments. If they are set to TRUE, the relations with agents, collections and activityFlows will be included in any case, independent of the direction in which the provenance graph is retrieved.
225     \TODO{Draw a provenance graph picture here with different relation types and arrows for direction.}
226     \TODO{Implementations need to show if this is really the best way.}
227    
228     A ProvDAL service MUST implement the parameters ID, BACKWARD and FORMAT; the remaining parameters are optional.
229     If a service does not implement the optional parameters, but they appear in the request, then the service should return with an error.
230    
231 kriebe 4135 Table~\ref{tab:provdal-parameters} summarizes the parameters for such a ProvDAL service interface.
232    
233     \begin{table}[h]
234     \small
235 kriebe 4205 \begin{tabulary}{1.0\textwidth}{@{}p{0.17\textwidth}Lp{0.2\textwidth}p{0.10\textwidth}p{0.3\textwidth}@{}}
236 kriebe 4135 %{llp{0.2\textwidth}p{0.3\textwidth}}
237     \toprule
238 kriebe 4201 \head{Parameter} & \head{Requirement} & \head{Value/options} & \head{Default} & \head{Description}\\\hline
239 kriebe 4135 \midrule
240 kriebe 4201 ID & required & qualified ID & -- & a valid qualified identifier for an entity or activity (can occur multiple times)\\
241 kriebe 4205 BACKWARD & required & 0,1,2,..., ALL & ALL & number of relations to be followed backwards or \texttt{ALL} for everything\\
242 kriebe 4201 FORWARD & optional & 0,1,2,..., ALL & 0 & number of relations to be followed forward or \texttt{ALL} for everything\\
243 kriebe 4205 FORMAT & required & PROV-N, PROV-JSON, PROV-XML, PROV-VOTABLE & ? & serialisation format of the response\\
244     EXPAND\_ AGENT & optional & TRUE or FALSE & TRUE & include agent relations in any case\\
245     EXPAND\_ COLLECTION & optional & TRUE or FALSE & TRUE & include relations with collections in any case\\
246     EXPAND\_ ACTIVITYFLOW & optional & TRUE or FALSE & TRUE & include relations with activityFlows in any case\\
247 kriebe 4135 \bottomrule
248     \end{tabulary}
249     \caption{ProvDAL request parameters}
250     \label{tab:provdal-parameters}
251     \end{table}
252    
253 kriebe 4205 \TODO{If EXPAND\_AGENT=TRUE: include all agent relations, but if EXPAND\_AGENT=FALSE, then use default behaviour? Or do not include any of the agent relations? Which one would it be?}
254 kriebe 4135
255 kriebe 4205 \clearpage
256 kriebe 4135 \subsubsection{ProvTAP}
257     ProvTAP is a TAP service implementing the ProvenanceDM data model. The data model mapping is included in the TAP schema. The mapping of ProvenanceDM classes and attributes onto tables and columns of the schema with the appropriate relationships, datatypes, units, utypes and ucds is done similarly to the PROV-VOTABLE serialization. The query response will result in a single table according to the query.
258 kriebe 4138 This single table is joining information coming from one or several ``provenance'' tables available in the database.
259 kriebe 4135
260 kriebe 4138 A special case is considered where ProvenanceDM and ObsCore are both implemented in the same TAP service and queried together. The TAP response is then providing an Obscore table with a ProvenanceDM extension. We can imagine that in the future this could be hard-coded and registered as an ObsTapProv service.
261 francois 3589
262    
263 kriebe 4135 %\TODO{Do we need combined query possibilities, i.e. ask for ObsCore-fields and Provenance fields in one query? Or rather use a 2-step-process, decoupling them from each other?}
264 kriebe 3447
265 francois 3589
266 kriebe 3721 %\TODO{Also look at PROV-AQ from the W3C.}

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26