# Annotation of /trunk/projects/dm/provenance/description/provaccess.tex

Revision 4205 - (hide annotations)
Wed Aug 2 23:04:40 2017 UTC (3 years, 2 months ago) by kriebe
File MIME type: application/x-tex
File size: 16681 byte(s)
Add updates on ProvDAL (and TODO-boxes with questions concerning this)


169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 kriebe 3630 194 francois 4002 195 francois 3589 \end{verbatim} 196 kriebe 4138 197 kriebe 4205 Such serializations can be retrieved through access protocols (see \ref{sec:access_protocols} ) or directly integrated in dataset headers or associated metadata'' in order to provide provenance metadata for these datasets. E.g. for FITS files a provenance extension called PROVENANCE'' could be added which contains provenance information of the workflow that generated the FITS file in one of the serialisation formats. 198 kriebe 3630 199 kriebe 4205 \TODO{Check that this keyword is not already taken.} 200 201 kriebe 4138 % \subsection{Graphic formats} --> moved to implementation section. But may want to 202 % include a more general section here, mentioning different ways to serialize 203 francois 4012 204 michele.sanguillon 4068 205 kriebe 3617 \subsection{Access protocols} 206 kriebe 4071 \label{sec:access_protocols} 207 kriebe 3630 We envision two possible access protocols: 208 francois 3589 \begin{itemize} 209 kriebe 4135 \item ProvDAL: retrieve provenance information based on given ID of a data entity or activity. 210 \item ProvTAP: allows detailed queries for provenance information, discovery of datasets based on e.g. code version. 211 \end{itemize} 212 francois 3589 213 kriebe 4135 \subsubsection{ProvDAL} 214 kriebe 4205 ProvDAL is a service the interface of which is organized around one main parameter, the ID'' of an entity (obs\_publisher\_did of an ObsDataSet for example) or activity. The response is given in one of the following formats: PROV-N, PROV-JSON, PROV-XML, PROV-VOTABLE. Additional parameters can complete ID to refine the query: FORMAT allows to choose the output format. BACKWARD gives the number of relations that shall be tracked in backward direction, i.e. along the provenance history. Its value is either a positive integer or ALL. If this parameter is omitted, the default is ALL, wich returs the complete provenance history. 215 The optional parameter FORWARD defines the number of forward relations; it's also either a positive integer or ALL, but default is 0. That means if neither FORWARD nor BACKWARD are specified, then the complete provenance history is returned. 216 kriebe 4201 217 218 kriebe 4135 The ID parameter is allowed more than once in order to retrieve several data set provenance details at the same time. An example request could look like this: 219 francois 3589 220 kriebe 4135 \begin{verbatim} 221 kriebe 4201 {provdal-base-url}?ID=rave:dr4&BACKWARD=1&FORMAT=PROV-JSON 222 kriebe 4135 \end{verbatim} 223 francois 3589 224 kriebe 4205 Each of the provenance relation has a direction, BACKWARD follows these directions whereas FORWARD follows the relations in reverse direction, independent of the relation type. This is easier to implement, but has the (for a user unexpected) side effect that e.g. agent relations are only retrieved when using BACKWARD, but never with FORWARD. Similarly for membership (hadStep, hadMember) relations: members of a collection or activityFlow are retrieved only in BACKWARD direction, and collections or activityFlows that contain an entity or activity are only found in FORWARD direction. In order to provide a more user-friendly interface with less surprising behaviour, we define three more request parameters: EXPAND\_AGENT, EXPAND\_COLLECTION and EXPAND\_ACTIVITYFLOW. They take TRUE or FALSE as arguments. If they are set to TRUE, the relations with agents, collections and activityFlows will be included in any case, independent of the direction in which the provenance graph is retrieved. 225 \TODO{Draw a provenance graph picture here with different relation types and arrows for direction.} 226 \TODO{Implementations need to show if this is really the best way.} 227 228 A ProvDAL service MUST implement the parameters ID, BACKWARD and FORMAT; the remaining parameters are optional. 229 If a service does not implement the optional parameters, but they appear in the request, then the service should return with an error. 230 231 kriebe 4135 Table~\ref{tab:provdal-parameters} summarizes the parameters for such a ProvDAL service interface. 232 233 \begin{table}[h] 234 \small 235 kriebe 4205 \begin{tabulary}{1.0\textwidth}{@{}p{0.17\textwidth}Lp{0.2\textwidth}p{0.10\textwidth}p{0.3\textwidth}@{}} 236 kriebe 4135 %{llp{0.2\textwidth}p{0.3\textwidth}} 237 \toprule 238 kriebe 4201 \head{Parameter} & \head{Requirement} & \head{Value/options} & \head{Default} & \head{Description}\\\hline 239 kriebe 4135 \midrule 240 kriebe 4201 ID & required & qualified ID & -- & a valid qualified identifier for an entity or activity (can occur multiple times)\\ 241 kriebe 4205 BACKWARD & required & 0,1,2,..., ALL & ALL & number of relations to be followed backwards or \texttt{ALL} for everything\\ 242 kriebe 4201 FORWARD & optional & 0,1,2,..., ALL & 0 & number of relations to be followed forward or \texttt{ALL} for everything\\ 243 kriebe 4205 FORMAT & required & PROV-N, PROV-JSON, PROV-XML, PROV-VOTABLE & ? & serialisation format of the response\\ 244 EXPAND\_ AGENT & optional & TRUE or FALSE & TRUE & include agent relations in any case\\ 245 EXPAND\_ COLLECTION & optional & TRUE or FALSE & TRUE & include relations with collections in any case\\ 246 EXPAND\_ ACTIVITYFLOW & optional & TRUE or FALSE & TRUE & include relations with activityFlows in any case\\ 247 kriebe 4135 \bottomrule 248 \end{tabulary} 249 \caption{ProvDAL request parameters} 250 \label{tab:provdal-parameters} 251 \end{table} 252 253 kriebe 4205 \TODO{If EXPAND\_AGENT=TRUE: include all agent relations, but if EXPAND\_AGENT=FALSE, then use default behaviour? Or do not include any of the agent relations? Which one would it be?} 254 kriebe 4135 255 kriebe 4205 \clearpage 256 kriebe 4135 \subsubsection{ProvTAP} 257 ProvTAP is a TAP service implementing the ProvenanceDM data model. The data model mapping is included in the TAP schema. The mapping of ProvenanceDM classes and attributes onto tables and columns of the schema with the appropriate relationships, datatypes, units, utypes and ucds is done similarly to the PROV-VOTABLE serialization. The query response will result in a single table according to the query. 258 kriebe 4138 This single table is joining information coming from one or several provenance'' tables available in the database. 259 kriebe 4135 260 kriebe 4138 A special case is considered where ProvenanceDM and ObsCore are both implemented in the same TAP service and queried together. The TAP response is then providing an Obscore table with a ProvenanceDM extension. We can imagine that in the future this could be hard-coded and registered as an ObsTapProv service. 261 francois 3589 262 263 kriebe 4135 %\TODO{Do we need combined query possibilities, i.e. ask for ObsCore-fields and Provenance fields in one query? Or rather use a 2-step-process, decoupling them from each other?} 264 kriebe 3447 265 francois 3589 266 kriebe 3721 %\TODO{Also look at PROV-AQ from the W3C.}