# Contents of /trunk/projects/dm/provenance/ProvDM/doc/datamodel-description.tex

Revision 5691 - (show annotations)
Fri Nov 15 16:39:50 2019 UTC (8 months, 2 weeks ago) by mathieu.servillat
File MIME type: application/x-tex
File size: 54848 byte(s)
update intro phrase on FAIR principles, update agent role description, update figures

 1 2 \subsection{Overview and class diagram} 3 \label{sec:overview} 4 5 6 \begin{figure}[hbt] 7 \centering 8 \includegraphics[width=1.0\textwidth]{PROV_Fig3.png} 9 \caption[Overview class diagram of the IVOA Provenance Data Model]{Overview class diagram of the IVOA Provenance Data Model. The core part in yellow is based on W3C PROV definitions where relations are shown in grey. It is extended by a description part (orange), specific types of entities (red) and an \class{ActivityConfiguration} package (green). A full diagram with attributes is shown in Section~\ref{sec:fulldiagram}, Figure~\ref{fig:fulldiagram}} 10 \label{fig:overview} 11 \end{figure} 12 13 The IVOA Provenance DM is based on the the PROV-DM recommendation \citep{std:W3CProvDM} of the World Wide Web Consortium (W3C), that provides the core elements of the model (see Sections~\ref{sec:ent_act} to~\ref{sec:agent+relations}). 14 In the VO context, the provenance of something is thus a sequence of activities using and generating entities run by agents. 15 16 The model includes in addition description classes (see Section~\ref{sec:descriptions}) to provide information common to several elements; Specific types of \class{Entity} classes commonly used in astronomy (see Section~\ref{sec:spec_entities}); and an \class{ActivityConfiguration} package (see Section~\ref{sec:configuration}). 17 18 The IVOA Provenance DM is a class data model that follows the VO-DML designing rules \citep{2018ivoa.spec.0910L}. It is represented as a UML class diagram: an overview diagram is shown in Figure~\ref{fig:overview}, and a full diagram with attributes is shown in Appendix~\ref{sec:fulldiagram}, Figure~\ref{fig:fulldiagram}. 19 20 21 \subsection{Entity and Activity classes} 22 \label{sec:ent_act} 23 24 The core classes and relations of the IVOA Provenance DM are presented in Figure~\ref{fig:coreclasses}. 25 Traceability (see goal A in Section~\ref{sec:goals}) is enabled by chaining entities and activities, which are the building blocks of the history graph. 26 27 28 \begin{figure}[ht] 29 \centering 30 \includegraphics[width=1.0\textwidth]{PROV_Fig4.png} 31 \caption[Core classes and relations]{Core classes and relations. Attributes for these classes are detailed in tables found in Sections~\ref{sec:ent_act} to~\ref{sec:agent+relations}.} 32 \label{fig:coreclasses} 33 \end{figure} 34 35 36 37 \subsubsection{Entity and Collection classes} 38 \label{sec:Entity} 39 40 An \textbf{entity} is a physical, digital, conceptual, or other kind of thing with some fixed aspects (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-entity}{\S5.1.1}). 41 42 The \class{Entity} class in the model have the attributes given in Table \ref{tab:entity}. 43 44 Entities in astronomy are usually astronomical or astrophysical datasets in the form of images, tables, numbers, etc. But they can also be log files, files containing system information, any input or output value, environment variables, ambient conditions, or, in a wider sense, observation proposals, scientific articles, or manuals and other documents. 45 Though the focus is on digital entities in this document, entities can also refer to physical entities that may be linked to digital entities, such as e.g., tools, instruments, detectors, photographic plates. 46 47 48 \begin{table}[ht] 49 \small 50 \tymax 0.5\textwidth 51 \textbf{\normalsize Entity}\vspace{0.25em}\\ 52 \begin{tabulary}{1.0\textwidth}{llL} 53 \toprule 54 \head{Attribute} & \head{Data type} & \head{Description}\\ 55 \midrule 56 \textbf{id} & string & a unique identifier for this entity\\ 57 name & string & a human-readable name for the entity\\ 58 %a provenance type, i.e.~one of: prov:collection, prov:bundle, prov:plan; or any of the specialized entities defined in Section~\ref{sec:spec_entities} \\ 59 %description\_ref & foreign key/url & link to \class{EntityDescription}\\ 60 location & string & a path or spatial coordinates, e.g., a URL, latitude-longitude coordinates on Earth, the name of a place.\\ 61 %value & prov:value & & provides a value that is a direct representation of the entity \\ 62 generatedAtTime & datetime & date and time at which the entity was created (e.g., timestamp of a file)\\ 63 invalidatedAtTime & datetime & date and time of invalidation of the entity. After that date, the entity is no longer available for any use.\\ 64 %date and time of the destruction or cessation of the entity. The entity is no longer available for use (or further invalidation) after invalidation.\\ 65 comment & string & text containing specific comments on the entity\\ 66 %rights & -- & string & access rights for the entity, values: public, secure or proprietary; see Curation.Rights, RightsType in DatasetDM\\ 67 %\midrule 68 %$\rightarrow$ description & & link & link to \class{EntityDescription}\\ 69 %$\rightarrow$ wasAttributedTo & prov:wasAttributedTo & link & link to \class{WasAttributedTo} for linking with a responsible \class{Agent}\\ 70 \bottomrule 71 \end{tabulary} 72 \caption[Attributes of the \class{Entity} class]{Attributes of the \class{Entity} class. Attributes in \textbf{bold} are mandatory and must not be null. 73 }\label{tab:entity} 74 \end{table} 75 76 77 A \textbf{collection} is an entity that provides a structure to some constituents that must themselves be entities (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-collection}{\S5.6.1}). These constituents are said to be member of the collections. They are connected in the model with a \class{hadMember} relation. 78 79 80 \subsubsection{Activity class} 81 \label{sec:activity} 82 83 An \textbf{activity} is something that occurs over a period of time and acts upon or with entities; it may include consuming, processing, transforming, modifying, relocating, using, or generating entities (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-Activity}{\S5.1.2}). 84 85 The \class{Activity} class in the model have the attributes given in Table \ref{tab:activity}. 86 87 Activities in astronomy include all steps from obtaining data to the reduction 88 of images and production of new datasets, such as image calibration, bias 89 subtraction, image stacking, light curve generation from a number of 90 observations, radial velocity determination from spectra, post-processing steps 91 of simulations, etc. 92 93 94 \begin{table}[ht] 95 \small 96 \tymax 0.5\textwidth 97 \textbf{\normalsize Activity}\vspace{0.25em}\\ 98 \begin{tabulary}{1.0\textwidth}{llL} 99 \toprule 100 \head{Attribute} & \head{Data type} & \head{Description}\\ 101 \midrule 102 \textbf{id} & string & a unique id for this activity\\ 103 name & string & a human-readable name (to be displayed by clients)\\ 104 startTime & datetime & start of an activity\\ 105 endTime & datetime & end of an activity\\ 106 % startTime and endTime are not strictly required -- and in case of a reproducible activity 107 % they are meaningless. Therefore, I removed the bf here. 108 % mireille : I do not agree for this change by Ole : They can be unknown but they must be mandatory in order to allow to query on time for an activity, and to tell the execution order. 109 % ole so what should I put there if the time is not known? And what is the use case of using the temporal execution order instead of the logical one (following the provenance links used/wasgeneratedby)? 110 comment & string & text containing specific comments on the activity\\ 111 %status & & string & can be used to describe the terminal status of the activity (e.g., completed, aborted, error...)\\ 112 %votype & string & can be either activity'' or activityFlow''\\ 113 %, used to differentiate between these two class types, if activityFlow'' is not implemented as an extra class (and in W3C compatible serializations)\\ 114 %\midrule 115 %$\rightarrow$ description & & link & link to \class{ActivityDescription}\\ 116 %$\rightarrow$ wasAssociatedWith & prov:wasAssociatedWith & link & link to \class{WasAssociatedWith} for linking with a responsible \class{Agent}\\ 117 \bottomrule 118 \end{tabulary} 119 \caption[Attributes of the \class{Activity} class.]{Attributes of the \class{Activity} class. Attributes in \textbf{bold} are mandatory and must not be null.}\label{tab:activity} 120 %, references are indicated with an arrow ($\rightarrow$).} 121 \end{table} 122 123 124 \subsection{Entity-Activity relations} 125 \label{sec:entity-activity-relations} 126 127 Each entity is usually a result of an activity, expressed by a link from the entity to its generating activity, and can be used as input for (many) other activities. 128 Thus the information on whether data is used as input or was produced as output of some activity is given by the \emph{relations} between activities and entities. 129 Tracking those relations answers one of the main objective of the model (see goal A in Section~\ref{sec:goals}). 130 131 132 \subsubsection{Used class} 133 134 \textbf{Usage} is the beginning of utilizing an entity by an activity. Before usage, the activity had not begun to utilize this entity and could not have been affected by the entity (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-Usage}{\S5.1.4}). 135 136 Usage is implemented in the model by a class \class{Used} that connects \class{Activity} to \class{Entity} and contains the attributes in Table~\ref{tab:used}. 137 138 For example, an activity calibration'' used entities with the roles calibration data'' and raw images''. 139 140 \begin{table}[ht] 141 \small 142 \tymax 0.5\textwidth 143 \textbf{\normalsize Used}\vspace{0.25em}\\ 144 \begin{tabulary}{1.0\textwidth}{llL} 145 \toprule 146 \head{Attribute} & \head{Data type} & \head{Description}\\ 147 \midrule 148 %\multicolumn{4}{@{}l}{References}\\ 149 %\midrule 150 %$\rightarrow$ \textbf{activity} & prov:activity & link & link to an \class{Activity} instance\\ 151 %$\rightarrow$ \textbf{entity} & prov:entity & link & link to an \class{Entity} instance\\ 152 %$\rightarrow$ description & & link & link to the corresponding \class{UsedDescription}, if existing\\ 153 %\midrule 154 %\textbf{id} & prov:id & string & an identifier for this relation\\ 155 role & string & function of the entity with respect to the activity\\ 156 time & datetime & time at which the usage of an entity started\\ 157 \bottomrule 158 \end{tabulary} 159 \caption[Attributes of the \class{Used} relation class]{Attributes of the \class{Used} relation class.} 160 \label{tab:used} 161 \end{table} 162 163 The \attribute{time} of the usage can be specified, and must be between the \attribute{startTime} and the \attribute{stopTime} of the corresponding activity. 164 165 The \class{Used} class is closely coupled to the \class{Activity} by a composition (see \ref{sect:Composition}). 166 Any given entity can be used by more than one activity. 167 168 169 \subsubsection{WasGeneratedBy class} 170 171 \textbf{Generation} is the completion of production of a new entity by an activity. This entity did not exist before generation and becomes available for usage after this generation (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-Generation}{\S5.1.3}). 172 173 Generation is implemented in the model by a class \class{WasGeneratedBy} that connects \class{Entity} to \class{Activity} and contains the attributes in Table~\ref{tab:used}. 174 175 For example, the entity raw\_image.fits'' was generated by the activity observation'' with the role raw image''. 176 177 \begin{table}[ht] 178 \small 179 \tymax 0.5\textwidth 180 \textbf{\normalsize WasGeneratedBy}\vspace{0.25em}\\ 181 \begin{tabulary}{1.0\textwidth}{llL} 182 \toprule 183 \head{Attribute} & \head{Data type} & \head{Description}\\ 184 \midrule 185 %\multicolumn{4}{@{}l}{References}\\ 186 %\midrule 187 %$\rightarrow$ \textbf{entity} & prov:entity & link & link to an \class{Entity} instance\\ 188 %$\rightarrow$ \textbf{activity} & prov:activity & link & link to an \class{Activity} instance\\ 189 %$\rightarrow$ description & & link & link to the corresponding \class{WasGeneratedByDescription}, if existing\\ 190 %\midrule 191 %\textbf{id} & prov:id & string & an identifier for this relation\\ 192 role & string & function of the entity with respect to the activity\\ 193 %time & prov:time & datetime & time at which the generation of an entity is finished\\ 194 \bottomrule 195 \end{tabulary} 196 \caption[Attributes of the \class{WasGeneratedBy} relation class]{Attributes of the \class{WasGeneratedBy} relation class.} 197 \label{tab:wasgeneratedby} 198 \end{table} 199 200 As the \class{Entity} class has an attribute \attribute{generatedAtTime}, there is no additional time attribute in this relation. 201 202 The \class{WasGeneratedBy} relation is closely coupled with the \class{Entity} via a composition (see \ref{sect:Composition}). 203 An entity can be generated by only one activity, so the multiplicity is 1 or 0 between \class{Entity} and \class{WasGeneratedBy}. 204 205 206 \subsubsection{Roles in Entity-Activity relations} 207 \label{sec:roles} 208 209 The \attribute{role} of an entity within an activity should be provided. 210 Roles in \class{Entity}-\class{Activity} relations are free text attributes. 211 212 The \attribute{role} cannot be an attribute of the \class{Entity} class, since the same entity (e.g., a specific file containing an image) may play different roles with different activities. 213 214 In some cases the role is mandatory to distinguish two input entities. For example, an activity for dark-frame subtraction requires two input images. But it is very important to know which of the images is the raw image and which one fulfils the role of dark frame. 215 216 Several entities may play the same role for an activity. For example, many image entities may be used as science-ready-images for an image stacking process. 217 218 219 220 \subsubsection{WasDerivedFrom relation} 221 222 A \textbf{derivation} is a transformation of an entity into another, an update of an entity resulting in a new one, or the construction of a new entity based on a pre-existing entity (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-Derivation}{\S5.2.1}). 223 224 Derivation is a relation \class{wasDerivedFrom} in the model, that connects an instance of \class{Entity} to another instance. 225 226 For example, the entity calibrated\_image.fits'' was derived from the entity raw\_image.fits'' 227 228 This relation makes it possible to visualize independently the flow of entities, e.g., a dataflow. It does not need a priori a specific class or table in an implementation, but it provides a way to expose partial information that follow the general chain \class{WasGeneratedBy}-\class{Activity}-\class{Used} where the activity may be an empty instance because it is unknown or irrelevant. 229 230 231 \subsubsection{WasInformedBy relation} 232 233 \textbf{Communication} is the exchange of information (some unspecified entity) by two activities, one activity using some entity generated by the other (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-Communication}{\S5.1.5}). 234 235 Communication is a relation \class{wasInformedBy} in the model, that connects an instance of \class{Activity} to another instance. 236 237 For example, the activity calibration'' was informed by the activity pipeline''. 238 239 This relation makes it possible to visualize independently the flow of activities as they occurred, which may be the result of the execution of a workflow. It does not need a priori a specific class or table in an implementation, but it provides a way to expose partial information that follow the general chain \class{Used}-\class{Entity}-\class{WasGeneratedBy} where the entity may be an empty instance because it is unknown or irrelevant. 240 241 242 \subsection{Agent and relations to Agent} 243 \label{sec:agent+relations} 244 245 A contact information is needed in case more information about a certain activity or entity is required, but also in order to know who was involved and to fulfil the Acknowledgement objective (see goal B in Section~\ref{sec:goals}). 246 247 248 \subsubsection{Agent class} 249 \label{sec:agent} 250 251 An \textbf{agent} is something that bears some form of responsibility for an activity taking place or for the existence of an entity (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-agent}{\S5.3.1}). 252 253 The \class{Agent} class in the model has the attributes given in Table \ref{tab:agent}. 254 255 An Agent is generally someone who pressed a button, ran a script, performed the observation or published a dataset. The agent can be a single person, a group of persons, a project or an institute (the vocabulary is defined in Table). 256 257 It is recommended to use organizational agents and agents with generic contacts. 258 259 260 \begin{table}[ht] 261 \small 262 \tymax 0.5\textwidth 263 \textbf{\normalsize Agent}\vspace{0.25em}\\ 264 \begin{tabulary}{1.0\textwidth}{llL} 265 \toprule 266 \head{Attribute} & \head{Data type} & \head{Description}\\ 267 \midrule 268 \textbf{id} & string & unique identifier for an agent\\ 269 \textbf{name} & string & a common name for this agent; e.g., first name and last name; project name, pipeline team, data center.\\ 270 type & AgentType & type of the agent as given in Table~\ref{tab:agent-types}\\ 271 comment & string & text containing specific comments on the agent\\ 272 email & string & contact email of the agent\\ 273 affiliation & string & affiliation of the agent\\ 274 phone & string & phone number\\ 275 address & string & address of the agent\\ 276 url & anyURI & reference URL to the agent\\ 277 % insert here the attributes dedicated to contact for a Party in DataSet Metadata DM. 278 % \hline 279 % \multicolumn{4}{l}{Additional optional attributes from Dataset.Party subclasses:}\\ 280 % \hline 281 % address & & string & Address of the agent both for Individual (Person) and Organization\\ 282 % phone & & string & Contact phone number of the agent both for Individual (Person) and Organization\\ 283 % email & & string & Contact email of the agent both for Individual (Person) and Organization\\ 284 \bottomrule 285 \end{tabulary} 286 \caption[Attributes of the \class{Agent} class]{Attributes of the \class{Agent} class. Attributes in \textbf{bold} are mandatory and must not be null.} 287 \label{tab:agent} 288 \end{table} 289 290 \begin{table}[ht] 291 \small 292 \tymax 0.5\textwidth 293 \textbf{\normalsize AgentType}\vspace{0.25em}\\ 294 \begin{tabulary}{1.0\textwidth}{lp{8cm}} 295 \toprule 296 \head{Type} &\head{Description} \\ 297 \midrule 298 Person & person agents are people\\ 299 Organization & a social or legal institution, e.g., an institute, a consortium, a project\\ 300 SoftwareAgent & running software, e.g., a cron job or a trigger \\ 301 \bottomrule 302 \end{tabulary} 303 \caption[Enumeration of Agent types.]{Enumeration of Agent types.} 304 \label{tab:agent-types} 305 \end{table} 306 307 % 2018-12 commented 308 %A definition of organizations is given in the 309 %IVOA Recommendation on Resource Metadata \citep{std:ResourceMeta}, hereafter 310 %referred to as RM: An organization is [a] specific type of resource that 311 %brings people together to pursue participation in VO applications.'' 312 %It also specifies further that scientific projects can be considered 313 %as organizations on a finer level: 314 %At a high level, an organization could be a university, observatory, or government 315 %agency. At a finer level, it could be a specific scientific project, space mission, 316 %or individual researcher. A provider is an organization that makes data and/or services 317 %available to users over the network.'' 318 319 For each agent a \attribute{name} must be specified. 320 Other attributes can help locate or contact the agent (\attribute{email}, \attribute{affiliation}, \attribute{phone}, \attribute{address}). 321 Not every project will need them; e.g. an advanced system may use permanent identifiers (ORCIDs, identities in federations, etc) to identify agents ,and retrieve their properties from an external system instead. 322 323 %It is desired to have at least one agent given for each activity. 324 There can be more than one agent for each activity and one agent can be responsible for more than one activity or entity, using the relations defined in the following sections. 325 326 327 \subsubsection{WasAssociatedWith class} 328 329 An activity \textbf{association} is an assignment of responsibility to an agent for an activity, indicating that the agent had a role in the activity (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-Association}{\S5.3.3}). 330 331 Association is implemented in the model by a class \class{WasAssociatedWith} that connects \class{Activity} to \class{Agent} and contains the attributes in Table~\ref{tab:wasassociatedwith}. 332 333 For example, the agent Max Smith'' was associated with the activity observation'' with the role Observer''. 334 335 \begin{table}[ht] 336 \small 337 \tymax 0.5\textwidth 338 \textbf{\normalsize WasAssociatedWith}\vspace{0.25em}\\ 339 \begin{tabulary}{1.0\textwidth}{llL} 340 \toprule 341 \head{Attribute} & \head{Data type} & \head{Description}\\ 342 \midrule 343 %\multicolumn{4}{@{}l}{References}\\ 344 %\midrule 345 %$\rightarrow$ \textbf{agent} & prov:agent & link & link to an \class{Agent} instance\\ 346 %$\rightarrow$ \textbf{activity} & prov:activity & link & link to an \class{Activity} instance\\ 347 %$\rightarrow$ description & & link & link to the corresponding \class{WasGeneratedByDescription}, if existing\\ 348 %\midrule 349 %\textbf{id} & prov:id & string & an identifier for this relation\\ 350 role & string & function of the agent with respect to the activity\\ 351 \bottomrule 352 \end{tabulary} 353 \caption[Attributes of \class{WasAssociatedWith} relation class]{Attributes of \class{WasAssociatedWith} relation class.} 354 \label{tab:wasassociatedwith} 355 \end{table} 356 357 358 \subsubsection{WasAttributedTo class} 359 360 \textbf{Attribution} is the ascribing of an entity to an agent. When an entity is attributed to an agent, this entity was generated by some unspecified activity that in turn was associated to the agent. Thus, this relation is generally useful when the activity is not known, or irrelevant (W3C PROV-DM \href{https://www.w3.org/TR/prov-dm/#term-attribution}{\S5.3.2}). 361 %The agent therefore bears some responsibility for its existence. 362 363 %When an entity is attributed to an agent, this entity was generated by some unspecified activity that in turn was associated to the agent. Thus, this relation is generally useful when the activity is not known, or irrelevant. 364 365 Attribution is implemented in the model by a class \class{WasAttributedTo} that connects \class{Entity} to \class{Agent} and contains the attributes in Table~\ref{tab:wasattributedto}. 366 367 For example, the entity science\_image.fits'' was attributed to the agent observatory''. 368 369 370 \begin{table}[ht] 371 \small 372 \tymax 0.5\textwidth 373 \textbf{\normalsize WasAttributedTo}\vspace{0.25em}\\ 374 \begin{tabulary}{1.0\textwidth}{llL} 375 \toprule 376 \head{Attribute} & \head{Data type} & \head{Description}\\ 377 \midrule 378 %\multicolumn{4}{@{}l}{References}\\ 379 %\midrule 380 %$\rightarrow$ \textbf{agent} & prov:agent & link & link to an \class{Agent} instance\\ 381 %$\rightarrow$ \textbf{entity} & prov:entity & link & link to an \class{Entity} instance\\ 382 %\midrule 383 %\textbf{id} & prov:id & string & an identifier for this relation\\ 384 role & string & function of the agent with respect to the entity \\ 385 \bottomrule 386 \end{tabulary} 387 \caption[Attributes of \class{WasAttributedTo} relation class]{Attributes of \class{WasAttributedTo} relation class.} 388 \label{tab:wasattributedto} 389 \end{table} 390 391 392 \subsubsection{Agent roles} 393 394 Agents may play a specific role with respect to an activity or an entity. 395 %For example: telescope observer, pipeline operator, principal investigator, software engineer, project helpdesk. 396 The \attribute{role} attribute should be specified whenever it is known. 397 398 Roles in relations to \class{Agent} are free text attributes, but if one of the terms in Table \ref{tab:agent-roles} applies, it should be used. 399 400 % DataCite roles, see https://schema.datacite.org/meta/kernel-4.2/doc/DataCite-MetadataKernel_v4.2.pdf 401 % ContactPerson DataCollector DataCurator DataManager Distributor Editor HostingInstitution Producer ProjectLeader ProjectManager ProjectMember RegistrationAgency RegistrationAuthority RelatedPerson Researcher ResearchGroup RightsHolder Sponsor Supervisor WorkPackageLeader Other 402 403 \begin{table}[ht] 404 \small 405 \tymax 0.5\textwidth 406 \textbf{\normalsize Agent roles}\vspace{0.25em}\\ 407 \begin{tabulary}{1.0\textwidth}{lp{8cm}} 408 \toprule 409 \head{Role} & \head{Description} \\ 410 \midrule 411 Author & the agent was at the origin of a written entity (e.g., article, document, proposal) \\ 412 Contributor & the agent helped in the creation of an entity or execution of an activity \\ 413 Coordinator & the agent was leading the organisation of an activity \\ 414 Creator & the agent created an entity or an activity \\ 415 Curator & the agent was responsible for the legacy aspects of an entity \\ 416 Editor & the agent validated the content of an entity \\ 417 Funder & the agent provided financial support for an activity or the creation of an entity \\ 418 Investigator & the agent was responsible for the scientific goals of an activity \\ 419 Observer & the agent executed an observation activity or was responsible for observing a specific entity \\ 420 Operator & the agent was in charge of performing an activity or using an entity \\ 421 Provider & the agent effectively gave access and delivered the entity \\ 422 Publisher & the agent certified and made an entity available to the public \\ 423 \bottomrule 424 \end{tabulary} 425 \caption[Terms applicable as agent roles.]{Terms applicable as agent roles.} 426 \label{tab:agent-roles} 427 \end{table} 428 429 430 431 432 \subsection{Description classes} 433 \label{sec:descriptions} 434 435 In the domain of astronomy, certain processes and steps are repeated over and over again, maybe using a different configuration and within a different context. 436 We therefore separate the descriptions of activities from the actual processes and introduce an \class{ActivityDescription} class (Section~\ref{sec:activity_desc}). 437 Likewise, we also apply the same pattern for \class{Entity} and add an \class{EntityDescription} class (Section~\ref{sec:entity_desc}). 438 439 Defining such descriptions allows them to be predefined and reused, which is less redundant when exposing the provenance of a series of tasks of the same type. 440 Providing detailed descriptions to activities and entities help assess the quality and reliability of the processes executed (see goal C in Section~\ref{sec:goals}). 441 442 Figure~\ref{fig:classdiagram_descriptions} shows the class diagram part focused on the description classes. 443 444 \begin{figure}[ht] 445 \centering 446 \includegraphics[width=1.0\textwidth]{PROV_Fig5.png} 447 \caption[Partial class diagram focused on description classes.]{Partial class diagram focused on description classes.} 448 \label{fig:classdiagram_descriptions} 449 \end{figure} 450 451 452 \subsubsection{ActivityDescription class} 453 \label{sec:activity_desc} 454 455 456 \begin{table}[ht] 457 \small 458 \tymax 0.5\textwidth 459 \textbf{\normalsize ActivityDescription}\vspace{0.25em}\\ 460 \begin{tabulary}{1.0\textwidth}{llL} 461 \toprule 462 \head{Attribute} & \head{Data type} & \head{Description}\\ 463 \midrule 464 %\textbf{id} & string & a unique id for this activity description\\ 465 \textbf{name} & string & a human-readable name\\ 466 version & string & a version number, if applicable (e.g., for the code used)\\ 467 description & string & additional free text describing how the activity works internally\\ 468 docurl & anyURI & link to further documentation on this activity, e.g., a 469 paper, the source code in a version control system etc.\\ 470 type & string & type of the activity\\ 471 subtype & string & more specific subtype of the activity\\ 472 % code & string & the code (software) used for this process, if applicable\\ 473 \bottomrule 474 \end{tabulary} 475 \caption[Attributes of the \class{ActivityDescription} class]{Attributes of the \class{ActivityDescription} class. Attributes in \textbf{bold} are mandatory and must not be null. 476 }\label{tab:activitydescription} 477 \end{table} 478 479 480 The information necessary to describe how an activity works internally are stored in \class{ActivityDescription} objects. 481 482 \class{ActivityDescription} is directly attached to \class{Activity} and can thus be seen as a list of attributes that can be known before an \class{Activity} instance is created. 483 484 There must be exactly zero or one \class{ActivityDescription} instance per activity. 485 If an activity is linked to an \class{ActivityDescription} instance, \class{Used}/\class{WasGeneratedBy}/\class{Entity} objects bound to this activity must refer to the description elements composing the \class{ActivityDescription}. 486 487 %If a \class{Used}/\class{WasGeneratedBy} object is linked to a corresponding description element, then there must exist a link between the related activity and its corresponding activity description. 488 489 490 The activity \attribute{type} is a free text attribute, but if one of the terms in Table \ref{tab:activitydescription-types} applies, it should be used. 491 The activity \attribute{subtype} is a free text attribute to be used internally by the project that defined \class{ActivityDescription} instances (e.g., mosaicing, denoising, photometric calibration, cross correlation). 492 493 494 \begin{table}[ht] 495 \small 496 \tymax 0.5\textwidth 497 \textbf{\normalsize ActivityDescription types}\vspace{0.25em}\\ 498 \begin{tabulary}{1.0\textwidth}{lp{8cm}} 499 \toprule 500 \head{Type} & \head{Description} \\ 501 \midrule 502 Observation & active acquisition of information on a phenomenon\\ 503 Simulation & generation of data through a computational process\\ 504 Reduction & transformation of digital information into a corrected, ordered, and simplified form\\ 505 Calibration & transformation and comparison of measurement values with respect to a calibration standard of known accuracy\\ 506 Reconstruction & estimation of physical properties using indirect information\\ 507 Selection & application of filters or criteria to select partial information\\ 508 Analysis & process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making\\ 509 \bottomrule 510 \end{tabulary} 511 \caption[Terms applicable as activity types.]{Terms applicable as activity types.} 512 \label{tab:activitydescription-types} 513 \end{table} 514 515 516 517 518 \subsubsection{EntityDescription class} 519 \label{sec:entity_desc} 520 521 522 \begin{table}[ht] 523 \small 524 \tymax 0.5\textwidth 525 \textbf{\normalsize EntityDescription}\vspace{0.25em}\\ 526 \begin{tabulary}{\textwidth}{llL} 527 \toprule 528 \head{Attribute} & \head{Data type} & \head{Description}\\ 529 \midrule 530 %\textbf{id} & string & a unique identifier for this description\\ 531 name & string & a human-readable name for the entity description\\ 532 description & string & a descriptive text for this kind of entity\\ 533 doculink & anyURI & link to more documentation\\ 534 type & string & type of the entity\\ 535 % \midrule 536 % \multicolumn{3}{@{}l}{\textbf{Optional attributes:}} \\ 537 % content\_type & string & MIME type for the content of the entity\\ 538 % format & string & type of container for the entity\\ 539 % removed the obscore attributes, since specific for observations only, not applicable to configuration entities etc. 540 % dataproduct\_ type & string & from ObsCore data model \citep{std:ObsCore}, if applicable; describes, what kind of product it is (e.g., image, table)\\ 541 % dataproduct\_ subtype & string & from ObsCore data model, more specific subtype\\ 542 % level & enum integer & the level of processing or calibration; for ObsCore's calib\_level it is an integer between 0 and 3\\ 543 \bottomrule 544 \end{tabulary} 545 \caption[Attributes of the \class{EntityDescription} class]{Attributes of the \class{EntityDescription} class. 546 }\label{tab:entitydescription} 547 \end{table} 548 549 550 The \class{EntityDescription} class is meant to store descriptive information for different categories of entities. It contains information that is known before an \class{Entity} instance is created. The \class{EntityDescription} general attributes are summarized in Table~\ref{tab:entitydescription}. 551 552 For example, a specific category of entities in a project may be defined in details in a document or on a webpage (e.g., a CTA DL3 file, a CCD device, a photographic plate). 553 554 The entity \attribute{type} is a free text attribute, that contains the general category of the entity, e.g., if it is data, a document, a vizualization, a device. 555 556 The \class{EntityDescription} class should not contain information about the usage of the data, in particular, it generally tells nothing about them being used as input or generated as output. This kind of information should be provided by the relations (and their descriptions) between activities and entities (see Sections~\ref{sec:entity-activity-relations} and \ref{sec:use_gen_desc}). 557 558 559 \subsubsection{UsageDescription and GenerationDescription classes} 560 \label{sec:use_gen_desc} 561 562 \begin{table}[ht] 563 \small 564 \tymax 0.5\textwidth 565 \textbf{\normalsize UsageDescription}\vspace{0.25em}\\ 566 \begin{tabulary}{1.0\textwidth}{llL} 567 \toprule 568 \head{Attribute} & \head{Data type} & \head{Description}\\ 569 % \midrule 570 % $\rightarrow$ \textbf{activityDescription} & link & link to \class{ActivityDescription}\\ 571 % $\rightarrow$ entityDescription & link & link to \class{EntityDescription}\\ 572 \midrule 573 %\textbf{id} & string & identifier\\ 574 \textbf{role} & string & function of the entity with respect to the activity \\ 575 description & string & a descriptive text for this kind of usage \\ 576 type & string & type of relation, see Section~\ref{sec:ugtypes} \\ 577 multiplicity & string & Number of expected input entities to be used with the given role. The multiplicity syntax is similar to that of VO-DML (\citealt{2018ivoa.spec.0910L}, \S4.19) in the form minOccurs..maxOccurs'' or a single value if minOccurs and maxOccurs are identical, e.g., 1'' for one item, *'' for unbounded or 3..*'' for unbounded with at least 3 items. \\ 578 \bottomrule 579 \end{tabulary} 580 \caption[Attributes of the \class{UsageDescription} class]{Attributes of the \class{UsageDescription} class. Attributes in \textbf{bold} are mandatory and must not be null.} 581 \label{tab:usagedescription} 582 \end{table} 583 584 585 \begin{table}[ht] 586 \small 587 \tymax 0.5\textwidth 588 \textbf{\normalsize GenerationDescription}\vspace{0.25em}\\ 589 \begin{tabulary}{1.0\textwidth}{llL} 590 \toprule 591 \head{Attribute} & \head{Data type} & \head{Description}\\ 592 \midrule 593 %\textbf{id} & string & identifier\\ 594 \textbf{role} & string & function of the entity with respect to the activity \\ 595 description & string & a descriptive text for this kind of generation \\ 596 type & string & type of relation, see section \ref{sec:ugtypes} \\ 597 multiplicity & string & Number of expected output entities that will be generated with the given role. The multiplicity syntax is similar to that of VO-DML (\citealt{2018ivoa.spec.0910L}, \S4.19) in the form minOccurs..maxOccurs'' or a single value if minOccurs and maxOccurs are identical, e.g., 1'' for one item, *'' for unbounded or 3..*'' for unbounded with at least 3 items. \\ 598 % \midrule 599 % $\rightarrow$ \textbf{activityDescription} & link & link to an \class{ActivityDescription}\\ 600 % $\rightarrow$ entityDescription & link & link to \class{EntityDescription}\\ 601 \bottomrule 602 \end{tabulary} 603 \caption[Attributes of the \class{GenerationDescription} class]{Attributes of the \class{GenerationDescription} class. Attributes in \textbf{bold} are mandatory and must not be null.} 604 \label{tab:wasgeneratedbydescription} 605 \end{table} 606 607 608 In order to describe more precisely an activity, the expected inputs and outputs of this activity should be specified. 609 610 We introduce the \class{UsageDescription} and the \class{GenerationDescription} classes, that are meant to store the information about the usage or generation of entities that is known before an activity instance is executed, i.e.~wht we expect to store in the \class{Used} and \class{WasGeneratedBy} relations (see \ref{sec:entity-activity-relations}). 611 Instances of \class{Used} (respectively \class{WasGeneratedBy}) may thus point to an instance of \class{UsageDescription} (respectively \class{GenerationDescription}). 612 613 If a \class{UsageDescription} (respectively \class{GenerationDescription}) instance is defined, the \attribute{role} attribute of the related \class{Used} (respectively \class{WasGeneratedBy}) instances must match the \attribute{role} attribute of this \class{UsageDescription} (respectively \class{GenerationDescription}) instance. 614 615 A \attribute{multiplicity} attribute should be specified to indicate the number of entities expected to share the same role for a given \class{ActivityDescription} instance, e.g., in the case of the stacking of images, several images are expected with the same input role (\attribute{multiplicity=*}). 616 617 When related to the \class{UsageDescription} or \class{GenerationDescription}, the attributes of \class{EntityDescription} (see Section~\ref{sec:entity_desc}) help to describe the category of entities expected as an input or an output in an activity. 618 For example: if the input bias files are expected to be in FITS format, the \class{UsageDescription} object would have a relation to a \class{DatasetDescription} object with \attribute{contentType}=application/fits''. 619 620 621 \subsubsection{Types of Usage and Generation} 622 \label{sec:ugtypes} 623 624 The typing of those relations is particularly needed to enable quality assessment and identification of error sources in the process (see goals C and D in Section \ref{sec:goals}), so as to facilitate the exploration of provenance information. 625 626 The type of usage or generation is a free text attribute, but if one of the terms in Table \ref{tab:usage-generation-types} applies, it should be used. 627 628 \begin{table}[ht] 629 \small 630 \tymax 0.5\textwidth 631 \begin{tabulary}{1.0\textwidth}{Lp{8cm}} 632 \toprule 633 \head{Type} & \head{Description} \\ 634 \midrule 635 Main & main input or output entities of the activity, i.e.~strictly necessary, and the primary objective of the activity\\ 636 Calibration & usage of an entity to calibrate another entity\\ 637 Preview & generation of a quick representation of an entity\\ 638 Setup & usage of an entity as configuration information, see also Section~\ref{sec:configurationpackage}\\ 639 Quality & generation of information that helps to assess the quality of the activity results, e.g., errors, warnings, flags, percentage of overexposed pixels\\ 640 Log & generation of logging information\\ 641 Context & contextual information that influences the activity, but for which there are no or little control at the moment of its execution, examples: temperature, wind, conditions of observation, execution platform, operating system, instrumental context\\ 642 \bottomrule 643 \end{tabulary} 644 \caption[Terms applicable as usage or generation type.]{Terms applicable as usage or generation type.} 645 \label{tab:usage-generation-types} 646 \end{table} 647 648 The type Main'' indicates the main input and output entities of an activity. It should help to provide the minimum relevant data flow to the initial entity or activity, i.e.~to find the most relevant progenitors. 649 650 651 \subsection{Specific types of Entity classes} 652 \label{sec:spec_entities} 653 654 \class{Entity} and \class{EntityDescription} classes carry the minimum metadata that can apply to any kind of entity without specifying the nature or the structure of the content of the entity. 655 In some cases, the structure of the content is relevant information to assess the usefulness of the entity, in particular for datasets. 656 657 In some other cases, the content itself of an entity is relevant information to assess the usefulness of the related entities or activities. Such content must then be expose as properly described values. 658 659 In astronomy and the VO, we thus define two main types of entity classes: 660 661 \begin{itemize} 662 \item \textbf{Dataset}: a dataset is a resource which encodes data in a defined structure. It is generally a file or a set of files which are considered to be a single deliverable. The content may be e.g., a cube, an image, a table, a list. 663 \item \textbf{Value}: a value is an atomic piece of data with a given value type (e.g., a data type such as boolean, integer, real, string). 664 \end{itemize} 665 666 \begin{figure}[ht] 667 \centering 668 \includegraphics[width=1.0\textwidth]{PROV_Fig6.png} 669 \caption[Partial class diagram focused on specific types of \class{Entity} classes.]{Partial class diagram focused on Specific types of \class{Entity} classes.} 670 \label{fig:classdiagram_entityclasses} 671 \end{figure} 672 673 As shown in Figure~\ref{fig:classdiagram_entityclasses}, the entity description classes for both \class{ValueEntity} and \class{DatasetEntity} are subsetted respectively as \class{ValueDescription} and \class{DatasetDescription}. 674 675 We anticipate that more specific categories of entities can be defined by the projects (for example, a device, a document, a vizualization). The \attribute{type} attribute of the \class{EntityDescription} class should be used to differentiate the different categories of entities. 676 677 678 \subsubsection{DatasetEntity and DatasetDescription classes} 679 680 The handling of datasets is implemented in the model by a \class{DatasetEntity} class. A corresponding \class{DatasetDescription} class contains a \attribute{contentType} attribute that must not be null (see Table~\ref{tab:datasetdescription}). 681 682 The \attribute{contentType} indicates the MIME-type or format of a dataset, or a more precise structure, following the definition of the attribute \attribute{access\_format} defined in ObsCoreDM (\citet{2017ivoa.spec.0509L}, Section 4.7). 683 684 \begin{table}[ht] 685 \small 686 \tymax 0.5\textwidth 687 \textbf{\normalsize DatasetDescription}\vspace{0.25em}\\ 688 %\begin{tabulary}{1.0\textwidth}{@{}p{2.5cm}p{0cm}lL@{}} 689 \begin{tabulary}{1.0\textwidth}{llL} 690 \toprule 691 \head{Attribute} & \head{Data type} & \head{Description}\\ 692 \midrule 693 \textbf{contentType} & string & format of the dataset, MIME type when applicable \\ 694 \bottomrule 695 \end{tabulary} 696 \caption[Attributes of the \class{DatasetDescription} class]{Attributes of the \class{DatasetDescription} class. The class also inherits the attributes of \class{EntityDescription} listed in Table \ref{tab:entitydescription}. Attributes in \textbf{bold} are mandatory and must not be null.} 697 \label{tab:datasetdescription} 698 \end{table} 699 700 701 \subsubsection{ValueEntity and ValueDescription classes} 702 703 The handling of values is implemented in the model by a \class{ValueEntity} class that contains a \attribute{value} attribute. A corresponding \class{ValueDescription} class contains attributes commonly used in the VO to qualify values. Those attributes are listed in Table~\ref{tab:valuedescription}. 704 705 \begin{table}[ht] 706 \small 707 \tymax 0.5\textwidth 708 \textbf{\normalsize ValueEntity}\vspace{0.25em}\\ 709 %\begin{tabulary}{1.0\textwidth}{@{}p{2.5cm}p{0cm}lL@{}} 710 \begin{tabulary}{1.0\textwidth}{llL} 711 \toprule 712 \head{Attribute} & \head{Data type} & \head{Description}\\ 713 \midrule 714 \textbf{value} & string & the value of the entity. If a corresponding \class{ValueDescription}.\attribute{valueType} attribute is set, the value string can be interpreted by this \attribute{valueType}. \\ 715 \bottomrule 716 \end{tabulary} 717 \caption[Attributes of the \class{ValueEntity} class]{Attributes of the \class{ValueEntity} class. The class also inherits the attributes of \class{EntityDescription} listed in Table \ref{tab:entitydescription}. Attributes in \textbf{bold} are mandatory and must not be null.} 718 \label{tab:valueentity} 719 \end{table} 720 721 \begin{table}[ht] 722 \small 723 \tymax 0.5\textwidth 724 \textbf{\normalsize ValueDescription}\vspace{0.25em}\\ 725 %\begin{tabulary}{1.0\textwidth}{@{}p{2.5cm}p{0cm}lL@{}} 726 \begin{tabulary}{1.0\textwidth}{p{2cm}LL} 727 \toprule 728 \head{Attribute} & \head{Data type} & \head{Description}\\ 729 \midrule 730 %\textbf{id} & string & parameter unique identifier\\ 731 \textbf{valueType} & VOTableType & combination of \attribute{datatype}, \attribute{arraysize} and \attribute{xtype} following VOTable 1.3 \citep[][, \S4.1]{2013ivoa.spec.0920O} \\ 732 unit & Unit & VO unit, see \ref{sect:Units} and \citet{2014ivoa.spec.0523D} for recommended unit representation \\ 733 ucd & string & Unified Content Descriptor, supplying a standardized classification of the physical quantity, see \citet{2018ivoa.spec.0527M}\\ 734 utype & string & Utype, meant to express the role of the value in the context of an external data model, see \citet{note:utypeusage} \\ 735 \bottomrule 736 \end{tabulary} 737 \caption[Attributes of the \class{ValueDescription} class]{Attributes of the \class{ValueDescription} class. The class also inherits the attributes of \class{EntityDescription} listed in Table \ref{tab:entitydescription}. Attributes in \textbf{bold} are mandatory and must not be null.} 738 \label{tab:valuedescription} 739 \end{table} 740 741 742 743 \subsection{Activity configuration} 744 \label{sec:configuration} 745 746 Configuring an activity is the way to set parameters so that the activity occurs in the desired conditions. 747 748 In some cases developed in Section~\ref{sec:goals} (goals C and D in particular), configuration information is relevant to assess the quality and reliability of an activity or an entity, and to identify the location of configuration errors in a processing. It also facilitates the re-execution of an activity (reproducibility). 749 750 Configuration information may be carried by entities using the core features, where an entity (e.g., \class{ValueEntity} and \class{DatasetEntity} instances) is referenced in \class{Used} relations with a given \attribute{role} and \attribute{type}=“setup”. With this solution, the configuration information is independent from the activity and can be generated and used as any entity. 751 752 The data model also provides a specialized \class{ActivityConfiguration} package to directly attach configuration information to an activity. This package is composed of a \class{WasConfiguredBy} relation connecting \class{Parameter} and \class{ConfigFile} classes with the \class{Activity} class (see~\ref{sec:configurationpackage}). With this solution the configuration information is independent from the entities, and seen as part of the activity. 753 754 755 \begin{figure}[hbt] 756 \centering 757 \includegraphics[width=1.0\textwidth]{PROV_Fig7.png} 758 % Mireille: updated the diagram file for the last version with the proper cardinalities for Parameter and ConfigFile 759 \caption[Partial class diagram focused on the \class{ActivityConfiguration} package.]{Partial class diagram focused on the \class{ActivityConfiguration} package. The \class{Parameter} and \class{ConfigFile} classes provide configuration information for an \class{Activity} instance. The right side of the diagram shows the descriptions, where an \class{ActivityDescription} class is bound with the \class{ParameterDescription} and \class{ConfigFileDescription} classes.} 760 \label{fig:activityconfig} 761 \end{figure} 762 763 764 \subsubsection{Overview of the ActivityConfiguration package} \label{sec:configurationpackage} 765 766 As shown in Figure \ref{fig:activityconfig} the \class{ActivityConfiguration} package contains two classes for the execution side: \class{Parameter} and \class{ConfigFile} which are connected to an \class{Activity} instance via the \class{WasConfiguredBy} association class. 767 An \class{Activity} may thus be configured by a set of \class{Parameter} instances, by \class{ConfigFile} instances, or by a combination of both. 768 769 The corresponding description classes, \class{ParameterDescription} and \class{ConfigFileDescription}, are both defined in the context of the description of an activity. 770 There can be several instances of a \class{Parameter} (respectively \class{ConfigFile}) that are described by the same instance of \class{ParameterDescription} (respectively \class{ConfigFileDescription}). 771 772 773 \subsubsection{Parameter and ParameterDescription classes} 774 \label{sec:parameterandD} 775 776 \begin{table}[ht] 777 \small 778 \tymax 0.5\textwidth 779 \textbf{\normalsize Parameter}\vspace{0.25em}\\ 780 \begin{tabulary}{1.0\textwidth}{llL} 781 \toprule 782 \head{Attribute} & \head{Data type} & \head{Description}\\ 783 \midrule 784 %\textbf{id} & string & a unique id\\ 785 \textbf{name} & string & name of the parameter \\ 786 \textbf{value} & string & the value of the parameter. If a corresponding \class{ParameterDescription}.\attribute{valueType} attribute is set, the value string can be interpreted by this \attribute{valueType}. \\ 787 \bottomrule 788 \end{tabulary} 789 \caption[Attributes of the \class{Parameter} class]{Attributes of the \class{Parameter} class. Attributes in \textbf{bold} are mandatory and must not be null.} 790 \label{tab:param} 791 \end{table} 792 793 \begin{table}[ht] 794 \small 795 \tymax 0.5\textwidth 796 \textbf{\normalsize ParameterDescription}\vspace{0.25em}\\ 797 \begin{tabulary}{1.0\textwidth}{lLL} 798 \toprule 799 \head{Attribute} & \head{Data type} & \head{Description}\\ 800 \midrule 801 %\textbf{id} & string & unique ParemeterDescription identifier\\ 802 \textbf{name} & string & name of the parameter \\ 803 \textbf{valueType} & VOTableType & combination of \attribute{datatype}, \attribute{arraysize} and \attribute{xtype} following VOTable 1.3 \citep[][, \S4.1]{2013ivoa.spec.0920O} \\ 804 description & string & a descriptive text for the parameter \\ 805 unit & Unit & VO unit, see \ref{sect:Units} and \citet{2014ivoa.spec.0523D} for recommended unit representation \\ 806 ucd & string & Unified Content Descriptor, supplying a standardized classification of the physical quantity, see \citet{2018ivoa.spec.0527M} \\ 807 utype & string & Utype, meant to express the role of the parameter in the context of an external data model, see \citet{note:utypeusage} \\ 808 %xtype & string & extended datatype as in VOTable 1.2 and above. A list of proposed \\ 809 % \midrule 810 % \multicolumn{3}{@{}l}{\textbf{Optional attributes:}} \\ 811 min & string & minimum value as a string whose value can be interpreted by the \attribute{valueType} attribute \\ 812 max & string & maximum value as a string whose value can be interpreted by the \attribute{valueType} attribute\\ 813 options & array of strings & array of possible values\\ 814 default & string & the default value of the parameter as a string whose value can be interpreted by the \attribute{valueType} attribute \\ 815 \bottomrule 816 \end{tabulary} 817 \caption[Attributes of the \class{ParameterDescription} class]{Attributes of the \class{ParameterDescription} class. Attributes in \textbf{bold} are mandatory and must not be null.} 818 \label{tab:Paramdescription} 819 \end{table} 820 821 The \class{Parameter} class contains a \attribute{value} and a \attribute{name} attribute that must be set (Table~\ref{tab:param}). 822 823 The \class{ParameterDescription} class describes the parameter \attribute{value} attribute similarly to the \class{ValueEntity} and \class{ValueDescription} classes. Those attributes are listed in Table~\ref{tab:Paramdescription}. 824 825 If a \class{ParameterDescription} instance is defined, the \attribute{name} attribute of the related \class{Parameter} instances must match the \attribute{name} attribute of this \class{ParameterDescription} instance. 826 827 The \class{Parameter} instance may refer to a \class{ValueEntity} instance using a \textit{hadReference} relation which gives the origin of the parameter value. 828 829 830 \subsubsection{ConfigFile and ConfigFileDescription classes} 831 832 \begin{table}[ht] 833 \small 834 \tymax 0.5\textwidth 835 \textbf{\normalsize ConfigFile}\vspace{0.25em}\\ 836 \begin{tabulary}{1.0\textwidth}{llL} 837 \toprule 838 \head{Attribute} & \head{Data type} & \head{Description}\\ 839 \midrule 840 \textbf{name} & string & a human-readable name for the config file \\ 841 \textbf{location} & string & a path to the config file, e.g., a URL \\ 842 comment & string & text containing comments on the config file \\ 843 \bottomrule 844 \end{tabulary} 845 \caption[Attributes of the \class{ConfigFile} class]{Attributes of the \class{ConfigFile} class. Attributes in \textbf{bold} are mandatory and must not be null.} 846 \label{tab:configfile} 847 \end{table} 848 849 \begin{table}[ht] 850 \small 851 \tymax 0.5\textwidth 852 \textbf{\normalsize ConfigFileDescription}\vspace{0.25em}\\ 853 \begin{tabulary}{1.0\textwidth}{llL} 854 \toprule 855 \head{Attribute} & \head{Data type} & \head{Description}\\ 856 \midrule 857 \textbf{name} & string & a human-readable name for the config file \\ 858 \textbf{contentType} & string & format of the config file, MIME type when applicable \\ 859 description & string & a descriptive text for the config file \\ 860 \bottomrule 861 \end{tabulary} 862 \caption[Attributes of the \class{ConfigFileDescription} class]{Attributes of the \class{ConfigFileDescription} class. Attributes in \textbf{bold} are mandatory and must not be null.} 863 \label{tab:configfiledescription} 864 \end{table} 865 866 The \class{ConfigFile} points to a structured, machine readable file, where parameters for running an activity are stored. It contains a \attribute{location} and a \attribute{name} that must be set, and a \attribute{comment} attribute (Table~\ref{tab:configfile}). 867 868 The \class{ConfigFileDescription} class indicates the format in which the list is provided in a \attribute{contentType} attribute (see Table~\ref{tab:configfiledescription}). 869 870 If a \class{ConfigFileDescription} instance is defined, the \attribute{name} attribute of the related \class{ConfigFile} instances must match the \attribute{name} attribute of this \class{ConfigFileDescription} instance. 871 872 873 \subsubsection{Relations with Activity class} 874 875 \begin{table}[ht] 876 \small 877 \tymax 0.5\textwidth 878 \textbf{\normalsize WasConfiguredBy}\vspace{0.25em}\\ 879 \begin{tabulary}{1.0\textwidth}{llL} 880 \toprule 881 \head{Attribute} & \head{Data type} & \head{Description}\\ 882 \midrule 883 %\textbf{id} & string & a unique id\\ 884 \textbf{artefactType} & TypeOfConfigArtefact & literal that takes the value Parameter'' or ConfigFile'' to indicate the type of class pointed by the \class{WasConfiguredBy} instance. \\ 885 \bottomrule 886 \end{tabulary} 887 \caption[Attributes of the \class{WasConfiguredBy} class]{Attributes of the \class{WasConfiguredBy} class. Attributes in \textbf{bold} are mandatory and must not be null.} 888 \label{tab:WasConfiguredBy} 889 \end{table} 890 891 The relation of \class{Parameter} and \class{ConfigFile} to \class{Activity} is formalized by a \class{WasConfiguredBy} class. There must be exactly one instance connected to a \class{WasConfiguredBy} instance, either a \class{Parameter} instance or a \class{ConfigFile} instance. The \class{WasConfiguredBy} class contains the attribute \attribute{artefactType} to indicate the type of class pointed by the \class{WasConfiguredBy} instance (see Table~\ref{tab:WasConfiguredBy}). 892 893 The life cycle of a \class{Parameter} instance (respectively \class{ConfigFile} instance) is the one of the corresponding \class{Activity} instance. 894 The life cycle of a \class{ParameterDescription} instance (respectively \class{ConfigFileDescription} instance) is the one of the corresponding \class{ActivityDescription} instance. 895 This means that when an activity is deleted from the provenance repository, its parameters and config files also disappear. 896 897 Several activities launched with various possible values for a parameter share the same \class{ParameterDescription} instance. 898 For instance, a cube analysis activity with a parameter nbofChannels'' will point to the corresponding instance of \class{ParameterDescription} (\attribute{name} = nbofChannels'', \attribute{ucd} = meta.number'', \attribute{unit} = Null, \attribute{description} = Nb of channel used for segmentation''). 899  900 Similarly, we can foresee a number of different \class{ConfigFile} instances used for various instances of an \class{Activity}, which rely on the same \class{ConfigFileDescription} instance bound to the corresponding \class{ActivityDescription} instance.

 msdemlei@ari.uni-heidelberg.de ViewVC Help Powered by ViewVC 1.1.26