/[volute]/trunk/projects/theory/snapdm/doc/note/SimDB-note.html
ViewVC logotype

Diff of /trunk/projects/theory/snapdm/doc/note/SimDB-note.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 480 by gerard.lemson, Mon May 12 19:52:12 2008 UTC revision 481 by bourges.laurent, Wed May 14 07:38:34 2008 UTC
# Line 1  Line 1 
 <?xml version="1.0" encoding="iso-8859-1"?>  
1  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2  <html>  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
3  <head>  <head>
4          <title>IVOA Working Group - Internal Draft</title>          <title>IVOA Working Group - Internal Draft</title>
5          <meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />          <meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
6          <meta name="keywords" content="IVOA, International, Virtual, Observatory, Alliance" />          <meta name="keywords" content="IVOA, International, Virtual, Observatory, Alliance" />
7          <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />          <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
8          <meta name="maintainedBy" content="IVOA Document Coordinator, ivoadoc@ivoa.net" />          <meta name="maintainedBy" content="IVOA Document Coordinator, ivoadoc@ivoa.net" />
9    <link rel="stylesheet" href="http://ivoa.net/misc/ivoa_wg.css" type="text/css" />          
10    <link rel="stylesheet" href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/css/simdb-note.css" type="text/css" />          <link rel="stylesheet" href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/css/ivoa_wg.css" type="text/css" />
11            <link rel="stylesheet" href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/css/simdb-note.css" type="text/css" />
12  </head>  </head>
13    
14  <body>  <body>
15    
16    
17      <div class="customCSSDisplay">
18        <h1>Custom CSS Classes :</h1>
19        
20        <div class="revision">
21          Revision marks [revision] :
22          <br/>
23          
24          <br/>normal text
25          <br/>
26          <br/>
27    
28          <div>
29            <span class="citation">original text [citation]</span>
30            =>
31            <span class="proposal">replacement text [proposal]</span>
32          </div>
33    
34          <br/>
35          Authors :
36          <div class="gerard">Gerard's comments [gerard]</div>
37          <div class="laurent">Laurent's comments [laurent]</div>
38          <div class="rick">Rick's comments [rick]</div>
39          <div class="patrizia">Patrizia's comments [patrizia]</div>
40          
41        </div>
42      </div>  
43      
44      
45  <div class="head">  <div class="head">
46  <a href="http://www.ivoa.net/"><img alt="IVOA" src="http://www.ivoa.net/pub/images/IVOA_wb_300.jpg" width="300" height="169"/></a>  <a href="http://www.ivoa.net/"><img alt="IVOA" src="http://www.ivoa.net/pub/images/IVOA_wb_300.jpg" width="300" height="169"/></a>
47  <h1>Simulation Database (SimDB)<br/>  <h1>Simulation Database (SimDB)<br/>
# Line 19  Line 49 
49  <h2>IVOA Theory Interest Group <br />Internal Draft 2008 April 19 </h2>  <h2>IVOA Theory Interest Group <br />Internal Draft 2008 April 19 </h2>
50    
51    
52    <dt>This version:</dt>    <dt>This version:</dt>
53    <dd><a href="http://www.ivoa.net/Documents/...">    <dd><a href="http://www.ivoa.net/Documents/...">
54        http://www.ivoa.net/Documents/...</a></dd>        http://www.ivoa.net/Documents/...</a></dd>
55    
56    <dt>Latest version:</dt>    <dt>Latest version:</dt>
57    
58    <dd><a href="http://www.ivoa.net/Documents/latest/...">    <dd><a href="http://www.ivoa.net/Documents/latest/...">
59        http://www.ivoa.net/Documents/latest/...</a></dd>        http://www.ivoa.net/Documents/latest/...</a></dd>
60    
61    <dt>Previous versions:</dt>    <dt>Previous versions:</dt>
62    <dt>Interest Group:</dt>    <dt>Interest Group:</dt>
63                  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaTheory"> http://www.ivoa.net/twiki/bin/view/IVOA/IvoaTheory</a></dd>                  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/IvoaTheory"> http://www.ivoa.net/twiki/bin/view/IVOA/IvoaTheory</a></dd>
64          <dt>Author(s):</dt>          <dt>Author(s):</dt>
65  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/GerardLemson">Gerard Lemson</a> (editor)<br /></dd>  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/GerardLemson">Gerard Lemson</a> (editor)<br /></dd>
66  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/LaurentBourges">Laurent Bourges</a><br /></dd>  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/LaurentBourges">Laurent Bourges</a><br /></dd>
67  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/PatriziaManzato">Patrizia Manzato</a><br /></dd>  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/PatriziaManzato">Patrizia Manzato</a><br /></dd>
68  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/RickWagner">Rick Wagner</a><br /></dd>  <dd><a href="http://www.ivoa.net/twiki/bin/view/IVOA/RickWagner">Rick Wagner</a><br /></dd>
69  <dd>others?</dd>  <dd>others?</dd>
70  <hr/></div>  <hr/></div>
71    
72  <h2><a name="abstract" id="abstract">Abstract</a></h2>  <h2><a name="abstract" id="abstract">Abstract</a></h2>
73  <p>In this note we propose that the IVOA develop a standard protocol for discovering simulations.  <p>In this note we propose that the IVOA develop a standard protocol for discovering simulations.
74  We will call this protocol the <i>Simulation Database</i> (SimDB). Implementations of the SimDB will allow users to query for  We will call this protocol the <i>Simulation Database</i> (SimDB). Implementations of the SimDB will allow users to query for
75  results of simulations in quite some detail and will provide links to services for accessing these  results of simulations in quite some detail and will provide links to services for accessing these
76  simulations. </p>  simulations. </p>
77  <p>The results presented in this note, which form the core of the peoposed standard, are one half of a concerted effort of the theory Interest Group that originally went by the name  <p>The results presented in this note, which form the core of the peoposed standard, are one half of a concerted effort of the theory Interest Group that originally went by the name
78  S<i>imple Numerical Access Protocol</i> (SNAP), and is now split up in two parts. The second part defines protocols  S<i>imple Numerical Access Protocol</i> (SNAP), and is now split up in two parts. The second part defines protocols
79  for accessing the simulations data products themselves. This part will be written up in a separate Note  for accessing the simulations data products themselves. This part will be written up in a separate Note
80  (Gheller, Wagner et al, in preparation), under the name Simulation Data Access Protocol (SimDAP).  (Gheller, Wagner et al, in preparation), under the name Simulation Data Access Protocol (SimDAP).
81  </p>  </p>
82  <p>The current proposal is built around a UML data model describing simulations, a representation (mapping) of this model as a relational  <p>The current proposal is built around a UML data model describing simulations, a representation (mapping) of this model as a relational
83  database schema and a mapping to an XML schema.  database schema and a mapping to an XML schema.
84  We propose the relational schema to be the outer facade of a SimDB-TAP implementation which is to be queried using  We propose the relational schema to be the outer facade of a SimDB-TAP implementation which is to be queried using
85  <a href="http://www.ivoa.net/internal/IVOA/IvoaVOQL/ADQL-20080415.pdf">ADQL</a> <em class="todo">.@@ TODO update the ADQL link to later versions @@</em>  <a href="http://www.ivoa.net/internal/IVOA/IvoaVOQL/ADQL-20080415.pdf">ADQL</a> <em class="todo">.@@ TODO update the ADQL link to later versions @@</em>
86  The XML schema provides type definitions from  The XML schema provides type definitions from
87  which a machine readable serialisations of the model may be constructed. The schema also defines root elements for documents  which a machine readable serialisations of the model may be constructed. The schema also defines root elements for documents
88  describing SimDB-resources. The SimDB should return such documents for identified SimDB-Resources upon request, as an  describing SimDB-resources. The SimDB should return such documents for identified SimDB-Resources upon request, as an
89  alternative to the tabular (VOTable) results of ADQL queries.  alternative to the tabular (VOTable) results of ADQL queries.
90  In case updates are supported by a SimDB implementation, such documents may be sent    In case updates are supported by a SimDB implementation, such documents may be sent  
91  </p>  </p>
92  <p>  <p>
93  This Note describes use cases and requirements and the approach we have taken to define a specification  This Note describes use cases and requirements and the approach we have taken to define a specification
94  that and current state of the results. We feel that the results are  that and current state of the results. We feel that the results are
95  sufficiently far evolved that they can start following the formal IVOA standardisation track.  sufficiently far evolved that they can start following the formal IVOA standardisation track.
96  To this end it could be turned over to one of the existing working groups. If that is the decisions we feel  To this end it could be turned over to one of the existing working groups. If that is the decisions we feel
97  that the data modelling WG is closest to its scope, but there exist very strong links to Registry, Semantics, ADQL  that the data modelling WG is closest to its scope, but there exist very strong links to Registry, Semantics, ADQL
98  and DAL as well. One might argue that a targeted WG for this effort alone might be as appropriate.  and DAL as well. One might argue that a targeted WG for this effort alone might be as appropriate.
99  We leave the decision about this to the IVOA exec.  We leave the decision about this to the IVOA exec.
100  </p>  </p>
101    
102    
103    
104  <div class="status">  <div class="status">
105  <h2><a name="status" id="status">Status of this Document</a></h2>  <h2><a name="status" id="status">Status of this Document</a></h2>
# Line 88  Line 118 
118  </div><br />  </div><br />
119    
120  <h2><a name="acknowledgments" id="acknowledgments">Acknowledgments</a></h2>  <h2><a name="acknowledgments" id="acknowledgments">Acknowledgments</a></h2>
121  <p>We thank various persons for useful discussions in the course of this work. First the participants of the  <p>We thank various persons for useful discussions in the course of this work. First the participants of the
122  <a href="http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/CambridgeTheoryWorkshopFeb06">Feb 2006 theory  <a href="http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/CambridgeTheoryWorkshopFeb06">Feb 2006 theory
123  workshop</a> in Cambridge, UK,  where this work was started. Second the participants of the  workshop</a> in Cambridge, UK,  where this work was started. Second the participants of the
124  <a href="http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/GarchingSNAPWorkshop200704">April 2007 SNAP workshop</a> in  <a href="http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/GarchingSNAPWorkshop200704">April 2007 SNAP workshop</a> in
125  Garching, Germany, where the design started taking shape. Then we want to thank particularly the following persons  Garching, Germany, where the design started taking shape. Then we want to thank particularly the following persons
126  for useful discussions and feedback: Jeremy Blaizot, Klaus Dolag, Ray Plante, Volker Springel. We finally want to thank  for useful discussions and feedback: Jeremy Blaizot, Klaus Dolag, Ray Plante, Volker Springel. We finally want to thank
127  participants to the theory sessions in the interoperability meetings in Victoria, Moscow, Beijing and Cambridge where parts  participants to the theory sessions in the interoperability meetings in Victoria, Moscow, Beijing and Cambridge where parts
128  of this work was discussed.  of this work was discussed.
129  </p>  </p>
130  <h2><a id="contents" name="contents">Contents</a></h2>  <h2><a id="contents" name="contents">Contents</a></h2>
# Line 107  Line 137 
137  <li><a href="#sec1">1. Executive Summary</a></li>  <li><a href="#sec1">1. Executive Summary</a></li>
138    
139  <li><a href="#sec2">2. Overview</a></li>  <li><a href="#sec2">2. Overview</a></li>
140          <ul class="toc">          <ul class="toc">
141          <li><a href="#sec2_1">2.1 SNAP &rArr; SimDB + SimDAP</a></li>          <li><a href="#sec2_1">2.1 SNAP &rArr; SimDB + SimDAP</a></li>
142          <li><a href="#sec2_3">2.3 Simulation Database: structure and interface</a></li>          <li><a href="#sec2_3">2.3 Simulation Database: structure and interface</a></li>
143          <li><a href="#sec2_3">2.3 Registration</a></li>          <li><a href="#sec2_3">2.3 Registration</a></li>
144          <li><a href="#sec2_4">2.4 Technology: UML, XMI, XSLT</a></li>          <li><a href="#sec2_4">2.4 Technology: UML, XMI, XSLT</a></li>
145          <li><a href="#sec2_5">2.5 Reference implementations</a></li>          <li><a href="#sec2_5">2.5 Reference implementations</a></li>
146            </ul>
147    
148    
149    <li><a href="#sec3">3 Usage scenarios</a></li>
150            <ul class="toc">
151            <li><a href="#sec3_1">3.1 "20 questions"</a></li>
152            <li><a href="#sec3_2">3.2 SimDB-standard implementation</a></li>
153            <li><a href="#sec3_3">3.3 Legacy database</a></li>
154            <li><a href="#sec3_4">3.4 Meta data production pipe line</a></li>
155            <li><a href="#sec3_5">3.5 Client tools</a></li>
156            </ul>
157    
158    <li><a href="#sec4">4 Analysis model</a></li>
159            <ul class="toc">
160            <li><a href="#sec4_1">4.1 Universe of Discourse</a></li>
161            <li><a href="#sec4_2">4.2 <i>Domain Model for Astronomy</i></a></li>
162            <li><a href="#sec4_3">4.3 SimDB analysis model</a></li>
163            </ul>
164    
165    <li><a href="#sec5">5 Logical model</a></li>
166            <ul class="toc">
167            <li><a href="#sec5_1">5.1 Overview</a></li>
168            <li><a href="#sec5_2">5.2 Normalisation</a></li>
169            <li><a href="#sec5_3">5.3 Model content</a></li>
170            <li><a href="#sec5_3_1">5.3.1 Resource hierarchy</a></li>
171            <li><a href="#sec5_3_2">5.3.2 Object types</a></li>
172            <li><a href="#sec5_3_3">5.3.3 Target</a></li>
173            <li><a href="#sec5_3_4">5.3.4 Characterisation</a></li>
174            <li><a href="#sec5_3_5">5.3.5 Semantics</a></li>
175            <li><a href="#sec5_3_6">5.3.6 Units</a></li>
176            <li><a href="#sec5_3_7">5.3.7 Services</a></li>
177            </ul>
178            
179    <li><a href="#sec6">6 Physical models</a></li>
180            <ul class="toc">
181            <li><a href="#sec6_1">6.1 Identity and referencing</a></li>
182            <li><a href="#sec6_2">6.2 RDBM Schema</a></li>
183            <li><a href="#sec6_3">6.3 XML Schema</a></li>
184            <li><a href="#sec6_4">6.4 UTYPE-s</a></li>
185            <li><a href="#sec6_5">6.5 JAVA/JPA+JAXB (non-normative)</a></li>
186            </ul>
187            
188    <li><a href="#sec7">7. Query protocols</a></li>
189            <ul class="toc">
190            <li><a href="#sec7_1">7.1 ADQL</a></li>
191            <li><a href="#sec7_2">7.3 REST</a></li>
192            <li><a href="#sec7_3">7.2 TAP?</a></li>
193            </ul>
194    
195    <li><a href="#sec8">8. Next steps</a></li>
196            <ul class="toc">
197            <li><a href="#sec8_1">8.1 Reference implementations</a></li>
198            <ul class="toc">
199            <li><a href="#sec8_1_1">8.1.1 France</a></li>
200            <li><a href="#sec8_1_2">8.1.2 Germany</a></li>
201            <li><a href="#sec8_1_3">8.1.3 Italy</a></li>
202            <li><a href="#sec8_1_4">8.1.4 USA</a></li>
203            </ul>
204            <li><a href="#sec8_2">8.2 SimDAP services</a></li>
205          </ul>          </ul>
   
   
 <li><a href="#sec3">3 Usage scenarios</a></li>  
         <ul class="toc">  
         <li><a href="#sec3_1">3.1 "20 questions"</a></li>  
         <li><a href="#sec3_2">3.2 SimDB-standard implementation</a></li>  
         <li><a href="#sec3_3">3.3 Legacy database</a></li>  
         <li><a href="#sec3_4">3.4 Meta data production pipe line</a></li>  
         <li><a href="#sec3_5">3.5 Client tools</a></li>  
         </ul>  
   
 <li><a href="#sec4">4 Analysis model</a></li>  
         <ul class="toc">  
         <li><a href="#sec4_1">4.1 Universe of Discourse</a></li>  
         <li><a href="#sec4_2">4.2 <i>Domain Model for Astronomy</i></a></li>  
         <li><a href="#sec4_3">4.3 SimDB analysis model</a></li>  
         </ul>  
   
 <li><a href="#sec5">5 Logical model</a></li>  
         <ul class="toc">  
         <li><a href="#sec5_1">5.1 Overview</a></li>  
         <li><a href="#sec5_2">5.2 Normalisation</a></li>  
         <li><a href="#sec5_3">5.3 Model content</a></li>  
         <li><a href="#sec5_3_1">5.3.1 Resource hierarchy</a></li>  
         <li><a href="#sec5_3_2">5.3.2 Object types</a></li>  
         <li><a href="#sec5_3_3">5.3.3 Target</a></li>  
         <li><a href="#sec5_3_4">5.3.4 Characterisation</a></li>  
         <li><a href="#sec5_3_5">5.3.5 Semantics</a></li>  
         <li><a href="#sec5_3_6">5.3.6 Units</a></li>  
         <li><a href="#sec5_3_7">5.3.7 Services</a></li>  
         </ul>  
           
 <li><a href="#sec6">6 Physical models</a></li>  
         <ul class="toc">  
         <li><a href="#sec6_1">6.1 Identity and referencing</a></li>  
         <li><a href="#sec6_2">6.2 RDBM Schema</a></li>  
         <li><a href="#sec6_3">6.3 XML Schema</a></li>  
         <li><a href="#sec6_4">6.4 UTYPE-s</a></li>  
         <li><a href="#sec6_5">6.5 JAVA/JPA+JAXB (non-normative)</a></li>  
         </ul>  
           
 <li><a href="#sec7">7. Query protocols</a></li>  
         <ul class="toc">  
         <li><a href="#sec7_1">7.1 ADQL</a></li>  
         <li><a href="#sec7_2">7.3 REST</a></li>  
         <li><a href="#sec7_3">7.2 TAP?</a></li>  
         </ul>  
   
 <li><a href="#sec8">8. Next steps</a></li>  
         <ul class="toc">  
         <li><a href="#sec8_1">8.1 Reference implementations</a></li>  
         <ul class="toc">  
         <li><a href="#sec8_1_1">8.1.1 France</a></li>  
         <li><a href="#sec8_1_2">8.1.2 Germany</a></li>  
         <li><a href="#sec8_1_3">8.1.3 Italy</a></li>  
         <li><a href="#sec8_1_4">8.1.4 USA</a></li>  
         </ul>  
         <li><a href="#sec8_2">8.2 SimDAP services</a></li>  
         </ul>  
206  <br/>  <br/>
207  <li><a href="#appA">Appendix A: Data modelling specifics</a></li>  <li><a href="#appA">Appendix A: Data modelling specifics</a></li>
208  <li><a href="#appB">Appendix B: XSLT pipe line</a></li>  <li><a href="#appB">Appendix B: XSLT pipe line</a></li>
209  <li><a href="#glossary">Glossary and Acronyms</a></li>  <li><a href="#glossary">Glossary and Acronyms</a></li>
210    
211  <li><a href="#references">References</a></li>  <li><a href="#references">References</a></li>
212  </ul>  </ul>
213  </div>  </div>
# Line 185  Line 215 
215    
216    
217  <br/>  <br/>
218  <h2><a name="sec1">1. Executive summary</a></h2>  <h2><a name="sec1">1. Executive summary</a></h2>
219  <em class="todo">@@ TODO Modify this text, which was originally an email to be sent to THEORY, TCG, DM, maybe EXEC @@</em>  <em class="todo">@@ TODO Modify this text, which was originally an email to be sent to THEORY, TCG, DM, maybe EXEC @@</em>
220  <p>  <p>
221  We propose to derive two WG projects from what was so far the  We propose to derive two WG projects from what was so far the
222  SNAP project of the theory interest group: SimDB and SimDAP.  SNAP project of the theory interest group: SimDB and SimDAP.
223  In this note we discuss the first of these, SimDB, in some detail.  In this note we discuss the first of these, SimDB, in some detail.
224    
225  </p>  </p>
226  <h3> Simulation Database (SimDB)</h3>  <h3> Simulation Database (SimDB)</h3>
227  <p>We propose to developa standard specification project, called the "Simulation Database" (SimDB).  <p>We propose to developa standard specification project, called the "Simulation Database" (SimDB).
228  It is based on the description+discovery part of the old  It is based on the description+discovery part of the old
229  SNAP project. Its normative deliverables are  SNAP project. Its normative deliverables are
230  <ul>  <ul>
231  <li> A logical data model for describing simulations.<br/>  <li> A logical data model for describing simulations.<br/>
232     Following SNAP we keep concentrating     Following SNAP we keep concentrating
233     on 3+1D simulations, with which we mean simulations modelling a     on 3+1D simulations, with which we mean simulations modelling a
234     space-time sub-volume of the universe OF ANY SIZE, so not only large     space-time sub-volume of the universe OF ANY SIZE, so not only large
235     scale structure, galaxy clusters, but everything down to asteroid collisions etc.     scale structure, galaxy clusters, but everything down to asteroid collisions etc.
236     As the model <i>describes</i> simulations, it may be called a meta-data model.     As the model <i>describes</i> simulations, it may be called a meta-data model.
237     It will be a logical model in the sense of standard data modelling approaches <em class="todo">@@TODO add some references@@</em>,     It will be a logical model in the sense of standard data modelling approaches <em class="todo">@@TODO add some references@@</em>,
238     and is based on an analysis, or domain model which is presented but not normative.     and is based on an analysis, or domain model which is presented but not normative.
239     The logical model is presented in fully detailed and documented UML2, serialised     The logical model is presented in fully detailed and documented UML2, serialised
240     to XMI 2.1, created using the MagicDraw 12.1 Community edition tool.     to XMI 2.1, created using the MagicDraw 12.1 Community edition tool.
241     The data model is using a small subset of UML2 and has some UML profile     The data model is using a small subset of UML2 and has some UML profile
242     extensions added. Together this can be seen as a domain specific language,     extensions added. Together this can be seen as a domain specific language,
243     and this can be formalised in a UML Profile. We will propose using such a profile     and this can be formalised in a UML Profile. We will propose using such a profile
244     to the DM working group as a general approach for DM efforts.     to the DM working group as a general approach for DM efforts.
245     </li>     </li>
246  <li>A query protocol based on the logical model.    <li>A query protocol based on the logical model.  
247     <br />We propose this to have at least an ADQL version.     <br />We propose this to have at least an ADQL version.
248     To this end we will provide a relational mapping.     To this end we will provide a relational mapping.
249     This physical model is completely derived from the SimDB logical model using rules     This physical model is completely derived from the SimDB logical model using rules
250     implemented as a pipe-line of XSLT2 scripts working on the XMI representation of     implemented as a pipe-line of XSLT2 scripts working on the XMI representation of
251     the UML. The scripts will produce relational database DDL scripts defining the     the UML. The scripts will produce relational database DDL scripts defining the
252     database schema. That schema itself is not normative, instead we will define the     database schema. That schema itself is not normative, instead we will define the
253     replies to TAP metadata queries. We provide implementaiton scenarios in the text below,     replies to TAP metadata queries. We provide implementaiton scenarios in the text below,
254     for the case of someone using the results from this project completely and for the     for the case of someone using the results from this project completely and for the
255     case of someone implementing a SimDB on top of a legacy database.     case of someone implementing a SimDB on top of a legacy database.
256     </li>     </li>
257  <li> a messaging format for sending instances of the various components  <li> a messaging format for sending instances of the various components
258     in the data model around.     in the data model around.
259     <br />This format will be based on a number of XML     <br />This format will be based on a number of XML
260     schema documents (XSDs), one of which contains the root elements defining valid SimDB resources.     schema documents (XSDs), one of which contains the root elements defining valid SimDB resources.
261     This requires a mapping from the UML to XSD.     This requires a mapping from the UML to XSD.
262     This mapping will take the form of one or more XSLT documents.     This mapping will take the form of one or more XSLT documents.
263  </li>  </li>
264  <li> An IVOA working draft document describing these components.  <li> An IVOA working draft document describing these components.
265  <br />This will be based on the current document.</li></ul>  <br />This will be based on the current document.</li></ul>
266  </p>  </p>
267  <p>  <p>
268  We introduce some non-normative solutions that can be taken over for generic  We introduce some non-normative solutions that can be taken over for generic
269  data models (this is ofcourse also true for the UML/XMI+XSLT approach for the  data models (this is ofcourse also true for the UML/XMI+XSLT approach for the
270  normative standards).  normative standards).
271  <ul>  <ul>
272  <li> The XSLT scripts we propose above do not work on the XMI itself, but on  <li> The XSLT scripts we propose above do not work on the XMI itself, but on
273     an intermediate representation of the UML data model. This is an XML dialect     an intermediate representation of the UML data model. This is an XML dialect
274     based on a schema we define and which captures the UML profile more directly.     based on a schema we define and which captures the UML profile more directly.
275     XMI is very generic and rather cumbersome to work with. The representation of     XMI is very generic and rather cumbersome to work with. The representation of
276     the UML in our intermediate XML form is much more readable and XSLT based on it     the UML in our intermediate XML form is much more readable and XSLT based on it
277     is much simpler. It also allows easier adaptation to future modifications in UML,     is much simpler. It also allows easier adaptation to future modifications in UML,
278     or to tools whose XMI representation is different from the standard. We only need     or to tools whose XMI representation is different from the standard. We only need
279     to update the XMI->Intermediate XSLT transformation scripts. Not the more complex     to update the XMI->Intermediate XSLT transformation scripts. Not the more complex
280     transformations to the other official representations.       transformations to the other official representations.  
281     We will propose a similar approach to the DM WG.     We will propose a similar approach to the DM WG.
282     </li>     </li>
283  <li> We will provide XMI->Java+JPA+JAXB transformation scripts in XSLT (properly, intermediate->Java).  <li> We will provide XMI->Java+JPA+JAXB transformation scripts in XSLT (properly, intermediate->Java).
284     These scripts generate Java classes corresponding to the types (Class, DataType, Enumeration)     These scripts generate Java classes corresponding to the types (Class, DataType, Enumeration)
285     in UML. These classes are annotated with Java Persistence Architecture (JPA)     in UML. These classes are annotated with Java Persistence Architecture (JPA)
286     and Java Architecture for XML Binding (JAXB) attributes to assist in the transformation     and Java Architecture for XML Binding (JAXB) attributes to assist in the transformation
287     between relational database and XML representations.     between relational database and XML representations.
288     Similar scripts can be written for C#. C# allows the same annotations as Java 5 supports     Similar scripts can be written for C#. C# allows the same annotations as Java 5 supports
289     already for longer. For persistence we will likely use Linq, which seems similar to JPA.     already for longer. For persistence we will likely use Linq, which seems similar to JPA.
290  </li>  </li>
291  <li>We propose an approach for including application specific and legacy simulation databases  <li>We propose an approach for including application specific and legacy simulation databases
292     in this framework. This approach follows the "global-as-view" approach to information     in this framework. This approach follows the "global-as-view" approach to information
293     integration (see for example http://www.deg.byu.edu/papers/PODS.integration.pdf;     integration (see for example http://www.deg.byu.edu/papers/PODS.integration.pdf;
294     Leonid Kalinichenko from the RVO is an expert in this field).     Leonid Kalinichenko from the RVO is an expert in this field).
295     Implementors with an existing relational database schema may be able to define database     Implementors with an existing relational database schema may be able to define database
296     views which implement the relational representatiopn of the SimDB data model,     views which implement the relational representatiopn of the SimDB data model,
297     and in this way provide a simple way to support querying of their database using ADQL.     and in this way provide a simple way to support querying of their database using ADQL.
298  </li></ul></p>  </li></ul></p>
299  <h4>organisation</h4>  <h4>organisation</h4>
300  <p>  <p>
301  The SimDB is ready to be transferred to the DM WG.  The SimDB is ready to be transferred to the DM WG.
302  <br />We propose that Gerard Lemson keeps leading this effort (as main editor), also when it is moved  <br />We propose that Gerard Lemson keeps leading this effort (as main editor), also when it is moved
303  to that WG. The DM WG's chair (Mireille Louys) will be responsible all WG-chair  to that WG. The DM WG's chair (Mireille Louys) will be responsible all WG-chair
304  issues associated with moving a specification through the document process.  issues associated with moving a specification through the document process.
305  The people at the bottom will be part of a "tiger team" to push the standard to RFC.  The people at the bottom will be part of a "tiger team" to push the standard to RFC.
306  We may want to expand this group with an expert from each of the WGs mentioned below.  We may want to expand this group with an expert from each of the WGs mentioned below.
307  </p>  </p>
308  <p>  <p>
309  We have been discussing the data model for some time now.  We have been discussing the data model for some time now.
310  Various projects (Italy, USA, France and Germany) have implementations that are similar  Various projects (Italy, USA, France and Germany) have implementations that are similar
311  to the envisioned SimDB. We believe that by autumn 2008 it can go to RFC.  to the envisioned SimDB. We believe that by autumn 2008 it can go to RFC.
312  Patriza Manzato and Rick Wagner will have reference implementations based on existing DBs,  Patriza Manzato and Rick Wagner will have reference implementations based on existing DBs,
313  so will various projects in France (Lyon: Jeremy Blaizot and Laurent Bourges;  so will various projects in France (Lyon: Jeremy Blaizot and Laurent Bourges;
314  Galmer database: Igor Chillingarian) and GAVO.  Galmer database: Igor Chillingarian) and GAVO.
315  </p>  </p>
316  <p>  <p>
317  Other relevant working groups for this process are Registry, ADQL and Semantics, possibly DAL.  Other relevant working groups for this process are Registry, ADQL and Semantics, possibly DAL.
318  Registry because the simulation database is similar to a registry. We can  Registry because the simulation database is similar to a registry. We can
319  learn from implementations and the registry interface. Also, we (think we) may need an  learn from implementations and the registry interface. Also, we (think we) may need an
320  extension to the IVO Identifier in the implementation of references in SimDB.  extension to the IVO Identifier in the implementation of references in SimDB.
321  ADQL because we propose it to be the standard (main) query interface to a SimDB implementation.  ADQL because we propose it to be the standard (main) query interface to a SimDB implementation.
322  Semantics because our model includes usage of semantic vocabularies, maybe full ontologies  Semantics because our model includes usage of semantic vocabularies, maybe full ontologies
323  DAL because we our proposal for using ADQL in the query phase requirs a version of  DAL because we our proposal for using ADQL in the query phase requirs a version of
324  the TAP protocol for defining the interface.  the TAP protocol for defining the interface.
325  We would like to include a person from each of these WGs in the tiger team.  We would like to include a person from each of these WGs in the tiger team.
326  Our wishes are: Ray Plante (Registry), ? (ADQL), Norman Gray (Semantics), (?) TAP.  Our wishes are: Ray Plante (Registry), ? (ADQL), Norman Gray (Semantics), (?) TAP.
327  Ray and Norm have contributed to early discussions about SNAP.  Ray and Norm have contributed to early discussions about SNAP.
328  </p>  </p>
329  <p>  <p>
330  Of these other efforts it seems TAP offers the main risk for the SimDB standard to go to  Of these other efforts it seems TAP offers the main risk for the SimDB standard to go to
331  RFC by the Autumn. What may help us is that we do not need all the details of TAP.  RFC by the Autumn. What may help us is that we do not need all the details of TAP.
332  In particular the information_schema approach allowing users to  In particular the information_schema approach allowing users to
333  query for the data model is not required as it is part of SimDB specification.  query for the data model is not required as it is part of SimDB specification.
334  We mainly need a prescription for sending ADQL queries to the SimDB, and what the  We mainly need a prescription for sending ADQL queries to the SimDB, and what the
335  format of results should be.  format of results should be.
336  Since we expect meta-data databases to be relatively small (compared to  Since we expect meta-data databases to be relatively small (compared to
337  say an SDSS or Millennium database), we expect fewer, if any problems with  say an SDSS or Millennium database), we expect fewer, if any problems with
338  performance and can stick to synchronous behaviour at first.  performance and can stick to synchronous behaviour at first.
339  </p>  </p>
340  <p>  <p>
341  We may need some explicit registry-interface like features such as returning a  We may need some explicit registry-interface like features such as returning a
342  complete XML document according to the messaging format of the SimDB data model.  complete XML document according to the messaging format of the SimDB data model.
343  Other issues will come up during the next phase of the discussions.  Other issues will come up during the next phase of the discussions.
344  </p>  </p>
345    
346  <h3>Simulation Data Access Protocol (SimDAP)</h3>  <h3>Simulation Data Access Protocol (SimDAP)</h3>
347  <p>  <p>
348  The second spin-off of the SNAP project we propose we rename to <i>Simulation Data Access Protocol</i> (SimDAP).  The second spin-off of the SNAP project we propose we rename to <i>Simulation Data Access Protocol</i> (SimDAP).
349  It deals with accessing the data after discovery by some means,  It deals with accessing the data after discovery by some means,
350  likely trough an implementation of a Simulation Database.  likely trough an implementation of a Simulation Database.
351  It should handle special services such as cut-out, projection,  It should handle special services such as cut-out, projection,
352  extraction (AMR-like cut-outs, produces regular grids), but also staging etc.  extraction (AMR-like cut-outs, produces regular grids), but also staging etc.
353  It should also deal with data formats. Claudio Gheller (Italy) is leading  It should also deal with data formats. Claudio Gheller (Italy) is leading
354  this effort with close help of Rick Wagner (USA).  this effort with close help of Rick Wagner (USA).
355  </p>  </p>
356  <p>  <p>
357  This project needs more fleshing out and is hopefully ready to be transmitted  This project needs more fleshing out and is hopefully ready to be transmitted
358  to a WG, likely DAL by the Autumn interop.  to a WG, likely DAL by the Autumn interop.
359  </p>  </p>
360  <h3>Connections between SimDB and SimDAP</h3>  <h3>Connections between SimDB and SimDAP</h3>
361  <p>  <p>
362  The two projects are connected as follows:  The two projects are connected as follows:
363  The meta-data formats to be included in SimDAP messages are derived from  The meta-data formats to be included in SimDAP messages are derived from
364  the data model of the SimDB.  the data model of the SimDB.
365  Vice versa, the SimDB will include a component describing  Vice versa, the SimDB will include a component describing
366  which SimDAP services are applicable/available for a given simulation.  which SimDAP services are applicable/available for a given simulation.
367  </p>  </p>
368    
369  <!-- ++++++++++++++++++++++++ -->  <!-- ++++++++++++++++++++++++ -->
370  <h2><a name="sec2"/> 2 Overview</h2>  <h2><a name="sec2"/> 2 Overview</h2>
371    
372  <h3><a name="sec2_1"/>2.1 SNAP &rArr; SimDB + SimDAP</h3>  <h3><a name="sec2_1"/>2.1 SNAP &rArr; SimDB + SimDAP</h3>
373  <p>This document presents a model for describing certain types of numerical computer simulations  <p>This document presents a model for describing certain types of numerical computer simulations
374  and certain types of simulation post-processing products.  The model was oringinally envisioned to  and certain types of simulation post-processing products.  The model was oringinally envisioned to
375  be used in the query part of the <i>Simple Numerical Access Protocol</i> (SNAP),  be used in the query part of the <i>Simple Numerical Access Protocol</i> (SNAP),
376  and in discovery of interesting SNAP services in the first place.  and in discovery of interesting SNAP services in the first place.
377  After investigating the application domain carefully, we have decided to leave the concept of  After investigating the application domain carefully, we have decided to leave the concept of
378  designing a DAL-like SxAP protocol for simulations. Instead we have split up the effort into  designing a DAL-like SxAP protocol for simulations. Instead we have split up the effort into
379  two separate efforts that can be used each in their own right, though their is a clear link between them.  two separate efforts that can be used each in their own right, though their is a clear link between them.
380  This document discusses the firsts of these, which we have named the <i>Simulation Database</i>, and  This document discusses the firsts of these, which we have named the <i>Simulation Database</i>, and
381  will have the acronym <i>SimDB</i>. The second will be developed further in a separate effort amd is  will have the acronym <i>SimDB</i>. The second will be developed further in a separate effort amd is
382  called the <i>Simulation Data Access Protocol</i> (SimDAP, "Sim" stands for "Simulation", <i>not</i> "Simple"!).  called the <i>Simulation Data Access Protocol</i> (SimDAP, "Sim" stands for "Simulation", <i>not</i> "Simple"!).
383  </p>  </p>
384  <p>  <p>
385  Following SNAP, SimDB only explicitly considers simulations for systems that represent a space-time  Following SNAP, SimDB only explicitly considers simulations for systems that represent a space-time
386  sub-volume of the universe and (part of) its material contents. Examples of such simulations are  sub-volume of the universe and (part of) its material contents. Examples of such simulations are
387  cosmological, pure dark matter N-body simulations of the large-scale structure of the universe;  cosmological, pure dark matter N-body simulations of the large-scale structure of the universe;
388  adaptive mesh refinement (AMR) simulations following the evolution of a galaxy cluster using full hydrodynamics;  adaptive mesh refinement (AMR) simulations following the evolution of a galaxy cluster using full hydrodynamics;
389  a simulation of the evolution of a globular cluster using a combination of tools, together simulating  a simulation of the evolution of a globular cluster using a combination of tools, together simulating
390  the various types of physics <em class="todo">@@ TODO reference to MODEST-like activities</em>; or  the various types of physics <em class="todo">@@ TODO reference to MODEST-like activities</em>; or
391  simulations calculating the few seconds of a super nova explosion in full 3D.    simulations calculating the few seconds of a super nova explosion in full 3D.  
392  </p>  </p>
393  <p>  <p>
394  In general these simulations will evolve this system forward  In general these simulations will evolve this system forward
395  in time and are able to produce <i>snapshots</i>, representing the state of the system, a 3D volume of space,  in time and are able to produce <i>snapshots</i>, representing the state of the system, a 3D volume of space,
396  at a number of discrete times (though there are alternatives: light cone simulations, individual particle orbits).  at a number of discrete times (though there are alternatives: light cone simulations, individual particle orbits).
397  These direct, raw results of simulations we call Level-0 products, following  These direct, raw results of simulations we call Level-0 products, following
398  similar terminology for observations.  similar terminology for observations.
399  SimDB also covers Level-1 products, which consist of the results of certain types of post-processing  SimDB also covers Level-1 products, which consist of the results of certain types of post-processing
400  of simulations, namely those products that in some form create an alternative representation of  of simulations, namely those products that in some form create an alternative representation of
401  a spatial sub-volume of the universe. For example a density field calculated on a regular grid, derived  a spatial sub-volume of the universe. For example a density field calculated on a regular grid, derived
402  created from an N-body or an AMR simulation; a cluster catalogue derived using some group finder applied  created from an N-body or an AMR simulation; a cluster catalogue derived using some group finder applied
403  to a cosmological simulaiton, or a synthetic galaxy catalogue derived from the cluster catalogue using  to a cosmological simulaiton, or a synthetic galaxy catalogue derived from the cluster catalogue using
404  halo occupation distribution models (HODs) or semi-analytical models (SAMs).  halo occupation distribution models (HODs) or semi-analytical models (SAMs).
405  </p>  </p>
406  <p>  <p>
407  We do not make any restrictions on the type of systems being simulated, or the size of the  We do not make any restrictions on the type of systems being simulated, or the size of the
408  simulation, or the way the system is represented in the simulation code and results. We also  simulation, or the way the system is represented in the simulation code and results. We also
409  make no restrictions on the type of "observables" produced by the simulations.  make no restrictions on the type of "observables" produced by the simulations.
410  </p>  </p>
411  <p>  <p>
412  The SimDAP  The SimDAP
413  specification will includes protocols for services that process level-0 or level-1 results and produce  specification will includes protocols for services that process level-0 or level-1 results and produce
414  other level-1 results. The allowed services deal with selecting the results in a  other level-1 results. The allowed services deal with selecting the results in a
415  sub-volume of the complete result, sampling a regular 3-dimensional grid, etc. SimDAP also allows for  sub-volume of the complete result, sampling a regular 3-dimensional grid, etc. SimDAP also allows for
416  services, that do not produce SimDB-like, level-0 or 1 products. Examples are projections, 1D or 2D samplings.  services, that do not produce SimDB-like, level-0 or 1 products. Examples are projections, 1D or 2D samplings.
417  But also custom services will be allowed, for example calculating statistical properties such as correlation  But also custom services will be allowed, for example calculating statistical properties such as correlation
418  functions or power spectra in cosmological simulations. A more detailed description of SimDAP  functions or power spectra in cosmological simulations. A more detailed description of SimDAP
419  is outside of the main scope of this note.  is outside of the main scope of this note.
420  </p>  </p>
421  <h3><a name="sec2_2"/>2.2 Simulation Database: structure, interface and applicable services</h3>  <h3><a name="sec2_2"/>2.2 Simulation Database: structure, interface and applicable services</h3>
422  <p>  <p>
423  SimDB is a specification that defines the interface to a database containing meta data describing  SimDB is a specification that defines the interface to a database containing meta data describing
424  simulations. To this end it contains two main parts, one is a model for the meta data, the other  simulations. To this end it contains two main parts, one is a model for the meta data, the other
425  a protocol for interacting with the database. The model is the core of the specification.  a protocol for interacting with the database. The model is the core of the specification.
426  It describes the structure of individual data products in the database. We have chosen UML  It describes the structure of individual data products in the database. We have chosen UML
427  as modelling language, as prescribed by the data modelling working group in the interoperability meeting  as modelling language, as prescribed by the data modelling working group in the interoperability meeting
428  in Cambridge, UK, May 2003.  in Cambridge, UK, May 2003.
429  </p>  </p>
430  <p>  <p>
431  The UML model is a logical model (see [..] <em class="todo">@@ TODO add reference @@</em>) and  The UML model is a logical model (see [..] <em class="todo">@@ TODO add reference @@</em>) and
432  forms the basis for physical representations of the data products in the standard  forms the basis for physical representations of the data products in the standard
433  language that the IVOA has chosen for such purposes, XML. We derive an XML schema defining valid  language that the IVOA has chosen for such purposes, XML. We derive an XML schema defining valid
434  XML documents directly from the logical model. The SimDB interface will include functions for insetting  XML documents directly from the logical model. The SimDB interface will include functions for insetting
435  SimDB data products using such documents, and for retrieving individual, identified data products.  SimDB data products using such documents, and for retrieving individual, identified data products.
436  </p>  </p>
437  <p>  <p>
438  The logical model also forms the basis for a physical representation supporting formulation of queries.  The logical model also forms the basis for a physical representation supporting formulation of queries.
439  For various reasons explained below we have chosen ADQL to be the query language and accordingly we derive  For various reasons explained below we have chosen ADQL to be the query language and accordingly we derive
440  from the model a relational schema that defines the tables and columns that can be used in ADQL queries sent  from the model a relational schema that defines the tables and columns that can be used in ADQL queries sent
441  to a SimDB implementation. The result of ADQL queries is supposed to be a VOTable, and this will in general  to a SimDB implementation. The result of ADQL queries is supposed to be a VOTable, and this will in general
442  not represent a complete SimDB data product. However it can be used to browse the database, finally identifying  not represent a complete SimDB data product. However it can be used to browse the database, finally identifying
443  resources and possibly requesting these from the SimDB as XML documents.  resources and possibly requesting these from the SimDB as XML documents.
444  </p>  </p>
445  <p>  <p>
446  We make very limited assumptions on <em>how</em> a data product discovered in a SimDB can actually be accessed.  We make very limited assumptions on <em>how</em> a data product discovered in a SimDB can actually be accessed.
447  We only assume there is a web-based service available, identified by a base URL and tagged with a service type.  We only assume there is a web-based service available, identified by a base URL and tagged with a service type.
448  The range of service types will be defined by SimDAP, but it will at least include "download" and "custom".  The range of service types will be defined by SimDAP, but it will at least include "download" and "custom".
449  The data model contains an explicit element for indicating which services are available for a given data product,  The data model contains an explicit element for indicating which services are available for a given data product,
450  and users may, if they wish, retrieve this information through ADQL queries and follow the links directly.  and users may, if they wish, retrieve this information through ADQL queries and follow the links directly.
451  SimDB implementations can and likely will eventually provide SimDAP related functionality, but this is not part  SimDB implementations can and likely will eventually provide SimDAP related functionality, but this is not part
452  of this specification.  of this specification.
453  </p>  </p>
454  <h3><a name="sec2_3"/>2.3 Registration</h3>  <h3><a name="sec2_3"/>2.3 Registration</h3>
455  <p>  <p>
456  It must be possible to find SimDB instances in an IVOA Resource Registry <am class="todo">@@TODO add references&&</am>.  It must be possible to find SimDB instances in an IVOA Resource Registry <am class="todo">@@TODO add references&&</am>.
457  This implies we need a corresponding resource type, and we have to design its structure.  This implies we need a corresponding resource type, and we have to design its structure.
458  We also assume that one may define resources in the sense of [...]  We also assume that one may define resources in the sense of [...]
459  <em class="todo">@@ TODO add reference to Resource data model document @@</em>  <em class="todo">@@ TODO add reference to Resource data model document @@</em>
460  from within the contents of a SimDB. We take this into account explicitly in the model.  from within the contents of a SimDB. We take this into account explicitly in the model.
461  The SimDB will have a "getIVOAResource" function, which will execute the appropriate transformation from  The SimDB will have a "getIVOAResource" function, which will execute the appropriate transformation from
462  the internal representation of the SimDB data products to the Resource model's XML representation [...]  the internal representation of the SimDB data products to the Resource model's XML representation [...]
463  <em class="todo">@@ TODO link to Resource XML schema document@@</em>.  <em class="todo">@@ TODO link to Resource XML schema document@@</em>.
464  This will likely put more requirements on the Registry model itself, maybe requiring extensions to its schema.  This will likely put more requirements on the Registry model itself, maybe requiring extensions to its schema.
465  Possibly a SimDB itself can be an extension registry. This we think can be postponed to a future version of the  Possibly a SimDB itself can be an extension registry. This we think can be postponed to a future version of the
466  specification.  specification.
467  </p>  </p>
468  <h3><a name="sec2_4"/>2.4 Technology: UML, XMI, XSLT</h3>  <h3><a name="sec2_4"/>2.4 Technology: UML, XMI, XSLT</h3>
469  <p>  <p>
470  We  We
471  </p>  </p>
472  <h3><a name="sec2_5"/>2.5 Reference implementations</h3>  <h3><a name="sec2_5"/>2.5 Reference implementations</h3>
473  <!-- ++++++++++++++++++++++++ -->  <!-- ++++++++++++++++++++++++ -->
474    
475  <h2><a name="sec3"/>3 Usage scenarios</h2>  <h2><a name="sec3"/>3 Usage scenarios</h2>
476  <em class="todo">@@ TODO needs severe editing @@</em>  <em class="todo">@@ TODO needs severe editing @@</em>
477  We have assembled a list of explicit use cases and scenarios from which we derive  We have assembled a list of explicit use cases and scenarios from which we derive
478  requirements for the current model and the SNAP protocol.  requirements for the current model and the SNAP protocol.
479  <h4><a name="sec3_1"/>3.1 "20 questions"</h4>  <h4><a name="sec3_1"/>3.1 "20 questions"</h4>
480  <p>  <p>
481  SimDB defines a common data model for simulations.  SimDB defines a common data model for simulations.
482  Following the good practice for database design initiated in [], we here provide a number of  Following the good practice for database design initiated in [], we here provide a number of
483  scientific questions one might want to ask such a database. The data model and associated data  scientific questions one might want to ask such a database. The data model and associated data
484  access protocol need to be sufficiently rich that they can support such questions.  access protocol need to be sufficiently rich that they can support such questions.
485  </p>  </p>
486  <ul>  <ul>
487  <li> Scientific goal: investigate baryon wiggles in the evolved density field<br/>  <li> Scientific goal: investigate baryon wiggles in the evolved density field<br/>
488  Query: Return all cosmological, pure dark matter, N-body simulations with WMAP 3 initial  Query: Return all cosmological, pure dark matter, N-body simulations with WMAP 3 initial
489  conditions and a box size of at least 1000 Mpc comoving, containing snapshots at about  conditions and a box size of at least 1000 Mpc comoving, containing snapshots at about
490  10 redshifts between 3 and 0.  10 redshifts between 3 and 0.
491  </li>  </li>
492  <li> Scientific goal: investigate whether observed structures in X-ray cluster that seem to  <li> Scientific goal: investigate whether observed structures in X-ray cluster that seem to
493  indicate turbulence, can truly be that.<br> Query: return all hydro-dynamical simulations of  indicate turbulence, can truly be that.<br> Query: return all hydro-dynamical simulations of
494  galaxy clusters of mass at least 1o<sup>14</sup> M<sub>sun</sub>,  galaxy clusters of mass at least 1o<sup>14</sup> M<sub>sun</sub>,
495  that have a model for viscosity included in the simulation.  that have a model for viscosity included in the simulation.
496  Moreover, return only those simulations that have associated to them an online visualisation  Moreover, return only those simulations that have associated to them an online visualisation
497  service that can produce projected temperature and pressure maps.  service that can produce projected temperature and pressure maps.
498  </li>  </li>
499  <li> Scientific goal: interpret the possible histories of an observed galaxy merger to calculate  <li> Scientific goal: interpret the possible histories of an observed galaxy merger to calculate
500  possible star formation episodes and compare these to the observed stellar populations.<br>  possible star formation episodes and compare these to the observed stellar populations.<br>
501  Query: Return all simulations of galaxy mergers where the component galaxies have a particular  Query: Return all simulations of galaxy mergers where the component galaxies have a particular
502  mass ratio and where there are enough snapshots to follow the evolution over a few Gyr.  mass ratio and where there are enough snapshots to follow the evolution over a few Gyr.
503  </li>  </li>
504    
505  <li> Scientific goal: compare the luminosity function of galaxies in the SDSS survey with those  <li> Scientific goal: compare the luminosity function of galaxies in the SDSS survey with those
506  in synthetic catalogues.<br>Query: Select all cosmological simulations that have produced as  in synthetic catalogues.<br>Query: Select all cosmological simulations that have produced as
507  secondary product synthetic galaxy catalogues on a light-cone and provide those via an SQL (ADQL?)  secondary product synthetic galaxy catalogues on a light-cone and provide those via an SQL (ADQL?)
508  query interface.  query interface.
509  </li>  </li>
510  <li> ...  <li> ...
511  </li>  </li>
512  </ul>  </ul>
513  <p>  <p>
514  In the design of the model it is useful to think about the steps a user might go through  In the design of the model it is useful to think about the steps a user might go through
515  when querying a database system in various "drilling down" steps. For example the following  when querying a database system in various "drilling down" steps. For example the following
516  questions might be asked :  questions might be asked :
517  </p>  </p>
518  <p>  <p>
519  <ul>  <ul>
520  <li>What system/object is being simulated?</li>  <li>What system/object is being simulated?</li>
521  <li>What physical processes are included?</li>  <li>What physical processes are included?</li>
522  <li>How is the system being represented in the simulation  <li>How is the system being represented in the simulation
523  (particles (Langrangian), (adaptive) mesh (Eulerian)), both, other?</li>  (particles (Langrangian), (adaptive) mesh (Eulerian)), both, other?</li>
524  <li>Per process:<ul>  <li>Per process:<ul>
525  <li>How are the physical processes implemented ?</li>  <li>How are the physical processes implemented ?</li>
526  <li>Characterise the numerical approximations (.e.g. resolution, softening parameter)</li></ul></li>  <li>Characterise the numerical approximations (.e.g. resolution, softening parameter)</li></ul></li>
527  <li>What observables are available for the system/object, possibly as function of time?  <li>What observables are available for the system/object, possibly as function of time?
528  As it is a spatial system, at least size, center-of-mass position.</li>  As it is a spatial system, at least size, center-of-mass position.</li>
529  <li>What observables are available for the constituents, i.e. what is the schema of the atomic objects?</li>  <li>What observables are available for the constituents, i.e. what is the schema of the atomic objects?</li>
530  <li>Per snapshot, per atomic object type, per variable:  <li>Per snapshot, per atomic object type, per variable:
531  <ul>  <ul>
532  <li>Characterise the possible values</li>  <li>Characterise the possible values</li>
533  <li>Characterise the result</li></ul></li>  <li>Characterise the result</li></ul></li>
534  <li>Are post-processing results available?</li>  <li>Are post-processing results available?</li>
535  <li>Are services/applications available working on the results?</li>  <li>Are services/applications available working on the results?</li>
536  <li>Which code ran the simulation?</li>  <li>Which code ran the simulation?</li>
537  <li>What were values of physical parameters?</li>  <li>What were values of physical parameters?</li>
538  <li>How were initial conditions created, what parameters?</li>  <li>How were initial conditions created, what parameters?</li>
539  </ul>  </ul>
540  </p>  </p>
541    
542  <h4><a name="sec3_2"/>3.2 SimDB-standard implementation</h4>  <h4><a name="sec3_2"/>3.2 SimDB-standard implementation</h4>
543  We foresee a simple implementation scenario based directly on products developed  We foresee a simple implementation scenario based directly on products developed
544  in the course of the SimDB effort. We believe that from the data model to be developed  in the course of the SimDB effort. We believe that from the data model to be developed
545  in this effort we should be able to derive physical representations that  in this effort we should be able to derive physical representations that
546  can be used directly in implementations. We envisions that with only a little custom infrastructure code  can be used directly in implementations. We envisions that with only a little custom infrastructure code
547  it should be possible to    it should be possible to  
548  <ul>  <ul>
549  <li>fill a relational database with tables and views representing the SimDB data model from  <li>fill a relational database with tables and views representing the SimDB data model from
550  DDL scripts generated from the UML</li>  DDL scripts generated from the UML</li>
551  <li>create a web-based service that accept XML documents for inserting new simulation results  <li>create a web-based service that accept XML documents for inserting new simulation results
552  and translates these, using generated code with JAXB annotations, to in memory Java objects</li>  and translates these, using generated code with JAXB annotations, to in memory Java objects</li>
553  <li>flush these objects to a relational database using the Java Persistence Architecture (JPA) implementation,  <li>flush these objects to a relational database using the Java Persistence Architecture (JPA) implementation,
554  structured using the JPA annotations generated on the Java classes.  structured using the JPA annotations generated on the Java classes.
555  It should be not too hard to support other languages as well if they provide similar simple XML binding and  It should be not too hard to support other languages as well if they provide similar simple XML binding and
556  OR-mapping capabilities. Python+Django and C#+LINQ or NHibernate come to mind.<em class="todo">  OR-mapping capabilities. Python+Django and C#+LINQ or NHibernate come to mind.<em class="todo">
557  @@ TODO check with people knowing more about these technologies @@</em></li>  @@ TODO check with people knowing more about these technologies @@</em></li>
558  <li>accept ADQL queries that are translated to the appropriate vendor specific SQL  <li>accept ADQL queries that are translated to the appropriate vendor specific SQL
559  (using modules defined by the ADQL effort?) and return a VOTable</li>  (using modules defined by the ADQL effort?) and return a VOTable</li>
560  <li>accept requests for identified SimDB resources (using an IVO or implementation specific identifier),  <li>accept requests for identified SimDB resources (using an IVO or implementation specific identifier),
561  translate this into a JPA query to retrieve the object form the database, which is translated to  translate this into a JPA query to retrieve the object form the database, which is translated to
562  the appropriate XML using the JAXB layer and sent back to the user.</li>  the appropriate XML using the JAXB layer and sent back to the user.</li>
563  </ul>  </ul>
564    
565  <h4><a name="sec3_3"/>3.3 Legacy database</h4>  <h4><a name="sec3_3"/>3.3 Legacy database</h4>
566  Although by no means as common as similar efforts in the observational domain,  Although by no means as common as similar efforts in the observational domain,
567  databases have been developed containing the meta data of simulations.  databases have been developed containing the meta data of simulations.
568  How could a SimDB be implemented around such a database.  How could a SimDB be implemented around such a database.
569  Our ideas are inspired by (what we understand from) the "global-as-view" approach to information  Our ideas are inspired by (what we understand from) the "global-as-view" approach to information
570  integration. We assume the implementers have their own way of filing up their database with meta-data  integration. We assume the implementers have their own way of filing up their database with meta-data
571  describing simulations from their own efforts. The idea is that they write database views to provide  describing simulations from their own efforts. The idea is that they write database views to provide
572  a virtual implementation of the SimDB/RDB schema. ADQL queries sent to their service can now still be  a virtual implementation of the SimDB/RDB schema. ADQL queries sent to their service can now still be
573  understood and replied to. The users should also be able to write custom code to produce the appropriate  understood and replied to. The users should also be able to write custom code to produce the appropriate
574  XML documents based on a request for an identified resource, possibly by querying these same views.  XML documents based on a request for an identified resource, possibly by querying these same views.
575    
576  <h4><a name="sec3_4"/>3.4 Meta data production pipe line</h4>  <h4><a name="sec3_4"/>3.4 Meta data production pipe line</h4>
577  The SimDB data model is relatively comprehensive, which reflects itself in XML documents  The SimDB data model is relatively comprehensive, which reflects itself in XML documents
578  of substantial size ad complexity for realistic cases.  of substantial size ad complexity for realistic cases.
579  For a registration scenario, i.e. one where a user is allowed to upload XML documents to a SimDB implementation,  For a registration scenario, i.e. one where a user is allowed to upload XML documents to a SimDB implementation,
580  one would prefer not to have to produce these documents by hand. By far the preferred manner in our opinion  one would prefer not to have to produce these documents by hand. By far the preferred manner in our opinion
581  would be for simulation and post-processing pipe-lines to produce compliant documents.  would be for simulation and post-processing pipe-lines to produce compliant documents.
582  We have contacted authors of some of the most popular major simulation codes (Springel; Norman et al; more needed),  We have contacted authors of some of the most popular major simulation codes (Springel; Norman et al; more needed),
583  and they have agreed that this is feasible and are willing to participate in this effort.  and they have agreed that this is feasible and are willing to participate in this effort.
584    
585  <h4><a name="sec3_5"/>3.5 Client tools</h4>  <h4><a name="sec3_5"/>3.5 Client tools</h4>
586  One reason to produce a standard which uses ADQL on top of a standard data model is that client tools  One reason to produce a standard which uses ADQL on top of a standard data model is that client tools
587  can be written to query different such holdings. For example we could envision a tool such as VisIVO [..]  can be written to query different such holdings. For example we could envision a tool such as VisIVO [..]
588  to offer some user-friendly interface for querying SimDB implementations retrieved from an IVOA Registry.  to offer some user-friendly interface for querying SimDB implementations retrieved from an IVOA Registry.
589  The user need to see any ADQL, that is all generated by VisIVO, but can be shown results and services.  The user need to see any ADQL, that is all generated by VisIVO, but can be shown results and services.
590  In particular if a cut-out service is available, VisIVO could provide an interface for the user to decide  In particular if a cut-out service is available, VisIVO could provide an interface for the user to decide
591  on the sub-volume, retrieve and visualise it. The advantage of having a standard data model  on the sub-volume, retrieve and visualise it. The advantage of having a standard data model
592  clearly is that the same ADQL can be sent to all SimDB services.  clearly is that the same ADQL can be sent to all SimDB services.
593  <em class="todo">@@ TODO contact VisIVO people to see whether this could be implemented @@</em>.  <em class="todo">@@ TODO contact VisIVO people to see whether this could be implemented @@</em>.
594    
595  <!-- ++++++++++++++++++++++++ -->  <!-- ++++++++++++++++++++++++ -->
596    
597    
598  <h2><a name="sec4"/>4 Analysis model</h2>  <h2><a name="sec4"/>4 Analysis model</h2>
599  <p>  <p>
600  <em class="todo">@@TODO Gerard @@</em>  <em class="todo">@@TODO Gerard @@</em>
601  An <i>analysis model</i>, also called domain model, is an abstract, high-level representation of the  An <i>analysis model</i>, also called domain model, is an abstract, high-level representation of the
602  <i>universe of discourse</i> (UoD), the part of the world that our application deals with.  <i>universe of discourse</i> (UoD), the part of the world that our application deals with.
603  It is a UML model, with emphasis on the concepts and their exact relationships in the UoD, though details  It is a UML model, with emphasis on the concepts and their exact relationships in the UoD, though details
604  such as attributes need not be completely filled in.  such as attributes need not be completely filled in.
605  Importantly, it should not be influenced by application scenarios apart form knowledge of their UoD.  Importantly, it should not be influenced by application scenarios apart form knowledge of their UoD.
606  Here we describe the UoD and our analysis model. The model is strongly influenced by patterns  Here we describe the UoD and our analysis model. The model is strongly influenced by patterns
607  discovered in earlier work on a  discovered in earlier work on a
608  <i><a href="http://www.ivoa.net/internal/IVOA/IvoaDataModel/DomainModelv0.9.1.doc">Domain model for Astronomy</a></i>,  <i><a href="http://www.ivoa.net/internal/IVOA/IvoaDataModel/DomainModelv0.9.1.doc">Domain model for Astronomy</a></i>,
609  co-written by one of the authors of the present note. We describe some of its main patterns below as well.  co-written by one of the authors of the present note. We describe some of its main patterns below as well.
610  <em class="todo">@@ TODO or will we? @@</em>  <em class="todo">@@ TODO or will we? @@</em>
611  </p>  </p>
612  <h4><a name="sec4.1"/>4.1 Universe of Discourse</h4>  <h4><a name="sec4.1"/>4.1 Universe of Discourse</h4>
613    
614  <h4><a name="sec4.2"/>4.2 Domain Model for Astronomy</h4>  <h4><a name="sec4.2"/>4.2 Domain Model for Astronomy</h4>
615    
616  <h4><a name="sec4.3"/>4.3 SimDB analysis model</h4>  <h4><a name="sec4.3"/>4.3 SimDB analysis model</h4>
617  <em class="todo">@@TODO create a version and add it to volute@@</em>.  <em class="todo">@@TODO create a version and add it to volute@@</em>.
618    
619  <!-- ++++++++++++++++++++++++ -->  <!-- ++++++++++++++++++++++++ -->
620    
621  <h2><a name="sec5"/>5 Logical Model: SimDB</h2>  <h2><a name="sec5"/>5 Logical Model: SimDB</h2>
622  <p>  <p>
623  Here we introduce the core of our proposal, the UML representaiton of our logical data model  Here we introduce the core of our proposal, the UML representaiton of our logical data model
624  for our Simulation Database. The exact representation of this model is an  for our Simulation Database. The exact representation of this model is an
625  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/SimDB_DM.xml">XMI file</a>,  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/SimDB_DM.xml">XMI file</a>,
626  which can be found in the <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm">snapdm section</a>  which can be found in the <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm">snapdm section</a>
627  of the <a href="http://volute.googlecode.com/svn/">Volute subversion database</a> on Google code.  of the <a href="http://volute.googlecode.com/svn/">Volute subversion database</a> on Google code.
628  Other representations can be found in that same hierarchy, in particular check out the  Other representations can be found in that same hierarchy, in particular check out the
629  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/output/html/SimDB.html">HTML documentation</a> which we generated from the XMI  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/output/html/SimDB.html">HTML documentation</a> which we generated from the XMI
630  representation with the XSLT pipeline described in <a href="#appB">Appendix B</a>. This generated documentation file contains  representation with the XSLT pipeline described in <a href="#appB">Appendix B</a>. This generated documentation file contains
631  the explicit description of all of the elements in the model and forms the reference documentaiton document for the model.    the explicit description of all of the elements in the model and forms the reference documentaiton document for the model.  
632  </p>  </p>
633  <h3><a name="sec5_1"/>5.1 Overview</h3>  <h3><a name="sec5_1"/>5.1 Overview</h3>
634  <p>  <p>
635  The logical data model is a fully detailed model of the application domain. It is to form the basis of physical  The logical data model is a fully detailed model of the application domain. It is to form the basis of physical
636  models, representing the model in various computational environments.  models, representing the model in various computational environments.
637  The logical model is represented as a set of UML diagrams, which we created using MagicDraw Community Edition 12.1 and stored as an  The logical model is represented as a set of UML diagrams, which we created using MagicDraw Community Edition 12.1 and stored as an
638  XMI file in the GoogleCode  XMI file in the GoogleCode
639  SVN repository: <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/SNAP_Simulation_DM.xml">  SVN repository: <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/SNAP_Simulation_DM.xml">
640  SNAP_Simulation_DM.xml</a> <em class="todo">@@TODO should change all occurrences of names with SNAP to using SimDB@@</em>  SNAP_Simulation_DM.xml</a> <em class="todo">@@TODO should change all occurrences of names with SNAP to using SimDB@@</em>
641  JPG representations of the model can be found in <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/images/">this</a>  JPG representations of the model can be found in <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/images/">this</a>
642  directory. <em class="todo">@@TODO find proper representation image of the complete model. Possibly color packages differently.@@</em>  directory. <em class="todo">@@TODO find proper representation image of the complete model. Possibly color packages differently.@@</em>
643  </p>  </p>
644  <h3><a name="sec5_2"/>5.2 Normalisation</h3>  <h3><a name="sec5_2"/>5.2 Normalisation</h3>
645  <p>  <p>
646  We have tried to find a balance in the level of <i>normalisation</i> of the data model.  We have tried to find a balance in the level of <i>normalisation</i> of the data model.
647  </p>  </p>
648  <h3><a name="sec5_3"/>5.3 Model contents</h3>  <h3><a name="sec5_3"/>5.3 Model contents</h3>
649  <p>Here we discuss the actual contents of the model, though the detailed descritpion </p>  <p>Here we discuss the actual contents of the model, though the detailed descritpion </p>
650  <h4><a name="sec5_3_1"/>5.3.1 Resource hierarchy</h4>  <h4><a name="sec5_3_1"/>5.3.1 Resource hierarchy</h4>
651  <p>  <p>
652  At the root of the SimDb data model is an abstract class called Resource, in the rest  At the root of the SimDb data model is an abstract class called Resource, in the rest
653  of this document we will refere to this as SimDB/Resource.  of this document we will refere to this as SimDB/Resource.
654  It represents the different types of highest level meta-data objects to be stored in a SimDB.  It represents the different types of highest level meta-data objects to be stored in a SimDB.
655  Examples of this are represented as subclasses. First Experiment (SimDB/Experiment), which represents  Examples of this are represented as subclasses. First Experiment (SimDB/Experiment), which represents
656  different types of experiments that have been performed (run/executed/...) and have produced the results  different types of experiments that have been performed (run/executed/...) and have produced the results
657  that SimDB users may be interested in. Examples of SimDB/Experiment-s are first simulations,  that SimDB users may be interested in. Examples of SimDB/Experiment-s are first simulations,
658  but also the various post-processing operations transforming simulation results into other products  but also the various post-processing operations transforming simulation results into other products
659  such as halo catalogues, density fields etc.  such as halo catalogues, density fields etc.
660  </p>  </p>
661  <p>  <p>
662  The second major type of SimDB/Resource is the SimDB/Protocol.  The second major type of SimDB/Resource is the SimDB/Protocol.
663  This concept represents a <i>formally prescribed way of doing an experiment</i>.  This concept represents a <i>formally prescribed way of doing an experiment</i>.
664  It is derived from the concept with the same name in the domain model, which itself was inspired  It is derived from the concept with the same name in the domain model, which itself was inspired
665  by the concept with the same name in Chapter 8.5 in <a href="#r_AnalaysisPatterns>[3]</a>.  by the concept with the same name in Chapter 8.5 in <a href="#r_AnalaysisPatterns>[3]</a>.
666  In the SimDB/DM this concept has concrete representations in the computer programs that are being  In the SimDB/DM this concept has concrete representations in the computer programs that are being
667  used to run simulations and post-processing etc. As such it defines the possible input parameters,  used to run simulations and post-processing etc. As such it defines the possible input parameters,
668  possble algorithms, the kind of results that can be produced by the code. Every SimDB/Experiment must  possble algorithms, the kind of results that can be produced by the code. Every SimDB/Experiment must
669  indicate which SimDB/Protocol was used and for example provide values for the input parameters, indicate  indicate which SimDB/Protocol was used and for example provide values for the input parameters, indicate
670  which physics was used  which physics was used
671  </p>  </p>
672  <p>  <p>
673  The SimDB/Resource concept is clearly similar, but in general <i>not equivalent</i> to the Resource Registry's Resource concept.  The SimDB/Resource concept is clearly similar, but in general <i>not equivalent</i> to the Resource Registry's Resource concept.
674  In data modeling terms, it is not true that a SimDB/Resource <i>is a</i> Registry/Resource.  In data modeling terms, it is not true that a SimDB/Resource <i>is a</i> Registry/Resource.
675  Often the reason is similar to the reasons that a single image is not a Registry/Resource, whereas a SIAP-compatible service is.  Often the reason is similar to the reasons that a single image is not a Registry/Resource, whereas a SIAP-compatible service is.
676  The granularity of a SimDB will be higher than a Registry and many simulations on their own will be too small.  The granularity of a SimDB will be higher than a Registry and many simulations on their own will be too small.
677  The SimDB itself will have to be registered (see <a href="#">section ???</a> for a further discussion  The SimDB itself will have to be registered (see <a href="#">section ???</a> for a further discussion
678  <em class="todo">@@ TODO add propoer section and href@@</em>),  <em class="todo">@@ TODO add propoer section and href@@</em>),
679  i.e. a SimDb service <i>is a</i> Registry/Resource. In discussion with Ray Plante (IVOA Interop May 2007, Beijing)  i.e. a SimDb service <i>is a</i> Registry/Resource. In discussion with Ray Plante (IVOA Interop May 2007, Beijing)
680  on this issue it was proposed that some part of the contents could also be registered in a Registry directly.  on this issue it was proposed that some part of the contents could also be registered in a Registry directly.
681  I.e. we should be able to identify Registry/Resource-s in SimDB. Considerations to decide on how to make this identification would be for example  I.e. we should be able to identify Registry/Resource-s in SimDB. Considerations to decide on how to make this identification would be for example
682  that all data products resulting form a well defined (and published) scientific project could qualify.  that all data products resulting form a well defined (and published) scientific project could qualify.
683  To represent such a possibility for now we have introduced another subclass of SimDB/Resource: SimDB/Project.  To represent such a possibility for now we have introduced another subclass of SimDB/Resource: SimDB/Project.
684  This is not much more than an aggregation of experiments, with some additional atrributes describing the motivation etc.  This is not much more than an aggregation of experiments, with some additional atrributes describing the motivation etc.
685  The metadata of a SimDB/Project is not the same as that of a Registry/Resource, however we propose that we should be able  The metadata of a SimDB/Project is not the same as that of a Registry/Resource, however we propose that we should be able
686  to define a transformation (possibly implemented again in XSLT) to transform a SimDB/Project and produce a Registry/XML representation.  to define a transformation (possibly implemented again in XSLT) to transform a SimDB/Project and produce a Registry/XML representation.
687  Some more thoughts on this subject will be given in <a href="#">section ???</a> <em class="todo">@@ TODO add proper section and href@@</em> mentioned above.  Some more thoughts on this subject will be given in <a href="#">section ???</a> <em class="todo">@@ TODO add proper section and href@@</em> mentioned above.
688  </p>  </p>
689  <ul>  <ul>
690  <li>Should we define explicit transformations for SimDB/Resource -> Registry/Resource ?</li>  <li>Should we define explicit transformations for SimDB/Resource -> Registry/Resource ?</li>
691  </ul>  </ul>
692    
693  <h4><a name="sec5_3_2"/>5.3.2 Object types</h4>  <h4><a name="sec5_3_2"/>5.3.2 Object types</h4>
694  <p>  <p>
695  One of the main differences between the SimDB data model and other data models in the IVOA so far, is that we do not  One of the main differences between the SimDB data model and other data models in the IVOA so far, is that we do not
696  know in advance what types of results we can expect. The Spectrum data model describes spectra, the characterisation data model  know in advance what types of results we can expect. The Spectrum data model describes spectra, the characterisation data model
697  characterises observational results, the model implicit (for now) in SIA deals with 2D images etc.  characterises observational results, the model implicit (for now) in SIA deals with 2D images etc.
698  This implies that many features describing these results can be explicitly modeled: Spectra have been taken of sources on the sky,  This implies that many features describing these results can be explicitly modeled: Spectra have been taken of sources on the sky,
699  during a certain time persiod, covering a certain wavelength range. And fluxes were measured (in some form).  during a certain time persiod, covering a certain wavelength range. And fluxes were measured (in some form).
700  This makes it possible for space/time/wavelength/flux to make explicit appearances in the corresponding models.  This makes it possible for space/time/wavelength/flux to make explicit appearances in the corresponding models.
701  </p>  </p>
702  <p>  <p>
703  In contrast, simulations come in a great variety of types, even if we constrain ourselves to the "3+1D" kind.  In contrast, simulations come in a great variety of types, even if we constrain ourselves to the "3+1D" kind.
704  We can not make many assumptions on the type of objects making up the result, or on the "observables" of these objects.  We can not make many assumptions on the type of objects making up the result, or on the "observables" of these objects.
705  It therefore becomes necessary to add components to the model that allow publishers to describe these explicitly.  It therefore becomes necessary to add components to the model that allow publishers to describe these explicitly.
706  We do so in the ObjectType hierarchy.  We do so in the ObjectType hierarchy.
707  </p>  </p>
708  <ul>  <ul>
709  s<li>Should we add units to the properties and not to the characterisations; similar for InputParameter and ParameterSetting</li>  s<li>Should we add units to the properties and not to the characterisations; similar for InputParameter and ParameterSetting</li>
710  </ul>  </ul>
711    
712  <h4><a name="sec5_3_3"/>5.3.3 Target</h4>  <h4><a name="sec5_3_3"/>5.3.3 Target</h4>
713  <p>  <p>
714  The first question most people want to know about a simulations is: "what is being simulated?".  The first question most people want to know about a simulations is: "what is being simulated?".
715  The answer should correspond to a real (astronomical) object, or collection of objects,  The answer should correspond to a real (astronomical) object, or collection of objects,
716  or  possibly a physical process. For SimDB to answer such questions implies that publishers must be  or  possibly a physical process. For SimDB to answer such questions implies that publishers must be
717  able to describe these concepts in the model.  able to describe these concepts in the model.
718  We have introduced the TargetObjectType and TargetProcess classes for this.... <em class="todo">@@ TODO expand @@</em>.  We have introduced the TargetObjectType and TargetProcess classes for this.... <em class="todo">@@ TODO expand @@</em>.
719  </p>  </p>
720    
721  <h4><a name="sec5_3_4"/>5.3.4 Characterisation</h4>  <h4><a name="sec5_3_4"/>5.3.4 Characterisation</h4>
722  <p>  <p>
723  Much of the metadata in the model concerns itself with describing how the results that are supposedly the ultimate  Much of the metadata in the model concerns itself with describing how the results that are supposedly the ultimate
724  goal of users, describing the kind of objects contained in the results and the scientific content.  In an implementation of  goal of users, describing the kind of objects contained in the results and the scientific content.  In an implementation of
725  the model as a database, one does not expect the actual data to be stored and therefore there is no need to have model elements  the model as a database, one does not expect the actual data to be stored and therefore there is no need to have model elements
726  describing these. However there is some use to getting a summary of the actual data values, both obtained, and obtainable.  describing these. However there is some use to getting a summary of the actual data values, both obtained, and obtainable.
727  To this end we have added the characterisation elements, in reference to the <a href="#r_Characterisation">Characterisation data model [5]</a>  To this end we have added the characterisation elements, in reference to the <a href="#r_Characterisation">Characterisation data model [5]</a>
728  that, as we will try to explain, performs a similar function for observations.  that, as we will try to explain, performs a similar function for observations.
729  </p>  </p>
730  <p>  <p>
731  <em class="todo">@@ TODO expand @@</em>  <em class="todo">@@ TODO expand @@</em>
732  </p>  </p>
733  <ul class="issue">  <ul class="issue">
734  <li>We need to add characterisation to TargetObject, so users can ask for "simulations of 1e14 M<sub>sun</sub> galaxy clusters</li>  <li>We need to add characterisation to TargetObject, so users can ask for "simulations of 1e14 M<sub>sun</sub> galaxy clusters</li>
735  <li>Do we need other types of characterisation, such as accuracy etc?</li>  <li>Do we need other types of characterisation, such as accuracy etc?</li>
736  </ul>  </ul>
737    
738  <h4><a name="sec5_3_5"/>5.3.5 Semantics</h4>  <h4><a name="sec5_3_5"/>5.3.5 Semantics</h4>
739  <p>  <p>
740  There are many instances in the data model where we need to describe elements of the  There are many instances in the data model where we need to describe elements of the
741  SimDB/Resource-s explicitly, because we do not have implicit information based on the context.  SimDB/Resource-s explicitly, because we do not have implicit information based on the context.
742  Examples are the various properties of object types, the target objects and processes etc.  Examples are the various properties of object types, the target objects and processes etc.
743  Apart from a name and a description we then frequently add  Apart from a name and a description we then frequently add
744  an attribute which is supposed to "label" the element according to an assumed standard list of terms.  an attribute which is supposed to "label" the element according to an assumed standard list of terms.
745  We model this using the <pre>&lt;&lt;ontologyterm&gt;&gt;</pre> stereotype. Attributes with this stereotype  We model this using the <pre>&lt;&lt;ontologyterm&gt;&gt;</pre> stereotype. Attributes with this stereotype
746  are assumed to take their values form such a predefined "ontology". See  are assumed to take their values form such a predefined "ontology". See
747    
748  </p>  </p>
749  <ul class="issue">  <ul class="issue">
750  <li>We need to list the onotlogies that we create first attempts on some that do not yet exists.</li>  <li>We need to list the onotlogies that we create first attempts on some that do not yet exists.</li>
751  <li>We need to have proper locations of machine readable vocabularies</li>  <li>We need to have proper locations of machine readable vocabularies</li>
752  <li>We need to get feedback on what kind of ontologies we want. Narrower/broader types, fully linked ontologies? etc</li>  <li>We need to get feedback on what kind of ontologies we want. Narrower/broader types, fully linked ontologies? etc</li>
753  </ul>  </ul>
754  <h4><a name="sec5_3_6"/>5.3.6 Units</h4>  <h4><a name="sec5_3_6"/>5.3.6 Units</h4>
755  <p>  <p>
756  The current (May 2008 <em class="todo">@@ TODO update when necessary @@</em>) version of the model  The current (May 2008 <em class="todo">@@ TODO update when necessary @@</em>) version of the model
757  allows publishers to specify numerical quantities using a real value and a unit.  allows publishers to specify numerical quantities using a real value and a unit.
758  I.e. we do not prescribe units for particular quantities.  I.e. we do not prescribe units for particular quantities.
759  Allowing this flexibility in units assignment does pose a problem for a query interface that allows user to query on  Allowing this flexibility in units assignment does pose a problem for a query interface that allows user to query on
760  characterisation values and other numerical quantities. ADQL does not include units for example, but a user  characterisation values and other numerical quantities. ADQL does not include units for example, but a user
761  can not assume that every publisher will use the same unit for for example the typical size of a simulation box.  can not assume that every publisher will use the same unit for for example the typical size of a simulation box.
762  This is even worse of course for the characterisation values of properties that have to be defined  This is even worse of course for the characterisation values of properties that have to be defined
763  in the model and can have any kind of assumed unit.  in the model and can have any kind of assumed unit.
764  </p>  </p>
765  <p>  <p>
766  We believe we should treat units as a special semantic vocabulary, possibly an ontology.  We believe we should treat units as a special semantic vocabulary, possibly an ontology.
767  This implies we push its development off to elsewhere for now, and assume we can  This implies we push its development off to elsewhere for now, and assume we can
768  at some point use a standard list of units in a similar way to the other ontology references.  at some point use a standard list of units in a similar way to the other ontology references.
769  Maybe this could include a link to the physical quantity (etc, see for example the  Maybe this could include a link to the physical quantity (etc, see for example the
770  <a href="http://physics.nist.gov/cuu/Units/introduction.html">NIST reference on SI</a>) to which the unit applies.  <a href="http://physics.nist.gov/cuu/Units/introduction.html">NIST reference on SI</a>) to which the unit applies.
771  </p>  </p>
772  <p>  <p>
773  If this kind of link can be made, we could eventually attempt to impose a single unit to correspond to  If this kind of link can be made, we could eventually attempt to impose a single unit to correspond to
774  all properties sharing a given <a href="http://physics.nist.gov/cuu/Units/introduction.html">quantity in the general sense</a>.  all properties sharing a given <a href="http://physics.nist.gov/cuu/Units/introduction.html">quantity in the general sense</a>.
775  This may lead to very small or very large values, depending on the simulation, but at least allows simpler  This may lead to very small or very large values, depending on the simulation, but at least allows simpler
776  interfaces.  interfaces.
777  </p>  </p>
778  <ul class="issue">  <ul class="issue">
779  <li>We need input from the rest of the IVOA on how to deal with this issue</li>  <li>We need input from the rest of the IVOA on how to deal with this issue</li>
780  </ul>  </ul>
781  <h4><a name="sec5_3_7"/>5.3.7 Services</h4>  <h4><a name="sec5_3_7"/>5.3.7 Services</h4>
782  <p>  <p>
783  The goal of the SimDB specification is to define a protocol for querying interesting simulations  The goal of the SimDB specification is to define a protocol for querying interesting simulations
784  and related SimDB/Resource-s.  and related SimDB/Resource-s.
785  Once these have been identified the user should be able to access these simulations.  Once these have been identified the user should be able to access these simulations.
786  We assume that web services are the means to do so, and allow publishers to indicate such  We assume that web services are the means to do so, and allow publishers to indicate such
787  web services as are available for a given Experiment. We assume for now that we know little of the  web services as are available for a given Experiment. We assume for now that we know little of the
788  web service beyond some generic types: <i>download, cut-out, extraction, projection, custom</i>.  web service beyond some generic types: <i>download, cut-out, extraction, projection, custom</i>.
789  The SimDAP specification is being developed to address those aspects in detail.  The SimDAP specification is being developed to address those aspects in detail.
790  We assume that there will be a base-URL implementing some standard DAL (VOSI?) like services  We assume that there will be a base-URL implementing some standard DAL (VOSI?) like services
791  and leave it up to SimDB-client implementations to interact with these services in standard manners.  and leave it up to SimDB-client implementations to interact with these services in standard manners.
792  Only custom services can be directly accessed, and for now many services will necessarily be custom.  Only custom services can be directly accessed, and for now many services will necessarily be custom.
793  </p>  </p>
794  <ul class="issue">  <ul class="issue">
795  <li>How do we get the complete list of service types? Predefined (as enumeration) in model?  <li>How do we get the complete list of service types? Predefined (as enumeration) in model?
796  </li>  </li>
797  </ul>  </ul>
798  <h2><a name="sec6"/>6 Physical models</h2>  <h2><a name="sec6"/>6 Physical models</h2>
799  <p>  <p>
800  Here we describe how we create <i>physical models</i> out of the logical model.  Here we describe how we create <i>physical models</i> out of the logical model.
801  A <i>physical model</i> is (see <em class="todo">@@TODO reference to some standard reference on data modelling@@</em>)  A <i>physical model</i> is (see <em class="todo">@@TODO reference to some standard reference on data modelling@@</em>)
802  a representation of the logical model that is adapted to a particular software environment.  a representation of the logical model that is adapted to a particular software environment.
803  We present physical representations for the following contexts:  We present physical representations for the following contexts:
804  <ul>  <ul>
805  <li>XML: we present an <a href="http://www.w3.org/XML/Schema">XML schema</a> defining valid XML documents</li>  <li>XML: we present an <a href="http://www.w3.org/XML/Schema">XML schema</a> defining valid XML documents</li>
806  <li>Relational databases: we derive a relational database schema for storing instaces of the model.</li>  <li>Relational databases: we derive a relational database schema for storing instaces of the model.</li>
807  <li>Java: we present Java classes representing the data model in a JVM. These classes are annotated with  <li>Java: we present Java classes representing the data model in a JVM. These classes are annotated with
808  <a href="http://java.sun.com/javaee/technologies/persistence.jsp">Java Persistence API (JPA)</a> and  <a href="http://java.sun.com/javaee/technologies/persistence.jsp">Java Persistence API (JPA)</a> and
809  <a href="http://java.sun.com/developer/technicalArticles/WebServices/jaxb/">Java Architecture for XML Binding (JAXB)</a>  <a href="http://java.sun.com/developer/technicalArticles/WebServices/jaxb/">Java Architecture for XML Binding (JAXB)</a>
810  annotations to enable easy transformations to the XML and relational contexts.</li>  annotations to enable easy transformations to the XML and relational contexts.</li>
811  <li>TAP: we present a representation of the model in a manner that hopefully has some similarity to the way  <li>TAP: we present a representation of the model in a manner that hopefully has some similarity to the way
812  TAP will mandate meta-data about ADQL-queriable databases must be returned.</li>    TAP will mandate meta-data about ADQL-queriable databases must be returned.</li>  
813  <li>UTYPE: we present for the simple, i.e. non-structured elements in the SimDB/DM serialisations taht should  <li>UTYPE: we present for the simple, i.e. non-structured elements in the SimDB/DM serialisations taht should
814  resemble UTYPE-s. These can be used when representing (parts of) the model in VOTable.</li>  resemble UTYPE-s. These can be used when representing (parts of) the model in VOTable.</li>
815  <li>HTML: we present a representation of the model as a web browser (and human) readable HTML document.  <li>HTML: we present a representation of the model as a web browser (and human) readable HTML document.
816  This contains all details of the model in human readable form.</li>  This contains all details of the model in human readable form.</li>
817  </ul>  </ul>
818  We have <em>completely automated</em> the derivation of these representations from the logical model using transformation  We have <em>completely automated</em> the derivation of these representations from the logical model using transformation
819  rules implemented in XSLT.  rules implemented in XSLT.
820  Our XSLT pipeline is described in more detail in <a href="#appB">Appendix B</a>.  Our XSLT pipeline is described in more detail in <a href="#appB">Appendix B</a>.
821  annotations, provides simple means to store contents of SimDB/XML documents in  annotations, provides simple means to store contents of SimDB/XML documents in
822  a SimDB relational database and retrieve them from there again.  a SimDB relational database and retrieve them from there again.
823  </p>  </p>
824  <ul class="issue">  <ul class="issue">
825  <li>Discuss adoption of this approach with DM WG</li>  <li>Discuss adoption of this approach with DM WG</li>
826  <li>Ultimately we believe that defining these mappings is the realm of the DM WG,  <li>Ultimately we believe that defining these mappings is the realm of the DM WG,
827  which might come up with a kind of meta-specification.</li>  which might come up with a kind of meta-specification.</li>
828  </ul>  </ul>
829  <h3><a name="sec6_1"/>6.1 Identity and Referencing</h3>  <h3><a name="sec6_1"/>6.1 Identity and Referencing</h3>
830  <p>  <p>
831  The main elements in our data model are the object types, these embody the core concepts that we model.  The main elements in our data model are the object types, these embody the core concepts that we model.
832  In our approach we follow standard Object-Oriented design approaches  In our approach we follow standard Object-Oriented design approaches
833  (see [<a href="r_DMApproaches">10</a>]) where object types are assumed to have an explicit <i>identity</i>.  (see [<a href="r_DMApproaches">10</a>]) where object types are assumed to have an explicit <i>identity</i>.
834  Two objects (i.e instances of an object type) can have the same values for all fields, but if their identity is  Two objects (i.e instances of an object type) can have the same values for all fields, but if their identity is
835  not the same they are not the same object. Objects can be referenced by stating their identity (in whatever form this comes).  not the same they are not the same object. Objects can be referenced by stating their identity (in whatever form this comes).
836  In contrast to this, <i>value types</i> are assumed to be identical if their value (or values, in the case of structured  In contrast to this, <i>value types</i> are assumed to be identical if their value (or values, in the case of structured
837  value types) is the same.    value types) is the same.  
838  In our UML model we have not defined an explicit <i>identifier</i> attribute on each object type to represent its identity,  In our UML model we have not defined an explicit <i>identifier</i> attribute on each object type to represent its identity,
839  its existence is assumed. There are some identifier-like attributes, but those refer to an identity the object has in abnother context,  its existence is assumed. There are some identifier-like attributes, but those refer to an identity the object has in abnother context,
840  generally the one of the publisher or creator of the object.  generally the one of the publisher or creator of the object.
841  In most of the physical models we need to be able to represent this object identity explicitly however.  In most of the physical models we need to be able to represent this object identity explicitly however.
842  </p>  </p>
843  <p>  <p>
844  Related to this issue is that we need to be able to represent <a href="#uml_reference">reference</a>  Related to this issue is that we need to be able to represent <a href="#uml_reference">reference</a>
845  relations between different objects.  relations between different objects.
846  Most contexts provide a natural mapping for references. For example relational databases have the concept of foreign keys,  Most contexts provide a natural mapping for references. For example relational databases have the concept of foreign keys,
847  XML documents allow references using ID/IDREF and other mechanisms for references to entities in the same document,  XML documents allow references using ID/IDREF and other mechanisms for references to entities in the same document,
848  Java uses pointers (implicitly) to objects in the same virtual machine.    Java uses pointers (implicitly) to objects in the same virtual machine.  
849  Problems arise when we need to leave the local contexts: references to resources not in the current database,  Problems arise when we need to leave the local contexts: references to resources not in the current database,
850  or in another XML document.  or in another XML document.
851  </p>  </p>
852  <p>  <p>
853  It is easy to imagine cases where this may occur. For example when registering a simulation run with the  It is easy to imagine cases where this may occur. For example when registering a simulation run with the
854  open source Gadget [<a href="#r_Gadget">12</a>] simulation code, one needs to have a reference to the corresponding Gadget SimDB/Simulator.  open source Gadget [<a href="#r_Gadget">12</a>] simulation code, one needs to have a reference to the corresponding Gadget SimDB/Simulator.
855  Unless one registers the experiment in the same SimDB where Gadget is registered, one needs to use a reference  Unless one registers the experiment in the same SimDB where Gadget is registered, one needs to use a reference
856  across SimDB-s. One obvious way is to map all references to globally unique identifiers,  across SimDB-s. One obvious way is to map all references to globally unique identifiers,
857  possibly using URIs or IVOA Identifiers [<a href="#r_IVOAIdentifier">11</a>].  possibly using URIs or IVOA Identifiers [<a href="#r_IVOAIdentifier">11</a>].
858  The size of such URI-s makes this a rather expensive storage mechanism for use in a relational database,  The size of such URI-s makes this a rather expensive storage mechanism for use in a relational database,
859  certainly compared to simple integer (or bigint) columns.  certainly compared to simple integer (or bigint) columns.
860  </p>  </p>
861  This issue is not yet resolved satisfactory. The following possible approaches offer themselves and need discussions:  This issue is not yet resolved satisfactory. The following possible approaches offer themselves and need discussions:
862  <ul class="issue">  <ul class="issue">
863  <li>Never allow references to objects not in the same SimDB.  <li>Never allow references to objects not in the same SimDB.
864  This may require mirroring of resources not currently in the SimDB.  This may require mirroring of resources not currently in the SimDB.
865  Registries have experience with similar such mechanisms, though likely for different reasons.</li>  Registries have experience with similar such mechanisms, though likely for different reasons.</li>
866  <li>Use complete URIs for all references, and allow references to objects not in the same SimDB.  <li>Use complete URIs for all references, and allow references to objects not in the same SimDB.
867  This makes it impossible to have foreign keys on these references, as the referred to object may not exist.  This makes it impossible to have foreign keys on these references, as the referred to object may not exist.
868  It may be relatively expensive, though this may be reduced with an extra level of indirection.  It may be relatively expensive, though this may be reduced with an extra level of indirection.
869  If referenced SimDBs are themselves registered in each SimDB, they are themselves assigned the possibly smaller  If referenced SimDBs are themselves registered in each SimDB, they are themselves assigned the possibly smaller
870  local ID (an integer or bigint). A reference need then not require more than two IDs, possibly one if a standardised  local ID (an integer or bigint). A reference need then not require more than two IDs, possibly one if a standardised
871  mapping is used.</li>  mapping is used.</li>
872  <li>Use IDs adjusted to the specific. for example use full URIs in XML documents, but resolve these to  <li>Use IDs adjusted to the specific. for example use full URIs in XML documents, but resolve these to
873  smaller representations inside the database.</li>  smaller representations inside the database.</li>
874  </ul>  </ul>
875  <em class="todo">@@ TODO this needs rewriting, too much stream of consciousness @@</em>  <em class="todo">@@ TODO this needs rewriting, too much stream of consciousness @@</em>
876    
877  <h3><a name="sec6_2"/>6.2 RDBM Schema</h3>  <h3><a name="sec6_2"/>6.2 RDBM Schema</h3>
878  The public schema, i.e. the view the outside world has of a SimDB, is a relational schema.  The public schema, i.e. the view the outside world has of a SimDB, is a relational schema.
879  This will be formally defined using VOTables containing the appropriate TABLE definitions.  This will be formally defined using VOTables containing the appropriate TABLE definitions.
880  Our Object-Relaitonal mappingprescritpion contains the following elements:  Our Object-Relaitonal mappingprescritpion contains the following elements:
881  <ul>  <ul>
882  <li>object types are mapped to tables, one table per object type</li>  <li>object types are mapped to tables, one table per object type</li>
883  <li>Inheritance hierarchies: JOINED strategy as defined in JPA, i.e. each table only has columns for the attributes and references defined on the corresponding type.  <li>Inheritance hierarchies: JOINED strategy as defined in JPA, i.e. each table only has columns for the attributes and references defined on the corresponding type.
884  Also an ID column that is a PK and also a FK to the ID of the base class' table. Possibly a container column (see below)</li>  Also an ID column that is a PK and also a FK to the ID of the base class' table. Possibly a container column (see below)</li>
885  <li>Primary key column: <tt>ID NUMERIC(18)</tt></li>  <li>Primary key column: <tt>ID NUMERIC(18)</tt></li>
886  <li>Foreign key to container: <tt>containerId</tt><br/>plus foreign key and index declaration</li>  <li>Foreign key to container: <tt>containerId</tt><br/>plus foreign key and index declaration</li>
887  <li>References: &lt;referenceName&gt;Id<br/>plus foreign key and index declaration.</li>  <li>References: &lt;referenceName&gt;Id<br/>plus foreign key and index declaration.</li>
888  <li>Using topological sort of object types based on (extends|container|reference) relations we generated  <li>Using topological sort of object types based on (extends|container|reference) relations we generated
889  create table statements and ther indexes and foreign keys in blocks. drop table statements in opposite order.</li>  create table statements and ther indexes and foreign keys in blocks. drop table statements in opposite order.</li>
890  <li>For each class we create a view named "v_&lt;class name&gt;<br/>returns all columns for that class; uses join to base class's view.</li>  <li>For each class we create a view named "v_&lt;class name&gt;<br/>returns all columns for that class; uses join to base class's view.</li>
891  <li>generate a discriminator column on table for root in inheritance hierarchy, stores name of class (must be unique in inheritance hierarchy!)</li>  <li>generate a discriminator column on table for root in inheritance hierarchy, stores name of class (must be unique in inheritance hierarchy!)</li>
892  <li>attributes mapped to single column if their type is simple (i.e. primitive, or enumeration)</li>  <li>attributes mapped to single column if their type is simple (i.e. primitive, or enumeration)</li>
893  <li>if attribute's type is dataType mapped to as many columns as the dataType has attributes,  <li>if attribute's type is dataType mapped to as many columns as the dataType has attributes,
894  with column names the name of the dataType's attributes, prefixed by &lt;attribute-name&gt;_</li>  with column names the name of the dataType's attributes, prefixed by &lt;attribute-name&gt;_</li>
895  <li>For PK columns we use the  <li>For PK columns we use the
896  </ul>  </ul>
897  <em class="todo">@@ TODO add links to actual generated schemas @@</em>  <em class="todo">@@ TODO add links to actual generated schemas @@</em>
898    
899  <h3><a name="sec6_3"/>6.3 XML Schema</h3>  <h3><a name="sec6_3"/>6.3 XML Schema</h3>
900  <p>  <p>
901  The DM WG has mandated (IVOA interoperability meeting, Cambridge, UK, May 2003) that each data model should come with  The DM WG has mandated (IVOA interoperability meeting, Cambridge, UK, May 2003) that each data model should come with
902  an XML schema that represents valid XML serialisations of the data model.    an XML schema that represents valid XML serialisations of the data model.  
903  We foresee that this representation can be used to communicate instances of SimDB/Resource-s as XML documents.  We foresee that this representation can be used to communicate instances of SimDB/Resource-s as XML documents.
904  Such communication can be for registering new SimDB/Resources in a SimDB, or  Such communication can be for registering new SimDB/Resources in a SimDB, or
905  used in message to communicate instances of the SimDB Resource type.  used in message to communicate instances of the SimDB Resource type.
906  Here we shortly describe some of the rules for deriving an XML schema from our logical model.  Here we shortly describe some of the rules for deriving an XML schema from our logical model.
907  </p>  </p>
908  <ul>  <ul>
909  <li>object and data types are mapped to comlexType. Object types inherit from a base class taht defines  <li>object and data types are mapped to comlexType. Object types inherit from a base class taht defines
910  features dealing with identity.</li>  features dealing with identity.</li>
911  <li> primitiveType-s are mapped to appropriate simpleType-s</li>  <li> primitiveType-s are mapped to appropriate simpleType-s</li>
912  <li>enumerations are mapped to simpleType-s which are a restriction of xsd:string and have  <li>enumerations are mapped to simpleType-s which are a restriction of xsd:string and have
913  an enumeration element for each literal.</li>  an enumeration element for each literal.</li>
914  <li>packages are mapped to namespaces and eahc package has its own file, with dependencies translated  <li>packages are mapped to namespaces and eahc package has its own file, with dependencies translated
915  to schema imports.</li>  to schema imports.</li>
916  <li>A root element is generated for each concrete (i.e. non-abstract) root (i.e. not contained in other types)  <li>A root element is generated for each concrete (i.e. non-abstract) root (i.e. not contained in other types)
917  object type.</li>  object type.</li>
918  <li>attributes are mapped to elements (<i>not</i> attributes!) of the appropriate type.</li>  <li>attributes are mapped to elements (<i>not</i> attributes!) of the appropriate type.</li>
919  <li>collections are mapped to elements of the appropriate type, contained within the complexTYpe of the containing complexType</li>  <li>collections are mapped to elements of the appropriate type, contained within the complexTYpe of the containing complexType</li>
920  <li>references are mapped to a elements of a special purpose base complexType, Reference.  <li>references are mapped to a elements of a special purpose base complexType, Reference.
921  The precise definition of this type is postponed until the issues about identity and referencing is resolved.  The precise definition of this type is postponed until the issues about identity and referencing is resolved.
922  For now it has multiple sub-elements reflecting the different possible ways to refer to other elements.</li>  For now it has multiple sub-elements reflecting the different possible ways to refer to other elements.</li>
923  </ul>  </ul>
924  <em class="todo">@@ TODO add links to actual generated schemas @@</em>  <em class="todo">@@ TODO add links to actual generated schemas @@</em>
925    
926    
927  <h3><a name="sec6_4"/>6.4 UTYPE-s</h3>  <h3><a name="sec6_4"/>6.4 UTYPE-s</h3>
928  <p>  <p>
929  It is generally the case that contents of databases may be represented in ways that do not  It is generally the case that contents of databases may be represented in ways that do not
930  conform to one of the standard serialisations. Nothing prevents services to be developed on  conform to one of the standard serialisations. Nothing prevents services to be developed on
931  top of SimDB that represent SimDB/Resource-s or even fragments of these in another form.  top of SimDB that represent SimDB/Resource-s or even fragments of these in another form.
932  The standard example would be to have VOTables storing the results of a generic ADQL query of the SimDB/RDB representation.  The standard example would be to have VOTables storing the results of a generic ADQL query of the SimDB/RDB representation.
933  VOTable first introduced the option to have a UTYPE attribute in FIELD definition tags store  VOTable first introduced the option to have a UTYPE attribute in FIELD definition tags store
934  a pointer to an element in a data model that the column represents.  a pointer to an element in a data model that the column represents.
935  </p>  </p>
936  <p>  <p>
937  The <a href="#r_SpectrumDatamodel">Spectrum data model</a> was the first to add explicit  The <a href="#r_SpectrumDatamodel">Spectrum data model</a> was the first to add explicit
938  UTYPE-s for each of the attributes in its model and the <a href="#r_CharacterisationDM">Characterisaiton data model</a>  UTYPE-s for each of the attributes in its model and the <a href="#r_CharacterisationDM">Characterisaiton data model</a>
939  has followed that example. As long as the precise usage and relation of the syntax of the underlying data model is  has followed that example. As long as the precise usage and relation of the syntax of the underlying data model is
940  is not defined, we will follow these examples by assigning UTYPE-s explicitly to all elements in the model.  is not defined, we will follow these examples by assigning UTYPE-s explicitly to all elements in the model.
941  However, we will follow a fixed set of rules to makes this assignment and implement these in XSLT.  However, we will follow a fixed set of rules to makes this assignment and implement these in XSLT.
942  If a similar approach is at some time accepted within the IVOA, possibly in an alternative form, it will be straightforward  If a similar approach is at some time accepted within the IVOA, possibly in an alternative form, it will be straightforward
943  to adjust our definitions. The important point we want to make is that it is possible to simply define rules that then will  to adjust our definitions. The important point we want to make is that it is possible to simply define rules that then will
944  automatically produce the UTYPE-s for a given data model, i.e. the only discussion that is required is on the rules for doing so.  automatically produce the UTYPE-s for a given data model, i.e. the only discussion that is required is on the rules for doing so.
945  </p>  </p>
946  <p>  <p>
947  Our assumption is that the UTYPE should be able to uniquely represent any element in the data model, and in a manner  Our assumption is that the UTYPE should be able to uniquely represent any element in the data model, and in a manner
948  that is also easily interpreted. For now we assume that we need to point to those elements  that is also easily interpreted. For now we assume that we need to point to those elements
949  that can be stored in a column in a VOTable, i.e. for now we are looking for "simple" elements.  that can be stored in a column in a VOTable, i.e. for now we are looking for "simple" elements.
950  We can use our relational mapping to identify all these features, they are  We can use our relational mapping to identify all these features, they are
951  <ul>  <ul>
952  <li> attributes (paying attention to attributes with non simple data types)</li>  <li> attributes (paying attention to attributes with non simple data types)</li>
953  <li> references (an identifier </li>  <li> references (an identifier </li>
954  identifying the referenced object) and  identifying the referenced object) and
955  <li>collections (through a pointer to the containing, parent object). </li>  <li>collections (through a pointer to the containing, parent object). </li>
956  </ul>  </ul>
957  VOTable also allows arrays to be stored in single columns, so a collection can be stored as an array of identifiers of  VOTable also allows arrays to be stored in single columns, so a collection can be stored as an array of identifiers of
958  child objects. There are some other features that are not explicitly modelled, but are implied.  child objects. There are some other features that are not explicitly modelled, but are implied.
959  Examples are the identifier (ID) assigned to all objects and the name of the object type of an object.  Examples are the identifier (ID) assigned to all objects and the name of the object type of an object.
960  </p>  </p>
961  <p>  <p>
962  Of course we could give each of the elements a uniquely generated identifier, but we assume that UTYPE-s should hold  Of course we could give each of the elements a uniquely generated identifier, but we assume that UTYPE-s should hold
963  semantic information, otherwise we could use the XMI-ids generated by the UML modelling tool.  semantic information, otherwise we could use the XMI-ids generated by the UML modelling tool.
964  To identify any of these elements uniquely within the context of the IVOA,  To identify any of these elements uniquely within the context of the IVOA,
965  we then need the following components:  we then need the following components:
966  <ul>  <ul>
967  <li>name of element (possibly a path expression for structured attributes leading to a "leaf attribute")</li>  <li>name of element (possibly a path expression for structured attributes leading to a "leaf attribute")</li>
968  <li>name of containing object type</li>  <li>name of containing object type</li>
969  <li>a path expression for the package(s) containing the object type</li>  <li>a path expression for the package(s) containing the object type</li>
970  <li>unique identifier of the model, possibly its name if that is to be unique in the IVOA DM efforts</li>  <li>unique identifier of the model, possibly its name if that is to be unique in the IVOA DM efforts</li>
971  <li>some indication of the context, unless this can be implicit.</li>  <li>some indication of the context, unless this can be implicit.</li>
972  </ul>  </ul>
973  NB this assumes that we do not have a uniqueness rule on the names of object types within a model, something we do actually  NB this assumes that we do not have a uniqueness rule on the names of object types within a model, something we do actually
974  assume in the mapping of SimDB/RDB above. In that case we could leave out the package path.  assume in the mapping of SimDB/RDB above. In that case we could leave out the package path.
975  </p>  </p>
976  <p>  <p>
977  One could argue one could also give nice, unique names to each of the elements, but to find out what the actual element in  One could argue one could also give nice, unique names to each of the elements, but to find out what the actual element in
978  the model and in other representations one would still need to perform a look up. Such a uniqe name would likely include some of  the model and in other representations one would still need to perform a look up. Such a uniqe name would likely include some of
979  the elements above anyhow. So we believe it would be a waste of efforts to do so and instead propose a simple convention  the elements above anyhow. So we believe it would be a waste of efforts to do so and instead propose a simple convention
980  for deriving the UTYPE-s form the model based on this hiherarchy.  for deriving the UTYPE-s form the model based on this hiherarchy.
981  We have done so using these rules (in BNF-like notation)  We have done so using these rules (in BNF-like notation)
982  <dl>  <dl>
983  <dt>attribute</dt>  <dt>attribute</dt>
984  <dd>  <dd>
985  <pre>  <pre>
986  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." &lt;attribute-name&gt; [ "." &lt;attribute-name&gt;]*  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." &lt;attribute-name&gt; [ "." &lt;attribute-name&gt;]*
987  </pre>  </pre>
988  </dd>  </dd>
989  <dt>reference</dt>  <dt>reference</dt>
990  <dd>  <dd>
991  <pre>  <pre>
992  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." &lt;reference-name&gt;  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." &lt;reference-name&gt;
993  </pre>  </pre>
994  </dd>  </dd>
995  <dt>collection (as array of p0inters to child objects)</dt>  <dt>collection (as array of p0inters to child objects)</dt>
996  <dd>  <dd>
997  <pre>  <pre>
998  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." &lt;collection-name&gt;  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." &lt;collection-name&gt;
999  </pre>  </pre>
1000  </dd>  </dd>
1001  <dt>container</dt>  <dt>container</dt>
1002  <dd>  <dd>
1003  <pre>  <pre>
1004  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." "CONTAINER";  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." "CONTAINER";
1005  </pre>  </pre>
1006  </dd>  </dd>
1007  <dt>ID</dt>  <dt>ID</dt>
1008  <dd>  <dd>
1009  <pre>  <pre>
1010  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." "ID";  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." "ID";
1011  </pre>  </pre>
1012  </dd>  </dd>
1013  <dt>object type name</dt>  <dt>object type name</dt>
1014  <dd>  <dd>
1015  <pre>  <pre>
1016  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." "DTYPE";  &lt;model-name&gt; ":" &lt;package-name&gt;[ "/" &lt;package-name&gt;]* "/" &lt;objecttype-name&gt; "." "DTYPE";
1017  </pre>  </pre>
1018  </dd>  </dd>
1019    
1020  </dl>  </dl>
1021  The HTML documentation generated from the logical model contains UTYPE-s for these features, generated according to these rules.  The HTML documentation generated from the logical model contains UTYPE-s for these features, generated according to these rules.
1022  It will be obvious how to accommodate changes in the precise UTYPE specification, <em>as long as similar rules are upheld</em>.  It will be obvious how to accommodate changes in the precise UTYPE specification, <em>as long as similar rules are upheld</em>.
1023  </p>  </p>
1024  <em class="todo">@@ TODO add links to actual generated schemas @@</em>  <em class="todo">@@ TODO add links to actual generated schemas @@</em>
1025    
1026  <h3><a name="sec6_5"/>6.5 Java/JPA+JAXB (non normative)</h3>  <h3><a name="sec6_5"/>6.5 Java/JPA+JAXB (non normative)</h3>
1027    
1028  <h2><a name="sec7"/>7 Query Protocols</h2>  <h2><a name="sec7"/>7 Query Protocols</h2>
1029  <p>  <p>
1030  The previous chapter has defined a number of physical representations of the logical simulation data model.  The previous chapter has defined a number of physical representations of the logical simulation data model.
1031  Using these we can implement a database that can store instances of SimDB/Resources.  Using these we can implement a database that can store instances of SimDB/Resources.
1032  This could be done using an XML database, or using a relational database management system such as  This could be done using an XML database, or using a relational database management system such as
1033  Postgres, MySQL or any of the commercial versions. The data model is rather complex,  Postgres, MySQL or any of the commercial versions. The data model is rather complex,
1034  and more hierarchical than most other data models so far defined in the IVOA.  and more hierarchical than most other data models so far defined in the IVOA.
1035  Querying such a data model requires a rich query language and we propose to use  Querying such a data model requires a rich query language and we propose to use
1036  ADQL working on the relational representation. ADQL produces tabular results, whose structure is completely  ADQL working on the relational representation. ADQL produces tabular results, whose structure is completely
1037  governed by the query itself. We also assume it possible, once appropriate information is available, to  governed by the query itself. We also assume it possible, once appropriate information is available, to
1038  retrieve complete SimDB/Resource-s as XML documents and propose a simple REST-like query interface for that.  retrieve complete SimDB/Resource-s as XML documents and propose a simple REST-like query interface for that.
1039  Such an XML based interface will likely also be used to upload new resources to SimDB implementations taht support  Such an XML based interface will likely also be used to upload new resources to SimDB implementations taht support
1040  that functionality.  that functionality.
1041  </p>  </p>
1042  <h3><a name="sec7_1"/>7.1 ADQL + TAP</h3>  <h3><a name="sec7_1"/>7.1 ADQL + TAP</h3>
1043  <p>  <p>
1044  We expect no problems in formulating ADQL queries based on the relational representation of the data model  We expect no problems in formulating ADQL queries based on the relational representation of the data model
1045  described in the previous chapter. We need to require an appropriate protocol for sending these queries to  described in the previous chapter. We need to require an appropriate protocol for sending these queries to
1046  a SimDB service though. In DAL work has started on the Table Access Protocol (TAP) and clearly some version  a SimDB service though. In DAL work has started on the Table Access Protocol (TAP) and clearly some version
1047  of that seems to be applicable to our situation. However there are some simplifying features.  of that seems to be applicable to our situation. However there are some simplifying features.
1048  Foremost is that we pre-define the relational schema, so a generic TAP "getMetadata" service seems not necessary.  Foremost is that we pre-define the relational schema, so a generic TAP "getMetadata" service seems not necessary.
1049  There are likely going to be other standard DAL service features that we need to support (getCapabilities?),  There are likely going to be other standard DAL service features that we need to support (getCapabilities?),
1050  but as meta data databases are expected to be relatively small we may again not require the full richness of  but as meta data databases are expected to be relatively small we may again not require the full richness of
1051  asynchronous querying, staging, VOSpace and what not.  asynchronous querying, staging, VOSpace and what not.
1052  </p>  </p>
1053  Issues that need discussion:  Issues that need discussion:
1054  <ul class="issue">  <ul class="issue">
1055  <li>(How) does TAP deal with units?</li>    <li>(How) does TAP deal with units?</li>  
1056  <li>In TAP, does a table column containing values always have a single UCD and a single Unit?</li>  <li>In TAP, does a table column containing values always have a single UCD and a single Unit?</li>
1057  <li>Is TAP suited for this kind of meta data databases?</li>  <li>Is TAP suited for this kind of meta data databases?</li>
1058  </ul>  </ul>
1059    
1060  <h3><a name="sec7_2"/>7.2 REST</h3>  <h3><a name="sec7_2"/>7.2 REST</h3>
1061  <p>  <p>
1062  Under this heading we mean a protocol whereby data products can be retrieved through  Under this heading we mean a protocol whereby data products can be retrieved through
1063  HTTP GET requests. Possibly also they can be POST-ed, or PUT.  HTTP GET requests. Possibly also they can be POST-ed, or PUT.
1064  This needs to be discussed further, but maybe can be punted until a future release.  This needs to be discussed further, but maybe can be punted until a future release.
1065  The GET will always only be able to get a complete SimDB resource, serialised to SimDB/XML,  The GET will always only be able to get a complete SimDB resource, serialised to SimDB/XML,
1066  similar to the IVOA Resource Registry interface <em class="todo">@@ TODO is this actually a correct statement?@@</em>.  similar to the IVOA Resource Registry interface <em class="todo">@@ TODO is this actually a correct statement?@@</em>.
1067  </p>  </p>
1068    
1069    
1070    
1071  <h2><a name="sec8"/>8 Next Steps</h2>  <h2><a name="sec8"/>8 Next Steps</h2>
1072  <h3><a name="sec8_1"/>8.1 Reference implementations</h3>  <h3><a name="sec8_1"/>8.1 Reference implementations</h3>
1073  <h4><a name="sec8_1_1"/>8.1.1 France</h4>  <h4><a name="sec8_1_1"/>8.1.1 France</h4>
1074  <em class="todo">@@ TODO Laurent @@</em>  <em class="todo">@@ TODO Laurent @@</em>
1075  <h4><a name="sec8_1_2"/>8.1.2 Germany</h4>  <h4><a name="sec8_1_2"/>8.1.2 Germany</h4>
1076  <em class="todo">@@ TODO Gerard @@</em>  <em class="todo">@@ TODO Gerard @@</em>
1077  <h4><a name="sec8_1_3"/>8.1.3 Italy</h4>  <h4><a name="sec8_1_3"/>8.1.3 Italy</h4>
1078  <em class="todo">@@ TODO Patrizia @@</em>  <em class="todo">@@ TODO Patrizia @@</em>
1079  <h4><a name="sec8_1_4"/>8.1.4 USA</h4>  <h4><a name="sec8_1_4"/>8.1.4 USA</h4>
1080  <em class="todo">@@ TODO Rick @@</em>  <em class="todo">@@ TODO Rick @@</em>
1081  s  s
1082  <h3><a name="sec8_2"/>8.2 Generating SimDB/XML documents from simulation pipe lines</h3>  <h3><a name="sec8_2"/>8.2 Generating SimDB/XML documents from simulation pipe lines</h3>
1083  <p>  <p>
1084  Assigning meta-data to describe simulations etc is quite a lot of work if this is to be done aftre the fact.  Assigning meta-data to describe simulations etc is quite a lot of work if this is to be done aftre the fact.
1085  It seems more fruitful to see if simulation codes could make the production of the appropriate  It seems more fruitful to see if simulation codes could make the production of the appropriate
1086  documents part of their pipe-line. It is our goal to contact the writers of some of the major  documents part of their pipe-line. It is our goal to contact the writers of some of the major
1087  simulation packages and see whether they are willing to do so.  simulation packages and see whether they are willing to do so.
1088  First contacts with Volker Springel (Gadget), the group in San Diego (Enzo) give us hope that this could  First contacts with Volker Springel (Gadget), the group in San Diego (Enzo) give us hope that this could
1089  be achieved. The TIG should see it as its task to contact more authors of such codes and promote this idea further.  be achieved. The TIG should see it as its task to contact more authors of such codes and promote this idea further.
1090  </p>  </p>
1091  <h3><a name="sec8_3"/>8.3 SimDAP services</h3>  <h3><a name="sec8_3"/>8.3 SimDAP services</h3>
1092  <p>  <p>
1093  Together with SimDB implementations we need to urge scientists to develop online services for accessing  Together with SimDB implementations we need to urge scientists to develop online services for accessing
1094  their published simulations. Until the SimDAP specification is further developed these can be custom services,  their published simulations. Until the SimDAP specification is further developed these can be custom services,
1095  but it is important that services are available asap. This outreach is a task for the TIG.  but it is important that services are available asap. This outreach is a task for the TIG.
1096  </p>  </p>
1097  <h2><a name="appA"/>Appendix A: Data modelling specifics</h2>  <h2><a name="appA"/>Appendix A: Data modelling specifics</h2>
1098  Here we describe various aspects of UML modelling as we applied it to the current  Here we describe various aspects of UML modelling as we applied it to the current
1099  problem area.  problem area.
1100  <p>  <p>
1101  UML allows communities to create a domain specific modelling language through its Profiling capabilities  UML allows communities to create a domain specific modelling language through its Profiling capabilities
1102  <em class="todo">@@ TODO is this the proper term ?@@</em>.  <em class="todo">@@ TODO is this the proper term ?@@</em>.
1103    
1104  We have an initial implementation of a UML profile as created by MagicDraw available under  We have an initial implementation of a UML profile as created by MagicDraw available under
1105  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/IVOA%20UML%20Profile%20v-2.xml">this link</a>.  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/IVOA%20UML%20Profile%20v-2.xml">this link</a>.
1106  Here we list the main elements and give a a short motivation for their inclusion in the model/.  Here we list the main elements and give a a short motivation for their inclusion in the model/.
1107  It is our opinion that the DM working group should be ultimately responsible for a profile such as this,  It is our opinion that the DM working group should be ultimately responsible for a profile such as this,
1108  defining a domain specific language for all IVOA data modelling efforts.  defining a domain specific language for all IVOA data modelling efforts.
1109  </p>  </p>
1110  <p>  <p>
1111  As first step in our simulation pipeline we generate an XML document that represents the data model in a form  As first step in our simulation pipeline we generate an XML document that represents the data model in a form
1112  that is more easily interpreted, both by human readers and by XSLT scripts, than the XMI representation.  that is more easily interpreted, both by human readers and by XSLT scripts, than the XMI representation.
1113  This document itself is structured according to an XML schema that  This document itself is structured according to an XML schema that
1114  represents the UML profile rather directly and that we here shortly describe.  represents the UML profile rather directly and that we here shortly describe.
1115  </p>  </p>
1116  This schema is located in  This schema is located in
1117  <a href="http://volute.googlecode.com//svn/trunk/projects/theory/snapdm/input/intermediateModel.xsd">  <a href="http://volute.googlecode.com//svn/trunk/projects/theory/snapdm/input/intermediateModel.xsd">
1118  http://volute.googlecode.com//svn/trunk/projects/theory/snapdm/input/intermediateModel.xsd</a>.  http://volute.googlecode.com//svn/trunk/projects/theory/snapdm/input/intermediateModel.xsd</a>.
1119    
1120    
1121  We introduce our own XML format, defined by the XML schema in  We introduce our own XML format, defined by the XML schema in
1122  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/res/intermediateModel.xsd">intermediateModel.xsd</a>,  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/res/intermediateModel.xsd">intermediateModel.xsd</a>,
1123  for representing the logical model. For the time being we call this the <i>intermediate representation</i>.  for representing the logical model. For the time being we call this the <i>intermediate representation</i>.
1124  The first step in the generation pipeline is a translation of the XMI to an XML document following this format.  The first step in the generation pipeline is a translation of the XMI to an XML document following this format.
1125  This transformation is implemented in the    This transformation is implemented in the  
1126  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/res/xmi2intermediate.xsl">xmi2intermediate.xsl</a>  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/res/xmi2intermediate.xsl">xmi2intermediate.xsl</a>
1127  XSLT script. The latest version of the intermediate representation for the SimDB data model can be found in  XSLT script. The latest version of the intermediate representation for the SimDB data model can be found in
1128  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/output/SNAP_Simulation_DM_INTERMEDIATE.xml">this location</a>.  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/output/SNAP_Simulation_DM_INTERMEDIATE.xml">this location</a>.
1129  All other generation scripts work on this intermediate representation, not on the XMI document.  All other generation scripts work on this intermediate representation, not on the XMI document.
1130  Variations in tool-generated XMI, or different versions of XMI can now be supported by an appropriately adjusted  Variations in tool-generated XMI, or different versions of XMI can now be supported by an appropriately adjusted
1131  XSLT script.  XSLT script.
1132  One reasons why this may be useful is that are different tools may produce different versions or different  One reasons why this may be useful is that are different tools may produce different versions or different
1133  dialects of XMI. Another reason for this representation is that XMI is a rather complex representation of a UML  dialects of XMI. Another reason for this representation is that XMI is a rather complex representation of a UML
1134  model. Since we are using a rather restricted <a href="#profile">profile</a> we do not need this generality, and  model. Since we are using a rather restricted <a href="#profile">profile</a> we do not need this generality, and
1135  this allows us to represent the model using XML documents that are much easier to handle with XSLT.  this allows us to represent the model using XML documents that are much easier to handle with XSLT.
1136    
1137    
1138  <p>  <p>
1139  We illustrate out UML profile using an example data model  We illustrate out UML profile using an example data model
1140  derived form the SimDB/DM, shown in the following diagram:<br/>  derived form the SimDB/DM, shown in the following diagram:<br/>
1141  <img src="img/example.jpg"/>  <img src="img/example.jpg"/>
1142  <br/>  <br/>
1143  We now describe the individual elements.  We now describe the individual elements.
1144  some of these are standard, some of these are domain specific extensions following  some of these are standard, some of these are domain specific extensions following
1145  standard UML profile <i>stereotype</i> extension elements and associated <i>tag definition</i>.  standard UML profile <i>stereotype</i> extension elements and associated <i>tag definition</i>.
1146    
1147  <dl>  <dl>
1148    <dt><a name="uml_model"/>Model (no visual counterpart)<dt>    <dt><a name="uml_model"/>Model (no visual counterpart)<dt>
1149    <dd>    <dd>
1150    <ul>    <ul>
1151    <li> &lt;&lt;model&gt;&gt; </li>    <li> &lt;&lt;model&gt;&gt; </li>
1152    <ul>    <ul>
1153    <li>TagDefinition: author</li>    <li>TagDefinition: author</li>
1154    <li>TagDefinition: title</li>    <li>TagDefinition: title</li>
1155    </ul>    </ul>
1156    </ul>    </ul>
1157    </dd>    </dd>
1158        
1159    <dt> Package <br/><img src="img/package.jpg" /></dt>    <dt> Package <br/><img src="img/package.jpg" /></dt>
1160    <dd>    <dd>
1161    <ul>    <ul>
1162    <li>package containment</li>    <li>package containment</li>
1163    <li>package dependency</li>    <li>package dependency</li>
1164    </ul>    </ul>
1165    </dd>    </dd>
1166    <dt> Class <br/><img src="img/class.jpg" /></dt>    <dt> Class <br/><img src="img/class.jpg" /></dt>
1167    <dd>    <dd>
1168    <ul>    <ul>
1169    <li>isAbstract<br/>    <li>isAbstract<br/>
1170    Indicated by <i>italicised</i> name of the object. Implies that no instances can be made of the class,    Indicated by <i>italicised</i> name of the object. Implies that no instances can be made of the class,
1171    one needs sub classes for that.    one needs sub classes for that.
1172    </li>    </li>
1173    </ul>    </ul>
1174    </dd>    </dd>
1175    <dt> DataType <br/><img src="img/datatype.jpg" /></dt>    <dt> DataType <br/><img src="img/datatype.jpg" /></dt>
1176    <dd></dd>    <dd></dd>
1177    <dt> Enumeration <br/><img src="img/enumeration.jpg" /></dt>    <dt> Enumeration <br/><img src="img/enumeration.jpg" /></dt>
1178    <dd></dd>    <dd></dd>
1179    <dt> Property: attribute<br/><img src="img/attribute.jpg" /></dt>    <dt> Property: attribute<br/><img src="img/attribute.jpg" /></dt>
1180    <dd>    <dd>
1181    <ul><li>&lt;&lt;attribute&gt;&gt; </li>    <ul><li>&lt;&lt;attribute&gt;&gt; </li>
1182    <ul>    <ul>
1183    <li>TagDefinition: minLength<br/>    <li>TagDefinition: minLength<br/>
1184    </li>    </li>
1185    <li>TagDefinition: maxLength<br/>    <li>TagDefinition: maxLength<br/>
1186    </li>    </li>
1187    </ul>    </ul>
1188    <li> &lt;&lt;ontologyterm&gt;&gt; <br/>    <li> &lt;&lt;ontologyterm&gt;&gt; <br/>
1189    There are many instances in the data model where we need to describe elements of the    There are many instances in the data model where we need to describe elements of the
1190  SimDB/Resource-s explicitly, because we do not have implicit information based on the context.  SimDB/Resource-s explicitly, because we do not have implicit information based on the context.
1191  Examples are the various properties of object types, the target objects and processes etc.  Examples are the various properties of object types, the target objects and processes etc.
1192  Apart from a name and a description we then frequently add  Apart from a name and a description we then frequently add
1193  an attribute which is supposed to "label" the element according to an assumed standard list of terms.  an attribute which is supposed to "label" the element according to an assumed standard list of terms.
1194  We model this using the <pre>&lt;&lt;ontologyterm&gt;&gt;</pre> stereotype. Attributes with this stereotype  We model this using the <pre>&lt;&lt;ontologyterm&gt;&gt;</pre> stereotype. Attributes with this stereotype
1195  are assumed to take their values form such a predefined "ontology".  are assumed to take their values form such a predefined "ontology".
1196    </li>    </li>
1197    <ul>    <ul>
1198    <li>TagDefinition: ontologyURI<br/>    <li>TagDefinition: ontologyURI<br/>
1199    A URL locating a standard (RDF|SKOS|OWL|???) document containing    A URL locating a standard (RDF|SKOS|OWL|???) document containing
1200    a list of terms from which the value for this attribute may be obtained.    a list of terms from which the value for this attribute may be obtained.
1201    It is our opinion that the Semantics working group should be responsible for the    It is our opinion that the Semantics working group should be responsible for the
1202    definition of relevant ontologies (or semantic vocabularies, or thesauri, or ...)    definition of relevant ontologies (or semantic vocabularies, or thesauri, or ...)
1203    required for a given application domain, though the contents should be decided in    required for a given application domain, though the contents should be decided in
1204    cooperation with domain experts.    cooperation with domain experts.
1205    </li>    </li>
1206    </ul>    </ul>
1207    </ul>    </ul>
1208    </dd>    </dd>
1209    <dt>Inheritance    <dt>Inheritance
1210    <br/><img src="img/inheritance.jpg" /></dt>    <br/><img src="img/inheritance.jpg" /></dt>
1211    <dd>    <dd>
1212    Indicates the typical <i>is a</i> relation between the sub-class and its base-class (the one pointed at).    Indicates the typical <i>is a</i> relation between the sub-class and its base-class (the one pointed at).
1213    In this profile we do not support multiple inheritance. <em class="todo">@@ TODO explain? @@</em>.    In this profile we do not support multiple inheritance. <em class="todo">@@ TODO explain? @@</em>.
1214    </dd>    </dd>
1215    <dt>Binary association end: collection    <dt>Binary association end: collection
1216    <br/><img src="img/collection.jpg" /></dt>    <br/><img src="img/collection.jpg" /></dt>
1217    <dd>    <dd>
1218    This relation indicates a <i>composition</i> relation between one, parent object and 0 or more child objects.    This relation indicates a <i>composition</i> relation between one, parent object and 0 or more child objects.
1219    The life cycles of the child objects are governed by that of the parent.    The life cycles of the child objects are governed by that of the parent.
1220    </dd>    </dd>
1221    <dt><a name="uml_reference"/>Binary association end: reference    <dt><a name="uml_reference"/>Binary association end: reference
1222    <br/><img src="img/reference.jpg" /></dt>    <br/><img src="img/reference.jpg" /></dt>
1223    <dd>    <dd>
1224    This is a relation that indicates a kind of <i>usage</i>, or <i>dependency</i> of one object on another.    This is a relation that indicates a kind of <i>usage</i>, or <i>dependency</i> of one object on another.
1225    It is in general shared, i.e. many objects may reference a single other object. Accordingly the referenced    It is in general shared, i.e. many objects may reference a single other object. Accordingly the referenced
1226    object is independent of the "referee". In our model the cardinality can not be &gt; 1.    object is independent of the "referee". In our model the cardinality can not be &gt; 1.
1227    </dd>    </dd>
1228    <dt>Binary association end: subsets    <dt>Binary association end: subsets
1229    <br/><img src="img/subsets.jpg" /></dt>    <br/><img src="img/subsets.jpg" /></dt>
1230    <dd>    <dd>
1231    This indicates that a relation overrides a relation defined on a base class.    This indicates that a relation overrides a relation defined on a base class.
1232    It does so by specifying that the class at the end point of the relation should be a subclass of the    It does so by specifying that the class at the end point of the relation should be a subclass of the
1233    class at the enpoint of the original, subsetted relation.    class at the enpoint of the original, subsetted relation.
1234    </dd>    </dd>
1235        </dl>
1236    </p>
1237  </p>  
1238    
1239    <h2><a name="appB"/>Appendix B: XSLT pipe line</h2>
1240  <h2><a name="appB"/>Appendix B: XSLT pipe line</h2>  <em class="todo">@@ TODO Laurent @@</em>
1241  <em class="todo">@@ TODO Laurent @@</em>  
1242    <h2><a name="glossary"/>Glossary and Acronyms</h2>
1243  <h2><a name="glossary"/>Glossary and Acronyms</h2>  <dl>
1244  <dl>  <dt><a name="g_SimDB">SimDB</a></dt>
1245  <dt><a name="g_SimDB">SimDB</a></dt>  <dd>Acronym for <i>Simulation Database</i>, the standard that we propose to define in this Note.
1246  <dd>Acronym for <i>Simulation Database</i>, the standard that we propose to define in this Note.  Implementations of SimDB offer a query interface for discovering simulations (and related entities)
1247  Implementations of SimDB offer a query interface for discovering simulations (and related entities)  using ADQL, based on a prescribed (i.e.normative) relational data model and for describing simulations
1248  using ADQL, based on a prescribed (i.e.normative) relational data model and for describing simulations  via XML documents following prescribed XML (i.e. normative) schema.</dd>
1249  via XML documents following prescribed XML (i.e. normative) schema.</dd>  <dt><a name="g_SimDAP"/>SimDAP</dt>
1250  <dt><a name="g_SimDAP"/>SimDAP</dt>  <dd>Acronym for <i>Simulation Data Access Protocol</i>, a related standard to SimDB,
1251  <dd>Acronym for <i>Simulation Data Access Protocol</i>, a related standard to SimDB,  which will define services for accessing simulations discovered using SimDB.</dd>
1252  which will define services for accessing simulations discovered using SimDB.</dd>  <dt><a name="g_SimDB/DM"/>SimDB/DM</dt>
1253  <dt><a name="g_SimDB/DM"/>SimDB/DM</dt>  <dd>The logical data model defining the structure of <a href="#g_SimDB">SimDB</a>.</dd>
1254  <dd>The logical data model defining the structure of <a href="#g_SimDB">SimDB</a>.</dd>  <dt><a name="g_SimDB/RDB"/>SimDB/RDB</dt>
1255  <dt><a name="g_SimDB/RDB"/>SimDB/RDB</dt>  <dd>The representation of the SimDB/DM as a relational data base schema.
1256  <dd>The representation of the SimDB/DM as a relational data base schema.  This implies a parti</dd>
1257  This implies a parti</dd>  <dt><a name="g_SimDB/RDB"/>SimDB/Views</dt>
1258  <dt><a name="g_SimDB/RDB"/>SimDB/Views</dt>  <dd>The representation of the SimDB/DM as a collection of database view definitions. Each View directly represents
1259  <dd>The representation of the SimDB/DM as a collection of database view definitions. Each View directly represents  a complete DM class as a relational table, this in contrast to the underlying SimDB/RDB representation in tables,
1260  a complete DM class as a relational table, this in contrast to the underlying SimDB/RDB representation in tables,  at least in the JOINED object-relational mapping strategy.</dd>
1261  at least in the JOINED object-relational mapping strategy.</dd>  <dt><a name="g_SimDB/XML"/>SimDB/XML</dt>
1262  <dt><a name="g_SimDB/XML"/>SimDB/XML</dt>  <dd>The XML representation of the SimDB/DM</dd>
1263  <dd>The XML representation of the SimDB/DM</dd>  <dt><a name="g_SimDB/Resource"/>SimDB/Resource</dt>
1264  <dt><a name="g_SimDB/Resource"/>SimDB/Resource</dt>  <dd>A top-level data product stored in a SimDB.
1265  <dd>A top-level data product stored in a SimDB.  A SimDB/Resource can be described in a SimDB/XML document, but none of its constituents can.</dd>
1266  A SimDB/Resource can be described in a SimDB/XML document, but none of its constituents can.</dd>  <dt><a name="g_SimDB/TAP"/>SimDB/TAP</dt>
1267  <dt><a name="g_SimDB/TAP"/>SimDB/TAP</dt>  <dd>The TAP(-like) metadata representation of the SimDB/DM.
1268  <dd>The TAP(-like) metadata representation of the SimDB/DM.  This is currently (May 2008 <em class="todo">@@ TODO update once the TAP specification is out @@</em>
1269  This is currently (May 2008 <em class="todo">@@ TODO update once the TAP specification is out @@</em>  a representation of the <a href="#g_SimDB/Views">SimDB/Views</a> as a VOTable document.
1270  a representation of the <a href="#g_SimDB/Views">SimDB/Views</a> as a VOTable document.  </dd>
1271  </dd>  </dl>
 </dl>  
1272    
1273  <h2><a name="references">References</a></h2>  <h2><a name="references">References</a></h2>
1274    
1275  <p><a name="r_UML">[1] ???, <i>UML standard</i>  <p><a name="r_UML">[1] ???, <i>UML standard</i></a>
1276    <br/><a href="http://">http://</a>
1277    </p>
1278    <p><a name="r_XMI">[2] ???, <i>XMI standard</i></a>
1279    <br/><a href="http://">http://</a>
1280    </p>
1281    <p><a name="r_AnalaysisPatterns">[3] Martin Fowler, <i>Analysis Patterns</i>, 1997, Addison Wesley.</a>
1282    <br/><a href="http://">http://</a>
1283    </p>
1284    <p><a name="r_TheoryinVO">[4] Lemson & Colberg, <i>Theory in the virtual observatory</i></a>
1285    <br/><a href="http://">http://</a>
1286    </p>
1287    
1288    <p><a name="r_Characterisation">[5] ???, <i>Characterisation DM</i></a>
1289    <br/><a href="http://">http://</a>
1290    </p>
1291    
1292    <p><a name="r_informatonIntegration">[6] <em class="todo">@@ TODO @@</em>references on global-as-view and information integration</a>
1293    <br/><a href="http://">http://</a>
1294    </p>
1295    
1296    <p><a name="r_visivo">[7] <em class="todo">@@ TODO @@</em>reference to VisIVO</a>
1297    <br/><a href="http://">http://</a>
1298    </p>
1299    
1300    <p><a name="r_SpectrumDatamodel">[8] <em class="todo">@@ TODO @@</em>reference to Spectrum data model</a>
1301    <br/><a href="http://">http://</a>
1302    </p>
1303    
1304    <p><a name="r_Normalisation"/>[9], <i>Some links to pages on data model normalisation</i><br/>
1305    <a href="http://www.datamodel.org/NormalizationRules.html">http://www.datamodel.org/NormalizationRules.html</a><br/>
1306    <a href="http://en.wikipedia.org/wiki/Database_normalization">http://en.wikipedia.org/wiki/Database_normalization</a><br/>
1307    </p>
1308    
1309    <p><a name="r_DMApproaches"/>[10], some data model references<br/>
1310    <a href="http://www.agiledata.org/essays/dataModeling101.html">http://www.agiledata.org/essays/dataModeling101.html</a><br/>
1311    Meyer, B. <i>Object Oriented Software Construction, 2<sup>nd</sup> edition, Prentice Hall, 1997</i><br/>
1312    On object identity: <a href="http://en.wikipedia.org/wiki/Identity_(object-oriented_programming)">http://en.wikipedia.org/wiki/Identity_(object-oriented_programming)</a><br/>
1313    </p>
1314    
1315    <p><a name="r_IVOIdentifiers">[11] <em class="todo">@@ TODO @@</em>reference to IVOA Identifiers ...</a>
1316  <br/><a href="http://">http://</a>  <br/><a href="http://">http://</a>
1317  </p>  </p>
1318  <p><a name="r_XMI">[2] ???, <i>XMI standard</i>  
1319  <br/><a href="http://">http://</a>  <p><a name="r_Gadget">[12] <em class="todo">@@ TODO @@</em>reference to Gadget ...</a>
1320  </p>  <br/><a href="http://">http://</a>
1321  <p><a name="r_AnalaysisPatterns">[3] Martin Fowler, <i>Analysis Patterns</i>, 1997, Addison Wesley.  </p>
1322  <br/><a href="http://">http://</a>  
1323  </p>  
1324  <p><a name="r_TheoryinVO">[4] Lemson & Colberg, <i>Theory in the virtual observatory</i>  </body>
1325  <br/><a href="http://">http://</a>  </html>
 </p>  
   
 <p><a name="r_Characterisation">[5] ???, <i>Characterisation DM</i>  
 <br/><a href="http://">http://</a>  
 </p>  
   
 <p><a name="r_informatonIntegration">[6] <em class="todo">@@ TODO @@</em>references on global-as-view and information integration  
 <br/><a href="http://">http://</a>  
 </p>  
   
 <p><a name="r_visivo">[7] <em class="todo">@@ TODO @@</em>reference to VisIVO  
 <br/><a href="http://">http://</a>  
 </p>  
   
 <p><a name="r_SpectrumDatamodel">[8] <em class="todo">@@ TODO @@</em>reference to Spectrum data model  
 <br/><a href="http://">http://</a>  
 </p>  
   
 <p><a name="r_Normalisation"/>[9] <i>Some lionks to pages on data model normalisation<br/>  
 <a href="http://www.datamodel.org/NormalizationRules.html">http://www.datamodel.org/NormalizationRules.html</a><br/>  
 <a href="http://en.wikipedia.org/wiki/Database_normalization">http://en.wikipedia.org/wiki/Database_normalization</a><br/>  
 </p>  
   
 <p><a name="r_DMApproaches"/>[10] some data model references<br/>  
 <a href="http://www.agiledata.org/essays/dataModeling101.html">http://www.agiledata.org/essays/dataModeling101.html</a><br/>  
 Meyer, B. <i>Object Oriented Software Construction, 2<sup>nd</sup> edition, Prentice Hall, 1997<br/>  
 On object identity: <a href="http://en.wikipedia.org/wiki/Identity_(object-oriented_programming)">http://en.wikipedia.org/wiki/Identity_(object-oriented_programming)</a><br/>  
 </p>  
   
 <p><a name="r_IVOIdentifiers">[11] <em class="todo">@@ TODO @@</em>reference to IVOA Identifiers ...  
 <br/><a href="http://">http://</a>  
 </p>  
   
 <p><a name="r_Gadget">[12] <em class="todo">@@ TODO @@</em>reference to Gadget ...  
 <br/><a href="http://">http://</a>  
 </p>  
   
   
 </body></html>  

Legend:
Removed from v.480  
changed lines
  Added in v.481

msdemlei@ari.uni-heidelberg.de
ViewVC Help
Powered by ViewVC 1.1.26