ViewVC logotype

Diff of /trunk/projects/theory/snapdm/doc/note/SimDB-note.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 433 by gerard.lemson, Wed Apr 30 16:28:24 2008 UTC revision 434 by gerard.lemson, Fri May 9 16:16:54 2008 UTC
# Line 8  Line 8 
8          <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />          <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
9          <meta name="maintainedBy" content="IVOA Document Coordinator, ivoadoc@ivoa.net" />          <meta name="maintainedBy" content="IVOA Document Coordinator, ivoadoc@ivoa.net" />
10    <link rel="stylesheet" href="http://ivoa.net/misc/ivoa_wg.css" type="text/css" />    <link rel="stylesheet" href="http://ivoa.net/misc/ivoa_wg.css" type="text/css" />
11      <link rel="stylesheet" href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/css/simdb-note.css" type="text/css">    <link rel="stylesheet" href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/css/simdb-note.css" type="text/css">
12  </head>  </head>
14  <body>  <body>
# Line 143  Line 143 
144  <li><a href="#sec6">6 Physical models</a></li>  <li><a href="#sec6">6 Physical models</a></li>
145          <ul class="toc">          <ul class="toc">
146          <li><a href="#sec6_1">6.1 RDBM Schema</a></li>          <li><a href="#sec6_1">6.1 Identifiers and references</a></li>
147          <li><a href="#sec6_2">6.2 XML Schema</a></li>          <li><a href="#sec6_2">6.2 RDBM Schema</a></li>
148          <li><a href="#sec6_3">6.3 Identifiers</a></li>          <li><a href="#sec6_3">6.3 XML Schema</a></li>
149          <li><a href="#sec6_4">6.4 JAVA/JPA+JAXB (non-normative)</a></li>          <li><a href="#sec6_4">6.4 Identifiers</a></li>
150            <li><a href="#sec6_5">6.5 JAVA/JPA+JAXB (non-normative)</a></li>
151          </ul>          </ul>
153  <li><a href="#sec7">7. Query protocols</a></li>  <li><a href="#sec7">7. Query protocols</a></li>
# Line 581  Line 582 
582  <h2><a name="sec5"/>5 Logical Model: SimDB</h2>  <h2><a name="sec5"/>5 Logical Model: SimDB</h2>
583  <p>  <p>
584  Here we introduce the core of our proposal, the UML representaiton of our logical data model  Here we introduce the core of our proposal, the UML representaiton of our logical data model
585  for our Simulation Database.  for our Simulation Database. The exact representation of this model is an
586  <h4><a name="sec5_1"/>5.1 Overview</h4>  <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/SimDB_DM.xml">XMI file</a>,
587    which can be found in the <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm">snapdm section</a>
588    of the <a href="http://volute.googlecode.com/svn/">Volute subversion database</a> on Google code.
589    Other representations can be found in that same hierarchy, in particular check out the
590    <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/SimDB_DM.xml">HTML documentation</a> which we generated from the XMI
591    representation with the XSLT pipeline described in <a href="#appB">Appendix B</a>. This generated documentation file contains
592    the explicit description of all of the elements in the model and forms the reference documentaiton document for the model.  
593    <h3><a name="sec5_1"/>5.1 Overview</h4>
594  <p>  <p>
595  The logical data model is a fully detailed model of the application domain. It is to form the basis of physical  The logical data model is a fully detailed model of the application domain. It is to form the basis of physical
596  models, representing the model in various computational environments.  models, representing the model in various computational environments.
# Line 593  Line 601 
601  JPG representations of the model can be found in <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/images/">this</a>  JPG representations of the model can be found in <a href="http://volute.googlecode.com/svn/trunk/projects/theory/snapdm/input/images/">this</a>
602  directory. <em class="todo">@@TODO find proper representation image of the complete model. Possibly color packages differently.@@</em>  directory. <em class="todo">@@TODO find proper representation image of the complete model. Possibly color packages differently.@@</em>
603  </p>  </p>
604  <h4><a name="sec5_2"/>5.2 Normalisation</h4>  <h3><a name="sec5_2"/>5.2 Normalisation</h4>
605  <h4><a name="sec5_3"/>5.3 Target</h4>  <h3><a name="sec5_3"/>5.3 Model contents</h3>
606  <h4><a name="sec5_4"/>5.4 Characterisation</h4>  <h4><a name="sec5_3_1"/>5.3.1 Resource hierarchy</h4>
607  <h4><a name="sec5_5"/>5.5 Semantics</h4>  <p>
608  <h4><a name="sec5_6"/>5.6 Units</h4>  At the root of the SimDb data model is an abstract class called Resource, in the rest
609    of this document we will refere to this as SimDB/Resource.
610    It represents the different types of highest level meta-data objects to be stored in a SimDB.
611    Examples of this are represented as subclasses. First Experiment (SimDB/Experiment), which represents
612    different types of experiments that have been performed (run/executed/...) and have produced the results
613    that SimDB users may be interested in. Examples of SimDB/Experiment-s are first simulations,
614    but also the various post-processing operations transforming simulation results into other products
615    such as halo catalogues, density fields etc.
616    </p>
617    <p>
618    The second major type of SimDB/Resource is the SimDB/Protocol.
619    This concept represents a <i>formally prescribed way of doing an experiment</i>.
620    It is derived from the concept with the same name in the domain model, which itself was inspired
621    by the concept with the same name in Chapter 8.5 in <a href="#r_AnalaysisPatterns>[3]</a>.
622    In the SimDB/DM this concept has concrete representations in the computer programs that are being
623    used to run simulations and post-processing etc. As such it defines the possible input parameters,
624    possble algorithms, the kind of results that can be produced by the code. Every SimDB/Experiment must
625    indicate which SimDB/Protocol was used and for example provide values for the input parameters, indicate
626    which physics was used
627    </p>
628    <p>
629    The SimDB/Resource concept is clearly similar, but in general <i>not equivalent</i> to the Resource Registry's Resource concept.
630    In data modeling terms, it is not true that a SimDB/Resource <i>is a</i> Registry/Resource.
631    Often the reason is similar to the reasons that a single image is not a Registry/Resource, whereas a SIAP-compatible service is.
632    The granularity of a SimDB will be higher than a Registry and many simulations on their own will be too small.
633    The SimDB itself will have to be registered (see <a href="#">section ???</a> for a further discussion
634    <em class="todo">@@ TODO add propoer section and href@@</em>),
635    i.e. a SimDb service <i>is a</i> Registry/Resource. In discussion with Ray Plante (IVOA Interop May 2007, Beijing)
636    on this issue it was proposed that some part of the contents could also be registered in a Registry directly.
637    I.e. we should be able to identify Registry/Resource-s in SimDB. Considerations to decide on how to make this identification would be for example
638    that all data products resulting form a well defined (and published) scientific project could qualify.
639    To represent such a possibility for now we have introduced another subclass of SimDB/Resource: SimDB/Project.
640    This is not much more than an aggregation of experiments, with some additional atrributes describing the motivation etc.
641    The metadata of a SimDB/Project is not the same as that of a Registry/Resource, however we propose that we should be able
642    to define a transformation (possibly implemented again in XSLT) to transform a SimDB/Project and produce a Registry/XML representation.
643    Some more thoughts on this subject will be given in <a href="#">section ???</a> <em class="todo">@@ TODO add proper section and href@@</em> mentioned above.
644    </p>
646    <h4><a name="sec5_3_2"/>5.3.2 Target</h4>
648    <h4><a name="sec5_3_3"/>5.3.3 Characterisation</h4>
650    <h4><a name="sec5_3_4"/>5.3.4 Semantics</h4>
652    <h4><a name="sec5_3_5"/>5.3.5 Units</h4>
655    <h4><a name="sec5_3_6"/>5.3.6 Services</h4>
656    The goal of the SimDB specification is to define a protocol for querying interesting simulations and related SimDB/Resource-s.
657    Once these have been identified the user should be able to access these simulations.
660    <h2><a name="sec6"/>6 Physical models</h2>
661    Here we describe how we create physical models out of the logical model.
662    A <i>physical model</i> is (see <em class="todo">@@TODO reference to some standard reference on data modelling@@</em>)
663    a representation of the logical model that is adapted to a particular software environment.
664    The DM WG has mandated (IVOA interoperability meeting, Cambridge, UK, May 2003) that one
665    such representation should be an XML schema. This is to be used to define the structure of XML documents
666    used in message to communicate instances of the SimDB Resource type.
667    Together with this we also create a relational database schema.
668    We propose this model as we want to use the ADQL standard under development in the VOQL WG
669    in the protocol for querying SimDB-s.
672    <h3><a name="sec6_1"/>6.1 Identifiers and References</h3>
674    We want to be able to identify each instance of each concrete type explicitly in a globaly unique way.
676    To this end we need to be able to assign identifier on each
679    <h3><a name="sec6_2"/>6.2 RDBM Schema</h3>
680    The public schema, i.e. the view the outside world has of a SimDB, is a relational schema.
681    This will be formally defined using VOTables containing the appropriate TABLE definitions
682    <ul>
683    <li>object types are mapped to tables, one table per object type</li>
684    <li>Inheritance hierarchies: JOINED strategy as defined in JPA, i.e. each table only has columns for the attributes and references defined on the corresponding type.
685    Also an ID column that is a PK and also a FK to the ID of the base class' table. Possibly a container column (see below)</li>
686    <li>Primary key column: <tt>ID NUMERIC(18)</tt></li>
687    <li>Foreign key to container: <tt>containerId</tt><br/>plus foreign key and index declaration</li>
688    <li>References: &lt;referenceName&gt;Id<br/>plus foreign key and index declaration.</li>
689    <li>Using topological sort of object types based on (extends|container|reference) relations we generated
690    create table statements and ther indexes and foreign keys in blocks. drop table statements in opposite order.</li>
691    <li>For each class we create a view named "v_&lt;class name&gt;<br/>returns all columns for that class; uses join to base class's view.</li>
692    <li>generate a discriminator column on table for root in inheritance hierarchy, stores name of class (must be unique in inheritance hierarchy!)</li>
693    <li>attributes mapped to single column if their type is simple (i.e. primitive, or enumeration)</li>
694    <li>if attribute's type is dataType mapped to as many columns as the dataType has attributes,
695    with column names the name of the dataType's attributes, prefixed by &lt;attribute-name&gt;_</li>
696    <li>For PK columns we use the
697    </ul>
699    <h3><a name="sec6_3"/>6.3 XML Schema</h3>
701  <h2><a hname="sec6"/>6 Physical models</h2>  <h3><a name="sec6_4"/>6.4 UTYPE-s</h3>
702  <h3><a name="sec6_1"/>6.1 RDBM Schema</h3>  <p>
703    It is generally the case that contents of databases may be represented in ways that do not
704    conform to one of the standard serialisations. Nothing prevents services to be developed on
705    top of SimDB that represent SimDB/Resource-s or even fragments of these in another form.
706    The standard example would be to have VOTables storing the results of a generic ADQL query of the SimDB/RDB representation.
707    VOTable first introduced the option to have a UTYPE attribute in FIELD definition tags store
708    a pointer to an element in a data model that the column represents.
709    </p>
710    <p>
711    The <a href="#r_SpectrumDatamodel">Spectrum data model</a> was the first to add explicit
712    UTYPE-s for each of the attributes in its model and the <a href="#r_CharacterisationDM">Characterisaiton data model</a>
713    has followed that example. As long as the precise usage and relation of the syntax of the underlying data model is
714    is not defined, we will follow these exmaples by assigning UTYPE-s explicitly to all elements in the model.
715    However, we will follow a fixed set of rules to makes this assignment and implement these in XSLT.
716    If a similar approach is at some time accepted within the IVOA, possibly in an alternative form, it will be straightforward
717    to adjust our definitions.
718    </p>
719    <p>
720    Our assumption is that the UTYPE should be able to uniquely represent any element in the data model, and in a manner
721    that is also easily interpreted. For now the elements that we assume need to be able to address are those that can be
722    represented by a single value in a column. This leaves us to requiring to be able to derive UTYPE-s for the following
723    model elements:
724    <ul>
725    <li>Attribute</li>
726    <li>Reference</li>
727    <li>Collection</li>
728    </ul>
730  <h3><a name="sec6_2"/>6.2 XML Schema</h3>  </p>
733  <h3><a name="sec6_3"/>Identifiers</h3>  <h3><a name="sec6_5"/>6.5 Java/JPA+JAXB (non normative)</h3>
 <h3><a name="sec6_4"/>6.4 Java/JPA+JAXB (non normative)</h3>  
735  <h2><a name="sec7"/>7 Query Protocols</h2>  <h2><a name="sec7"/>7 Query Protocols</h2>
736  <h3><a name="sec7_1"/>7.1 ADQL</h3>  <h3><a name="sec7_1"/>7.1 ADQL</h3>
738  <h3><a name="sec7_2"/>7.2 REST</h3>  <h3><a name="sec7_2"/>7.2 REST</h3>
739  <p>  <p>
740  Under this heading we mean a protocol whereby data products can be retrieved through  Under this heading we mean a protocol whereby data products can be retrieved through
741  HTTP GET requests. Possibly also they can be POST-ed, or PUT.  HTTP GET requests. Possibly also they can be POST-ed, or PUT.
742  This needs to be discussed further, but maybe can be punted until a future release.  This needs to be discussed further, but maybe can be punted until a future release.
743  The GET will always only be able to get a complete SimDB resource, serialised to SimDB/XML.  The GET will always only be able to get a complete SimDB resource, serialised to SimDB/XML, similar to the Registry.
744  </p>  </p>
745  <h3><a name="sec7_3"/>7.3 TAP?</h3>  <h3><a name="sec7_3"/>7.3 TAP?</h3>
746  Issues:  Issues:
# Line 636  Line 761 
761  <h4><a name="sec8_1_4"/>8.1.4 USA</h4>  <h4><a name="sec8_1_4"/>8.1.4 USA</h4>
762  <em class="todo">@@ TODO Rick @@</em>  <em class="todo">@@ TODO Rick @@</em>
764  <h3><a name="sec8_2"/>8.2 Generating XML form simulation pipe lines</h3>  <h3><a name="sec8_2"/>8.2 Generating XML from simulation pipe lines</h3>
766  <h3><a name="sec8_3"/>8.3 SimDAP services</h3>  <h3><a name="sec8_3"/>8.3 SimDAP services</h3>
# Line 775  Line 900 
900  <p><a name="r_XMI">[2] ???, <i>XMI standard</i>  <p><a name="r_XMI">[2] ???, <i>XMI standard</i>
901  <br/><a href="http://">http://</a>  <br/><a href="http://">http://</a>
902  </p>  </p>
903  <p><a name="r_AnalaysisPatterns">[3] ???, <i>Analysis Patterns</i>  <p><a name="r_AnalaysisPatterns">[3] Martin Fowler, <i>Analysis Patterns</i>, 1997, Addison Wesley.
904  <br/><a href="http://">http://</a>  <br/><a href="http://">http://</a>
905  </p>  </p>
906  <p><a name="r_TheoryinVO">[4] Lemson & Colberg, <i>Theory in the virtual observatory</i>  <p><a name="r_TheoryinVO">[4] Lemson & Colberg, <i>Theory in the virtual observatory</i>
# Line 790  Line 915 
915  <br/><a href="http://">http://</a>  <br/><a href="http://">http://</a>
916  </p>  </p>
918  <p><a name="r_visivo">[6] <em class="todo>@@ TODO @@</em>reference to VisIVO  <p><a name="r_visivo">[7] <em class="todo>@@ TODO @@</em>reference to VisIVO
919    <br/><a href="http://">http://</a>
920    </p>
922    <p><a name="r_SpectrumDatamodel">[8] <em class="todo>@@ TODO @@</em>reference to Spectrum data model
923  <br/><a href="http://">http://</a>  <br/><a href="http://">http://</a>
924  </p>  </p>

Removed from v.433  
changed lines
  Added in v.434

ViewVC Help
Powered by ViewVC 1.1.26