The issue of the identity of object types must be addressed. ------------------------------------------------------------ In particular, object type instances (ie objects) in SimDb (and the VO in general) can have representations in various different contexts: XML, RDB (both DDL and query), code. We must be able to handle the fact that the identity of the same object can be represented in different ways, while preserving knowledge about the single underlying object. We must furthermore be able to allow references to the object to be expressed in different ways, again depending on the context. For example. - when an XML doc is created containing a new object to be registered in a SimDB, no "SimDB-identity" has been assigned yet. Nevertheless the creator may need to have references from one object to another. If these are in the same document, an ID/IDREF representation may be used. If they are in different documents this does no longer hold. - When stored in the database, we can not assume that the user supplied ID does not already exist in the DB. In our (generated) reference implementation we use JPA in the mode that primary keys are generated for each row. Within the database these id-s are also used in foreign keys representing references. - When we want to bridge different databases, we can not assume that database id-s generated in one DB will work in another. We have to make the host part of a globally unique "IVO-id". Our proposal for references from one XML doc to an existing object follows this same prescription. - Users can have their own unique identifiers assigned to resources and maybe "sub-resources". These need to be preserved. Currently we only support this as an explicit publisherDID attribute on Resource (the root element) and on Snapshot (the results). The latter we have added because we want to be able to communicate to SimDAP services using these user-supplied IDs. At runtime we need to be able to resolve these different identifiers. In our reference implementation we do this by storing from 1 up to 3 different identifiers per row. The generated databaseId (a bigint/long). An xsd:ID (if supplied by and for future reference for the registrar), An ivo-id, generated by thr SimDB implementation and serving as the globally unique ID for the row. For the generation we propose something like: /# eg: http://simdb.g-vo.org/simdb/protocol/Simulator#123456789 [TBD should this be ivo://... ?] Our reference implementation uses various life cycle events to resolve the identifier appropriate in a given context. References are to be resolved similarly. Depending on context a user must be able to use the appropriate identifier of the referenced object to represent the reference: - for remote references, outside of current context, use "ivo-id". - for local references, inside an XML doc, use xml-id (as an IDREF). - for local references inside a simdb/TAP, for examples when querying using ADQL, use the appropriate foreign key column (a bigint/long). TODO ---- 1. We need to compare this approach to the one proposed in the STC standard, which uses XLink and similar techniques. 2. Can we use and expand the IVO Identifier spec to identify rows in a unique manner? 3. Do we allow users to define their own unique ivo-id's, or are these generated by us? GL: my vote is to not allow this. The ivoId of an object is assigned by the SimDB implementation. publisherDID-s on selected elements are designed to give a publisher this possibility. If we can generate it as proposed above, do we need to store it in the row? If not, complicates SQL querying using them. We could see these ivoID-s purely as GUID, to be used to in REST-like requests, not SQL. 4. Do we need publisherDID-s on all possible objects? In that case these can be defined implicitly, on "MetadataObject", and do not need to be explicitly defined on Resource and Snapshot. These can be minOccurs=0. 5. ...