OBJECT-ORIENTED DATABASES METHODS AND TOOLS FOR REFINING DATA MODEL MAPPING CONSTRUCTION (ODB Course) Leonid Kalinichenko Institute of Informatics Problems Russian Academy of Science E-mail: [email protected] OUTLINE Introduction { Heterogeneous multidatabase management and data model mapping { Data model mapping issue in historical perspective Commutative data model mapping method { Data model formal denition and equivalence { Principles and properties of the commutative data model mapping { Data model mapping and unifying technique { Denotational semantics as a data metamodel { Structure of DDL and DML formal denitions Network to relational DM mapping example { Formal denition of source (network subset) and target (relational) DMs { Construction of database schema mapping { Axiomatic extension of the relational DM { Semantics of target DM DML interpretation by the source DML { Verication of the commutativity of the DML mapping diagram Summary of results of the commutative DM mapping method application OUTLINE (II) Data model mapping as the data model renement { Model-based notations as metamodels Abstract Machine Notation Structure of the Abstract Machines and the respective Proof Obligations Renement of Abstract Machines and proofs of the renement properties AMN as a metamodel { Data model renement { Mapping diagrams based on concept of renement and their properties { Commutative data model mapping construction { Example of the relationship type mapping for dierent object models Summary and discussion INTRODUCTION Heterogeneous multidatabase management and data model mapping Data model mapping issue in historical perspective HETEROGENEOUS MULTIDATABASE MANAGEMENT OBJECTIVES Heterogeneous multidatabase management is dened as an approach to database design, application and management providing for the following objectives: 1. joint usage of data from several heterogeneous databases as from a logically single database 2. homogeneous presentation for an application of a collection of various databases, maintenance of its integrity 3. data descripton and data manipulation language unication for various data models 4. maintenance of DBMS-independent generalized level of an application domain description 5. provision of application program independence of DBMS 6. continuous embracement of an extending spectrum of data representations and operations in digital collections of data. In most of the known multidatabase management systems (MDBMSs) the above goals were only partially achieved. Formal theory is required to study properties of data models as a whole (e.g., data model equivalence, data model similarity, data model renement) and operations on them (such as data model transformation). We attempt to develop system of concepts and metalanguages to treat data models on the basis of formal semantics. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (5) DATA MODEL MAPPING ISSUES How to obtain homogeneous specications of heterogeneous components so that the existing specication of a resource could be proved to be equivalent or to be a renement of its canonical specication. Shared, interoperable databases: integration of multiple data model paradigms, data model mapping, canonical data model (database interlingua) development. Canonical schema design is based on composing of reusable heterogeneous database components mapped to the canonical data model The reuse of a type is feasible to the extent that we could justify that the implemented type state and behaviour correctly models the required canonical type state and behaviour. Subtype and substitutability: objects of a subtype can be used in place of objects of a supertype without aecting their clients. Dierent research areas (heterogeneous database integration, semantic interoperability, schema evolution and transformation, views integration, data model mapping and database transformation, database migration, database design with reuse of pre-existing databases) require similar fundamentals. Diverse world of DBMSs (X3H7 Object Model Features Matrix) Heterogeneous DB interoperability projects (e.g., IRO-DB) Completeness of DB specications (up to provision of function specications in type denitions) is a necessary prerequisite for semantic interoperation. The subtype denitions should be complete: to take into account only the relationship of function signatures of type and subtype is not enough. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (6) DATA MODEL MAPPING ISSUES (2) Data model mapping approaches: notion of database state and data model equivalence (Biller H., Neuhold E.J, Borkin S.A.). approaches to data model mapping based on some kind of formalism a commutative data model mapping (veriable design of model transformation and canonical model synthesis) - Kalinichenko L.A. many sorted logic application for databases transforming between dierent data models (Olga de Troyer). Extensible data model challenges and requirements. Type model commutative mapping method: preserving information and operations of types while mapping them into the canonical types the commutativity of two mapping diagrams (type state and type behaviour diagrams) should be established required state-based and behavioral properties of the mappings lead to a proof that a source type model is a renement of its mapping into the canonical type model. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (7) COMMUTATIVE DATA MODEL MAPPING METHOD Data model formal denition and equivalence Principles and properties of the commutative data model mapping Data model mapping and unifying technique Denotational semantics as a data metamodel Structure of DDL and DML formal denitions BASIC IDEA OF HETEROGENEOUS MULTIDATABASE MANAGEMENT Introduction of DBMS-independent generalized level of data representation and manipulation | virtual database level | and a canonical data model of an integrating system corresponding to this level. Data models (DM) supported by DBMS with respect to this common DM are internal ones. Each DM is completely dened by data description language (DDL) and data manipulation language (DML) semantics. Construction of MDBMS becomes possible on the basis of methods of data model mapping (while speaking about mapping, the DM being mapped will be called a source DM and the DM into which the mapping is carried out - a target one). "Mapping" and "Transformation" "Mapping" is the abstract mechanism of conversion of one model into another and "transformation" is an implementation of the mapping. A software processor performing such mapping will be called "a data model transformer". Wrappers. In transition from internal to canonical DM it's necessary to preserve the information and operators. This requires that the internal DM should be equivalently represented in the canonical one in the process of DM mapping. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (9) DATA MODEL EQUIVALENCE Equivalence of database states Database states in source and a target DM are equivalent if they are mapped into one and the same state in the content of an abstract data metamodel. It is assumed that equivalent database states represent one and the same collection of facts. Equivalence of database schemas Database schemas are equivalent if they produce sets of database states of equal power related by bijective dependency in such a way that the states being in one-to-one correspondence are equivalent. Equivalence of data models Two data models are equivalent if each database schema in one model can be put into a one-to-one correspondence with the equivalent schema in the other model, while providing completeness of the DML operator set in each data model. DML OPERATOR SET COMPLETENESS A DML operator set is functionally complete if for each type of DDL objects the actions of retrieving, putting, deleting and updating of objects are expressible in DML and for each initial state b1 of a database with the schema si it is possible to dene a sequence of DML operators transferring the database into any given state b2 admissible for the schema si . It is assumed that DML operator sets of the data models which will be considered are functionally complete. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (10) DATA MODEL FORMAL DEFINITION For data model Mi the set of all schemas expressible in DDL of Mi is denoted by Si , set of data manipulation statements which may be constructed by DML of Mi is denoted by Oi . Database state corresponding to schema si belonging to Si is a function bs : Ids ! Vi dening for each data type in the schema denoted by identier Id its value vi taken from the set of admissible values Vi of the data type. It is essential that vi can also be a function. A set of admissible states corresponding to some schema si belonging to Si is a set of functions: Bs : Ids ! Vi]. A space of states expressible in Mi is a set of functions Bi : Idi ! Vi], which may be considered as union of sets Bs for all si belonging to Si . Data model Mi is a quadruple < Si Msi Oi Moi > where Msi : Si ! Bi is a semantics function of Mi DDL, Moi : Oi ! Bi ! Bi] is a semantics function of Mi DML. i i i i i OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (11) DATA MODEL MAPPING The following set of mappings constitutes the mapping f of data model Mj into data model Mi : database state space of Mj into the database state space of Mi : Bj ! Bi, database schema of Mj into database schema of Mi : Sj ! Si, DML operators of DML of Mi into the sequence of operators of DML of Mj : Oi ! Pj, where pj 2 Pj is a procedure in DML of Mj . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (12) BASIC PRINCIPLES OF HETEROGENEOUS MULTIDATABASE MANAGEMENT The following propositions form the basis of a heterogeneous multidatabase conception. Data model axiomatic extension principle The canonical data model in the integrating system should be extensible while new data models are included into the system. Such an extension is assumed axiomatic. It means that such an extension of target DM is carried out by addition to its DDL of a system of axioms determining (in terms of a target model) logical dependencies of the source data model and the modied semantics of DML operators of the target DM. The result of the extension should be equivalent to the source data model. Construction of a target DM axiomatic extension is considered as a new language design (DDL and DML) on the basis of the target DM. Data model commutative mapping principle In the process of mapping it is necessary to preserve information and operators. This requirement is satised if DM mapping is commutative. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (13) BASIC PRINCIPLES OF HETEROGENEOUS MULTIDATABASE MANAGEMENT (II) Mapping f =< > of data model Mj into extension Mi j of data model Mi is commutative if the following conditions hold: | schema mapping diagram is commutative: Ms -B S f g 6 6 i ij ij ij S Ms j j -B j | DML operators mapping diagram is commutative: Mo - B ! B ] O 6 ij ij P ij ij j ? Mp j - B j !B ] j | mapping is bijective ij denotes a set of axiom schemas expressing the data dependencies of Mj in terms of Mi Pj denotes sequences of Mj DML operators (procedures). Unifying canonical data model synthesis principle Canonical data model synthesis is a process of construction of extensions of the canonical data model kernel equivalent to data models of DBMS embraced by MDBMS and a process of the merging of such extensions into a canonical data model. Thus a unifying canonical data model is formed in which data models of various DBMS have homogeneous equivalent representations ( as the subsets of a unifying data model). OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (14) DATABASE INVARIANTS Axioms aij 2 ij by which inherent consistency rules xed in data model Mj and dened in terms of Mi are expressed will be referred to as data base invariants. Axiom induced actions Actions of database state modication induced by axiom aij 2 ij on the execution of database update statement oij 2 Oij is such minimal additional actions with respect to the statement oi 2 Oi of the target data model Mi which provide for aij to become an invariant with respect to oij . Complete set of database invariants for extensions Mij of data model Mi In axiomatic extensions of Mi the complete sets of invariants are considered which on the one hand completely dene for data types of Mij inherent consistency rules of Mj and on the other | completely dene the modications of DML Mij statements semantics with respect to analogous statements of Mi . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (15) PROPERTIES OF COMMUTATIVE DATA MODEL MAPPINGS Data model Mi is included into data model Mj if all axioms schemas ri of some reference Mr data model extension Mri (equivalent to Mi ) are included (not strictly perhaps) into the set of axiom schemas rj of reference Mr data model extension Mrj equivalent to Mj . Proposition of existence If database schema mapping diagram for the Mj to Mij mapping commutes, a set of Mj DML statements is functionally complete and if semantics function Moij is dened, then the mapping : Oij ! Pj exists which makes DML statement mapping diagram commutative. Proposition of equivalence Data model Mj is equivalent to Mi i there exists commutative mapping of Mj to Mi . Proposition of constructiveness Commutative mapping of data model Mj into extension of data model Mi can be constructed if the target data model (Mi ) is included into the source one (Mj ). OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (16) FIRST OBSERVATIONS Properties of commutative data model mappings dened above determine the methods of commutative DM mapping design. From proposition of existence it follows that the process of the design of the canonical data model kernel axiomatic extensions equivalent to data models of the set of DBMSs and the process of the design of the algorithms of extended target DML statements interpretation by means of the source one can be separated. Thus, it is allowed to separate and treat independently the process of the canonical unifying data model synthesis from the process of the de- nition of the algorithms of the DML interpreters. DATA METAMODEL Formal denition of data models is needed to obtain their compact and precise description, making possible their manipulation as by mathematical objects. The metalanguage used for the formal denition of the data models is called the data metamodel (DMM). DMM should be independent of particular data models concepts allowing precise expression of semantics properties of dierent data models, of their similarity or dierence on the basis of one and the same language. We need the common discipline for the design of data model mappings, providing for construction of the algorithm of data model mapping and for the proof of its correctness. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (17) PROGRAMMING LANGUAGE SEMANTICS Generally, the denition of programming language L semantics is mechanism M, which relates the syntactical constructions or programs of the language to their content (or denotation | an object corresponding to its own name). In other words M : P ! D , where P is syntactical domain of M (the set of all syntactically correct programs in L), D - semantical domain of M (the set of the program denotations). The existing methods for dening the programming language semantics are usually classied according to the type of domain D as: compiler based in which D is the set of target language syntax abstract representations (e. g. , in the form of a parse tree) operational or interpretative in which D is the set of calculations (of sequences of abstract machine states), induced by the programs axiomatic in which D is a relation dened on the sets of pre- and postconditions, dening the allowable set of initial and nal program states functional or denotational in which the denotation of the program is a partially recursive function. Operational methods are mostly convenient for language implementors. Axiomatic methods mostly correspond to the process of programming. Denotational methods are generally convenient for language designers. Among these methods only the functional method has the remarkable property of combining program treatment as mathematical objects (functions) with exible facilities for data types formal denition (in the form of partiallyordered sets). OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (18) FUNCTIONS AND TYPES IN DENOTATIONAL SEMANTICS The cornerstone of the denotational semantics is introduction of the class of "data types" | domains of functions | as posets and of a class of functions (generally recursive) for creation of a natural and precisely dened computational model. Foundations are in researches of D.Scott, C.Strachey, D.deBakker, Z.Manna devoted to the theory of computations, methods of program analysis, programming language semantics. Functions f with domain D1 and range D2 are in general considered to be partial. Such functions are considered as abstract objects treated as set of pairs. f = g if f (x ) = g (x ) for all x. Extensions of such functions to total are created by addition to domains D1 D2 of undened value ? in such way that if function f is undened on d1 belonging to D1 then f (d1) = ?. Besides, function f is dened also in point ?. The class of all functions mapping D1 into D2 is denoted further as D1 ! D2]. Relation v ("less dened or equal") is an order on domain D such that for all d belonging to D the following relationships hold : ? v d and d v d . Dierent elements d1 d2 from D minus f ? g are not linked by relation v. Data type (domain) in the data metamodel is a set partially ordered by relation v. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (19) DATA METAMODEL CONTENT Abstract data metamodel consists of: a set of elementary domains D1 D2 ::: corresponding to primary sets of objects operations allowing construction of complex domains (data types) from simpler ones and facilities for data type formal denition set of data types dened on the basis of elementary domains by means of data type constructing operations a set of primitive functions and predicates, dened on the data types a set of functional forms, used for new function denition and facilities for formal denition of functions a set of function denitions a set of rules of equivalent function transformations. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (20) PRIMARY DOMAINS AND DOMAIN CONSTRUCTION OPERATIONS Elementary domain Let E denote arbitrary set f ei i 2 I g. Elementary domain D is created by addition of element ? to E and setting on a created set a partial order v : for all ei 2 E it holds that ? v ei for all ei ej 2 E it holds that ei v ej ! ei = ej . Operation of elementary domain creation from arbitrary set is denoted as D = fei i 2 I g. E.g., N = f::: ;2 ;1 0 1 2 :::g , B = ftrue false g, C = f0a 0 0 b 00 c 0 :::g | domains of integer, boolean and character values. Production domain If D1 and D2 are domains then D = D1 D2 is a domain | cartesian product of domains consisting of the pairs < d1 d2 > d1 2 D1 d2 2 D2. It is essential that < d1 d2 >v< d10 d20 > i d1 v d10 in D1 and d2 v d20 in D2. Let Di (1 i n ) be a number of domains. D = D1 D2 ::: Dn is new domain consisting of n-ary tuples of elements of D1 D2 ::: Dn . Special case when Di is one and the same domain is denoted by D = Din . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (21) PRIMARY DOMAINS AND DOMAIN CONSTRUCTION OPERATIONS (II) Domain of sum If D1 and D2 are domains then D = D1 + D2 is a domain | sum including arbitrary elements of initial domains. Elements d d 0 2 D 0 are related by v i d d 0 belong to one of initial domains D1 or D2 and in this domain this relation between d d 0 holds. For large number of domains the domain-sum is denoted as D = D1 + D2 + ::: + Dn . Functional domain A class of functions D1 ! D2] where D1 and D2 are domains creates domain with respect to partial ordering v in such a way that f v g , (f g 2 D1 ! D2]) i for all d 2 D1 it holds that f (d ) v g (d ). List domain If D is any domain then D ? = D + D 2 + D 3 + ::: denotes a list domain which includes all possible tuples of arbitrary order composed of the elements of D. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (22) DATA TYPE CONSTRUCTION Any data type (domain) T is constructed recursively from data types (domains) Di (1 i n ) using domain constructing operations OP = + j j! according to the following denition: T = Di (any domain is a data type), T = T ?, T = T n, T = T OP T . In domain constructing expressions parentheses may be used : T = (T OP T ) For readability the functional domain is inserted into square brackets: T = T ! T ]. Data type denition in the metamodel is the following construction: < data type denition >::=< identier >=< data type > < data type >::= T j< predicate >;!< data type > < data type > OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (23) PRIMITIVE FUNCTIONS AND PREDICATES Functional domain analysing functions domain (f ) f 2 D ! D 0] ;! fd 2 D f (d ) 6= ?g ? range (f ) f 2 D ! D 0] ;! fd 2 D 0 f ;1(d 0 ) = 6 ?g ? Element length function length (d ) d 2 D ?&d =< d1 d2 ::: dn >;! n ? Projection function s ; Dj d =< d1 d2 ::: dn > &d 2 D1 D2 ::: Dn &(1 j n ) ;! dj ? It is allowed to write also j (d ) or j d if the structure of d is obvious. Selector function This function is used for analysis of the components of structured functional domains D1 ! D2 ! :::Dk ;1 ! Dk ]:::]] : d1 d2 ::: dk ;1 (f ) Sum domain analysis predicate is ; Dj (d ) d 2 D1 + D2 + ::: + Dn ;! (d 2 Dj &(1 j n )&d 6= ? ;! T F ) ? OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (24) FUNCTIONAL FORMS A functional form is an expression denoting function. The function is dened by the functions included into the functional form and their arguments. The basic functional forms used for composition of complex forms: Conditional form p ;! f g p = T ;! f p = F ;! g ? Form-functional substitution Let f 2 D ! D 0] d 2 D v 2 D 0 . Then f d =v ] is a functional form describing new function f 0 obtained from f by substitution of value f on the argument d equal to v . In case D and D 0 are complex domains (e. g. D = D1 D2 ::: Dn , D 0 = D1 D2 ::: Dm ) the substitution is denoted in the following way: f d1 d2 ::: dn =v1 v2 ::: vm ] Here < d1 d2 ::: dn >2 D < v1 v2 ::: vm >2 D 0. Composition Let f 2 D ! D 0] g 2 D 0 ! D 00 ]. The composition (left composition) of functions is the form g f 2 D ! D 00] dened by the rules (g f )(d ) g (f (d )) for all d 2 D . The composition of n functions f1 f2 ::: fn is the form ni=1fi fn fn ;1 ::: f1 Form - constant If d 2 D - an arbitrary element of some domain, d denotes a constant function such that for all arguments a the value of d (a ) a = ? ;! ? d . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (25) RULES OF EQUIVALENT FUNCTIONN TRANSFORMATION Equivalent function transformation is required for proofs of the commutativity of the data model mappings. Some basic transformations are shown below. Naturally, this list is not complete. f,g,h denote arbitrary functions, p,q denote predicates, f g will denote equivalence of functions . It is well known (e.g., J.Backus) that the following equations hold: ? f f ? ? (p ;! f g ) h p h ;! f h g h h (p ;! f g ) p ;! h f h g p ;! (p ;! f g ) h p ;! f h OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (26) (1) (2) (3) (4) STRUCTURE OF DATA MODEL FORMAL DEFINTION: DDL 1. Abstract DDL syntax is dened as a domain of all proper schemas Schi expressible in Mi 2. DDL semantics domains are data types which may be represented in the data model. Every data type is interpreted by elementary or recursively dened data type of data methamodel 3. DDL semantics function schemas are denitions of functional domains of the form Id ! V establishing for each Mi data type correspondence of possible identiers of the schemas of this type to the set of the data type values | semantics domain. Thus a data base state is described as a product of functional domains dened in such way. 4. Functions of DDL constructions interpretation: express exactly the correspondence of Mi DDL constructions to semantical domains. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (27) STRUCTURE OF DATA MODEL FORMAL DEFINTION: DML 1. Abstract DML syntax is dened as a domain of all DML operators expressible in Mi 2. DML semantics domains are DDL semantics domains and additional data types, characterizing the state of the program communicating with the database 3. DML semantics function schemas dene the general view of the interpretation function of the program, including DML statements, and of the separate DML statements 4. Primitive function schemas x the domains and ranges of the primitive undenable functions which are characteristic for particular Mi 5. Standard functions denition provide general functions and predicates useful for specication of functional forms of dierent DML statement interpretation functions 6. DML statement interpretation functions denition gives a functional form which interpretes the changes of database for DML statements execution in the context of a particular database schema. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (28) THE NECESSITY FOR FORMAL TOOLS Formal tools we introduce are are essential for : denition of rules of database schema mapping of the source DM into database schemas of the target one denition of the semantics of DML operators of extended target DM consistency of extended DM axiom operational semantics with DML operator semantics of the extended data model denition and verication of algorithms of interpretation of extended data model DML operators by means of the source DM. To solve above problems we should express them in an abstract language environment in order that the use of formal methods becomes possible. For obtaining practically useful results it is necessary to take into account all essential details of DDL and DML semantics of the source and target data models. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (29) DATA MODEL MAPPING AND UNIFYING TECHNIQUE Method of commutative data model mapping construction is ori- ented towards the solution of two dierent problems: construction of the canonical DM kernel extensions equivalent to internal data models, that is construction and formal xation of canonical DM level languages (DDL and DML) and their semantics development and verication of canonical DM level DML interpretors. We need general, systematic procedures independent of data model types. The procedures x the order of phases, the techniques used at each step by the person constructing data model mapping. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (30) STAGES OF UNIFYING CANONICAL MODEL (UCDM) SYNTHESIS UCDM contains arbitrary data models of dierent DBMSs reduced to their homogeneous equivalent representations. Process of UCDM synthesis consists of the following stages : 1. source data models set selection: W = fM1 M2 ::: Mn g 2. setting of partial order on the W set by the relation of inclusion of one data model into another () 3. UCDM kernel (Mg ) selection. "Minimal" data model Mi , (i = 1 2 ::: n ) is selected so that 8 j1j n (Mri Mrj ) Mri ( or Mrj ) denotes mapping of Mi (or Mj ) data models into xed reference data model (e. g. a relational one) 4. construction of extensions Mgj of the kernel Mg equivalent to Mj 5. UCDM construction as a union of all Mgj . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (31) NETWORK TO RELATIONAL DATA MODEL MAPPING EXAMPLE Formal denition of source (network subset) and target (relational) DMs Construction of database schema mapping Axiomatic extension of the relational DM Semantics of target DM DML interpretation by the source DML Verication of the commutativity of the DML mapping diagram Summary of results of the commutative DM mapping method application Formal denition of the target (M ) and the source (M ) DM by DMM tools (semantics functions construction for DDL Ms : Sch ! B and DML Mo ! B ! B ]) t s ? Construction of DDL mapping into DDL (denition of : Sch ! Sch : B ! B is injective s s t t s t ? Axiomatic extension of M to M : axiom schemas selection,construction of : Sch ! Sch f g, denition of operational interpretation of axioms, checking of bijectivity of : B ! B t ts ts s t s ts ts ? DML semantics denition of M : Mo : O ! B ! B ]. The expression of axioms operational semantics in terms of functions in Mo . Checking of the fact of consistency of operational axioms interpretation with their semantics in Mo ts ts t ts ts ts ts ? DML statements interpretation function construction : Mo : P ! B ! B ] s ts s s s ? Verication of commutativity of DML statement mapping diagram (transformation of Mo (o ) into Mo (o )) s ts t ts t SCHEMA OF THE PROCESS OF COMMUTATIVE DM MAPPING DESIGN CODASYL DATA MODEL SUBSET FORMAL DEFINITION DDL denition The following subset of the CODASYL DDL will be used: Schema entry SCHEMA NAME IS name-of-schema Record entry RECORD NAME IS record-name LOCATION MODE IS CALC USING identier-1 ,identier-2]... DUPLICATES NOT ALLOWED Data subentry data-name TYPE IS f FIXED j FLOAT j CHARACTERinteger-1] g Set entry SET NAME IS set-name OWNER IS record-name Member subentry MEMBER IS record-name f MANDATORY AUTOMATIC j OPTIONAL MANUAL g DUPLICATES ARE NOT ALLOWED FOR identier-1 ,identier-2]... SET SELECTION IS THROUGH set-name OWNER IDENTIFIED BY KEY identier-1 EQUAL TO identier-2]... THEN THROUGH set-name OWNER IDENTIFIED BY identier-3 EQUAL TO identier-4]...] OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (34) CODASYL DDL ABSTRACT SYNTAX Database schema Schn = Scn Ren Sen Record entry Ren = Re ? Re = Rna Loc De product of schema names, record and set entries product of record names, location phrases, data subentries Rna =j N Loc = CALC Di ? Di =j N j N Di | is a data identier, which consists of a record type name and a data element name. j N = f1 2 3 :::g domain of positive integers. Elements of j N ? will be used to represent data structure names in the schema. In the sequence of numbers in n 2j N ? the rst one usually denotes the record type, the next one is the data element type. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (35) CODASYL DDL ABSTRACT SYNTAX (II) De = Dse ? Dse = Type Type = fFIXED FLOAT CHARACTERg + Char Char = CHARACTER j N Usage of the reserved word of the language in abstract syntax denotes a domain and is dened as, e. g. fFIXEDg Set entry Sen = Se ? Tm = fMA OMg? MA,OM | MANDATORY AUTOMATIC, OPTIONAL MANUAL for short. Se = Sna Owner Member Sna =j N Owner = Rna Member = Rna Tm Dup Sos Dup = Di ? Sos = Calckey Sna Pseudo Calckey = (j N j N )? Pseudo = Di ? OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (36) CODASYL DDL SEMANTIC DOMAINS CHAR = f0a 0 0 b 00 c 0 :::g NR = f set of numbers representable in the database g DBK =j N data base keys ? Vo = NR + CHAR elementary values Vrec = K ! Ds ] record values K = f1 2 ::: length(de )g Here de = s ; De i (ren ) ren = s ; Ren (s ) | set of record entries in the schema s. Ds = Dse ! Vo dse = j (de ) data subentry j Ds (dse ) = (is ; FIXED(s ; Type (dse )) ;! NR is ; FLOAT(s ; Type (dse )) ;! NR is ; CHARACTER(s ; Type (dse )) ;! CHAR is ; Char (s ; Type (dse )) ;! CHAR2s ;Type (dse ) ?) set of records Vr = j N ! Vrec ] Vs = Nm ! No ] set of CODASYL sets Nm =j N No =j N Msn = Schn ! Sdbn Sdbn = Rs Ss = Rna ! Vr ] Sna ! Vs ] Notation : Vrj = j N ! Vrec ] Vsi = Nmi ! Noi set of instances of record type j 2 Rna set of instances of set type i 2 Sna OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (37) FRAGMENTS OF CODASYL DML DEFINITION Abstract syntax Prog = On ? On = Sts + Ers + Mods + Con + Dsc DML statements: store, erase, modify, connect, disconnect e . g . , Ers = ERASE Rna DML semantics domains Suwa = Rna ! Vuwa ] user working area Vuwa =j N Rna Vrec Cis = (Rna + Sna + Cri ) ! Kb Rna ] Cri - indicator of current record of run-unit Cis - currency state indicators Kb =j N data base key Spb = Cis Suwa Bn = Sdbn Spb set of states DML primitive functions ckey : Rna Vl !j N Vl = Vo ? ownr : Rna ! (Sna ) | set of set types in which record type i 2 Rna is dened as an owner DML statements semantics Erase statement Mon (ers ) = Cis Cri =? ?] erase (n h ) ers 2 Ers h = 1 Cis (Cri ) current record of run unit erase (n h ) = Vrn h =?] i 2ownr (n ) (m 2Vs ; (h ) (s ; Tm s ; Member set = MA ;! erase (s ; Rna s ; Member (set ) m ) Vsi m =?])) set = i s ; Sen (s ) | i-set type entry in the schema i 1 OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (38) RELATIONAL MODEL FRAGMENTS FORMAL DEFINITION DDL denition Simple relational model schema denition language: RELATIONAL SCHEMA <schema name> DOMAIN <domain description> f <domain description>g DATABASE VAR <relation description> f <relation description> g END <domain description> ::= <identier> : <type> <type> ::= INTEGER j REAL j CHAR j CHAR(<integer>) <relation description> ::= <identier> : RELATION <list of domain identiers> KEY <identier list> END DML statements: Cursor denition (<label>) CURSOR <relation identier> COND <selection expression> Delete statement (enforces deletion of current tuple dened by cursor with the same label) DELETE <label> OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (39) RELATIONAL MODEL FRAGMENTS FORMAL DEFINITION (II) DDL Abstract syntax Schr = Rsn Domain Rd Domain = Rtype ? Rtype = fINTEGER REAL CHARACTERg + Char Char = CHARACTER j N Rd = Rel ? Rel = Rena Rkey Rdom Rkey =j N ? Rdom =j N ? DDL semantics domains Vrel = j N ! Vt ] Vt = K ! Dos ] K = f1 2 ::: length(dom )g dom = s ; Rdom i (rd ), rd = s ; Rd (rs ) d = j (dom ) d - name of attribute j Dos = Rdom ! Vo tp = d s ; Domain (rs ), rs - name of relational database schema Dos (d ) = (is ; INTEGER(tp ) ;! NR, is ; REAL(tp ) ;! NR, is ; CHARACTER(tp ) ;! CHAR, is ; Char (tp ) ;! CHAR2tp ?) Res = Rena ! Vrel ] Sdbr = Res Msr : Schr ! Sdbr OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (40) RELATIONAL MODEL FRAGMENTS FORMAL DEFINITION (III) Fragments of DML denition Abstract syntax Prog = Or ? Or = Pt + Upd + Dlt Dlt = DELETE Cl DML semantics domains Isa = Rena !j N Vt ] Cp = Cl !j N Rena ] Spb = Cp Isa Br = Sdbr Spb PUT, UPDATE, DELETE statements Cl - cursor label tuple with unique number in working area state of cursors (cursor label denes number and type of current tuple of cursor) program state with respect to DBMS DML statement semantics Delete statement (dlt 2 DLT ) h = 2 dlt value of h is cl 2 Cl - cursor label t = 1 Cp (h ) current tuple of the cursor Mor (dlt ) = Cp h =nxt (i t ) i ] Vreli t =?] Here i = 2 Cp (h ) nxt : Rena j N !j N - primitive function giving for any tuple number of relation i 2 Rena the number of the next tuple in the relation. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (41) CONSTRUCTION OF DATABASE SCHEMA MAPPING DB schema mapping : Schn ! Schr is represented by the following collection of mappings: 1. Mapping of scalar data types : tmap : Type ! Rtype 2. Bijective mapping of record type names into the relation type names: nmap : Rna ! Rena ] 3. Correspondence of relation domain names to the data element names of the record types: syn : j N ? !j N ? ] 4. Correspondence of each record type to a set of selective keys identifying selective paths leading to the record through sets (owner - member record instances in the paths). The schema of each relation will include attributes mapped from data elements of the respective record type and additional attributes from all selective keys for the record type. Such selective keys make it possible to dene functional dependency (referential integrity) of relation type A from relation type B corresponding in the network schema to record type A0 (owner in some set) and record type B 0 (member in the same set). In such way the correspondence of selective keys for each record type in the network schema to the foreign keys for the respective relation type in relational schemas will be provided 5. Function dening a key for each relation in the schema: rk : Rena ! (j N j N )? . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (42) ABSTRACT SYNTAX OF TRANSFORMATIONAL SCHEMA LANGUAGE Schnr = Rdm Nmap Syn Rd Rdm = Domain Rd = Rel ? denition of relation in transformational schema Rel = Rena Rkey Rdom Rna Sel Here : Rkey =j N ? - key of the relation, Rdom =j N ? - attribute list, Rna - names of the record types corresponding to the relation types. Sel = (Rim Ftype Rsna Pak Pseudo )? rs denotes relational schema. To each relation which is functionally dependent on relation s ; Rena i s ; Rd (rs ) the component j (sel ) in sel = s ; Sel i s ; Rd (rs ) will be put into correspondence. In sel : s ; Rim j (sel ) denotes the name of relation which is functionally dependent on relation i s ; Ftype j (sel ) - class of functional dependency which may be TOTAL, corresponding to MANDATORY AUTOMATIC set membership, and PARTIAL corresponding to OPTIONAL MANUAL set membership s ; Rsna j (sel ) - set name s ; Pseudo j (sel ) - pseudokey of this functional dependency - collection of attributes whose values are sucient for identication of tuples of relation i corresponding to member type record instances. The pseudokey identies the tuple among other tuples of relation i associated to one and the same tuple (corresponding to owner record) of relation s ; Rim j (sel ) corresponding to owner record type in the set s ; Rsna j (sel ) s ; Pak j (sel ) - selective path, selective and secondary keys. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (43) ABSTRACT SYNTAX OF TRANSFORMATIONAL SCHEMA LANGUAGE (II) Denitions of Sel components: Rim = Rena Ftype = fTOTAL PARTIALg Rsna = Sna Pak = Cs Pa Kc Pa = Fst Nxtr selective path Fst = Rna Nxtr = Rna Sna Cs = Kc = Fkey Nst Cs - selective key, Kc- secondary key ? Fkey = Dn Nst = Pseudo Pseudo = Dn ? Now it is obvious that in the schema mapping dened the function is injective (not bijective that we are looking for). E.g., in network database only such states are allowed in which every member record may be connected to only one owner record in any given set. There is no adequate constraint on the relational level. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (44) AXIOMATIC EXTENSION OF RELATIONAL DM Analysing dierences in the sets of states of source and target data models using DDL schema mapping formal description, the axiom schemas (rn ) are introduced to express additional logical dependencies of target DM. An axiom should express some atomic (elementary) fact. In our example the axioms covering equivalent to CODASYL set dependencies on the level of target data model will be introduced. System of axioms of total functional dependency Main axiom : Ri(Csj) ! Rj(Kcj) expresses a fact of total functional dependency of relation Rj on relation Ri . Csj denotes attributes of Ri constituting a selective key (which is a foreign key of Rj ). Kcj is a secondary key of Rj , attributes of the Kcj being in one-to-one correspondence to the attributes of Csj . Invariants of Ri : UNIQUE Esj collection of attribute values given in UNIQUE axiom should uniquely identify the tuple of the relation Ri OBLIGATORY Esj collection of attributes given by OBLIGATORY axiom should have nonnull values Esj is a key of Ri formed as the union of attributes of Csj and of attributes of Ri pseudokey in this functional dependency. Invariants of Rj : UNIQUE Kcj OBLIGATORY Kcj OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (45) AXIOMATIC EXTENSION OF RELATIONAL DM (II) System of axioms of partial functional dependency Main axiom : Ri(Csj) =) Rj(Kcj) expresses fact of partial functional dependency of relation Rj on relation Ri . Invariants of Ri : UNIQUE NONNULL Esj SYNCHRONOUS NONNULL Csj collection of attribute values given in UNIQUE NONNULL axiom should uniquely identify the tuple of the relation in case all such values are nonnull the axiom expresses the requirement that the attributes of Csj should obtain nonnull values or should all "disappear" Invariants of Rj : UNIQUE NONNULL Kcj The schema mapping now should be extended to : Schn ! (Schr f rng), function becomes bijective and the schema mapping diagram becomes commutative. To show that it is sucient to consider in the injective each fact of bijectivity violation and for such case to nd a set of axioms removing the violation. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (46) OPERATIONAL SEMANTICS OF THE AXIOMS For each axiom, each DML statement should get the description of axiominduced minimal actions sucient to preserve the axiom as a DB invariant. Axiom DELETE (Ri ti ) DELETE (Rj tj ) Total functional dependency Ri (Csj ) ! Rj (Kcj ) - Delete all ti 2 Ri such that ti Csj ] = tj Kcj ]. Apply such deletions recursively to the hierarchy of the functional dependencies having Ri as a root UNIQUE Esj ... - - Ri (Csj ) =) Rj (Kcj ) - In all ti 2 Ri UNIQUE NONNULL Esj ... - Partial functional dependency such that ti Csj ] = tj Kcj ] values of all attributes in Csj should become null - OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (47) SEMANTICS OF DML STATEMENTS OF EXTENDED RELATIONAL DM Delete statement Morn (dlt ) = Idlt (n t ) Idlt (n t ) = Cp h =nxt (n t ) n ] rmv (n t ) Here h = 2 dlt - cursor index n = 2 Cp (h ) - relation name t = 1 Cp (h ) - number of current tuple of cursor. rmv (n t ) = Vreln t =?] (n 12Iim (n ) (sel 1) length (s ; Rim m (sel 1) = n ;! m =1 (p 2seek (n 1con ) (s ; Ftype m (sel 1) = TOTAL ;! rmv (n 1 p ) (s ; Ftype m (sel 1) = PARTW ;! ch (p n 1 cs 1 ?) b))) b)) Here Iim (Rj ) = fRi j Rj is functionally dependent on Ri g, sel 1 = s ; Sel n 1 s ; Rd (rs ), ch (t n a v ) | function making such modication of tuple t of relation n that the values of t attributes dened by the list a 2 Dn? will be substituted by v 2 Vo ?. seek : Rena Con ! (j N ) puts into correspondence to each search condition applied to the given relation subset of the numbers of its tuples relevant to such a condition. Condition con 2 Con is expressed by the functinal domain cond: Dn? Vo ? ! Con in which the range consists of the lists of data element names id 2 Dn? and data values vo 2 Vo ? corresponding to them . In functional form for rmv con means list of selective key attribute names of n 1 relation and values of the secondary key attributes of n relation corresponding to them. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (48) FORMAL DEFINITION OF AXIOM OPERATIONAL SEMANTICS Axiom DELETE (Ri ti ) DELETE (Rj tj ) Total functio- nal dependency Ri (Csj ) ! Rj (Kcj ) - rmv (Rj tj ) - ch (p n 1 cs 1 ?) in the expression for rmv Partial functional dependency Ri (Csj ) =) Rj (Kcj ) The set of axioms introduced is operationally complete with respect to the set of statements of Mrn data model. It means that each action of database state modication made by some DML statement of Mrn is caused by this statement semantics of the target data model or is induced by some axiom of target DM extension included into the denition of the data type | argument of the statement. Now the system of axioms extending Mr and semantics of Mrn DML are in well-dened correspondence. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (49) FUNCTION OF TARGET DML STATEMENT INTERPRETATION BY THE SOURCE DM DML For each orn 2 Orn the interpretation functions by means of Mn DML should be dened. These functions are based on the mapping and dened above. Only the delete statement function will be considered here. The functions will be given directly in the form Morn : Or ! Bn ! Bn ] Mornn (dlt ) = Cp h =nxtn (^n t )] Mon (ERASE n^ ) Cis Cri =t n^ ] Here n^ = nmap ;1(n ) denotes the result of mapping the respective (network) denotation. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (50) VERIFICATION OF THE COMMUTATIVITY OF THE DML MAPPING DIAGRAM Commutativity of a DML mapping diagram is a necessary condition to guarantee exact correspondence of the denition of semantics of extended target data model DML to its interpretation by the source data model DML. To verify the commutativity of DML mapping diagram the following constructive approach will be used : 1. For each orn 2 Orn on the basis of the function Mornn (orn ) the verication function Vorn (orn ) expressing an equivalent to Mornn (orn ) actions by means of Mrn data model will be constructed. This transformation is done in the following way: - each argument of the Mornn (orn ) function which is the semantic domain of the source data model is substituted by the semantic domain of the target data model whose components are in one-to-one correspondence to the components of such a source DM domain with respect to and mappings - every function frn in Mornn (orn ) denition is substituted by the function vfrn expressing equivalent to frn modications of objects of Mrn which are in one-to-one correspondence to the Mn objects modied by frn . It is expected that due to such construction the functions Mornn (orn ) and Vorn (orn ) applied to any pair of equivalent states of source and target databases will again produce equivalent database states. 2. The equivalence of functions Morn (orn ) and Vorn (orn ) should be demonstrated. To achieve that the function Vorn (orn ) will be transformed into the function Morn (orn ) by means of rules of equivalent function transformation. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (51) DEMONSTRATION OF THE COMMUTATIVITY OF THE DML MAPPING DIAGRAM Verication function construction Vorn (dlt ) = Cp h =nxtn (^n t )] verase (n h ) verase (^n t ) = Vrenn t =?] (n 12Iim (^n ) (sel 1) length (s ; Rim m (sel 1) = n ;! m =1 p 2seek (n 1con ) (s ; Ftype m (sel 1) = TOTAL ;! verase (n 1 p ) ch (p n 1 cs 1 ?)) b)) Function transformation In Morn (dlt ) the function rmv will be transformed (taking into account that s ; Ftype m (sel 1) may take only two values - TOTAL or PARTW): rmv (n t ) = Vreln t =?] (n 12Iim (n ) (sel 1) length (s ; Rim m (sel 1) = n ;! m =1 (p 2seek (n 1con ) (s ; Ftype m (sel 1) = TOTAL ;! rmv (n 1 p ) (s ; Ftype m (sel 1) = TOTAL ;! b ch (p n 1 cs 1 ?)))) b)) Now to the function (s ; Ftype m (sel 1) = TOTAL ;! rmv (n 1 p ) (s ; Ftype m (sel 1) = TOTAL ;! b , ch (p n 1 cs 1 ?))) the rule (4) for equivalent function transformation may be applied : p ;! (p ;! f g ) h p ;! f h In our case we have p ;! h (p ;! f g ) p ;! h g . After application of this rule we obtain: rmv (n t ) = Vreln t =?] (n 12Iim (n ) (sel 1) length (s ; Rim m (sel 1) = n ;! m =1 (p 2seek (n 1con ) (s ; Ftype m (sel 1) = TOTAL ;! rmv (n 1 p ) ch (p n 1 cs 1 ?))) b)) Now it is clearly seen that verase and rmv functions are the same. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (52) AXIOMATIC EXTENSION OF RELATIONAL DM EQUIVALENT TO CODASYL DM Simple axioms (for relation Ri Ai Aj Ak - collection of attributes of Ri ) 1.Axiom of uniqueness UNIQUE Ai 2.Axiom of constancy CONSTANT Ai 3.Axiom of deniteness OBLIGATORY Ai 4.Axiom of conditional uniqueness UNIQUE NONNULL Ai 5.Axiom of order Ri RESTRICTED BY Ai ] IS ORDERED<order>] BY < direction > Aj f < direction > Ak g] Compound axioms (for relations Ri Rj ) 6.Axiom of total functional dependency(f.d.) Rj (Aj ) ! Ri (Ai ) 7.Axiom of partial functional dependency(p.f.d.) Rj (Ai ) =) Ri (Ai ) 8.Axiom of partial strong functional dependency Ri (Ai ) = S =) Ri (Ai ) 9.Axiom of partial functional dependency with initial connection Rj (Aj ) = L =) Ri (Ai ) OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (53) SATURATION OF UCDM An important property of the process of synthesis consists in relatively fast saturation of the canonical data model when taking into consideration of new source data model introduces no new axioms on the target DM level. The resulting canonical data model with respect to the known DM is a saturated one. This circumstance allows to consider the resulting model to be a unifying one. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (54) UCDM facilities Kernel of canonical data model Normalized relations Hierarchical relations Positional aggregates Data models 1 2 3 4 5 6 7 8 9 10 11 12 * * * * * * * * * * * * * * * * * * * * Kernel extension Axiom of uniqueness * * * * * * * Axiom of constancy * * * * * * Axiom of deniteness * * * * * * * Axiom of conditional uniqueness * Axiom of conditional constancy * Axiom-function * Axiom-partial function * Axiom of order * * * * * * * Axiom-predicate * Axiom of total f.d. * * * * * * * Axiom of partial f.d. * * * Axiom of strong p.f.d. * Axiom of p.f.d.with initial connection * Axiom of total f.d. with backward connection * Axiom of p.f.d. with backward connection * Axiom of stable total f.d.(s.t.f.d.) * Axiom of s.t.f.d. with backward connection * Axiom of of duplex dependency * * Data models denotation 1.Codd relational DM (1970) 7.Descriptor DM of DBMS BASIS 2.CODASYL network DM 8.DM of DBMS POISK 3.IMS Hierarchical DM 9.TOTAL network DM 4.IDS Network DM 10.Hierarchical DM of DBMS INES 5.DM of DBMS PALMA 11.Binary relational DM 6.ADABAS DM 12.Codd relational DM (1979) SUMMARY The problem of data model heterogeneity in multidatabase environment has been considered. The data model axiomatic extension principle, the data model commutative mapping principle, the unifying canonical data model synthesis principle based on the notion of data model equivalence were pro- posed as the basis for the DM mappings. These principes are applied for a methodology for commutative data model mapping construction. This results in the following main steps of the DM mapping design: formal denition of target and source data model semantics (on the basis of an abstract data metamodel) formal denition and verication of the rules of database schema mapping denition of the semantics of DML operators of extended target data model checking of consistency of the axiom operational semantics with the DML operators semantics of the extended data model denition and verication of the algorithms of interpretation of extended data model DML operators by means of the source data model. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (56) SUMMARY (II) The methods introduced provides for construction of the canonical data model kernel extensions equivalent to the internal data models and for developing and verication of the algorithms of the target level DML interpretors. Systematic application of the approach to the design of the architecture of the data model transformers (wrappers) provides for acquiring important application features, including: a unifying canonical data model can be synthesized representing equivalently data models of various DBMS due to that the update of the integrated data base is supported the spectrum of data models embraced by the heterogeneous environment may include structured as well as unstructured data models a technology of developing generic parameterized application pro- grams independent of DBMS can be applied. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (57) DATA MODEL MAPPING AS THE DATA MODEL REFINEMENT Model-based notations as metamodels { Abstract Machine Notation { Structure of the Abstract Machines and the respective Proof Obligations { Renement of Abstract Machines and proofs of the renement properties { AMN as a metamodel Data model renement Mapping diagrams based on concept of renement and their properties Commutative data model mapping construction Example of the relationship type mapping for dierent object models Summary and discussion MODEL-BASED SPECIFICATIONS Static aspects: include the states a system can occupy and the invariant relationships (constraints) that should be preserved as the system moves from state to state. Dynamic aspects: include possible operations and changes of state that happen. Specication of an operation (function) consists of a denition of prop- erties and relationships that state transitions caused by the operation should satisfy. For that predicates relating values of state variables before and after operation (expressing mixed pre- and post-conditions) are dened. The notion of execution of a model-based specication consists of the proof of the initial consistency of the model and the preservation of the invariants by the operations. The provable way of development of programs from specications by proper concretization of abstract data types and operations of the specication by concrete data types and programs satisfying strict concretization conditions. Starting with VDM, model-theoretic methods evolved to Z and ObjectZ Notations, Raise and J.-R. Abrial's Abstract Machines technology that had reached the level of "industrial strength" methods supported by the proprietory dedicated program packages. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (59) ABSTRACT MACHINE NOTATION AMN is a notation for expressing specications. The elements of such notation, put together, form a pseudo-programming language. The AMN specications are intended for a mathematical analysis: each construct of the notation receives a precise mathematical denition. The objectives of using AMN are multi-purpose { starting with a specication of requirements and ranging to implementation and speci- cation of existing components. More abstract and more imperative styles of using the notation will be distinguished to suit more to the what domain characterizing specication and to the how domain characterizing programs. A central feature of the notation is that of abstract machine. This is a modularization concept related to such notions as class in SIMULA, abstract data type in CLU, package in ADA, etc. Abstract machine allows to organize large specications as independent fragments having well-dened interfaces. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (60) SPECIFICATIONS IN B An exact axiomatic denition (specication) of the modeled entity properties in AMN may be analysed formally. AMN program is used for proving correctness of denitions and for symbolic transformation of initial program (composition, renement, etc.) An abstract machine denition contains: the machine states and operations allowing to get and change such states. States are dened by variables determining the state components and invariants. Operations describe properties and relationships that must be satised during a change of a state within the limits of the invariants. To express logical assertions relating values of state variables before and after operation executing, AMN uses calculus of subsitutions. Properties of operations are expressed in terms of predicate transformers which bind with post-condition of Op its weakest precondition. Generalized substitutions (which are operators of such calculus) may be considered as abstract machine commands. To prove that an operation preserves invariants, every invariant is considered as a post-condition to which the operation (i.e., a predicate transformer) is applied. This application results in forming of a new predicate which must be proved too. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (61) IDEA OF A GENERALIZED SUBSTITUTION The commands of AMN are generalized substitutions. x := E ]P denotes the predicate obtained after substituting all free occurences of x in P by the expression E : For the invariant I of a type and S { a substitution corresponding to an operation, the following proof obligation should hold: I ) S ]I Every generalized substitution S denes a predicate transformer binding with some post-condition R its weakest pre-condition S ]R that guarantees the invariance of R after an operation execution. Operation insert for IntSet has been specied using the following logical statement (abridged): s 2 PN1) 8 s 0 (s 0 = s fe g ) s 0 2 PN1) Using substitution, this statement can be rewritten without the dashed variables and quantiers: s 2PN1 ) s := s fe g]s 2 PN1 Every generalized substitution denes a rewriting rule transforming the following predicate to a pre-condition. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (62) GENERALIZED SUBSTITUTIONS OF AMN Kinds of the generalized substitutions are listed below. Multiple substitution x ::: y := E ::: F ]R , R0 where R0 is R in which free occurences of x ::: y are simultaneously replaced by E ::: F respectively. This means: x y := E F ]R , z := F ]x := E ]y := z ]R The following notation is more readable: x := E jj y := F Empty substitution skip ]R , R Pre-conditioned substitution P j S ]R , P ^ S ]R S ]R should be applied only if P holds. The invariant preservation can only be proved under the extra hyposethis of the pre-condition: I ^ P ) P j S ]I Guarded substitution P =) S ]R , P ) S ]R For pre-conditioned substitution it is necessary to prove P to establish a post-condition. For guarded substitution P is assumed. It means that when P does not hold, the substitution is said to be non-feasible (Implication P ) S ]R can establish anything). OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (63) GENERALIZED SUBSTITUTIONS OF AMN (II) Bounded choice substitution S ]T ]R , S ]R&T ]R "S choice T " = bounded non-determinism. Implementor of the specication has the freedom to choose to implement either the operation corresponding to substitution S or that corresponding to substitution T : Any substitution must preserve the post-condition R. It is why the conjunction sign is used in the predicate. Unbounded choice substitution z :S ]R , 8 z :S ]R This is unbounded choice, unbounded non-determinism. The implementor has the possibility to choose any specic value of z for a future implementation of the operation. The conjunction appearing in the case of the choice substitution is generalized to the universal quantier. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (64) SYNTACTIC STRUCTURE OF THE ABSTRACT MACHINE machine Identi er (Identi er, ... , Identi er)] AM has a name and may have formal params (simple scalars or non-empty nite sets). sets GivenSet ... GivenSet The sets in the set clause constitute the basis of its type system. variables Variables AM has a number of variables that should obey a certain number of predicates forming together the invariant of the machine. invariant Predicate The invariant allows to set-theoretically type each variable. initialization Substitution AM also has an initialization that is a substitution. operations Operation ... Operation An abstract machine has a number of operations dened as: Variable ; Identi er (Variable )] = Substitution Identi er (Variable )] = Substitution end Another clauses: constants clause declares identiers for a read-only use within the operations. properties determines logical properties of sets and constants. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (65) PROOF OBLIGATIONS OF AN ABSTRACT MACHINE machine Identi er variables z invariant R initialization T operations Operation name = pre L then S end end Proof obligations templates: 9z R T ]R R ^ L ) S ]R The rst proof obligation is just existence proof obligation for variables. The second one concerns the establishment of the invariant by the initialization. The last one concerns the preservation of the invariant by each operation. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (66) REFINEMENT OF ABSTRACT MACHINES Step-wise restriction of the abstract specication to make them nally implementable is a renement. Algorithmic renement consists in removing of non-determinism by being more and more precise about the way our operations are to be eventually made concrete through sequencing and loop. At the same time we should relax pre-conditions. Data renement consists in removing completely all variables whose types are too complicated to be implemented as such and in replacing them by simpler variables whose types correspond to those found in programming notations. Data renement also includes some algorithmic renement at the same time. Abstraction relation Assume two substitutions S and T working within two dierent machines (two distinct variable spaces are represented by two variables x and y , x 2 s and y 2 t are respective invariants of these machines). A binary relation v from s to t such that ran (v ) is equal to t is an abstraction relation. Algoritmic renement just appears to be a special case of data renement with an identity abstraction relation. It is important that algorithmic and data renements are monotonic on all generalized substitution constructs introduced. A machine N is said to rene a machine M if a user can use N instead of M without noticing it. Both machines must have the same operation signatures. Sucient condition for N to rene M is the requirement that each operation of N data renes the corresponding operation of M (by a certain abstraction relation). OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (67) REFINEMENT CONSTRUCT renement is a construct that resembles a machine. A renement can rene either a machine or another renement. The invariant clause of renement is just the abstraction relation. The operations of the renement only involve the variables of the renement, not of the construct being rened. Pure algorithmic renement: variables of the renement and of the construct being rened are the same. machine AM Identi er variables x invariant P initialization S operations z ; OpName = pre Q then T end end Renement of the abstract machine machine Identi er renes AM Identi er variables y invariant R initialization U operations z ; OpName = pre L then V end end OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (68) PROOF OBLIGATIONS FOR REFINEMENT The proof obligations template which concern the relationship between a machine and its renememt via the abstraction mapping R on the variables of the two machines: 9(x y ) (P ^ R) U ] : S ] : R 8(x y ) (P ^ R ^ Q ) L ^ V 0 ] : T ] : (R ^ z = z 0 )) R is a predicate containing both a typing invariant and other properties of the local state y and the abstraction mapping which relates y and x : V 0 stands for substitution V within which the variable z has been replaced by z 0: The proof obligations meaning: 1. The existence of a model for a combined abstract and rened state which satises the abstraction relation 2. The correct renement of initialization under the assumption of the constraints and properties of both machines. 3. The correct renement of operations. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (69) CORRECTNESS OF OPERATION REFINEMENT Under the abstraction relation and the pre-condition of the more abstract operation, the pre-condition of the respective rened operation holds (the pre-condition in renements may be weakened, or the rened operation has a wider domain of behaviour). For every execution of V there is a corresponding execution from the same initial state (under the mapping R) which establishes the same resulting values (z = z 0 ) and which re-establishes the abstraction mapping between the post-states. The double negation :Q ]:W is used to express the existence of an execution of Q that establishes W : A predicate transformer Q ]W denotes the weakest pre-condition guaranteeing that Q establishes W : It means that in states satisfying Q W ] Q should terminate and each starting state being related through Q to at least one ending state not fullling W cannot be part of the set of states satisfying the pre-condition. :Q ]:W expresses the fact that Q cannot establish :W : This means either Q does not terminate or there exists at least one state satisfying W in which Q terminates. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (70) REFINING SUBSTITUTIONS EXAMPLES Abstract sequence x y := y x Concrete sequence t := v v := w w := t v and w are put into correspondence to x and y respectively by a renemnet abstraction function and t is a variable introduced in the rening construct. Weakening of the operation pre-condition: the pre-condition of the abstract operation must imply the pre-condition of the rened operation. P ^ S ] can be rened by P ) S ]: Reduction of non-determinism in the result of an operation. Bounded choice S ]T can be rened by any of the substitutions | S or T : A substitution by a choice from a set can be rened by a substitution by a particular element of this set. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (71) AMN AS A DATA METAMODEL AMN-based data metamodel consists of: State Specication Notation 1. built-in basic sets (elementary sorts) including sets of natural numbers (NAT ), boolean (BOOL), string (STRING) 2. constructors allowing specication of complex sorts (types) from simpler ones and facilities for data type formal denition 3. variables dened on the sorts 4. invariants (predicates on data types and variables) Behavior Specication Notation 1. primitive operators (substitutions) used to dene behaviours 2. operations of abstract machines. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (72) STATE SPECIFICATION NOTATION Set theory and a typed rst-order language with the built-in types and type constructors. The set notation is introduced axiomatically: the axiomatic basis is needed to prove (formally) the properties of abstract machines. Denition The set of complex sort constructors includes: cartesian product (), powerset (}), set comprehension (fx j x 2 s ^ P g), relational sort constructors (s $ t ), functional sort constructors (s ;! t ). The set of sorts (types) S is dened as follows: each primitive sort s (NAT, BOOL, STRING) is a sort in S if s1 s2 2 S then s1 s2 is a sort in S if s 2 S then }(s ) is a sort in S if s 2 S and P is a certain predicate, then fx j x 2 s ^ Pg is a sort in S if s, t are in S, then s ;! t is a sort in S denoting a set of functions from s to t (total, partial, bijective, injective, surjective functions are available) if s, t are in S, then s $ t is a sort in S denoting a set of binary relations from s to t (various kinds of relations are provided) all sorts in S are given in this way. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (73) PREDICATES AND FORMULAE Denition Predicates in the notation are dened by the rst-order language that is given by the following symbols: the sort symbols and sort constructor symbols, variable symbols v s for each sort s, n-ary function symbols for each (n+1)-tuple (s0 : : : sn ) of sorts, symbols \ n logical connectors : ^ _ ) , logical quantiers 8 9 the predicate symbols 2 and parenthesis ( and ): Denition The set of terms for each sort s: each variable v s is a term of sort s if f is a functional symbol of sort (s0 ::: sn ) and ui (i = 0 : : : n ; 1) are terms of sorts si then f (u0 ::: un ;1 ) is a term of sort sn if s = }(s 0 ) for some sort s 0 and u1 u2 are terms of sort s, then u1 u2 u1 \ u2 u1 n u2 are terms of sort s if r is a relational symbol of sort (s $ t ) then dom (r ) ran (r ) r ;1 are terms denoting domain of r, range of r and an inverse of r all terms are given in this way. The set of well-formed atomic formulae: if u1 and u2 are terms of the same sort s then u1 = u2 is an atomic formula if u1 and u2 are terms of sort s and }(s ) respectively then u1 2 u2 is an atomic formula if u1 and u2 are terms of sort }(s ) or sort s ;! t or sort s $ t then u1 u2 u1 u2 are atomic formulae. Interpretation A assigns to each sort s a non-empty domain (universe of sort s) Ds having a type corresponding to the sort s constructor. A assigns to each n-ary function symbol a function and to each predicate symbol 2 its set-theoretic meaning. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (74) STATE INTERPRETATION Interpretation of a state of AM is given by the machine variable binding assigning to each variable vs 2 Vs a domain element in Ds : The set space of all variable bindings gives the state space $ : $ = f : V ! D j (Vs ) Ds for all sorts s g where D may be assumed as the solution in the Scott's lattice of the following equation: D = B + }(D ) + (D D ) + (D ;! D ) + (D $ D ) The set of all states satisfying the invariant predicate I is denoted by $I = f jj= I g Here j= I means that the state satises the predicate I : OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (75) SPECIFICATION OF BEHAVIOUR: SUBSTITUTIONS The operations of the abstract machines are based on the generalized substitutions: they allow non-determinism. Gen. substitution S is a predicate transformer binding with some postcondition R its weakest pre-condition S ]R that guarantees the invariance of R after an operation execution. If it is so, one says that S establishes R. "Weakest" precondition means that the "initial state" predicate associated with some given "nal state" predicate should allow as many states as possible. The predicate R0 is weaker then R i R ) R0. Denition The weakest precondition S ]R is obtained by replacing the variables in R according to the rules provided by dierent kinds of the generalized substitutions dened for AMN. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (76) DATA MODEL REFINEMENT Database states in a source and a target DM are equivalent i they are mapped into one and the same state in the content of an abstract data metamodel. Such state mapping should be "isomorphic". A type ts bijectively data renes a type tt i the types produce sets of database states of equal power related by bijective dependency in such a way that the states being in one-to-one correspondence are equivalent. For such renement a data abstraction is a total bijective function relating equivalent states. Schema Ss renes a schema St i for each type ts of Ss there is a type tt in St (St includes no other types) such that ts is a renement of tt : Data model Ms renes data model Mt i for each admissible schema Ss of Ms there exists an admissible schema St of Mt such that Ss is a renement of St : Denition Data model Ms is equivalent to data model Mt i Ms renes Mt and Mt renes Ms : For heterogeneous multidatabase management we require that any source data model should be a renement of the canonial one: the paradigms of source data models should be integrated in the canonical data model. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (77) DATA TYPE STATE We assume that schemas in any DM are decomposable to types (type model mapping). We release type models of peculiarities of specic object relationships (such as inheritance, generalization or classication) in a particular schema taking into account their meaning. We focus on the subtyping relationship between types in schemas. For data model Mi the set of all data type schemas expressible in DDL of Mi is denoted by Ti : Denition The data type state corresponding to a type schema ti 2 Ti is a function st : Idt ! Vi dening for each state variable (attribute) of the type schema denoted by identier I 2 Idt its value vi taken from the set of admissible values Vi of the variable type. It is essential that vi in its turn can also be an analogous function. A set of admissible states corresponding to some type schema ti 2 Ti is a set of functions St : Idt ! Vi ]: A space of data type states expressible in Mi is a set of functions Si : Idi ! Vi ] which may be considered as union of sets st 2 St for all ti 2 Ti : We consider only admissible states that satisfy the invariants related to types. i i i i i i OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (78) i DATA TYPE BEHAVIOUR Denition The data type behaviour corresponding to a type schema ti 2 Ti is a function bt : Ot ! St St : : : St ! St ] dening for each operation in the type schema the related state transformation for this type. tk : : : tn are type schemas included into the same database schema that is a type schema ti : A set of admissible behaviours corresponding to some type schema ti 2 Ti is a set of functions Bt : Ot ! St St : : : St ! St ]]: A space of data type behaviours expressible in Mi is a set of functions Bi : Oi ! Si : : : Si ! Si ]] which may be considered as union of sets bt 2 Bt for all ti 2 Ti : Denition Data model Mi is a triple < Ti Msi Mbi > where Msi : Ti ! Si is a semantic state function of Mi Mbi : Ti ! Bi is a semantic behaviour function of Mi : i i i k i i n i i i OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (79) i k n i DATA TYPE MAPPINGS The following set of mappings constitutes the mapping f of data model Mj into data model Mi : data type schema of Mj into data type schema of Mi mapping: : Tj ! Ti , data type state space of Mj into datatype state space of Mi mapping: : Sj ! Si , data type behaviour of Mi into datatype behaviour of Mj mapping: : Bi ! Bj . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (80) BASIC PROPOSITIONS FOR HETEROGENEOUS DATA MODEL INTEGRATION Proposition The data model axiomatic extension principle. Canoni- cal data model in the multidatabase management system should be extensible while new source data models are considered. Such extension is implemented axiomatically. The extension should result in provision of the source DM to be a bijective data renement of the target data model. Proposition The data model commutative mapping principle. In the process of mapping the DM of a specic DBMS into a canonical one it is necessary to preserve information and operations. This requirement is satised if DM mapping is commutative. Proposition The unifying canonical data model synthesis principle. Canonical data model synthesis is a process of 1) construction of the canonical data model kernel extensions such that data models of DBMSs embraced by the multidatabase system rene these extensions and 2) merging such extensions in a canonical data model. In such way a unifying canonical data model is formed in which data models of various DBMSs have homogeneous representations (by the subsets of a unifying data model). OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (81) COMMUTATIVE DATA MODEL MAPPING Mapping f =< > of data model Mj into extension Mij of data model Mi is commutative i the following conditions hold: data type state diagram is commutative: Msij - Sij Ti fij g 6 6 Msj Tj - Sj data type behaviour diagram is commutative: T i Mb - O - S 6 i i T ti S tik 0 Mb - O - S j j ::: j tj - S ti ]] tj ]] ::: S tjk ?- S mapping is a bijective abstraction function of a data renement mapping is an algorithmic renement. ij denotes a set of axiom schemas expressing the data dependencies of Mj in terms of Mi . OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (82) COMMUTATIVE DM CHARACTERIZATION Data model Mi is included into data model Mj if all axiom schemas ri of an extension Mri of a reference data model Mr such that Mi is its renement are included (possibly not strictly) into the set of axiom schemas rj of an extension Mrj of the reference data model Mr such that Mj is its renement. Proposition of existence. If data type state mapping diagram of Mj to Mij commutes, then the behavioural data type mapping diagram can be constructed. Proposition of renement. Data model Mj renes Mi i there exists commutative mapping of Mj to Mi . Separation of the process of the canonical unifying data model synthesis from the process of the denition of the types behaviour. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (83) COMMUTATIVE DATA MODEL CONSTRUCTION Formal denition of data models in AMN makes possible simultaneous construction of data model mapping and of the proof of it's correctness. Properties of AMN suggest application of compiler-based semantics M : P ! D , where P is syntactical domain of M (the set of all syntactically correct programs in L), D - semantical domain of M. D is the set of target language abstract representations { just AM put into direct correspondence to data types instead of using other semantics (axiomatic, operational or denotational ones). Method of commutative data model mapping construction is oriented towards denition of canonical DM kernel extensions that should be rened by the internal data models. Applying compiler-based semantics, we should construct Mj into an extension of Mi mapping AMN semantics of Mj AMN semantics of the extended Mi : After that we can apply B technology to prove the state-based and behavioral properties of the mapping for all type models dened for an internal DM. The mapping obtained can be used in process of denition of new, application specic types to get their particular mappings and the respected proofs. Such approach justifes the choice of compiler-based semantics and resembles an idea of computer-assisted application of formal methods for the software design. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (84) ODMG'93 RELATIONSHIP TYPE We focus on the one-to-many relationships here. The relationships rediscover CODASYL sets that were well experienced in 70ies and 80ies. We assume one-to-many relationship R means a partial function R : T 7;! S where T (S ) is a set of objects of the target (source) type. The semantics of operations on objects in T and S should preserve such constraint. The BNF for the ODL relationship specication follows: < relationship spec >::= relationship] < target of path >< traversal path name > < inverse < inverse traversal path >> order by < attribute list >] < traversal path name >::=< string > < target of path >::= < collection type > < target type > < target type >::=< type name > < inverse traversal path >::=< target type >:: < traversal path name > collection type option indicates cardinality greater then one on the target side, otherwise the cardinality is one. Each attribute used in the ordering criterion must be dened in the property list of the target type denition. Inverse traversal path may be given in both source and target traversal path descriptions. The operations dened on a one-to-many relationship type are the following: create (o1:Denotable Object, s:Set<Denotable Object>) delete () add one to one (o1:Denotable Object, o2:Denotable Object) remove one to one (o1:Denotable Object, o2:Denotable Object) traverse (from:Denotable Object) -> s: Set<Denotable Object> OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (85) RELATED SYNTHESIS LANGUAGE CONSTRUCTS Association metatype: a specication of an attribute may be considered as a specication of an association type. It is said that an object attribute (as an association type) belongs to a particular attribute category that is explicitely introduced by an association metatype. < association metatype >::= f< association metatype identi er > in : association metatype params : f< formal parameter list >g ] supertype :< supertype list > ] inverse :< association metatype identi er > ] < attribute speci cation list > ] instance section : fassociation type : f< bounds > < bounds >g ] domain :< domain > ] range :< range > ] < attribute speci cation list >g ] g < bounds >::= f< lower bound > < upper bound >g < lower bound >::=< arithmetic expression >j inf < upper bound >::=< arithmetic expression >j inf An association type is dened in an instance section of the association metatype. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (86) ASSOCIATION TYPE Is interpreted here as a subtype of a set type with elements of a product type dened on an association domain and range types. Constraint (dening a kind of binary relation) is set by an attribute association type. The binary relation R is dened on a domain C1 and a range C2: The rst bound gives for any object c1 of C1 an admissible range (minimal and maximal value) of a number of dierent objects of c2 in C2 such that < c1 c2 > belongs to R. The second bound for any c2 of C2 gives a minimal and maximal value of a number of objects c1 of C1 such that < c2 c1 > belongs to an association inverse to R. inf is a constant denoting an arbitrary positive integer. By means of an instance section another attributes of an association type may be dened ("attributes of attributes") For an attribute price attributes of this type can be provided, such as price status and currency). If an attribute belongs to a certain attribute category (categories) then a union of the corresponding specications given in the attribute metaslot and in an instance sections of the corresponding association metatypes is formed. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (87) AN IDEA OF MAPPING OF A RELATIONSHIP TYPE TO ASSOCIATION Dierence: SYNTHESIS association metatype is a loose, unconstrained type that can be easily extended by associating to it new attributes - state, assertional and functional. The ODL relationship type is a concrete, built-in type that might be interpreted as the C++ class for which the specic relationships dened in types may be treated as instances. We need mapping of type schemas leading to the bijective data renement of the data type state diagram. To reach that we construct an appropriate axiomatic extension of the association metatype introducing axioms: 1. The axiom of partial functional dependency of the association range on the association domain: fassociation type : ff0 inf g f0 1 gg makes this metatype attribute obligatory 2. The axiom of order that is provided by the assertion extending the canonical kernel: ordered by : f< attribute list >g The latter is optional axiom making association metatype denition to be parameterized with < attribute list >. The attribute list is represented as an instance of the sequence of string type. The empty attribute list means that no ordering is implied. The mapping of operations leading to the algorithmic renement of the data type behaviour diagram should be constructed. We should specify the relationship operations create, delete, add one to one, remove one to one and traverse in the attribute specication list of the metatype. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (88) A RESULT OF THE RELATIONSHIP TYPE MAPPING Only three operations are shown.] fodl relationship in: metatype, association params: fattributes: seq of stringg inverse: odl relationship inv instance section: fassociation type: ff0, infg,f0, 1gg domain: type range: type ord pred: invariant, fordered by: attributesg add one to one:fin:function params: f+v1/domain, +v2/rangeg : v2 in this.range & this' = union(this, fv1,v2]g)g remove one to one:fin:function params: f+v1/domain, +v2/rangeg v1 in this.domain & v2 in this.range & this' = dier(this, fv1,v2]g)g traverse:fin:function params: f+from/domain, -to/range g to' = ordf(attributes, fx/range j from, x] in this gg /* ordf is assumed to be a function that returns a set of objects given by the second parameter ordered by values of ordering attributes given by the rst parameter. gg OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (89) ODL RELATIONSHIP TYPE IN AMN: INVARIANTS V1 and V2 are sets of admissible values of two related object types. We consider a case when for a collection type in a relationship a set type is used. To interprete relationship we introduce two variables establishing direct and inverse traversal paths: dpath 2 V1 7;! seq( V2 ) ^ ipath 2 V2 7;! V1 Here 7;! denotes a partial function and seq(S) is a set of nite sequences of elements from S. Each sequence over a set S is a partial function whose domain is an interval 1::n for some natural number n. dpath is considered to be a natural representation of a CODASYL set type that we use here to interprete the relationship. dpath establishes set instances and an order imposed on each of them. ipath expresses a constraint of a partial functional dependency similar to the CODASYL OM membership. Interrelation of dpath and ipath: dom( dpath ) = ran( ipath ) ^ 8 xx . ( xx 2 dom( dpath ) ) ran( dpath ( xx ) ) = ipath ;1 fxxg] r ;1 is an inverse of a relation. To interprete ordered relationships we introduce the type: fsort 2 } ( ATTR LIST ) }1 ( VV ) ;! seq ( VV ) ^ 8 ( xx , yy ) . ( xx ATTR LIST ^ yy 2 }1 VV ) fsort ( xx 7! yy ) 2 perm( yy ) ) where }1(S) denotes a set of all non-empty subsets of S, perm(S) is a set of bijective sequences of elements of a nite set S: perm(S) = 1..card(S) ! S, where ! denotes a set of bijections. (E 7! F ) denotes in AMN an ordered pair. Ordering is simulated by the operation order of the abstract machine ord: OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (90) ODL RELATIONSHIP TYPE IN AMN: ABSTRACT MACHINE MACHINE ord ( VV , ATTR LIST ) VARIABLES fsort INVARIANT fsort 2 } ( ATTR LIST ) }1 ( VV ) ;! seq ( VV ) ^ 8 ( xx , yy ) . ( xx 2 } ( ATTR LIST ) ^ yy 2 }1 ( VV ) ) fsort ( xx 7! yy ) 2 perm ( yy ) ) ASSERTIONS 8 ( xx , yy ) . ( xx 2 } ( ATTR LIST ) ^ yy 2 }1 ( VV ) ^ fsort ( xx 7! yy ) 2 perm ( yy ) ) ran ( fsort ( xx 7! yy ) ) = yy ) = ASSERTIONS is a list of predicates separated by ^ which gives prop- erties that can be asserted from the machine invariant and other contextual information. = OPERATIONS oo ; order ( par1 , par2 ) = PRE par1 VV ^ par2 ATTR LIST THEN oo := fsort ( par2 7! par1 ) END = PRE P THEN S END is the AMN equivalent for the pre-conditioned substitution P j S : = END OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (91) EXTENDED ASSOCIATION TYPE IN AMN: INVARIANTS The association metatype (a loose type) is interpreted by a binary relation of AMN ( equivalent to powerset }(V1 V2)): rr 2 V1 $ V2 Additional constraints corresponding to two axioms of extension to make a relationship type its bijective data renement: fassociation type : ff0 inf g f0 1 gg the following invariant of AMN corresponds ( rrs] is an image of set s under rr ): 8 xx . ( xx 2 dom( rr ) ) card( rr f xx g ] ) 1 ) To interprete an axiom of order we impose an ordering constraint similarly to the ODMG type: ordf 2 }1 ( ATTR V2 ) }1 ( V2 ) ;! seq( V2 ) ^ 8 ( xx , yy ) . ( xx 2 }1 ( ATTR V2 ) ^ yy 2 }1 ( V2 ) ) ordf ( xx 7! yy ) 2 perm ( yy ) ) OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (92) EXTENDED ASSOCIATION TYPE IN AMN: ABSTRACT MACHINE MACHINE canonical rel SETS V1 V2 ATTR V2 VARIABLES rr , ordf , attr v2 INVARIANT rr 2 V1 $ V2 ^ 8 xx . ( xx 2 ran( rr ) ) card( rr ;1 f xx g ] ) = 1 ) ^ ordf 2 } ( ATTR V2 ) }1 ( V2 ) ;! seq ( V2 ) ^ 8 ( xx , yy ) . ( xx 2 } ( ATTR V2 ) ^ yy 2 }1 ( V2 ) ) ordf ( xx 7! yy ) 2 perm ( yy ) ) ^ attr v2 2 } ( ATTR V2 ) ASSERTIONS 8 ( xx , yy ) . ( xx 2 } ( ATTR V2 ) ^ ordf ( xx 7! yy ) 2 perm ( yy ) ) ran ( ordf ( xx 7! yy ) ) = yy ) OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (93) EXTENDED ASSOCIATION TYPE IN AMN: ABSTRACT MACHINE (2) OPERATIONS add one to one ( par1 , par2 ) = PRE par1 2 V1 ^ par2 2 V2 ^ : ( par2 2 ran( rr ) ) THEN rr := rr f par1 7! par2 g END remove one to one ( par1 , par2 ) = PRE par1 2 dom( rr ) ^ par2 2 rr f par1 g ] THEN rr := rr ; f par1 7! par2 g END to ; traverse ( from ) = PRE from 2 dom( rr ) THEN to := ordf ( attr v2 7! rr f from g ] ) END END OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (94) ODMG RELATIONSHIP TYPE AS A REFINEMENT OF ASSOCIATION TYPE Abstraction bijective relation: dom( dpath ) = dom( rr ) ^ 8 xx . ( xx 2 dom ( dpath ) ) ran( dpath ( xx ) ) = rr f xx g ] ) ^ fsort = ordf ^ ipath = rr ;1 REFINEMENT odmg rel REFINES canonical rel INCLUDES ord ( V2 , ATTR V2 ) VARIABLES dpath , ipath , attr pp2 INVARIANT dpath 2 V1 7;! seq( V2 ) ^ ipath 2 V2 7;! V1 ^ dom( dpath ) = ran( ipath ) ^ 8 xx . ( xx 2 dom ( dpath ) ) ran( dpath ( xx ) ) = ipath ;1 f xx g ] ) ^ 8 xx . ( xx 2 dom( dpath ) ) dpath ( xx ) = fsort ( attr pp2 7! ran( dpath ( xx ) ) ) ) ^ attr pp2 2 } ( ATTR V2 ) ^ attr pp2 = attr v2 ^ dom( dpath ) = dom( rr ) ^ 8 xx . ( xx 2 dom ( dpath ) ) ran( dpath ( xx ) ) = rr f xx g ] ) ^ fsort = ordf ^ ipath = rr ;1 OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (95) ODMG RELATIONSHIP TYPE AS A REFINEMENT OF ASSOCIATION TYPE (2) OPERATIONS add one to one ( par1 , par2 ) = PRE par1 2 V1 ^ par2 2 V2 ^ : ( par2 2 dom( ipath ) ) THEN IF par1 2 dom( dpath ) THEN ANY pp WHERE pp 2 seq( V2 ) THEN pp ; order ( ran( dpath ( par1 ) ) f par2 g , attr pp2 ) dpath ( par1 ) := pp END ELSE ANY pp WHERE pp 2 seq( V2 ) THEN pp ; order ( f par2 g , attr pp2 ) dpath ( par1 ) := pp END END ipath ( par2 ) := par1 END = ANY xx WHERE P THEN S END is the AMN equivalent for generalized substitution xx :(P =) S ): = OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (96) ODMG RELATIONSHIP TYPE AS A REFINEMENT OF ASSOCIATION TYPE (3) remove one to one ( par1 , par2 ) = PRE par1 2 dom( dpath ) ^ par2 2 ran( dpath ( par1 ) ) THEN IF ran ( dpath ( par1 ) ) ; f par2 g = ? THEN dpath := f par1 g dpath = s r is antirestriction of r by s : Here r 2 S $ T and s 2 S : The antirestriction gives the set: fx y j x y 2 r ^ x 2 S ; s g: = ELSE LET qq BE qq = ran( dpath ( par1 ) ) ; f par2 g IN ANY pp WHERE pp 2 seq( V2 ) THEN pp ; order ( qq , attr pp2 ) dpath ( par1 ) := pp END END END ipath := ipath ; f par2 7! par1 g END = LET xx BE xx = E IN S END is the AMN equivalent for the generalized substitution xx :(xx = E =) S ): = to ; traverse ( from ) = PRE from 2 dom( dpath ) THEN to := dpath ( from ) END END OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (97) SUMMARY AND DISCUSSION A methodological basis for resolving the data model heterogeneity in multidatabase environment has been introduced. The basis provides for veriable design of the data model mappings handling the models as formal objects in frame of an abstract metamodel. Two dierent metamodels and verication techniques have been considered in detail. Finally a concept of data model renement was introduced providing AMN as a metamodel. Pure mathematical notation is combined with an ability to prove obligatory properties of data type denitions as well as the subtype property using the renement concept. Principles of data model axiomatic extension, the data model commutative mapping, the unifying canonical data model synthesis based on the notion of data model renement are exploited to integrate heterogeneous data models in one paradigm (canonical model of the environment). The basic steps of the mapping design are as follows: construct the mapping of a source data model type specications into type specications of an extension of the canonical data model (including state and behavior mapping) provide a formal interpretation of source data model types provide interpretation in formal notation of the source data types mapped into extension of the canonical data model types justify the state-based and behavioral properties of the type mappings proving that a source data type is a renement of its mapping to the canonical data model type. The spectrum of data models embraced by the method may include objectoriented as well as structured data models. A unifying canonical data model can be synthesized integrating data models of various DBMSs on the basis of the data model renement conception. The canonical data model with an object-oriented kernel can be used. OBJECT-ORIENTED DATABASES IPI RAS Leonid Kalinichenko (98)
© Copyright 2025 Paperzz