mediation

ODB Course. Mediation
environment for object DB
integration
Leonid A. Kalinichenko
Institute of Informatics Problems,
Russian Academy of Sciences
[email protected]
Talk outline
 Motivation
 Resource registration at a mediator
 Query rewriting for information integration
 Query rewriting algorithm in the typed environment
Motivation
Outline :
 Application-driven EIS development
 Canonical Information Model
 Consolidation of a mediator
 Mediator schema example
Concerns of EIS development








Enterprise inter- and intra-organizational models: virtual corporations (e.g.,
virtual observatories in e-science)
Integrating at the model level - taking fragments of information within the
enterprise and placing them in a larger context
What model is to be taken and how a proper context is to be formed
Heterogeneous information resources of various kinds (data resources, service
resources, process resources, ontological resources) relevant to EIS are to be
used in a specific context of an application
Often such resources are autonomous and evolve with time. A set of resources
relevant to a specific EIS may be changed quite rapidly. The technologies
applied for relevant resources are also rapidly evolving.
Justifiable identification of relevant to EIS resources, reaching semantic
integration of various kinds of them in contexts of appropriate applications
Making EIS stable in the rapidly evolving world
New methods and tools for EIS application development over multiple
distributed collections of data and services are required
4
Subject Domain in Natural Science
Domain Terminology and Concepts
(abstract, methodological, concrete)
Semantics of T1…Tn constituents
InterpretaObservable/Measurable
tion
Characteristics
Theory (Model) 1. T1 Signature
(attributes, types, classes, processes)
T1 Measurable Characteristics
[simulators]
(attributes, types, classes, procs)
Concretization A of T1
T2, … , Tn measuConcretization B of T1
rable characteristics
…
Simulation
Observations, simulations,
Explaining, forecasting
measurements for T1
Theories (Models)
T2, … , Tn
Material System Def in NL Semantics
Methods and Instruments for observation, experimentation, measurement,
data analysis, discovery
Problems, methods of solutions,
algorithms, programs, workflows
5
Requirements for scientific results publishing
To publish means to make information resource available through services:






To unify theory, experiment, and simulation
To query integrated information
To allow independent checks of conclusions based on theoretical results,
reproducing certain results.
To allow comparisons with similar results/methodologies or with the
corresponding data by observers/theoreticians.
To make theoretical results more easily accessible and understandable for
observers.
To establish invariants for observable classes, to treat observable classes as
interpretations of theories (models), triggers watching for inconsistencies of
observations and theoretical models.
6
Two approaches to resource integration






Moving from resources to problems (an integrated schema of multiple resources
is created independently of specific applications)
Moving from an application to resources (a description of an application subject
domain (in terms of concepts, data structures, functions, processes) is created, in
which resources relevant to the application are mapped)
The first approach driven by information resources is not scalable with respect
to the number of resources, does not make semantic integration of resources in a
context of specific application possible, does not lead to justifiable identification
of relevant to EIS resources, does not provide for enhancing of EIS stability
The second approach (application-driven) assumes creation of subject mediator
that supports an interaction between an application and resources on the basis of
the application domain definition (description of the mediator)
Application-driven, subject mediation approach provides for overcoming of
deficiencies of the resource-driven approach
Basic methods for application-driven approach will be characterized
7
Principles of application-driven EIS development
Basic principles of application-driven EIS development over multiple
heterogeneous information resources are the following:
1. independence of application (mediator) specification of the existing information
resources;
2. definition of an application mediator as a result of consolidation effort of the
respective community;
3. emphasizing semantic canonical definitions for the mediator specification;
4. independence of user interfaces of the resources registered at the mediator:
application users should be only conscious of definition of the application
domain (definition of mediator);
5. independence of publication of the newly developed information resources of
the mediators
6. three stage identification of information resources relevant to mediator
7. semantic integration of relevant heterogeneous information resources in
canonical mediator specification
8. integrated access to the information resources registered at mediator applying
the canonical model and query rewriting system
9. recursive structure of mediators: each mediator can be registered as a new
information resource
8
Synthesis of canonical information model







Explosive growth of various information representation models (OMG
architectures (e.g., MDA architecture), SemanticWeb and Web service
architectures, digital library architectures as collective memories, information
Grid architectures, languages and data models (ODMG, SQL, UML, XML and
RDF stacks of data models), process models and workflow models, semantic
models (including ontological models and models of metadata), models of
digital repositories of data and knowledge in particular domains
Another trend — intensive development of based on such models information
components and services, accelerating need for integration in various
applications of components and services
Development of adequate methods for manipulation of various information
models are required
The basis of these methods is constituted by the concept of a canonical
information model serving as the common language, ”Esperanto”
Initially ideas of mapping structured data models and canonical model
construction for them were developed
Main principle of mapping of an arbitrary resource data model (represented by
its DDL and DML) into the target one (the canonical model) constituted the
principle of commutative data model mapping
Preserving of operations and information of a resource data model while
mapping it into the canonical one applying proofs in denotational semantics
9
Synthesis of canonical information model (2)






For the object data models, the method of data model mapping and canonical
models constructions used as a formalism (metamodel) of the method the
Abstract Machine Notation (AMN). It allowed to define the model-theoretic
specifications in the first order logics and to prove the fact of specification
refinement
The main principle of canonical model synthesis is that its extensibility is
required for semantic integration and information interoperability in
heterogeneous environment, including various models
A kernel of the canonical model is fixed. For each specific information model M
of the environment an extension of the kernel is defined so that this extension
together with the kernel is refined by M
The canonical model for the environment is synthesized as the union of
extensions, constructed for models M of the environment. The resource schema
refines the canonical model schema
The refinement of the schema mapping is formally checked
Canonical data model synthesis method provides a seminal role for synthesis of
canonical models for various kinds of resource information models including
process models (workflows), service models (Web services), ontological models
(OWL)
10
Heterogeneous models absorbed by
the canonical model
Canonical Model
Core
Extensions
is_refined_by
Semistructured
Data Models
(OEM, ADM, OQL-doc)
Component Models
(IDL, CDL, BOF)
Object & Heterogeneous
DB Models
(ODL, SQL3, Garlic)
Document Object
Model
Knowledge Base
Representations
(OKBC, Ontolingua)
Metadata for DL
(Dublin Core, Warwick,
Starts, Z.39.50)
Metadata Expressible
in Meta Models
(MOF, RDF)
Unstructured Data
(vocabularies, thesauri)
Workflow Models
11
Mediator Definition as Subject
Metainformation Consolidation
For the mediator's scalability two separate phases of the mediator's functioning are
distinguished: consolidation phase and operational phase.
On the consolidation phase the efforts of the community are focused on the mediator
subject definition by declaring its metainformation. The well-known representative
resources of information in the subject domain are used during the process of
metainformation definition. The metainformation created at the consolidation phase
constitutes the mediated level of the integrated system.
During the operational phase arbitrary information resources can be registered at the
mediator expressed in terms of the mediated level. Process of the registration is
autonomous and can be done by resource providers independently of each other.
Users of the mediator know only the metainformation defining the mediator’s subject
and formulate their queries in terms of the mediator’s subject. For a query the
mediator decides what registered resources are relevant to the query.
12
Mediator’s Recursion
Query
Data from
mediator mediator
Mediator
Register
mediator
(as collection)
Query
Data from Register
collection collection collection
13
Advantages of subject domain
mediation
Semantic integration of heterogeneous information collections
is reached
2. Users should know only subject definitions as defined by a
community
3. Information providers can disseminate their information for
integration independently of each other and at any time.
4. Autonomous information collections are absolutely independent
on the mediator and its consolidated metainformation
definitions
5. Users have integrated access to all information registered up to
the moment of a query.
6. Mediators form recursive structure. Multiple subjects can be
semantically integrated defining mediators of the higher level.
1.
14
Cultural Heritage Subject Domain
«type»
Heritage_Entity
Painting
Sculpture
Antiquities
Person
Creator
Collector
Owner
Repository
Museum
Gallery
Exhibition
created_by*
date*
narrative*
idintifier*
relation*
…
place_of_origin
history_period
content
origin_history
in_collection
owned_by
digital_form
...
«type»
Text
contains
near
within
follows
…
Ontologies
Thesauri:
Cultural
Heritage
History
Jurisdiction
15
Cultural Heritage Subject Mediator
«type»
Entity
«type»
Person
-name
-nationality
-date_of_birth
-date_of_death
-residence
*
-title
-date
+value(in e : Entity) : real
-created_by
«type»
Creator
-culture
-general_Info
«type»
Repository
1
-works
1
*
-name
-place
«type»
Heritage_Entity
-contains
-place_of_origin
-date_of_origin
-content
*
-in_collection
«type»
Painting
«type»
Sculpture
«type»
Antiquities
-dimensions
-material_medium
-exposition_space
-type_specimen
-archaeology
1
-in_repository
*
-collections
«type»
Collection
-name
-description
1
16
Resource registration at a mediator
Outline :
 Resource classes as views over mediator schema
 Example
 Resource registration facilities
 Contextualization of ontology
 Process of an information resource registration
Registration of relevant resources in mediator








Definition of a subject mediator and registration of information resources in
mediator is based on ideas of compositional development of information systems
Registration of resources is a process of purposeful specification transformation
including decomposition of mediator specifications into consistent fragments,
search among specifications of relevant resources of data types treating as
candidates for refining by them of the mediator specification types, construction
of expressions defining resource classes as composition of the mediator classes
Specification composition calculus, type reducts, type algebra
A process of registration of heterogeneous information resources in a subject
mediator is based on GLAV that combines two approaches - Local As View
(LAV) and Global As View (GAV)
GAV views provide for reconciliation of various conflicts between resource and
mediator specifications and provide rules for transformation of a query results
from resource into mediator representation
The main registration result is a GLAV expression defining how a resource class
is determined as a composition of the mediator classes
Such registration technique provides for stability of EIS application
specification during any modifications of specific information resources and of
their actual presence
Identification of resources relevant to a mediator (that precedes the registration)
is based on three models: metadata model, ontological model, canonical
information model
18
Heterogeneous Information resources
Integration for Problem Solving
• integration information from pre-selected resources. A procedural
approach is known to integrate information from resources through adhoc procedures. When information needs or resources change, a new
mediator should be generated. This is known as Global as View (GAV)
approach.
• integration information from arbitrary resources according to the
predefined information needs. A declarative approach is known.
Mediators contain mechanisms to rewrite queries according to resource
descriptions. A rewritten query should be contained in the original query.
This is known as Local as View (LAV) approach.
• combined LAV and GAV approaches (GLAV), applying partial
materialization
19
Representation of Information resources
Formally, the contents of an information resource are described by a set
of expressions:
V(Z)  C1 (U 1) &…& Cn ( Un )
Where C1, … , Cn are classes on the mediator level, V is the class on the
information resource level. This means that the resource can be asked a
query of the form V(Z) (or any partial instantiation of it), and returns
instances with state attributes that satisfy the following implication:
 Z (V( Z) => U C1 (U 1) &…& Cn ( Un ))
Z are assumed to be free variables in U C1 (U 1) &…& Cn ( Un )
20
Resources Representation at Subject Mediator
Mediator level metainformation
Local into mediator level mapping
Hermitage Museum Web Site
museum_objectP
museum_objecrS
title
name
place_of_origin
date_of_origin
r_name
dimensions
title
name
place_of_origin
date_of_origin
r_name
material
Louvre Museum Web Site
workP
title
author
place_of_origin
date_of_origin
in_rep
workS
title
author
place_of_origin
date_of_origin
in_rep
exp_space
Uffizi Museum Web Site
artist
name
general_Info
works
canvas
title
name
culture
place_of_origin
r_name
Local views in terms of mediator classes
museum_objectP(p/Museum_Object[title,
name, place_of_origin, date_of_origin,
r_name])  painting(p/Painting[title,
name:created_by.name, place_of_origin,
date_of_origin, r_name:
in_collection.in_repository.name]),
date_of_origin < 1700, date_of_origin >
1600
…
workP(p/Work[title, author,
place_of_origin, date_of_origin, in_rep]) 
painting(p/Painting[title, author:
created_by.name, place_of_origin,
date_of_origin, in_rep:
in_collection.in_repository.name]), in_rep =
‘Louvre’
…
amount(h/Entity[title, name:
created_by.name], v/real) 
value(h/Entity[title, name:
created_by.name], v/real)
canvas(p/Canvas[title, name, culture,
place_of_origin, r_name])  painting
(p/Painting[title, name: created_by.name,
place_of_origin, date_of_origin, r_name:
in_collection.in_repository.name]), creator
(c/Creator[name, culture]), r_name = 'Uffizi',
date_of_origin >= 1550, date_of_origin <
1700
artist (a/Artist[name, general_Info, works]) 
creator(a/Creator[name, general_Info,
works/{set-of:Painting}])
21
Uffizi views in terms of the mediator

canvas(p/Canvas[title, name, culture, place_of_origin, r_name]) 
painting (p/Painting[title, name: created_by.name, place_of_origin,
date_of_origin, r_name: in_collection.in_repository.name]),
creator (c/Creator[name, culture]),
r_name = 'Uffizi', date_of_origin >= 1550, date_of_origin < 1700

artist (a/Artist[name, general_Info, works]) 
creator(a/Creator[name, general_Info, works/{set-of:Painting}])

amount(h/Entity[title, name: created_by.name], v/real) 
value(h/Entity[title, name: created_by.name], v/real)
22
Resource Registration Facilities
The facilities intended to support functions of resource registration include:
• contextualization of ontology;
• constructing mapping of a resource data model and metadata into the canonical
ones;
• structural and behavioral conflicts resolution;
• view definition: representation of resource classes in terms of the mediator's
classes;
• semi-automatic construction of a resource wrapper;
• connecting the wrapper to the interoperation infrastructure.
23
Structure of the Resource
Registration Tool
Reresource Registration Tool
reresource context / mediator
metainformation reconciliation
most common reduct identification
construction resource class
specifications as views over mediator
classes
wrapper generation
Mediator’s DBMS (Oracle 9i)
metainformation
repository
B-Toolkit
B-AMN
wrapper code
24
Contextualization of Ontology
mapping of local ontological context to that of the mediator
•by names and relationships
•by natural language description
•applying structural integration to concept specifications
•introducing new concepts over existing ones
contextualization through structural correlation
•establishing loose ontological relevance of specification elements
applying analysis of intercontext concept relationships
•establishing tight ontological relevance of specification elements
introducing a subsumption relationship between concepts
25
Correlation of Ontological Concepts
evaluation of descriptor weights
f k log
WXk 
N
nk

N


f
log

 i
n
iV X 
i 
2
establishing intercontext relationships tbetween concepts
 WXk WYk 
kV V
sim  X , Y  
t
t
2
2
 WXk    WYk 
X
kV X
t
r  X,Y  
 min W
Xk
,W Yk 
kV X VY
t
 W 
kV X
2
Xk
Y
kVY
t
r Y,X  
 min W
Xk
, W Yk 
kV X VY
t
 W 
kVY
2
Yk
26
Ontological Metainformation
type
Class
1
*
ADT
*
1 collection
ConceptRel
toRelation
-strength: float=1
1 fromConcept
*
1
fromRelation *
Concept
toConcept
PositiveRel
*
-definition: string * foreign
-wordClass: string descriptorOf
1 category
Category
-code: string
1
weightOf
*
concept 1
1
NarrowRel
* weights
*
* descriptors
ConceptWeight
PartRel
Descriptor
-weight: float
-frequency: float
-name: string
-weight: float
-name: string
RelativeRel
27
Identification of Ontologically
Relevant Elements
<< concept>>
ArtAge
<< attribute> >
FineArt.period
positive(0.64)
<< concept>>
Period
<< attribute> >
Creator.culture_race
28
Process of an Information Resource
Registration
For each resource class the following steps (of the compositional development
process) are required:
1.
relevant mediator classes identification
• Find mediator classes that ontologically can be used for defining resource
class in terms of mediator classes. To a resource class several mediator classes
may correspond covering with their instance types different reducts of an
instance type of the resource class.
2.
most common reducts construction
For an instance type of each identified mediator class do:
•
•
Construct most common reducts for instance type of this mediator class and
resource class instance type to concretize (partially) such mediator instance
type.
In this process for each attribute type of the common reduct a concretizing
type, concretizing function or their combination should be constructed (this
step should be recursively applied).
29
Process of an Information Resource
Registration
3.
partial resource view construction
• For each relevant mediator class construct a partial resource view
expressing a constraints in terms of the mediator class that should
be satisfied by values of respective most common reducts of resource
class instances.
4.
partial views composition
• Construct compositions of the resource type most common reducts
obtained for instance types of all mediator classes involved.
• Construct a resource view as a composition of partial views obtained
above. This is an expression of a materialized view of an information
resource in terms of mediator classes. An instance type of this view is
determined by the most common reducts composition constructed
above.
30
Query rewriting for information integration
Outline :
 Canonical model query language
 Query containment
 Query rewriting algorithms
 View definition and inverse rules construction
 Query rewriting algorithm schema and an example of rewriting
 Mediator architecture
Canonical model query language

SYNTHESIS Conjunctive Query (SCQ) is a query of the form
q(v/Tv):- C1(v1/Tv1), … , Cn(vn/Tvn), F1(X1,Y1), … , Fm(Xm,Ym), B
where q(v/Tv), C1(v1/Tv1), … , Cn(vn/Tvn) are collection (class) atoms,
F1(X1,Y1), … , Fm(Xm,Ym) are functional atoms, B, called constraint, is a
conjunction of predicates over the variables v, v1, … , vn, typed by Tv, Tv1, … ,
Tvn , or output variables Y1Y2 … Ym of functional atoms. Each atom
Ci(vi/Tvi) or Fj(Xj,Yj) (i = 1, … , n; j = 1, … ,m) is called a subgoal. The value v
structured according to Tv is called the output value of the query. A union
query is a finite union of SCQs.


Formal semantics of SCQ have been defined as a composition of Cartesian
product of sets and classes, functional predicates execution, selection of
instances satisfying B, typing of joins of product domains by join operation of
the specifications of the respective argument types
Semantics of disjunctions Ci(vi/Tvi)  Cj(vj/Tvj) requires that for Tvi and Tvj a
resulting type of disjunction is defined by type operation meet
32
Query Containment Algorithms
A query Q1 is said to be contained in a query Q2 , if for all databases D, the set of
tuples computed for Q1 is a subset of those computed for Q2
Query containment has been studied for the purposes of query optimization,
detecting independence of queries from database updates, rewriting queries using
views, maintenance of integrity constraints, semantic data caching, etc.









Containment mapping for conjunctive queries [1977]
Uniform containment of Datalog programs [1988]
Containment of Conjunctive queries with built-in subgoals [1989]
Containment of conjunctive queries in Datalog program [1989]
Uniform containment for Datalog programs [1996]
Containment for Queries with Complex Objects [1997]
Boolean query containment [1997]
Containment for Conjunctive Queries With Regular Expressions [1998]
Query containment relative to views [1999]
33
Query rewriting algorithms for data
integration

U-join algorithm, Bucket algorithm: conjunctive queries using conjunctive
views [1996]

Inverse-rule algorithm: Datalog programs using Datalog views [1998]

MiniCon algorithm, Shared-Variable-Bucket: improved versions of the Bucket
algorithm [2000, 2001]

Algorithms for finding contained rewritings in the presence of functional
dependencies [1998-2002]

Rewriting Unions of Relational Conjunctive Queries [John Wang Thesis,
Griffith University, Brisbane, 2002]

Resolution-based rewriting [2002]
34
View definition and inverse rules construction
During the registration a local resource class is described as a view over virtual
classes of the mediator having the following general form of SCQ.
V(h/Th)  P1(b1/Tb1), … , Pk(bk/Tbk), F1(X1,Y1), … , Fr(Xr,Yr), B
A reduct of the view body instance type is to be refined by the concretizing type
Th designed above the resource.
To produce inverse rules out of the view definitions replace in the view each not
contained in Th attribute from Tb1, … ,Tbk with a distinct Skolem function of h/Th
producing output value of the type of the respective attribute. Such Skolemizing
mapping of the view is denoted as . After the Skolemizing mapping, inverse
rules for the mediator classes in the view body are produced as
(Pi(bi/Tbi)  V(h/Th)) (for i = 1, … , k)
For the mediator functions being type methods the inverse rules look like
(Tm.Fbj(Xbj,Ybj)  Ts.Fhl(Xhl,Yhl)), for j = 1, … ,r, here Fbj and Fhl are
methods of Tm and Ts such that type of function of Fhl refines type of function of
Fbj.
(B) is the inferred constraint of the view predicate V(h/Th).
35
Example: View Def and Inverse Rules
Views definition for Uffizi site:
 canvas(p/Canvas[title, name, culture, place_of_origin, r_name]) 
painting(p/Painting[title, name: created_by.name, place_of_origin,
date_of_origin, r_name: in_collection.in_repository.name]),
creator(c/Creator[name, culture]), r_name = 'Uffizi', date_of_origin >=
1550, date_of_origin < 1700
 artist (a/Artist[name, general_Info, works]) 
creator(a/Creator[name, general_Info, works/{set-of:Painting}])
 amount(h/Entity[title, name: created_by.name], v/real) 
value(h/Entity[title, name: created_by.name], v/real)
Inverse rules for the first view def above
 painting(p/Painting[title, name: created_by.name, place_of_origin,
#1date_of_origin, r_name: in_collection.in_repository.name])  canvas(p/
Canvas[title, name, culture, place_of_origin, r_name])
 creator(c/ Creator[name, culture])  canvas(p/ Canvas[title, name, culture,
place_of_origin, r_name])
#1date_of_origin is a Skolem function
36
Query rewriting algorithm schema (1)
1.
For each SCQ (denoted as Q) q(v/Tv):- C1(v1/Tv1), … , Cn(vn/Tvn), F1(X1,Y1),
… , Fm(Xm,Ym), B in Qu generate a set of candidate formulae
valuable_Italian_heritage_entities(h/Heritage_Entity_Valued[title, c_name,
r_name, v]) :heritage_entity(h/Heritage_Entity[title, c_name: created_by.name,
place_of_origin, date_of_origin, r_name: in_collection.in_repository.name]),
value(h/ Heritage_Entity [title, name: c_name], v/real),
v >= 200000, date_of_origin >= 1500, date_of_origin < 1750,
place_of_origin = ‘Italy’
2.
For each subgoal Ci(vi/Tvi) or Fj(Xj,Yj) of Q find inverse rule r  I unifying
with the subgoal its head Pl(bl/Tbl) or Fbo(Xbo,Ybo) (unification is based on
subtyping and refinement relations*)
painting(p/Painting[title, name: created_by.name, place_of_origin,
#1date_of_origin, r_name: in_collection.in_repository.name]) 
canvas(p/ Canvas[title, name, culture, place_of_origin, r_name])
* Painting is a subtype of Heritage_Entity
+
37
Query rewriting algorithm schema (2)
3.
A destination of Q is a sequence D of atoms P1(b1/Tb1), … , Pn (bn/Tbn),
Fb1(Xb1,Yb1), … , Fbm (Xbm,Ybm) obtained as a result of the query subgoals
unification with the heads of inverse rules from I. Several destinations can be
produced as various combinations of SCQ subgoals unifications found.
painting(p/Painting[title, name: created_by.name, place_of_origin,
#1date_of_origin, r_name: in_collection.in_repository.name]),
value(h/Painting[title, name: created_by.name], v/real)
4.
To construct a candidate formula so that for each atom in D: establish a
mapping i of attributes and variables in this atom and associated view to the
attributes and variables of the respective atom of Q
 mapping for the destination (only different name mappings are shown):
 1 = { p → h, name: created_by.name → c_name: created_by.name,
#1date_of_origin → date_of_origin : #1date_of_origin }
 2 = { name: created_by.name → name:c_name }
38
Query rewriting algorithm schema (3)
5.
6.
For each destination and variable mappings defined, construct a formula :
1(P1(b1/Tb1)), … , n(Pn(bn/Tbn)), n+1(Fb1(Xb1,Yb1)), … ,
n+m(Fbm(Xbm,Ybm))
Construct the mapping  of a constraint of Q to a constraint in .
7. Replace heads of the inverse rules in the obtained SCQ with the associated
inverse rules bodies to get the formula 2
q(v/Tv):- 1(V1(h1/Th1)), … , n(Vn(hn/Thn)), n+1(Fh1(Xh1,Yh1)), … ,
n+m(Fhm(Xhm,Yhm)), (B), E
Applying 1, 2, we get the a candidate formula:
valuable_Italian_heritage_entities(h/Heritage_Entity_Valued[title, c_name,
r_name, v]) :- canvas(h/Canvas[title, c_name: name, culture, place_of_origin,
date_of_origin: #1date_of_origin, r_name]), amount(h/Painting[title, name:
c_name], v/real), v >= 200000, date_of_origin >= 1500, date_of_origin <
1750, place_of_origin = ‘Italy’
39
Query rewriting algorithm schema (4)
8.
9.
If the constraint (B)  E and the inferred constraints of the view atoms in the
candidate formula are consistent and there are no Skolem functions in the
candidate of Q then the formula is a rewriting
Skolem functions elimination: if the inferred constraints of the view atoms
imply the constraints in the candidate formula, then we can remove those
constraints directly
The inferred constraint for canvas(h/Canvas[title, c_name: name, culture,
place_of_origin, date_of_origin: #1date_of_origin, r_name]) that looks as
r_name = 'Uffizi', #1date_of_origin >= 1550, #1date_of_origin < 1700 implies
date_of_origin >= 1500, date_of_origin < 1750.
Due to that Skolem functions can be eliminated from this candidate formula and
after the consistency check we get the rewriting:
valuable_Italian_heritage_entities(h/Heritage_Entity_Valued[title, c_name,
r_name, v]) :- canvas(h/ Canvas[title, c_name: name, culture, place_of_origin,
r_name]), amount(h/Painting[title, name: c_name], v/real), v >= 200000,
place_of_origin = ‘Italy’
40
Query rewriting algorithm schema (5)
10.
Containment property of the candidate formulae: if we expand each view
atom with the corresponding Skolemized view body and treat the Skolem
functions as variables, then we will get a safe SCQ which is contained in Q (in
particular, an instance type of a subgoal of 2 is a refinement of the instance
type of the respective subgoal of Q )
41
Query Rewriting Example (1)
Similarly, for Louvre the heritage_entity subgoal of a query unifies with painting,
sculpture as heritage_entity subclasses. Only rewriting formed for painting is
shown here.
valuable_Italian_heritage_entities(h/Heritage_Entity_Valued[title, c_name,
r_name, v]) :- workP(h/Work[title, c_name: author, place_of_origin,
date_of_origin, r_name: in_rep]), amount(h/Work[title, name: c_name], v/real),
v >= 200000, date_of_origin >= 1500, date_of_origin < 1750, place_of_origin
= ‘Italy’
Finally, for Hermitage Museum heritage_entity subgoal of a query also unifies with
painting, sculpture as heritage_entity subclasses. Only rewriting formed for
painting is shown here.
valuable_Italian_heritage_entities(h/Heritage_Entity_Valued[title, c_name,
r_name, v]) :- museum_objectP(h/Museum_Object[title, c_name:name,
place_of_origin, date_of_origin, r_name]),pcost(h/Museum_Object [title, name],
v/real), v >= 200000, date_of_origin >= 1500, date_of_origin < 1750,
place_of_origin = ‘Italy’
42
Query Rewriting Example (2)
User
Find data on valuable Italian heritage entities
produced between 1500 and 1750 year
heritage_entity(h/Heritage_Entity[title, c_name: created_by.name, place_of_origin, date_of_origin, r_name:
in_collection.in_repository.name]), value(h/ Heritage_Entity [title, name: c_name], v/real), v >= 200000,
date_of_origin >= 1500, date_of_origin < 1750, place_of_origin = ‘Italy’
Mediator
Query
Rewriter
museum_objectP(h/Museum_Object[t
itle, c_name:name, place_of_origin,
date_of_origin, r_name]),
pcost(h/Museum_Object [title, name],
v/real), v >= 200000, date_of_origin
>= 1500, date_of_origin < 1750,
place_of_origin = ‘Italy’
Thesaurus extension may
add ‘Italia’
workP(h/Work[title, c_name:
author, place_of_origin,
date_of_origin, r_name:
in_rep]), amount(h/Work[title,
name: c_name], v/real), v >=
200000, date_of_origin >=
1500, date_of_origin < 1750,
place_of_origin = ‘Italy’
Thesaurus
canvas(h/ Canvas[title, c_name:
name, culture, place_of_origin,
r_name]), amount(h/Painting[title,
name: c_name], v/real), v >=
200000, place_of_origin = ‘Italy’
Uffizi
Hermitage
Louvre
43
Mediator Architecture
Portal
Web
Browser
Web
Web
Page
Page
Application
Client
1
Application Server
Servlets/
JSP
2
Mediator
1
EJB /
WS
2
6
Metadata
Access
ADQL2SYFS
3
6
Supervisor
Rewriter
3
3
Planner
5
Collection
Oracle 10g
Metainformation
Repository
3
Synth2Oracle
3
7
Data
Repository
SOAPWrapper
4
Collection
Adapter
Registration
Client
4
4
Collection
Adapter
9
Collection
Collection
4
Tool
Adapter
5
Software
Tools
44