redundancy reducing

RRXS
Redundancy reducing XML
storage in relations
• O. MERT ERKUŞ 2002701054
• A. ONUR DOĞUÇ 2002701069
PRESENTATION OUTLINE
INTRODUCTION
FUNCTIONAL DEPENDENCIES
CONSTRAINT PRESERVING RELATIONAL STORAGE
EXPERIMENTAL EVALUATION
CONCLUSION
INTRODUCTION
INTRODUCTION
PROBLEM
Current techniques for storing XML
using relational technology, consider the
structure of an XML document but ignore
its semantics.
However, when the semantics of a
document is considered redundancy may
be reduced!
INTRODUCTION
RELATIONAL DATABASES REVIEW
STRUCTURAL APPROACH
DTD – SCHEMA GRAPHS
STRUCTURAL + SEMANTIC APPROACH
KEYS – FOREIGN KEYS
FUNCTIONAL DEPENDENCIES
XML SCHEMA
INTRODUCTION
PROBLEM DEFINITION
Providing a mapping from XML to a
relational database taking structural as well
as a broad class of semantic constraints
into account.
INTRODUCTION
EXAMPLE - XML TREE
INTRODUCTION
EXAMPLE - CONSTRAINTS
INTRODUCTION
EXAMPLE - COMMENTS
1. & 3. constraints are STRUCTURAL.
2. & 4. constraints are KEYS
5.constraint is FUNCTIONAL DEPENDENCY
None of the relational storage strategies designed
to date would produce this design.
INTRODUCTION
OUTLINE OF THE WORK
1. A new constraint definition, XFDs, that can
capture structural and key constraints, as well as
the functional dependencies
2. A set of rewriting rules for XFDs
3. A polynomial time algorithm to reduce the input
set of XFDs
4. A constraint-preserving mapping into relational
storage that reduces redundancy
5. Experimental evaluation which shows the
effectiveness of RRXS
FUNCTIONAL DEPENDENCIES
FUNCTIONAL DEPENDENCIES
DEFINITION
Functional dependencies for XML
(XFDs) are used to describe the property
that the values of some attributes of a tuple
uniquely determine the values of other
attributes of the tuple.
FUNCTIONAL DEPENDENCIES
EXAMPLE – XFD’s from constraints
Variable Bindings
$x in //vendor
$y in //book
$z in $x/book
FUNCTIONAL DEPENDENCIES
DEFINITIONS - SUITE
•An “attribute” for XML, called a P-attribute, is defined
by a path expression $v=Q that occurs in some
functional dependency.
•The set of P-attributes in an XFD group together values
to form a ‘tuple’ for an XML instance, named an Xtuple.
A functional dependency is defined on the P-attributes
of an X-tuple, and intuitively must hold on the set of all
X-tuples formed by valid variable bindings.
FUNCTIONAL DEPENDENCIES
TYPES OF XFD’S
Structural XFD’s :
Structural XFDs are used to capture the tree
structure of an XML document and certain
types of schema information. C1, C3
Semantic XFD’s:
Semantic constraints are used to capture
deeper knowledge of the data. C2, C4, C5
FUNCTIONAL DEPENDENCIES
REDUCING XFD’S
THE TASK: FINDING A SET OF RULES,
WHICH CAN PROVE THE SOUNDNESS &
COMPLETENESS OF THE XFD INFERENCE
FUNCTIONAL DEPENDENCIES
REWRITE RULES
1. Armstrong Axioms
Reflexivity
Augmentation
Transitivity
2. Containment
To use path expressions instead
of simple attributes.
Considers the relationship
between path expressions.
FUNCTIONAL DEPENDENCIES
REWRITE RULES
3. Singleton path
Exploits structural constraints
imposed by the definition of
XFDs.
4. Variable-move
Move variable bindinds in
relations
5. Variable Introduction
and Elimination
Insert new variables and
eliminate redundant ones
FUNCTIONAL DEPENDENCIES
XFD INFERENCE
INFER:
A polynomial time algorithm which “Given an
XFD Ø : XY and a set of XFD’S F, determines
wheter or not Ø can be inferred from F using L
(Rewrite Rules) .”
It detects which XFD’s can be eliminated or
simplified, from the initial set of XFD’s and derives
G (Redundancy reduced set of XFD’s)
CONSTRAINT PRESERVING RELATIONAL STORAGE
CONSTRAINT PRESERVING RELATIONAL STORAGE
RRXS: SCHEMA MAPPING
The XML-to-Relational mapping method has
following input and outputs :
Input: A set ofXFDs F, and an optional DTD D.
Output: A target relational schema R with a set
of keys K , and a redundancy reducing,
constraint preserving transformation M.
CONSTRAINT PRESERVING RELATIONAL STORAGE
RRXS: SCHEMA MAPPING
REDUNDANCY REDUCING:
It means that: redundancy which can be detected by F
using L is eliminated in R.
CONSTRAINT-PRESERVING :
It means that : for any XML tree T, F hold on T if and
only if K hold on M(T).
CONSTRAINT PRESERVING RELATIONAL STORAGE
ALGORITHM RRXS
CONSTRAINT PRESERVING RELATIONAL STORAGE
ALGORITHM RRXS
EQUIVALENCE:
An algorithm to recognize equivalent XFDs and equivalent
elements, then group them in equivalence classes and
output G.
REDUCE:
An algorithm similar to ‘infer’ removing redundant XFDs
SHRINK:
An algorithm that removes unnecessary elements,
producing the set of XFDs
CONSTRAINT PRESERVING RELATIONAL STORAGE
INSTANCE MAPPING
The instance mapping takes an XML tree T which
conforms to DTD D and satisfies the XFDs F as well as
the schema mapping output M, and generates a relational
instance M(T) which conforms to schema R.
EXPERIMENTAL EVALUATION
EXPERIMENTAL EVALUATION
EXPERIMENTAL EVALUATION
RESULTS
1. SOME NODE IDS GENERATED BY HYBRID
INLINING ARE REMOVED. (ID)
2. USER DEFINED XFD’s ARE CORRECTLY
USED TO ELIMINATE REDUNDANCIES
3. THE STRATEGY WORKS CORRECTLY FOR
RECURSIVE DATA
EXPERIMENTAL EVALUATION
RESULTS
CONCLUSION
SUMMARY AND FUTURE WORKS