Constructing Search Space for
Materialized View Selection
Dimiti Theodoratos
Wugang Xu
New Jersey Institute of Technology
DOLAP'04 - Washington DC
1
Problem (1)
• Many problems in Databases require the
selection of views to materialize.
• A general form of these problems is the
following:
– Given a set of queries, select a set of views to
materialize such that a cost function is
optimized and a number of constraints is
satisfied.
DOLAP'04 - Washington DC
2
Problem (2)
• Examples of view selection problems in DWing.
– Given a set of queries to be satisfied by the DW,
select a set of views to materialize such that the
combination of the query evaluation and view
maintenance cost is minimized and the size of the
materialized views does not exceed the space
allocated for materialization.
– Find the best global evaluation plan for multiple
incremental maintenance expressions for materialized
views.
DOLAP'04 - Washington DC
3
Problem (3)
• Solving view selection problems requires the
identification of common sub-expressions between
queries.
• Usually, this is done by identifying equivalent (or
subsumed) view nodes in query evaluation plans of two
queries in a bottom-up way.
• However, for this approach to be successful, all the
alternative query evaluation plans of the queries need to
be considered – an unfeasible task.
DOLAP'04 - Washington DC
4
Example - Query Evaluation Plans and
common subexpressions
Q2
Q1
A=A
B=B
A=B
A=B
B=B
A=A
R
S
T
U
R
DOLAP'04 - Washington DC
S
T
U
5
Example - Query Evaluation Plans and
common subexpressions
Q1
Q2
B=B
B=B
A=B
R
A=B
V
V
A=A
A=A
S
T
U
R
S
DOLAP'04 - Washington DC
U
T
6
Example - Query Evaluation Plans and
common subexpressions
Q2
Q1
B=B
A=B
V
A=B
V
A=A
B=B
A=A
R
S
T
U
R
DOLAP'04 - Washington DC
S
U
T
7
Example - Query Evaluation Plans and
common subexpressions
Q2
Q1
A=A
A=B
W
A=B
W
A=A
B=B
B=B
R
S
T
U
R
S
DOLAP'04 - Washington DC
T
U
8
Example - Query Evaluation Plans and
common subexpressions
Q1
Q2
A=B
A=A
R
A=B
B=B
S
T
A=A
U
R
DOLAP'04 - Washington DC
B=B
S
T
U
9
Our approach
Q1
Q2
Rewriting of Q1
using V
Rewriting of Q2
using V
V (CCD)
Optimal evaluation
Plan of Q1
Evaluation plans
of V
(search space of views
To materialize)
DOLAP'04 - Washington DC
Optimal evaluation
Plan of Q2
10
Goals
• Formalize the concept of ‘closeness’ of a
common subexpression to two queries.
• Design algorithms for computing common subexpressions that are as close to the queries as
possible (these common subexpressions are
called Closest Common Derivators).
• We address these problems starting with SPJ
queries that involve self-joins.
DOLAP'04 - Washington DC
11
Example
Q1
Select R1.A, R2.B, R3.C
From U, R as R1, R as R2, R as R3, S as S1
Where U.A=R1.A and R1.B<=R2.B and R2.C<=R3.B
and R3.C=S.C and R2.B<3 and R3.A>=4 and
R3.A<=7 and S1.D>=3
Q2
Select R4.C, R5.A, S3.C
From S as s2, R as R4, R as R5, S as S3, T
Where S2.C<=R4.C and R4.C=R5.B and R5.C<=S3.C
and S3.D=T.D and R4.B=3 and R5.A>=5 and
R5.A<=9 and S3.D>=3
DOLAP'04 - Washington DC
12
Query Graph Representation
R2.B<3
R3.A≥4^R3.A≤7
U.A
=R
1.A
U
S1.D≥3
R1.B≤R2.B
Q1
R1[R]:A
R2.C≤R3.B
R2[R]:B
R3[R]:C
R3.
C=S
1
.C
S1[S]
R4.B=3
R5.A≥5^R5.A≤9
R4.C=R5.B
Q2
S2
.C
R4
≤
C
.
R4[R]:C
S3.D≥3
R5.C≤S3.C
R5[R]:A
S2[S]
S3[S]:C
S3
.D=
T.D
T
DOLAP'04 - Washington DC
13
Query rewritings
• A rewriting Q’ of a query Q using view V is a query that
references V and possibly base relations such that
replacing V by its definition results in a query equivalent
to Q. Notation: Q |-- V.
If there is a rewriting of Q that references only V (no
base relations), we call it complete rewriting.
Notation: Q ||-- V.
Otherwise, we call it a partial rewriting.
• A rewriting Q’ of query Q using a view V is called simple
rewriting if view V has a single occurrence in Q’.
• A rewriting Q’ of a query Q using a view V is minimal if
for every relation R that has n, n>0, occurrences in Q, R
has k, 0 ≤ k≤ n, occurrences in V and n- k occurrences in
Q’. Notation: Q |--m V.
DOLAP'04 - Washington DC
14
Common Derivator (CD) of two queries
• Let Q1 and Q2 be two queries and R1, R2 be two sets of
relation occurrences from Q1 and Q2, respectively, that
have the same number of relation occurrences of each
relation. A common derivator (CD) of Q1 and Q2 over the
respective sets R1 and R2 is a view V such that there is
a minimal rewriting of Q1 (resp. Q2) using V that
involves V and only those relation occurrences of Q1
(resp. Q2) that do not appear in R1 (resp. R2.)
DOLAP'04 - Washington DC
15
Example - Common Derivator
R2.B<3
U.A
U
=R
1
R3.A≥4^R3.A≤7
S1.D≥3
.A
R1.B≤R2.B
Q1
R2.C≤R3.B
R3.
C=S
1.C
R3[R]:C
R2[R]:B
R1[R]:A
f1
R1={R2, R3}
R2={R4, R5}
R6.C≤R7.B
R6[R]
R7[R]
f2
R4.B=3
S2
.C
R4
≤
C
.
R5.A≥5^R5.A≤9
S3.D≥3
R5.C≤S3.C
R5[R]:A
R4[R]:C
CD V1 over
R1 and R2
f2
R4.C=R5.B
Q2
S1[S]
f1
S3[S]:C
S3
.D=
T.D
V1
S2[S]
T
DOLAP'04 - Washington DC
16
Example - Common Derivator
R2.B<3
U
U.
A=
R1
R3.A≥4^R3.A≤7
S1.D≥3
.A
R1.B≤R2.B
Q1
R2.C≤R3.B
R2[R]:B
R1[R]:A
R3[R]:C
R3.
C=S
1
.C
R7.A≥3^R7.A≤9
S1[S]
f1
f1
R1 = {R2, R3, S1}
R2 = {R4, R5, S3}
R7.C
7.B
≤R R7[R]
C
.
R6
R6[R]
f1
≤S4.C
S4[S]
f2
CD V2 over
R1 and R2
f2
f2
R4.B=3
R5.A≥5^R5.A≤9
R4.C=R5.B
Q2
4.C
≤R
.C
2
S
R4[R]:C
S3.D≥3
R5.C≤S3.C
R5[R]:A
S2[S]
S3[S]:C
S3
.D
=T
.D
T
DOLAP'04 - Washington DC
17
Closeness relationship between CDs
•
Let Q1, Q2 be two queries,
V= X(C(R)) is a CD of Q1 and Q2 over R1 and R2,
V’= X(C’(R’)) be a CD of Q1 and Q2 over R1’ and R2’,
and R1 R1’ and R2 R2’.
CD V’ is closer to Q1 and Q2 than CD V if the following
conditions are satisfied
(a) V’ |-- V
(b) if C’(R’) ||-- C(R) then V ||─V’
DOLAP'04 - Washington DC
18
Example – Closeness relationship
• V2 is closer to Q1 and Q2 than V1
V1
V2
≤9
A
.
7
3 ^R
≥
.A
R7
R2
f1
R3
R2
f1
f1
R6.C≤R7.B
R6[R]
R7[R]
f2
f2
R4
R3
S1
f1
R7.C
7.B
R
R7[R]
.C≤
6
R
R6[R]
S4[S]
f2
f2
R5
R4
R1={R2, R3}
R2={R4, R5}
f1
≤S4.C
R5
f2
S3
R1={R2, R3, S1}
R2={R4, R5, S3}
DOLAP'04 - Washington DC
19
Example – Closeness relationship
• V3 is closer to Q1 and Q2 than V2
V2
≤9
R2
f1
R3
S1
f1
R7.C
.B
R7
≤
R7[R]
6.C
R6[R]
f1
≤S4.C
S4[S]
f2
f2
R2
f1
f2
f2
R4
R5
R1={R2, R3, S1}
R2={R4, R5, S3}
S3
9
R3
S1
f1
R7.C
.B
R7
≤
R7[R]
.C
R6
f2
R6[R]
R5
R4
≤S4.C
f1
S4.D≥3
R
≤
7.A
R
3^
.A≥
7
R
R6.B
≤3
R
7.A
R
^
≥3
7.A
V3
S4[S]
f2
S3
R1={R2, R3, S1}
R2={R4, R5, S3}
DOLAP'04 - Washington DC
20
Example – Closeness relationship
≤
7.A
R
^
3
.A≥
R7
9
R3
R2
S1
f1
f1
R6 . C
R7 . C
B
≤R7.
R7[R]
≤S4.C
f2
f1
S4.D≥3
R6 . B
≤3
• V4 is closer to
Q1 and Q2
than V3
V3
S4[S]
R6[R]
f2
f2
R5
S3
R4
R1={R2, R3, S1}
R2={R4, R5, S3}
V4
≤
7.A
^R
3
.A≥
R7
R6 . C
B
≤R7.
f1
R7.C
f2
R6[R]:A,B,C
f2
S1
R7[R]:A,B,C
≤S4.C
S4[S]:D
f2
S4.D≥3
R6.B
≤3
R3
f1
R2
f1
9
R5
S3
R4
R1={R2, R3, S1}
R2={R4, R5, S3}
21
Closest Common Derivator (CCD)
•
Let Q1 and Q2 be two queries. A Closest Common
Derivator (CCD) of Q1 and Q2 over R1 and R2 is a
CD V of Q1 and Q2 over R1 and R2 such that there
exists no CD of Q1 and Q2 that is closer to Q1 and Q2
than V.
DOLAP'04 - Washington DC
22
Example
≤
7.A
^R
3
.A≥
R7
≤3
R6.B
R6 . C
S1
R7[R]:A,B,C
B
≤R7.
f1
R7.C
f2
≤S4.C
S4[S]:D
R6[R]:A,B,C
f2
f2
S4.D≥3
CCD1
R3
f1
R2
f1
9
R5
S3
R4
R1={R2, R3, S1}
R2={R4, R5, S3}
R1
S1
f1
f1
CCD2
R6.C≤R7.B
R8[R]:A,B,C
S5[S]:C,D
f2
f2
S2
R4
R1={R2, R3}
R2={R4, R5}
DOLAP'04 - Washington DC
23
How to compute a CCD
•
•
•
•
Query graph in Full Form
Condition merging
Candidate CCDs
Comparison of Candidate CCDs over the
same occurrence set
DOLAP'04 - Washington DC
24
Full Form Condition and Query
•
1.
2.
•
A condition C is in full form if:
For every atomic condition Ai such that
C |= Ai, there is an atomic condition Aj in C such that
Aj |= Ai (|= denotes logical implication)
Condition C does not include strongly redundant
atomic conditions.
A query X(C(R) is in full form if its condition C is in
full from.
DOLAP'04 - Washington DC
25
Example—Query graph Full Form
Q2
S
S
S
‘CCD’
S.B=T.B
S.B=T.B
S.B=T.B
DOLAP'04 - Washington DC
T
B>5
Q1
R.A=S.A
B>7
R
T
T.C=U.C
U
T
26
Example—Query graph full form
R
S
S.B=T.B
T
B>5
Q1
R.A=S.A
Q2
B>7
B>5
S
S.B=T.B
T
T.C=U.C
U
B>7
S
DOLAP'04 - Washington DC
T
B>5
CCD
S.B=T.B
27
Condition Merging
• Two conditions C1 and C2 are mergeable if there is a
non-valid condition C such that C1|=C and C2|=C and
there exists no condition C', C'≡C, such that C1|=C',
C2 |= C' and C’|= C.
Condition C is called a merge of C1 and C2.
• We show how the merge of two conditions can be
computed.
DOLAP'04 - Washington DC
28
CCD Computation
• We introduce the concept of a candidate CCD: a graph
representation of a CCD resulting by ‘merging’ common
subparts of two query graphs.
• We show that a CCD of two queries is a candidate CCD.
• We express the CCD closeness relationship on
candidate CCDs.
DOLAP'04 - Washington DC
29
CCD Computation (2)
In order to compute all the CCDs of two queries:
• We compute all the candidate CCDs of two query graphs
in full form.
• We discard a candidate CCD V if there is another CCD
V’ that is closer to the queries than V.
DOLAP'04 - Washington DC
30
Future work
• Extend the concept of a CCD so that it
applies to a more general class of queries.
• Use the concept of a CCD to identify
common sub-expressions within one query
• Use the concept of a CCD to design
algorithms for different materialized view
selection problems.
DOLAP'04 - Washington DC
31
Thanks
DOLAP'04 - Washington DC
32
© Copyright 2025 Paperzz