Algorithm dec(S,F) - Decomposition into BCNF : R0(S0) – relation, F

Algorithm dec(S,F) - Decomposition into BCNF : R0(S0) – relation, F0 – set of FDs
% Initialization
S=S0; F= F0;
% Body
if (S in BCNF) return S;
else if (X  Y  F violates BCNF)
{ compute X+ ;
S1 = X+ ; S2 = (X, S - X+ );
compute F1; compute F2;
return (dec(S1, F1)  dec(S2, F2)) }
Basis of recursion (stop condition): binary relation
Example 1: St (stud, course, semester, lecturer, time)
St: stud course semester lecturer
time
Avi 89-01 b
Lavan Mon-16:18
Avi 89-01 b
Lavan Wed-12:14
Beni 89-02 b
Kahol
Wed-12:14
S = (stud, course, semester, lecturer, time)
F = { stud, course  semester; stud, course  lecturer}
stud, course  semester violates BCNF
{ stud, course}+ = { stud, course, semester, lecturer}
S1 = (stud, course, semester, lecturer); S2 = (stud, course, time)
F1 = { stud, course  semester; stud, course  lecturer}; F2 empty
========== Final design : S1, S2
Example 2: St1(stud, course, lecturer, L_city, L_region)
St1:
stud course lecturer L_city L_region
Avi 89-01 Lavan RG
center
Beni 89-01 Lavan RG
center
Beni 89-02 Kahol
Haifa North
S = (stud, course, lecturer, L_city, L_region)
F = { stud, course  lecturer; lecturer  L_city; L_city  L_region }
lecturer  L_city violates BCNF
{ lecturer }+ = { lecturer, L_city, L_region }
S1 = (lecturer, L_city, L_region); S2 = (lecturer, stud, course)
F1 = { lecturer  L_city; L_city  L_region }, F2 = { stud, course  lecturer}
========== S1 – not BCNF; S2 – BCNF  dec(S1, F1)
S = (lecturer, L_city, L_region)
F = { lecturer  L_city; L_city  L_region }
L_city  L_region violates BCNF
{ L_city }+ = { L_city, L_region }
S1 = (L_city, L_region); S2 = (L_city, lecturer)
F1 = { L_city  L_region }, F2 = { lecturer  L_city }
======= Final design (L_city, L_region); (lecturer , L_city) ; ( stud, course, lecturer)
Decomposition properties: BCNF
Elimination of anomalies
v
Info preservation (lossless-join dec.) v
Dependencies preservation
Recovering of info from decomposition: natural join
Theorem. R(S), S = X  Y  Z, F = {Y  Z}
R = π X  Y R |X| π Y  Z R
Example
R(X,Y,Z), Y  Z
R(X,Y,Z), Y -/-> Z, Y -/-> X
Lossless-join dec.
Lossy-join dec.
X Y Z
x1 y z
x2 y z
X Y
x1 y
x2 y
X Y Z
x1 y z
x2 y z
X Y Z
x1 y z1
x2 y z2
Y Z
y z
X Y
x1 y
x2 y
X
x1
x1
x2
x2
Y
y
y
y
y
Y Z
y z1
y z2
Z
z1
z2
z1
z2
Join - the only way to recover!
Chase test: lossless-join
True: Natural join - associative, commutative
True: R  N.j. of projections
Check: N.j. of projections  R ( t  N.j.  t  R)
Tableau
Tuple for every schema Si , subscripts for supplementing attributes
Example
R(A,B,C,D), A B C D
S1=(A,D),
a b1 c1 d
S2=(A,C),
a b2 c d2
S3=(B,C,D) a3 b c d
Chase
X  Y: x's agree  y's agree
Example F = { A  B; B  C; C,D  A }
A
B
C
D
A
B
C
D
A
B
C
D
a
b1
c1
d
a
b1
c
d
a
b1
c1
d
a
b1
c
d2
a
b1
c
d2
a
b2
c
d2
a3
b
c
d
a3
b
c
d
a
b
c
d
Dependency preservation: R(stud, course, lecturer),
F = { stud, course  lecturer; lecturer  course}
BCNF: R1(lecturer, course), R2(lecturer, stud)
R1 |X| R2: stud course lecturer
lecturer course :R1 R2: lecturer stud
Kahol
DB
Kahol
Avi
Avi DB
Kahol
Lavan
DB
Lavan
Avi
Avi DB
Lavan
lecturer  course
stud, course -/-> lecturer
3NF - Third Normal Form
Def.1.  X  Y - nontrivial FD
 either X - superkey
 or  Yi  Y - X , Yi  some key
Def. 2. …
 either left side - superkey
 or right side consists of prime attributes only
Decomposition properties: 3NF
Elimination of anomalies
Info preservation (lossless-join dec.) v
Dependencies preservation
v
Algorithm - Decomposition into 3NF : R (S) – relation, F – set of FDs
B = minimal basis for F;
for (  f = ( X  Y)  B) scheme (X,Y);
if ( none of the schemes from the for-loop is a superkey for R) add scheme = key for R;
Example
R(A,B,C,D,E), F = {A,B  C; C  B; A  D}
1. Show that F is a minimal basis
2. R1(A,B,C), R2(B,C), R3(A,D)
S2  S1  drop R2  current design: R1(A,B,C), R3(A,D)
3. Keys: K1=A,B,E; K2=A,C,E  final design: R1(A,B,C), R3(A,D), R4(A,B,E)
Multivalued dependencies - mvd X ->> Y ; S - X - Y
Tuples: one value on X, set of values on Y independent of the set of values on S - X - Y
R (name, phone, cr_card, valid) name ->>phone, name->>cr_card,valid
Rules
1. Trivial mvd
2. Transitive rule
3. No splitting
4. FD promotion
5. Complementation rule
6. More trivial mvd
if Y  X then X ->> Y
if X ->> Y and Y ->> Z then X ->> Z;
name -/>> cr_card, name -/>> valid
if X  Y then X ->> Y
if X ->> Y then X ->> S-X-Y
if S = (X,Y) then X ->> Y
Z=Z-X
4NF R(S), X,Y  S, X ->> Y
If X ->>Y is a nontrivial mvd, then X is a superkey
Algorithm dec4(S,F) - Dec. into 4NF : R0(S0) – relation, F0 – set of FDs and mvd's
% Initialization
S=S0; F= F0;
% Body
if (S in 4NF) return S;
else if (X ->> Y  F violates 4NF) // X is not a superkey
{S1 = (X,Y) ; S2 = (X, S-X-Y );
compute F1; compute F2;
return (dec4(S1, F1)  dec4(S2, F2)) }
Basis of recursion (stop condition): binary relation
Inclusion: 3NF  BCNF  4NF
Property of dec.
3NF BCNF 4NF
Eliminates redundancy (FD) v
v
Eliminates redundancy (mvd) v
Preserves FDs
v
Preserves mvd's
-