Multivalued Dependency - Department of Computer Science

Multivalued Dependency
Prof. Sin-Min Lee
Department of Computer Science
HIGHER NORMAL FORMS
1NF
2NF 3NF BCNF 4NF 5NF
functional
dependencies
multivalued
dependencies
join
dependencies
STUDENT enjoys HOBBY
STUDENT learns MODULE
John learns
Mary learns
Pascal
Databases
Java
John enjoys Music
Jogging
Mary enjoys Reading
Tennis
Cycling
C++
Jenny learns C++
Databases
Jenny enjoys Music
STUDENT
MODULE
HOBBY
John
Pascal
Databases
Java
Music
Jogging
C++
Reading
Tennis
Cycling
C++
Databases
Music
Mary
Jenny
PROFILE is in BCNF but exhibits
redundancy and I, D ad U anomalies
multivalued dependency X Y
holds in R if:
whenever two tuples of R
agree in value of X,
their image sets in pR(X,Y)
are the same;
X, Y, Z - pairwise disjoint subsets of R (X,Y,Z)
STUDENT  MODULE
STUDENT  HOBBY
mutually independent
PROFILE
STUDENT
MODULE
HOBBY
John
Pascal
Music
John
Pascal
Jogging
John
Databases
Music
John
Databases
Jogging
John
Java
Music
John
Java
Jogging
Mary
C++
Reading
Mary
C++
Tennis
Mary
C++
Cycling
Jenny
C++
Music
Jenny
Databases
Music
Fourth Normal Form
preventing conjunction of unrelated facts
R(X, Y, Z) is in 4NF if,
whenever a multivalued dependency XY holds for R,
so does the functional dependency XA for all attributes A in R
R is in 4NF
X
x1
if
th en
x2
Y
…
xn
y1
y2
Z
…
yn
mvd
fd
4NF: every MVD is FD
z1
z2
…
zn
Multivalued Dependencies
The multivalued dependency X  Y holds
in a relation R if whenever we have two
tuples of R that agree in all the attributes of
X, then we can swap their Y components
and get two new tuples that are also in R.
X
Y
others
Example
Drinkers(name, addr, phones, beersLiked)
with MVD Name  phones. If Drinkers has
the two tuples:
name addr
sue
a
sue
a
phones
p1
p2
beersLiked
b1
b2
it must also have the same tuples with phones
components swapped:
name addr
sue
a
sue
a
phones
p2
p1
beersLiked
b1
b2
Note: we must check this condition for all pairs of tuples
that agree on name, not just one pair.
MVD Rules
1.Every FD is an MVD.
– Because if X Y, then swapping Y’s between
tuples that agree on X doesn’t create new tuples.
– Example, in Drinkers: name  addr.
2.Complementation: if X  Y, then X  Z,
where Z is all attributes not in X or Y.
– Example: since name  phones
holds in Drinkers, so does
name  addr beersLiked.
Splitting
Doesn’t
Hold
Sometimes you need to have several attributes on the right of an
MVD. For example:
Drinkers(name, areaCode, phones, beersLiked,
beerManf)
name
Sue
Sue
Sue
Sue
•
areaCode
831
831
408
408
phones
555-1111
555-1111
555-9999
555-9999
beersLiked
Bud
Wicked Ale
Bud
Wicked Ale
name  areaCode phones holds, but neither
name  areaCode nor name  phones do.
beerManf
A.B.
Pete’s
A.B.
Pete’s
4NF
Eliminate redundancy due to multiplicative effect of MVD’s.
• Roughly: treat MVD’s as FD's for decomposition, but not
for finding keys.
• Formally: R is in Fourth Normal Form if whenever MVD
X  Y is nontrivial (Y is not a subset of X, and X  Y is
not all attributes), then X is a superkey.
– Remember, X  Y implies X  Y, so 4NF is more stringent
than BCNF.
• Decompose R, using
4NF violation X  Y,
into XY and X  (R—Y).
R
X Y
Example
Drinkers(name, addr, phones, beersLiked)
• FD: name  addr
• Nontrivial MVD’s: name  phones and
name  beersLiked.
• Only key: {name, phones, beersLiked}
• All three dependencies above violate 4NF.
• Successive decomposition yields 4NF relations:
D1(name, addr)
D2(name, phones)
D3(name, beersLiked)
Multivalued Dependencies
• Multivalued dependencies are referred to as tuplegenerating dependencies.
• Let R be a relation schema and let a R and b R. The
multivalued dependency is a  b
holds on R if, in any legal relation r( R ), for all pairs of
tuples t1 and t2 in r such that t1[ a ] = t2[ a ], there exist
tuples t3 and t4 in r such that
Multivalued Dependencies (cont)
• t1[ a ] = t2[ a ] = t3[ a ] = t4[ a ]
t3[ b ] = t1[ b ]
t3[ R - b ] = t2[ R - b ]
t4[ b ] = t2[ b ]
t4[ R - b ] = t1[ R - b ]
• The multivalued dependency a  b says that the
relationship between a and b is independent of the
relationship between a and R - b.
Multivalued Dependencies (cont)
• If the multivalued dependency a  b is satisfied by all
relations on schema R, then a  b is a trivial
multivalued dependency on schema R.
• Thus, a  b is trivial if b  a or b a = R
Tabular representation of a  b
a
b
R-a-b
t1
a1…ai
ai+1…aj
aj+1…an
t2
a1…ai
bi+1…bj
bj+1…bn
t3
a1…ai
ai+1…aj
bj+1…bn
t4
a1…ai
bi+1…bj
aj+1…an
Example: Here is an example of multivalued dependencies
given R(A B C D). Show that A  BD we can rearrange the
table to R(A B D C).
A
1
2
2
2
3
4
3
2
B
2
1
1
1
1
2
2
1
C
3
4
2
2
3
3
3
4
D
1
1
1
2
1
1
2
2
A
1
2
2
2
3
4
3
2
B
2
1
1
1
1
2
2
1
D
1
1
1
2
1
1
2
2
C
3
4
2
2
3
3
3
4
Example (Cont.): Perform each test to check if A  BD.
t1 = r 5
3
1
1
3
t1 = r 2
2
1
1
4
t2 = r 7
3
2
2
3
t2 = r 3
2
1
1
2
t3 = r 5
3
1
1
3
t3 = r 3
2
1
1
2
t4 = r 7
3
2
2
3
t4 = r 2
2
1
1
4
t1 = r 2
2
1
1
4
t1 = r 2
2
1
1
4
t2 = r 8
2
1
2
4
t2 = r 4
2
1
2
2
t3 = r 2
2
1
1
4
t3 = r 3
2
1
1
2
t4 = r 8
2
1
2
4
t4 = r 8
2
1
2
4
Example (Cont.): Perform each test to check if A  BD.
t1 = r 3
2
1
1
2
t1 = r 3
2
1
1
2
t2 = r 4
2
1
2
2
t2 = r 8
2
1
2
4
t3 = r 3
2
1
1
2
t3 = r 2
2
1
1
4
t4 = r 4
2
1
2
2
t4 = r 4
2
1
2
2
t1 = r 4
2
1
2
2
t2 = r 8
2
1
2
4
t3 = r 8
2
1
2
4
t4 = r 4
2
1
2
2
Each test is satisfied, so
 BD is true!!!
A
Multivalued Dependencies (cont)
• To illustrate the difference between functional and
multivalued dependencies, we consider again the BCschema.
Graph 1
loan-number
customer-name
customer-street
customer-city
L-23
Smith
North
Rye
L-23
Smith
Main
Manchester
L-93
Curry
Lake
Horseneck
Multivalued Dependencies (cont)
• On graph 1, we must repeat the loan number once for each
address a customer has, and we must repeat the address for
each loan a customer has. This repetition is unnecessary,
since the relationship between that customer and his
address is independent of the relationship between that
customer and a loan.
• If a customer (say, Smith) has a loan (say, loan number L23), we want that loan to be associated with all Smith’s
addresses.
Multivalued Dependencies (cont)
• The relation on graph 2 is illegal, therefore to make this
relation legal, we need to add the tuples (L-23, Smith,
Main, Manchester) and (L-27, Smith, North, Rye) to the
bc relation of graph 2.
Graph 2 (an illegal bc relation)
loan-number
customer-name
customer-street
customer-city
L-23
Smith
North
Rye
L-27
Smith
Main
Manchester
Multivalued Dependencies (cont)
•
Comparing the preceding example with our definition of
multivalued dependency, we see that we want the
multivalued dependency to hold.
customer-name  customer-street customer-city
•
As was the case for functional dependencies, we shall
use multivalued dependencies in two ways:
1. To test relations to determine whether they are legal under a given
set of functional and multivalued dependencies.
2. To specify constraints on the set of legal relations; we shall thus
concern ourselves with only those relations that specify a given set
of functional and multivalued dependencies.
Theory of Multivalued
Dependencies
1.
2.
3.
4.
Reflexivity rule. If a is a set attributes, and b C a, then
a  b holds.
Augmentation rule. If a  b holds, and c is a set of
attributes, then ca  cb holds.
Transitivity rule. If a  b holds, and b  c holds, then
a  c holds.
Complementation rule. If a  b holds, then a  R
– b – a holds.
Theory of Multivalued
Dependencies
5. Multivalued augmentation rule. If a  b holds, and c
R and d C c, then ca  db holds.
6. Multivalued transitivity rule. If a  b holds, and b 
c holds, then a  c – b holds.
7. Replication rule. If a  b holds, then a  b.
8. Coalescence rule. If a  b holds, and c C b, and there
is
a d such that d C R, and d 3 b = w, and d  c, then a
c
holds.
Theory of Multivalued
Dependencies (cont)
1.
2.
3.
Multivalued union rule. If a  b holds, and a  c
holds, then a  bc holds.
Intersection rule. If a  b holds, and a  c holds,
then a  b 3 c holds.
Difference rule. If a  b holds, and a  c holds,
then a  b - c holds and a  c - b holds.