Spring2017
SCHEMAREFINEMENTANDNORMAL
FORMS
[CH 19]
4/23/17
CS 564: Database Management Systems; (c) Jignesh M. Patel, 2013
1
DatabaseDesign:TheStorysoFar
• RequirementsAnalysis
• Datastored,operations,apps,…
• ConceptualDatabaseDesign
• Modelhigh-leveldescriptionofthedata,constraints,ERmodel
• LogicalDatabaseDesign
• ChooseaDBMSanddesignadatabaseschema
• SchemaRefinement
• Normalizerelations,avoidredundancy,anomalies…
• PhysicalDatabaseDesign
• Examinephysicaldatabasestructureslikeindices,restructure…
• SecurityDesign
4/23/17
CS 564: Database Management Systems
2
Normalization
Whatisagoodrelationalschema?Howcanweimproveit?
•
e.g.:Suppliers(name,item,desc,addr,price)
Redundancy Problems:
1.
Asuppliersuppliestwoitems:RedundantStorage
2.
Changeaddressofasupplier:UpdateAnomaly
3.
Insertasupplier:InsertionAnomaly
o Whatifthesupplierdoesnotsupplyanyitems(nulls?)
o Recorddesc foranitemthatisnotsuppliedbyanysupplier
4.
Deletetheonlysuppliertuple:DeleteAnomaly
o Usenulls?
o Deletethelastitemtuple.Can’tmakenamenull.Why?
Alternative:
4/23/17
CS 564: Database Management Systems
3
DealingwithRedundancy
• Identify“bad”schemas
– functionaldependencies
• Mainrefinementtechnique:decomposition
– replacinglargerrelationwithsmallerones
• Decompositionshouldbeusedjudiciously:
–
Isthereareasontodecomposearelation?
•
–
Doesdecompositioncauseanyproblems?
•
•
•
4/23/17
Normalforms:guaranteesagainst(some)redundancy
Losslessjoin
Dependencypreservation
Performance(mustjoindecomposedrelations)
CS 564: Database Management Systems
4
FunctionalDependencies(FDs)
• AformofIC
• D:X→ Y
XandYsubsetsofrelationR’sattributes
t1∈ r,t2∈ r,∏X(t1)=∏X (t2)⇒ ∏y(t1)=∏y (t2)
• AnFDisastatementaboutallallowablerelations.
–
Basedonlyonapplicationsemantics,can’tdeducefrominstances
–
CansimplycheckifaninstanceviolatesFD(andotherICs)
• Consider,(X,Y)→Z.Doesthisimply(X,Y)isakey?
X"
1"
1"
2"
2"
4/23/17
Y"
1"
2"
2"
2"
Z"
11"
12"
22"
22"
K"
A"
A"
A"
B"
Primary Key IC is a special case of FD
CS 564: Database Management Systems
5
Example:ConstraintsonEntitySet
Supplier
name
•
•
•
•
•
addr
price
Supplier
item
desc
S(name,item,desc,addr,price)
FD:{n,i}→ {n,i,d,a,p}
FD:{n}→ {a}
FD:{i}→ {d}
Decomposeto:NA,ID,INP
name
Supplies
addr
price
Item
item
desc
• Spl(name,item,price)
– FD:{n,i}→ {n,i,p}
• Sup(name,addr)
– FD:{n}→ {n,a}
• Item(item,desc)
– FD:{i}→ {i,d}
ERdesignissubjectiveandcanhavemanyE+Rs
FDs:sanitychecks+deeperunderstandingofschema
Samesituationcouldhappenwitharelationshipset
4/23/17
CS 564: Database Management Systems
6
RefininganERDiagram
loc
Supplier
name
addr
Supplies
price
Item
item
desc
• IS(item,name,desc,loc,price)
S(name,addr)
• Asupplierkeepsallitemsinthesamelocation
FD:name→ loc
• Solution:
4/23/17
CS 564: Database Management Systems
7
InferringFD
• ename → ejob,ejob → esal;⇒ ename → esal
• Armstrong’sAxioms(X,Y,Zaresetsofattributes):
– Reflexivity:IfY⊆ X,thenX→ Y
– Augmentation:IfX→ Y,thenXZ→YZforanyZ
– Transitivity:IfX→ YandY→ Z,thenX→ Z
• Additionalrules(derivable):
– Union:IfX→ YandX→ Z,thenX→ YZ
– Decomposition:IfX→ YZ,thenX→ YandX→ Z
• SetofallFD=closureofF,denotedasF+
• AAsound:onlygeneratesFDinF+
• AAcomplete:repeatedapplicationgeneratesallFDinF+
4/23/17
CS 564: Database Management Systems
8
Decomposition
•
Replacearelationwithtwoormorerelations
•
Problemswithdecomposition
1. Somequeriesbecomemoreexpensive.(morejoins)
2. LosslessJoin:Canwereconstructtheoriginalrelation
frominstancesofthedecomposedrelations?
3. DependencyPreservation:Checkingsomedependencies
mayrequirejoiningtheinstancesofthedecomposed
relations.
4/23/17
CS 564: Database Management Systems
9
LosslessJoinDecompositions
• RelationR,FDsF:DecomposedtoX,Y
• Lossless-Joindecompositionif:
∏X(r)⋈ ∏Y(r)=rforevery instancerofR
• Note,r⊆ ∏X(r)⋈ ∏Y(r)isalwaystrue,
notviceversa,unlessthejoinislossless
• Cangeneralizetothreemorerelations
4/23/17
CS 564: Database Management Systems
10
LosslessJoin…
• RelationR,FDs F:DecomposedtoX,Y
– Test:lossless-joinw.r.t.FifandonlyiftheclosureofFcontains:
• X∩ Y→ X,or
• X∩ Y→ Y
i.e.attributescommontoXandYcontainakeyforeitherXorY
– Also,givenFD:X→ YandX∩ Y=∅,thedecompositionintoR-Y
andXYislossless
n
4/23/17
X is a key in XY, and appears in both
CS 564: Database Management Systems
11
DependencyPreservingDecomposition
• R(sailor,boat,date)
{D→ S,D→ B}
à X(sailor,boat)
Y(boat,date)
{D→ B}
• TocheckD→ SneedtojoinR1andR2(expensive)
• Dependencypreserving:
– Rà X,Y F+ =(Fx ⋃ Fy)+
n Note: F not necessarily = Fx ⋃ Fy
4/23/17
CS 564: Database Management Systems
12
NormalForms
–
–
–
–
n
1NF:Atomicvalues
2NF:Historical
3NF:…
BCNF:Boyce-Codd NormalForm
More
Restrictive
• Isanyrefinementisneeded!
• NormalForms:guaranteesthatcertainkindsof
problemswon’toccur
Role of FDs in detecting redundancy:
n
Relation R with 3 attributes, ABC.
n
n
4/23/17
No ICs (FDs) hold ⇒ no redundancy.
A → B ⇒ 2 or more tuples with the same A value,
redundantly have the same B value!
CS 564: Database Management Systems
13
Boyce-Codd NormalForm(BCNF)
• Reln RwithFDsFisinBCNFif,forallX→ AinF+
–
–
•
•
n
A∈ X(trivialFD),or
Xisasuperkey
i.e.allnon-trivialFDsoverRarekeyconstraints.
NoredundancyinR (atleastnonethatFDsdetect)
Mostdesirablenormalform
X
Consider a relation in BCNF and FD: X → A,
x
two tuples have the same X value
x
n Can the A values be the same (i.e. redundant)?
n NO! X is a key, ⇒ y1 = y2. Not a set!
4/23/17
CS 564: Database Management Systems
Y
A
y1
a
y2
?
14
3NF
• RelationRwithFDsFisin3NFif,forallX→ AinF+
–
–
–
•
•
BCNFimplies3NF
e.g.:Sailor(Sailor,Boat,Date,CreditCrd)
–
–
–
–
•
A∈ Xor
Xisasuperkeyor
Aispartofsomekey forR(primeattribute)
- Minimality ofakey(notsuperkey)iscrucial!
SBD->SBDC,S->C
SBD->SBDC,S->C(not3NF)
IfC->S,thenCBD->SBDC(i.e.CBDisalsoakey).Nowin3NF!
Noteredundancyin(S,C);3NFpermitsthis
CompromiseusedwhenBCNFnotachievable,orperf.Consideration
Lossless-join,dependency-preservingdecompositionofRintoa
collectionof3NFrelationsalwayspossible.
4/23/17
CS 564: Database Management Systems
15
© Copyright 2026 Paperzz