Chapter 4

Chapter Four
Database Design (Relational)
Objectives
Summary
 Keys (Constraints)
 Relational DBMS
 Normal Forms

Summary


DB Lifecycle
Business Requirements







Architecture of DBMS
Definitions
Data Models
Database Design (ER Model)





Design (ER)
Build DB
Production
Strong Entity
Weak Entity
Relationship
Functionality
Functional Dependency
2
Keys


(Constraints)
A set of attributes whose values
uniquely identify each entity in an entity
set or a relationship set
How do we identify keys?
Relation R with a1, a2, … an
3
Keys
1.
(Constraints)
Super key: Any set of attributes that
uniquely identify each table.
Student (Name, ID, GPA, Major, Minor,
Address, Phone)
4
Keys(Constraints)
2.
3.
Candidate Key: Smallest super key
Primary key: Candidate key selected
by the DBA
5
Keys
(Constraints)
Characteristic of primary key:
a.
b.
Uniqueness:
At any given time, no two tuples can have
the same value for a given primary key
Minimally:
None of the attributes in the primary key
can be discarded without distorting the
uniqueness property
6
Keys
4.
(Constraints)
Foreign Key:
An attribute(s) in an entity set one (relation
one) which is the primary key of entity
two(relation two)
R1 (a,b,c,d,e)
R2 (x,y,z,a,w)
Faculty (ID, Name, Salary, D_name, age, Hiring_date)
Department(D_name, No_Faculty, D_head)
7
Relational DBMS
Relational DBMS




RDBM: Data are represented as a set of
tables (relation is a mathematical term
for a table)
Originated by E.F. Codd(1970)
Based on sets theory
Record base data model
9
Structure:




A set of relations (Table)
Each relation has a unique name
Each relation has a set of attributes
(Columns)
Each relation has a set of tuples
(Rows)
10
Restriction on RDB:






No two tuples are the same
No two attributes are the same
The order of tuples are immaterial
The order of attributes are immaterial
There is an attribute or collection of
attributes which identifies tuples
uniquely called Primary Key
Value of attribute must be atomic
13
Intention vs. Extension
R
a1
a2
…
…
an
T1
T2
R: Relation Name
an: attribute
Tm: tuple
T[an]: value of attributes for tuple T
14
Converting E.R Diagram to Relational
Strong Entity sets:
1.


Let E be a strong entity set with
attributes a1, a2,a3, … an
Create a relation R with n distinct
columns each of which corresponds to
one of the attributes in E
15
Converting E.R Diagram to Relational
Weak Entity sets:
2.




Let W be a weak entity set with
attributes a1 ,a 2,a3 , … ak
Let E be the strong entity set on which W
is dependent
Let primary key of E be e1 ,e2 ,e3 , … ex
Create a relation R with k+x columns (a1,
a2 ,a3 , … am) & (e1 ,e2 ,e3 , … ex)
16
Converting E.R Diagram to Relational
Relationship:
3.


Let R be a relationship among entity sets
e1, e2, … en with primary keys (Ei) and
attributes a1 … an
Create a relation called R with Un Primary
key (Ei) U {a1, … an}
17
Example

Convert the school ER diagram into
relational database.
18
Normal Forms
(Guidelines for RD design)



How do we know this design is good?
If it is not a good design, What should
we do?
Modify our design ??.
19
Normal Forms
(Guidelines for RD design)



First Normal Form (1NF)
Deals with the shape of the records
A relation is in 1NF if the values of
domain is atomic for each attribute.
20
First Normal Form: 1NF

Example: R (A, B, C, …)
R( A
a1
B
b1, b2
)
R( A
=>
a1
a1
B )
b1
b2
21
First Normal Form: 1NF
Example:
 Person (Name
Smith

Person (Name
Smith
Smith
Smith
Age Children )
42
John, Lori, Mark
Age Child)
42
42
42
John
Lori
Mark
22
First Normal Form: 1NF
Example:
 Student (


Name
Birthday
S1
S2
Feb 2,91
March 8,88
)
Student (Name, D_Birth, M_Birth, Y_Birth)
Note: 2NF and 3NF Deal with the relationship
between non-key and key
23
Second Normal Form: 2NF


A relation R is in 2NF with respect to a
set of FD if it is in 1NF and every nonprime attribute is Fully dependent on
the entire key in R.
Fact: 2NF is violated when a non-key is
a fact about a subset of a primary key
24
Second Normal Form: 2NF

Non-prime vs. prime:
A relation R with attribute A and a set of FD
on attribute A is prime if A is contained in
some key of R, otherwise A is non-prime
25
Second Normal Form: 2NF






Example: R(A,B,C,D) with FD
A, B ---> C, D
A ---> D
D partially depends on A,B
C fully depends on A,B
A&B are prime (part of key)
If A is primary key. Is this in 2NF?
If A&B is primary key. Is this in 2NF?
26
Second Normal Form: 2NF


What should we do with a relation which is
not in 2NF?
Example: R(A,B,C,D)




A, B ---> C, D
A ---> D
R1 (A,B,C)
R2(A,D)
27
Second Normal Form: 2NF

Example:
R(Part Warehouse
P1
W1
P2
W1
P3
P4
P4
W2
W4
W1
Address
Frostburg
Frostburg
Quantity)
25
30
Cumberland 32
Frostburg
25
What is the primary key?
Part, Warehouse ---> Quantity
Warehouse ---> Address
28
Second Normal Form: 2NF
Problems:

1.
Repetition of information:
Changing the address W!
2.
Unable to present information:
Warehouse with no part
3.

Inconsistency
So …
R1 (Warehouse, Address)
R2 (Part, Warehouse, Quantity)
29
Second Normal Form: 2NF

R(
Example:
Professor, Student, Course, Degree
P1
P2
P3
S1
S2
S2
C1
C2
C4
Ph.D.
M.S.
M.S.
P3
S3
C4
Pg.D.
Professor ---> Course
Student ---> Degree
Professor ---> Student
)
Key? Not in 2NF
R1(Student, Degree)
R2(Professor, Course, Student)
30
Third Normal Form (3NF):

A relation R is 3NF with respect to a set of
FD if it is in 2NF and whenever A ---> B
holds, then
1. A --> B is a trivial FD
2. A is a superkey for R
3. B is contained in a candidate key for R
 A Non-key attribute non transitively
depends on the Primary Key.
31
Third Normal Form (3NF):
 Example: R(A,B,C,D)
A, B --->D
R1(A,B,D)
D ---> C
R2(D,C)
Fact: 3NF is violated when a non-key is a fact
about another non-key

Employee ---> Dept ---> Location
32
Third Normal Form (3NF):

Example: R(Employee, Dept, Location)
Employee ---> Dept
Dept ---> Location

Employee
Dept
Location
E1
E2
E3
D1
D1
D1
Frostburg
Frostburg
Frostburg
Problems?
R1(Employee, Dept)
R2(Dept, Location)
33
Third Normal Form (3NF):

ItemInfo (item,price, discount)


Item
I1
I2
I3
I4
Item ---> price
Price ---> discount
price
.99
.80
.10
5
discount
2%
2%
2%
10%
34
Third Normal Form (3NF):

Employee (ID, Name, Expertise ,Age,
Dept)





ID --> Name
ID --> Expertise
ID --> Age
ID --> Dept
Dept --> Expertise
35
Third Normal Form (3NF):

Example: R(A,B,C,D)






A,B ---> C
A,C ---> D
So A,B is the Primary Key
Not in 3NF
R1(A,B,C)
R2(A,C,D)
36
Boyce Codd Normal Form:

Def: A relation schema R is in BCNF
with respect to a set of FD, if it is 3NF
and whenever X  A holds, then X is a
superkey (AX)
37
Boyce Codd Normal Form:


Most 3rd NF relations are also BCNF
A 3rd NF relation is NOT in BCNF if:



Candidate keys in the relation are
composite keys (not single attribute)
There is more than one candidate key in
the relation, and
The keys are not disjoint (some attributes
in the keys are common)
38
Boyce Codd Normal Form:

A relation is in BCNF if every determinant is a
candidate key
R(A,B,C)
 FD:
A,B -> C
C -> A
A is prime, so it is 3rd NF
C is not candidate key (Not in BCNF)



Not BCNF
R1(A,B,C)
R2(A,C)
39
Boyce Codd Normal Form:

S(SupplierNo, sname, status, city)
FD:







SupplierNo ---> status
SupplierNo ---> city
SupplierNo ---> sname
sname ---> status
sname ---> city
sname ---> SupplierNo
It is in BCNF; Every determinate is a
candidate key
40
Boyce Codd Normal Form:
S( SupplierNo
sname
Status
City
S1
Smith
H
Frostburg
S2
Johnson
L
LaVale
S3
Marker
M
Cumberla
nd
)
41
Boyce Codd Normal Form:
S(SupplierNo, sname, PartNo, Qty)
FD:



SupplierNo -- sname
SupplierNo, PartNo ---> Qty
sname, PartNo ---> Qty
42
Boyce Codd Normal Form:
S( SupplierNo
S1
S1
S1
S1
sname
Smith
Smith
Smith
Smith
PartNo
P1
P2
P3
P4
Qty
100
200
300
400
)
It is in 3NF;
not in BCNF;
Problems: Sname or SupplierNo are not candidate keys for this
relation
R1(SupplierNo, sname)
R2(sname, PartNo, Qty)
43
Boyce Codd Normal Form:
ClientInterview (ClientNo, InterviewDate, InterviewTime,
StaffID, roomNo)
ClientNo,InterviewDate -> InterviewTime
ClientNo, InterviewDate -> StaffID
ClientNo, InterviewDate -> RoomNo
Staffid, InterviewDate, InterviewTime -> ClientNo
RoomNo, InterviewDate, InterviewTime -> StaffID
RoomNo, InterviewDate, InterviewTime -> ClientNo
StaffID, InterviewDate -> RoomNo
44
Boyce Codd Normal Form:
ClientNo InterviewDate
InterviewTime
StaffID
RoomNo
C25
March 2, 02
10:00
S10
GC104
C28
March 2, 02
11:30
S10
GC104
C72
March 2, 02
1:30
S8
GC103
C28
April 2, 02
10:00
S24
GC103
It is in 3NF
Not in BCNF
(StaffID, InterviewData) is not a cadidatekey

45
Boyce Codd Normal Form:


R1(ClientNo, InterviewData, InterviewTime,
StaffID)
R2(StaffID,InterviewData, RoomNo)
46
Normal Forms:
Cars(Model, NoCylinders, Madeln, Tax,
Fee)





Model, NoCylinders ---> Madeln
Model, NoCylinders ---> Tax
Model, NoCylinders ---> Fee
NoCylinders ---> Fee
Madeln ---> Tax
47
Normal Forms:
Cars( Model NoCylinders Madeln Tax Fee )
GM
6
U.S.
$20
$30
Toyota 4
Japan
$40
$5
Honda 4
Japan
$40
$5
VW
German
$50
$10
5
Primary Key? Model, NoCylinders
Is it in 1NF?
Is it in 2NF?
48
Normal Forms:
Cars(Model, NoCylinders, Madeln, Tax)
Licensing(NoCylinders,Fee)
49
Normal Forms:

Is it in 3NF?




Assume we have FD


Cars(Model, NoCylinders, Madeln)
Taxation(Madeln, Tax)
Licensing(NoCylinders, Fee)
Madeln ---> NoCylinders
It is not in BCNF


Cars(Model, NoCylinders)
EngineSize(NoCylinders, Madeln)
50
Practice:
A: PropertyNo
B: PropertyAddress
C: InspectionDate
D: InspectionTime
E: Comments
F: StaffID
G: StaffName
H: CarRegistrationNo

FD:
A,C -> D,E,F,G,H
A -> B
F -> G
F,C -> H
H,C,D -> A,B,E,F,G
F,C,D -> A,B,E

51
Multivalue Dependency (MVD)



Multi valued Dependency are a
generalization of FD
Relation R, with x,y subset attributes of
of R we say X -->-> Y
There is a multivalued dependency of y
on x. Given a value for x there is a set
of values for y.
52
Multivalue Dependency (MVD)

Example:
Name --->->
St, city
S
S
S1 C1
S2 C2
M
M
S1 C1
S2 C2
53
Multi-value Dependency (MVD)
x--->->y hold if t and s are 2 tuples in R t[x]=s[x] then
also there are tuples u and v where
1. u[x]=v[x]=t[x]=s[x]
2. u[y]=t[y] & u[R-x-y]=S[R-x-y]
3. v[y]=s[y] & v[R-x-y]=t[R-x-y]
[Relationship between x&y is independent of the relationship
between x & R-y]
R
x
y
R-x-y
t
s
U
V
54
Multivalue Dependency (MVD)

Example:
Name
St
City
Car
t
S
S1
C1
Ford
s
S
S2
C2
Chev
u
S
S1
C1
Chev
v
S
S2
C2
Ford
1. u[Name]=v[Name]=s[Name]=t[Name]
2. u[St,City]=t[St,City] & u[Car]=s[Car]
3. v[St,City]=s[St,City] & v[Car]=t[Car]
55
Fourth Normal Form (4NF):



A relation is in 4th NF with respect to a set of
MVD. If it is in 3rd NF and whenever x--->->y
holds, then x in a superkey (x--->->y is not a
trivial multivalued dependency, that is yx;
yxy or x not empty)
4NF is violated when a record type contains
two or more independent multivalued facts
about an entity.
4th and 5th NF in a sense are also about
composite keys
56
Fourth Normal Form (4NF):

Example: R(Employee, Skill, Language)
Employee  Skill
Employee  Language
57
Fourth Normal Form (4NF):

Example: R(Employee, Skill, Language)
Employee
Skill
E1
Cook
E1
Cashier
E1
Manager
Language
E1
English
E1
German
E1
Italian
E2
Cook
German
58
Fourth Normal Form (4NF):

We have two, many-to-many relationships,



Employee and Skill
Employee and Language
Employee --->-> Skill


R1(Employee, Skill)
<----- key ----->
R2(Employee, Language)
<----- key ----->
59
Fourth Normal Form (4NF):
Employee
Skill
Employee Language
E1
Cook
E1
English
E1
Cashier
E1
German
E1
Manager
E1
Italian
E2
Cook
E2
German
60
Fourth Normal Form (4NF):

IN 4Th normal form a record should not
contain two or more independent multivalued fact about an entity
61
Join Dependency (5 NF)
R(
R1(
SupplierNo
PartNo
ProjectNo
S1
P1
N2
S2
P2
N1
S2
P1
N1
S1
P1
N1
SupplierNo
PartNo )
S1
R2(
)
PartNo
ProjectNo
P1
P1
N2
S1
P2
P2
N1
S2
P1
P1
N1
)
62
Join Dependency
R3(
SupplierNo
Project )
No
S1
N2
S1
N1
S2
N1
63
Join Dependency

Join R1 & R2 over PartNo
SupplierNo
PartNo
ProjectNo
S1
P1
N2
S1
P2
N1
S2
P1
N1
S2
P1
N2
S1
P1
N1
64
Join Dependency

Join Result with R3
SupplierNo
PartNo
ProjectNo
S1
P1
N2
S1
P2
N1
S2
S2
P1
P1
N1
N1
S1
P1
N1
65
Join Dependency


If(S1,P1) appears in R1
AND (P1,N1) appears in R2
AND (N1,S1) appears in R3
THEN (S1,P1,N1) appears in R
Rewrite:
IF (S1,P1,N2), (S2,P1,N1), (S1,P2,N1)
appears in R
THEN (S1,P1,N1) appear in R
66
Join Dependency

Example:
IF Nelson supplies Screw Driver
AND Screw Drivers are used in Pullen project
AND Nelson supplies the Pullen project
THEN
Nelson supplies Screw Drivers for Pullen
project
67
Fifth Normal Form (5NF):



5th normal form deals with cases that information can
be reconstructed from smaller pieces of information
which can be maintained with less redundancy.
Join Dependency
If an agent represents a company; and company
makes a product and agent sales product, so we
have:
R( Agent
A1
A1
Company
Ford
GM
Product
Car
Truck
)
68
Fifth Normal Form (5NF):

Lets assume, there is a rule:
“if an agent sells a product and s/he represent the
company making that product, then s/he sells that
product for that company”.
Agent
Company
Product
S1
Ford
car
S1
Ford
Truck
S1
GM
Car
S1
GM
Truck
S2
Ford
Car
69
Fifth Normal Form (5NF):
Agent
Company
Company
Products
S1
Ford
Ford
Car
S1
GM
Ford
Truck
S2
Ford
GM
Car
GM
Truck
Agent
Products
S1
Car
S1
Truck
S2
Car
70