Chapter 7 Normalization

Normalization
Normalization is a formal method involved with a series of test
to help database designer to be able to identify the optimal grouping
of attributes for each relation in the relational schema. Normalization
can be applied to individual relation so that database can be
normalized to a specific form to prevent the possible occurrence of
update anomaly.
Data Redundancy and Update Anomalies
The main purpose of database design is to identify the optimal grouping of
attributes in order to minimize data redundancy, and NULL value which affect on saving
space for data storage.
Data redundancy always causes UPDATE ANOMALIES which are
classified into 3 types:
Insertion anomalies
Deletion Anomalies
Modification Anomalies
Insertion Anomalies
Deletion Anomalies
Modification Anomalies
Insertion Anomalies
To insert the details of new students into the
Class_Info relation, we must include the details of the lecturer and
subject in order to avoid null value.
Deletion Anomalies
If we delete a lecturer from the Class_Info relation, the details
of students and subjects are also lost from the database.
Modification Anomalies
If we want to change the value of one of the attributes of a
particular student in the Class_Info relation, we must update all rows
which associate to the student. If this modification is not carried out on
all the appropriate rows of the Class_Info relation, the database will
become inconsistent.
Insertion Anomaly
Class_Info
LID
Lname
E5001
E5001
E5001
E5001
E6001
E6001
E6001
E6001
E6002
E6002
E9001
E9001
E9001
E9001
E9001
E9001
Dusit
Dusit
Dusit
Dusit
Anan
Anan
Anan
Anan
Saeree
Saeree
Pattara
Pattara
Pattara
Pattara
Pattara
Pattara
Salary
28700
28700
28700
28700
24900
24900
24900
24900
53020
53020
18500
18500
18500
18500
18500
18500
Dept
Subject
EE
EE
EE
EE
IE
IE
IE
IE
IE
IE
CPE
CPE
CPE
CPE
CPE
CPE
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Optimization
Optimization
Prob Stat
Prob Stat
Optimization
Optimization
Data Structure
Data Structure
Data Structure
Web Service
Web Services
Web Services
Credit
3
3
3
3
3
3
4
4
3
3
3
3
3
4
4
4
SID
Sname
S4
S5
S6
S7
S8
S9
S8
S9
S10
S11
S1
S2
S3
S3
S1
S2
Panita
Sarun
Kanok
Vichu
Kitti
Chareon
Kitti
Chareon
Sathit
Vitthaya
Preeda
Panu
Vallapa
Vallapa
Preeda
Panu
GPA
3.35
2.96
2.75
3.15
2.54
3.08
2.54
3.08
2.67
3.25
2.85
2.45
3.02
3.02
2.85
2.45
NULL NULL
NULL NULL
NULL
NULL S999
Luxana
NULL
E9999 Thana
17500 CPE
NULL
NULL NULL
NULL
NULL
NULL NULL
NULL CPE
GIS
4 NULL
NULL
NULL
Insert new records may cause data redundancy and null value in some fields.
Insertion Anomaly
LID
Lname
E5001
E5001
E5001
E5001
Dusit
Dusit
Dusit
Dusit
E5001
E5001
E5001
E5001
E6001
E6001
E6001
E6001
E6002
E6002
E9001
E9001
E9001
E9001
E9001
E9001
Class_Info
Dept
Subject
28700
28700
28700
28700
EE
EE
EE
EE
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Dusit
Dusit
Dusit
Dusit
28700
28700
28700
28700
EE
EE
EE
EE
Anan
Anan
Anan
Anan
Saeree
Saeree
Pattara
Pattara
Pattara
Pattara
Pattara
Pattara
24900
24900
24900
24900
53020
53020
18500
18500
18500
18500
18500
18500
IE
IE
IE
IE
IE
IE
CPE
CPE
CPE
CPE
CPE
CPE
Salary
SID
Sname
3
3
3
3
S4
S5
S6
S7
Panita
Sarun
Kanok
Vichu
3.35
2.96
2.75
3.15
Power Control
Power Control
Power Control
Power Control
3
3
3
3
S4
S5
S6
S7
Panita
Sarun
Kanok
Vichu
3.35
2.96
2.75
3.15
Optimization
Optimization
Prob Stat
Prob Stat
Optimization
Optimization
Data Structure
Data Structure
Data Structure
Web Service
Web Services
Web Services
3
3
4
4
3
3
3
3
3
4
4
4
S8
S9
S8
S9
S10
S11
S1
S2
S3
S3
S1
S2
Kitti
Chareon
Kitti
Chareon
Sathit
Vitthaya
Preeda
Panu
Vallapa
Vallapa
Preeda
Panu
2.54
3.08
2.54
3.08
2.67
3.25
2.85
2.45
3.02
3.02
2.85
2.45
Credit
GPA
NULL NULL
NULL NULL
NULL
NULL S999
Luxana
NULL
E9999 Thana
17500 CPE
NULL
NULL NULL
NULL
NULL
NULL NULL
NULL CPE
Prob Stat
4 NULL
NULL
NULL
Insert new records may cause data redundancy and null value in some fields.
Insertion Anomaly
LID
Lname
E5001
E5001
E5001
E5001
E5001
E5001
E5001
E5001
E6001
E6001
E6001
E6001
E6002
E6002
E9001
E9001
E9001
E9001
E9001
E9001
Dusit
Dusit
Dusit
Dusit
Dusit
Dusit
Dusit
Dusit
Anan
Anan
Anan
Anan
Saeree
Saeree
Pattara
Pattara
Pattara
Pattara
Pattara
Pattara
Class_Info
Salary
28700
28700
28700
28700
28700
28700
28700
28700
24900
24900
24900
24900
53020
53020
18500
18500
18500
18500
18500
18500
Dept
Subject
EE
EE
EE
EE
EE
EE
EE
EE
IE
IE
IE
IE
IE
IE
CPE
CPE
CPE
CPE
CPE
CPE
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Power Control
Power Control
Power Control
Power Control
Optimization
Optimization
Prob Stat
Prob Stat
Optimization
Optimization
Data Structure
Data Structure
Data Structure
Web Service
Web Services
Web Services
Credit
3
3
3
3
3
3
3
3
3
3
4
4
3
3
3
3
3
4
4
4
SID
Sname
S4
S5
S6
S7
S4
S5
S6
S7
S8
S9
S8
S9
S10
S11
S1
S2
S3
S3
S1
S2
Panita
Sarun
Kanok
Vichu
Panita
Sarun
Kanok
Vichu
Kitti
Chareon
Kitti
Chareon
Sathit
Vitthaya
Preeda
Panu
Vallapa
Vallapa
Preeda
Panu
GPA
3.35
2.96
2.75
3.15
3.15
3.15
3.15
3.15
2.54
3.08
2.54
3.08
2.67
3.25
2.85
2.45
3.02
3.02
2.85
2.45
NULL NULL
NULL NULL
NULL
NULL S999
Luxana
NULL
E9999 Thana
17500 CPE
NULL
NULL NULL
NULL
NULL
NULL NULL
NULL CPE
Prob Stat
4 NULL
NULL
NULL
Insert new records may cause data redundancy and null value in some fields.
Deletion Anomaly
Class_Info
LID
Lname
Salary
Dept
Subject
Credit
SID
Sname
GPA
E5001 Dusit
28700 EE
Electronic 1
3 S4
Panita
3.35
E5001 Dusit
28700 EE
Electronic 1
3 S5
Sarun
2.96
E5001 Dusit
28700 EE
Electronic 1
3 S6
Kanok
2.75
E5001 Dusit
28700 EE
Electronic 1
3 S7
Vichu
3.15
E6001 Anan
24900 IE
Optimization
3 S8
Kitti
2.54
E6001 Anan
24900 IE
Optimization
3 S9
Chareon
3.08
E6001 Anan
24900 IE
Prob Stat
4 S8
Kitti
2.54
E6001 Anan
24900 IE
Prob Stat
4 S9
Chareon
3.08
E6002 Saeree
53020 IE
Optimization
3 S10
Sathit
2.67
E6002 Saeree
53020 IE
Optimization
3 S11
Vitthaya
3.25
E9001 Pattara
18500 CPE
Data Structure
3 S1
Preeda
2.85
E9001 Pattara
18500 CPE
Data Structure
3 S2
Panu
2.45
E9001 Pattara
18500 CPE
Data Structure
3 S3
Vallapa
3.02
E9001 Pattara
18500 CPE
Web Service
4 S3
Vallapa
3.02
E9001 Pattara
18500 CPE
Web Services
4 S1
Preeda
2.85
E9001 Pattara
18500 CPE
Web Services
4 S2
Panu
2.45
Deletion Anomaly may cause loss other necessary data.
Modification Anomaly
Class_Info
LID
Lname
Salary
Dept
Subject
Credit
SID
Sname
GPA
E5001 Dusit
28700
45000 EE
Electronic 1
3 S4
Panita
3.35
E5001 Dusit
45000 EE
28700
Electronic 1
3 S5
Sarun
2.96
E5001 Dusit
45000 EE
28700
Electronic 1
3 S6
Kanok
2.75
E5001 Dusit
45000 EE
28700
Electronic 1
3 S7
Vichu
3.15
E6001 Anan
24900 IE
Optimization
3 S8
Kitti
2.54
E6001 Anan
24900 IE
Optimization
3 S9
Chareon
3.08
E6001 Anan
24900 IE
Prob Stat
4 S8
Kitti
2.54
E6001 Anan
24900 IE
Prob Stat
4 S9
Chareon
3.08
E6002 Saeree
53020 IE
Optimization
3 S10
Sathit
2.67
E6002 Saeree
53020 IE
Optimization
3 S11
Vitthaya
3.25
E9001 Pattara
Pattara
18500
21000 CPE
Data Structure
3 S1
Preeda
2.85
E9001 Pattara
18500
18500 CPE
Data Structure
3 S2
Panu
Panu
2.45
2.67
E9001 Pattara
18500
18500 CPE
Data Structure
3 S3
Vallapa
3.02
E9001 Pattara
18500
18500 CPE
Web Service
4 S3
Vallapa
3.02
E9001 Pattara
18500
25000 CPE
Web Services
4 S1
Preeda
2.85
E9001 Pattara
18500 CPE
Web Services
4 S2
Panu
2.45
If we want to change the value of one of the attributes of a particular entity in the
relation, we must update all rows that relate to this entity. If this modification is not
carried out on all the appropriate rows ,the data base will become inconsistent.
To solve update anomalies, a relation must be normalized by using
normalization process to remove existing data redundancy.
LID
Lname
E5001
E5001
E5001
E5001
E6001
E6001
E6001
E6001
E6002
E6002
E9001
E9001
E9001
E9001
E9001
E9001
Dusit
Dusit
Dusit
Dusit
Anan
Anan
Anan
Anan
Saeree
Saeree
Pattara
Pattara
Pattara
Pattara
Pattara
Pattara
Salary
28700
28700
28700
28700
24900
24900
24900
24900
53020
53020
18500
18500
18500
18500
18500
18500
Dept
Subject
EE
EE
EE
EE
IE
IE
IE
IE
IE
IE
CPE
CPE
CPE
CPE
CPE
CPE
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Optimization
Optimization
Prob Stat
Prob Stat
Optimization
Optimization
Data Structure
Data Structure
Data Structure
Web Service
Web Services
Web Services
Credit
3
3
3
3
3
3
4
4
3
3
3
3
3
4
4
4
SID
Sname
S4
S5
S6
S7
S8
S9
S8
S9
S10
S11
S1
S2
S3
S3
S1
S2
Panita
Sarun
Kanok
Vichu
Kitti
Chareon
Kitti
Chareon
Sathit
Vitthaya
Preeda
Panu
Vallapa
Vallapa
Preeda
Panu
GPA
3.35
2.96
2.75
3.15
2.54
3.08
2.54
3.08
2.67
3.25
2.85
2.45
3.02
3.02
2.85
2.45
Functional Dependency
One of the main concepts associated with normalization is functional
dependency, which describes the relationship between attributes.
Functional Dependency describes the relationship between
attributes in a relation. For example, if A and B are attributes (or
set of attributes) of relation R, B is functionally dependent on A
(denoted AB), if each value of A is associated with exactly one
value of B.
The symbol of Functional Dependency (AB) can be described as
followings:
B is functionally dependent on A
or A determines B
or B depends on A
Functional Dependencies
One of the main concepts associated with normalization
is functional dependency, which describes the relationship
between attributes.
(Definition of Functional Dependency)
Suppose that B is an attribute and A is another one, we
said that B is functionally dependent on A (denoted A  B), if
each value of A is associated with exactly one value of B. ( A
and B may each consists of one or more attributes.)
The symbol of functional dependence (A  B) means
B is functionally dependent on A
or A functionally defines B
or B depends on A
If the functional dependency    holds on schema R,
in any legal relation r, for all pairs of tuples t1 and t2
in r such that t1[] = t2[], it is also the case that t1[] = t2[].
Given a relation r, attribute y of r is dependent on attribute x
if and only if whenever two tuples of R agree on their x-value,
they must necessarily agree on their y-value.
For every tuple in the relation r, if the value of attribute  in
tuples are the same, DBMS guarantees that the value of the attribute
 in those tuples must be the same. That is
If
   holds on R
and if t1[] = t2[]
DBMS must guarantee that t1[] = t2[]
A
B is functionally
dependent on A
B
When a functional dependency exists, the attribute or group
Of attributes on the left-hand side of the arrow is called the
determinant.
Position is functionally
Staff_No
Position
dependent on Staff_No
SL21
Position
System Engineer
Staff_No is not functionally
dependent on Position
System Engineer
Staff_No
SL21
SG5
LID
Lname
E5001
E5001
E5001
E5001
E6001
E6001
E6001
E6001
E6002
E6002
E9001
E9001
E9001
E9001
E9001
E9001
Dusit
Dusit
Dusit
Dusit
Anan
Anan
Anan
Anan
Saeree
Saeree
Pattara
Pattara
Pattara
Pattara
Pattara
Pattara
Salary
28700
28700
28700
28700
24900
24900
24900
24900
53020
53020
18500
18500
18500
18500
18500
18500
Dept
Subject
EE
EE
EE
EE
IE
IE
IE
IE
IE
IE
CPE
CPE
CPE
CPE
CPE
CPE
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Optimization
Optimization
Prob Stat
Prob Stat
Optimization
Optimization
Data Structure
Data Structure
Data Structure
Web Service
Web Services
Web Services
Credit
3
3
3
3
3
3
4
4
3
3
3
3
3
4
4
4
SID
Sname
S4
S5
S6
S7
S8
S9
S8
S9
S10
S11
S1
S2
S3
S3
S1
S2
Panita
Sarun
Kanok
Vichu
Kitti
Chareon
Kitti
Chareon
Sathit
Vitthaya
Preeda
Panu
Vallapa
Vallapa
Preeda
Panu
( LID, Subject,SID ) Lname, Salary, Dept, Credit, Sname, GPA
LID  Lname, Salary, Dept
Subject  Credit
SID  Sname, GPA
GPA
3.35
2.96
2.75
3.15
2.54
3.08
2.54
3.08
2.67
3.25
2.85
2.45
3.02
3.02
2.85
2.45
Utilization of FD to decompose a relation
LID
Lname
Salary
Dept
Subject
Credit
SID
Sname
GPA
E5001
E5001
E5001
E5001
E6001
E6001
……
Dusit
Dusit
Dusit
Dusit
Anan
Anan
…………..
28700
28700
28700
28700
24900
24900
…………..
EE
EE
EE
EE
IE
IE
………….
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Optimization
Optimization
…………..
3
3
3
3
3
3
…………..
S4
S5
S6
S7
S8
S9
…………..
Panita
Sarun
Kanok
Vichu
Kitti
Charoen
…………..
3.35
2.96
2.75
3.15
2.54
3.08
…………..
Lecturer
LID
Lname
E5001
E6001
E6002
E9001
Dusit
Anan
Saeree
Pattara
Student
Subject
Salary
28700
24900
53020
18500
Dept
Subject
EE
IE
IE
CPE
Electronic 1
Optimization
Prob Stat
Data Structure
Web Service
Credit
3
3
4
3
4
SID
Sname
S1
S2
S3
S4
S5
S6
S7
S8
S9
Preeda
Panu
Vallapa
Panita
Sarun
Kanok
Vichu
Kitti
Chareon
GPA
2.85
2.45
3.02
3.35
2.96
2.75
3.15
2.54
3.08
Normalization is a formal method involved with a series of test to help
database designer to be able to identify the optimal grouping of
attributes for each relation in the relational schema.
Unnormalized Form
1st Normal Form
2nd Normal Form
The process of normalization is a formal
method that identifies relations based on
primary key (or candidate keys in the
case of BCNF the functional dependencies
among their attributes).
3rd Normal Form
Boyce-Codd Normal Form
Normalization can be applied to individual relation so that
database can be normalized to a specific form to prevent the possible
occurrence of update anomaly.
Relationships of Normal Forms
1NF
2NF
3NF/BCNF
4NF
5NF
Higher
DKNF
Normal
forms
Case Study
The DreamHome company manages property on behalf of the owners, and as part of this service, the
company takes care of the property’s rental. To simplify this example, we assume that a customer rents
a given property only once, and cannot rent more than one property at any one time.
Unnormalized form (UNF) : A table that contains one or more repeating groups.
Customer_Rental Relation
Cust_No
CName
Property_
No
PAddress
Rent
RentStart
RentFinish
Owner_No
CR76
John Kay
PG4
6 Lawrence St,
350
1-Jul-94
31-Aug-96
CO40
Tina Murphy
-------
PG16
5 Norwar Dr
450
1-Sep-96
1-Sep-98
CO93
Tony Shaw
--------
PG4
6 Lawrence St,
350
1-Sep-92
10-Jan-94
CO40
Tina Murphy
--------
PG36
2 Manor Rd,
375
10-Oct-94
1-Dec-95
CO93
Tony Shaw
---------
PG16
5 Norwar Dr
450
1-Jan-96
10-Aug-96
CO93
Tony Shaw
-------
CR56
Aline
Stewart
OName
O_addr
A repeating group is an attribute or group of attributes within a table that occurs with multiple values
for a single occurrence of the key attribute (s) for that table. The term key refers to the attribute (s)
that uniquely identify each row within the unnormalized table.
Case Study
The DreamHome company manages property on behalf of the owners, and as part of this service, the
company takes care of the property’s rental. To simplify this example, we assume that a customer rents
a given property only once, and cannot rent more than one property at any one time.
Adjust Unnormalized form to 1st NF by removing of repeating groups in order to
form relational data model (data are conceptually structured in the form of table) .
Customer_Rental Relation
Cust_No
CName
Property_
No
PAddress
Rent
RentStart
RentFinish
Owner_No
OName
O_addr
CR76
John Kay
PG4
6 Lawrence St,
350
1-Jul-94
31-Aug-96
CO40
Tina Murphy
……
CR76
John Kay
PG16
5 Norwar Dr
450
1-Sep-96
1-Sep-98
CO93
Tony Shaw
…….
CR56
Aline Stewart
PG4
6 Lawrence St,
350
1-Sep-92
10-Jan-94
CO40
Tina Murphy
……..
CR56
Aline Stewart
PG36
2 Manor Rd,
375
10-Oct-94
1-Dec-95
CO93
Tony Shaw
…….
CR56
Aline Stewart
PG16
5 Norwar Dr
450
1-Jan-96
10-Aug-96
CO93
Tony Shaw
…….
First normal form (1NF) : A relation in which the intersection of each row and column
contains one and only one value.
Customer_Rental Relation
Custome_No
Property_No
CName
PAddress
Rent
RentStart
RentFinish
Owner_No
OName
CR76
PG4
John Kay
6 Lawrence St,
350
1-jul-94
31-Aug-96
CO40
Tina Murphy
CR76
PG16
John Kay
5 Norwar Dr
450
1-Sep-98
1-Sep-98
CO93
Tony Shaw
CR56
PG4
Aline Stew
6 Lawrence St,
350
10-Jun-94
10-Jun-94
CO40
Tina Murphy
CR56
PG36
Aline Stew
2 Manor Rd,
375
1-Dec-95
1-Dec-95
CO93
Tony Shaw
CR56
PG16
Aline Stew
5 Norwar Dr
450
10-Aug-96
10-Aug-96
CO93
Tony Shaw
For the relational data model, it is important to recognize that it is only first
normal form(1NF) that is critical in creating appropriate relations. All the subsequent
normal forms are optional. However, to avoid the update anomalies, it is recommended that
we proceed to at least 3NF.
Set of the Functional Dependency of Customer_Rental relation
fd1
Customer_No, Property_No  RentStart, RentFinish
fd2
Customer_No  CName
fd3
Property_No  PAddress, Rent, Owner_No, OName
fd4
Owner_No  Oname, O_add
fd5
Customer_No, RentStart  Property_No, PAddress, RentFinish,
Rent, Owner, OName
(Candidate key)
fd6
Property_No, RentStart  Customer_No, CName, RentFinish (Candidate key)
(Primary key)
(Partial dependency)
(Partial dependency)
(Transitive dependency)
Customer_No
Property_No
CName
PAddress
RentStart
RentFinish
Rent
Owner_No
OName
(Primary key)
fd1
fd2
(Partial dependency)
(Partial dependency)
fd3
fd4
(Transitive dependency)
fd5
(Candidate key)
fd6
(Candidate key)
Second Normal Form (2NF) :
A relation that is in the first normal form and every non-primary key attribute is
fully functionally dependent on the primary key.
Full functional :
dependency
Indicates that if A and B are attributes of a relation, B is fully functionally dependent
on A if B is functionally dependent on A, but not on any proper subset of A.
ถ้า B เป็ น Non-Key attribute ซึง่ มีฟังก์ชนการขึ
ั่
น้ ต่อกันอยูก่ บั ส่วนใดส่วนหนึ่งของคียห์ ลัก เราจะเรียกว่า B partial
dependence on A. Partial dependency ต้องถูกขจัดออกโดยการแยก ออกไปตัง้ เป็ นตารางใหม่ เพือ่ ให้ Non-Key
attribute ตัวนี้ fully dependent on คียห์ ลัก
Customer_No
Property_No
CName
PAddress
RentStart
RentFinish
Rent
Owner_No
OName
O_Addr
(Primary key)
fd1
fd2
(Partial dependency)
fd3
(Partial dependency)
Customer (Customer_No, CName)
Rental (Customer_No, Property_No, RentStart, RentFinish)
Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, O_addr)
Rental Relation
Customer Relation
Customer_No
Property_No
RentStart
RentFinish
John Kay
CR76
PG14
1-Jul-94
31-Aug-96
Aline Stewart
CR766
PG16
1-Sep-96
1-Sep-98
CR56
PG4
1-Sep-92
10-Jun-94
CR56
PG36
10-Oct-94 1-Dec-95
CR56
PG16
1-Jan-96
Customer_No
CName
CR76
CR56
10-Aug-96
Property-Owner Relation
Property_No
PAddress
Rent
Owner_No
OName
O_addr
PG14
6 Lawrence St,
350
CO40
Tina Murphy
28 North Rye
PG16
5 Norwar Dr
450
CO93
Tony Shaw
550/8 Lake Shore Dr.
PG36
2 Manor Rd,
375
CO93
Tony Shaw
550/8 Lake Shore Dr.
2NF applies to relations with composite keys, that is, relations with a primary key that composed of
two or more attributes. A relation with a single attribute primary key is automatically in at least 2NF.
Transitive dependency
Customer (Customer_No, CName)
Rental
(Customer_No, Property_No, RentStart, RentFinish)
Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, O_addr)
Transitive
dependency
Property-Owner Relation
Property_
No
PAddress
Rent
Owner_No
OName
O_addr
PG14
6 Lawrence St,
350
CO40
Tina Murphy
28 North Rye
PG16
5 Norwar Dr
450
CO93
Tony Shaw
550/8 Lake Shore Dr.
PG36
2 Manor Rd,
375
CO93
Tony Shaw
550/8 Lake Shore Dr.
Customer Relation
Rental Relation
Customer_No
CName
Customer_No
Property_No
RentStart
RentFinish
CR76
John Kay
CR76
PG14
1-Jul-94
31-Aug-96
CR56
Aline Stewart
CR766
PG16
1-Sep-96
1-Sep-98
CR56
PG4
1-Sep-92
10-Jun-94
CR56
PG36
10-Oct-94 1-Dec-95
CR56
PG16
1-Jan-96
10-Aug-96
Transitive dependency : A condition where A, B, and C are attributes of a relation such that
if A  B and B  C, then C is transitively dependent on A via B
(provided that A is not functionally dependent on B or C).
Definition of Third Normal Form:
A relation that is in first and second normal form, and in which no non-primary key attribute
is transitively dependent on the primary key.
Customer (Customer_No, CName)
Rental
(Customer_No, Property_No, RentStart, RentFinish)
Property_Owner (Property_No, PAddress, Rent, Owner_No, Oname, O_addr)
Property-for-Rent Relation
Owner Relation
Property_No
PAddress
Rent
Owner_No
PG14
6 Lawrence St,
350
CO40
C040
Tina Murphy
28 North Rye
PG16
5 Norwar Dr
450
CO93
Co93
Tony Shaw
550/8 Lake Shore Dr.
PG36
2 Manor Rd,
375
CO93
Owner_No
OName
O_addr
Customer_Rental Relation
Custome_No
Property_No
CName
PAddress
Rent
RentStart
RentFinish
Owner_No
OName
CR76
PG4
John Kay
6 Lawrence St,
350
1-jul-94
31-Aug-96
CO40
Tina Murphy
CR76
PG16
John Kay
5 Norwar Dr
450
1-Sep-98
1-Sep-98
CO93
Tony Shaw
CR56
PG4
Aline Stew
6 Lawrence St,
350
10-Jun-94
10-Jun-94
CO40
Tina Murphy
CR56
PG36
Aline Stew
2 Manor Rd,
375
1-Dec-95
1-Dec-95
CO93
Tony Shaw
CR56
PG16
Aline Stew
5 Norwar Dr
450
10-Aug-96
10-Aug-96
CO93
Tony Shaw
Customer (Customer_No, CName)
Rental
(Customer_No, Property_No, RentStart, RentFinish)
Property (Property_No, PAddress, Rent, Owner_No)
Owner (Owner_No, Oname, O_addr)
Customer_Rental
1NF
Property_Owner
Customer
Rental
Property_for_Rent
2NF
Owner
3NF
Rental
Customer
Customer_No
CName
Customer_No
Property_No
RentStart
RentFinish
CR76
John Kay
CR76
PG14
1-Jul-94
31-Aug-96
CR56
Aline Stewart
CR766
PG16
1-Sep-96
1-Sep-98
CR56
PG4
1-Sep-92
10-Jun-94
CR56
PG36
10-Oct-94 1-Dec-95
CR56
PG16
1-Jan-96
10-Aug-96
Property_for_Rent
Owner
Property_No
PAddress
Rent
Owner_No
PG14
6 Lawrence St,
350
CO40
Owner_No
OName
PG16
5 Norwar Dr
450
CO93
CO40
Tina Murphy
28 North Rye
PG36
2 Manor Rd,
375
CO93
CO93
Tony Shaw
550/8 Lake Shore
address
From 3NF to Boyce-Codd Normal Form (BCNF)
BCNF is based on functional dependencies that take into account all candidate
keys in a relation. For a relation with only one candidate key, 3NF and BCNF are
equivalent.
The difference between 3NF and BCNF is that for a functional dependency
AB, 3NF allows this dependency in a relation if B is a primary-key attribute and A is
not a candidate key. Whereas, BCNF insists that for this dependency to remain in a
relation, A must be a candidate key. Therefore, BCNF is a stronger form of 3NF, such
every relation in BCNF is also in 3NF.
Boyce-Codd :
normal form (BCNF)
A relation is in BCNF if and only if every determinant is
a candidate key.
Violation of BCNF is quite rare, since it may only happen under specific
conditions. The potential to violate BCNF may occur in relation that
• contains two (or more) composite candidate keys and
• which overlap, that is share at least one attribute in common
Case Study
In this example, Client_Interview relation is presented. It contains details of
the arrangements for interviews of clients by members of staff of the DreamHome
company. The members of staff involved in interviewing clients are allocated to a
specific room on the day of interview. However, a room may be allocated to several
members of staff as required throughout a working day. A client is only interviewed once
on a given date, but may be requested to attend further interviews at later dates. This
relation has three candidate keys:
(Client_No, Interview_Date),
(Staff_No, Interview_Date, Interview_Time), and
(Room_No, Interview_Date, Interview_Time).
Therefore the Client_Interview relation has three composite candidate keys, which
overlap by sharing the common attribute Interview_Date. We select
Client_No, Interview_Date) to act as the primary key for this relation.
Client_Interview (Client_No, Inverview_Date, Interview_Time, Staff_No, Room_No)
The Client_Interview relation has the following functional dependencies :
Fd1
Client_No, Interview_Date  Interview_Time, Staff_No, Room_No
(Primary key)
Fd2
Staff_No, Interview_Date, Interview_Time  Client_No
Fd3
Room_No, Interview_Date, Interview_Time  Staff_No, Client_No (Candidate
key)
Staff_No, Interview_Date  Room_No
Fd4
(Candidate key)
Client_No
Interview_Date Interview_Time Staff_No
Room_No
CR76
13-May-98
10:30
SG5
G101
CR56
13-May-98
12:00
SG5
G101
CR74
13-May-98
12:00
SG37
G102
CR56
1-Jul-98
10:30
SG5
G102
Client_Interview Relation
Interview (Client_No, Interview-Date, Interview_Time, Staff_No)
Staff_Room (Staff_No, Interview-Date, Room_No)
Interview Relation
Client_No
Interview_Date Interview_Time Staff_No
CR76
13-May-98
10:30
SG5
CR56
13-May-98
12:00
SG5
CR74
13-May-98
12:00
SG37
CR56
1-Jul-98
10:30
SG5
Staff_Room Relation
Staff_No
Interview_Date Room_No
SG5
13-May-98
G101
SG37
13-May-98
G102
SG5
1-Jul-98
G102
Review of Normalization (1NF to BCNF)
The DreamHome company manages property on behalf of the owners, and as
part of this service the company undertakes regular inspections of the property by
members of staff. When staff are required to undertake these inspections, they are
allocated a company car for use on the day of the inspections. However, a car may be
allocated to several members of staff, as required throughout the working day. A
member of staff may inspect several properties on a given date, but a property is
only inspected once on a given date.
Property_Inspection Relation
Property_No
PAddress
IDate
ITime
Comments
Staff_No
SName
Car_Reg
PG4
6 Lawrence St,
18-Oct-96
10:00
Need to replace crockery
SG37
Ann Beech
M231 JGR
22-Apr-97
09:00
In good order
SG14
David Ford
M533 HDR
1-Oct-98
12:00
Damp rot in bathroom
SG14
David Ford
N721 HFR
22-Apr-96
13:00
Replace room carpet
SG14
David Ford
M533 HDR
24-Oct-97
14:00
Good condition
SG37
Ann Beech
N721 HFR
PG16
5 Norwar Dr
Property_Inspection (Property_No, PAddress, IDate, ITime, Comments, Staff_No, SName, OName)
1NF : Property_Inspection Relation
Property_No
IDate
ITime
PAddress
Comments
Staff_No
SName
Car_Reg
PG4
18-Oct-96
10:00
6 Lawrence St,
Need to replace crockery
SG37
Ann Beech
M231 JGR
PG4
22-Apr-97
09:00
6 Lawrence St,
In good order
SG14
David Ford
M533 HDR
PG4
1-Oct-98
12:00
6 Lawrence St,
Damp rot in bathroom
SG14
David Ford
N721 HFR
PG16
22-Apr-96
13:00
5 Norwar Dr
Replace room carpet
SG14
David Ford
M533 HDR
PG16
24-Oct-97
14:00
5 Norwar Dr
Good condition
SG37
Ann Beech
N721 HFR
Property_Inspection (Property_No, IDate, ITime, PAddress, Comments, Staff_No, SName, OName)
Property_No
IDate
ITime
PAddress
Comments
Staff_No
SName
Car_Reg
FD1
FD2
FD3
(Primary key)
(Partial dependency)
(Transitive dependency)
FD4
FD5
FD6
(Candidate key)
(Candidate key)
1NF : Property_Inspection Relation
Property_No
IDate
ITime
PAddress
Comments
Staff_No
SName
Car_Reg
PG4
18-Oct-96
10:00
6 Lawrence St,
Need to replace crockery
SG37
Ann Beech
M231 JGR
PG4
22-Apr-97
09:00
6 Lawrence St,
In good order
SG14
David Ford
M533 HDR
PG4
1-Oct-98
12:00
6 Lawrence St,
Damp rot in bathroom
SG14
David Ford
N721 HFR
PG16
22-Apr-96
13:00
5 Norwar Dr
Replace room carpet
SG14
David Ford
M533 HDR
PG16
24-Oct-97
14:00
5 Norwar Dr
Good condition
SG37
Ann Beech
N721 HFR
The potential to violate BCNF may occur in relation that
•
contains two (or more) composite candidate keys and
•
which overlap, that is share at least one attribute in common
(Property_No, Idate)
(IDate, ITime, Car_Reg)
(IDate, ITime, Staff_No)
Property_No
IDate
ITime
PAddress
Comments
Staff_No
SName
Car_Reg
FD1
(Primary key)
(Partial dependency)
FD2
Remove Partial dependency (decompose the relation) to obtain 2NF
Property Relation
Property_No
PAddress
PG4
6 Lawrence St,
PG16
5 Norwar Dr
Property_Inspection Relation
Property_No
IDate
ITime
Comments
Staff_No
SName
Car_Reg
PG4
18-Oct-96
10:00
Need to replace crockery
SG37
Ann Beech
M231 JGR
PG4
22-Apr-97
09:00
In good order
SG14
David Ford
M533 HDR
PG4
1-Oct-98
12:00
Damp rot in bathroom
SG14
David Ford
N721 HFR
PG16
22-Apr-96
13:00
Replace room carpet
SG14
David Ford
M533 HDR
PG16
24-Oct-97
14:00
Good condition
SG37
Ann Beech
N721 HFR
Property Relation (Property_No, PAddress)
Property_No
PAddress
PG4
6 Lawrence St,
PG16
5 Norwar Dr
Property_Inspection Relation
Property_No
IDate
ITime
Comments
Staff_No
SName
Car_Reg
FD1
FD3
(Primary key)
(Transitive dependency)
FD4
(Candidate key)
FD5
FD6
(Candidate key)
Property Relation
Property_No
PAddress
PG4
6 Lawrence St,
PG16
5 Norwar Dr
Remove Transitive dependency (decompose the relation) to obtain 3NF
Staff Relation
Staff_No
SName
SG37
Ann Beech
SG14
David Ford
Property_Inspection Relation
Property_No
IDate
ITime
Comments
Staff_No
Car_Reg
PG4
18-Oct-96
10:00
Need to replace crockery
SG37
M231 JGR
PG4
22-Apr-97
09:00
In good order
SG14
M533 HDR
PG4
1-Oct-98
12:00
Damp rot in bathroom
SG14
N721 HFR
PG16
22-Apr-97
13:00
Replace room carpet
SG14
M533 HDR
PG16
24-Oct-97
14:00
Good condition
SG37
N721 HFR
Property Relation
Staff Relation
Property_No
PAddress
Staff_No
SName
PG4
6 Lawrence St,
SG37
Ann Beech
PG16
5 Norwar Dr
SG14
David Ford
Remove remaining anomalies from functional dependencies to obtain BCNF
Property_Inspection Relation
Property_No
IDate
ITime
Comments
Staff_No
Car_Reg
(Primary key)
(Candidate key)
Staff_Car (Staff_No, IDate, Car_Reg)
Inspection (Property_No, IDate, ITime, Comments, Staff_No)
From BCNF to Fourth Normal Form (4NF)
Although BCNF removes any anomalies due to functional dependencies, further
research led to the identification of another type of dependency called multi-valued
dependency (MVD), which can cause similar design problems for relations in terms
of data redundancy.
Even though the following table is in BCNF, but update anomalies still exists.
Lect_Sub_Research Relation
Lecturer_Name
Subject
Research
Yuen
Data Structure
Natural Language Processing
Yuen
Data Structure
Protocal Analyzer
Yuen
Discrete Math
Natural Language Processing
Yuen
Discrete Math
Protocal Analyzer
Yuen
Data Base
Natural Language Processing
Yuen
Data Base
Protocal Analyzer
Chalerrmsak
Data Structure
Protocal Analyzer
Chalerrmsak
Data Structure
Compiler Utilities
Chalerrmsak
Data Structure
Natural Language Processing
Multi-valued :
dependency
(MVD)
Represents a dependency between attributes (for example, A,
B, and C) in a relation, such that for each value of A there is a
set of values for B, and a set of values for C. However, the set
of values for B and C are independent of each other.
A > B
A > C
Lecturer > Subject
Lecturer > Research
Lec_Sub_Research Relation
Lecturer_Name
Lec_Sub Relation
Lecturer_Name
Subject
Subject
Research
Yuen
Data Structure
Natural Language Processing
Yuen
Data Structure
Yuen
Data Structure
Protocal Analyzer
Yuen
Discrete Math
Yuen
Discrete Math
Natural Language Processing
Yuen
Data Base
Yuen
Discrete Math
Protocal Analyzer
Chalerrmsak
Data Structure
Yuen
Data Base
Natural Language Processing
Yuen
Data Base
Protocal Analyzer
Lec_Research Relation
Chalerrmsak
Data Structure
Protocal Analyzer
Lecturer_Name
Chalerrmsak
Data Structure
Compiler Utilities
Yuen
Natural Language Processing
Chalerrmsak
Data Structure
Natural Language Processing
Yuen
Protocal Analyzer
Chalerrmsak
Protocal Analyzer
Chalerrmsak
Compiler Utilities
Chalerrmsak
Natural Language Processing
Research
Unnormalized form (UNF)
Remove repeating groups
First normal form (1NF)
Remove partial dependencies
Second normal form (2NF)
Remove transitive dependencies
Third normal form (3NF)
Remove remaining anomalies
From functional dependencies
Boyce-Codd form (BCNF)
Remove multi-valued
dependencies
Fourth normal form (4NF)
LID
Lname
E5001
E5001
E5001
E5001
E6001
E6001
E6001
E6001
E6002
E6002
E9001
E9001
E9001
E9001
E9001
E9001
Dusit
Dusit
Dusit
Dusit
Anan
Anan
Anan
Anan
Saeree
Saeree
Pattara
Pattara
Pattara
Pattara
Pattara
Pattara
Salary
28700
28700
28700
28700
24900
24900
24900
24900
53020
53020
18500
18500
18500
18500
18500
18500
Dept
Subject
EE
EE
EE
EE
IE
IE
IE
IE
IE
IE
CPE
CPE
CPE
CPE
CPE
CPE
Electronic 1
Electronic 1
Electronic 1
Electronic 1
Optimization
Optimization
Prob Stat
Prob Stat
Optimization
Optimization
Data Structure
Data Structure
Data Structure
Web Service
Web Services
Web Services
Credit
3
3
3
4
3
3
4
4
3
3
3
3
3
4
4
4
SID
Sname
S4
S5
S6
S7
S8
S9
S8
S9
S10
S11
S1
S2
S3
S3
S1
S2
Panita
Sarun
Kanok
Vichu
Kitti
Chareon
Kitti
Chareon
Sathit
Vitthaya
Preeda
Panu
Vallapa
Vallapa
Preeda
Panu
GPA
3.35
2.96
2.75
3.15
2.54
3.08
2.54
3.08
2.67
3.25
2.85
2.45
3.02
3.02
2.85
2.45
NULL NULL
NULL NULL
NULL
NULL S999
Luxana
NULL
E9999 Thana
17500 CPE
NULL
NULL NULL
NULL
NULL
NULL NULL
NULL CPE
Prob Stat
4 NULL
NULL
NULL