Lecture 1

Chapter 5
• Normalization
• An Normalization
example
INSS 651
1
Learning Objectives
• What normalization is and what role it plays in
the database design process
• About the normal forms 1NF, 2NF, 3NF, BCNF,
and 4NF
• How normal forms can be transformed from
lower normal forms to higher normal forms
• That normalization and ER modeling are used
concurrently to produce a good database design
• That some situations require denormalization to
generate information efficiently
INSS 651
2
Normalization
A process for evaluating and correcting table
structure
Minimize data redundancy
Eliminate Anomalies
INSS 651
3
Is Normalization Necessary?
NO
But it is helpful to maintain data integrity and
consistency
INSS 651
4
Anomalies
• Update—requires update in multiple
locations
• Deletion—A deletion may lose important
information
• Insertion—Requires complete definitions,
ie does see page 187 (an employee can
not be entered unless he is assigned a
project
INSS 651
5
Normalization Process
• 1st NF
• 2nd NF
• 3rd NF
• Almost for 90-98% application 3rd NF is
sufficient
INSS 651
6
Dependency
When an attribute value depends on
attribute B then B is dependent on A
A---B
or values of B can be determined by value of
A, reverse may or may not be true
Ex:
ssn--Name
SSN, CID--Grade
INSS 651
7
Un-normalized relation
Remove REPEATING groups
1st NF
Remove PARTIAL dependency
2nd NF
Remove TRANSIENT dependency
3rd NF
Every determinant is a candidate key
Boyce-CODD NF
If we can convert a relation into 3NF almost 90-98% of anomalies are removed
INSS 651
8
The Need for Normalization
(continued)
• Structure of data set in Figure 5.1 does not
handle data very well
• The table structure appears to work; report
is generated with ease
• Unfortunately, the report may yield
different results, depending on what data
anomaly has occurred
INSS 651
9
1st NF
• Remove repeating groups
ASSIGNMENT
(Proj_num,
proj_name(Emp_num,E_name,job_class,c
hg_hours,Hour))
INSS 651
10
A Dependency Diagram:
First Normal Form (1NF)
INSS 651
11
Second Normal Form (2NF)
Conversion Results
INSS 651
12
Third Normal Form (3NF)
Conversion Results
INSS 651
13
Un-normalized form
A relation is in un-normalized form, if it contains repeating group
Typically shown in parentheses
Ex: PART NO DESC.
VENDOR-NAME ADDRESS UNIT-COST
1234
LOGIC
chip
INTEL
LSI LOGIC
5678
MEMORY INTEL
chip
INSS 651
SAN JOSE 150.00
SAN JOSE 120.00
SAN JOSE
50.00
14
SUPPLIER
(Part_no, Part_DESC, (Vendor_name,
Vendor_address, Unit_cost))
INSS 651
15
Another Way
(Part_NO, V_NAME)-> Unit_cost
Part_NO->P_Desc (Partial dependency)
V-Name->V_DESC (Partial Dependency)
Part_NO
P-DESC
V_NAME
V_ADDRESS
INSS 651
UNIT_COST
16
1st NF
A relation is in 1st NF if it does NOT contain any repeating groups
(Part_no, Part_DESC, (Vendor_name, Vendor_address, Unit_cost))
1st NF..remove repeating groups
Break it into TWO relations
One without repeating group and
ONE with repeating group AND PK of other relation
S1 (Part_no, Part_DESC)
S2 (Vendor_name, Part_no, Vendor_address, Unit_cost
INSS 651
17
2nd NF
A relation is in 2nf NF if it is in 1stNF and it does not
contain any partial dependency
Partial dependency: A partial dependency exists if an
attribute is dependent ONLY on PART of the PK and the
WHOLE PK
We must examine each relation for partial dependency
NOTE: A partial dependency can only exist if there are
more than ONE attribute as PK
INSS 651
18
S1 (Part_no, Part_DESC)
S2 (Vendor_name, Part_no, Vendor_address,
Unit_cost
Note S1 is already in 2nd NF since there is only
attribute as PK
In S2:
Question is Vendor_address dependent on BOTH
vendor_name AND Part_NO?
Question is Unit_price dependent on BOTH
vendor_name AND Part_NO?
INSS 651
19
Question is Vendor_address dependent on BOTH
vendor_name AND Part_NO?
Answer: NO
Give me vendor_no and I can find
vendor_address, we do NOT need Part_No to
know vendor_address, ie Vendor_address
depends ONLY Vendor_name, hence the partial
dependency
INSS 651
20
Question is Unit_price dependent on BOTH
vendor_name AND Part_NO?
YES if you examine the table, price changes
with vendor and part_no, ie price depnds
on both Part_no AND which vendor
supplies it
INSS 651
21
Remove Partial Dependency
VENDOR _ADDRESS
VENDOR_name
UNIT_PRICE
PART#
Create TWO tables:
One with Partial dependency and other without it
S21 (Vendor_name, vendor_address)
S22(Vendor_name, Part_no, Unit_price)
INSS 651
22
3rd NF
A relation is in 3rd NF if it is in 2nd NF and it
does not contain any transitive
dependency
Transitive dependency: A transitive
dependency exists when some of the nonkey attributes are dependent on other nonkey attributes
INSS 651
23
So far we have three relations that are in
at least 2nd NF
S1 (Part_no, Part_DESC)
S21 (Vendor_name, vendor_address)
S22(Vendor_name, Part_no, Unit_price)
S1, S21 & S22 are also in 3rd NF since there
is ONLY ONE non_key attribute and
transitive dependency can NOT exist
INSS 651
24
ERD
PART
PART-SUPPLIED
INSS 651
VENDOR
25
Q6/p 184
A Using notation from the book
a. (C1, C3)- C2,C4,C5
(i.e., C2, C4, C5) are functionally dependent on C1 and C3
Above relation is in at least 1stNF, since there are No
repeating groups
C1C2 there is PARTIAL dependency since C2 depends
on PART of the PK and the whole PK
C4--C5 (transitive dependency since C5 ( a non-PK
attribute) depends on another non-PK attribute (C4)
INSS 651
26
Part b
Table 1
C1
Primary key: C1
Foreign key: None
Normal form: 3NF
C2
Table 2
C1
C3
C4
C5
Primary key: C1 + C3
Foreign key: C1 (to Table 1)
Normal form: 2NF, because the
table exhibits the transitive
dependencies C4
C5
INSS 651
27
Part c
C1
C1
C3
C4
C2
Table 1
Primary key: C1
Foreign key: None
Normal form: 3NF
C4
Table 2
Primary key: C1 + C3
Foreign key: C1 (to Table 1)
C4 (to Table 3)
Normal form: 3NF
C5
Table 3
Primary key: C4
Foreign key: None
Normal form: 3NF
INSS 651
28
Q8/P187
Table P5.8 Sample ITEM Records
Sample Value
Sample Value
Sample Value
ITEM_ID
231134-678
342245-225
254668-449
ITEM_LABEL
HP DeskJet 895Cse
HP Toner
DT Scanner
ROOM_NUMBER
325
325
123
BLDG_CODE
NTC
NTC
CSF
BLDG_NAME
Nottooclear
Nottoclear
Canseefar
BLDG_MANAGER
I. B. Rightonit
I. B. Rightonit
May B. Next
Attribute Name
INSS 651
29
Problem 8 Solution
ITEM_ID ITEM_DESCRIPTION BLDG_ROOM BLDG_CODE BLDG_NAME BLDG_MANAGER
INSS 651
30
ITEM_ID ITEM_DESCRIPTION BLDG_ROOM BLDG_CODE BLDG_NAME BLDG_MANAGER
Problem 9 Solution: All tables in 3NF
ITEM_ID ITEM_DESCRIPTION BLDG_ROOM BLDG_CODE
BLDG_CODE BLDG_NAME
EMP_CODE
EMP_CODE
EMP_LNAME EMP_FNAME
INSS 651
EMP_INITIAL
31
1
EMPLOYEE
(0,N)
manages
M
(1,1)
M
1
BUILDING
contains
(0,N)
EMPLOYEE
EMP_CODE
EMP_INITIAL
(1,1)
BUILDING
1
BLDG_CODE
EMP_LNAME
EMP_FNAME
ITEM
M
ITEM
1
ITEM_ID
BLDG_NAME
ITEM_DESCRIPTION
EMP_CODE
ITEM_ROOM
M
BLDG_CODE
INSS 651
32
Denormalization
• Reversing normalization
• i.e from 3rd NF to 2nd NF
• Or 2nd to 1st NF
INSS 651
33
Convert into 3NF
INVOICE
(Inv_num,
cust_num,lastname,Firstname,street,city,st
ate,zip,date,(partnum,
description,price,numshipped))
INSS 651
34