Normalization

Lecture Nine: Normalization
Introduction to Normalization:
redundancy, anormalies?
1st – 3rd Normal Forms
7/28/2017
1
Objectives







Purpose of normalization.
Problems associated with redundant data.
Identification of various types of update anomalies such as
insertion, deletion, and modification anomalies.
How to recognize appropriateness or quality of the design of
relations.
How functional dependencies can be used to group attributes
into relations that are in a known normal form.
How to undertake process of normalization.
How to identify most commonly used normal forms, namely
1NF, 2NF, and 3NF
7/28/2017
2
Normalization

Normalization is defined as a technique for producing a set of
well designed relations that measure up to a set of requirements
which are outlined in various levels of normalization (or Normal
Forms).

Most commonly used normal forms are first (1NF), second (2NF)
and third (3NF) normal forms.

Normalization has the underlying aim of minimizing information
redundancy, avoiding data inconsistency and preventing Update
anomalies (insertion, deletion, and modification anomalies).
7/28/2017
3
Data Redundancy

Major aim of relational database design is to group attributes
into relations to minimize data redundancy and reduce file
storage space required by base relations.

Problems associated with data redundancy are illustrated by
comparing the following Staff and Branch relations with the
StaffBranch relation.
7/28/2017
4
Data Redundancy
7/28/2017
5
Update Anomalies

Relations that contain redundant information may potentially
suffer from update anomalies.

Types of update anomalies include:
 Insertion
 Deletion
 Modification.
Insertion Anomaly: Occurs when extra data beyond the desired
data must be added to the database.
7/28/2017
6
Update anomalies: Insertion Anomaly

Until the new faculty member, Dr. Newsome, is assigned to
teach at least one course, his details cannot be recorded.
7/28/2017
7
Update anomalies: Modification Anomaly
Modification Anomaly: Changing the value of one of the
columns in a table will mean changing all the values that have to
do with that column.
Employee 519 is shown as having different addresses on different records.
7/28/2017
8
Update anomalies: Deletion Anomaly

Deletion Anomaly: Occurs whenever deleting a row
inadvertently causes other data to be deleted.
All information about Dr. Giddens is lost when he temporarily ceases to
be assigned to any courses.
7/28/2017
9
Functional Dependency




Functional Dependency: Describes relationship between attributes
in a relation.
 If A and B are attributes of relation R, B is functionally
dependent on A (denoted A  B), if each value of A in R is
associated with exactly one value of B in R.
Diagrammatic representation:
Determinant of a functional dependency refers to attribute or group of
attributes on left-hand side of the arrow.
Main concept associated with normalization.
7/28/2017
10
Example

branchNo
7/28/2017
bAddress
Func
tiona
l
2.11
Dep
ende
Example - Functional Dependency
7/28/2017
12
Example
2.13
7/28/2017
Func
tiona
l
Dep
ende
Example
Given TEXT we know the COURSE.
TEXT ->COURSE
TEXT maps to a single value of COURSE
7/28/2017
14
The Process of Normalization

Formal technique for analyzing a relation based on its
primary key and functional dependencies between its
attributes.

Often executed as a series of steps. Each step corresponds to
a specific normal form, which has known properties.

As normalization proceeds, relations become progressively
more restricted (stronger) in format and also less vulnerable
to update anomalies.
7/28/2017
15
Unnormalized Form (UNF)

A table that contains one or more repeating groups.
 Note: A repeating group is an attribute or group of
attributes within a table that occurs with multiple
values for a single occurrence of the nominated key
attributes for that table. For example a book with
multiple authors, etc

To create an unnormalized table:
 transform data from information source (e.g. form) into
table format with columns and rows.
7/28/2017
16
First normal form (1NF)




A table is in First Normal Form (1NF) iff all its attributes are
atomic.
A domain is atomic if its elements are considered to be
indivisible units. A relation in which intersection of each row
and column contains one and only one value.
Implies that it should have no composite attributes or multivalued attributes.
In case a table is not in 1NF, we do two things
7/28/2017
17
UNF to 1NF
First identify a primary key, then
Either
Place each value of a repeating group on a tuple with duplicate
values of the non-repeating data (called “flattening” the table)
Or
 Make a new table to cater for multi-valued attributes.
 Place repeating data along with copy of the original key
attribute(s) into a separate relation
 The new primary key should be a combination of the (multivalued) attribute and the primary key of the parent table.

7/28/2017
18
UNF to 1NF
Nor
mali
zatio
n
UNF to 1NF
7/28/2017
20
UNF to 1NF
7/28/2017
21
HEALTH HISTORY REPORT
PET ID
246
298
341
519
7/28/2017
PET NAME
ROVER
SPOT
MORRIS
TWEEDY
PET TYPE
DOG
DOG
CAT
BIRD
PET AGE
12
2
4
2
OWNER
SAM
COOK
TERRY
KIM
SAM
COOK
TERRY
KIM
VISIT DATE
PID
PROCEDURE
PNAME
JAN 13/2002
01
RABIES VACCINATION
MAR 27/2002
10
EXAMINE and TREAT WOUND
APR 02/2002
05
HEART WORM TEST
JAN 21/2002
08
TETANUS VACCINATION
MAR 10/2002
05
HEART WORM TEST
JAN 23/2001
01
RABIES VACCINATION
JAN 13/2002
01
RABIES VACCINATION
APR 30/2002
20
ANNUAL CHECK UP
APR 30/2002
12
EYE WASH
22
Second Normal Form (2NF)



Based on concept of full functional dependency:
 A and B are attributes of a relation,
 B is fully dependent on A if B is functionally dependent
on A but not on any proper subset of A.
2NF - A relation that is in 1NF and every non-primary-key
attribute is fully functionally dependent on the primary
key.
It applies to relations that have composite keys for a primary
key.
7/28/2017
Nor
mali
zatio
n
1NF to 2NF

This involves the removal of partial dependencies

A partial dependency occurs when the primary key is made up
of more than one attribute (i.e. it is a composite primary key)
and there exists an attribute (which is a non-primary key
attribute) that is dependant on only part of the primary key.

These partial dependencies can be removed by removing all of
the partially dependent attributes into another relation along
with a copy of the determinant attribute (which is part of the
Nor
primary key in the original relation)
mali
7/28/2017
zatio
n
1NF to 2NF
7/28/2017
25
7/28/2017
26
Third Normal Form (3NF)

Based on concept of transitive dependency:
 A, B and C are attributes of a relation such that if A  B
and B  C,
 then C is transitively dependent on A through B. (Provided
that A is not functionally dependent on B or C).

3NF - A relation that is in 1NF and 2NF and in which no
non-primary-key attribute is transitively dependent on the
primary key.
7/28/2017
Nor
mali
zatio
n
2NF to 3NF

Identify the primary key in the 2NF relation.

Identify functional dependencies in the relation.

If transitive dependencies exist on the primary key remove
them by placing them in a new relation along with copy of
their determinant.
7/28/2017
Nor
mali
zatio
n
``````3ea4EZQq
7/28/2017
`1
1
29
7/28/2017
30
Exercises: Instructions

The following tables are susceptible to update anomalies.
Provide examples of insertion, deletion, and modification
anomalies.

Describe and illustrate the process of normalizing the tables
to 3NF. State any assumptions you make about the data
shown in these tables.
7/28/2017
31
Exercise 1
7/28/2017
32
Exercise 2
7/28/2017
33
Exercise 3
7/28/2017
34