Normalization

Normalization
Introduction
Badly structured tables , that contains redundant data,
may suffer from Update anomalies :
• Insertions
• Deletions
• Modification
Bad structure may occur due to :
• Errors in the original ER diagram.
• Or in the process of translating ER models into tables.
Database Tables and Normalization
• Normalization is a technique to support the design of
databases based on relational model
• Normalization helps reduce data redundancies and
helps eliminate the data anomalies.
• Normalization works through a series of stages called
normal forms:
– First normal form (1NF)
– Second normal form (2NF)
– Third normal form (3NF)
• The highest level of normalization is not always
desirable.
Data redundancy and update
anomalies
Major aim of relational database design is to
group columns into tables to minimize data
redundancy and reduce file storage space
required by base tables.
© Pearson Education Limited,
2004
4
Data redundancy and update anomalies
Major aim of relational database design is to group columns into
tables to minimize data redundancy and reduce file storage space
required by base tables
Staff(staffNo , name, position, salary, branchNo)
Primary key staffNo
Foreign key branchNo references Branch(branchNo)
Branch(branchNo , branchAddress , TelNo)
Primary key branchNo
Foreign key branchNo references Branch(branchNo)
StaffBranch(staffNo , name, position, salary, branchNo,
branchAddress,TelNo)
Primary key staffNo
Data redundancy and update anomalies
© Pearson Education Limited,
2004
6
Data redundancy and update anomalies
7
What is the problem ?
• StaffBranch table has redundant data; the details of a
branch (branchAddress and telNo) are repeated for
every member of staff located at that branch.
• In contrast, in Branch table the branch information
appears only once for each branch and only the branch
number (branchNo) is repeated in the Staff table, to
represent where each member of staff is located.
• Tables having redundant data may suffer from update
anomalies (insertion , deletion or modification
anomalies)
© Pearson Education Limited,
2004
8
Insertion anomalies
• How to insert the details of a new member of staff at
branch B002 into the StaffBranch table ?
• Any problem with the tables separated ?
© Pearson Education Limited,
2004
9
Insertion anomalies (1)
To insert the details of a new member of staff at branch B002 into
the StaffBranch table , we must enter the correct details of
Branch B002 so that the branch details are consistent with values
for branch B002 in other records of StaffBranch table.
No problem with the tables separated, because no need to enter
the details , just the foreign key is enough.
© Pearson Education Limited,
2004
10
Insertion anomalies(2)
How to insert the details of a new branch that currently
has no member of staff into StaffBranch table ?
Any problem with the tables separated ?
© Pearson Education Limited,
2004
11
Insertion anomalies(2)
To insert the details of a new branch that currently has no
member of staff into StaffBranch table , it is necessary to enter
null into the staff-related column , such as StaffNo (the primary
key). This violates entity integrity and not allowed
No problem with the tables separated. We just enter the new brach
in the branch table ?
© Pearson Education Limited,
2004
12
Deletion anomalies
What happen if we delete a record from the StaffBranch
table that represent the last member of staff located at a
branch ?
Any problem with the tables separated, Why ?
© Pearson Education Limited,
2004
13
Deletion anomalies
If we delete a record from the StaffBranch table
that represent the last member of staff located at a
branch, details about the branch are also lost from
the database.
No problem with the tables separated, because
branch records are stored separately
© Pearson Education Limited,
2004
14
Modification anomalies
What if we change of the value of one of the columns of a
particular branch in the StaffBranch table (ex: telephone
number)?
© Pearson Education Limited,
2004
15
Modification anomalies
We must update the records of all staff located at that
branch
© Pearson Education Limited,
2004
16
First normal form (1NF)
Definition
A table in which the intersection of every column
and record contains only one value.
Only 1NF is critical in creating appropriate tables
for relational databases. All subsequent normal
forms are optional.
However to avoid update anomalies, proceed to
3NF
© Pearson Education Limited,
2004
17
Problem : Column telNos does not comply with 1NF,
because there are multiple values at the intersection of the
telNos column with every record.
How to solve the problem ?
© Pearson Education Limited,
2004
18
Solution : create a separate table BranchTelephone to hold
the telephone numbers of branches , by removing telNo
column from Branch table
NOTE : Primary key of the new table
BranchTelephone table is the
newEducation
telNo Limited,
© Pearson
2004
column
19
Functional dependency
• The particular relationships that we show
between the columns of a table are more
formally referred to as functional
dependencies.
• Functional dependency describes the
relationship between columns in a table.
© Pearson Education Limited,
2004
20
Functional dependency
• Functional dependency in a table indicate how columns relate to one
another.
• Column B is functionally dependent on column A (A→B) = if we
know the value of A , we find only one value of B in all records that has
this value of A.
• We say that B is worked out from A
• However, for a given value of B there may be several values of A
Problem : TempStaffAllocation table is not in 2NF, why ?
primary-key
columns
No primary-key
columns
Functional
dependency
© Pearson Education Limited,
2004
22
Second normal form (2NF)
A table in 2NF is one that is :
 1NF
Each non-primary-key column can be worked out
from the values in all the columns that make up the
primary key (primary-key columns).
This means every non-primary-key column is fully
functional dependent on the primary key.
Fully means dependent on A but not on any
proper subset of A
© Pearson Education Limited,
2004
23
Second normal form (2NF)
NB: 2NF only applies only to tables with
composite primary keys ( primary key composed
of 2 or more columns).
NB: 1NF table with a single column primary
key is automatically in at least 2NF.
© Pearson Education Limited,
2004
24
Functional dependency
BranchAddress can be worked out from BranchNo (part of the
primary key).
 Every time B002 appears in branchNo column , the same
address ”City center ……..” appears in branchAddress . The
reverse is true. (partial dependency)
Name and position can be worked out from staffNo (part of the
primary key).
 Every time S455 appears in staffNo column , the name
“Ellen Layman” and position “assistant” appears in name
and position columns ( partial dependency)
© Pearson Education Limited,
2004
25
Functional dependency
hoursPerWeek can be worked only out from both
staffNo and BranchNo ( the whole primary key).
• As a partial dependency exists on the primary key,
the table is not 2NF .
• 2NF is achieved by removing partial dependency.
How ?
© Pearson Education Limited,
2004
26
Converting TempStaffAllocation table
to 2NF
© Pearson Education Limited,
2004
27
Third normal form (3NF)
Definition
A table that is in 1NF and 2NF and in which all
non-primary-key column can be worked out
from only the primary key column(s) and no
other columns.
© Pearson Education Limited,
2004
28
Is StaffBranch table 3NF?
Draw the dependency arrows ?
StaffBranch table is not in 3NF
© Pearson Education Limited,
2004
30
Converting the StaffBranch table to 3NF
© Pearson Education Limited,
2004
31