CmpE226-DB-L10n1

Database Design
Dr. M.E. Fayad, Professor
Computer Engineering Department, Room #283I
College of Engineering
San José State University
One Washington Square
San José, CA 95192-0180
http://www.engr.sjsu.edu/~fayad
2003
SJSU -- CmpE
L10-S1
Normalization
Lesson 10:
Normalization
2
2003
SJSU – CmpE ---
M.E. Fayad
L10-S2
Normalization
Lesson Objectives
Objectives

Understand the goals of normalization
 Explore the problems of DB bad design
 Understand normalization
 Learn how to deal with normalization
3
2003
SJSU – CmpE ---
M.E. Fayad
L10-S3
Normalization
Relational Database Design
 Goals:
– Reduce data redundancy (undesirable replication
of data values)
– Minimize anomaly problems (data model is
structured in an improper manner)
– Maintain (correct) information
– Enforce semantic & integrity constraints, e.g.,
using dependency & domain constraints
2003
SJSU – CmpE ---
M.E. Fayad
L10-S4
4
Normalization
Problems of Bad DB Design
 Data
replication: extra storage, update
anomalies
 Anomaly: costly & data inconsistency
 Loss of information: poor decomposition
 Un-enforced dependency constraints:
dependencies are lost
5
2003
SJSU – CmpE ---
M.E. Fayad
L10-S5
Normalization
Normalization

A db scheme design tool

A process of replacing associations among attributes in a relation
scheme

An approximation of the relation schemes that should be created

Objectives: accomplish the goals of relational database design

Two approaches: decomposition & synthesis
6
2003
SJSU – CmpE ---
M.E. Fayad
L10-S6
Normalization
Decomposition

A process to split or decompose a relation until the
resultant relations no longer exhibit the undesirable
problems, e.g., data redundancy, data inconsistency,
anomaly, etc.

Decomposing a relation scheme R means breaking R
into a pair of schemes, possibly intersecting
+ this process is repeated until all the decomposed relation
schemes are in the desired (normal) form.
2003
SJSU – CmpE ---
M.E. Fayad
L10-S7
Normalization
7
Normal Forms (NFs)
1NF
2NF
3NF
BCNF
4NF
PJNF
8
2003
SJSU – CmpE ---
M.E. Fayad
L10-S8
Normalization
Normal Forms (NFs)


restrictions on the db scheme that preclude certain
undesirable properties (data redundancy, update
anomaly, loss of information, etc.) from the DB.
A relation scheme R is in PJNF if
R is in 4NF if
R is in BCNF if
R is in 3NF if
R is in 2NF if
R is in 1NF
9
2003
SJSU – CmpE ---
M.E. Fayad
L10-S9
Normalization
Normal Forms (NFs)

Definition. A data value v is atomic if v is not a
(i) set of values or (ii) composite value;
otherwise, v is non-atomic.

First Normal Form (1NF). A relation scheme
R is in 1NF if for every attribute A in R, the
values in the domain of A, i.e., dom(A), are
atomic.
10
2003
SJSU – CmpE ---
M.E. Fayad
L10-S10
Normalization
Normal Forms (NFs)
2003

Boyce-Codd Normal Form (BCNF). A relation
scheme R is in BCNF if for every non-trivial FD
X  Y applied to R, X is a superkey for R.

Definition. Let A be an attribute in a relation
scheme R, and let F be a set of FDs over R. A is
a prime attribute in R if A is contained in some
candidate key of R; otherwise, A is a non-prime
attribute in R.
SJSU – CmpE ---
M.E. Fayad
L10-S11
11
Normalization
Normal Forms (NFs)
 Third
Normal Form (3NF). A relation
scheme R is in 3NF if
1 R is in 1NF, and
2 For every non-trivial FD X  Y applied to R,
either
– X is a superkey of R, or
– every attribute in Y is an attribute of some
candidate key of R, i.e., prime.
2003
SJSU – CmpE ---
M.E. Fayad
L10-S12
12
Normalization
Normal Forms (NFs)

Lossy decomposition: a decomposition is lossy if the natural
join of all
the decomposed relations contain additional
tuples and the original
relation is lost.

Lossless decomposition: a decomposition is lossless if the
natural join of all the decomposed relations always yields the
original relation without any extra tuples, i.e.,
– a decomposition {R1, R2, …, Rn} of R is lossless if r(R),
r(R) = R1(r)  R2(r)  …  Rn(r)
13
2003
SJSU – CmpE ---
M.E. Fayad
L10-S13
Normalization
2-Relational lossless-join decomposition
 2-Relational
lossless-join decomposition:
Let R1 and R2 be decomposed schemes of R. Let
F be a set of FDs on R. The decomposition is
lossless if
R1

R2  R1  F+ or R1  R2  R2  F+
14
2003
SJSU – CmpE ---
M.E. Fayad
L10-S14
Normalization
Dependency preserving

a decomposition is dependency preserving if no
dependency is lost in the process.
Let F be the set of FDs on R.
Let R1, … , Rn be a decomposition of R.
Let Fi, 1  i  n, be the set of FDs in F+ which applies to Ri.
Let F’ = i=1..n Fi.
If F’+ = F+, then the decomposition is dependency
preserving.
2003
SJSU – CmpE ---
M.E. Fayad
L10-S15
15
Normalization