Diploma of Government
(Enterprise Architecture)
Block 12
Conversion to Schema and Normalization
Overview
• Conversion of Entity Relationship
Diagrams to Schema
• Normalization of Schema
• Using Systems Architect to draw Entity
Relationship diagrams, and then convert
them to OV-7 and SV-11
Converting
ER to Relational Schema
• Once an ER model has been developed it
needs to be converted into a “relational
schema”
• A relational schema is a specification of
the required table definitions and their
foreign key links
• There are well-defined principles for
converting from one to the other
Conversion Rules for Entities
• Each entity set becomes a relation/table
• Each single-valued attribute of the entity
set becomes a column (attribute) of the
table representing the entity set
• Composite attributes are represented only
by their components
• Derived attributes are ignored
• ER key table primary key
Multi-valued Attributes
• Multi-valued attributes are not dealt with
by having repeating columns in the table
representing the entity set
• That is:
Qualification
person
should not be represented by:
person (... qual1, qual2, qual3, ...etc...)
Multivalued Attributes
• The correct way to represent multivalued
attributes is with another table
• Example:
person (person-id, name… , address… , ...etc...)
person-qual (person-qual-id, qualification)
Relationships
• How an ER relationship is represented
depends on its connectivity and degree
• We consider binary relationships first
• Recall that the 3 possible connectivities
are:
– one-to-one
– one-to-many
– many-to-many
1:1
1:M
M:N
1:1 Relationships (Binary)
• To represent a 1:1 relationship, a foreign
key is exported from either relation into
the other - but not both ways
• Which direction is chosen generally
depends on how the connected entity sets
participate in the relationship
1:1 Relationships
staff
1
head
of
1
department
Could be represented by:
staff (staff-id, name, ...etc... )
department (dept-name, ...etc... , head-id)
1:1 Relationships
• It could also be represented by:
staff (staff-id, staff-name, ...etc... , head_dept)
department (dept-name, ...etc...)
Which is better? The first ... because all
(mandatory) departments have heads but only a
few staff are heads of departments (optional)
Workbook Exercise 1:
President of Club
member
Write the schema:
1
president
1
club
Workbook Exercise 1
President of Club Solution
Workbook Exercise 2: Traditional
Monogamous Heterosexual Marriage
male
Write the schema:
1
marries
1
female
Workbook Solution 2: Traditional
Monogamous Heterosexual Marriage
1:N Relationships
• The primary key of the relation on the “1”
side of the relationship is exported as a
foreign key into the relation on the “N” side
(it’s that simple!).
1:N Relationships
1
lecturer
teaches
N
subject
This is represented by:
lecturer (lecturer-id, name, telephone, ...etc...)
subject (subject-code, name, ...etc..., subject-lecturer)
1:N Relationships
lecturer
teaches
subject
This is represented by:
lecturer (lecturer-id, name, telephone, ...etc...)
subject (subject-code, name, ...etc..., subject-lecturer)
M:N Relationships
• These types of relationships require a new
table to represent them
• The new table (often called an intersection
relation or similar) contains two foreign
keys - one from each of the participants in
the relationship
• The primary key of the new table is the
concatenation of the two primary keys
M:N Relationships
student
M
enrols
in
N
subject
This relationship is represented by:
student (number, name, ...etc...)
enrolment (student, subject)
subject (code, name, ...etc...)
Workbook Exercise 3:
“Modern” (non-bigamous) Marriage
start date
male
Write the schema:
M
marries
N
finish date
female
Workbook Solution 3:
“Modern” (non-bigamous) Marriage
Variations in Degree:
Unary Relationships
• The same rules apply as those already
discussed for binary relationships
• In the case of 1:1 and 1:N connectivities
the foreign key is “exported” back into
the same table
• For M:N connectivities a new table is
created, though with two foreign keys
having the same parent primary key- but
they appear in different roles
Unary Relationships
1
supervisor
employee
supervises
N
supervisee
This is represented by:
employee (id, name, address, ...etc..., supervisor)
Unary Relationships
M
has friend
N
person
This is represented by:
person (person-id, name, address, ...etc...,)
has-friend (person-1, person-2)
Ternary Relationships
• Ternary relationships are always
represented with a new table
• The new table contains three foreign keys
- one from each of the participating
relations
• The primary key of the new table is the
concatenation of all three of the imported
foreign keys
Ternary Relationships
item
vendor
sale
purchaser
This is represented as shown on the next slide
Ternary Relationships
vendor (name, address, telephone, ...etc...)
customer (name, ...etc...)
sale (customer, vendor, item, date, value, ..etc...)
item (code, category, ...etc...)
Workbook Exercise 4:
Ternary Relationships
number
pilot
name
number
plane
type
Write the schema:
flight
date
ticket number
passenger
name
Workbook Solution 4:
Ternary Relationships
Weak Entity Sets
• Weak entity sets are translated into a table
of their own, just as standard entity sets do
• However, the weak entity set only has (by
definition) a partial key
• Therefore, the primary key of the entity set
on which it is dependent is imported as a
foreign key, and becomes part of the weak
entity set’s primary key
Weak Entity Sets
Name
golf course
Number
1
consists
of
N
hole
This is represented by:
golf_course (name, ...etc...)
hole (course, number, ...etc...)
Workbook Exercise 5:
Name
motel
Write the schema:
Number
1
consists
of
N
room
Workbook Solution 5:
Supertypes & Subtypes
• A supertype and each of its subtypes all
translate into separate tables, as usual
• The primary keys of the supertype table
and all of the subtype tables are identical
• The supertype and each subtype table
contains only the specific attributes that
apply to it
Supertypes & Subtypes
Registration
motor
vehicle
G
G
truck
car
truck
attributes
car
attributes
bus
bus
attributes
Supertypes & Subtypes
• The previous ER model is represented by:
motor_vehicle (Registration, ...etc...)
truck (Registration, ...etc...)
car (Registration, ...etc...)
bus (Registration, ...etc...)
Workbook Exercise 6:
Supertypes & Subtypes
stock no
Write the schema:
craft
item
G
G
pottery
craft
pottery
attributes
craft
attributes
art
art
attributes
Workbook Solution 6:
Supertypes & Subtypes
What is a “Good” Database
Design?
• “Bad” design exposes a database to three
types of anomaly
– Insertion anomalies
– Deletion anomalies
– Update anomalies
• Insertion and update anomalies result in
inconsistent data; deletion anomalies
result in lost data
A Poorly Designed Relation
enrol(Student_No,Student_Name,Subject,Lecturer,Department)
enrol
Student_No
Student_Name
Sub ject
Lecturer
Department
2467137
J. Brady
Databases
D. Hart
CompSci
1689347
M. Roden
Databases
D. Hart
CompSci
2271349
L. Landford
Databases
D. Hart
CompSci
2467137
J. Brady
Data Networks
A. Goscinski
CompSci
3351844
W. Latham
Comp Eng
C. Yang
ElecEng
1487221
R. Abbott
Data Networks
A. Goscinski
CompSci
2271349
L. Landford
Data Networks
A. Goscinski
CompSci
3351844
W. Latham
Data Networks
A. Goscinski
CompSci
1689347
M. Roden
Comp Eng
C. Yang
ElecEng
1174562
M. Wainwright
Comp Eng
C. Yang
ElecEng
1174562
M. Wainwright
Databases
D. Hart
CompSci
1174562
M. Wainwright
Data Networks
A. Goscinski
CompSci
1174562
M. Wainwright
Speech Proc
M. Wagner
CompSci
3351844
W. Latham
Databases
D. Hart
CompSci
etc
etc
etc
etc
etc
Insertion Anomaly
• Insertion anomalies come in two types
• Examples:
– Suppose student J. Brady (2467137) enrols in
Comp Eng, but the name is incorrectly input
as “J. Bradie”. Now student 2467137 has two
names
– Suppose a new subject is introduced. It is not
possible to record the subject details until at
least one student is enrolled in it
Deletion Anomaly
• Suppose student M. Wainwright is the only
student enrolled in the subject Speech
Processing but decides not to take it after
all
• The other information about this subject
(i.e. who the lecturer is and what
department offers it) is lost
Update Anomaly
• Suppose the lecturer for Databases
changes from D. Hart to J. Yang but not all
of the applicable tuples (rows) are updated
to reflect the change
• The database now contains conflicting
information about who the lecturer is for
this subject
A Better Database Design
student(Number, Name)
enrol(Student, Subject)
subject(Name, Lecturer)
lecturer(Name, Department)
student
lecturer
Number
Name
Name
Department
2467137
J. Brady
D. Hart
CompSci
1689347
M. Roden
A. Goscinski
CompSci
2271349
L. Landford
C. Yang
ElecEng
3351844
W. Latham
M. Wagner
CompSci
1487221
R. Abbott
1174562
M. Wainwright
etc
etc
etc
etc
enrol
subject
Name
Lecturer
Databases
D. Hart
Data Networks
A. Goscinski
Comp Eng
C. Yang
Speech Proc
M. Wagner
etc
etc
Student_No
Subject_Name
2467137
Databases
1689347
Databases
2271349
Databases
2467137
Data Networks
3351844
Comp Eng
1487221
Data Networks
2271349
Data Networks
3351844
Data Networks
1689347
Comp Eng
1174562
Comp Eng
1174562
Databases
1174562
Speech Proc
3351844
Databases
etc
etc
Normalization
• There is a formally specifiable process that
can be followed to achieve a good
database design, or check that an existing
design is of good quality
• This process is known as normalization
• The different stages of normalization are
known as “normal forms”
Normal Forms
• The “normal forms” that database design
can be viewed as progressing through are:
– First normal form (1NF)
– Second normal form (2NF)
– Third normal form (3NF)
– Boyce-Codd normal form (BCNF)
– Fourth normal form (4NF)
– Fifth normal form (5NF)
Normalization of Data
“This can be looked on as a process during
which unsatisfactory relation schemas
(entities) are decomposed by breaking up
their attributes into smaller relation
schemas (entities) that possess desirable
properties.”
Elmasre & Navathe,1994, Fundamentals of Database Systems, p. 407
Normal Forms
• Normal forms 1NF to BCNF are based on
the concept of “functional dependency”
• 4NF is based on the concept of “multivalued dependency”
• 5NF is based on the concept of “join
dependency”
Functional Dependency
• Definition:
An attribute, or group of attributes, Y of
a relation (entity) is functionally dependent
on another attribute, or group of attributes,
X of the relation (entity), if the value(s) of
the attributes X uniquely determine the
value(s) of the attributes Y
Functional Dependency
• If attribute(s) Y are functionally dependent
on attribute(s) X then this is represented
by the notation:
XY
• Example: If the value of an attribute
“Service_No” is known then the value of
an attribute “Rank” is determined since
Service_NoRank
for any service person.
First Normal Form (1NF)
• This is not really a normal form at all but
rather is assumed in the definition of a
relation. It states ...
• A relation schema is in 1NF if all of its
attributes are single-valued and restricted
to assuming simple (atomic) values, and
all attributes are functionally dependent
on the primary key
First Normal Form (1NF)
• Composite attributes (in the ER sense) are
represented only by their component
attributes
• Attributes cannot have multiple values (i.e.
no repeating groups)
• Attributes cannot have tuples or relations
as values
Example of 1NF
Dept-employee (dept-name, dept-no,emp-no, emp-name)
The key is unique, but dept-name does not rely on the
WHOLE key, neither does emp-name rely on the WHOLE
key.
To normalize it we need to break it into two relations and
remove the attributes that violate 1NF.
Dept (dept-name, dept-no)
Employee (emp-no, emp-name)
Workbook Exercise 7: 1NF
Reduce the following to 1NF
Employee (emp-no, emp-name, proj-no, proj-hours, hrly-rate)
Workbook Solution 7: 1NF
Prime & Non-Prime Attributes
• 2NF uses the concept of “prime” and “nonprime” attributes
• An attribute of an entity R that belongs to
any key of R is said to be a prime
attribute, otherwise it is non-prime
Full Functional Dependency
• 2NF also needs the concept of “full
functional dependency”
• If the functional dependency XY holds
(where X and Y are either individual or
groups of attributes) and removal of any
attribute from X results in the dependency
failing to hold, then Y is fully functionally
dependent on X
Full Functional Dependency
• Consider the relation schema:
credit_record(Customer, Credit_Card,
Address, Employer, Limit, Interest)
• None of “Address”, “Employer” and “Interest”
are fully functionally dependent on the
primary key
• “Limit” is, however, fully functionally
dependent on the primary key
Second Normal Form (2NF)
• Definition:
A relation schema (entity) R is in 2NF if
it is in 1NF and every non-prime attribute
is fully functionally dependent on every
key of R
• The relation “credit_record” is, therefore,
not in 2NF
Second Normal Form (2NF)
• Consider an example of the relation
“credit_record” and note the duplication of
data evident in it ...
credit_record
Customer
Credit_Card
Address
Employer
Limit
Interest
D. Hart
D. Hart
Bankcard
Visa
12 Gleeson Pl
12 Gleeson Pl
ADFA
ADFA
2000
3000
23.5
22.8
D. Hart
A. Bone
Mastercard
Amex
12 Gleeson Pl
10 Hargrave St
ADFA
Telecom
2500
null
23.5
null
A. Bone
Visa
10 Hargrave St
Telecom
4000
22.8
J. Eynon
J. Eynon
Visa
Mastercard
24 Waller Cres
24 Waller Cres
Ford
Ford
2000
3500
22.8
23.5
J. Eynon
Diners Club
24 Waller Cres
Ford
5000
21.8
Second Normal Form (2NF)
• The duplication of data evident in
“credit_record” is a product of its failure to
be in 2NF
• The functional dependencies applying to
“credit_record” are:
– Customer Address
– Customer Employer
– Credit_Card Interest
– {Customer, Credit_Card} Limit
Second Normal Form (2NF)
• If “credit_record” is decomposed into
separate relations based on the functional
dependencies then a much better design
results
• The algorithm ... For all functional
dependencies XAi that have identical
X’s, combine X and the Ai ‘s into one
relation with X as the primary key
Second Normal Form (2NF)
• The result?
customer(Name, Address, Employer)
credit_card(Type, Interest)
credit_limit(Customer, Credit_Card, Limit)
• Each of these relations is now in 2NF (in
fact they are all in 5NF!)
Second Normal Form (2NF)
customer
credit_limit
Name
Address
Employer
Customer
Credit_Card
Limit
D. Hart
12 Gleeson Pl
ADFA
D. Hart
Bankcard
2000
A. Bone
10 Hargrave St
Telecom
D. Hart
Visa
3000
J. Eynon
24 Waller Cres
Ford
D. Hart
Mastercard
2500
A. Bone
Amex
null
A. Bone
Visa
4000
J. Eynon
Visa
2000
J. Eynon
Mastercard
3500
J. Eynon
Diners Club
5000
credit_card
Type
Interest
Bankcard
23.5
Visa
22.8
Mastercard
23.5
Amex
null
Diners Club
21.8
Workbook Exercise 8: 2NF
Normalize to 2NF
Emp-proj (emp-no, proj-no, hrs, emp-name, proj-name,proj-loc)
Workbook Solution: 8 2NF
Transitive Functional
Dependency
• Third normal form is based on the concept
of “transitive functional dependency”
• Definition ...
A functional dependency XY in a
relation (entity) R is a transitive
dependency if there is an attribute or
group of attributes Z that is not a
superkey of R for which XZ and ZY
Transitive Functional
Dependency
• The presence of a transitive functional
dependency can result in redundant data
being present in a relation
• Consider the relation
customer(Name,Address,Employer,
Work_Phone,Work_Address)
• There are two transitive dependencies:
NameEmployer, EmployerWork_Phone
NameEmployer, EmployerWork_Address
Transitive Functional
Dependency
• An example of the “customer” relation
could be (note the redundant data again)
...
customer
Name
Address
Employer
Work_Phone
Work_Address
B. Holloway
38 Sturdee Cres
Santronics
806422
11 Kembla St
M. Wright
15 Bowes St
Hills Telefix
806754
168 Gladstone St
A. Leslie
18 The Pines
Hills Telefix
806754
168 Gladstone St
K. Riley
67 Irving St
Brashs
815255
168 Melrose Drv
D. Martens
24 Green St
Santronics
806422
11 Kembla St
L. Kazar
27 Archer St
Brashs
815255
168 Melrose Drv
etc
etc
etc
etc
etc
Third Normal Form (3NF)
• Definition:
A relation schema R is in 3NF if it is in
2NF and every non-prime attribute is nontransitively dependent on every key of R
• The algorithm for achieving 3NF is the
same as that for achieving 2NF
• Applying it results in a decomposition of
“customer” into two relations ...
Third Normal Form (3NF)
• customer(Name, Address, Employer)
employer(Name, Address, Telephone)
• This decomposition is based on the
functional dependencies (for the original
“customer”):
– Customer Name Employer
– Customer Name Address
– Employer Work_Phone
– Employer Work_Address
Workbook Exercise 9: 3NF
Normalize to 3NF
Emp-dept(emp-name, emp-no, gender, emp-address, dept-no,
dept-name, dept-location)
Workbook Solution 9: 3NF
Workbook Exercise 10
Normalize to 3NF
Faculty-bid (faculty-no, bid-no, faculty-name, bid-item-number,
bid-item-description, bid-priority, faculty-member-making-bid,
number-of-students-affected, item-price, first-date-itemrequired)
Workbook Solution 10
Normalize to 3NF
Workbook Exercise 11
Normalize to 3NF
Athletics-results (race-no, race-distance, race-type, racegender, race-age, race-place, place-student-name, place-time,
record-holder-name, record-holder-time, record-holder-year)
Workbook Solution 11
Normalize to 3NF
Summary
First Normal Form (1NF)
• A relation schema is in 1NF if
– all of its attributes are single-valued and
– restricted to assuming simple (atomic) values,
and
– all attributes are functionally dependent on
the primary key
Summary
Second Normal Form (2NF)
• A relation schema is in 2NF if
– it is in 1NF and
– every non-prime attribute is fully functionally
dependent on every key
Summary
Third Normal Form (3NF)
• A relation schema is in 3NF if
– it is in 2NF and
– every non-prime attribute is non-transitively
dependent on every key
© Copyright 2026 Paperzz