関係データベースの
第三正規化の形式的検証
Formally Verifying
the Third Normalization of
Relational Databases
産総研 平井洋一
AIST, Yoichi Hirai
2013-11-22, Nagano (TPP 2013)
1
ACID properties of the database systems
• Atomicity
Changes are applied in “all or nothing” manner.
Partial changes must be rolled back.
• Consistency
Changes on valid states result in valid states.
• Isolation
Even concurrent changes simulate a temporally
serial execution.
• Durability
Once changes are applied, they remain forever
unless overwritten.
2
Anomalities: failures of consistency
• Update anomality
titleID
Title
Author
Library
3
Istanbul
Orhan Pamuk
Central
3
Istanbul
Orhan Pamuk
East
Tried to change the title, but failed to
change all occurrences.
Consistency is violated.
titleID
Title
Author
Library
3
Istanbul
Orhan Pamuk Central
3
My name is red
Orhan Pamuk East
3
Anomalities: failures of consistency
• Deletion anomality
FacultyID Faculty
name
Facult
y hire
date
Course
name
Couse
day
Couse
time
33
R. Wavey
195109-01
Physics
2A
Wed
15:00-
34
…
…
…
…
…
Just removed a course, but removed a
faculty as a result.
FacultyID
Faculty Faculty
name
hire
date
Course
name
Couse
day
Couse
time
34
…
…
…
…
…
4
Codd’s first normal form
1st normal form excludes repetition of
the same attributes.
titleID
Title
Author
Library
Library
3
Istanbul
Orhan
Pamuk
East
Central
5
5
Functional dependencies
titleID
Title
Author
Library
3
Istanbul
Orhan Pamuk
Central
3
Istanbul
Orhan Pamuk
East
{titleID} → {Title, Author}
{titleID, Library} → {titleID, Title, Author, Library}
6
Functional dependencies
FacultyID
Faculty
name
Faculty Course
hire
name
date
33
R. Wavey 195109-01
34
…
…
Couse
day
Couse
time
Physics
2A
Wed
15:00-
…
…
…
{FacultyID} → {Faculty name, Faculty hire date}
{FacultyID, Course name} → {Course day, Course time}
{FacultyID, Course name} → {FacultyID, Faculty name,
Faculty hire date, Course name, Course day, Course time}
7
Armstrong’s laws
• Mizar has formalization, soundness and
completeness with respect to the relational
semantics
1. Reflexivity: Y ⊆ X implies X → Y
2. Augmentation: Z ⊆ W and X → Y imply
X∪W→Y∪Z
3. Transitivity: X → Y and Y → Z imply X → Z
sound and complete with respect to the
relational semantics
8
Codd’s second normal form
• Excludes this
FacultyID Faculty
name
Facult
y hire
date
Course
name
Couse
day
Couse
time
33
R. Wavey
195109-01
Physics
2A
Wed
15:00-
34
…
…
…
…
…
Because of these conditions
1. {FacultyID, course name} is a minimal set X with functional dependency
X → {FacultyID, faculty name, faculty hire date, course name, course day,
course time} ({Faculty ID, couse name} is a candidate key).
2. Faculty hire date is not contained in any candidate key (faculty hire date is
non-prime attribute)
3. Faculty hire date is dependent on {FacultyID}, which is a proper subset of
a candidate key {FacultyID, couse name}.
9
The third normal form
• Excludes this (example from Wikipedia)
Tournament
Year
Winner
Winner Date of
Birth
Indiana Invitational
1998
Al Fredrickson
21 July 1975
Des Moines
Masters
1999
Al Fredrickson
21 July 1975
Indiana Invitational
1999
Chip Masterson
14 March 1977
Because a non-prime attribute
“Winner Date of Birth” is
transitively dependent on a
candidate key. Concretely,
1. “Winner Date of Birth” is a
non-prime attribute
2. {Tournament, Year} is a
candidate key
3. {Tournament, Year} → Winner
4. Winner → {Tournament, Year}
does not hold
5. Winner → {Winner Date of
Birth} holds
6. “Winner Date of Birth” is not
in {Tournament, Year}
7. “Winner Date of Birth” is not
in {Winner}
10
Obtaining the third normal form:
the input and output
• Input: a finite set of functional dependencies
Tournament
Winner
Date of Birth
Winner
Year
• Output: a finite set of relations and their keys (in 3NF)
Tournament
Winner
Year
Winner
Date of Birth
Winner
11
Bernstein’s algorithm 1
[Bernstein, 1976]
Obtained after two earlier erroneous attempts!
12
Bernstein’s algorithm 1, step 1
Eliminating extraneous attributes.
Tournament
Year
Winner
Winner
Date of Birth
Winner
Winner
Date of Birth
Place
Tournament
Year
Place
Smaller, but equivalent
(after taking closure of
Armstrong’s laws)
13
Bernstein’s algorithm 1, step 2
Finding nonredundant covering
• A set of functional dependencies is
nonredundant when no element can be
inferred from the others using
Armstrong’s laws.
• Step 2 removes functional dependencies
until the whole set becomes
nonredundant.
14
Bernstein’s algorithm 1, step 3
Partition
Tournament
Winner
Year
Place
Winner
Date of Birth
These two functional
dependencies share the
left hand side.
15
Bernstein’s algorithm 1, step 4
Construct Relations
Tournament
Winner
Year
Winner
Date of Birth
Relation 1
{Winner, Winner Date of Birth}
Place
Relation 2
{Tournament, Year, Place, Winner}
Underlined attributes are keys.
These relations are in the third normal form. Why?
16
Formalization Strategies
• Never mention the relational semantics
• Attributes are just elements of a type (with
equalities)
• A functional dependency is a pair of sequents
of attributes
• Derivations based on armstrong’s laws are
defined in an inductive manner.
17
Termination of algorithms.
(coq computes only total, terminating functions)
• Termination of closure (on Armstrong’s laws)
– Sizes converge because increasing and bounded
– When sizes converge, the closure converges
• Termination of Bernstein’s algorithm 1
– This is easier because all steps are simplification in
some case.
– Repeat simplifying something until it cannot
simplified further.
18
Proving Preservation Properties
• Each step preserves the closure of
functional dependencies!
• This property holds entirely without
exception, so very easy to formalize and
to prove (straightforward divide and
conquer).
19
Proving 3NF
• Mostly followed the text
(first, I omitted step 1 then the proof attempt failed)
• Changed a little to allow easier formalization.
• Some proof steps not understood entirely
– Refactoring should bring enlightenments.
20
Some changes on
Bernstein’s original proof.
Removed this graphical reasoning
The root cause of such graphical
objects
“If there exists a (graphical) derivation
using a functional dependency g,”
A reformulation
“If all (graphical) derivation uses a
functional dependency g,”
21
Amount of code
Parts
Lines of code
Comments
Properties of Armstrong’s
laws &
closure operation
~600
Took ~100 lines for proving
that monotinic bounded
sequence of natural numbers
converge.
Definition of steps,
Steps keep closures,
When steps terminate,
certain things are removed
totally.
~700
Somewhat boilerplate.
The whole algorithm
produces 3NF
~200
Very involved monolithic
proof.
22
Still to be seen: Bernstein’s algorithm 2
• The number of relations produced by Bernstein’s
algorithm 1 is not optimal
• Bernstein’s algorithm 2 gives optimal (= smallest)
number of relations, answering Codd’s challenge.
• We just formalized the algorithm 2.
• And multi-dependencies, normal forms 4 and 5.
23
© Copyright 2026 Paperzz