Equivalents, Applications, and Consistency of the Axiom of Choice

Equivalents, Applications, and
Consistency
of the Axiom of Choice
Xi Yao
Supervised by Professor Greg Hjorth
[email protected]
Department of Mathematics and Statistics
The University of Melbourne
November 2010
To Hui-Ying Hu
Acknowledgments
I would like to particularly thank my supervisor Professor Greg Hjorth. As a
previous biology student who had never done any pure mathematics before and
somehow decided to do a pure mathematics Master thesis in set theory and
logic, and as an international student who had to work a lot to make a living,
I have experienced a hard time. Professor Greg Hjorth has been very patient
and considerate, helping me through many of the elementary aspects of pure
mathematics.
I also want to thank Raj A. Dahya, Leigh Humphries, and Toby Meadows
for the discussions about L, Dr Lawrence Reeves for the discussions about the
Banach–Tarski Paradox, Dr Paul Norbury for the discussions on the proof of
Vec ⇒ AC, and Bowen Plug for some general discussions about AC and correcting
some of my grammatic mistakes.
Abstract
Given any family of non-empty sets, the Axiom of Choice (AC) gives us the power
to choose one element from each member from the family. AC appears and is
used in most modern mathematics and sciences, more often than not, implicitly.
For example, when we say “choose coset representatives”, we are indeed using
AC.
Upon its formal formulation after unconscious uses, AC had been receiving
heavy criticisms for years due to its non-constructive nature. However, AC indeed
gives us a richer mathematical (and scientific) world.
We first provide, as motivation, some equivalents of AC and results requiring
AC from various branches of mathematics. Then we prove the consistency of AC
in ZF. Some results from AC, e.g. the Banach–Tarski paradox, are discussed.
Contents
1 Introduction
1
2 Preliminaries
2.1 Zermelo–Fraekel Set Theory
2.2 Ordinals . . . . . . . . . . .
2.3 Transfinite Induction . . . .
2.4 Cardinals . . . . . . . . . .
2.5 The universe of sets . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
7
9
12
13
3 Some equivalents of AC
14
4 Some results requiring AC
4.1 Set theory . . . . . . . .
4.2 General topology . . . .
4.3 Algebra . . . . . . . . .
4.4 Group theory . . . . . .
4.5 Measure theory . . . . .
4.6 Functional Analysis . . .
.
.
.
.
.
.
20
20
20
21
21
21
21
.
.
.
.
.
.
.
.
22
22
22
26
31
36
42
43
44
5 The
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
consistency of AC in ZF
Overview of the consistency and
A quick review of model theory
Absoluteness . . . . . . . . . . .
Constructible sets . . . . . . . .
Gödel operations . . . . . . . .
Con(ZF + V = L) . . . . . . .
Con(ZFC) . . . . . . . . . . . .
ZF + ¬AC . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
its proof
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6 The Sun and a pea
45
7 Conclusion
47
References
48
1
Introduction
The Axiom of Choice is necessary to select a set from an infinite number of
socks, but not an infinite number of
shoes.
Bertrand Russell
Along with the eight Axioms in the Zermelo–Fraenkel axiomatic set theory
(ZF hereafter), together called ZFC, the Axiom of Choice (AC hereafter) forms a
natural and simple but indispensable foundation for most modern mathematics
and sciences. It penetrates into most, if not all, branches of mathematics, including set theory, order theory, algebra, topology, group theory, and analysis, just
to name a few; and it has many equivalents and applications.
In the history of mathematics, only through extensive use, did assumptions
become clean formulas. According to Moore 1982, the development of AC can
be divided into four stages. The rudimentary version of AC — picking out an
arbitrary element from a set — can be traced back at least to 2300 years ago
in Euclid’s Elements. In the second stage, around the start of the nineteenthcentury, mathematicians constructed the rule of choice when dealing with infinity.
But gradually, mathematicians, like Cauchy, started to leave the rule implicit.
The last stage can probably marked by that in 1871, Cantor once used some
choice functions that no rule existed and he did not recognize this. After the four
stages, AC was first formulated by Ernst Zermelo when he was seeking a proof
for the well-ordering principle.
AC often appears implicitly as an assumption in many proofs and can easily be
overlooked. The proof of Theorem 3.1 later provides an example of the oversight
of AC.
In the first 30 years after AC was formulated, whether or not AC could be
taken as meaningful and acceptable was very controversial. However, AC is now
widely accepted and used by the majority of mathematicians.
AC states:
The Axiom of Choice 1 Every family of non-empty sets has a choice function.
This choice function maps each set in the family to an element of the set. AC
has many forms. We list a few below and it is not hard to see they are essentially
the same.
1
The Axiom of Choice 2 For every set x, there exists a function
f : P(x) − {∅} → x
such that f (s) ∈ s.
The Axiom of Choice 3 The product of any family of non-empty sets is nonempty.
The Axiom of Choice 4 Every family x of non-empty sets has a choice function on x such that for all s ∈ x, f (s) is a non-empty finite subset of s.
If the sets are finite or are ordered in some way, e.g. natural numbers, we
do not need AC. A choice function is required if the sets are infinite and there
is no clear rule to select elements from them. It is important to point out that
the essence of AC is the existence of a choice function, not the choice that the
function gives. Therefore, we do not require the knowledge about the choice
function, and thus no construction of the choice function is needed; that is, the
choice function used need not be well-defined.
In constructivism, all existence proofs are required to be totally explicit. That
is, one must be able to construct, in an explicit and canonical manner, anything
that is proven to exist. Due to its non-constructive nature, AC received strong
criticisms, especially after Banach and Tarski proved in 1924 a paradox using
AC. We will discuss this in section 6.
The criticisms did not stop until in 1938 Kurt Gödel proved the consistency
of AC in ZF. This verified why we can use AC and displaced any suspicion that
the use of AC could lead to contradiction.
In this paper, we first provide, as motivation, some equivalents of AC and
results requiring AC from various branches of mathematics. Then we prove the
consistency of AC in ZF.
2
2
Preliminaries
The essential idea on which the Axiom of
Choice is based constitutes a general logical principle which . . . is necessary and
indispensable.
David Hilbert
We assume that the reader has some knowledge in the formal language of
logic. See Kunen 1980 [12] for a quick review or Mendelson 1987 [15] for a formal
treatment.
2.1
Zermelo–Fraekel Set Theory
We list, formulate, and command on all the axioms in ZF . Some familiar mathematics objects are defined using these axioms.
0. Axiom of Existence.
∃x(x = x).
This axiom assume a non-empty universe. It is listed here for emphasis
but can be omitted since it is derivable from the logical axioms of formal
logic [12]. (Thus it is not counted as one of ZF.) But what is a set and
what is not a set? It is not easy to give definitions to them without selfreference. Although in ZF, classes do not exist, i.e. the only objects are
sets [9], sometime it is unavoidable to work on classes.
Definition 2.1 (§4 [15]). A class is a collection {x : P (x)}, with some
property P . A class x is a set if it is a member of a class; it is a proper
class if it is not a set.
We would like this definition to capture all the legitimate mathematical
objects. For instance, we do not accept a P (x) that says x is a dog.
1. Axiom of Extensionality.
∀x∀y (∀u ∈ x(u ∈ y) ∧ ∀u ∈ y(y ∈ x)) ⇒ x = y .
This axiom says a set is defined by its members.
3
2. Axiom of Foundation.
∀x(x 6= ∅ ⇒ ∃u ∈ x(u ∩ x) = ∅).
This axiom says any set non-empty set contains an ∈-least element. It
prevents ill-founded sets from existing, e.g. a set x with x ∈ x, and thus
solve the famous Russell’s Paradox:
∃u u = {x : x ∈
/ x)} .
Since there exist no sets containing themselves, u contains all the sets and
thus is not a set itself by the Axiom of Separation below.
3. Axiom of Pairing.
∀x∀y∃u(x ∈ z ∧ y ∈ z ∧ ∀u ∈ z(u = x ∨ u = y)).
Given x and y, define the pair u to be {x, y}. Easy to see u is unique by
the Axiom of Extensionality. If y = x, then u = {x, x} = {x}, and there
exists a set z with
z = {x, u} = {x, {x}}.
Thus we define the ordered pair (x, y) of x and y to be {{x}, {x, y}} .
Similarly, define tuples with more than two entries to be
(x, y, z) = ((x, y), z) = (x, (y, z))
(x, y, z, u) = ((x, y, z), u)
..
.
and so on.
4. Axiom of Union.
∀x∃y∀t(t ∈ y ⇔ ∃s(s ∈ x ∧ t ∈ s)).
S
This axioms states for any set x, the collection x of all the elements in
the members in x is a set. Using pairing we define
[
x ∪ y = {x, y}
x ∪ y ∪ z = (x ∪ y) ∪ z
..
.
and so on.
4
5. Axiom of Power Set.
∀x∃y∀u(u ⊂ x ⇔ u ∈ y).
This axioms says for any set x, the collection of all the subset of x is a set.
Using ordered pairing and power set, we define the product x × y of x and
y to be {(s, t) : s ∈ x ∧ t ∈ y}. Similarly, define
x × y × z = ((x × y) × z) = (x × (y × z))
x × y × z × u = (x × y × z) × u
..
.
and so on. We also called a set of ordered tuples an relation and define the
domain of a relation r to be
dom(r) = {x : ∃y((x, y) ∈ r)},
and the range of r to be
ran(r) = {y : ∃x((x, y) ∈ r)}.
If f is a relation such that
(x, y) ∈ f ∧ (x, z) ∈ f ⇒ y = z,
then it is a function. If (x, y) ∈ f , we write
f (x) = y.
For a function f , if dom(f ) = x and ran(f ) = y, we write
f : x → y,
and say f maps x to y. The image of a set x under a function f , or simply
the image of f , is
f [x] = img(f ) = {t : ∃s ∈ x(t = f (s))}.
A function f is one-to-one or injective if
(x, z) ∈ f ∧ (y, z) ∈ f ⇒ x = y;
it is onto or surjective if
img(f ) = ran(f );
5
and it is bijective if it is both one-to-one and onto. The pre-image of
t ∈ ran(f ) under a function f is
f −1 (t) = {s ∈ dom(f ) : (s, t) ∈ f }.
If f is a one-to-one function, its inverse f −1 is also a function. By definition,
ran(f ) contains and could be strictly larger than the image of dom(f ). The
restriction f of a function f to a set x is
f x = {(s, t) ∈ f : s ∈ x}.
Easy to see the terms defined above are sets.
6. Axiom of Infinity.
∃x(∅ ∈ x ∧ ∀s ∈ x(s ∪ {s} ∈ x)).
This axioms says there exists a set with infinitely many members. If a set
x is finite, then there exists a natural number n such that x has exactly n
members. Instead of formulating the property of infinity, in the formula of
this axiom we formulate that of inductivity. In fact, the natural numbers
is the smallest inductive set.
Proposition 2.2. Any inductive set contains the natural numbers.
Proof. Let x be an inductive set, i.e.
∅ ∈ x ∧ ∀s ∈ x(s ∪ {s} ∈ x).
Suppose x does not contain all the natural numbers. Let n be the smallest
natural number that is not contained in x. Then
∀m < n(m ∈ x ∧ m ∪ {m} ∈ x).
Now since n is not zero, there exists a natural number l such l ∪ {l} = n.
If l < n, we are done. But clearly l n.
7. Axiom Schema of Replacement. For each formula ϕ,
∀x∀y∀z ϕ(x, y) ∧ ϕ(x, z) ⇒ y = z
⇒ ∀x∃y∀u u ∈ y ⇔ ∃s ∈ x(ϕ(s, u)) .
First note, since there are infinitely many formulas and thus infinitely many
axioms of Replacement, this is an axiom schema; and similarly for the
axioms of Separation. In short, this axiom schema says the image of a set
under a function is a set.
6
8. Axiom Schema of Separation. For each formula ϕ,
∀x∀p1 . . . ∀pn ∃y∀s s ∈ y ⇔ s ∈ x ∧ ϕ(s, p1 , . . . , pn ) .
This schema states given a formula ϕ, a set x, and parameters p = {p1 , . . . , pn },
there exists a set u = {s ∈ x : ϕ(s, p)}; and such u is unique by Extensionality. It is important to note that the statement that there exists a set
u = {s : ϕ(s, p)} is false. Then it follows that
∅ = {x : x 6= x}
is a set and that the universal class
{x : x = x}
is not.
2.2
Ordinals
Definition 2.3. A partial ordered set (x, <) consists of a set x and a binary
relation <, called a partial ordering, on x, such that
i) ∀s ∈ x(¬(s < s));
ii) ∀r, s, t ∈ x(r < s ∧ s < t ⇒ r < t).
Definition 2.4. Let (x, <) be a partial ordered set, ∅ =
6 t ⊂ x, and s ∈ x. We
say s is
i) maximal in t if s ∈ t and ¬(∃u ∈ t(s < u));
ii) minimal in t if s ∈ t and ¬(∃u ∈ t(u < s));
iii) largest or greatest in t if s ∈ t and ∀u ∈ t(u < s ∨ u = s));
iv) least or lowest in t if s ∈ t and ∀u ∈ t(s < u ∨ s = u)).
v) an upper bound of t if ∀u ∈ t(u < s ∨ u = s));
vi) a lower bound of t if ∀u ∈ t(s < u ∨ s = u)).
Definition 2.5. A linear ordered set (x, <) is a partial ordered set such that
∀s, t ∈ x(s = t ∨ s < t ∨ t < s).
Remark 2.6. A partial ordered set can have more than one maximal elements
but only one largest element, whereas in a linear ordered set, the maximal element
and a largest one coincide; similar for minimal and least elements.
7
Remark 2.7. If we order all the elements in a linear ordered set x, they look
like a chain. So we call a subset z of a partial ordered set (y, <) a chain if (z, <z )
is a linear ordered set, where <z is the restriction of < to z.
Notation 2.8. Let x and y be two sets. If there exists a one-to-one mapping
from x to y, we write
|x| ≤ |y|.
If |x| ≤ |y| and |y| ≤ |x|, we write
|x| = |y|.
If |x| ≤ |y| and |y| |x|, we write
|x| < |y|.
Definition 2.9. Two linearly ordered sets (x, <1 ) and (y, <2 ) are order isomorphic, denoted (x, <1 ) ' (y, <2 ), if there exists a map f from x to y such that f
is bijective and order-preserving, i.e.
∀s, t ∈ x s <1 t ⇒ f (s) <2 f (t) ∧ ∀s, t ∈ y s <2 t ⇒ f −1 (s) <1 f −1 (t) .
Definition 2.10. A well-ordered set (x, <) is a linear ordered set such that every
non-empty subset of x has a <–least element. An initial segment x(t) of (x, <)
is the set {s ∈ x : s < t} for some t ∈ x.
Definition 2.11. A set x is transitive if
∀s(s ∈ x ⇒ s ⊂ x).
Definition 2.12. A set α is an ordinal if and only if (α, ∈) is a transitive wellordered set. We write lowercase Greek letters α, β, . . . to represent ordinals.
Denote the class of ordinals by Ord. For α, β ∈ Ord, define
α < β ⇔ α ∈ β.
It is easy to see the following properties of ordinals:
Lemma 2.13. Let α, β ∈ Ord.
1. γ ∈ α ⇒ γ ∈ Ord.
2. β 6= α ∧ β ⊂ α ⇒ β ∈ α.
3. α ⊂ β ∨ β ⊂ α.
4. α ∪ {α} ∈ Ord.
5. β < α ⇔ β ∈ α ⇔ β $ α.
8
6. < linearly orders Ord.
7. α = β ⇔ (α, <) ' (β, <).
Definition 2.14. An ordinal α is a successor if there exists an ordinal β such
that
α = β ∪ {β} = β + 1;
it is a limit if it is not a successor, that is, it has no last element.
Remark 2.15. If α is a limit, then either
i) α = 0, or
i) α > 0, and α =
S
β<α
β.
Examples 2.16 (of ordinals).
0 = ∅,
1 = {∅},
2 = {∅, {∅}},
3 = {∅, {∅}, {∅, {∅}}},
..
.
ω = {the finite ordinals} = {the natural numbers},
ω1 = {the countable ordinals}.
2.3
Transfinite Induction
In the light of proofs by induction on natural numbers, we extend them to their
counterparts on ordinals.
Theorem 2.17 (Proof by transfinite induction, Theorem 1.7.3 [19]).
Let {ϕs : s ∈ x} be a set of mathematical expressions or statements labeled by A
well-ordered set (x, <). Suppose
∀s ∈ x ∀t ∈ x(t < s ⇒ ϕt ) ⇒ ϕs ,
then
∀s ∈ x(ϕs ).
Proof. Suppose, instead, ∀s ∈ x(ϕs ) is not true . Let u be the least element in x
such that ϕu fails. Then
∀t ∈ x(t < u ⇒ ϕt ).
But by the assumption,
∀t ∈ x(t < u ⇒ ϕt ) ⇒ ϕu .
Thus ϕu is true. This contradicts the definition of u.
9
Transfinite induction can also be used to define objects.
Theorem 2.18 (Definition by transfinite induction, Theorem 1.7.4 [19]). Let x
be a set, (y, <) be a well-ordered set, and define
Ω = f : t → x | ∃u ∈ y t = y(u) = {v ∈ y : v < u} ,
Let G : Ω → x be a function. Then there exists a unique function F on y such
that for each u ∈ y,
F (u) = G(F y(u)).
Remark 2.19. In Ω, each f is a sequence (ordered tuples) with an ordinal length
of elements in x labeled by the elements in y. For the purpose of demonstration,
say y is an ordinal. Then an f is in the form of
hs0 , s1 , . . . , sn , . . . , sα i,
where h. . .i = (. . .) denotes a sequence, α ∈ Ord and each si ∈ x.
Intuitively, Theorem 2.18 works because we get a sequence F of of elements
in x from the given G by induction as follows:
At α = 0, since ∅ is a sequence (of length 0), let F (0) = G(∅), say, = s0 ;
at α = 1, since hs0 i is a sequence (of length 1), let F (1) = G(hs0 i), say, = s1 ;
..
.
at α = n ∈ ω, since hs0 , s1 , . . . , sn−1 i is a sequence (of length n), let F (n) =
G(hs0 , s1 , . . . , sn−1 i); and so on.
So each time when we get an element from G, we glue it to the end of the old
sequence. This is exactly the familiar definition by induction. But Theorem 2.17
allows us to proceed to the limit cases.
Proof of Theorem 2.18. For each u ∈ y, let ϕu be the statement that there exists
a unique function fu : u → x such that
∀v < u fu (v) = G(fu v) .
Let w ∈ y such that for all u < w, ϕu holds. (ϕ∅ is obviously the trivial one
that holds.) We have the unique set Ωw = {fu : u < w}, where fu is as in the
statement ϕu . It is easy to see that for each u < w, if p < u, fu p also satisfies
∀v < p (fu p)(v) = G (fu p)v .
S
Therefore by the uniqueness of Ωw , if p < u < w, then fp = fu p. So h = u<w fu
is a common extension for {fu : u < w}. Now if w is a limit, let fw = h; if w is a
successor, let fw y(w − 1) = h and fw (w − 1) = G(h), where w − 1 is the element
whose successor is w. Since Ωw is unique, so is fw . Then we have
∀u ∈ y ∀v ∈ y(v < u ⇒ ϕv ) ⇒ ϕu .
10
So by Theorem 2.17,
∀u ∈ y(ϕu ).
Similarly, one can see
H=
[
fu
u∈y
is the unique common extension for {fu : u ∈ y}. Again, if y is a limit, let F = H;
if it is a successor, let s be the largest element in y and take F y(s − 1) = H and
F (s) = G(H). And similarly, F is unique.
Since induction is just a special case of transfinite induction, we collectively
call induction and transfinite induction just induction hereafter.
Theorem 2.20 (Theorem 2.12 [9]). Every well-ordered set is order isomorphic
to a unique ordinal.
Proof. Let (x, <) be a well-ordered set. Using Theorem 2.18, we define by induction
Ω = {f : α → x | α ∈ Ord}.
Since α and x are sets, the elements in Ω are well-defined. Ω is the set of the
sequence of elements in x (labeled by ordinals); in particular, Ω contains all the
ft ’s with t ∈ x such that
i) the domain of ft is an ordinal,
ii) the image of ft is the initial segment x(t) = {s ∈ x : s < t}, and
iii) ft is an order isomorphism.
Now let G : Ω → x be such that G maps an initial segment x(s) of x to s, where
s ∈ x. Then by Theorem 2.18, we get a unique F . And F is an is an order
isomorphism from some ordinal α to x. This α is also unique.
Remark 2.21. From Theorem 2.20, we see that an important idea of ordinals
is that they provide the canonical well-orderings for all sets (by assuming AC).
Later we will see in section 3, every set can be well-ordered. On the other hand,
since no set contains a copy (up to order isomorphim) of itself, Ord is a proper
class. Using this fact, we prove the following result.
Theorem 2.22 (Hartogs’ Theorem, Proposition 4.28 [15]).
∀x∃α ∈ Ord∀y(y ⊂ x ⇒ |y| =
6 |α|).
11
Proof. Suppose x is a set such that
∀α ∈ Ord∃y(y ⊂ x ∧ |y| = |α|).
Now for each pair of y and α such that |y| = |α|, we let fy,α : y → α be the
bijection and define a relation <y,α on y by
∀s, t ∈ y s <y,α t ⇔ fy,α (s) < fy,α (t) .
∗
: (y, <y,α ) → (α, <) is the natural extension of fy,α and is an obvious
Thus fy,α
∗
order isomorphism. For each fy,α
, since y ⊂ x and |y| = |α|, we have
∗
fy,α
⊂ x × (x × x) × x.
Define a function F mapping each ordinal α to the set
∗
uα = {fy,β
: β < α}.
As we see
uα ⊂ P(x) × (P(x) × P(x)) × P(x),
uα is a set. So it follows that the image of F is also a set since
F Ord ⊂ P P(x) × (P(x) × P(x)) × P(x) .
Now since F is a one-to-one function, F has an inverse F −1 . F −1 maps F Ord
to Ord. So, by the Axiom of Replacement, Ord is a set. This contradicts the fact
that Ord is a proper class.
2.4
Cardinals
Cardinality is a familiar concept seen in all branches of mathematics. It counts
the number of the elements of a set. Cardinality and cardinal are the same thing,
but intuitively, one may think of cardinality as a property and cardinal as a
number. Now we define cardinals formally using ordinals.
Definition 2.23 (§3 [9]). A cardinal is an ordinal κ such that
∀α ∈ Ord(α < κ ⇒ |α| < |κ|).
We say the cardinality of a set x is κ if |x| = |κ|.
Examples 2.24 (of cardinals).
1. All the natural numbers are the finite cardinals.
2. The infinite cardinals are called alephs, written ℵα ’s. The ordinal ω is the
first infinite cardinal, written ℵ0 . The first uncountably infinite cardinal is
ω1 , written ℵ1 .
3. The ordinal ω + 2 is not a cardinal, since ω < ω + 2 and |ω + 2| = |ω| = ℵ0 .
12
2.5
The universe of sets
Definition 2.25 (Definitions §III 2.1, 2.2 [12]). We define by induction the universe V of sets,
i) V0 = ∅,
(
ii) Vα =
iii) V =
S
P(Vα−1 ) if α is a successor
, and
S
V
if
α
is
a
limit
β
β<α
α∈Ord
Vα .
Since the Axiom of Foundation says there are no ill-founded sets, i.e. any (nonempty) set has an ∈–minimal element, it follows that any (non-empty) class has
the same property (see Lemma 6.2 [9]). Using this fact, we can show V contains
all the (well-founded) sets. (Thus V is not a set itself but a proper class.)
Lemma 2.26 (Lemma 6.3 [9]).
∀x(x ∈ V).
Proof. Let y = {x : x ∈
/ V}. If y = ∅, we are done; otherwise, let x be ∈-minimal
in y. Then we have
∀s ∈ x(s ∈ V);
if not, x is not ∈-minimal. So x ⊂ V and therefore there exists an obvious oneto-one map from x to V. ThenSsince x is a set, we use the Axiom of Replacement
to get an ordinal β with x ⊂ α<β Vα . Then
x ⊂ Vβ
⇒x ∈ P(Vβ ) = Vβ+1
⇒x ∈ V.
Definition 2.27. Let rank(x) = inf{β ∈ Ord : x ∈ Vβ+1 } be the rank for x ∈ V.
It is not hard to check the properties of V and Vα ’s listed in the following
lemma.
Lemma 2.28. Let α ∈ Ord.
1. Vα is transitive (Lemma § III 2.3 (a) [12]).
2. ∀γ ∈ Ord(γ ⊂ α ⇒ Vγ ⊂ Vα ) (Lemma § III 2.3 (b) [12]).
3. Vα ∩ Ord = α (Lemma § III 2.7 (b) [12]).
4. Vα = {x ∈ V : rank(x) < α} (Lemma § III 2.5 [12]).
13
3
Some equivalents of AC
The Axiom of Choice is obviously true,
the Well-ordering Principle obviously
false, and who can tell about Zorn’s
lemma?
Jerry Bona
The quote at the beginning of this section is indeed a joke, as we will show
the three things in the quote are equivalent.
AC has a large number of equivalents; they live in different branches of mathematics, including mathematical logic, set theory, order theory, algebra, general
topology, functional analysis, and so on [17]. Here we only list and prove a few
of them. For a more comprehensive review of AC’s equivalents, the interested
reader is referred to Rubin & Rubin’s 1963 monograph [17].
We will show the following statements are equivalent:
1. AC (the Axiom of Choice).
2. WOP (Zermelo’s Well-ordering Principle): every set can be well-ordered.
3. Tri (Trichotomy): ∀x∀y(|x| < |y| ∨ |y| < |x| ∨ |x| = |y|).
4. Zorn (Zorn’s lemma): if every chain in a non-empty partial ordered set
(x, ≺) has an upper bound, then x has a maximal element.
5. Vec: Every vector space has a basis.
6. Tych (Tychonoff’s Theorem): A product of compact spaces is compact.
Before starting the proof of the equivalences, we would like to point out that
since the use of AC often occurs in a very subtle way, sometimes it is easy to
make mistakes in the proof of AC from an equivalent of it. The wrong proof
below from a standard textbook is an example.
14
Theorem 3.1. Zorn ⇒ AC.
Proof. This is a wrong proof adapted from §1.3 [19].
Let {ui : i ∈ I} be a family of non-empty sets, for some index set I. Let F
denote the set of all the choice functions for
{ui : i ∈ J ∧ J ⊂ I}.
We want to show the set (F, ⊂) satisfies the assumption of Zorn’s lemma.
Let {fs : s ∈ S} be a ⊂–chain in F, for some S ⊂ I. Let
[
f=
fs
s∈S
be the common extension for {fs : s ∈ S}. Clearly, ∀s ∈ S(fs ⊂ f ). So f is
an upper bound of {fs : s ∈ S} in F. So by Zorn’s lemma, F has a maximal
element, say g.
It remains to show g is a choice functions for {ui : i ∈ I}. Suppose not. Since
g ∈ F, g is a choice function for, say, {ui : i ∈ J} for some J $ I. Let v ∈ I − J
and b ∈ uv . Define h ⊃ g such that h(uv ) = b. Then h ∈ F, contradicting the
maximality of g.
This proof of Zorn ⇒ AC by Srivastava [19] seems to be a routine exercise
using Zorn’s lemma. However, when the author chooses some v ∈ I − J and
b ∈ uv , he implicitly but indeed uses AC. Without assuming AC or any knowledge
of I − J and uv , it could happen that we have no way to choose elements from
I − J and uv .
Now we prove the equivalences listed above. We will prove the following
relations:
7? Tri
www
wwwww
w
w
www
w www
/ AC o
Tych
WOP
O
Fc F
w
FF
w
w
FF
ww
FF
ww
F
{ww
Zorn
Vec
The proof of Tychonoff’s Theorem using ultrafilter, a maximal set, and the proof
that every vector space has a basis using Zorn’s lemma, are omitted here since
they can be found, respectively, in most standard textbooks for general topology
(e.g. Theorem 7.4.2 [24]) and algebra (e.g. Theorem § 5.2 [7]).
Proposition 3.2. WOP ⇒ AC.
Proof. Let the minimal element be the choice.
15
Theorem 3.3 (Proposition 4.37 [15]). AC ⇒ Zorn
Proof. Let (x, ≺) be a non-empty partial ordered set such that every ≺–chain in
x has an upper bound in x. We want to show x has a maximal element.
Suppose f is a choice function for x. Let b = f (x). By induction we define a
function F : Ord → x by
(
b
if α = 0,
F (α) =
f (uα ) otherwise,
where
uα = {s ∈ x : s ∈
/ F [α] ∧ ∀t ∈ F [α](t ≺ s)}.
Note that
i) uα is the set of upper bounds of F [α] in x and uα ∩ F [α] = ∅; and
ii) F (·) is the image of a point, whereas F [ · ] is the image of a set.
Now there is an ordinal α such that uα = ∅; if not, by the argument similar
to that in Theorem 2.22, Ord would be a set. Let γ be the least such ordinal and
g = F γ. Observe, for each α > 0, that g(α) is an upper bound of g[α] in x, and
that g(α) ∈
/ g[α]. So g is a one-to-one function. Then g[γ] is a ≺–chain in x and
thus has an upper bound in x, say p. Since uγ = ∅, which states that there exists
no upper bounds of g[γ] that are in x − g[γ], we see p ∈ g[γ] and is unique.
We finish the proof by claiming that p is ≺–maximal in x. Suppose not.
Then there exists an s ∈ x such that s 6= p and p ≺ s. Then s ∈
/ F [γ] and
∀t ∈ F [γ](t ≺ s). So uγ 6= ∅, leading to contradiction.
Lemma 3.4 (Proposition 4.37 [15]). Zorn ⇒ WOP.
Proof. Let x be a set. Let F be all the sequences with an ordinal length of
elements in x, as described in Remark 2.19, but with no repeated element of x.
F 6= ∅, since the empty sequence ∅ ∈ F. By the argument similar to that in
Theorem 2.22, F is a set. Clearly, F is partial ordered by ⊂ and the union of any
⊂–chain in F is an upper bound. So by Zorn’s lemma, there exists a maximal
element in F, say f . Clearly, f is a sequence consisting of all the elements in
x. Let α ∈ Ord be the length of f and define a well-ordering for f similarly to
Theorem 2.22.
Proposition 3.5 (Proposition 4.37 [15]). WOP ⇒ Tri.
Proof. Follows Theorem 2.20 and Lemma 2.13.
16
Proposition 3.6 (Proposition 4.37 [15]). Tri ⇒ WOP.
Proof. By Hartogs’ Theorem (2.22) and Trichotomy,
∀x∃α ∈ x(|x| < |α|).
Then we can define a well-ordering for x by the argument similar to that in
Theorem 2.22.
Theorem 3.7 ([13]). Tych ⇒ AC.
Proof. Let {xi : i ∈ I} be a family of non-empty sets, for some index set I. We
want to show the Form 3 of AC, i.e.
Y
xi 6= ∅.
i∈I
Without loss of generality, assume {xi : i ∈ I} are pairwise disjoint.
Let α be an ordinal such that α ∈
/ xi for all i ∈ I. Such α exists by Hartogs’
Theorem. Now for each i ∈ I, let
yi = xi ∪ {α}
and equip it with the topology
{∅, {α}, xi , yi }.
Each yi is clearly compact. Let
y=
Y
yi
i∈I
equipped with the product topology.
For each i ∈ I, let
πi : y → yi
be the projection map and
ui = πi−1 ({α}).
Then ui ’s are open sets, since πi ’s are continuous and {α} is a basic open set.
Now for any finite J $ I,
[
uj
j∈J
does not contain the element
(. . . , {α}, . . .),
S
where {α} is the i-th entry with i ∈ I −J. It follows Sj∈J uj does not cover y and
so {ui : i ∈ I} has no finite subcover of y. Therefore i∈I ui too does not cover y;
otherwise, it has a finite subcoverQof y, since y is compact by Tychonoff’s Theorem.
Now since all yi ’s are different, i∈I xi must be non-empty, as required.
17
Theorem 3.8 (Theorem 2 [2]). Vec ⇒ AC.
Proof. Let {xi : i ∈ α} be a family of non-empty sets, where α is an ordinal. We
use {xi } to construct a vector space by the idea of the field of rational fractions,
and then find a finite set ui ∈ xi for each i ∈ α. For the background of the field of
rational fractions, one can consult any standard algebra textbook (e.g. S
p124 [7]).
Without loss of generality, assume {xi } are pairwise disjoint. Let x = i∈α xi .
For s ∈ x, let R[s] and R(s) be the field of polynomials and that of rational
fractions, respectively, with variable s and coefficients in R. Similarly let R[[x]] and
R((x)) denote the field of polynomials and that of rational fractions, respectively,
with variables in x and coefficients in R.
For an f ∈ R[[x]], we say f has [i]–degree n if the sum of the exponents of
elements in xi in each term of f is n. For an h ∈ R((x)), where h = f /g with
f, g ∈ R[[x]], we say h has (i)–degree n if
[i]–degree of f = n + [i]–degree of g.
For examples, if s, t ∈ x1 and p, q ∈ x2 , then
1
f = 2sp−2 + t2 p3 q 10 − t
2
has [1]–degree 4 and [2]–degree 11, and
h=
f
t2 q 15
has (1)–degree 2 and (2)–degree −4.
Now let
k = {h ∈ R((x)) : h has (i)–degree 0, i ∈ α}.
It is a routine to check that k is a subfield of R[[x]]. Then R[[x]] is a vector space
over k. So by our assumption, the subspace y spanned by x has a basis, say B.
Now for i ∈ α, s ∈ xi , and s 6= 0, by the definition of a basis, we can write
X
s=
a(s, b) · b,
b∈Bs
where Bs ⊂ B is finite, and a(s, b) ∈ k is non-zero and depends on s and b. Now
if t ∈ xi , we have
X
X
t
t=
a(t, b) · b =
a(s, b) · b.
s
b∈B
b∈B
t
s
Since a(t, b) and a(s, b)t/s are just coefficients, we see Bs = Bt and the s in
a(s, b)/s could be any element in xi . Since a(s, b) ∈ k so has (i)–degree 0;
18
therefore a(s, b)/s has (i)–degree −1. Then for i ∈ α, s ∈ xi , and s 6= 0, we can
write
X
s=
a(b) · b · s,
b∈Bi
where Bi ⊂ B is finite and only depends on i, and each a(b) has (i)–degree −1. Bi
is finite, so there are finitely many a(b)’s. Then we finish the proof by letting ui
be the finitely many elements in xi appearing in the denominators of a(b)’s.
19
4
Some results requiring AC
Mathematics would not be an international science if its theorems did not possess an objective content independent of
the language in which we express them.
Ernst Zermelo
In this chapter we list some results requiring AC in various branches of mathematics. We omit the proofs since they are all classic results and can all be found
in standard textbooks; we provide references instead. For a more comprehensive
list of the applications of AC, see Appendix 2 of Moore 1982 [16] and Chapter 2
of Jech 1973 [8].
4.1
Set theory
Theorem 4.1 (Schröder–Bernstein Theorem, Proposition 4.21 (d) [15], Theorem 1.2.3 [19]).
|x| ≤ |y| ∧ |y| ≤ |x| ⇒ |x| = |y|.
Theorem 4.2 (Lemma §I 10.21 [12]). Let κ be a cardinal. Then
[ κ ≥ ω ∧ ∀α ≤ κ(|xα | ≤ κ) ⇒ xα ≤ κ.
α≤κ
Corollary 4.3. Any union of finitely or countably many countable sets is itself
countable.
4.2
General topology
Theorem 4.4 (Theorem 11.3.7 [24]). A uniform space is compact if and only if
it is complete and totally bounded.
20
4.3
Algebra
Theorem 4.5 (Theorem §IV 4.4 [7]). Every field has an algebraic closure.
4.4
Group theory
Theorem 4.6 (Nielsen–Schreier Theorem, 1.5 [10] ). Every subgroup of a free
group is free.
4.5
Measure theory
Theorem 4.7 (Theorem 3.23 [23]). Let {xn : n ∈ ω} be countable disjoint measurable sets and µ be the Lebesgue measure. Then
[ X
µ(xn ).
µ
xn =
n∈ω
n∈ω
Theorem 4.8 (Vitali Theorem, 3.38 [23]). There exist set that are not Lebesgue
measurable.
4.6
Functional Analysis
Theorem 4.9 (Hahn–Banach Theorem, 12.4.1 [24]). Let x be a normed space
and y ⊂ x a vector subspace. Then every bounded linear functional on y can be
extended to a linear functional on x with the same norm.
Theorem 4.10 (Lemma 3.20 [18]). Every Hilbert space has an orthonormal basis.
21
5
The consistency of AC in ZF
With the increasing recognition that
there are questions in mathematics
which cannot be decided without this axiom [of choice], the resistance to it must
increasingly disappear.
Ernst Steinitz
5.1
Overview of the consistency and its proof
The proof of the consistency of AC in ZF, denoted
Con(ZF + AC) or Con(ZFC),
was first outlined in Kurt Gödel’s 1938 paper [5] and was fully given in his 1940
monograph [6].
To prove Con(ZFC), we in fact prove
Con(ZF) ⇒ Con(ZFC).
By the Gödel Second Incompleteness Theorem, ZF cannot prove its own consistency. So we take Con(ZF) as an assumption.
By the Gödel Completeness Theorem, if ZFC is consistent, it has a model. It
follows that if ZFC has a model, it is consistent. Therefore, while Con(ZFC) can
be proved within model theory [9], the unusual approach people take is to find a
model for ZFC.
We adopt Gödel’s approach.
5.2
A quick review of model theory
Definition 5.1 (Definition 1.1.1 [14]). A language L = {F, R, C} consists a set
F of functions, a set R of relations, and a set C of constants.
Definition 5.2 (Definition 1.1.2 [14]). A model M for a language L is a pair
(M, I), where M is a set, called the universe of M, and I is the interpretation of
L in the universe M. We write M = (M, F L , RL , C L ).
22
Examples 5.3.
1. The language for set theory is just {∈}. {Vα , ∈} is a model for {∈}, for
α ∈ Ord.
2. The language for additive groups is {+, e}, where + is a binary function
and e is constant. A set g is a group if it satisfies the axioms
∀x(e + x = x + e = x),
∀x∀y∀z(x + (y + z) = (x + y) + z),
∀x∃y(x + y = y + x = e).
A group g is Abelian if it satisfies the additional axiom
∀x∃y(x + y = y + x).
The set F of all the formulas in set theory is an Abelian group with the
following group structure:
i) let e be ∅, the empty formula;
ii) let ϕ ∧ ψ be ϕ + ψ;
ii) let ¬ϕ be the inverse of ϕ.
In addition, one can define multiplicative groups in a similar fasion.
3. The language for ring theory is {+, −, ·, 0, 1}, where +, −, and · are binary
functions and 0 and 1 are constants. An additive Abelian group R is a ring
is it satisfies axioms
∀x(x · 0 = 0 ∧ x · 1 = 1 · x = x),
∀x∀y∀z x · (y · z) = (x · y) · z
∧x−y =z ⇔x=y+z
∧ x · (y + z) = (x · y) + (x · z)
∧ x · (y + z) = (x · y) + (x · z) .
If x is a set, define
i) P(x) = {s ⊂ x}, the power set of x;
ii) 0L = ∅;
iii) 1L = x;
iv) for s, t ⊂ x, s + t = (s ∪ t) − (s ∩ t);
v) s = −s;
23
vi) s · t = s ∩ t.
Then (P(x), +, −, ∩, ∅, x) is a model for ring theory.
Definition 5.4 (Definition 1.1.4 [14]). Given L = {F, R, C}, the set of L -terms
is the smallest set T such that
i) C ⊂ T
ii) {vi is a variable : i ∈ N} ⊂ T
iii) if x1 , . . . , xn ∈ T and f ∈ F , then f (x1 , . . . , xn ) ∈ T .
Definition 5.5 (Definition 1.1.5 [14]). Given L = {F, R, C}, if r ∈ R and
t1 , . . . tn ∈ T , then ϕ is an atomic L-formula if it is in the form of t1 = t2
or r(t1 , . . . tn ). The set of L-formulas is the closure of atomic L-formulas under
the ¬, ∧, ∨, ∃, and ∀ operations.
A variable v in a formula ϕ(. . . , v, . . .) is bounded if it is inside a quantifier ∃v
or ∀v; otherwise, it is free. A quantifier ∃v or ∀v in ϕ(. . . , v, . . .) is restricted if v
has a domain x, i.e. ∃v ∈ x or ∀v ∈ x; otherwise, it is unrestricted. A formula is
a sentence if it has no free variables.
A sentence in a model is always true or false. A formula with free variables
shows a property of the free variables rather than anything true or false. For
example, ∃x ∈ R(x > 0) is a true sentence while x > 0 shows a property of x.
Definition 5.6 (Definition 1.1.6 [14]). By induction on the length of a term t ∈ T
of a model M, we define its value tM (a) over a finite sequence a = (a1 , . . . , an )
of M as follows.
i) if t is the constant c, then tM (a) = cM ;
ii) if t is the variable vi , then tM (a) = ai ;
iii) if t is the term f (t1 , . . . , tm ), where f ∈ F is a function and t1 , . . . , tm ∈ T
M
are terms, then tM (a) = f M (tM
1 (a), . . . , tm (a)).
24
Definition 5.7. By induction on the complexity of a formula ϕ of a model M,
we define its satisfaction ϕM over a finite sequence a = (a1 , . . . , an ) of M as
follows:
(t = s)M iff tM (a) = sM (a)
M
M
(r(t1 , . . . , tm ))M iff (tM
1 (a), . . . , tm (a)) ∈ r
(¬ψ)M iff ¬(ψ M )
(ψ ∧ φ)M iff ψ M and φM
(ψ ∨ φ)M iff ψ M or φM
(∃w(ψ(v1 , . . . vn , w)))M iff ∃b ∈ M (ψ(a1 , . . . , an , b))M
(∀w(ψ(v1 , . . . vn , w)))M iff ∀b ∈ M (ψ(a1 , . . . , an , b))M
where t, s ∈ T , r ∈ R, and ψ and φ are formulas.
If ϕM , we say ϕ holds or is true in M. We also write M |= ϕ(a) or |=M ϕ(a).
In the rest of this paper, we restrict our interest to {∈}, the language of set
theory.
Definition 5.8 (Definition 12.6 [9]). Let M = (M, ∈) be a model for the language of set theory, where M is a class, and let ϕ(v1 , . . . , vn ) be a formula for it.
We now define ϕM , the Relativization or the the counterpart of satisfaction for
classes of ϕ to M, by induction on the complexity of ϕ as follows:
(vi1 ∈ vi2 )M iff vi1 ∈ vi2
(vi1 = vi2 )M iff vi1 = vi2
(¬ψ)M iff ¬(ψ M )
(ψ ∧ φ)M iff ψ M ∧ φM
(ψ ∨ φ)M iff ψ M ∨ φM
(∃x(ψ))M iff ∃x ∈ M (ψ M )
(∀x(ψ))M iff ∀x ∈ M (ψ M )
where vi1 and vi2 are variables, and ψ and φ are formulas.
Since there is no confusion, we write M instead of M. So similarly, we can
write M |= ϕ or |=M ϕ.
25
5.3
Absoluteness
Definition 5.9 (Definition §IV 3.5 [12]). A ∆0 -formula is a formula ϕ (in ZF)
in the form of
i) an atomic formula, or
ii) ¬ψ, ψ ∧ φ, ψ ∨ φ, ψ ⇒ φ, or ψ ⇔ φ, or
iii) ∃x ∈ y(ψ) or ∀x ∈ y(ψ),
where ψ and φ are ∆0 -formula. We say ϕ is ∆0 .
First we notice that a ∆0 -formula has no unrestricted quantifiers. Then given
a formula ϕ with unrestricted quantifiers, say ∃x(ψ) with ψ ∆0 ,
(∃x(ψ))M = ∃x ∈ M (ψ M )
is ∆0 . So we have proved
Proposition 5.10. The relativization of a formula is ∆0 .
Definition 5.11. A formula is a Σ1 -formula or Π1 -formula if it can be written,
respectively, in the form of ∃x(ψ) or ∀x(ψ), where ψ is a ∆0 -formula. A formula
is a ∆1 -formula if it is both Σ1 and Π1 .
Proposition 5.12. A ∆0 -formula ϕ is ∆1 .
Proof.
ϕ ⇔ ∃x(x ∈ ω ∧ ϕ)
and
ϕ ⇔ ∀x(x ∈ ω ⇒ x ∈ R ∧ ϕ).
Lemma 5.13 (Lemma 13.10 [9]). Let ϕ(v, . . .) and ψ(v, . . .) be two ∆0 -formulas.
Then
1. the following formulas are Σ1 :
¬∀x ϕ(v, x, . . .) ,
∃x1 . . . ∃xn ϕ(x1 , . . . , xn , v, . . .) ,
∃x ϕ(v, x, . . .) ∧ ∃y ψ(v, x, . . .) , and
∃x ϕ(v, x, . . .) ∨ ∃y ψ(v, x, . . .) ;
26
2. the following formulas are Π1 :
¬∃x ϕ(v, x, . . .) ,
∀x1 . . . ∀xn ϕ(x1 , . . . , xn , v, . . .) ,
∀x ϕ(v, x, . . .) ∧ ∀y ψ(v, x, . . .) , and
∀x ϕ(v, x, . . .) ∨ ∀y ψ(v, x, . . .) .
Proof.
1.
¬∀x(ϕ) iff ∃x(¬ϕ),
∃x1 . . . ∃xn ϕ iff ∃s∃t ∈ s∃x1 ∈ t . . . ∃xn ∈ t(s = (x1 , . . . , xn ) ∧ ϕ),
∃x(ϕ) ∧ ∃y(ψ) iff ∃x∃y(ϕ ∧ ψ),
∃x(ϕ) ∨ ∃y(ψ) iff ∃x∃y(ϕ ∨ ψ).
2.
¬∃x(ϕ) iff ∀x(¬ϕ),
∀x1 . . . ∀xn (ϕ) iff ∀x1 . . . ∀xn−1 ¬∃xn (¬ϕ)
iff ∀x1 . . . ∀xn−2 ¬∃xn−1 ∃xn (¬ϕ)
..
.
iff ¬∃x1 . . . ∃xn (¬ϕ)
iff ¬∃s∃t ∈ s∃x1 ∈ t . . . ∃xn ∈ t(s = (x1 , . . . , xn ) ∧ ¬ϕ)
iff ∀s¬ ∃t ∈ s∃x1 ∈ t . . . ∃xn ∈ t(s = (x1 , . . . , xn ) ∧ ¬ϕ) ,
∀x(ϕ) ∧ ∀y(ψ) iff ∀x∀y(ϕ ∧ ψ),
∀x(ϕ) ∨ ∀y(ψ) iff ∀x∀y(ϕ ∨ ψ).
We will work in transitive models and show Con(ZFC). The relation between
a ∆0 -formula ϕ and a transitive model M , to which we wish to relativize ϕ, will be
essential in our proof, as we will see in lemma 5.18. Since all the quantifiers in ϕ
27
are unrestricted, so each bounded variable has a domain sitting in M . Therefore,
thanks to the transitivity of M , all the variables in ϕM also sit in M .
∆0 -formulas are defined purely by logic symbols. The following two lemmas
show that some familiar operations or expressions are ∆0 .
Definition 5.14. An expression is a ∆0 -expression or simply ∆0 if it can be
written as a ∆0 -formula. An operation T is a ∆0 -operation or simply ∆0 if
z = T (. . .) is a ∆0 -expression. Similarly we define Σ1 , Π1 , and ∆1 -expressions
and -operations.
Lemma
(Lemma 12.10 [9], Theorem §IV 3.9 [12]). {·, ·}, (·, ·), ⊂, ×, −,
T 5.15
S
∩, ∪, , , dom, ran, f (·), and f are ∆0 -operations, where f is a function.
Proof. Let x, y be sets. Since x = y iff x ⊂ y ∧ y ⊂ x, then we have
z = {x, y} iff x ∈ z ∧ y ∈ z ∧ ∀u ∈ z(u = x ∨ u = y),
z = (x, y) iff ∃u ∈ z(u = {x, x}) ∧ ∃u ∈ z(u = {x, y})
∧ ∀u ∈ z(u = {x, x} ∨ u = {x, y}),
x ⊂ z iff ∀u ∈ x(u ∈ z),
z = x × y iff ∀u ∈ z∃s ∈ x∃t ∈ y(u = (s, t)) ∧ ∀s ∈ x∀t ∈ y∃u ∈ z(u = (s, t)),
z = x − y iff z ⊂ x ∧ ¬(∃u ∈ z(u ∈ y)) ∧ ∀u ∈ x(u ∈ y ∨ u ∈ z),
z = x ∩ y iff ∀u ∈ z(u ∈ x ∧ u ∈ y) ∧ ∀u ∈ x(u ∈ y ⇔ u ∈ z)
∧ ∀u ∈ y(u ∈ x ⇔ u ∈ z),
z = x ∪ y iff ∀u ∈ z(u ∈ x ∨ u ∈ y) ∧ x ⊂ z ∧ y ⊂ z,
z=
\
x iff ∀u ∈ z∀s ∈ x(u ∈ s) ∧ ∀s ∈ x(z ⊂ s),
z=
[
x iff ∀u ∈ z∃s ∈ x(u ∈ s) ∧ ∀s ∈ x(s ⊂ z),
z = dom(x) iff ∀u ∈ z∃s ∈ x∀t ∈ s(u ∈ t)
∧ ∀s ∈ x∀t ∈ s∀u, v ∈ t(s = (u, v) ⇒ u ∈ z),
28
\
z = ran(x) iff ∀u ∈ z∃s, t ∈ x∃p ∈ t p =
t ∧ ∃v ∈ p(s = (v, u))
\
∧ ∀s ∈ x∃t ∈ x∀p ∈ t∃u ∈ z p =
t ⇒ ∃v ∈ p(s = (v, u)) ,
z = f (x) iff ∃u ∈ f (u = (x, z)), and
g = f x iff g ⊂ f ∧ dom(g) = dom(f ) ∩ x.
Lemma 5.16 (Lemma 12.10 [9], Theorems §IV 3.11, 5.1 [12]). The following
expressions are ∆0 : x = ∅ = 0, x is transitive, α ∈ Ord, α is a limit ordinal,
z = ω, n is a natural number, r is a relation, and f is a function.
Proof.
x = ∅ = 0 iff ¬(∃u ∈ z(u = u)),
x is transitive iff ∀s ∈ x(s ⊂ x),
α ∈ Ord iff α is transitive ∧ ∀s, t ∈ x(s ∈ t ∨ s = t ∨ t ∈ s)
∧ ∀s, t, u ∈ s(s ∈ t ∧ t ∈ u ⇒ s ∈ u),
α is a limit ordinal iff α ∈ Ord ∧ ∀s ∈ x∃t ∈ x(s ∈ t),
z = ω iff ¬(z = 0) ∧ z is a limit ordinal ∧ ∀u ∈ z(¬(u is a limit ordinal)),
n is a natural number iff n ∈ ω,
r is a relation iff ∀u ∈ r∃s ∈
[
u∃t ∈
[
u(u = (s, t)), and
f is a function iff f is a relation
[[ ∧ ∀s, t, u ∈
f (s, t) ∈ f ∧ (s, u) ∈ f ⇒ t = u .
29
Definition 5.17. A formula ϕ is absolute for a model M if for all v1 , . . . , vn ,
ϕM (v1 , . . . , vn ) ⇔ ϕ(v1 , . . . , vn ).
Lemma 5.18 (Lemma 12.9 [9]). A ∆0 -formula is absolute for a transitive M .
Proof. We prove by induction on the complexity of a formula ϕ. It is straightforward if ϕ is atomic, ¬ψ, ψ ∧ φ, ψ ∨ φ, ψ ⇒ φ, or ψ ⇔ φ, assuming the result
holds for ψ and φ.
Now assuming ψ = ψ(y, . . .) is a ∆0 -formula, if ϕ is ∃x ∈ y(ψ), then
∃x ∈ y(ψ)
⇔∃x(x ∈ y ∧ ψ)
⇔∃x(x ∈ y ∧ ψ M )
⇔∃x(x ∈ y ∧ x ∈ M ∧ ψ M )
⇔(∃x(x ∈ y ∧ ψ))M
⇔(∃x ∈ y(ψ))M .
∀x ∈ y(ψ) is similar.
Lemma 5.19. A ∆1 -formula ϕ is absolute for a transitive M .
Proof. First write ϕ in the form of ∃x(ψ), where ψ is a ∆0 -formula. Then
ϕM ⇔ (∃x(ψ))M
⇔ ∃x ∈ M (ψ)M
⇔ ∃x ∈ M (ψ)
⇒ ∃x(ψ)
⇔ ϕ.
by lemma 5.18
For the other direction, write ϕ in the form of ∀x(ψ), where ψ is a ∆0 -formula.
Then
ϕ ⇔ ∀x(ψ)
⇒ ∀x ∈ M (ψ)
⇔ ∀x ∈ M (ψ)M
by lemma 5.18
M
⇔ (∀x(ψ))
⇔ ϕM .
30
5.4
Constructible sets
Definition 5.20. A subset x of a set M (as an {∈}-model) is definable over M
if
x = {s ∈ M : ϕM (s, a)},
where a = (a1 , . . . , an ) is a finite sequence over M .
Intuitively, a definable set X is uniquely defined by some property. But its
careful definition by a formula in the formal language avoids the following false
paradox:
Define the first undefinable ordinal λ by the property that λ is a
undefinable ordinal and ∀α < λ(α is definable). Then we have defined
an undefinable ordinal.
Notation 5.21. Let
D(M ) = {X ⊂ M : X is definable over M }.
Definition 5.22 (Definitions §VI 1.4, 1.5 [12]). We now define by induction the
universe L of constructible sets, in which we will show Con(ZFC):
i) L0 = ∅,
(
ii) Lα =
iii) L =
S
D(Lα−1 )
S
β<α Lβ
α∈Ord
if α is a successor
, and
if α is a limit
Lα .
Remark 5.23. Being constructible and definable are not the same thing. A set
x is constructible if and only if x ∈ L, whereas definability can be only talked
about in the context of a model.
Definition 5.24. The Axiom of Constructibility states V = L, i.e.
∀x∃α ∈ Ord(x ∈ Lα ).
Our next task is to show L is a model for ZF + V = L. After that, the job
becomes straightforward: just to construct a well-ordering for all the constructible
sets.
Think carefully the Axiom of Constructibility. It is trivial that any set in L
is constructible. But as we have seen the definitions of V and L, which seem
(intuitively) very different, it is unlikely that any arbitrary set is constructible.
However, it is indeed a fact that ZF + V = L is consistent (if we can show L is
model of them). To show this, we need to show “x is constructible” is absolute
for L, i.e. (x is constructible)L ⇔ x is constructible. First we see some properties
of L and Lα .
31
Lemma 5.25 (Lemma §VI 1.3 [12]). Let M be a transitive set.
1. If x and y are definable over M , so are M − x, x ∩ y, and x ∪ y.
2. M ⊂ D(M ) ⊂ P(M ).
3. If x ⊂ M is finite, then x is definable.
4. (AC) If M is infinite, then |D(M )| = |M |.
Proof.
1. Let x = {s ∈ M : ψ M (s, a)} and y = {s ∈ M : φM (s, b)}. Then
M − x = {s ∈ M : ¬(ψ M (s, a))}
x ∩ y = {s ∈ M : ψ M (s, a) ∧ φM (s, b)}, and
x ∪ y = {s ∈ M : ψ M (s, a) ∨ φM (s, b)}.
2. Let y ∈ M . Since M is transitive, y = {x ∈ y} = {x ∈ M : x ∈ y} and
thus is definable. D(M ) ⊂ P(M ) follows the definition of D(M ).
3. Follows from 1 and 2.
4. By Theorem 4.2, ∀n ∈ ω(|M n | = |M |). |D(M )| ≥ |M | follows from M ⊂
D(M ).
Definition 5.26. Let ρ(x) = inf{β ∈ Ord : x ∈ Lβ+1 } be the L-rank for x ∈ L.
Some properties of L and Lα listed in Lemmas 5.27 and 5.28 can be routinely
checked.
Lemma 5.27. Let α ∈ Ord.
1. Lα and L are transitive.
2. Lα ∈ Lα+1 (Lemma §VI 1.10 [12]).
3. ∀β < α(Lβ ⊂ Lα ) (Lemma §VI 1.6 [12]).
4. Lα = {x ∈ L : ρ(x) < α} (Lemma §VI 1.8 [12]).
5. Lα ∩ Ord = α (Lemma 13.2 [9]).
6. α ⊂ Lα and ρ(α) = α (Lemma §VI 1.9 (a) [12]).
32
Lemma 5.28.
1. ∀α ∈ Ord(Lα ⊂ Vα ) (Lemma §VI 1.11 [12]).
2. ∀n ≤ ω(Ln = Vn ) (Lemma §VI 1.13 [12]).
Lemma 5.29 (Exercise 12.16 [9]). Let ϕ be a formula. Then there exists an
arbitrarily large ordinal α such that for all v1 , . . . , vn ∈ Lα ,
ϕL (v1 , . . . , vn ) ⇔ ϕLα (v1 , . . . , vn ).
Proof. We prove by induction on the complexity of ϕ. It is straightforward if ϕ
is atomic, ¬ψ, ψ ∧ φ, ψ ∨ φ, ψ ⇒ φ, or ψ ⇔ φ, assuming the result holds for ψ
and φ. Since
∃x ∈ y(ϕL ) ⇔ ∃x ∈ L(x ∈ y ∧ ϕL )
and
∀x ⇔ ¬∃x¬,
it remains to show if the result holds for ϕ, then it also holds for ∃x(ϕ). Thus
we finish the proof by seeing, for arbitrarily large α and v1 , . . . , vn ∈ Lα ,
∃x ∈ Lα (ϕLα (x, v1 , . . . , vn ))
⇔∃x ∈ Lα (ϕ(x, v1 , . . . , vn ))
⇔∃x ∈ L(ϕ(x, v1 , . . . , vn )).
The first equivalence is by the assumption of ϕ, and the second one is by the
arbitrary largeness of α.
Theorem 5.30 (Theorem 13.3 [9]). L is a model for ZF.
Proof. We prove σ L holds for every axiom σ in ZF.
Extensionality. We want to show
L |= ∀x∀y(ϕ(x, y)),
where ϕ is
(∀u ∈ x(u ∈ y) ∧ ∀u ∈ y(y ∈ x)) ⇒ x = y.
But ϕ is a ∆0 -formula and thus absolute.
Foundation. We want to show
L |= ∀x(x 6= ∅ ⇒ ∃u ∈ x(u ∩ x) = ∅).
Since ∅L = ∅, let ∅ 6= x ∈ L. Then ∃u ∈ x(u ∩ x) = ∅. L is transitive so
u ∈ L, and u ∩ x is ∆0 .
33
Pairing. We want to show
L |= ∀x∀y∃z(z = {x, y}).
Let x, y ∈ L and define z = {x, y}. Then ∃α ∈ Ord(x ∈ Lα ∧ y ∈ Lα ). So
z ∈ Lα+1 since {x, y} is definable on Lα . And z = {x, y} is ∆0 .
Union. We want to show
[
L |= ∀x∃y(y =
x).
S
Let x ∈ L and define y = x. Since L is transitive, z ∈ L for all z ∈ x.
Use the transitivity of L again, we have
∀z ∈ x(u ∈ z ⇒ u ∈ L).
Therefore
y ⊂ L. So there exists an ordinal α such that x ∈ Lα and y ⊂ Lα .
S
t ∈ s is ∆0 . So
[
[
y = {u ∈ Lα : u ∈
x} = {u ∈ Lα : Lα |= u ∈
x}
is Lα -definable. Then
∀x ∈ L∃y ∈ L(y =
[
x)
[
x)L
⇔∀x ∈ L∃y ∈ L(y =
[ L
⇔ ∀x∃y(y =
x) .
Power Set. We want to show
L |= ∀x∃y(ϕ(x, y)),
where ϕ is
∀u(u ⊂ x ⇔ u ∈ y).
Let x ∈ L and y = P(x) ∩ L. There exists an α such that y ∈ Lα . Lα is
transitive, so y ⊂ Lα . Then
y = {u ∈ Lα : u ⊂ x}.
Clearly, ϕ is ∆0 and y satisfies ϕ.
Infinity. It suffices to show L contains an (infinite) inductive set , i.e.
L |= ∃x(ϕ(x, u)),
where ϕ is
∅ ∈ x ∧ ∀u ∈ x({u, {u}} ∈ x).
We have already shown above ∅L = ∅ and the Axiom of Pairing holds in L.
So ω ∈ L satisfies ϕ.
34
Replacement. We want to show if ϕ is a formula such that
L |= ∀x∀y∀z(ϕ(x, y) ∧ ϕ(x, z) ⇒ y = z),
where we write ϕ(x, y) instead of ϕ(x, y, . . .) for convenience, then
L |= ∀x∃y∀u u ∈ y ⇔ ∃s ∈ x(ϕ(s, u)) .
Let x ∈ L and define χ(s, α) to be
α ∈ Ord ∧ ∃t ∈ Lα (ϕL (s, t)) ∧ ∀β < α∀u ∈ Lβ (¬ϕL (s, u)).
Note that for χ(s, α),
α = min{ρ(t) : ϕL (s, t)},
where ρ(t) is the L-rank of t. Then clearly, if χ(s, α) holds for some s, α is
unique. Now by the Axiom of Replacement, there exists y such that
∀α α ∈ y ⇔ ∃s ∈ x(χ(s, α)) .
S
Let γ = (y ∩ Ord). Then γ is an ordinal, and
∀s ∈ x∀α ∈ Ord(χ(s, α) ⇒ α < γ).
By the definition of χ,
∀s ∈ x∀t ∈ L∃α ∈ Ord(ϕL (s, t) ⇒ t ∈ Lα ).
But t ∈ Lα ⇒ t ∈ Lγ . So Lγ satisfies
L |= ∀u u ∈ Lγ ⇔ ∃s ∈ x(ϕ(s, u)) .
Separation. We want to show if ϕ is a formula, then
L |= ∀x∀p1 . . . ∀pn ∃y∀s s ∈ y ⇔ s ∈ x ∧ ϕ(s, p1 , . . . , pn ) .
Let x, p1 . . . , pn ∈ L and define
y = {s ∈ x : ϕL (s, p1 , . . . , pn )}.
It is easy to see by lemma 5.29 that there exists an α such that
y = {s ∈ Lα : s ∈ x ∧ ϕLα (s, p1 , . . . , pn )}.
Therefore y ∈ L and satisfies
∀s ∈ L s ∈ y ⇔ s ∈ x ∧ ϕ(s, p1 , . . . , pn ) .
35
5.5
Gödel operations
Definition 5.31. A Gödel operation is one of or a composition of the following
basic (Gödel) operations:
g1 (x, y) = {x, y},
g2 (x, y) = x × y,
g3 (x, y) = {(u, v) ∈ x × y : u ∈ v},
g4 (x, y) = x − y,
[
g5 (x) =
x,
g6 (x) = dom(x),
g7 (x × y) = y × x.
Proposition 5.32. The following operations are Gödel operations:
1. g8 (x, y) = x ∩ y.
2. g9 (x) = ran(x).
3. g10 (x × y × z) = x × z × y.
Proof.
1. g8 (x, y) = x ∩ y = x − (x − y).
2. g9 (x) = ran(x) = dom(g7 (x)).
3. Since x × y × z = x × (y × z), then
g10 (x × (y × z)) = x × (z × y)
= dom x × (y × z) × g7 ran(x × (y × z)) .
Observe that operations g1 to g10 are in fact the set operations to generate
new sets as
S a group of axioms of set theory [6]. Recall Lemma 5.15 that {·, ·},
×, −, ∩, , dom, and ran are ∆0 -operations. One may conjecture that all the
Gödel operations are ∆0 . We prove this now.
Lemma 5.33 (Lemma 13.7 [9]). Any Gödel operation G is a ∆0 -operation.
Proof. We prove by induction on the complexity of G. To show
i) z = G(. . .) is ∆0 ,
36
it is equivalent to show
∀u ∈ z(u ∈ G(. . .)) ∧ ∀u ∈ G(. . .)(u ∈ z)
is ∆0 ; thus it is also equivalent to show
ii) u ∈ G(. . .) is ∆0 , and
iii) if ϕ is a ∆0 -formula, then ∀u ∈ G(. . .)(ϕ(u, . . .)) is ∆0 . (Similarly, ∃u ∈
G(. . .)(ϕ(u, . . .)) is also ∆0 .)
First we show the basic Gödel operations g1 to g7 are ∆0 . Since we have already
seen in Lemma 5.15 that g1 , g2 , g4 , g5 , and g6 are ∆0 , it remains to show g3 and
g7 are also ∆0 . This is true since
z = g3 (x, y) = {(u, v) ∈ x × y : u ∈ v}
iff ∀s ∈ z∃u ∈ x∃v ∈ y(u ∈ v ∧ s = (u, v))
∧ ∀u ∈ x∀v ∈ y∃s ∈ z(u ∈ v ⇒ s = (u, v)),
and
z = g7 (x) = {(u, v) : (v, u) ∈ x} iff
[[
∀p ∈ z∃s ∈ x∃u, v ∈
x(s = (v, u) ∧ p = (u, v))
[[
∀s ∈ x∀u, v ∈
x∃p ∈ z(s = (v, u) ⇒ p = (u, v)).
Therefore, i), ii) and iii) hold for the basic cases.
We now proceed to the inductive steps. Assume i), ii) and iii) hold for G(. . .)
and H(. . .), two Gödel operations. We would like to show the basic Gödel operations on G(. . .) and H(. . .) are ∆0 . For convenience, we drop (. . .) in G(. . .) and
H(. . .).
z = g1 (G, H) = {G, H}
iff ∀u ∈ z(u = G ∨ u = H) ∧ ∃u, v ∈ z(u = G ∧ v = H),
z = g2 (G, H) = G × H
iff ∀u ∈ z∃s ∈ G∃t ∈ H(u = (s, t)) ∧ ∀s ∈ G∀t ∈ H∃u ∈ z(u = (s, t)),
z = g3 (G, H) = {(u, v) ∈ G × H : u ∈ v}
iff ∀s ∈ z∃u ∈ G∃v ∈ H(u ∈ v ∧ s = (u, v))
∧ ∀u ∈ G∀v ∈ H∃s ∈ z(u ∈ v ⇒ s = (u, v)),
37
z = g4 (G, H) = G − H
iff ∀u ∈ z(u ∈ G) ∧ ¬(∃u ∈ z(u ∈ H)) ∧ ∀u ∈ G(u ∈ H ∨ u ∈ z),
z = g5 (G) =
[
G
iff ∀u ∈ z∃s ∈ G(u ∈ s) ∧ ∀s ∈ G(s ⊂ z),
z = g6 (G) = dom(G)
iff ∀u ∈ z∃s ∈ G∀t ∈ s(u ∈ t)
∧ ∀s ∈ G∀t ∈ s∀u, v ∈ t(s = (u, v) ⇒ u ∈ z),
z = g7 (G) = {(u, v) : (v, u) ∈ G}
[[
iff ∀p ∈ z∃s ∈ G∃u, v ∈
G(s = (v, u) ∧ p = (u, v))
[[
∀s ∈ G∀u, v ∈
G∃p ∈ z(s = (v, u) ⇒ p = (u, v)).
For g1 , we proved directly. For
g2 to g7 , we used inductive hypotheses. For g7 ,
SS
we are allowed to use ∀u ∈
G since we already proved the g5 case. This
completes the proof.
Having seen that any Gödel operation can be somehow expressed by a ∆0 formula, we naturally ask if the reverse is also true. Next we show the set defined
by a ∆0 -formula is a Gödel operation.
Theorem 5.34 (Gödel’s Normal Form Theorem, 13.7 [9]). Given a ∆0 -formula
ϕ(v1 , . . . , vn ) and a finite sequence x1 , . . . , xn of sets , these exists a Gödel operation G on x1 , . . . , xn such that G is defined by ϕ, i.e.
G(x1 , . . . , xn ) = {(v1 , . . . , vn ) ∈ x1 × . . . × xn : ϕ(v1 , . . . , vn )}.
Proof. Let v = (v1 , . . . , vn ) and x = x1 × . . . × xn . We prove by induction on
the complexity of ϕ. As we see the definition 5.9, the induction is done by the
number of symbols ∈, =, ∧, ∨, ¬, ⇒, ⇔, and restricted quantifiers, ∃vi ∈ vj and
∀vi ∈ vj with i 6= j.
We can assume ϕ contains only ∈, ∧, ¬, and restricted ∃ since
a∨b
∀a ∈ b(ψ)
a=b
a⇒b
a⇔b
iff
iff
iff
iff
iff
¬(¬a ∧ ¬b),
¬∃a ∈ b(¬ψ),
∀u ∈ a(u ∈ b) ∧ ∀u ∈ b(u ∈ a),
b ∨ ¬a,
a ⇒ b ∧ b ⇒ a.
38
Further, if ϕ is ∃vs ∈ vt (ψ(. . . , vs , . . . , vt , . . .)), we can assume ϕ is
∃vs ∈ vt (ψ(. . . , vt , . . . , vs )),
so that the quantified variable vs is of the highest index, because this is just a
matter of renaming.
Case 1 We prove for an atomic formula ϕ, i.e. vi ∈ vj where i 6= j, by induction
on the number of variables in ϕ (allowing dummy variables).
Case 1a If n = 2, then
{(v1 , v2 ) ∈ x1 × x2 : v1 ∈ v2 } = g3 (x1 , x2 )
and
{(v1 , v2 ) ∈ x1 × x2 : v2 ∈ v1 } = g7 (g3 (x2 , x1 )).
Case 1b If n > 2 and i, j 6= n, suppose there exists a G such that
G(x1 , . . . , xn−1 ) = {(v1 , . . . , vn−1 ) ∈ x1 × . . . × xn−1 : vi ∈ vj }.
Then
{v ∈ x : vi ∈ vj ∧ i, j 6= n} = G(x1 , . . . , xn−1 ) × xn .
Case 1c If n > 2 and i, j 6= n − 1, by Case 1b we can assume there exists a G
such that
G(x1 , . . . , xn )
={(v1 , . . . , vn−2 , vn , vn−1 ) ∈ x1 × . . . × xn−2 × xn × xn−1 : vi ∈ vj ∧ i, j 6= n − 1}
={((v1 , . . . , vn−2 ), vn , vn−1 ) ∈ (x1 × . . . × xn−2 ) × xn × xn−1 : vi ∈ vj ∧ i, j 6= n − 1}.
Then
{v ∈ x : vi ∈ vj ∧ i, j 6= n − 1}
={((v1 , . . . , vn−2 ), vn−1 , vn ) ∈ (x1 × . . . × xn−2 ) × xn−1 × xn : vi ∈ vj ∧ i, j 6= n − 1}
=g10 (G(x1 , . . . , xn )).
Case 1d If n > 2 and i = n − 1, j = n, then
{v ∈ x : vn−1 ∈ vn } = x1 × . . . × xn−2 × g3 (xn−1 , xn ).
Case 1e If n > 2 and i = n, j = n − 1, then
{v ∈ x : vn ∈ vn−1 } = g10 ((x1 × . . . × xn−2 ) × g3 (xn , xn−1 )).
39
Case 2 If ϕ(v) is ¬ψ(v), suppose there exists a G such that
G(x1 , . . . , xn ) = {v ∈ x : ψ(v)}.
Then
{v ∈ x : ϕ(v)} = x − G(x1 , . . . , xn ).
Case 3 If ϕ(v) is (ψ1 ∧ ψ2 )(v), suppose for i = 1, 2,
{v ∈ x : ψi (v)} = Gi (x1 , . . . , xn ).
Then
{v ∈ x : ϕ(v)} = G1 (x1 , . . . , xn ) ∩ G2 (x1 , . . . , xn ).
Case 4 If ϕ(v) is ∃u ∈ vi (ψ(v, u)) with 1 ≤ i ≤ n, suppose by inductive hypothesis
{(v, u) ∈ x × y : ψ(v, u)} = G1 (x1 , . . . , xn , y)
and by Case 1
{(v, u) ∈ x × y : u ∈ vi } = G2 (x1 , . . . , xn , y)
Since
u ∈ v i ∈ xi ⇒ u ∈
[
xi ,
we have
ϕ(v) iff ∃u ∈ vi (ψ(v, u))
[
iff ∃u(ψ(v, u) ∧ u ∈ vi ∧ u ∈
xi )
[
iff v ∈ dom{(v, u) ∈ x ×
xi : ψ(v, u) ∧ u ∈ vi }.
Then
[
{(v, u) ∈ x × xi : ϕ(v, u)}
[
[ =x ∩ dom G1 (x1 , . . . , xn , xi ) ∩ G2 (x1 , . . . , xn , xi ) .
Therefore, we have finished the proof.
Notation 5.35. Let (cl)G (x) denote the closure of x under Gödel operations.
Remark 5.36. Think carefully (cl)G (x) as it will play a crucial role in the rest
of our proof. We construct it by induction on ω.
At n = 0, let
x0 = x.
40
At n = 1, apply once (arbitrary) basic Gödel operation over any u, v ∈ x0 ,
collect all the yielded sets, and along with x0 , define
x1 = x0 ∪ {gi (u, v) : u, v ∈ x0 , i = 1, . . . , 7}.
Then, inductively, at n = m, define
xm+1 = xm ∪ {gi (u, v) : u, v ∈ xm , i = 1, . . . , 7}.
Finally, we have
(cl)G (x) =
[
xn .
n∈ω
Proposition 5.37. (cl)G is a Σ1 -operation.
Proof. In the light of Remark 5.36, we see
y = (cl)G (x)
[
iff ∃f f is a function ∧ dom(f ) = ω ∧ y =
ran(f ) ∧ f (0) = x
∧ n ∈ ω(f (n + 1) = f (n) ∪ {gi (u, v) : u, v ∈ f (n), i = 1, . . . , 7}) .
Corollary 5.38. For a transitive set M ,
D(M ) = (cl)G (M ∪ {M }) ∩ P(M ).
Proof. By Proposition 5.10 and Gödel’s Normal Form Theorem, we have
D(M ) ⊂ (cl)G (M ∪ {M }) ∩ P(M ).
For the other direction, let x ⊂ M and x = G(. . .) for some Gödel operation G
over M ∪ {M }. Since G is ∆0 , there exists a ∆0 -formula ϕ such that
x = {v ∈ M : ϕ(v, . . .)}.
But ϕ is ∆0 , so
x = {v ∈ M : ϕM (v, . . .)} ∈ D(M ).
Corollary 5.39. Gödel operations are absolute for transitive models.
41
5.6
Con(ZF + V = L)
Lemma 5.40 (Lemma 13.14 [9]). The function D is absolute for L.
Proof. We show D is ∆1 . First it is clear that
z = D(x) = (cl)G (x ∪ {x}) ∩ P(x)
iff ∃y y = (cl)G (x ∪ {x}) ∧ ∀u ∈ z(u ⊂ x ∧ u ∈ y)
∧ ∀u ∈ y(u ⊂ x ⇒ u ∈ z) .
Since (cl)G is Σ1 by proposition 5.37 and the rest after the quantifier ∃y is ∆0 ,
the whole thing after ∃y is Σ1 . So D is a Σ1 -function by Lemma 5.13 (1).
Now
z = D(x) iff x ∈ dom(D) ∧ ∀u ¬(u = z) ⇒ ¬(u = D(x)) .
Expressions x ∈ dom(D) and ¬(u = z) are ∆0 , and ¬(u = D(x)) is Π1 by
Lemma 5.13 (2). Then z = D(x) is also Π1 , again by Lemma 5.13 (2). So D is
∆1 and thus absolute for L by Lemma 5.19.
Lemma 5.41. ∀α ∈ Ord((Lα )L ) = Lα .
Proof. We prove by contradiction. Suppose there exists an α such that (Lα )L 6= L.
Let β be the smallest such ordinal. First trivially β 6= 0. If β > 0 and is a limit,
then
[ L
(Lβ )L =
Lα
α<β
=
[
(Lα )L
by Lemma 5.15
Lα
by hypothesis
α<β
=
[
α<β
= Lβ
contradicting the definition of β.
If β is a successor, then
L
(Lβ )L = D(Lβ−1 )
= D (Lβ−1 )L
= D(Lβ−1 )
= Lβ
by Lemma 5.40
by hypothesis
contradicting the definition of β.
The result thus follows.
42
Corollary 5.42. L |= V = L.
Proof. Since
L |= V = L
iff L |= ∀x∃α(α ∈ Ord ∧ x ∈ Lα )
iff ∀x ∈ L∃α ∈ L α ∈ (Ord)L ∧ x ∈ (Lα )L ,
the result follows from Lemmas 5.27 (5) and 5.41.
Corollary 5.43. L is a model for ZF + V = L.
5.7
Con(ZFC)
Theorem 5.44 (Definition §VI 4.1 [12], Theorem 13.18 [9]). There exists a wellordering for L.
Proof. By induction we construct a well-ordering Cα for each Lα . We require
these well-orderings have the properties that whenever α < β,
x Cα y ⇒ x Cβ y, and
x ∈ Lα ∧ y ∈ Lβ − Lα ⇒ x Cβ y.
If α is a limit, define
Cα = {(x, y) ∈ Lα × Lα : ρ(x) < ρ(y) ∨ (ρ(x) = ρ(y) ∧ (x, y) ∈Cρ(x)+1 )},
where ρ is the L-rank function.
Recall, by Corollary 5.38 and Remark 5.36,
Lα+1 = D(Lα ) = (cl)G (Lα ∪ {Lα }) ∩ P(Lα ) =
[
xα,n ∩ P(Lα ),
n∈ω
where
xα,0 = Lα ,
xα,n+1 = xα,n ∪ {gi (u, v) : u, v ∈ xα,n , i = 1, . . . , 7}.
So given Cα and s, t ∈ Lα+1 , we use Gödel operations to construct Cα+1 by
induction on n (in xα,n ) as following:
(i) If s, t ∈ Lα , then s Cα+1 t iff s Cα t.
(ii) If s ∈ Lα and t ∈ Lα+1 − Lα , then s Cα+1 t.
iii) If s ∈ xα,n , t ∈ xα,m , then s Cα+1 t iff n < m.
43
(iv) If s, t ∈ xα,n+1 − xα,n , then s Cα+1 t iff is < it , where is is the least i such
that
∃u, v ∈ xα,n (s = gis (u, v)).
(v) If in (iv) is = it , then s Cα+1 t iff us Cα+1 ut , where us is the Cα+1 –least u
such that
∃v ∈ xα,n (s = gis (us , v)).
(vi) If in (v) us = ut , then s Cα+1 t iff vs Cα+1 vt , where vs is the Cα+1 –least v
such that
s = gis (us , vs ).
(vii) If in (vi) vs = vt , it is easy to see s = t.
Define
CL =
[
Cα .
α∈Ord
Then CL is a a well-ordering for L.
Corollary 5.45. L |= V = L ⇒ AC.
Corollary 5.46. Con(ZF) ⇒ Con(ZFC)
5.8
ZF + ¬AC
Our last words for this section is the following deep result which we only state
here.
Theorem 5.47 ([3], [4]). There is a model of ZF in which R cannot be wellordered. Therefore
Con(ZF) ⇒ Con(ZF + ¬AC).
This theorem is proved by a powerful technique, called forcing, for consistency
and independence proofs introduced by Paul J. Cohen in 1963 and 1964 ([3], [4]).
It looks somewhat surprising and strange at the first glance, since we know how
to order R and the consistency of ZFC in our “normal” mathematics world might
have let us believe AC was true in general. Along with Con(ZFC), we have:
Corollary 5.48. AC is independent of ZF.
44
6
The Sun and a pea
To those who feel that the Banach–
Tarski Paradox is absurd, it seems evident that the Axiom of Choice is the
culprit.
Stan Wagon
In 1924, Banach and Tarski, using AC, proved a surprising and counterintuitive theorem:
Theorem 6.1 (Banach–Tarski Theorem). A unit 3-ball can be divided into a
finite number of disjoint pieces, such that these pieces can be reassembled to create
two unit 3-balls.
The proof is not included here but can be found in Chapter 3 of Wagon 1985
[21] or Chapter 5 of Wapner 2005 [22]. The idea of the proof is: assuming AC,
there exist non-Lebesgue measurable sets (Theorom 4.8); divide a unit 3-ball
into a finite number of non-Lebesgue measurable subsets; and these subset can
somehow be put back to get two unit 3-balls.
Given Banach–Tarski Theorem, one can make an arbitrarily large ball using
a unit ball; hence the title of this section.
Banach–Tarski theorem was one of the important reasons that AC was rejected by many mathematicians before Con(ZFC) was proved. This paradox
raised the fear that AC could lead to contradiction. To remedy this, some weaker
forms of AC were suggested. One of them is
The Axiom of Dependent Choices Give a binary relation ≺ on a set x, if
∀s ∈ x∃t ∈ x(t ≺ s),
then there exist {sn ∈ x : n ∈ ω} such that
∀n ∈ ω(sn+1 ≺ sn ).
It can be shown that there exists a model for ZF + DC in which any subset of
R is Lebesgue measurable; see Theorem 10.10 [8]. Therefore, ZF + DC seems to
45
appease both sides: there are choice functions but no longer the Banach–Tarski
paradox.
However, as Wagon argued, if some forms of non-constructive choice functions
are tolerated, why are not all the choice functions [21]? AC endows and inspire
our mathematicians and scientists with a brave and liberal attitude. Without AC,
mathematics world would have been a collection of algorithms. Those paradoxes
led by AC, like the Banach–Tarski one, instead of raising any fear to AC, actually
reinforce and reshape our views to mathematics.
Many fruitful results are known to be provable if and only if AC is assumed.
One example is the existence of non-Lebesgue measurable sets. Another one is
Theorem 6.2 (Theorem 10.11 [8]). There exists a model for ZF in which there
exists a vector space with no basis.
In fact, the physical problems caused by AC happens entirely in abstract
mathematical worlds, more specifically, in the area of Lebesgue measurable sets.
Therefore, doubling the volume of a ball simply means the physical property,
volume, is an undefinable term in Lebesgue measure theory.
Moreover, AC also helps to discover results in the world without assuming
AC. For example, the proof of Schröder–Bernstein theorem can be without using
AC, but is easier in ZFC [21]. Without AC, its discovery would be postponed.
For another example, in response to the Banach–Tarski paradox, the concept of
amenable groups was introduced to deal with paradoxical groups by von Neumann
in 1929 [20].
46
7
Conclusion
No doubt [AC] will always be desirable
despite the technical interest of various
independence questions involving it and
weaker principles.
Dana Scott
We have shown some equivalents of AC and results requiring AC from various
branches of mathematics. We have also justified why AC can be used by our
proof of the consistency of AC in ZF.
In most areas other than set theory, AC is used as a simple fact. However,
in set theory, research on this topic often goes quite deep, e.g. large cardinals.
This is especially powered by forcing, the technique invented by Cohen for his
proof of Con(ZF +¬ AC). For the reader who would like more information about
forcing, Cohen 1963 [3], Cohen 1964 [4], Kunen 1980 [12], and Jech 2002 [9] are
recommended. For contents about large cardinals, one is referred to Jech’s 2002
textbook [9]. For more fundamental results, one may consult Marker 2002 [14]
for model theory or Mendelson 1987 [15] for the Gödel Completeness Theorem
and the two Gödel Incompleteness Theorems.
47
References
[1] Banach, S. & Tarski, A. 1924. Sur la décomposition des ensembles de points
en parties respectivement congruentes. Fundamenta Mathematicae, 6: 244–
277.
[2] Blass, A. 1984. Existence of bases implies the axiom of choice. Contemporary
Mathematics, 31: 31–33.
[3] Cohen, P.J. 1963. The Independence of the Continuum Hypothesis. Proceedings of the National Academy of Sciences of the USA, 50: 1143–1148.
[4] Cohen, P.J. 1964. The Independence of the Continuum Hypothesis, II.
Proceedings of the National Academy of Sciences of the USA, 51: 105–110.
[5] Gödel, K. 1938. The consistency of the axiom of choice and of the generalized
continuum hypothesis. Proceedings of the National Academy of Sciences of
the USA, 24: 556–557.
[6] Gödel, K. 1940. The consistency of the axiom of choice and of the generalized
continuum–hypothesis with the axioms of set theory. Princeton University
Press, Princeton, New Jersey.
[7] Grillet, P.A. 2007. Abstract Algebra, 2nd ed. In: Axler, S., & Riber, K.A.
(eds), Graduate texts in mathematics, 242. Springer-Verlag, New York.
[8] Jech, T.J. 1973. The Axiom of Choice. North Holland Publishing Company,
New York.
[9] Jech, T.J. 2002. Set Theory, 3rd ed. Springer, Berlin.
[10] Johnson, D.L. 1976. Presentations of groups. In: Shephard, G.C. (ed),
London Mathematical Society Lecture note series, 22. Cambridge University
Press, Cambridge.
[11] Johnstone, P.T. 1981. Tychonoff’s theorem without the axiom of choice.
Fundamenta Mathematicae, 113: 21–35
[12] Kunen, K. 1980. Set Theory: an introduction to independence proofs. In:
Barwise, J., Kaplan, D., Keisler, H.J., Suppes, P., & Troelstra, A.S. (eds),
Studies in logic and the foundations of mathematics, 102. North Holland
Publishing Company, New York.
[13] Kelley, J.L. 1950. The Tychonoff product theorem implies the axiom of
choice. Fundamenta Mathematicae, 37: 75–76.
48
[14] Marker, D. 2002. Model theory : an introduction. In: Axler, S.,
Gehring, F.W. & Riber, K.A. (eds), Graduate texts in mathematics, 216.
Springer-Verlag, New York.
[15] Mendelson, E. 1987. Introduction to Mathematical Logic, 3rd ed. Wadsworth
& Brooks/Cole Advanced Books & Software, California.
[16] Moore, G.H. 1982. Zermelo’s Axiom of Choice: its origins, development,
and influence. Springer-Verlag, New York.
[17] Rubin, H & Rubin J.E. 1963. Equivalents of the Axiom of Choice. In:
Brouwer, L.E.J., Beth, E.W. & Heyting, A. (eds), Studies in logic and the
foundations of mathematics. North Holland Publishing Company, New York.
[18] Rynne, B.P. & Youngson, M.A. 2008. Linear Functional Analysis, 2nd ed.
Springer-Verlag, London.
[19] Srivastava, S.M. 1998. A Course on Borel Sets. In: Axler, S., Gehring, F.W.
& Riber, K.A. (eds), Graduate texts in mathematics, 180. Springer-Verlag,
New York.
[20] von Neumann, J. 1929. Zur allgemeinen Theorie des Maßes. Fundamenta
Mathematicae, 13: 73–111.
[21] Wagon, S. 1985. The Banach–Tarski Paradox. Cambridge University Press,
New York.
[22] Wapner, L.M. 2005. The Pea and the Sun: a mathematical paradox. A K
Peters, Ltd, Wellesley, Massachusetts.
[23] Wheeden, R.L. & Zygmund, A. 1977. Measure and Integral: an introduction
to real analysis. In: Hewitt, E., Kobayashi, S. & Taft, E.J., etc. (eds),
Monographs and textbooks in pure and applied mathematics, 43. Marcel
Dekker, Inc., New York.
[24] Wilansky, A.
Toronto.
1970.
Topology for Analysis.
49
Xerox College Publishing,