Equivalents, Applications, and Consistency of the Axiom of Choice Xi Yao Supervised by Professor Greg Hjorth [email protected] Department of Mathematics and Statistics The University of Melbourne November 2010 To Hui-Ying Hu Acknowledgments I would like to particularly thank my supervisor Professor Greg Hjorth. As a previous biology student who had never done any pure mathematics before and somehow decided to do a pure mathematics Master thesis in set theory and logic, and as an international student who had to work a lot to make a living, I have experienced a hard time. Professor Greg Hjorth has been very patient and considerate, helping me through many of the elementary aspects of pure mathematics. I also want to thank Raj A. Dahya, Leigh Humphries, and Toby Meadows for the discussions about L, Dr Lawrence Reeves for the discussions about the Banach–Tarski Paradox, Dr Paul Norbury for the discussions on the proof of Vec ⇒ AC, and Bowen Plug for some general discussions about AC and correcting some of my grammatic mistakes. Abstract Given any family of non-empty sets, the Axiom of Choice (AC) gives us the power to choose one element from each member from the family. AC appears and is used in most modern mathematics and sciences, more often than not, implicitly. For example, when we say “choose coset representatives”, we are indeed using AC. Upon its formal formulation after unconscious uses, AC had been receiving heavy criticisms for years due to its non-constructive nature. However, AC indeed gives us a richer mathematical (and scientific) world. We first provide, as motivation, some equivalents of AC and results requiring AC from various branches of mathematics. Then we prove the consistency of AC in ZF. Some results from AC, e.g. the Banach–Tarski paradox, are discussed. Contents 1 Introduction 1 2 Preliminaries 2.1 Zermelo–Fraekel Set Theory 2.2 Ordinals . . . . . . . . . . . 2.3 Transfinite Induction . . . . 2.4 Cardinals . . . . . . . . . . 2.5 The universe of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 7 9 12 13 3 Some equivalents of AC 14 4 Some results requiring AC 4.1 Set theory . . . . . . . . 4.2 General topology . . . . 4.3 Algebra . . . . . . . . . 4.4 Group theory . . . . . . 4.5 Measure theory . . . . . 4.6 Functional Analysis . . . . . . . . . 20 20 20 21 21 21 21 . . . . . . . . 22 22 22 26 31 36 42 43 44 5 The 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 . . . . . . . . . . . . . . . . . . . . . . . . consistency of AC in ZF Overview of the consistency and A quick review of model theory Absoluteness . . . . . . . . . . . Constructible sets . . . . . . . . Gödel operations . . . . . . . . Con(ZF + V = L) . . . . . . . Con(ZFC) . . . . . . . . . . . . ZF + ¬AC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . its proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The Sun and a pea 45 7 Conclusion 47 References 48 1 Introduction The Axiom of Choice is necessary to select a set from an infinite number of socks, but not an infinite number of shoes. Bertrand Russell Along with the eight Axioms in the Zermelo–Fraenkel axiomatic set theory (ZF hereafter), together called ZFC, the Axiom of Choice (AC hereafter) forms a natural and simple but indispensable foundation for most modern mathematics and sciences. It penetrates into most, if not all, branches of mathematics, including set theory, order theory, algebra, topology, group theory, and analysis, just to name a few; and it has many equivalents and applications. In the history of mathematics, only through extensive use, did assumptions become clean formulas. According to Moore 1982, the development of AC can be divided into four stages. The rudimentary version of AC — picking out an arbitrary element from a set — can be traced back at least to 2300 years ago in Euclid’s Elements. In the second stage, around the start of the nineteenthcentury, mathematicians constructed the rule of choice when dealing with infinity. But gradually, mathematicians, like Cauchy, started to leave the rule implicit. The last stage can probably marked by that in 1871, Cantor once used some choice functions that no rule existed and he did not recognize this. After the four stages, AC was first formulated by Ernst Zermelo when he was seeking a proof for the well-ordering principle. AC often appears implicitly as an assumption in many proofs and can easily be overlooked. The proof of Theorem 3.1 later provides an example of the oversight of AC. In the first 30 years after AC was formulated, whether or not AC could be taken as meaningful and acceptable was very controversial. However, AC is now widely accepted and used by the majority of mathematicians. AC states: The Axiom of Choice 1 Every family of non-empty sets has a choice function. This choice function maps each set in the family to an element of the set. AC has many forms. We list a few below and it is not hard to see they are essentially the same. 1 The Axiom of Choice 2 For every set x, there exists a function f : P(x) − {∅} → x such that f (s) ∈ s. The Axiom of Choice 3 The product of any family of non-empty sets is nonempty. The Axiom of Choice 4 Every family x of non-empty sets has a choice function on x such that for all s ∈ x, f (s) is a non-empty finite subset of s. If the sets are finite or are ordered in some way, e.g. natural numbers, we do not need AC. A choice function is required if the sets are infinite and there is no clear rule to select elements from them. It is important to point out that the essence of AC is the existence of a choice function, not the choice that the function gives. Therefore, we do not require the knowledge about the choice function, and thus no construction of the choice function is needed; that is, the choice function used need not be well-defined. In constructivism, all existence proofs are required to be totally explicit. That is, one must be able to construct, in an explicit and canonical manner, anything that is proven to exist. Due to its non-constructive nature, AC received strong criticisms, especially after Banach and Tarski proved in 1924 a paradox using AC. We will discuss this in section 6. The criticisms did not stop until in 1938 Kurt Gödel proved the consistency of AC in ZF. This verified why we can use AC and displaced any suspicion that the use of AC could lead to contradiction. In this paper, we first provide, as motivation, some equivalents of AC and results requiring AC from various branches of mathematics. Then we prove the consistency of AC in ZF. 2 2 Preliminaries The essential idea on which the Axiom of Choice is based constitutes a general logical principle which . . . is necessary and indispensable. David Hilbert We assume that the reader has some knowledge in the formal language of logic. See Kunen 1980 [12] for a quick review or Mendelson 1987 [15] for a formal treatment. 2.1 Zermelo–Fraekel Set Theory We list, formulate, and command on all the axioms in ZF . Some familiar mathematics objects are defined using these axioms. 0. Axiom of Existence. ∃x(x = x). This axiom assume a non-empty universe. It is listed here for emphasis but can be omitted since it is derivable from the logical axioms of formal logic [12]. (Thus it is not counted as one of ZF.) But what is a set and what is not a set? It is not easy to give definitions to them without selfreference. Although in ZF, classes do not exist, i.e. the only objects are sets [9], sometime it is unavoidable to work on classes. Definition 2.1 (§4 [15]). A class is a collection {x : P (x)}, with some property P . A class x is a set if it is a member of a class; it is a proper class if it is not a set. We would like this definition to capture all the legitimate mathematical objects. For instance, we do not accept a P (x) that says x is a dog. 1. Axiom of Extensionality. ∀x∀y (∀u ∈ x(u ∈ y) ∧ ∀u ∈ y(y ∈ x)) ⇒ x = y . This axiom says a set is defined by its members. 3 2. Axiom of Foundation. ∀x(x 6= ∅ ⇒ ∃u ∈ x(u ∩ x) = ∅). This axiom says any set non-empty set contains an ∈-least element. It prevents ill-founded sets from existing, e.g. a set x with x ∈ x, and thus solve the famous Russell’s Paradox: ∃u u = {x : x ∈ / x)} . Since there exist no sets containing themselves, u contains all the sets and thus is not a set itself by the Axiom of Separation below. 3. Axiom of Pairing. ∀x∀y∃u(x ∈ z ∧ y ∈ z ∧ ∀u ∈ z(u = x ∨ u = y)). Given x and y, define the pair u to be {x, y}. Easy to see u is unique by the Axiom of Extensionality. If y = x, then u = {x, x} = {x}, and there exists a set z with z = {x, u} = {x, {x}}. Thus we define the ordered pair (x, y) of x and y to be {{x}, {x, y}} . Similarly, define tuples with more than two entries to be (x, y, z) = ((x, y), z) = (x, (y, z)) (x, y, z, u) = ((x, y, z), u) .. . and so on. 4. Axiom of Union. ∀x∃y∀t(t ∈ y ⇔ ∃s(s ∈ x ∧ t ∈ s)). S This axioms states for any set x, the collection x of all the elements in the members in x is a set. Using pairing we define [ x ∪ y = {x, y} x ∪ y ∪ z = (x ∪ y) ∪ z .. . and so on. 4 5. Axiom of Power Set. ∀x∃y∀u(u ⊂ x ⇔ u ∈ y). This axioms says for any set x, the collection of all the subset of x is a set. Using ordered pairing and power set, we define the product x × y of x and y to be {(s, t) : s ∈ x ∧ t ∈ y}. Similarly, define x × y × z = ((x × y) × z) = (x × (y × z)) x × y × z × u = (x × y × z) × u .. . and so on. We also called a set of ordered tuples an relation and define the domain of a relation r to be dom(r) = {x : ∃y((x, y) ∈ r)}, and the range of r to be ran(r) = {y : ∃x((x, y) ∈ r)}. If f is a relation such that (x, y) ∈ f ∧ (x, z) ∈ f ⇒ y = z, then it is a function. If (x, y) ∈ f , we write f (x) = y. For a function f , if dom(f ) = x and ran(f ) = y, we write f : x → y, and say f maps x to y. The image of a set x under a function f , or simply the image of f , is f [x] = img(f ) = {t : ∃s ∈ x(t = f (s))}. A function f is one-to-one or injective if (x, z) ∈ f ∧ (y, z) ∈ f ⇒ x = y; it is onto or surjective if img(f ) = ran(f ); 5 and it is bijective if it is both one-to-one and onto. The pre-image of t ∈ ran(f ) under a function f is f −1 (t) = {s ∈ dom(f ) : (s, t) ∈ f }. If f is a one-to-one function, its inverse f −1 is also a function. By definition, ran(f ) contains and could be strictly larger than the image of dom(f ). The restriction f of a function f to a set x is f x = {(s, t) ∈ f : s ∈ x}. Easy to see the terms defined above are sets. 6. Axiom of Infinity. ∃x(∅ ∈ x ∧ ∀s ∈ x(s ∪ {s} ∈ x)). This axioms says there exists a set with infinitely many members. If a set x is finite, then there exists a natural number n such that x has exactly n members. Instead of formulating the property of infinity, in the formula of this axiom we formulate that of inductivity. In fact, the natural numbers is the smallest inductive set. Proposition 2.2. Any inductive set contains the natural numbers. Proof. Let x be an inductive set, i.e. ∅ ∈ x ∧ ∀s ∈ x(s ∪ {s} ∈ x). Suppose x does not contain all the natural numbers. Let n be the smallest natural number that is not contained in x. Then ∀m < n(m ∈ x ∧ m ∪ {m} ∈ x). Now since n is not zero, there exists a natural number l such l ∪ {l} = n. If l < n, we are done. But clearly l n. 7. Axiom Schema of Replacement. For each formula ϕ, ∀x∀y∀z ϕ(x, y) ∧ ϕ(x, z) ⇒ y = z ⇒ ∀x∃y∀u u ∈ y ⇔ ∃s ∈ x(ϕ(s, u)) . First note, since there are infinitely many formulas and thus infinitely many axioms of Replacement, this is an axiom schema; and similarly for the axioms of Separation. In short, this axiom schema says the image of a set under a function is a set. 6 8. Axiom Schema of Separation. For each formula ϕ, ∀x∀p1 . . . ∀pn ∃y∀s s ∈ y ⇔ s ∈ x ∧ ϕ(s, p1 , . . . , pn ) . This schema states given a formula ϕ, a set x, and parameters p = {p1 , . . . , pn }, there exists a set u = {s ∈ x : ϕ(s, p)}; and such u is unique by Extensionality. It is important to note that the statement that there exists a set u = {s : ϕ(s, p)} is false. Then it follows that ∅ = {x : x 6= x} is a set and that the universal class {x : x = x} is not. 2.2 Ordinals Definition 2.3. A partial ordered set (x, <) consists of a set x and a binary relation <, called a partial ordering, on x, such that i) ∀s ∈ x(¬(s < s)); ii) ∀r, s, t ∈ x(r < s ∧ s < t ⇒ r < t). Definition 2.4. Let (x, <) be a partial ordered set, ∅ = 6 t ⊂ x, and s ∈ x. We say s is i) maximal in t if s ∈ t and ¬(∃u ∈ t(s < u)); ii) minimal in t if s ∈ t and ¬(∃u ∈ t(u < s)); iii) largest or greatest in t if s ∈ t and ∀u ∈ t(u < s ∨ u = s)); iv) least or lowest in t if s ∈ t and ∀u ∈ t(s < u ∨ s = u)). v) an upper bound of t if ∀u ∈ t(u < s ∨ u = s)); vi) a lower bound of t if ∀u ∈ t(s < u ∨ s = u)). Definition 2.5. A linear ordered set (x, <) is a partial ordered set such that ∀s, t ∈ x(s = t ∨ s < t ∨ t < s). Remark 2.6. A partial ordered set can have more than one maximal elements but only one largest element, whereas in a linear ordered set, the maximal element and a largest one coincide; similar for minimal and least elements. 7 Remark 2.7. If we order all the elements in a linear ordered set x, they look like a chain. So we call a subset z of a partial ordered set (y, <) a chain if (z, <z ) is a linear ordered set, where <z is the restriction of < to z. Notation 2.8. Let x and y be two sets. If there exists a one-to-one mapping from x to y, we write |x| ≤ |y|. If |x| ≤ |y| and |y| ≤ |x|, we write |x| = |y|. If |x| ≤ |y| and |y| |x|, we write |x| < |y|. Definition 2.9. Two linearly ordered sets (x, <1 ) and (y, <2 ) are order isomorphic, denoted (x, <1 ) ' (y, <2 ), if there exists a map f from x to y such that f is bijective and order-preserving, i.e. ∀s, t ∈ x s <1 t ⇒ f (s) <2 f (t) ∧ ∀s, t ∈ y s <2 t ⇒ f −1 (s) <1 f −1 (t) . Definition 2.10. A well-ordered set (x, <) is a linear ordered set such that every non-empty subset of x has a <–least element. An initial segment x(t) of (x, <) is the set {s ∈ x : s < t} for some t ∈ x. Definition 2.11. A set x is transitive if ∀s(s ∈ x ⇒ s ⊂ x). Definition 2.12. A set α is an ordinal if and only if (α, ∈) is a transitive wellordered set. We write lowercase Greek letters α, β, . . . to represent ordinals. Denote the class of ordinals by Ord. For α, β ∈ Ord, define α < β ⇔ α ∈ β. It is easy to see the following properties of ordinals: Lemma 2.13. Let α, β ∈ Ord. 1. γ ∈ α ⇒ γ ∈ Ord. 2. β 6= α ∧ β ⊂ α ⇒ β ∈ α. 3. α ⊂ β ∨ β ⊂ α. 4. α ∪ {α} ∈ Ord. 5. β < α ⇔ β ∈ α ⇔ β $ α. 8 6. < linearly orders Ord. 7. α = β ⇔ (α, <) ' (β, <). Definition 2.14. An ordinal α is a successor if there exists an ordinal β such that α = β ∪ {β} = β + 1; it is a limit if it is not a successor, that is, it has no last element. Remark 2.15. If α is a limit, then either i) α = 0, or i) α > 0, and α = S β<α β. Examples 2.16 (of ordinals). 0 = ∅, 1 = {∅}, 2 = {∅, {∅}}, 3 = {∅, {∅}, {∅, {∅}}}, .. . ω = {the finite ordinals} = {the natural numbers}, ω1 = {the countable ordinals}. 2.3 Transfinite Induction In the light of proofs by induction on natural numbers, we extend them to their counterparts on ordinals. Theorem 2.17 (Proof by transfinite induction, Theorem 1.7.3 [19]). Let {ϕs : s ∈ x} be a set of mathematical expressions or statements labeled by A well-ordered set (x, <). Suppose ∀s ∈ x ∀t ∈ x(t < s ⇒ ϕt ) ⇒ ϕs , then ∀s ∈ x(ϕs ). Proof. Suppose, instead, ∀s ∈ x(ϕs ) is not true . Let u be the least element in x such that ϕu fails. Then ∀t ∈ x(t < u ⇒ ϕt ). But by the assumption, ∀t ∈ x(t < u ⇒ ϕt ) ⇒ ϕu . Thus ϕu is true. This contradicts the definition of u. 9 Transfinite induction can also be used to define objects. Theorem 2.18 (Definition by transfinite induction, Theorem 1.7.4 [19]). Let x be a set, (y, <) be a well-ordered set, and define Ω = f : t → x | ∃u ∈ y t = y(u) = {v ∈ y : v < u} , Let G : Ω → x be a function. Then there exists a unique function F on y such that for each u ∈ y, F (u) = G(F y(u)). Remark 2.19. In Ω, each f is a sequence (ordered tuples) with an ordinal length of elements in x labeled by the elements in y. For the purpose of demonstration, say y is an ordinal. Then an f is in the form of hs0 , s1 , . . . , sn , . . . , sα i, where h. . .i = (. . .) denotes a sequence, α ∈ Ord and each si ∈ x. Intuitively, Theorem 2.18 works because we get a sequence F of of elements in x from the given G by induction as follows: At α = 0, since ∅ is a sequence (of length 0), let F (0) = G(∅), say, = s0 ; at α = 1, since hs0 i is a sequence (of length 1), let F (1) = G(hs0 i), say, = s1 ; .. . at α = n ∈ ω, since hs0 , s1 , . . . , sn−1 i is a sequence (of length n), let F (n) = G(hs0 , s1 , . . . , sn−1 i); and so on. So each time when we get an element from G, we glue it to the end of the old sequence. This is exactly the familiar definition by induction. But Theorem 2.17 allows us to proceed to the limit cases. Proof of Theorem 2.18. For each u ∈ y, let ϕu be the statement that there exists a unique function fu : u → x such that ∀v < u fu (v) = G(fu v) . Let w ∈ y such that for all u < w, ϕu holds. (ϕ∅ is obviously the trivial one that holds.) We have the unique set Ωw = {fu : u < w}, where fu is as in the statement ϕu . It is easy to see that for each u < w, if p < u, fu p also satisfies ∀v < p (fu p)(v) = G (fu p)v . S Therefore by the uniqueness of Ωw , if p < u < w, then fp = fu p. So h = u<w fu is a common extension for {fu : u < w}. Now if w is a limit, let fw = h; if w is a successor, let fw y(w − 1) = h and fw (w − 1) = G(h), where w − 1 is the element whose successor is w. Since Ωw is unique, so is fw . Then we have ∀u ∈ y ∀v ∈ y(v < u ⇒ ϕv ) ⇒ ϕu . 10 So by Theorem 2.17, ∀u ∈ y(ϕu ). Similarly, one can see H= [ fu u∈y is the unique common extension for {fu : u ∈ y}. Again, if y is a limit, let F = H; if it is a successor, let s be the largest element in y and take F y(s − 1) = H and F (s) = G(H). And similarly, F is unique. Since induction is just a special case of transfinite induction, we collectively call induction and transfinite induction just induction hereafter. Theorem 2.20 (Theorem 2.12 [9]). Every well-ordered set is order isomorphic to a unique ordinal. Proof. Let (x, <) be a well-ordered set. Using Theorem 2.18, we define by induction Ω = {f : α → x | α ∈ Ord}. Since α and x are sets, the elements in Ω are well-defined. Ω is the set of the sequence of elements in x (labeled by ordinals); in particular, Ω contains all the ft ’s with t ∈ x such that i) the domain of ft is an ordinal, ii) the image of ft is the initial segment x(t) = {s ∈ x : s < t}, and iii) ft is an order isomorphism. Now let G : Ω → x be such that G maps an initial segment x(s) of x to s, where s ∈ x. Then by Theorem 2.18, we get a unique F . And F is an is an order isomorphism from some ordinal α to x. This α is also unique. Remark 2.21. From Theorem 2.20, we see that an important idea of ordinals is that they provide the canonical well-orderings for all sets (by assuming AC). Later we will see in section 3, every set can be well-ordered. On the other hand, since no set contains a copy (up to order isomorphim) of itself, Ord is a proper class. Using this fact, we prove the following result. Theorem 2.22 (Hartogs’ Theorem, Proposition 4.28 [15]). ∀x∃α ∈ Ord∀y(y ⊂ x ⇒ |y| = 6 |α|). 11 Proof. Suppose x is a set such that ∀α ∈ Ord∃y(y ⊂ x ∧ |y| = |α|). Now for each pair of y and α such that |y| = |α|, we let fy,α : y → α be the bijection and define a relation <y,α on y by ∀s, t ∈ y s <y,α t ⇔ fy,α (s) < fy,α (t) . ∗ : (y, <y,α ) → (α, <) is the natural extension of fy,α and is an obvious Thus fy,α ∗ order isomorphism. For each fy,α , since y ⊂ x and |y| = |α|, we have ∗ fy,α ⊂ x × (x × x) × x. Define a function F mapping each ordinal α to the set ∗ uα = {fy,β : β < α}. As we see uα ⊂ P(x) × (P(x) × P(x)) × P(x), uα is a set. So it follows that the image of F is also a set since F Ord ⊂ P P(x) × (P(x) × P(x)) × P(x) . Now since F is a one-to-one function, F has an inverse F −1 . F −1 maps F Ord to Ord. So, by the Axiom of Replacement, Ord is a set. This contradicts the fact that Ord is a proper class. 2.4 Cardinals Cardinality is a familiar concept seen in all branches of mathematics. It counts the number of the elements of a set. Cardinality and cardinal are the same thing, but intuitively, one may think of cardinality as a property and cardinal as a number. Now we define cardinals formally using ordinals. Definition 2.23 (§3 [9]). A cardinal is an ordinal κ such that ∀α ∈ Ord(α < κ ⇒ |α| < |κ|). We say the cardinality of a set x is κ if |x| = |κ|. Examples 2.24 (of cardinals). 1. All the natural numbers are the finite cardinals. 2. The infinite cardinals are called alephs, written ℵα ’s. The ordinal ω is the first infinite cardinal, written ℵ0 . The first uncountably infinite cardinal is ω1 , written ℵ1 . 3. The ordinal ω + 2 is not a cardinal, since ω < ω + 2 and |ω + 2| = |ω| = ℵ0 . 12 2.5 The universe of sets Definition 2.25 (Definitions §III 2.1, 2.2 [12]). We define by induction the universe V of sets, i) V0 = ∅, ( ii) Vα = iii) V = S P(Vα−1 ) if α is a successor , and S V if α is a limit β β<α α∈Ord Vα . Since the Axiom of Foundation says there are no ill-founded sets, i.e. any (nonempty) set has an ∈–minimal element, it follows that any (non-empty) class has the same property (see Lemma 6.2 [9]). Using this fact, we can show V contains all the (well-founded) sets. (Thus V is not a set itself but a proper class.) Lemma 2.26 (Lemma 6.3 [9]). ∀x(x ∈ V). Proof. Let y = {x : x ∈ / V}. If y = ∅, we are done; otherwise, let x be ∈-minimal in y. Then we have ∀s ∈ x(s ∈ V); if not, x is not ∈-minimal. So x ⊂ V and therefore there exists an obvious oneto-one map from x to V. ThenSsince x is a set, we use the Axiom of Replacement to get an ordinal β with x ⊂ α<β Vα . Then x ⊂ Vβ ⇒x ∈ P(Vβ ) = Vβ+1 ⇒x ∈ V. Definition 2.27. Let rank(x) = inf{β ∈ Ord : x ∈ Vβ+1 } be the rank for x ∈ V. It is not hard to check the properties of V and Vα ’s listed in the following lemma. Lemma 2.28. Let α ∈ Ord. 1. Vα is transitive (Lemma § III 2.3 (a) [12]). 2. ∀γ ∈ Ord(γ ⊂ α ⇒ Vγ ⊂ Vα ) (Lemma § III 2.3 (b) [12]). 3. Vα ∩ Ord = α (Lemma § III 2.7 (b) [12]). 4. Vα = {x ∈ V : rank(x) < α} (Lemma § III 2.5 [12]). 13 3 Some equivalents of AC The Axiom of Choice is obviously true, the Well-ordering Principle obviously false, and who can tell about Zorn’s lemma? Jerry Bona The quote at the beginning of this section is indeed a joke, as we will show the three things in the quote are equivalent. AC has a large number of equivalents; they live in different branches of mathematics, including mathematical logic, set theory, order theory, algebra, general topology, functional analysis, and so on [17]. Here we only list and prove a few of them. For a more comprehensive review of AC’s equivalents, the interested reader is referred to Rubin & Rubin’s 1963 monograph [17]. We will show the following statements are equivalent: 1. AC (the Axiom of Choice). 2. WOP (Zermelo’s Well-ordering Principle): every set can be well-ordered. 3. Tri (Trichotomy): ∀x∀y(|x| < |y| ∨ |y| < |x| ∨ |x| = |y|). 4. Zorn (Zorn’s lemma): if every chain in a non-empty partial ordered set (x, ≺) has an upper bound, then x has a maximal element. 5. Vec: Every vector space has a basis. 6. Tych (Tychonoff’s Theorem): A product of compact spaces is compact. Before starting the proof of the equivalences, we would like to point out that since the use of AC often occurs in a very subtle way, sometimes it is easy to make mistakes in the proof of AC from an equivalent of it. The wrong proof below from a standard textbook is an example. 14 Theorem 3.1. Zorn ⇒ AC. Proof. This is a wrong proof adapted from §1.3 [19]. Let {ui : i ∈ I} be a family of non-empty sets, for some index set I. Let F denote the set of all the choice functions for {ui : i ∈ J ∧ J ⊂ I}. We want to show the set (F, ⊂) satisfies the assumption of Zorn’s lemma. Let {fs : s ∈ S} be a ⊂–chain in F, for some S ⊂ I. Let [ f= fs s∈S be the common extension for {fs : s ∈ S}. Clearly, ∀s ∈ S(fs ⊂ f ). So f is an upper bound of {fs : s ∈ S} in F. So by Zorn’s lemma, F has a maximal element, say g. It remains to show g is a choice functions for {ui : i ∈ I}. Suppose not. Since g ∈ F, g is a choice function for, say, {ui : i ∈ J} for some J $ I. Let v ∈ I − J and b ∈ uv . Define h ⊃ g such that h(uv ) = b. Then h ∈ F, contradicting the maximality of g. This proof of Zorn ⇒ AC by Srivastava [19] seems to be a routine exercise using Zorn’s lemma. However, when the author chooses some v ∈ I − J and b ∈ uv , he implicitly but indeed uses AC. Without assuming AC or any knowledge of I − J and uv , it could happen that we have no way to choose elements from I − J and uv . Now we prove the equivalences listed above. We will prove the following relations: 7? Tri www wwwww w w www w www / AC o Tych WOP O Fc F w FF w w FF ww FF ww F {ww Zorn Vec The proof of Tychonoff’s Theorem using ultrafilter, a maximal set, and the proof that every vector space has a basis using Zorn’s lemma, are omitted here since they can be found, respectively, in most standard textbooks for general topology (e.g. Theorem 7.4.2 [24]) and algebra (e.g. Theorem § 5.2 [7]). Proposition 3.2. WOP ⇒ AC. Proof. Let the minimal element be the choice. 15 Theorem 3.3 (Proposition 4.37 [15]). AC ⇒ Zorn Proof. Let (x, ≺) be a non-empty partial ordered set such that every ≺–chain in x has an upper bound in x. We want to show x has a maximal element. Suppose f is a choice function for x. Let b = f (x). By induction we define a function F : Ord → x by ( b if α = 0, F (α) = f (uα ) otherwise, where uα = {s ∈ x : s ∈ / F [α] ∧ ∀t ∈ F [α](t ≺ s)}. Note that i) uα is the set of upper bounds of F [α] in x and uα ∩ F [α] = ∅; and ii) F (·) is the image of a point, whereas F [ · ] is the image of a set. Now there is an ordinal α such that uα = ∅; if not, by the argument similar to that in Theorem 2.22, Ord would be a set. Let γ be the least such ordinal and g = F γ. Observe, for each α > 0, that g(α) is an upper bound of g[α] in x, and that g(α) ∈ / g[α]. So g is a one-to-one function. Then g[γ] is a ≺–chain in x and thus has an upper bound in x, say p. Since uγ = ∅, which states that there exists no upper bounds of g[γ] that are in x − g[γ], we see p ∈ g[γ] and is unique. We finish the proof by claiming that p is ≺–maximal in x. Suppose not. Then there exists an s ∈ x such that s 6= p and p ≺ s. Then s ∈ / F [γ] and ∀t ∈ F [γ](t ≺ s). So uγ 6= ∅, leading to contradiction. Lemma 3.4 (Proposition 4.37 [15]). Zorn ⇒ WOP. Proof. Let x be a set. Let F be all the sequences with an ordinal length of elements in x, as described in Remark 2.19, but with no repeated element of x. F 6= ∅, since the empty sequence ∅ ∈ F. By the argument similar to that in Theorem 2.22, F is a set. Clearly, F is partial ordered by ⊂ and the union of any ⊂–chain in F is an upper bound. So by Zorn’s lemma, there exists a maximal element in F, say f . Clearly, f is a sequence consisting of all the elements in x. Let α ∈ Ord be the length of f and define a well-ordering for f similarly to Theorem 2.22. Proposition 3.5 (Proposition 4.37 [15]). WOP ⇒ Tri. Proof. Follows Theorem 2.20 and Lemma 2.13. 16 Proposition 3.6 (Proposition 4.37 [15]). Tri ⇒ WOP. Proof. By Hartogs’ Theorem (2.22) and Trichotomy, ∀x∃α ∈ x(|x| < |α|). Then we can define a well-ordering for x by the argument similar to that in Theorem 2.22. Theorem 3.7 ([13]). Tych ⇒ AC. Proof. Let {xi : i ∈ I} be a family of non-empty sets, for some index set I. We want to show the Form 3 of AC, i.e. Y xi 6= ∅. i∈I Without loss of generality, assume {xi : i ∈ I} are pairwise disjoint. Let α be an ordinal such that α ∈ / xi for all i ∈ I. Such α exists by Hartogs’ Theorem. Now for each i ∈ I, let yi = xi ∪ {α} and equip it with the topology {∅, {α}, xi , yi }. Each yi is clearly compact. Let y= Y yi i∈I equipped with the product topology. For each i ∈ I, let πi : y → yi be the projection map and ui = πi−1 ({α}). Then ui ’s are open sets, since πi ’s are continuous and {α} is a basic open set. Now for any finite J $ I, [ uj j∈J does not contain the element (. . . , {α}, . . .), S where {α} is the i-th entry with i ∈ I −J. It follows Sj∈J uj does not cover y and so {ui : i ∈ I} has no finite subcover of y. Therefore i∈I ui too does not cover y; otherwise, it has a finite subcoverQof y, since y is compact by Tychonoff’s Theorem. Now since all yi ’s are different, i∈I xi must be non-empty, as required. 17 Theorem 3.8 (Theorem 2 [2]). Vec ⇒ AC. Proof. Let {xi : i ∈ α} be a family of non-empty sets, where α is an ordinal. We use {xi } to construct a vector space by the idea of the field of rational fractions, and then find a finite set ui ∈ xi for each i ∈ α. For the background of the field of rational fractions, one can consult any standard algebra textbook (e.g. S p124 [7]). Without loss of generality, assume {xi } are pairwise disjoint. Let x = i∈α xi . For s ∈ x, let R[s] and R(s) be the field of polynomials and that of rational fractions, respectively, with variable s and coefficients in R. Similarly let R[[x]] and R((x)) denote the field of polynomials and that of rational fractions, respectively, with variables in x and coefficients in R. For an f ∈ R[[x]], we say f has [i]–degree n if the sum of the exponents of elements in xi in each term of f is n. For an h ∈ R((x)), where h = f /g with f, g ∈ R[[x]], we say h has (i)–degree n if [i]–degree of f = n + [i]–degree of g. For examples, if s, t ∈ x1 and p, q ∈ x2 , then 1 f = 2sp−2 + t2 p3 q 10 − t 2 has [1]–degree 4 and [2]–degree 11, and h= f t2 q 15 has (1)–degree 2 and (2)–degree −4. Now let k = {h ∈ R((x)) : h has (i)–degree 0, i ∈ α}. It is a routine to check that k is a subfield of R[[x]]. Then R[[x]] is a vector space over k. So by our assumption, the subspace y spanned by x has a basis, say B. Now for i ∈ α, s ∈ xi , and s 6= 0, by the definition of a basis, we can write X s= a(s, b) · b, b∈Bs where Bs ⊂ B is finite, and a(s, b) ∈ k is non-zero and depends on s and b. Now if t ∈ xi , we have X X t t= a(t, b) · b = a(s, b) · b. s b∈B b∈B t s Since a(t, b) and a(s, b)t/s are just coefficients, we see Bs = Bt and the s in a(s, b)/s could be any element in xi . Since a(s, b) ∈ k so has (i)–degree 0; 18 therefore a(s, b)/s has (i)–degree −1. Then for i ∈ α, s ∈ xi , and s 6= 0, we can write X s= a(b) · b · s, b∈Bi where Bi ⊂ B is finite and only depends on i, and each a(b) has (i)–degree −1. Bi is finite, so there are finitely many a(b)’s. Then we finish the proof by letting ui be the finitely many elements in xi appearing in the denominators of a(b)’s. 19 4 Some results requiring AC Mathematics would not be an international science if its theorems did not possess an objective content independent of the language in which we express them. Ernst Zermelo In this chapter we list some results requiring AC in various branches of mathematics. We omit the proofs since they are all classic results and can all be found in standard textbooks; we provide references instead. For a more comprehensive list of the applications of AC, see Appendix 2 of Moore 1982 [16] and Chapter 2 of Jech 1973 [8]. 4.1 Set theory Theorem 4.1 (Schröder–Bernstein Theorem, Proposition 4.21 (d) [15], Theorem 1.2.3 [19]). |x| ≤ |y| ∧ |y| ≤ |x| ⇒ |x| = |y|. Theorem 4.2 (Lemma §I 10.21 [12]). Let κ be a cardinal. Then [ κ ≥ ω ∧ ∀α ≤ κ(|xα | ≤ κ) ⇒ xα ≤ κ. α≤κ Corollary 4.3. Any union of finitely or countably many countable sets is itself countable. 4.2 General topology Theorem 4.4 (Theorem 11.3.7 [24]). A uniform space is compact if and only if it is complete and totally bounded. 20 4.3 Algebra Theorem 4.5 (Theorem §IV 4.4 [7]). Every field has an algebraic closure. 4.4 Group theory Theorem 4.6 (Nielsen–Schreier Theorem, 1.5 [10] ). Every subgroup of a free group is free. 4.5 Measure theory Theorem 4.7 (Theorem 3.23 [23]). Let {xn : n ∈ ω} be countable disjoint measurable sets and µ be the Lebesgue measure. Then [ X µ(xn ). µ xn = n∈ω n∈ω Theorem 4.8 (Vitali Theorem, 3.38 [23]). There exist set that are not Lebesgue measurable. 4.6 Functional Analysis Theorem 4.9 (Hahn–Banach Theorem, 12.4.1 [24]). Let x be a normed space and y ⊂ x a vector subspace. Then every bounded linear functional on y can be extended to a linear functional on x with the same norm. Theorem 4.10 (Lemma 3.20 [18]). Every Hilbert space has an orthonormal basis. 21 5 The consistency of AC in ZF With the increasing recognition that there are questions in mathematics which cannot be decided without this axiom [of choice], the resistance to it must increasingly disappear. Ernst Steinitz 5.1 Overview of the consistency and its proof The proof of the consistency of AC in ZF, denoted Con(ZF + AC) or Con(ZFC), was first outlined in Kurt Gödel’s 1938 paper [5] and was fully given in his 1940 monograph [6]. To prove Con(ZFC), we in fact prove Con(ZF) ⇒ Con(ZFC). By the Gödel Second Incompleteness Theorem, ZF cannot prove its own consistency. So we take Con(ZF) as an assumption. By the Gödel Completeness Theorem, if ZFC is consistent, it has a model. It follows that if ZFC has a model, it is consistent. Therefore, while Con(ZFC) can be proved within model theory [9], the unusual approach people take is to find a model for ZFC. We adopt Gödel’s approach. 5.2 A quick review of model theory Definition 5.1 (Definition 1.1.1 [14]). A language L = {F, R, C} consists a set F of functions, a set R of relations, and a set C of constants. Definition 5.2 (Definition 1.1.2 [14]). A model M for a language L is a pair (M, I), where M is a set, called the universe of M, and I is the interpretation of L in the universe M. We write M = (M, F L , RL , C L ). 22 Examples 5.3. 1. The language for set theory is just {∈}. {Vα , ∈} is a model for {∈}, for α ∈ Ord. 2. The language for additive groups is {+, e}, where + is a binary function and e is constant. A set g is a group if it satisfies the axioms ∀x(e + x = x + e = x), ∀x∀y∀z(x + (y + z) = (x + y) + z), ∀x∃y(x + y = y + x = e). A group g is Abelian if it satisfies the additional axiom ∀x∃y(x + y = y + x). The set F of all the formulas in set theory is an Abelian group with the following group structure: i) let e be ∅, the empty formula; ii) let ϕ ∧ ψ be ϕ + ψ; ii) let ¬ϕ be the inverse of ϕ. In addition, one can define multiplicative groups in a similar fasion. 3. The language for ring theory is {+, −, ·, 0, 1}, where +, −, and · are binary functions and 0 and 1 are constants. An additive Abelian group R is a ring is it satisfies axioms ∀x(x · 0 = 0 ∧ x · 1 = 1 · x = x), ∀x∀y∀z x · (y · z) = (x · y) · z ∧x−y =z ⇔x=y+z ∧ x · (y + z) = (x · y) + (x · z) ∧ x · (y + z) = (x · y) + (x · z) . If x is a set, define i) P(x) = {s ⊂ x}, the power set of x; ii) 0L = ∅; iii) 1L = x; iv) for s, t ⊂ x, s + t = (s ∪ t) − (s ∩ t); v) s = −s; 23 vi) s · t = s ∩ t. Then (P(x), +, −, ∩, ∅, x) is a model for ring theory. Definition 5.4 (Definition 1.1.4 [14]). Given L = {F, R, C}, the set of L -terms is the smallest set T such that i) C ⊂ T ii) {vi is a variable : i ∈ N} ⊂ T iii) if x1 , . . . , xn ∈ T and f ∈ F , then f (x1 , . . . , xn ) ∈ T . Definition 5.5 (Definition 1.1.5 [14]). Given L = {F, R, C}, if r ∈ R and t1 , . . . tn ∈ T , then ϕ is an atomic L-formula if it is in the form of t1 = t2 or r(t1 , . . . tn ). The set of L-formulas is the closure of atomic L-formulas under the ¬, ∧, ∨, ∃, and ∀ operations. A variable v in a formula ϕ(. . . , v, . . .) is bounded if it is inside a quantifier ∃v or ∀v; otherwise, it is free. A quantifier ∃v or ∀v in ϕ(. . . , v, . . .) is restricted if v has a domain x, i.e. ∃v ∈ x or ∀v ∈ x; otherwise, it is unrestricted. A formula is a sentence if it has no free variables. A sentence in a model is always true or false. A formula with free variables shows a property of the free variables rather than anything true or false. For example, ∃x ∈ R(x > 0) is a true sentence while x > 0 shows a property of x. Definition 5.6 (Definition 1.1.6 [14]). By induction on the length of a term t ∈ T of a model M, we define its value tM (a) over a finite sequence a = (a1 , . . . , an ) of M as follows. i) if t is the constant c, then tM (a) = cM ; ii) if t is the variable vi , then tM (a) = ai ; iii) if t is the term f (t1 , . . . , tm ), where f ∈ F is a function and t1 , . . . , tm ∈ T M are terms, then tM (a) = f M (tM 1 (a), . . . , tm (a)). 24 Definition 5.7. By induction on the complexity of a formula ϕ of a model M, we define its satisfaction ϕM over a finite sequence a = (a1 , . . . , an ) of M as follows: (t = s)M iff tM (a) = sM (a) M M (r(t1 , . . . , tm ))M iff (tM 1 (a), . . . , tm (a)) ∈ r (¬ψ)M iff ¬(ψ M ) (ψ ∧ φ)M iff ψ M and φM (ψ ∨ φ)M iff ψ M or φM (∃w(ψ(v1 , . . . vn , w)))M iff ∃b ∈ M (ψ(a1 , . . . , an , b))M (∀w(ψ(v1 , . . . vn , w)))M iff ∀b ∈ M (ψ(a1 , . . . , an , b))M where t, s ∈ T , r ∈ R, and ψ and φ are formulas. If ϕM , we say ϕ holds or is true in M. We also write M |= ϕ(a) or |=M ϕ(a). In the rest of this paper, we restrict our interest to {∈}, the language of set theory. Definition 5.8 (Definition 12.6 [9]). Let M = (M, ∈) be a model for the language of set theory, where M is a class, and let ϕ(v1 , . . . , vn ) be a formula for it. We now define ϕM , the Relativization or the the counterpart of satisfaction for classes of ϕ to M, by induction on the complexity of ϕ as follows: (vi1 ∈ vi2 )M iff vi1 ∈ vi2 (vi1 = vi2 )M iff vi1 = vi2 (¬ψ)M iff ¬(ψ M ) (ψ ∧ φ)M iff ψ M ∧ φM (ψ ∨ φ)M iff ψ M ∨ φM (∃x(ψ))M iff ∃x ∈ M (ψ M ) (∀x(ψ))M iff ∀x ∈ M (ψ M ) where vi1 and vi2 are variables, and ψ and φ are formulas. Since there is no confusion, we write M instead of M. So similarly, we can write M |= ϕ or |=M ϕ. 25 5.3 Absoluteness Definition 5.9 (Definition §IV 3.5 [12]). A ∆0 -formula is a formula ϕ (in ZF) in the form of i) an atomic formula, or ii) ¬ψ, ψ ∧ φ, ψ ∨ φ, ψ ⇒ φ, or ψ ⇔ φ, or iii) ∃x ∈ y(ψ) or ∀x ∈ y(ψ), where ψ and φ are ∆0 -formula. We say ϕ is ∆0 . First we notice that a ∆0 -formula has no unrestricted quantifiers. Then given a formula ϕ with unrestricted quantifiers, say ∃x(ψ) with ψ ∆0 , (∃x(ψ))M = ∃x ∈ M (ψ M ) is ∆0 . So we have proved Proposition 5.10. The relativization of a formula is ∆0 . Definition 5.11. A formula is a Σ1 -formula or Π1 -formula if it can be written, respectively, in the form of ∃x(ψ) or ∀x(ψ), where ψ is a ∆0 -formula. A formula is a ∆1 -formula if it is both Σ1 and Π1 . Proposition 5.12. A ∆0 -formula ϕ is ∆1 . Proof. ϕ ⇔ ∃x(x ∈ ω ∧ ϕ) and ϕ ⇔ ∀x(x ∈ ω ⇒ x ∈ R ∧ ϕ). Lemma 5.13 (Lemma 13.10 [9]). Let ϕ(v, . . .) and ψ(v, . . .) be two ∆0 -formulas. Then 1. the following formulas are Σ1 : ¬∀x ϕ(v, x, . . .) , ∃x1 . . . ∃xn ϕ(x1 , . . . , xn , v, . . .) , ∃x ϕ(v, x, . . .) ∧ ∃y ψ(v, x, . . .) , and ∃x ϕ(v, x, . . .) ∨ ∃y ψ(v, x, . . .) ; 26 2. the following formulas are Π1 : ¬∃x ϕ(v, x, . . .) , ∀x1 . . . ∀xn ϕ(x1 , . . . , xn , v, . . .) , ∀x ϕ(v, x, . . .) ∧ ∀y ψ(v, x, . . .) , and ∀x ϕ(v, x, . . .) ∨ ∀y ψ(v, x, . . .) . Proof. 1. ¬∀x(ϕ) iff ∃x(¬ϕ), ∃x1 . . . ∃xn ϕ iff ∃s∃t ∈ s∃x1 ∈ t . . . ∃xn ∈ t(s = (x1 , . . . , xn ) ∧ ϕ), ∃x(ϕ) ∧ ∃y(ψ) iff ∃x∃y(ϕ ∧ ψ), ∃x(ϕ) ∨ ∃y(ψ) iff ∃x∃y(ϕ ∨ ψ). 2. ¬∃x(ϕ) iff ∀x(¬ϕ), ∀x1 . . . ∀xn (ϕ) iff ∀x1 . . . ∀xn−1 ¬∃xn (¬ϕ) iff ∀x1 . . . ∀xn−2 ¬∃xn−1 ∃xn (¬ϕ) .. . iff ¬∃x1 . . . ∃xn (¬ϕ) iff ¬∃s∃t ∈ s∃x1 ∈ t . . . ∃xn ∈ t(s = (x1 , . . . , xn ) ∧ ¬ϕ) iff ∀s¬ ∃t ∈ s∃x1 ∈ t . . . ∃xn ∈ t(s = (x1 , . . . , xn ) ∧ ¬ϕ) , ∀x(ϕ) ∧ ∀y(ψ) iff ∀x∀y(ϕ ∧ ψ), ∀x(ϕ) ∨ ∀y(ψ) iff ∀x∀y(ϕ ∨ ψ). We will work in transitive models and show Con(ZFC). The relation between a ∆0 -formula ϕ and a transitive model M , to which we wish to relativize ϕ, will be essential in our proof, as we will see in lemma 5.18. Since all the quantifiers in ϕ 27 are unrestricted, so each bounded variable has a domain sitting in M . Therefore, thanks to the transitivity of M , all the variables in ϕM also sit in M . ∆0 -formulas are defined purely by logic symbols. The following two lemmas show that some familiar operations or expressions are ∆0 . Definition 5.14. An expression is a ∆0 -expression or simply ∆0 if it can be written as a ∆0 -formula. An operation T is a ∆0 -operation or simply ∆0 if z = T (. . .) is a ∆0 -expression. Similarly we define Σ1 , Π1 , and ∆1 -expressions and -operations. Lemma (Lemma 12.10 [9], Theorem §IV 3.9 [12]). {·, ·}, (·, ·), ⊂, ×, −, T 5.15 S ∩, ∪, , , dom, ran, f (·), and f are ∆0 -operations, where f is a function. Proof. Let x, y be sets. Since x = y iff x ⊂ y ∧ y ⊂ x, then we have z = {x, y} iff x ∈ z ∧ y ∈ z ∧ ∀u ∈ z(u = x ∨ u = y), z = (x, y) iff ∃u ∈ z(u = {x, x}) ∧ ∃u ∈ z(u = {x, y}) ∧ ∀u ∈ z(u = {x, x} ∨ u = {x, y}), x ⊂ z iff ∀u ∈ x(u ∈ z), z = x × y iff ∀u ∈ z∃s ∈ x∃t ∈ y(u = (s, t)) ∧ ∀s ∈ x∀t ∈ y∃u ∈ z(u = (s, t)), z = x − y iff z ⊂ x ∧ ¬(∃u ∈ z(u ∈ y)) ∧ ∀u ∈ x(u ∈ y ∨ u ∈ z), z = x ∩ y iff ∀u ∈ z(u ∈ x ∧ u ∈ y) ∧ ∀u ∈ x(u ∈ y ⇔ u ∈ z) ∧ ∀u ∈ y(u ∈ x ⇔ u ∈ z), z = x ∪ y iff ∀u ∈ z(u ∈ x ∨ u ∈ y) ∧ x ⊂ z ∧ y ⊂ z, z= \ x iff ∀u ∈ z∀s ∈ x(u ∈ s) ∧ ∀s ∈ x(z ⊂ s), z= [ x iff ∀u ∈ z∃s ∈ x(u ∈ s) ∧ ∀s ∈ x(s ⊂ z), z = dom(x) iff ∀u ∈ z∃s ∈ x∀t ∈ s(u ∈ t) ∧ ∀s ∈ x∀t ∈ s∀u, v ∈ t(s = (u, v) ⇒ u ∈ z), 28 \ z = ran(x) iff ∀u ∈ z∃s, t ∈ x∃p ∈ t p = t ∧ ∃v ∈ p(s = (v, u)) \ ∧ ∀s ∈ x∃t ∈ x∀p ∈ t∃u ∈ z p = t ⇒ ∃v ∈ p(s = (v, u)) , z = f (x) iff ∃u ∈ f (u = (x, z)), and g = f x iff g ⊂ f ∧ dom(g) = dom(f ) ∩ x. Lemma 5.16 (Lemma 12.10 [9], Theorems §IV 3.11, 5.1 [12]). The following expressions are ∆0 : x = ∅ = 0, x is transitive, α ∈ Ord, α is a limit ordinal, z = ω, n is a natural number, r is a relation, and f is a function. Proof. x = ∅ = 0 iff ¬(∃u ∈ z(u = u)), x is transitive iff ∀s ∈ x(s ⊂ x), α ∈ Ord iff α is transitive ∧ ∀s, t ∈ x(s ∈ t ∨ s = t ∨ t ∈ s) ∧ ∀s, t, u ∈ s(s ∈ t ∧ t ∈ u ⇒ s ∈ u), α is a limit ordinal iff α ∈ Ord ∧ ∀s ∈ x∃t ∈ x(s ∈ t), z = ω iff ¬(z = 0) ∧ z is a limit ordinal ∧ ∀u ∈ z(¬(u is a limit ordinal)), n is a natural number iff n ∈ ω, r is a relation iff ∀u ∈ r∃s ∈ [ u∃t ∈ [ u(u = (s, t)), and f is a function iff f is a relation [[ ∧ ∀s, t, u ∈ f (s, t) ∈ f ∧ (s, u) ∈ f ⇒ t = u . 29 Definition 5.17. A formula ϕ is absolute for a model M if for all v1 , . . . , vn , ϕM (v1 , . . . , vn ) ⇔ ϕ(v1 , . . . , vn ). Lemma 5.18 (Lemma 12.9 [9]). A ∆0 -formula is absolute for a transitive M . Proof. We prove by induction on the complexity of a formula ϕ. It is straightforward if ϕ is atomic, ¬ψ, ψ ∧ φ, ψ ∨ φ, ψ ⇒ φ, or ψ ⇔ φ, assuming the result holds for ψ and φ. Now assuming ψ = ψ(y, . . .) is a ∆0 -formula, if ϕ is ∃x ∈ y(ψ), then ∃x ∈ y(ψ) ⇔∃x(x ∈ y ∧ ψ) ⇔∃x(x ∈ y ∧ ψ M ) ⇔∃x(x ∈ y ∧ x ∈ M ∧ ψ M ) ⇔(∃x(x ∈ y ∧ ψ))M ⇔(∃x ∈ y(ψ))M . ∀x ∈ y(ψ) is similar. Lemma 5.19. A ∆1 -formula ϕ is absolute for a transitive M . Proof. First write ϕ in the form of ∃x(ψ), where ψ is a ∆0 -formula. Then ϕM ⇔ (∃x(ψ))M ⇔ ∃x ∈ M (ψ)M ⇔ ∃x ∈ M (ψ) ⇒ ∃x(ψ) ⇔ ϕ. by lemma 5.18 For the other direction, write ϕ in the form of ∀x(ψ), where ψ is a ∆0 -formula. Then ϕ ⇔ ∀x(ψ) ⇒ ∀x ∈ M (ψ) ⇔ ∀x ∈ M (ψ)M by lemma 5.18 M ⇔ (∀x(ψ)) ⇔ ϕM . 30 5.4 Constructible sets Definition 5.20. A subset x of a set M (as an {∈}-model) is definable over M if x = {s ∈ M : ϕM (s, a)}, where a = (a1 , . . . , an ) is a finite sequence over M . Intuitively, a definable set X is uniquely defined by some property. But its careful definition by a formula in the formal language avoids the following false paradox: Define the first undefinable ordinal λ by the property that λ is a undefinable ordinal and ∀α < λ(α is definable). Then we have defined an undefinable ordinal. Notation 5.21. Let D(M ) = {X ⊂ M : X is definable over M }. Definition 5.22 (Definitions §VI 1.4, 1.5 [12]). We now define by induction the universe L of constructible sets, in which we will show Con(ZFC): i) L0 = ∅, ( ii) Lα = iii) L = S D(Lα−1 ) S β<α Lβ α∈Ord if α is a successor , and if α is a limit Lα . Remark 5.23. Being constructible and definable are not the same thing. A set x is constructible if and only if x ∈ L, whereas definability can be only talked about in the context of a model. Definition 5.24. The Axiom of Constructibility states V = L, i.e. ∀x∃α ∈ Ord(x ∈ Lα ). Our next task is to show L is a model for ZF + V = L. After that, the job becomes straightforward: just to construct a well-ordering for all the constructible sets. Think carefully the Axiom of Constructibility. It is trivial that any set in L is constructible. But as we have seen the definitions of V and L, which seem (intuitively) very different, it is unlikely that any arbitrary set is constructible. However, it is indeed a fact that ZF + V = L is consistent (if we can show L is model of them). To show this, we need to show “x is constructible” is absolute for L, i.e. (x is constructible)L ⇔ x is constructible. First we see some properties of L and Lα . 31 Lemma 5.25 (Lemma §VI 1.3 [12]). Let M be a transitive set. 1. If x and y are definable over M , so are M − x, x ∩ y, and x ∪ y. 2. M ⊂ D(M ) ⊂ P(M ). 3. If x ⊂ M is finite, then x is definable. 4. (AC) If M is infinite, then |D(M )| = |M |. Proof. 1. Let x = {s ∈ M : ψ M (s, a)} and y = {s ∈ M : φM (s, b)}. Then M − x = {s ∈ M : ¬(ψ M (s, a))} x ∩ y = {s ∈ M : ψ M (s, a) ∧ φM (s, b)}, and x ∪ y = {s ∈ M : ψ M (s, a) ∨ φM (s, b)}. 2. Let y ∈ M . Since M is transitive, y = {x ∈ y} = {x ∈ M : x ∈ y} and thus is definable. D(M ) ⊂ P(M ) follows the definition of D(M ). 3. Follows from 1 and 2. 4. By Theorem 4.2, ∀n ∈ ω(|M n | = |M |). |D(M )| ≥ |M | follows from M ⊂ D(M ). Definition 5.26. Let ρ(x) = inf{β ∈ Ord : x ∈ Lβ+1 } be the L-rank for x ∈ L. Some properties of L and Lα listed in Lemmas 5.27 and 5.28 can be routinely checked. Lemma 5.27. Let α ∈ Ord. 1. Lα and L are transitive. 2. Lα ∈ Lα+1 (Lemma §VI 1.10 [12]). 3. ∀β < α(Lβ ⊂ Lα ) (Lemma §VI 1.6 [12]). 4. Lα = {x ∈ L : ρ(x) < α} (Lemma §VI 1.8 [12]). 5. Lα ∩ Ord = α (Lemma 13.2 [9]). 6. α ⊂ Lα and ρ(α) = α (Lemma §VI 1.9 (a) [12]). 32 Lemma 5.28. 1. ∀α ∈ Ord(Lα ⊂ Vα ) (Lemma §VI 1.11 [12]). 2. ∀n ≤ ω(Ln = Vn ) (Lemma §VI 1.13 [12]). Lemma 5.29 (Exercise 12.16 [9]). Let ϕ be a formula. Then there exists an arbitrarily large ordinal α such that for all v1 , . . . , vn ∈ Lα , ϕL (v1 , . . . , vn ) ⇔ ϕLα (v1 , . . . , vn ). Proof. We prove by induction on the complexity of ϕ. It is straightforward if ϕ is atomic, ¬ψ, ψ ∧ φ, ψ ∨ φ, ψ ⇒ φ, or ψ ⇔ φ, assuming the result holds for ψ and φ. Since ∃x ∈ y(ϕL ) ⇔ ∃x ∈ L(x ∈ y ∧ ϕL ) and ∀x ⇔ ¬∃x¬, it remains to show if the result holds for ϕ, then it also holds for ∃x(ϕ). Thus we finish the proof by seeing, for arbitrarily large α and v1 , . . . , vn ∈ Lα , ∃x ∈ Lα (ϕLα (x, v1 , . . . , vn )) ⇔∃x ∈ Lα (ϕ(x, v1 , . . . , vn )) ⇔∃x ∈ L(ϕ(x, v1 , . . . , vn )). The first equivalence is by the assumption of ϕ, and the second one is by the arbitrary largeness of α. Theorem 5.30 (Theorem 13.3 [9]). L is a model for ZF. Proof. We prove σ L holds for every axiom σ in ZF. Extensionality. We want to show L |= ∀x∀y(ϕ(x, y)), where ϕ is (∀u ∈ x(u ∈ y) ∧ ∀u ∈ y(y ∈ x)) ⇒ x = y. But ϕ is a ∆0 -formula and thus absolute. Foundation. We want to show L |= ∀x(x 6= ∅ ⇒ ∃u ∈ x(u ∩ x) = ∅). Since ∅L = ∅, let ∅ 6= x ∈ L. Then ∃u ∈ x(u ∩ x) = ∅. L is transitive so u ∈ L, and u ∩ x is ∆0 . 33 Pairing. We want to show L |= ∀x∀y∃z(z = {x, y}). Let x, y ∈ L and define z = {x, y}. Then ∃α ∈ Ord(x ∈ Lα ∧ y ∈ Lα ). So z ∈ Lα+1 since {x, y} is definable on Lα . And z = {x, y} is ∆0 . Union. We want to show [ L |= ∀x∃y(y = x). S Let x ∈ L and define y = x. Since L is transitive, z ∈ L for all z ∈ x. Use the transitivity of L again, we have ∀z ∈ x(u ∈ z ⇒ u ∈ L). Therefore y ⊂ L. So there exists an ordinal α such that x ∈ Lα and y ⊂ Lα . S t ∈ s is ∆0 . So [ [ y = {u ∈ Lα : u ∈ x} = {u ∈ Lα : Lα |= u ∈ x} is Lα -definable. Then ∀x ∈ L∃y ∈ L(y = [ x) [ x)L ⇔∀x ∈ L∃y ∈ L(y = [ L ⇔ ∀x∃y(y = x) . Power Set. We want to show L |= ∀x∃y(ϕ(x, y)), where ϕ is ∀u(u ⊂ x ⇔ u ∈ y). Let x ∈ L and y = P(x) ∩ L. There exists an α such that y ∈ Lα . Lα is transitive, so y ⊂ Lα . Then y = {u ∈ Lα : u ⊂ x}. Clearly, ϕ is ∆0 and y satisfies ϕ. Infinity. It suffices to show L contains an (infinite) inductive set , i.e. L |= ∃x(ϕ(x, u)), where ϕ is ∅ ∈ x ∧ ∀u ∈ x({u, {u}} ∈ x). We have already shown above ∅L = ∅ and the Axiom of Pairing holds in L. So ω ∈ L satisfies ϕ. 34 Replacement. We want to show if ϕ is a formula such that L |= ∀x∀y∀z(ϕ(x, y) ∧ ϕ(x, z) ⇒ y = z), where we write ϕ(x, y) instead of ϕ(x, y, . . .) for convenience, then L |= ∀x∃y∀u u ∈ y ⇔ ∃s ∈ x(ϕ(s, u)) . Let x ∈ L and define χ(s, α) to be α ∈ Ord ∧ ∃t ∈ Lα (ϕL (s, t)) ∧ ∀β < α∀u ∈ Lβ (¬ϕL (s, u)). Note that for χ(s, α), α = min{ρ(t) : ϕL (s, t)}, where ρ(t) is the L-rank of t. Then clearly, if χ(s, α) holds for some s, α is unique. Now by the Axiom of Replacement, there exists y such that ∀α α ∈ y ⇔ ∃s ∈ x(χ(s, α)) . S Let γ = (y ∩ Ord). Then γ is an ordinal, and ∀s ∈ x∀α ∈ Ord(χ(s, α) ⇒ α < γ). By the definition of χ, ∀s ∈ x∀t ∈ L∃α ∈ Ord(ϕL (s, t) ⇒ t ∈ Lα ). But t ∈ Lα ⇒ t ∈ Lγ . So Lγ satisfies L |= ∀u u ∈ Lγ ⇔ ∃s ∈ x(ϕ(s, u)) . Separation. We want to show if ϕ is a formula, then L |= ∀x∀p1 . . . ∀pn ∃y∀s s ∈ y ⇔ s ∈ x ∧ ϕ(s, p1 , . . . , pn ) . Let x, p1 . . . , pn ∈ L and define y = {s ∈ x : ϕL (s, p1 , . . . , pn )}. It is easy to see by lemma 5.29 that there exists an α such that y = {s ∈ Lα : s ∈ x ∧ ϕLα (s, p1 , . . . , pn )}. Therefore y ∈ L and satisfies ∀s ∈ L s ∈ y ⇔ s ∈ x ∧ ϕ(s, p1 , . . . , pn ) . 35 5.5 Gödel operations Definition 5.31. A Gödel operation is one of or a composition of the following basic (Gödel) operations: g1 (x, y) = {x, y}, g2 (x, y) = x × y, g3 (x, y) = {(u, v) ∈ x × y : u ∈ v}, g4 (x, y) = x − y, [ g5 (x) = x, g6 (x) = dom(x), g7 (x × y) = y × x. Proposition 5.32. The following operations are Gödel operations: 1. g8 (x, y) = x ∩ y. 2. g9 (x) = ran(x). 3. g10 (x × y × z) = x × z × y. Proof. 1. g8 (x, y) = x ∩ y = x − (x − y). 2. g9 (x) = ran(x) = dom(g7 (x)). 3. Since x × y × z = x × (y × z), then g10 (x × (y × z)) = x × (z × y) = dom x × (y × z) × g7 ran(x × (y × z)) . Observe that operations g1 to g10 are in fact the set operations to generate new sets as S a group of axioms of set theory [6]. Recall Lemma 5.15 that {·, ·}, ×, −, ∩, , dom, and ran are ∆0 -operations. One may conjecture that all the Gödel operations are ∆0 . We prove this now. Lemma 5.33 (Lemma 13.7 [9]). Any Gödel operation G is a ∆0 -operation. Proof. We prove by induction on the complexity of G. To show i) z = G(. . .) is ∆0 , 36 it is equivalent to show ∀u ∈ z(u ∈ G(. . .)) ∧ ∀u ∈ G(. . .)(u ∈ z) is ∆0 ; thus it is also equivalent to show ii) u ∈ G(. . .) is ∆0 , and iii) if ϕ is a ∆0 -formula, then ∀u ∈ G(. . .)(ϕ(u, . . .)) is ∆0 . (Similarly, ∃u ∈ G(. . .)(ϕ(u, . . .)) is also ∆0 .) First we show the basic Gödel operations g1 to g7 are ∆0 . Since we have already seen in Lemma 5.15 that g1 , g2 , g4 , g5 , and g6 are ∆0 , it remains to show g3 and g7 are also ∆0 . This is true since z = g3 (x, y) = {(u, v) ∈ x × y : u ∈ v} iff ∀s ∈ z∃u ∈ x∃v ∈ y(u ∈ v ∧ s = (u, v)) ∧ ∀u ∈ x∀v ∈ y∃s ∈ z(u ∈ v ⇒ s = (u, v)), and z = g7 (x) = {(u, v) : (v, u) ∈ x} iff [[ ∀p ∈ z∃s ∈ x∃u, v ∈ x(s = (v, u) ∧ p = (u, v)) [[ ∀s ∈ x∀u, v ∈ x∃p ∈ z(s = (v, u) ⇒ p = (u, v)). Therefore, i), ii) and iii) hold for the basic cases. We now proceed to the inductive steps. Assume i), ii) and iii) hold for G(. . .) and H(. . .), two Gödel operations. We would like to show the basic Gödel operations on G(. . .) and H(. . .) are ∆0 . For convenience, we drop (. . .) in G(. . .) and H(. . .). z = g1 (G, H) = {G, H} iff ∀u ∈ z(u = G ∨ u = H) ∧ ∃u, v ∈ z(u = G ∧ v = H), z = g2 (G, H) = G × H iff ∀u ∈ z∃s ∈ G∃t ∈ H(u = (s, t)) ∧ ∀s ∈ G∀t ∈ H∃u ∈ z(u = (s, t)), z = g3 (G, H) = {(u, v) ∈ G × H : u ∈ v} iff ∀s ∈ z∃u ∈ G∃v ∈ H(u ∈ v ∧ s = (u, v)) ∧ ∀u ∈ G∀v ∈ H∃s ∈ z(u ∈ v ⇒ s = (u, v)), 37 z = g4 (G, H) = G − H iff ∀u ∈ z(u ∈ G) ∧ ¬(∃u ∈ z(u ∈ H)) ∧ ∀u ∈ G(u ∈ H ∨ u ∈ z), z = g5 (G) = [ G iff ∀u ∈ z∃s ∈ G(u ∈ s) ∧ ∀s ∈ G(s ⊂ z), z = g6 (G) = dom(G) iff ∀u ∈ z∃s ∈ G∀t ∈ s(u ∈ t) ∧ ∀s ∈ G∀t ∈ s∀u, v ∈ t(s = (u, v) ⇒ u ∈ z), z = g7 (G) = {(u, v) : (v, u) ∈ G} [[ iff ∀p ∈ z∃s ∈ G∃u, v ∈ G(s = (v, u) ∧ p = (u, v)) [[ ∀s ∈ G∀u, v ∈ G∃p ∈ z(s = (v, u) ⇒ p = (u, v)). For g1 , we proved directly. For g2 to g7 , we used inductive hypotheses. For g7 , SS we are allowed to use ∀u ∈ G since we already proved the g5 case. This completes the proof. Having seen that any Gödel operation can be somehow expressed by a ∆0 formula, we naturally ask if the reverse is also true. Next we show the set defined by a ∆0 -formula is a Gödel operation. Theorem 5.34 (Gödel’s Normal Form Theorem, 13.7 [9]). Given a ∆0 -formula ϕ(v1 , . . . , vn ) and a finite sequence x1 , . . . , xn of sets , these exists a Gödel operation G on x1 , . . . , xn such that G is defined by ϕ, i.e. G(x1 , . . . , xn ) = {(v1 , . . . , vn ) ∈ x1 × . . . × xn : ϕ(v1 , . . . , vn )}. Proof. Let v = (v1 , . . . , vn ) and x = x1 × . . . × xn . We prove by induction on the complexity of ϕ. As we see the definition 5.9, the induction is done by the number of symbols ∈, =, ∧, ∨, ¬, ⇒, ⇔, and restricted quantifiers, ∃vi ∈ vj and ∀vi ∈ vj with i 6= j. We can assume ϕ contains only ∈, ∧, ¬, and restricted ∃ since a∨b ∀a ∈ b(ψ) a=b a⇒b a⇔b iff iff iff iff iff ¬(¬a ∧ ¬b), ¬∃a ∈ b(¬ψ), ∀u ∈ a(u ∈ b) ∧ ∀u ∈ b(u ∈ a), b ∨ ¬a, a ⇒ b ∧ b ⇒ a. 38 Further, if ϕ is ∃vs ∈ vt (ψ(. . . , vs , . . . , vt , . . .)), we can assume ϕ is ∃vs ∈ vt (ψ(. . . , vt , . . . , vs )), so that the quantified variable vs is of the highest index, because this is just a matter of renaming. Case 1 We prove for an atomic formula ϕ, i.e. vi ∈ vj where i 6= j, by induction on the number of variables in ϕ (allowing dummy variables). Case 1a If n = 2, then {(v1 , v2 ) ∈ x1 × x2 : v1 ∈ v2 } = g3 (x1 , x2 ) and {(v1 , v2 ) ∈ x1 × x2 : v2 ∈ v1 } = g7 (g3 (x2 , x1 )). Case 1b If n > 2 and i, j 6= n, suppose there exists a G such that G(x1 , . . . , xn−1 ) = {(v1 , . . . , vn−1 ) ∈ x1 × . . . × xn−1 : vi ∈ vj }. Then {v ∈ x : vi ∈ vj ∧ i, j 6= n} = G(x1 , . . . , xn−1 ) × xn . Case 1c If n > 2 and i, j 6= n − 1, by Case 1b we can assume there exists a G such that G(x1 , . . . , xn ) ={(v1 , . . . , vn−2 , vn , vn−1 ) ∈ x1 × . . . × xn−2 × xn × xn−1 : vi ∈ vj ∧ i, j 6= n − 1} ={((v1 , . . . , vn−2 ), vn , vn−1 ) ∈ (x1 × . . . × xn−2 ) × xn × xn−1 : vi ∈ vj ∧ i, j 6= n − 1}. Then {v ∈ x : vi ∈ vj ∧ i, j 6= n − 1} ={((v1 , . . . , vn−2 ), vn−1 , vn ) ∈ (x1 × . . . × xn−2 ) × xn−1 × xn : vi ∈ vj ∧ i, j 6= n − 1} =g10 (G(x1 , . . . , xn )). Case 1d If n > 2 and i = n − 1, j = n, then {v ∈ x : vn−1 ∈ vn } = x1 × . . . × xn−2 × g3 (xn−1 , xn ). Case 1e If n > 2 and i = n, j = n − 1, then {v ∈ x : vn ∈ vn−1 } = g10 ((x1 × . . . × xn−2 ) × g3 (xn , xn−1 )). 39 Case 2 If ϕ(v) is ¬ψ(v), suppose there exists a G such that G(x1 , . . . , xn ) = {v ∈ x : ψ(v)}. Then {v ∈ x : ϕ(v)} = x − G(x1 , . . . , xn ). Case 3 If ϕ(v) is (ψ1 ∧ ψ2 )(v), suppose for i = 1, 2, {v ∈ x : ψi (v)} = Gi (x1 , . . . , xn ). Then {v ∈ x : ϕ(v)} = G1 (x1 , . . . , xn ) ∩ G2 (x1 , . . . , xn ). Case 4 If ϕ(v) is ∃u ∈ vi (ψ(v, u)) with 1 ≤ i ≤ n, suppose by inductive hypothesis {(v, u) ∈ x × y : ψ(v, u)} = G1 (x1 , . . . , xn , y) and by Case 1 {(v, u) ∈ x × y : u ∈ vi } = G2 (x1 , . . . , xn , y) Since u ∈ v i ∈ xi ⇒ u ∈ [ xi , we have ϕ(v) iff ∃u ∈ vi (ψ(v, u)) [ iff ∃u(ψ(v, u) ∧ u ∈ vi ∧ u ∈ xi ) [ iff v ∈ dom{(v, u) ∈ x × xi : ψ(v, u) ∧ u ∈ vi }. Then [ {(v, u) ∈ x × xi : ϕ(v, u)} [ [ =x ∩ dom G1 (x1 , . . . , xn , xi ) ∩ G2 (x1 , . . . , xn , xi ) . Therefore, we have finished the proof. Notation 5.35. Let (cl)G (x) denote the closure of x under Gödel operations. Remark 5.36. Think carefully (cl)G (x) as it will play a crucial role in the rest of our proof. We construct it by induction on ω. At n = 0, let x0 = x. 40 At n = 1, apply once (arbitrary) basic Gödel operation over any u, v ∈ x0 , collect all the yielded sets, and along with x0 , define x1 = x0 ∪ {gi (u, v) : u, v ∈ x0 , i = 1, . . . , 7}. Then, inductively, at n = m, define xm+1 = xm ∪ {gi (u, v) : u, v ∈ xm , i = 1, . . . , 7}. Finally, we have (cl)G (x) = [ xn . n∈ω Proposition 5.37. (cl)G is a Σ1 -operation. Proof. In the light of Remark 5.36, we see y = (cl)G (x) [ iff ∃f f is a function ∧ dom(f ) = ω ∧ y = ran(f ) ∧ f (0) = x ∧ n ∈ ω(f (n + 1) = f (n) ∪ {gi (u, v) : u, v ∈ f (n), i = 1, . . . , 7}) . Corollary 5.38. For a transitive set M , D(M ) = (cl)G (M ∪ {M }) ∩ P(M ). Proof. By Proposition 5.10 and Gödel’s Normal Form Theorem, we have D(M ) ⊂ (cl)G (M ∪ {M }) ∩ P(M ). For the other direction, let x ⊂ M and x = G(. . .) for some Gödel operation G over M ∪ {M }. Since G is ∆0 , there exists a ∆0 -formula ϕ such that x = {v ∈ M : ϕ(v, . . .)}. But ϕ is ∆0 , so x = {v ∈ M : ϕM (v, . . .)} ∈ D(M ). Corollary 5.39. Gödel operations are absolute for transitive models. 41 5.6 Con(ZF + V = L) Lemma 5.40 (Lemma 13.14 [9]). The function D is absolute for L. Proof. We show D is ∆1 . First it is clear that z = D(x) = (cl)G (x ∪ {x}) ∩ P(x) iff ∃y y = (cl)G (x ∪ {x}) ∧ ∀u ∈ z(u ⊂ x ∧ u ∈ y) ∧ ∀u ∈ y(u ⊂ x ⇒ u ∈ z) . Since (cl)G is Σ1 by proposition 5.37 and the rest after the quantifier ∃y is ∆0 , the whole thing after ∃y is Σ1 . So D is a Σ1 -function by Lemma 5.13 (1). Now z = D(x) iff x ∈ dom(D) ∧ ∀u ¬(u = z) ⇒ ¬(u = D(x)) . Expressions x ∈ dom(D) and ¬(u = z) are ∆0 , and ¬(u = D(x)) is Π1 by Lemma 5.13 (2). Then z = D(x) is also Π1 , again by Lemma 5.13 (2). So D is ∆1 and thus absolute for L by Lemma 5.19. Lemma 5.41. ∀α ∈ Ord((Lα )L ) = Lα . Proof. We prove by contradiction. Suppose there exists an α such that (Lα )L 6= L. Let β be the smallest such ordinal. First trivially β 6= 0. If β > 0 and is a limit, then [ L (Lβ )L = Lα α<β = [ (Lα )L by Lemma 5.15 Lα by hypothesis α<β = [ α<β = Lβ contradicting the definition of β. If β is a successor, then L (Lβ )L = D(Lβ−1 ) = D (Lβ−1 )L = D(Lβ−1 ) = Lβ by Lemma 5.40 by hypothesis contradicting the definition of β. The result thus follows. 42 Corollary 5.42. L |= V = L. Proof. Since L |= V = L iff L |= ∀x∃α(α ∈ Ord ∧ x ∈ Lα ) iff ∀x ∈ L∃α ∈ L α ∈ (Ord)L ∧ x ∈ (Lα )L , the result follows from Lemmas 5.27 (5) and 5.41. Corollary 5.43. L is a model for ZF + V = L. 5.7 Con(ZFC) Theorem 5.44 (Definition §VI 4.1 [12], Theorem 13.18 [9]). There exists a wellordering for L. Proof. By induction we construct a well-ordering Cα for each Lα . We require these well-orderings have the properties that whenever α < β, x Cα y ⇒ x Cβ y, and x ∈ Lα ∧ y ∈ Lβ − Lα ⇒ x Cβ y. If α is a limit, define Cα = {(x, y) ∈ Lα × Lα : ρ(x) < ρ(y) ∨ (ρ(x) = ρ(y) ∧ (x, y) ∈Cρ(x)+1 )}, where ρ is the L-rank function. Recall, by Corollary 5.38 and Remark 5.36, Lα+1 = D(Lα ) = (cl)G (Lα ∪ {Lα }) ∩ P(Lα ) = [ xα,n ∩ P(Lα ), n∈ω where xα,0 = Lα , xα,n+1 = xα,n ∪ {gi (u, v) : u, v ∈ xα,n , i = 1, . . . , 7}. So given Cα and s, t ∈ Lα+1 , we use Gödel operations to construct Cα+1 by induction on n (in xα,n ) as following: (i) If s, t ∈ Lα , then s Cα+1 t iff s Cα t. (ii) If s ∈ Lα and t ∈ Lα+1 − Lα , then s Cα+1 t. iii) If s ∈ xα,n , t ∈ xα,m , then s Cα+1 t iff n < m. 43 (iv) If s, t ∈ xα,n+1 − xα,n , then s Cα+1 t iff is < it , where is is the least i such that ∃u, v ∈ xα,n (s = gis (u, v)). (v) If in (iv) is = it , then s Cα+1 t iff us Cα+1 ut , where us is the Cα+1 –least u such that ∃v ∈ xα,n (s = gis (us , v)). (vi) If in (v) us = ut , then s Cα+1 t iff vs Cα+1 vt , where vs is the Cα+1 –least v such that s = gis (us , vs ). (vii) If in (vi) vs = vt , it is easy to see s = t. Define CL = [ Cα . α∈Ord Then CL is a a well-ordering for L. Corollary 5.45. L |= V = L ⇒ AC. Corollary 5.46. Con(ZF) ⇒ Con(ZFC) 5.8 ZF + ¬AC Our last words for this section is the following deep result which we only state here. Theorem 5.47 ([3], [4]). There is a model of ZF in which R cannot be wellordered. Therefore Con(ZF) ⇒ Con(ZF + ¬AC). This theorem is proved by a powerful technique, called forcing, for consistency and independence proofs introduced by Paul J. Cohen in 1963 and 1964 ([3], [4]). It looks somewhat surprising and strange at the first glance, since we know how to order R and the consistency of ZFC in our “normal” mathematics world might have let us believe AC was true in general. Along with Con(ZFC), we have: Corollary 5.48. AC is independent of ZF. 44 6 The Sun and a pea To those who feel that the Banach– Tarski Paradox is absurd, it seems evident that the Axiom of Choice is the culprit. Stan Wagon In 1924, Banach and Tarski, using AC, proved a surprising and counterintuitive theorem: Theorem 6.1 (Banach–Tarski Theorem). A unit 3-ball can be divided into a finite number of disjoint pieces, such that these pieces can be reassembled to create two unit 3-balls. The proof is not included here but can be found in Chapter 3 of Wagon 1985 [21] or Chapter 5 of Wapner 2005 [22]. The idea of the proof is: assuming AC, there exist non-Lebesgue measurable sets (Theorom 4.8); divide a unit 3-ball into a finite number of non-Lebesgue measurable subsets; and these subset can somehow be put back to get two unit 3-balls. Given Banach–Tarski Theorem, one can make an arbitrarily large ball using a unit ball; hence the title of this section. Banach–Tarski theorem was one of the important reasons that AC was rejected by many mathematicians before Con(ZFC) was proved. This paradox raised the fear that AC could lead to contradiction. To remedy this, some weaker forms of AC were suggested. One of them is The Axiom of Dependent Choices Give a binary relation ≺ on a set x, if ∀s ∈ x∃t ∈ x(t ≺ s), then there exist {sn ∈ x : n ∈ ω} such that ∀n ∈ ω(sn+1 ≺ sn ). It can be shown that there exists a model for ZF + DC in which any subset of R is Lebesgue measurable; see Theorem 10.10 [8]. Therefore, ZF + DC seems to 45 appease both sides: there are choice functions but no longer the Banach–Tarski paradox. However, as Wagon argued, if some forms of non-constructive choice functions are tolerated, why are not all the choice functions [21]? AC endows and inspire our mathematicians and scientists with a brave and liberal attitude. Without AC, mathematics world would have been a collection of algorithms. Those paradoxes led by AC, like the Banach–Tarski one, instead of raising any fear to AC, actually reinforce and reshape our views to mathematics. Many fruitful results are known to be provable if and only if AC is assumed. One example is the existence of non-Lebesgue measurable sets. Another one is Theorem 6.2 (Theorem 10.11 [8]). There exists a model for ZF in which there exists a vector space with no basis. In fact, the physical problems caused by AC happens entirely in abstract mathematical worlds, more specifically, in the area of Lebesgue measurable sets. Therefore, doubling the volume of a ball simply means the physical property, volume, is an undefinable term in Lebesgue measure theory. Moreover, AC also helps to discover results in the world without assuming AC. For example, the proof of Schröder–Bernstein theorem can be without using AC, but is easier in ZFC [21]. Without AC, its discovery would be postponed. For another example, in response to the Banach–Tarski paradox, the concept of amenable groups was introduced to deal with paradoxical groups by von Neumann in 1929 [20]. 46 7 Conclusion No doubt [AC] will always be desirable despite the technical interest of various independence questions involving it and weaker principles. Dana Scott We have shown some equivalents of AC and results requiring AC from various branches of mathematics. We have also justified why AC can be used by our proof of the consistency of AC in ZF. In most areas other than set theory, AC is used as a simple fact. However, in set theory, research on this topic often goes quite deep, e.g. large cardinals. This is especially powered by forcing, the technique invented by Cohen for his proof of Con(ZF +¬ AC). For the reader who would like more information about forcing, Cohen 1963 [3], Cohen 1964 [4], Kunen 1980 [12], and Jech 2002 [9] are recommended. For contents about large cardinals, one is referred to Jech’s 2002 textbook [9]. For more fundamental results, one may consult Marker 2002 [14] for model theory or Mendelson 1987 [15] for the Gödel Completeness Theorem and the two Gödel Incompleteness Theorems. 47 References [1] Banach, S. & Tarski, A. 1924. Sur la décomposition des ensembles de points en parties respectivement congruentes. Fundamenta Mathematicae, 6: 244– 277. [2] Blass, A. 1984. Existence of bases implies the axiom of choice. Contemporary Mathematics, 31: 31–33. [3] Cohen, P.J. 1963. The Independence of the Continuum Hypothesis. Proceedings of the National Academy of Sciences of the USA, 50: 1143–1148. [4] Cohen, P.J. 1964. The Independence of the Continuum Hypothesis, II. Proceedings of the National Academy of Sciences of the USA, 51: 105–110. [5] Gödel, K. 1938. The consistency of the axiom of choice and of the generalized continuum hypothesis. Proceedings of the National Academy of Sciences of the USA, 24: 556–557. [6] Gödel, K. 1940. The consistency of the axiom of choice and of the generalized continuum–hypothesis with the axioms of set theory. Princeton University Press, Princeton, New Jersey. [7] Grillet, P.A. 2007. Abstract Algebra, 2nd ed. In: Axler, S., & Riber, K.A. (eds), Graduate texts in mathematics, 242. Springer-Verlag, New York. [8] Jech, T.J. 1973. The Axiom of Choice. North Holland Publishing Company, New York. [9] Jech, T.J. 2002. Set Theory, 3rd ed. Springer, Berlin. [10] Johnson, D.L. 1976. Presentations of groups. In: Shephard, G.C. (ed), London Mathematical Society Lecture note series, 22. Cambridge University Press, Cambridge. [11] Johnstone, P.T. 1981. Tychonoff’s theorem without the axiom of choice. Fundamenta Mathematicae, 113: 21–35 [12] Kunen, K. 1980. Set Theory: an introduction to independence proofs. In: Barwise, J., Kaplan, D., Keisler, H.J., Suppes, P., & Troelstra, A.S. (eds), Studies in logic and the foundations of mathematics, 102. North Holland Publishing Company, New York. [13] Kelley, J.L. 1950. The Tychonoff product theorem implies the axiom of choice. Fundamenta Mathematicae, 37: 75–76. 48 [14] Marker, D. 2002. Model theory : an introduction. In: Axler, S., Gehring, F.W. & Riber, K.A. (eds), Graduate texts in mathematics, 216. Springer-Verlag, New York. [15] Mendelson, E. 1987. Introduction to Mathematical Logic, 3rd ed. Wadsworth & Brooks/Cole Advanced Books & Software, California. [16] Moore, G.H. 1982. Zermelo’s Axiom of Choice: its origins, development, and influence. Springer-Verlag, New York. [17] Rubin, H & Rubin J.E. 1963. Equivalents of the Axiom of Choice. In: Brouwer, L.E.J., Beth, E.W. & Heyting, A. (eds), Studies in logic and the foundations of mathematics. North Holland Publishing Company, New York. [18] Rynne, B.P. & Youngson, M.A. 2008. Linear Functional Analysis, 2nd ed. Springer-Verlag, London. [19] Srivastava, S.M. 1998. A Course on Borel Sets. In: Axler, S., Gehring, F.W. & Riber, K.A. (eds), Graduate texts in mathematics, 180. Springer-Verlag, New York. [20] von Neumann, J. 1929. Zur allgemeinen Theorie des Maßes. Fundamenta Mathematicae, 13: 73–111. [21] Wagon, S. 1985. The Banach–Tarski Paradox. Cambridge University Press, New York. [22] Wapner, L.M. 2005. The Pea and the Sun: a mathematical paradox. A K Peters, Ltd, Wellesley, Massachusetts. [23] Wheeden, R.L. & Zygmund, A. 1977. Measure and Integral: an introduction to real analysis. In: Hewitt, E., Kobayashi, S. & Taft, E.J., etc. (eds), Monographs and textbooks in pure and applied mathematics, 43. Marcel Dekker, Inc., New York. [24] Wilansky, A. Toronto. 1970. Topology for Analysis. 49 Xerox College Publishing,
© Copyright 2026 Paperzz