Set Theory-an Introduction

Set Theory-an Introduction
1. Intro
• A set theoretist is a mathematician who admits not to
know what the real numbers are.
• Some commonly used axioms/theorems/statements/logical reasoning is dangerous. One has to know the danger and should
not ignore it.
• Naive set theory (Cantor: everything with a property is a set:
any definable collection is a set) and Russel’s paradox: The
set of all sets ⊃ the set of all sets not containing itself as an
element. Let us call a set ”abnormal” if it is a member of
itself, and ”normal” otherwise. For example, take the set of all
squares in the plane. That set is not itself a square in the plane,
and therefore is not a member of the set of all squares in the
plane. So it is ”normal”. On the other hand, if we take the
complementary set that contains all non-(squares in the plane),
that set is itself not a square in the plane and so should be
one of its own members as it is a non-(square in the plane). It
is ”abnormal”. Now we consider the set of all normal sets, R.
Determining whether R is normal or abnormal is impossible: if
R were a normal set, it would be contained in the set of normal
sets (itself), and therefore be abnormal; and if R were abnormal,
it would not be contained in the set of all normal sets (itself),
and therefore be normal. This leads to the conclusion that R is
neither normal nor abnormal: Russell’s paradox.
• There is a model of R such that there is an -δ discontinuous
function that is sequentially continuous. From ZF it cannot be
proven that a contiuous function on a compact interval attains
its maximum.
2. What is a set?
• Main obstacle: A set cannot contain itself as an element.
• solutions: certain axiomatics: Sierpinski: a set is of higher
hierarchy than its elements
• More common: Zermelo-Fraenkel axioms
1
2
•
•
•
•
•
•
•
•
•
•
•
•
A set contains elements (objects): a ∈ A
Two sets are equal iff they have the same elements.
No set is its own element.
Given a condition we do not know beforehand that there is an
object fulfilling it. So it is convenient to define ∅ (empty set)
as the set that does not contain any element.
Sets might be elements of other sets.
Subsets are sets (collection of some elements in the set). The
empty set is a subset of all sets.
Unions of sets areSsets: A a set, for each α ∈ A there is a set
Bα then the union α Bα is defined as the set containing all the
elements of at least one of Bα .
Complements are sets B \ A = C is a set.
Intersection of sets: TA a set, for each α ∈ A there is a set Bα
then the intersection α Bα is defined as the set containing all
the elements that are elements of all Bα .
Complements are sets B \ A = C is a set.
Cartesian product of sets: A × B is the set of ordered pairs
(x, y), x ∈ A, y ∈ B.
Exponents of sets: AB is the set of all functions from B into
A. Example: AN is the set of all sequences in A.
3. Consistency of ZF
By Gödel’s theorem ZF’s consistency cannot be proven within ZF.
For this one needs the existence of ”large” cardinals (another axiom in
set theory).
However, redundancy is known.
4. Classes
• A class is a collection of sets (or sometimes other mathematical
objects) that can be unambiguously defined by a property that
all its members share.
• A class that is not a set (informally in ZermeloFraenkel) is called
a proper class.
• Examples:
– The class of all sets
– The class of all one-element sets
– The class of all groups, rings, fields, vector spaces, etc.
• One way to prove that a class is proper is to place it in bijection
with the class of all ordinal numbers.
3
5. Equivalence relations
Let M be a set.
Definition 5.1. A relation ”∼” on M × M is an equivalence relation
if
(1) Reflexivity: a ∼ a for all a ∈ M .
(2) Symmetry: a ∼ b,
=⇒
b ∼ a.
(3) Transitivity: a ∼ b and b ∼ c
=⇒
a ∼ c.
Definition 5.2. Partition into classes:
[
M=
Mα , α 6= β
=⇒ Mα ∩ Mβ = ∅.
α
Mα – classes.
Lemma 5.1. Any equivalence relations gives a partition into classes
and vice versa.
Proof.
( =⇒ ) Ma = {x ∈ M : x ∼ a}, Ma = Mb = Mα
(reverse) a ∼ b
⇐⇒
⇐⇒
a ∼ b.
a ∈ Mb .
Two classes either coincide or are disjoint!
6. Equivalence of sets
Definition 6.1. A ∼ B iff there is a bijection f : A → B.
Remark 6.1. Two finite sets are equivalent iff they have the same
number of elements (generalization!)
Example 6.1. NN ∼ [0, 1] \ Q via continued fraction expansion. Note
that NN is the set of all integer sequences.
Definition 6.2. A set is of infinite power (infinite) if it is not equivalent to any finite set.
Definition 6.3. A set is countable iff it is equivalent to N. If a set
is neither finite nor countable it is called uncountable.
Example 6.2.
• Z is countable.
• The even numbers are countable.
• Q is countable.
• Any subset B of a countable set A is countable or finite. (enumerate A : a1 , a2 , a3 , · · · and let B : an1 , an−2 , an3 , · · · either the
enumeration of B is finite or a correspondence to ω)
4
• Any countable union of countable sets with specified bijections to N is countable (write a table for each An : · · · , an,i , · · ·
and mimic the enumeration of Q.
Definition 6.4. A set is called Dedekind finite if it does not contain
a countable set as a subset. Otherwise it is called Dedekind infinite.
Lemma 6.1 (ωAC equivalent). Any infinite set is Dedekind infinite.
Proof.
• Choose arbitrary a1 ∈ M (OK!).
• Choose a2 ∈ M \ a1 (OK!).
• Continue· · · since there are always elements left because M is
infinite (OK???????).
We run into the same problems as with sequential continuity. There
is a model of R where R contains infinite sets that are Dedekind finite! This gives also an example of a discontinuous function that is
sequentially continuous!
Theorem 6.1. I := [0, 1] ∩ R is uncountable (in ZF without AC!).
• Any x has a unique binary expansion with infinitely
many 0’s. So I ∼ B ⊂ {0, 1}N . (The latter is the set of all
0–1 sequences).
• {0, 1}N ∼ C1/3 ⊂ I (use base 3 expansion).
• This shows by the Cantor-Bernstein Theorem that I ∼
{0, 1}N .
• Use Cantor’s diagonal argument to show that N {0, 1}N
Proof.
7. The Cantor–Bernstein Theorem
Theorem 7.1 (Cantor–Bernstein). Let A ∼ B1 , B ∼ A1 , A1 ⊂ A and
B1 ⊂ B. Then A ∼ B.
Proof. Let f : A → B1 and g : B → A1 be the (any) corresponding
bijective maps. Consider chains a 7→ b iff a ∈ A, b ∈ B1 ⊂ B and
f (a) = b. Similarly, b 7→ a iff b ∈ B, a ∈ A1 ⊂ A and g(b) = a.
• Each chain is ”infinite to the right”
• Each element of A, respectively B is contained in exactly one
chain.
• There are (disjoint) 3 possibilities: a chain C is ”infinite in both
directions” (type 1), the ”least element” belongs to A (type 2)
or the ”least element” belongs to B (type 3).
• f maps the elements of A that belong to chains of type 1 or 2 in
a 1-to-1 way into B 1,2 ⊂ B. The remaining elements of A are in
5
chains of type 3 and g −1 maps A \ { chains of type 1 or type 2}
in a 1-to-1 fashion onto B \ B 1,2 .
8. The power of a set, Cardinals
Definition 8.1. A cardinal number is an equivalence class of sets.
Definition 8.2. The power or cardinality of a set A is the corresponding cardinal number denoted by m(A).
Remark 8.1. Some facts.
• The cardinality of a finite set is a natural number.
• The natural numbers have a cardinality denoted by ℵ0 .
• There are 4 possibilities:
– A ∼ B1 ⊂ B and B ∼ A1 ⊂ A
– A ∼ B1 ⊂ B but ∀A1 ⊂ B
A1 B
– B ∼ A1 ⊂ A but ∀B1 ⊂ A
B1 A
– ∀A1 ⊂ A
A1 B and ∀B1 ⊂ B
B1 A.
In the first case (Cantor-Bernstein) m(A) = m(B), in the
second we write m(A) < m(B), in the third m(A) > m(B).
The most interesting case is the fourth. Without any further axioms it can happen or not. Assuming that the fourth
case does not happen, i.e. we can compare any two sets
(Trichotomy) is equivalent to the Axiom of Choice that
we will study later.
For a set M we write P(M ) for its power set, i.e. the set of all
(including the empty set!) subsets of M .
Theorem 8.1 (Cantor). m(M ) < m(P(M )).
• x ∈ M → {x} ∈ P(M ) is a bijection onto its image. So
M and P(M ) are compareble.
• Assume there is a bijection x → f (x) = Mx ∈ P(M ).
• Consider the set X := {x ∈ M : x ∈
/ Mx } ⊂ M , i.e. X ∈
P(M ).
• Like in Russel’s paradox X 6= f (y) for all y ∈ M . If it were
than y cannot be in X nor in its complement!
Proof.
Remark 8.2. We summarize:
• There is no ”largest” cardinal.
• The collection of all cardinal numbers is a proper class!
• Notation: m(P(M )) = 2m(M ) in analogy of finite sets.
6
• 2ℵ0 = c, where c = m(R) - the power of the continuum.
• m({0, 1}N ) = c.
• m(NN ) = c. For both last statements consider the equivalence
to the real numbers.
• The ccontinuum hypothesis states that there is no cardinal
number m such that ℵ0 < m < c.
• One can prove that m(Borel sets on R) = c. All subsets of C1/3
have outer Lebesgue measure 0 and hence are Lebesgue measurable. So m(Lebesgue measurable sets) > c > m(Borel sets).
9. CH and GCH
CH There is no cardinal m such that ℵ0 = m(N) < m < c = 2ℵ0 .
GCH For any cardinal number m there is no other cardinal n such
that m < n < 2m .
10. Well-ordering
Definition 10.1. (Partial) Ordering a ≤ b is a relation, i.e. a subset
of M × M with:
• a ≤ a (refelxivity)
• a ≤ b, b ≤ a
=⇒
a ≤ c (transitivity)
• a ≤ b, b ≤ a
=⇒
a = b (antisymmetry)
Example 10.1. N, R, P(M ) with subset relation, · · ·
Definition 10.2. A set M is totally ordered if for any distinct a, b ∈
M either a < b or b < a.
Example 10.2. N, R but not P(M ).
Definition 10.3. A well-ordering of a set M is a total ordering such
that any subset M1 ⊂ M , i.e. M1 ∈ P(M ), has a least element, i.e.
∃x ∈ M1 such that for all a ∈ M1 we have x ≤ a.
Example 10.3. N but not R.
Definition 10.4. A subset M1 ⊂ M of a (partially) ordered set M is
called a chain if it is totally ordered, i.e.∀a, b ∈ M1 either a ≤ b or
b ≤ a.
Definition 10.5. An element a ∈ M of a (partially) ordered set M is
said to be an upper bound for a subset M1 ⊂ M if ∀x ∈ M1
=⇒
x ≤ a. A lower bound is defined analoguously. A subset with an
upper/lower bound is said to be bounded from above/below.
An element a ∈ M of a (partially) ordered set M is maximal if
a≤x
=⇒ a = x.
7
Definition 10.6. An element a ∈ M1 ⊂ M of a subset of an (partially)
ordered set is said to be compareable in M1 if
∀x ∈ M1
=⇒ a ≤ x or x ≤ a.
Remark 10.1. A subset M1 ⊂ M of a (partially) ordered set M is a
chain iff any of its elements is compareable in M1 .
11. The Axiom of Choice and related statements
AC: For any collection
S of sets {Aα }α∈I , Aα , I are sets, there is a
function f : {Aα }α∈I → α Aα such that for any α we have f (Aα ) ∈ Aα .
Remark 11.1. In contrast to ZF this axiom allows to build sets without
specifying the elements. One can choose one shoe from an infinite
collection of pairs of shoes (choose always the left!) but one needs AC
to choose from pairs of socks.
Since any set contains elements AC is not needed for a finite collection of sets!
AC is not harmless at all but it has many convenient applications.
Sometimes one uses AC even when it is not needed (definition of a
differentiable structure).
One should always be aware when one uses AC!
The AC is equivalent to the following 4 statements. Each of which
has its advantages in different applications.
Theorem 11.1 (Maximal chain theorem of Hausdorff). In any partially ordered set, every totally ordered subset (chain) is contained in a
maximal totally ordered subset (chain).
AC implies the Maximal chain theorem of Hausdorff.
• First change ≤ to ⊂ by defining M sup Mx := {y ∈ M : y ≤ x}.
• Let X be a non-empty collection (it contains ∅) of subsets of M
with the properties. Every subset of a set in X belongs to X
and the union of each chain in X is contained in X.
• Let f : P(M ) \ ∅ → M be a choice function for M (AC!). For
each A ∈ X let A∗ := {x ∈ M : A ∪ {x}}. Define g : X → X by
g(A) = A if A∗ \ A = ∅ or otherwise g(A) = A ∪ {f (A∗ \ A)}.
Then g(A) contains at most 1 element more than A.
• We want to prove that G(A) = A for some A ∈ X. That will
impliy the theorem.
• We say set a subcollection J of X is a tower if
– ∅∈J
– If A ∈ J then g(A) ∈ J. S
– If C is a chain in J then A∈C A ∈ J.
8
• The intersection of towers is a tower. Let J0 be the intersection
of all towers, i.e. the smallest. We are going to prove that it is
a chain.
• Let C be compareable in J0 .
• Assume A ( C then g(A) ⊂ C. Otherwise A ( C ( g(A)
contradicting that g(A) has at most one more element than A.
• Let U be the collection of sets in A ∈ J0 such that A ⊂ C or
g(C) ⊂ A. We want to show that U is a tower.
– ∅ ∈ U.
– A ( C then (previously) g(A) ⊂ C, i.e. g(A) ∈ U.
– A = C then g(C) = g(A) (i.e. g(C) ⊂ g(A)) and g(A) ∈ U.
– g(C) ⊂ A then g(C) ⊂ A ⊂ g(A) and g(A) ∈ U.
– The union of the elements over a chain is by the definition
of U contained in U.
• U = J0 .
• If C is compareable then so is g(C) by the previous considerations.: If A ∈ J0 = U so either AsubsetC ⊂ g(C) or g(C) ⊂ A.
• g maps compareable sets to compareable sets. The union of
compareable setsover a chain is compareable. That implies that
compareable sets constitute a tower and hence J0 consists of
compareable sets only, i.e. J0 is a chain itself.
• Since J0 is a chain and a tower the union A over all elements of
J0 is in J0 . Therefore g(A) ⊂ A since A includes all sets in J0 .
On the other hand A ⊂ g(A). Therefore A = g(A).
Theorem 11.2 (Zorn’s Lemma). Every non-empty partially ordered
set in which every chain (i.e., totally ordered subset) has an upper
bound contains at least one maximal element.
Maximal chain theorem of Hausdorff implies Zorn’s Lemma.
Take a chain C in the set and an upper bound a for it. Then a ∈ C
and is a maximal element. Otherwise there is an element b in the set
such that a < b ∈
/ C and the chain C is not maximal.
Theorem 11.3 (Zermelo’s Well-Ordering Principle (WOP)). Every
set can be well-ordered.
Zorn’s lemma implies WOP.
• If A, B are two well-ordered sets then A is said to be a continuation of B if B ⊂ A and the subset inclusion preserves the
order in B, A.
• If a collection of well-ordered sets forms a chain C with respect
to continuation then there is a unique well-ordering of U , the
9
union of the sets in the chain, that is a continuation of the
well-ordering of all sets in C. This well-order is defined in the
following way: take a, b ∈ U . The there are a ∈ A ∈ C, b ∈
B ∈ C. By the continuation property either A = B or one is
the continuation of the other. That defines the order between
a andb. It is clearly a well-ordering.
• Consider U the collection of well-ordered subsets of M , i.e subsets together with a (choosen) well-ordering. Then U is partially
ordered by continuation. This collection contains the empty set.
If C is a chain in U then its union of the sets in C is an upper
bound. Hence, there is a maximal well-ordered set M 0 in U.
This set must be equal to M since otherwise we could add the
”missing” element x ∈ M \ M 0 to the chain by x > y, ∀y ∈ M 0 .
WOP implies AC.
S
Consider U = α∈I Aα and choose (no AC needed for this!) a wellordering on the set U . Since Aα ⊂ U we can well-define f (Aα ) as the
minimal element of Aα ⊂ U . This is a choice function.
Theorem 11.4 (Trichotomy of Cardinals). If two sets are given, then
either they have the same cardinality, or one has a smaller cardinality
than the other.
Proof. This proof comes later.
There are other statements commonly used in analysis, topology,
combinatorics,· · · that are not implied in ZF but implied or even equivalent toby/to AC.
• Ultrafilter Lemma
• Krein-Milman Theorem
Q
• Tychonov not for modules: ∞
n=1 Z is not a free Z-module.
• Bernstein sets
• Under CH (or weaker) there is no finite, σ-additive, non-atomic
measure such that any subset of R is measurable (Ulam).
• Non-measurable sets via ultrafilters.
• Ellis’ Theorem: Any compact semi-group has an idempotent
element.
There are also stronger axioms that imply AC. The most familiar one
is the genralized continuums hypothesis (GCH). On the other hand, CH
is independend of ZFC (Cohen).
10
12. Tomas
• Without AC! There is a function ℵ associating to any cardinal
m an ℵ, i.e. ℵ(m) = m(X) where X is an ordinal, with the
property that ℵ(m) m (i.e. not neccessarily compareable)
2m
and ℵ(m) < 22 (compareable).
• From this follows (without AC): There is no injective net in
any set (a net xα , α along all ordinal numbers and xα 6= xβ if
α 6= β). This gives an alternative proof of Zorn’s lemma (use
AC to construct an injective net if there is no maximal element).
• From this follows (without AC): Trichotomy =⇒ WOP.
• From this also follows (without AC): GCH =⇒ WOP.
• The collection of all ordinals is a proper class.
• Given any sequence of infinite ordinals strictly less than a given
ordinal β > ω1 . Then they are all prefixes of an ordinal β 0 <
β. This proves that [0, ω1 ) is not compact but sequentially
compact. (Similar for nets where the index set has cardinality
less than β)