MAT4450 – Note 1 – On Zorn’s lemma, the axiom of choice and
Tychonoff ’s theorem
1
Zorn’s lemma and the axiom of choice
The axiom of choice in set-theory can be stated as follows:
Let {Xi }i∈I be an arbitrary family of non-empty sets indexed by Sa nonempty set I. Then there exists a so-called choice function c : I → i∈I Xi
satisfying c(i) ∈ Xi for every i ∈ I.
Note that if we let Πi∈I Xi denote the cartesian product of the above family,
that is,
n
o
[
Πi∈I Xi = f : I →
Xi | f (i) ∈ Xi for every i ∈ I ,
i∈I
then the axiom of choice just says that Πi∈I Xi is non-empty.
This innocent looking axiom is often useful when the index set I is infinite
and we don’t know much about the Xi ’s. A famous example due to B. Russel
illustrates this: if each Xi consists of a pair of shoes, then we can just decide
to let c(i) denote the left shoe in Xi ; but if each Xi consists of a pair of socks
(and I is infinite), then we need the axiom of choice to secure the existence
of a choice function.
It can be shown that the axiom of choice is logically equivalent to many
other statements, but for most of them it is not immediately obvious that
they should hold. This is in particular true for the so-called Zorn’s lemma,
which is the key tool in the proof of several important results. Before we
state Zorn’s lemma, we recall some terminology and a few elementary facts.
Consider a partially ordered set X. By this we mean that X is equipped
with a relation ≤ that is reflexive, transitive and anti-symmetric. If it happens that for all x, y ∈ X we have x ≤ y or y ≤ x, we say that X is totally
ordered.
For example, R equipped with it usual ordering is a totally ordered set.
On the other hand, if Ω is a non-empty set, X = P(Ω) denote the set
consisting of all subsets of Ω, and we consider the relation ⊂ on X (in other
words, X is ordered by set-inclusion), then X is partially ordered, but not
totally ordered (unless Ω has only one element).
1
An element m ∈ X in a partially ordered set X is called maximal if
{x ∈ X | m ≤ x} = {m}. If we write x < y when x ≤ y and x 6= y,
this means that m is maximal if there does not exist any x ∈ X such that
m < x. Note that X may not have any maximal element (e.g. R with its
usual ordering), and that X may have several maximal elements.
Let A be a subset of a partially ordered set X. If b ∈ X is such that a ≤ b
for all a ∈ A, we say that b is an upper bound for A. If b is an upper bound
for A and we have b ≤ b0 for any other upper bound b0 for A, we say that b
is a least upper bound for A. Clearly, such a least upper bound is unique if
it exists. If A is non-empty and totally ordered with respect to the partial
order it inherits from X (by restricting ≤ to elements of A), then we say that
A is a chain in X.
We can now state Zorn’s lemma:
Zorn’s lemma
Let X 6= Ø be an inductively ordered set, that is, X is a partially ordered
set with the property that every chain in X has an upper bound. Then X has
(at least) one maximal element.
The goal of this section is to explain how Zorn’s lemma can be deduced
from the axiom of choice. Our approach goes back to Bourbaki (a nickname
for a collective of french mathematicians in the 20th century). As a first step,
we introduce the following:
Weak form of Zorn’s lemma
Let Y 6= Ø be a strictly inductively ordered set, that is, Y is a partially
ordered set with the property that every chain in Y has a least upper bound.
Then Y has (at least) one maximal element.
Proposition 1. Assume that the weak form of Zorn’s lemma holds. Then
Zorn’s lemma also holds.
Proof. Let X 6= Ø be an inductively ordered set. Let Y be the set
consisting of all chains in X. Since {x} ∈ Y for all x ∈ X, Y is non-empty.
We use set-inclusion as a partial order on Y . Observe that Y is strictly
inductively ordered:
Indeed, let A be a chain in Y and set
[
B=
S = {x ∈ X | x ∈ S for some S ∈ A} .
S∈A
2
Then B is a chain in X. To see this, let x, x0 ∈ B. Then x ∈ S and x0 ∈ S 0
for some S, S 0 ∈ A. Since A is a chain in Y , we have S ⊂ S 0 or S 0 ⊂ S. If
S ⊂ S 0 , then x, x0 ∈ S 0 , so, as S 0 is a chain in X, we have x ≤ x0 or x0 ≤ x.
Similarly, if S 0 ⊂ S, we also get x ≤ x0 or x0 ≤ x. One also sees easily that
B is non-empty. Hence, B is a chain in X, so B ∈ Y . Moreover, B is clearly
a least upper bound for A in Y . Thus Y is strictly inductively ordered, as
asserted above.
Using the assumption (that the weak form of Zorn’s lemma holds) on Y ,
we get that Y has a maximal element M . This means that M is a chain in
X which is not contained in any other chain in X. Since X is inductively
ordered, M has an upper bound m ∈ X. Then m is a maximal element of
X:
Indeed, let x ∈ X and assume that m ≤ x. Then M ∪ {x} is a chain in
X which contains M . Hence we must have M ∪ {x} = M , i.e. x ∈ M . This
implies that x ≤ m, and we conclude (from anti-symmetry) that x = m, as
desired.
We have thus shown that X has a maximal element, and thereby that
Zorn’s lemma holds.
It remains now to explain how the weak form of Zorn’s lemma can be
proven with the help of the axiom of choice. Consider a (non-empty) strictly
inductively ordered set Y and assume that Y has no maximal element. We
want to show that this leads to a contradiction. For each y ∈ Y , define
Zy = {z ∈ Y | y < z}. Then each Zy is non-empty (otherwise y would
be maximal). Using the axiom of choice, we can pick a choice function
c : Y → ∪y∈Y Zy ⊂ Y . Note that we then have c(y) ∈ Zy , i.e., y < c(y),
for all y ∈ Y . This will give us a contradiction once we have shown that the
following result is true:
Proposition 2. Assume Y is a (non-empty) strictly inductively ordered set
and f : Y → Y is such that y ≤ f (y) for every y ∈ Y . Then f has a fixed
point in Y , that is, there exists some y ∈ Y such that y = f (y).
Proof. Pick some a ∈ Y and set A = {b ∈ Y | a ≤ b}. Then A is nonempty, and it is rather trivial to check that A is strictly inductively ordered
with respect to the order it inherits from Y . Moreover, it is obvious that f
maps A into itself. We will show that f has a fixed point in A (hence in Y ).
3
Let us say that a subset B of A is admissible if a ∈ B, f (B) ⊂ B and
the least upper bound in A of any chain in B lies in B. Clearly, A itself is
admissible. Our aim is to find an admissible totally ordered subset S of A,
because if we then let c be a least upper bound of S in A, we will get that
c ∈ S, so f (c) ∈ S, and this will give c ≤ f (c) ≤ c. Hence c will be a fixed
point for f , as desired.
Let S denote the intersection of all admissible subsets of A. We will show
that S is admissible and totally ordered, and this will finish the proof.
We first check that S is admissible. Trivially we have a ∈ S. Moreover,
let x ∈ S. Then for every admissible B we have x ∈ B, hence f (x) ∈ B.
Thus f (x) ∈ S. So f (S) ⊂ S. Finally, let T be a chain in S and u be the
least upper bound of T in A. For every admissible B, T is a chain in B and
u is the least upper bound of T in A, so u ∈ B. It follows that u ∈ S, as
desired.
Next, we remark that if B is an admissible subset of A and B ⊂ S, then,
as S ⊂ B (by definition of S), we must have B = S. Using this remark, it
is not very difficult to verify that the following claim holds (we leave this to
the reader as an exercise):
Claim. Let c ∈ S and set Sc = {x ∈ S | x ≤ c} ∪ {x ∈ S | f (c) ≤ x}.
Say that c is extremal if f (x) ≤ c whenever x ∈ S and x < c. Then the
following assertions hold:
(i) We have Sc = S whenever c ∈ S is extremal.
(ii) Every element of S is extremal.
Hint: For (i) show that Sc is admissible whenever c is extremal. For (ii),
use (i) to show that the set of all extremal elements is admissible.
It is now straightforward to deduce that S is totally ordered. Indeed, let
c, c ∈ S. Then, using the above claim, c is extremal in S and c0 ∈ S = Sc .
Hence we have c0 ≤ c or c ≤ f (c) ≤ c0 .
0
Exercise 1. Prove that the above Claim holds.
Exercise 2. Let V be non-trivial vector space (over R or C). Let S be a
non-empty (possibly infinite) subset of V . We recall that Span(S) denote
the subspace of V consisting of all the possible linear combinations of vectors
obtained by using finitely many vectors in S. Moreover, if S is infinite, then
4
S is said to be linearly independent if every non-empty finite subset of S
is linearly independent. Finally, S is said to be a (Hamel) basis for V if
Span(S) = V and S is linearly independent.
a) Let S be a linearly independent subset of V . Assume that Span(S) 6= V
and let v ∈ V \Span(S). Show that S ∪ {v} is linearly independent.
b) Use Zorn’s lemma to show that V has a (Hamel) basis.
Hint: Consider X = {S ∈ P(V ) | S is linearly independent} ordered by
set-inclusion.
c) Let R be a linearly independent subset of V . Show that there exists a
(Hamel) basis for V that contains R.
Exercise 3. Let H be a non-trivial Hilbert space. Recall that B ⊂ H is
called an orthonormal basis for H when B is orthonormal and Span(B) is
dense in H. Show that H has an orthonormal basis.
Exercise 4. Let R be a unital ring. Recall that a left ideal L in R is called
proper if L 6= R. Use Zorn’s lemma to show that R has a maximal proper
left ideal (i.e., a proper left ideal that is not contained in any other proper
left ideal).
2
Tychonoff ’s theorem
Consider a family {Xi }i∈I of non-empty topological spaces indexed by a nonempty set I. For each i ∈ I, we let pi denote the canonical map from the
cartesian product X = Πi∈I Xi onto Xi given by
pi (f ) = f (i) ,
f ∈X.
We recall that the product topology on X is the topology generated by
the subbasis E that consists of the subsets of X of the form p−1
i (U ) for some
i ∈ I and some open U ⊂ Xi . One important result in topology is:
Tychonoff ’s theorem
Assume that all the Xi ’s are compact. Then X = Πi∈I Xi is compact in
the product topology.
5
When the index set is finite, this result is usually proved in introductory
courses in topology. We will give a proof of the general case that makes use
of the so-called Alexander’s lemma, whose proof illustrates the usefulness of
Zorn’s lemma.
Alexander’s lemma
Let X be a topological space with a topology generated by a subbasis E.
Assume that every cover of X with elements from E has a finite subcover.
Then X is compact.
Proof. We first recall some terminology. A family C of subsets of X, i.e.
C ⊂ P(X), is called a cover of X if ∪C∈C C = X. A finite subcover of C is
a finite subset {C1 , . . . , Cn } of C such that ∪nk=1 Ck = X. The family C is
called an open cover of X if all elements of C are open subsets of X. Finally,
compactness of X means that every open cover of X has a finite subcover.
Assume (for contradiction) that X is not compact. This means that the
set U consisting of all open covers of X without finite subcovers is non-empty.
We order U by set-inclusion: hence, if C, D ∈ U, C ⊂ D means that every
member of C is a member of D. Remark that U is inductively ordered: if V
is a chain in U, then
[
V ⊂ P(X)
W=
V∈V
gives an upper bound for V (check this !).
By Zorn’s lemma, U has a maximal element M. Note that this implies
that if V is an open subset of X such that V 6∈ M, then M ∪ {V } has
necessarily a finite subcover.
Set B = M ∩ E. We claim that B is a open cover of X. This will imply
that B is an open cover of X with members of E with no finite subcovers,
thus contradicting the assumption.
Since every element of B is clearly an open subset of X, we have to show
that B is a cover of X.
S Assume (for contradiction) that there exists some
x ∈ X such that x 6∈ B∈B B. Since M is a cover of X, there exists U ∈ M
such that x ∈ U . Further, since E generates the topology of X, there exist
V1 , . . . , Vn ∈ E such that
n
\
x∈
Vj ⊂ U .
j=1
6
Note that none of the Vj ’s belongs
to M (for if Vj ∈ M for some j, then
S
Vj ∈ B, so we would have x ∈ B∈B B). Hence, using the maximality of M,
we get that for every j, there exists a subset Wj of X which is a finite union
of members of M and satisfies Vj ∪ Wj = X.
This gives
U ∪
n
[
Wj ⊃
j=1
n
\
Vj
∪
n
[
j=1
Wj = X
j=1
which shows that M has a finite subcover. But this contradicts the fact that
M ∈ U.
Hence, B must be a cover of X. As pointed out above, this finishes the
proof.
Proof of Tychonoff ’s theorem.
To show that X is compact, we will use Alexander’s lemma. Let V be a
cover of X with members of E, where
E = {V ∈ P(X) | V = p−1
i (U ) for some i ∈ I and some open U ⊂ Xi } .
We have to show that V has a finite subcover.
For each i ∈ I, set
Vi = {U ∈ P(Xi ) | U is open and p−1
i (U ) ∈ V} .
Then there exists some j ∈ I such that Vj is a cover of Xj .
Indeed, assume (for contradiction)
that this is not true.
that
S This means
S
U
=
6
∅.
By
for each i ∈ I, we have U ∈Vi U 6= Xi , that is Xi \
U ∈Vi
S S
the axiom of choice, we can let x : I → i∈I Xi \
be a choice
U ∈Vi U
S
function. Then x ∈ X, but x 6∈ V ∈V V : otherwise, we would have x ∈ V
for some V ∈ V , thus x ∈ p−1
i (U ) for some i ∈ I and some U ∈ Vi , that is,
x(i) = pi (x) ∈ U for some i ∈ I and some U ∈ Vi , which is not possible in
view of the definition of x. But this shows that V is not a cover of X, giving
a contradiction.
Now, since Xj is compact and Vj is an
Snopen cover of Xj , we know that
there exist U1 , . . . , Un ∈ Vj such that
Uk = Xj . This gives that
Sn k=1
−1
−1
p−1
(U
)
∈
V
for
k
=
1,
.
.
.
,
n
and
p
(U
k
k ) = pj (Xj ) = X, so V
j
k=1 j
has a finite subcover, as desired.
7
© Copyright 2026 Paperzz