1
The derivative of the Inverse Operation
In this section, we show that the mapping A 7→ A−1 is continuously differentiable. This will
be used to prove that the inverse function produced in the inverse function theorem has a
continuous derivative.
Definition 1.1. GL(Rk ) := {L ∈ L(Rk , Rk ) : L is invertible and L−1 ∈ L(Rk , Rk )}.
Exercise: Is GL(Rk ) a subspace of L(Rk , Rk )? Why or why not?
The following theorem implies that if A is small, then Id − A is invertible. In fact:
B1 (Id) ⊆ GL(Rk ).
Proposition 1.2. Suppose kAk < 1. Then Id − A ∈ GL(Rk ).
P
Proof. Let Sn := ni=0 Ai . We will now show that Sn is a Cauchy sequence in L(Rk , Rk ).
Suppose that m > n. Then,
m
X
kSm − Sn k = Ai i=n+1
m
X
all hail
≤
kAki
i=n+1
kAkn+1 − kAkm+1
=
1 − kAk
n+1
kAk
.
≤
1 − kAk
Since kAk < 1, we know that kAkn → 0, and so given ε > 0, there is an N ∈ N such
n+1
that whenever n ∈ N and n > N , kAk
< ε. Suppose now that m > n > N . Then the
1−kAk
inequalities above imply that kSm − Sn k < ε. Thus, Sn is a Cauchy sequence in L(Rk , Rk ),
and so Sn → S ∈ L(Rk , Rk ). We now show that S is an inverse of Id − A, i.e. (Id − A)S = Id
and S(Id − A) = Id. We have
k(Id − A)S − Idk ≤ k(Id − A)(S − Sn ) + (Id − A)Sn − Idk
≤ kId − AkkS − Sn k + kSn − ASn − Idk
n
!
!
n
X
X
= kId − AkkS − Sn k + Ai − A
Ai − Id
i=1
i=0
!
!
n
n
X
X
= kId − AkkS − Sn k + Ai −
Ai+1 − Id
i=0
i=0 n+1
= kId − AkkS − Sn k + Id − A
− Id
≤ kId − AkkS − Sn k + kAkn+1 .
Because Sn → S and kAk < 1 implies that kAkn+1 → 0, we see that k(Id − A)S − Idk = 0.
Showing that S(Id − A) = Id is done similarly.
1
Notice that the preceding proof shows that the series
is the inverse of Id − A. In other words: (Id − A)−1 =
∞
X
∞
X
Ai converges in L(Rk , Rk ) and
i=0
Ai . We now show that GL(Rk ) is
i=0
an open subset of L(Rk , Rk ).
Theorem 1.3. GL(Rk ) is an open subset of L(Rk , Rk ).
Proof. Suppose S ∈ GL(Rk ), and let r :=
Suppose then that T ∈ Br (s). Notice that
1
.
kS −1 k
We will now show that Br (S) ⊆ GL(Rk ).
kId − S −1 T k = kS −1 S − S −1 T k ≤ kS −1 kkS − T k < kS −1 k ·
1
= 1.
kS −1 k
Therefore, Id − (Id − S −1 T ) is invertible by Proposition 1.2. Note that we have
T = S − S + T = S − S(Id − S −1 T ) = S Id − (Id − S −1 T ) ,
and so T is the product of elements of GL(Rk ), and so T is itself invertible.
We now show that operation Inv : GL(Rk ) → (Rk , Rk ), Inv(S) = S −1 is differentiable,
and the derivative is continuous.
Theorem 1.4. Inv : GL(Rk ) → L(Rk , Rk ) is differentiable, and Inv 0 (S)A = −S −1 AS −1
for any A ∈ Lk (Rk , Rk ). Moreover, the derivative is continuous.
1
Proof. Suppose S ∈ GL(Rk ), and suppose that kAk < kS −1
. By Theorem 1.2, S + A is
k
invertible. In fact, Theorem 1.2 shows that if T := S + A, then T = S Id − (Id − S −1 T ) .
Therefore
−1
Inv(S + A) = (S + A)−1 = T −1 = Id − (Id − S −1 T ) S −1
!
∞
X
i
=
Id − S −1 T
S −1
=
=
=
i=0
∞
X
i=0
∞
X
i=0
∞
X
!
S −1 S − S −1 T
i
S −1
!
S −1 (S − T )
i
S −1
!
i
S −1 (−A)
S −1
i=0
=
Id − S −1 A + (S −1 A)2 +
∞
X
!
−S −1 A
i
S −1
i=3
= S −1 − S −1 AS −1 + (S −1 A)2 S −1 +
∞
X
i=3
2
!
i
−S −1 A
S −1 .
Thus, subtracting S −1 − S −1 AS −1 from both sides, and taking norms, we will have
kInv(S + A) − Inv(S) + S
−1
AS
−1
k ≤ kS
−1 3
2
k kAk +
∞
X
kS −1 ki+1 kAki
i=3
kS
−1 3
2
2
k kAk + kAk ·
= kS −1 k3 kAk2 1 +
= kS −1 k3 kAk2 1 +
∞
X
i=3
∞
X
i=3
∞
X
kS −1 ki+1 kAki−2
!
kS −1 ki−2 kAki−2
!
kS −1 kk kAkk
k=1
= kS −1 k3 kAk2
∞
X
!
kS −1 kk kAkk
k=0
=
kS −1 k3 kAk2
,
1 − kS −1 kkAk
1
= 1. Suppose now that ε > 0
where the last series converges, since kS −1 kkAk < kS −1 k kS −1
k
n
o
is given. Let δ := min 2kS1−1 k , ε , and suppose A ∈ L(Rk , Rk ) is arbitrary and kAk < δ.
This implies that kS −1 kkAk < 21 , and so −kS −1 kkAk > − 12 . Thus, if kAk < δ, we will have
1
< 2. Thus, the inequalities above implies that
1 − kS −1 kkAk > 21 and so 1−kS −1
kkAk
kS −1 k3
1 − kS −1 kkAk
1
≤ εkAk · kS −1 k3 ·
1 − kS −1 kkAk
≤ 2kS −1 k3 · εkAk.
kInv(S + A) − Inv(S) + S −1 AS −1 k ≤ kAk · kAk ·
This shows that Inv is differentiable at S, and Inv 0 (S)A = −S −1 AS −1 .
We next show that Inv 0 is continuous, although it is a little tricky to decide what
that means. Since GL(Rk ) ⊆ L(Rk , Rk ) and Inv : GL(Rk ) → L(Rk , Rk ), the derivative
of Inv at a S ∈ GL(Rk ) must be a bounded linear map from L(Rk , Rk ), i.e. Inv 0 (s) ∈
L(L(Rk , Rk ), L(Rk , Rk )). To show that the derivative is continuous, we need to show that
if Sn is a sequence in GL(Rk ) such that Sn → S, then Inv 0 (Sn ) → Inv 0 (Sn ) in the norm
on L(L(Rk , Rk ), L(Rk , Rk )). That means that we need to show that for any ε > 0, there
is an N ∈ N such that whenever n > N , we have kInv 0 (Sn )A − Inv 0 (S)Ak < ε for all
A ∈ L(Rk , Rk ) with kAk ≤ 1.
Let ε > 0 be given. Notice that since Inv is differentiable, Inv is continuous. Therefore,
Inv(Sn ) → Inv(S). This also implies that kInv(Sn )k is convergent, and thus there is a
K > 0 such that for all n kInv(Sn )k < K. Since Inv(Sn ) → Inv(S), there is an N ∈ N such
that whenever n > N , kInv(Sn ) − Inv(S)k < kS −1εk+K . Suppose now that n > N , and let
3
A ∈ L(Rk , Rk ) be arbitrary, and assume kAk ≤ 1. We then have
kInv 0 (Sn )A − Inv 0 (S)Ak = k − Sn−1 ASn−1 + S −1 AS −1 k
= kS −1 AS −1 − Sn−1 ASn −1k
≤ kS −1 AS −1 − S −1 ASn−1 k + kS −1 ASn−1 − Sn−1 ASn−1 k
= kS −1 A(S −1 − Sn−1 )k + k(S −1 − Sn−1 )ASn−1 k
≤ kS −1 k · kAk · kS −1 − Sn−1 k + kS −1 − Sn−1 k · kAk · K
≤ kS −1 k · kInv(S) − Inv(Sn )k + kInv(S) − Inv(Sn )k · K
≤ kS −1 k + K kInv(S) − Inv(Sn )k < ε.
Notice that the derivative of Inv is in terms of Inv itself. Thus, the continuity of the
derivative Inv 0 follows from the continuity of Inv . . . which follows from the differentiability
of Inv itself. A similar argument would show that Inv 0 is itself differentiable, and the
derivative of Inv 0 is determined Inv 0 and Inv, which are both continuous . . . which means
that Inv 00 is continuous. This is an example of bootstrapping: the differentiability of a thing
implies the continuity of the derivative, which implies the differentiability of the derivative,
which . . . This process can be carried out forever! In essence, it shows that the map
Inv : S 7→ S −1 is actually infinitely differentiable!
2
The Banach Fixed-Point Theorem
Definition 2.1. Suppose Ω ⊆ Rd . A function f : Ω → Ω is a contraction if there exists a
λ ∈ [0, 1) such that kf (x) − f (y)k ≤ λkx − yk.
Notice that a contraction is automatically continuous! In fact, a contraction is uniformly
continuous.
Theorem 2.2 (The Banach Fixed-Point Theorem, aka the contraction mapping theorem).
Suppose Ω ⊆ Rd is closed and suppose f : Ω → Ω is a contraction. Then there is a unique
u ∈ Ω such that f (u) = u. (Such a u is called a fixed point, since f doesn’t move u at all.)
Proof. We first show that if a contraction has a fixed point, then it must be unique. Suppose
that x, y ∈ Ω are both fixed points of f . Then, kx − yk = kf (x) − f (y)k ≤ λkx − yk. But
that implies that (1 − λ)kx − yk = 0, and so kx − yk = 0, i.e. x = y.
Next, suppose λ = 0. Then f must be a constant function, since kf (x) − f (y)k ≤ 0 for
all x, y ∈ Ω implies that for all x, y ∈ Ω, f (x) = f (y). Since f : Ω → Ω, there must be a
p ∈ Ω such that f (x) = p for all p ∈ Ω, and thus p is a fixed point, since f (p) = p.
Suppose next that λ ∈ (0, 1). Pick u1 ∈ Ω, and for any n ∈ N , let un+1 := f (un ). In
particular, for any n ∈ N, n ≥ 2, kun+1 − un k = kf (un ) − f (un−1 )k ≤ λkun − un−1 k. We now
use induction to show that for any n ∈ N, kun+1 − un k ≤ λn−1 ku2 − u1 k. The statement is
clearly true when n = 1. Suppose then that n ∈ N is arbitrary, and suppose for this fixed n
that kun+1 − un k ≤ λn−1 ku2 − u1 k. Then
kun+2 − un+1 k = kf (un+1 ) − f (un )k ≤ λkun+1 − un k = λn ku2 − u1 k.
4
Thus, by induction, we know that for any n ∈ N, kun+1 − un k ≤ λn−1 ku2 − u1 k.
We now show that un is a Cauchy sequence. Notice that if m > n, then we have
m−1
X
kum − un k = (uj+1 − uj )
j=n
all hail
≤
m−1
X
kuj+1 − uj k
j=n
≤
m−1
X
λj−1 ku2 − u1 k
(1)
j=n
m−1
X
=
!
λ
j−1
ku2 − u1 k
j=n
n−1
=
λ
λn−1
− λm
ku2 − u1 k ≤
ku2 − u1 k.
1−λ
1−λ
Suppose then that ε > 0 is given. Since λn → 0, there is an N ∈ N such that whenever
n−1
n > N , λ1−λ ku2 − u1 k < ε. Suppose now that m > n. By (1), we then have kum − un k < ε.
Since Rd is complete, we know that un converges to some u ∈ Rd . Since Ω is closed, u ∈ Ω.
Next, since un+1 = f (un ), letting n → ∞ and noting that f is continuous, u = f (u), i.e. u
is a fixed point of f .
3
The Inverse Function Theorem
Proposition 3.1. Suppose Ω ⊆ Rd is an open set that contains 0. Suppose f : Ω → Rd
satisfies the following:
1. f (0) = 0
2. f is differentiable in Ω, f 0 (0) = Id, and f 0 (x) is continuous in Ω.
Then:
(i) there exists an r1 > 0 such that for any x1 , x2 ∈ Br1 (0) := {y ∈ Rd : kyk < r1 },
kf (x1 ) − f (x2 )k ≥ 12 kx1 − x2 k,
(ii) there exists and r2 > 0 such that Br2 (0) ⊆ f (Br1 (0)),
(iii) there exists a function g : Br2 (0) → Br1 (0) such that f (g(y)) = y for all y ∈ Br2 (0),
and g is differentiable Br2 (0), with g 0 (y) = (f 0 (x))−1 when g(y) = x.
Proof. (i) By continuity of f 0 at the origin and the open-ness of Ω, there is an r1 > 0 such
that kId − f 0 (x)k < 21 for all x ∈ Br1 (0) ⊆ Ω. Now, let H : Br1 (0) → Rd be H(x) := x − f (x).
Notice then that H 0 (x) = Id − f 0 (x), and therefore kH 0 (x)k < 21 for all x ∈ Br1 (0). Then, by
5
the Mean Value Theorem, for any x1 , x2 ∈ Br1 (0), we will have kH(x1 )−H(x2 )k < 21 kx1 −x2 k.
Therefore,
kx1 − x2 k = kH(x1 ) + f (x1 ) − (H(x2 ) + f (x2 ))k
≤ k(H(x1 ) − H(x2 )) + (f (x1 ) − f (x2 ))k
all hail 1
≤
kx1 − x2 k + kf (x1 ) − f (x2 )k.
2
Subtracting 21 kx1 − x2 k then gives the inequality 21 kx1 − x2 k ≤ kf (x1 ) − f (x2 )k. Notice: this
implies that f is injective on Br1 (0), and so f will have an inverse function on f (Br1 (0)).
(ii) Pick r2 such that 0 < r2 < r21 . For any y ∈ Br2 (0), let Fy : Br1 (0) → Rd be
Fy (x) := y + x − f (x). Notice that for every y ∈ Br2 (0), Fy0 (x) = Id − f 0 (x), and so
kFy0 (x)k < 12 for all y ∈ Br2 (0). Therefore, for any y ∈ Br2 (0), the Mean Value Theorem
implies that
1
kFy (x1 ) − Fy (x2 )k < kx1 − x2 k for all x1 , x2 ∈ Br1 (x).
2
(2)
Let K := {x ∈ Rd : kxk ≤ 2r2 } ⊆ Br1 (0) (since r2 < 21 r1 ). We will now show that for any
y ∈ Br2 (0), Fy maps K to itself, i.e. for any y ∈ Br2 (0) and x ∈ K, Fy (x) ∈ K. Suppose
that y ∈ Br2 (0) and x ∈ K are arbitrary. Then we have
1
kFy (x) − Fy (0)k = ky + x − f (x) − yk = kx − f (x)k = kH(x) − H(0)k < kxk,
2
all hail
and therefore kFy (x)k ≤ kFy (x) − Fy (0)k + kFy (0)k < 21 kxk + kyk ≤ 12 2r2 + r2 = 2r2 , and
so Fy (x) ∈ K. Therefore, for each y ∈ Br2 (0), Fy : K → K is a contraction. Since K is
closed, for each y ∈ Br2 (0), Theorem 2.2 implies that Fy has a unique fixed point in K. We
now show that Br2 (0) ⊆ f (Br1 (0)). Let ỹ ∈ Br2 (0), and let x̃ ∈ K be the fixed point of Fỹ .
By assumption, x̃ = ỹ + x̃ − f (x̃) and so ỹ = f (x̃). Since K ⊆ Br1 (0), we have ỹ = f (x̃) for
some x̃ ∈ Br1 (0), i.e. ỹ ∈ f (Br1 (0)). Thus, Br2 (0) ⊆ f (Br1 (0)).
(iii) Let g : Br2 (0) → Br1 (0) be defined as: g(y) is the unique fixed point of the function
Fy : K → K, Fy (x) = y + x − f (x). That is: g(y) = x exactly when x = y + x − f (x).
In particular, this tells us that g(y) = x exactly when y = f (x). Therefore, for any y ∈
Br2 (0), f (g(y)) = f (x) = y. Next, we show that g is differentiable at the origin and
that g 0 (0) = Id. First, we show that g is continuous on Br2 (0). Let y1 , y2 ∈ Br2 (0). If
x1 := g(y1 ) and x2 := g(y2 ), then x1 , x2 ∈ Br1 (0). Therefore, by the inequality from (i),
1
kx1 − x2 k ≤ kf (x1 ) − f (x2 )k which means that 12 kg(y1 ) − g(y2 )k ≤ ky1 − y2 k, and so
2
kg(y1 ) − g(y2 )k ≤ 2ky1 − y2 k. Thus, g is continuous on Br2 (0). In fact, we have shown
for all y1 , y2 ∈ Br2 (0), kg(y1 ) − g(y2 )k ≤ 2ky1 − y2 k.
(3)
Suppose now that y ∈ Br2 (0) is arbitrary, and let ε > 0 be given. Let x := g(y), and
note that x ∈ Br1 (0). Therefore, kId − f 0 (x)k < 21 , and so Id − (Id − f 0 (x)) = f 0 (x) is
invertible by Proposition 1.2. Since f is differentiable at x, there is a δ > 0 such that
ε
k
whenever khk < δ, kf (x + h) − f (x) − f 0 (x)hk ≤ 2k(f 0 (x))
−1 k khk. Suppose now that v ∈ R is
6
arbitrary, and suppose kvk < min r2 , 2δ . If x := g(y) and x1 = g(y + v), then (3) implies
that kx1 − xk = kg(y + v) − g(y)k ≤ 2kvk < δ. Thus, if h := x1 − x, then khk < δ. Moreover,
f (x + h) = f (x1 ) = y + v since g(y + v) = f (x1 ). Because khk < δ, we see
−1 −1
(y + v) − v g(y + v) − g(y) − f 0 (x) v = x1 − x − f 0 (x)
−1
0
= h − f (x)
f (x1 ) − f (x) −1 0 −1
0
0
f (x1 ) − f (x) = f (x) f (x) h − f (x)
−1 ≤ f 0 (x) f 0 (x)h − f (x1 ) − f (x) 0 −1 = f (x) kf (x1 ) − f (x) − f 0 (x)hk
−1 = f 0 (x) kf (x + h) − f (x) − f 0 (x)hk
−1 ε
khk
≤ f 0 (x) ·
2 k(f 0 (x))−1 k
ε
ε
ε
= khk = kx1 − xk = kg(y + v) − g(y)k
2
2
2
ε
≤ 2kvk = εkvk,
2
where we have use (3) in the last inequality. Therefore, g is differentiable at y, and g 0 (y) =
−1
−1
. Since g, f 0 and the inverse operator are all continuous, we see
f 0 (x)
= f 0 g(y)
that g 0 is continuous. Thus: the inverse function is continuously differentiable.
We now get the general inverse function theorem:
Theorem 3.2 (The Inverse Function Theorem). Suppose Ω ⊆ Rd is open, and suppose
f : Ω → Rd . Suppose f is continuously differentiable on Ω, and f 0 (p) is invertible at p ∈ Ω.
Then, there is an r1 > 0 such that f is one-to-one on Br1 (p), f (Br1 (p)) is open, and the
inverse function g : f (Br1 (p)) → Br1 (p) is continuously differentiable.
Proof. Let q := f (p), and let φ : Rd → Rd , ψ : Rd → Rd be defined by φ(x) = x −
−1
p, ψ(y) = f 0 (p) (y − q). Notice then that φ(p) = 0 = ψ(q). Moreover, φ0 (x) = Id
−1
and ψ 0 (y) = f 0 (p) . Notice also that both φ and ψ have inverses: φ−1 (u) = u + p,
ψ −1 (v) = f 0 (p)v − q. Notice that φ(Ω) is simply Ω shifted by −p, and so φ(Ω) is an
open set that contains 0. Let F : φ(Ω) → Rd , F (x) := ψ ◦ f ◦ φ−1 (x). Notice then that
−1
F (0) = ψ f φ−1 (0) = ψ f p = f 0 (p)
f (p) − q = 0 since q = f (p). Moreover, by
the chain rule, we know
−1
F 0 (0) = ψ 0 f φ(0) × f 0 φ(0) × φ0 (0) = f 0 (p)
× f 0 (p) = Id.
Finally, since f is assumed to be continuously differentiable, we know that F will be continuously differentiable. We know apply Proposition 3.1 to F : there exist r1 , r2 > 0 such
that
7
© Copyright 2026 Paperzz