The Inverse Function Theorem We start with some definitions

The Inverse Function Theorem
We start with some definitions.
Definition Let X be a complete metric space. If φ : X → X and if there is a number 0 < C < 1
such that
d(φ(x), φ(y)) ≤ Cd(x, y)
for all x and y in X, then φ is said to be a contraction of X to X.
Definition A f maps from an open subset E ⊂ Rn to Rm is called differentiable at x if there exists a
linear transformation A : Rn → Rm such that
kf (x + h) − f (x) − A(h)k
= 0.
h→0
khk
lim
The linear transformation A is called the derivative of f at x and is denoted dfx .
Definition A function f : E → Rm is called continuously differentiable if the map
df : E → Mn (R) = Rn
2
is continuous. Similarly, a map f is k-times continuously differentiable, written f ∈ C k if
dk f : E → Mnk (R) = Rn
2k
is continuous.
Now we give some fun facts.
2
• Invertible matrices are open in Mn (R) (since GLn (R) = Rn − det−1 {0})
• Matrix inversion is smooth: Inv : GLn (R) → GLn (R) is smooth (Cramer’s Rule gives the inverse in
terms of non vanishing rational functions in the coefficients of the matrix )
• Let f map a convex open set E to Rm , be differentiable in E, and suppose there exists an M > 0 such
that
kdfx k ≤ M for all x ∈ E.
Then for all a, b ∈ E
kf (a) − f (b)k ≤ M ka − bk.
1
Banach Fixed Point Theorem:
Let X be a complete metric space and φ a contraction of X into X, then there exists a unique point x ∈ X
such that φ(x) = x.
Proof. Pick x0 ∈ X and define a sequence by xn+1 = φ(xn ). Since φ is a contraction a 0 < C < 1 with
d(xn+1 , xn ) = d(φ(xn ), φ(xn−1 )) ≤ Cd(xn , xn−1 ).
So, induction gives us
d(xn+1 , xn ) ≤ C n d(x1 , x0 ).
We now show (xn ) is a Cauchy sequence. Let m > n, then
d(xm , xn ) ≤
m
X
d(xk , xk−1 ) ≤
k=n+1
m
X
C k−1 d(x1 , x0 )
k=n+1
∞
X
n
≤C
C k d(x1 , x0 ) = C n
k=0
d(x1 , x0 )
−→ 0,
1−C
since C < 1. Hence, the sequence is Cauchy in a complete metric space. So the sequence converges, say to
x. Since φ is continuous we get
φ(x) = φ lim xn = lim φ(xn ) = lim xn+1 = x.
n→∞
n→∞
n→∞
So we see φ has a fixed point.
We still need this x to be unique. Assume, then, that we have two fixed points x and y. That is we have
φ(x) = x and φ(y) = y. Then
d(x, y) = d(φ(x), φ(y)) ≤ Cd(x, y)
Since C < 1 the only way this can happen is if d(x, y) = 0; that is, if x = y. So the fixed point is unique.
We can now prove the Inverse Function Theorem in Rn .
Inverse Function Theorem Theorem:
Suppose f is a smooth mapping of E ⊂ Rn into Rn with dfa invertible for some a ∈ E. Then there exits
open sets U and V such that f : U → V is a diffeomorphism.
Proof. We complete the proof in four steps, first constructing U and V where f is a bijection, then showing
V is in fact an open set, then showing f −1 : V → U is differentiable, and then showing f −1 is in fact smooth.
2
Bijectivity
We are given that dfa−1 exists. Since invertible matrices are open in M (n) we can find an r < 2kdf1−1 k such
a
that if A ∈ B(dfa , r), then A is invertible. Now, since f is smooth, it is continuously differentiable, meaning,
df : E → M (n) is continuous. So, given r, there exists a δ > 0 such that if kx − ak < δ then dfx is invertible.
Now, for y ∈ Rn define φy : E → Rn by φy (x) = x + dfa−1 (y − f (x)). Notice that
x is a fixed point of φy
⇐⇒
f (x) = y.
We want f to be injective, so we want φy to only have one fixed point. This will be accomplished if we can
show φ is a contraction (since uniqueness didn’t depend on completeness).
Define U = B(a, δ) and V = f (U ). If we can show f is injective on U it will be a bijection. Take
y ∈ f (U ). Now, if x ∈ U then dfx ∈ B(dfa , r) and
d(φy )x = I − dfa−1 ◦ dfx = dfa−1 (dfa − dfx ).
So,
kd(φy )x k ≤ kdfa−1 kkdfa − dfx k
1
1
< kdfa−1 k
−1 = 2 ,
2kdfa k
Hence, by a theorem above, for x1 and x2 in U we have
kφy (x1 ) − φy (x2 )k ≤
1
kx1 − x2 k.
2
But this says φy is a contraction on U ; so f is injective on U . Hence f : U → V is a bijection. It remains to
see V is open.
V open
Take y0 ∈ V , we need an R > 0 such that if ky − y0 k < R then y ∈ V . We leave R for now and
determine what its value should be.
So, given y with ky − y0 k < R, we want y ∈ V ; that is, we want an x ∈ U with f (x) = y. Let’s look
again at the function φy . If it has a fixed point x then we’ll have f (x) = y as desired. So, we need to find
a fixed point of φy . To do this we want to used the Banach Fixed Point Theorem. But, for this to work
we need a complete metric space for φy to act on. We cannot use U since it is open, but if we find a closed
subset of U , say K, then K will be a complete metric space in its own right. Consequently, we just need to
find a closed subset K of U with φy : K → K to get the existence of a fixed point.
To this end, call x0 = f −1 (y0 ) and take K = B̄(x0 , ρ) where ρ is small enough such that K ⊂ U . We
want φy : K → K. That is, if kx − x0 k ≤ ρ then we need kφy (x) − x0 k ≤ ρ. So, take x ∈ K, then we have
kφy (x) − x0 k ≤ kφy (x) − φy (x0 )k + kφy (x0 ) − x0 k
1
≤ kx − x0 k + kdfa−1 (y − f (x0 ))k
2
1
≤ ρ + kdfa−1 kky − y0 k
2
1
≤ ρ + kdfa−1 kR.
2
We see that if R = 2kdfρ−1 k we’ll get kφy (x) − x0 k ≤ ρ. Thus, φy : K → K as desired. Since it’s a contraction
a
we have the existence of an x with φy (x) = x. So, f (x) = y and y ∈ V . We seen now that V is indeed open.
We still need f −1 to differentiable (and smooth).
3
Differentiablility
If d(f −1 )b were to exist, by the chain rule we would have
(f ◦ f −1 )(b) = b
=⇒
dfa ◦ d(f −1 )b = I
=⇒
d(f −1 )b = dfa−1 .
So this is our candidate for the derivative. Let y and y + h ∈ V , then we have x and x + k ∈ U with f (x) = y
and f (x + k) = y + h where k → 0 as h → 0. We can now calculate:
kf −1 (y + h) − f −1 (y) − dfx−1 (h)k
kx + k − x − dfx−1 (h)k
k − dfx−1 (h − dfx (k))k
= lim
= lim
.
h→0
h→0
h→0
khk
khk
khk
lim
Since h = f (x + k) − y = f (x + k) − f (x), we get
kdfx−1 (f (x + k) − f (x) − dfx (k))k
kkk
kf (x + k) − f (x) − dfx (k)k
= lim
kdfx−1 k
=0
h→0
h→0 khk
khk
kkk
= lim
IF we can show kkk/khk is bounded.
Let’s look back at φy :
φy (x + k) − φy (x) = x + k + dfa−1 (y − f (x + k)) − x − dfa−1 (y − f (x))
= k − dfa−1 (h),
so
kkk − kdfa−1 (h)k ≤ kk − dfa−1 k
= kφy (x + k) − φy (x)k ≤
1
1
kx + k − xk = kkk.
2
2
Some more unraveling gives
1
kkk ≤ kdfa−1 k ≤ kdfa−1 kkhk
2
=⇒
kkk
≤ 2kdfa−1 k.
khk
That is kkk/khk is bounded, meaning this limit is zero and that f −1 is differentiable, with derivative
d(f −1 )y = dfx−1 .
All that remains is showing the smoothness of f −1 .
Smoothness
We proceed by induction: we have d(f −1 )y = dfx−1 . What we’re doing here is
f −1
df
Inv
b 7−→ a 7−→ dfa 7−→ dfa−1 .
Specifically,
d(f −1 ) = Inv ◦ df ◦ f −1 .
Inversion is smooth and hence continuous, df is continuous since f is smooth, and f −1 is continuous since
it is differentiable. So d(f −1 ) is continuous as it is the composition of continuous functions. Consequently
f −1 ∈ C 1 and the base case is proven.
Now, suppose we have shown f −1 ∈ C k . Then we have
dk+1 (f −1 ) = d(dk (f −1 )) = d(dk−1 Inv ◦ dk f ◦ dk−1 (f −1 ))
= dk Inv ◦ dk+1 f ◦ dk (f −1 ).
4
Again, since Inv is smooth, dk Inv is continuous, dk f is continuous since f is smooth, and by the induction
hypotheses dk (f −1 ) is continuous. Hence dk+1 (f −1 ) is continuous and we get f ∈ C k+1 this completes the
induction. Since f −1 ∈ C k for each k we get that f −1 is smooth, as desired.
In summary, we have shown that there are sets U and V such that f : U → V is invertible with a smooth
inverse. That is f : U → V is a diffeomorphism.
For fun, we extend the result to manifolds.
Inverse Function Theorem Suppose f : M → N is smooth and for some p ∈ N , dfp : T pM → T qN is
invertible. Then there exists open sets U ⊂ M and V ⊂ N such that f : U → V is a diffeomorphism.
Proof. We are given that dfp is invertible. Take charts (U, φ) and (V, ψ) for p and q respectively. Consider
the function
fˆ = ψ −1 ◦ f ◦ φ.
This is smooth since each of ψ −1 , f, and φ are. Say that φ(x) = p and ψ(y) = q. We can compute
dfˆx = d(ψ −1 )q ◦ dfp ◦ dφx : U → V.
This is invertible since each of the functions in the composition is. By the Inverse Function Theorem in Rn
we have sets Ũ ⊂ U and Ṽ ⊂ V such that
fˆ : Ũ → Ṽ
is a diffeomorphism. That is we have
ψ −1 ◦ f ◦ φ : Ũ → Ṽ .
Call U 0 = φ(Ũ ) and V 0 = ψ(Ṽ ). Since φ and ψ are diffeomorphims, U 0 and V 0 are open and we get
f = ψ ◦ fˆ ◦ φ−1 : U 0 → V 0
is a diffeomorphim as well.
5