Implicit Function Theorem

John Nachbar
Washington University
September 6, 2016
The Implicit Function Theorem1
1
Introduction
The Implicit Function Theorem is a non-linear version of the following observation
from linear algebra. Suppose first that F : R2 → R is given by F (x) = ax1 + bx2 .
If a 6= 0, then the zero set of F (the set of (x1 , x2 ) such that F (x) = 0; also called
the kernel of F , since F is linear) can be written as the graph of a linear function
ψ : R → R given by
b
ψ(x1 ) = − x1
a
In particular, F (ψ(x2 ), x2 ) = a[−(b/a)x2 ] + bx2 = 0
More generally, suppose that L = M + N and that F : RL → RM is linear and
can be written in the form
F (x) = Axµ + Bxν ,
where x = (xµ , xν ) ∈ RL , xµ ∈ RM , xν ∈ RN , A is an M × M matrix, and B is an
M × N matrix. If A is invertible then the zero set of F can be written as the graph
of a linear function ψ : RN → RM given by
ψ(xν ) = −A−1 Bxν .
In particular, F (ψ(xν ), xν ) = A[−A−1 Bxν ] + Bxν = 0.
One version of the Implicit Function Theorem says the following. Suppose that
F : RL → RM is C r , that F (x∗ ) = 0, and that Dµ F (x∗ ) (the M × M matrix of
derivatives with respect to the xµ variables) is invertible. Then, near the point x∗ ,
the zero set of F can be written as the graph of a C r function ψ that gives xµ as a
function of xν . ψ is the function implicitly defined by F (x) = 0. Moreover, for any
xν near x∗ν , setting x = (ψ(xν ), xν ),
Dψ(xν ) = −[Dµ F (x)]−1 Dν F (x),
which is the analog of what we found in the linear case. This equation allows us to
compute Dψ even when we cannot solve for ψ analytically.
Much of the intuition for the Implicit Function Theorem is illustrated by the unit
circle. Explicitly, the unit circle can be expressed as the zero set of F : R2 → R,
cbna. This work is licensed under the Creative Commons Attribution-NonCommercialShareAlike 4.0 License.
1
1
F (x) = x21 + x22 − 1. I can write the circle as the union of the graphs of four
differentiable functions on (−1, 1), two giving x1 as a function of x2 ,
q
x1 = x22 − 1
and
q
x1 = − x22 − 1.
and two giving x2 as a function of x1 ,
x2 =
and
q
x21 − 1
q
x2 = − x21 − 1.
There is substantial overlap across these graphs, but that is not a problem.
Note the following subtleties.
1. I cannot express the entire unit circle, the zero set of F , as the graph of a
single function.
2. Although I expressed the Implicit Function Theorem above as saying that the
first M variables could be written as a function of the last N variables, there
is nothing sacred about order. As long as DF has full rank (namely M ),
then I can express some M of the variables as a function of the remaining N
variables. In the circle example, this shows up as sometimes writing x2 as a
function of x1 and sometimes writing x1 as a function of x2 .
Example 1. In economics, a standard example of an implicit function involves indifference curves (or indifference surfaces, in higher dimensions). If the utility function
is u : RL → R and if u(x∗ ) = c∗ , then the indifference curve through x∗ is defined implicitly as the zero set of the function F (x) = u(x) − c∗ . If L = 2, and if
D2 u(x∗ ) 6= 0, then the Implicit Function Theorem says that, near the point x∗ , the
indifference curve through x∗ can be given as the graph of a function ψ, with
Dψ(x1 ) = −
D1 u(x)
,
D2 u(x)
which is (the negative of) the marginal rate of substitution.
The next two examples illustrate pathologies.
Example 2. Define F : R2 → R by
F (x) = x1 x2 .
Let x∗ = (0, 0).
DF (x∗ ) = [ 0 0 ],
2
which violates the full rank condition of the Implicit Function theorem. The zero
set of F resembles a “+” sign. There is no way to represent this zero set in a
neighborhood of the origin as the graph of a function, differentiable or otherwise. Example 3. Define F : R2 → R by
F (x) = (x1 − x2 )2 .
Then at x∗ = (0, 0),
DF (x∗ ) = [ 0 0 ].
Here, however, a C ∞ ψ exists, namely,
ψ(x2 ) = x1 .
So the full rank condition on DF (x∗ ) is not necessary for either existence of an
implicit function or for its differentiability. In contrast, for the Inverse Function
Theorem, the full rank condition, while not necessary for existence of an inverse
function, was necessary for the differentiability of the inverse function. 2
The Implicit Function Theorem
Consider a function F : O → RM where O is an open subset of RL , L = M + N .
Denote a point x ∈ RL as (xµ , xν ), where xµ ∈ RM and xν ∈ RN . At a point x∗ , let
Dµ F (x∗ ) denote the first M columns of DF (x∗ ) (the xµ columns) and let Dν F (x∗ )
denote the remaining N columns (the xν columns).
Theorem 1 (Implicit Function Theorem). Let O be a nonempty open subset of RL .
Let F : O → RM be C r , where r is a positive integer. Consider any x∗ ∈ O such that
F (x∗ ) = 0. If Df (x∗ ) has full rank, namely M , then there is an open set W in RL
such that the restriction of the zero set F −1 (0) to W is the graph of a C r function.
In particular, suppose, for concreteness and simplicity of notation, that the first
M columns of Df (x∗ ) (the xµ columns) are linearly independent. Then there are
open sets U ⊆ RN and W ⊆ RL , and a C r function ψ : U → RM such that Dµ F (x)
has full rank for all x ∈ U , and
1. x∗ν ∈ U , x∗ ∈ W ,
2. ψ(x∗ν ) = x∗µ ,
3. For any x ∈ W , xν ∈ U ,
4. For any xν ∈ U , ψ(xν ) is the unique xµ such that, setting x = (xµ , xν ),
(a) x ∈ W ,
3
(b) F (x) = 0,
5. For any xν ∈ U , setting x = (ψ(xν ), xν ),
Dψ(xν ) = −[Dµ F (x)]−1 Dν F (x).
(1)
Proof. See Section 3. The Implicit Function theorem thus states that if F is continuously differentiable,
if F (x∗ ) = 0, and if DF (x∗ ) has full rank then the zero set of F is, near x∗ , an N dimensional surface in RL . Example 2 in Section 1 shows what can go wrong.
Note that the focus on zero sets of functions is really without loss of generality.
Suppose that f : RL → RM and that f (x∗ ) = y ∗ . Then the level set of f through
x∗ is just the zero set of the function F : RL → RN ,
F (x) = f (x) − y ∗ .
The proof of the Implicit Function Theorem is an application of the Inverse
Function Theorem; the Implicit Function Theorem can be viewed as a corollary.
The fact that Dψ(xν ) = −[Dµ F (x)]−1 Dν F (x), labeled equation 1 in the statement of the Implicit Function Theorem, is consistent with the Chain Rule. Explicitly, suppose that we are simply told that the differentiable function ψ exists. Define
g : U → RL by g(xν ) = (ψ(xν ), xν ). Define h : U → RM by h(xν ) = F (g(xν )).
Then h(xν ) = 0 for all xν ∈ U , hence
Dh(xν ) = 0.
On the other hand, by the Chain Rule, letting x = (ψ(xν ), xν ),
Dh(xν ) = DF (x)Dg(xν )
Dψ(xν )
= Dµ F (x) Dν F (x)
I
= Dµ F (x)Dψ(xν ) + Dν F (x),
where I is the N × N identity matrix. Putting all this together,
0 = Dµ f (x)Dψ(xν ) + Dν f (x).
Rearranging yields equation 1.
3
Proof of the Implicit Function Theorem.
Define G : O → RL by G(x) = (F (x), xν ). Then G is C r and
Dµ F (x) Dν F (x)
DG(x) =
,
0
I
4
where 0 is the N × M matrix of zeroes and I is the N × N identity.
G(x∗ ) = (0, x∗ν ). Moreover, DG(x∗ ) is invertible. In fact, direct calculation
confirms that for any x such that Dµ F (x) is invertible,
[Dµ F (x)]−1 −[Dµ F (x)]−1 Dν F (x)
−1
[DG(x)] =
.
(2)
0
I
Therefore, by the Inverse Function Theorem, there is an open set Õ ⊆ O, with
x∗ ∈ Õ, and an open set V ⊆ RL , with (0, x∗ν ) ∈ V , such that DG(x) has full
rank for every x in Õ, G maps Õ 1-1 onto V and the inverse G−1 : V → Õ is C r .
Since Dµ F (x∗ ) has full rank, F is continuously differentiable, and the determinate
function is continuous, one can take Õ such that Dµ F (x) has full rank for any x ∈ Õ.
Since V is open, there exists an open set U⊆ RN such that x∗ν ∈ U and, for
every xν ∈ U , (0, xν ) ∈ V .2 Let W = RM × U ∩ Õ. This is open, since it is the
intersection of two open sets.
For any xν ∈ U , if G−1 (0, xν ) = (xµ , xν ) then xµ is the unique point such that,
setting x = (xµ , xν ), x ∈ Õ and F (x) = 0. Moreover, by construction, x ∈ W .
Therefore, define ψ : U → RM by setting ψ(xν ) equal to the first M coordinates
of G−1 (0, xν ). Since G−1 is C r on V , ψ is C r on U . Since DG−1 = [DG]−1 , Dψ is
given by the upper-right sub-matrix in equation 2, which implies equation 1. 2
As discussed in the notes on RN , any open ball is contained in an open cube and vice versa.
In the present case, since V √
is open and contains (0, x∗ν ), there is an ε > 0 such Nε (0, x∗ν ) ⊆ V .
Choose any r > 0 such that r M + N ≤ ε. Then the M + N -dimensional cube with sides of length
2r and centered at (0, x∗ν ) is contained in Nε (0, x∗ν ), which is contained in V . Take U to be the
N -dimensional cube with sides of length 2r and centered at x∗ν .
5

Download Report

Implicit Function Theorem

Paperzz.com

Your Paperzz