John Nachbar Washington University September 6, 2016 The Implicit Function Theorem1 1 Introduction The Implicit Function Theorem is a non-linear version of the following observation from linear algebra. Suppose first that F : R2 → R is given by F (x) = ax1 + bx2 . If a 6= 0, then the zero set of F (the set of (x1 , x2 ) such that F (x) = 0; also called the kernel of F , since F is linear) can be written as the graph of a linear function ψ : R → R given by b ψ(x1 ) = − x1 a In particular, F (ψ(x2 ), x2 ) = a[−(b/a)x2 ] + bx2 = 0 More generally, suppose that L = M + N and that F : RL → RM is linear and can be written in the form F (x) = Axµ + Bxν , where x = (xµ , xν ) ∈ RL , xµ ∈ RM , xν ∈ RN , A is an M × M matrix, and B is an M × N matrix. If A is invertible then the zero set of F can be written as the graph of a linear function ψ : RN → RM given by ψ(xν ) = −A−1 Bxν . In particular, F (ψ(xν ), xν ) = A[−A−1 Bxν ] + Bxν = 0. One version of the Implicit Function Theorem says the following. Suppose that F : RL → RM is C r , that F (x∗ ) = 0, and that Dµ F (x∗ ) (the M × M matrix of derivatives with respect to the xµ variables) is invertible. Then, near the point x∗ , the zero set of F can be written as the graph of a C r function ψ that gives xµ as a function of xν . ψ is the function implicitly defined by F (x) = 0. Moreover, for any xν near x∗ν , setting x = (ψ(xν ), xν ), Dψ(xν ) = −[Dµ F (x)]−1 Dν F (x), which is the analog of what we found in the linear case. This equation allows us to compute Dψ even when we cannot solve for ψ analytically. Much of the intuition for the Implicit Function Theorem is illustrated by the unit circle. Explicitly, the unit circle can be expressed as the zero set of F : R2 → R, cbna. This work is licensed under the Creative Commons Attribution-NonCommercialShareAlike 4.0 License. 1 1 F (x) = x21 + x22 − 1. I can write the circle as the union of the graphs of four differentiable functions on (−1, 1), two giving x1 as a function of x2 , q x1 = x22 − 1 and q x1 = − x22 − 1. and two giving x2 as a function of x1 , x2 = and q x21 − 1 q x2 = − x21 − 1. There is substantial overlap across these graphs, but that is not a problem. Note the following subtleties. 1. I cannot express the entire unit circle, the zero set of F , as the graph of a single function. 2. Although I expressed the Implicit Function Theorem above as saying that the first M variables could be written as a function of the last N variables, there is nothing sacred about order. As long as DF has full rank (namely M ), then I can express some M of the variables as a function of the remaining N variables. In the circle example, this shows up as sometimes writing x2 as a function of x1 and sometimes writing x1 as a function of x2 . Example 1. In economics, a standard example of an implicit function involves indifference curves (or indifference surfaces, in higher dimensions). If the utility function is u : RL → R and if u(x∗ ) = c∗ , then the indifference curve through x∗ is defined implicitly as the zero set of the function F (x) = u(x) − c∗ . If L = 2, and if D2 u(x∗ ) 6= 0, then the Implicit Function Theorem says that, near the point x∗ , the indifference curve through x∗ can be given as the graph of a function ψ, with Dψ(x1 ) = − D1 u(x) , D2 u(x) which is (the negative of) the marginal rate of substitution. The next two examples illustrate pathologies. Example 2. Define F : R2 → R by F (x) = x1 x2 . Let x∗ = (0, 0). DF (x∗ ) = [ 0 0 ], 2 which violates the full rank condition of the Implicit Function theorem. The zero set of F resembles a “+” sign. There is no way to represent this zero set in a neighborhood of the origin as the graph of a function, differentiable or otherwise. Example 3. Define F : R2 → R by F (x) = (x1 − x2 )2 . Then at x∗ = (0, 0), DF (x∗ ) = [ 0 0 ]. Here, however, a C ∞ ψ exists, namely, ψ(x2 ) = x1 . So the full rank condition on DF (x∗ ) is not necessary for either existence of an implicit function or for its differentiability. In contrast, for the Inverse Function Theorem, the full rank condition, while not necessary for existence of an inverse function, was necessary for the differentiability of the inverse function. 2 The Implicit Function Theorem Consider a function F : O → RM where O is an open subset of RL , L = M + N . Denote a point x ∈ RL as (xµ , xν ), where xµ ∈ RM and xν ∈ RN . At a point x∗ , let Dµ F (x∗ ) denote the first M columns of DF (x∗ ) (the xµ columns) and let Dν F (x∗ ) denote the remaining N columns (the xν columns). Theorem 1 (Implicit Function Theorem). Let O be a nonempty open subset of RL . Let F : O → RM be C r , where r is a positive integer. Consider any x∗ ∈ O such that F (x∗ ) = 0. If Df (x∗ ) has full rank, namely M , then there is an open set W in RL such that the restriction of the zero set F −1 (0) to W is the graph of a C r function. In particular, suppose, for concreteness and simplicity of notation, that the first M columns of Df (x∗ ) (the xµ columns) are linearly independent. Then there are open sets U ⊆ RN and W ⊆ RL , and a C r function ψ : U → RM such that Dµ F (x) has full rank for all x ∈ U , and 1. x∗ν ∈ U , x∗ ∈ W , 2. ψ(x∗ν ) = x∗µ , 3. For any x ∈ W , xν ∈ U , 4. For any xν ∈ U , ψ(xν ) is the unique xµ such that, setting x = (xµ , xν ), (a) x ∈ W , 3 (b) F (x) = 0, 5. For any xν ∈ U , setting x = (ψ(xν ), xν ), Dψ(xν ) = −[Dµ F (x)]−1 Dν F (x). (1) Proof. See Section 3. The Implicit Function theorem thus states that if F is continuously differentiable, if F (x∗ ) = 0, and if DF (x∗ ) has full rank then the zero set of F is, near x∗ , an N dimensional surface in RL . Example 2 in Section 1 shows what can go wrong. Note that the focus on zero sets of functions is really without loss of generality. Suppose that f : RL → RM and that f (x∗ ) = y ∗ . Then the level set of f through x∗ is just the zero set of the function F : RL → RN , F (x) = f (x) − y ∗ . The proof of the Implicit Function Theorem is an application of the Inverse Function Theorem; the Implicit Function Theorem can be viewed as a corollary. The fact that Dψ(xν ) = −[Dµ F (x)]−1 Dν F (x), labeled equation 1 in the statement of the Implicit Function Theorem, is consistent with the Chain Rule. Explicitly, suppose that we are simply told that the differentiable function ψ exists. Define g : U → RL by g(xν ) = (ψ(xν ), xν ). Define h : U → RM by h(xν ) = F (g(xν )). Then h(xν ) = 0 for all xν ∈ U , hence Dh(xν ) = 0. On the other hand, by the Chain Rule, letting x = (ψ(xν ), xν ), Dh(xν ) = DF (x)Dg(xν ) Dψ(xν ) = Dµ F (x) Dν F (x) I = Dµ F (x)Dψ(xν ) + Dν F (x), where I is the N × N identity matrix. Putting all this together, 0 = Dµ f (x)Dψ(xν ) + Dν f (x). Rearranging yields equation 1. 3 Proof of the Implicit Function Theorem. Define G : O → RL by G(x) = (F (x), xν ). Then G is C r and Dµ F (x) Dν F (x) DG(x) = , 0 I 4 where 0 is the N × M matrix of zeroes and I is the N × N identity. G(x∗ ) = (0, x∗ν ). Moreover, DG(x∗ ) is invertible. In fact, direct calculation confirms that for any x such that Dµ F (x) is invertible, [Dµ F (x)]−1 −[Dµ F (x)]−1 Dν F (x) −1 [DG(x)] = . (2) 0 I Therefore, by the Inverse Function Theorem, there is an open set Õ ⊆ O, with x∗ ∈ Õ, and an open set V ⊆ RL , with (0, x∗ν ) ∈ V , such that DG(x) has full rank for every x in Õ, G maps Õ 1-1 onto V and the inverse G−1 : V → Õ is C r . Since Dµ F (x∗ ) has full rank, F is continuously differentiable, and the determinate function is continuous, one can take Õ such that Dµ F (x) has full rank for any x ∈ Õ. Since V is open, there exists an open set U⊆ RN such that x∗ν ∈ U and, for every xν ∈ U , (0, xν ) ∈ V .2 Let W = RM × U ∩ Õ. This is open, since it is the intersection of two open sets. For any xν ∈ U , if G−1 (0, xν ) = (xµ , xν ) then xµ is the unique point such that, setting x = (xµ , xν ), x ∈ Õ and F (x) = 0. Moreover, by construction, x ∈ W . Therefore, define ψ : U → RM by setting ψ(xν ) equal to the first M coordinates of G−1 (0, xν ). Since G−1 is C r on V , ψ is C r on U . Since DG−1 = [DG]−1 , Dψ is given by the upper-right sub-matrix in equation 2, which implies equation 1. 2 As discussed in the notes on RN , any open ball is contained in an open cube and vice versa. In the present case, since V √ is open and contains (0, x∗ν ), there is an ε > 0 such Nε (0, x∗ν ) ⊆ V . Choose any r > 0 such that r M + N ≤ ε. Then the M + N -dimensional cube with sides of length 2r and centered at (0, x∗ν ) is contained in Nε (0, x∗ν ), which is contained in V . Take U to be the N -dimensional cube with sides of length 2r and centered at x∗ν . 5
© Copyright 2026 Paperzz