The implicit function theorem is one of the greatest theorems in mathematics. There are many versions of
this theorem which are of far greater generality than the one given here. The proof given here is like one
found in one of Caratheodory’s books on the calculus of variations. It is not as elegant as some of the
others which are based on a contraction mapping principle but it may be more accessible.
However, it is an advanced topic. Don’t waste your time with it unless you have first read and
understood the material on rank and determinants found in the chapter on the mathematical
theory of determinants. You will also need to use the extreme value theorem for a function of
n variables and the chain rule of multivariable calculus as well as everything about matrix
multiplication.
Definition 19.0.1Suppose U is an open set in ℝ^{n}× ℝ^{m}and
(x,y )
will denote a typical point ofℝ^{n}× ℝ^{m}with x ∈ ℝ^{n}and y ∈ ℝ^{m}. Let f : U → ℝ^{p}be in C^{1}
( ( 1 ) ( 1 ) )
( ) | f1,x1 x ,y ⋅⋅⋅ f1,xn x ,y |
J x1,⋅⋅⋅,xn,y ≡ |( ... ... |) . (*)
f (xn,y) ⋅⋅⋅ f (xn,y)
n,x1 n,xn
(*)
Then by the assumption of continuity of all the partial derivatives and the extreme value theorem, there
exists r > 0 and δ_{0},η_{0}> 0 such that if δ ≤ δ_{0} and η ≤ η_{0}, it follows that for all
(x1,⋅⋅⋅,xn)
∈B
(x0,δ)
^{n}
and y ∈B
(y0,η)
,
| ( ( ))|
|det J x1,⋅⋅⋅,xn,y | > r > 0. (19.3)
(19.3)
and B
(x0,δ0)
×B
(y0,η0)
⊆ U. By continuity of all the partial derivatives and the extreme value theorem,
it can also be assumed there exists a constant, K such that for all
(x,y)
∈B
(x0,δ0)
×B
(y0,η0)
and
i = 1,2,
⋅⋅⋅
,n, the i^{th} row of D_{2}f
(x,y)
, given by D_{2}f_{i}
(x,y )
satisfies
|D2fi(x,y)| < K, (19.4)
(19.4)
and for all
( 1 n)
x ,⋅⋅⋅,x
∈B
(x0,δ0)
^{n} and y ∈B
(y0,η0)
the i^{th} row of the matrix,
( )
J x1,⋅⋅⋅,xn,y −1
which equals e_{i}^{T}
( )
J (x1,⋅⋅⋅,xn,y )−1
satisfies
| ( )|
||eTi J (x1,⋅⋅⋅,xn,y)−1 || < K. (19.5)
(19.5)
(Recall that e_{i} is the column vector consisting of all zeros except for a 1 in the i^{th} position.)
To begin with it is shown that for a given y ∈ B
(y0,η)
there is at most one x ∈ B
(x0,δ)
such that
f
(x,y)
= 0.
Pick y ∈ B
(y0,η)
and suppose there exist x,z∈B
(x0,δ)
such that f
(x,y)
= f
(z,y )
= 0. Consider f_{i}
and let
h(t) ≡ fi(x + t(z − x),y).
Then h
(1)
= h
(0)
and so by the mean value theorem, h^{′}
(ti)
= 0 for some t_{i}∈
(0,1)
. Therefore, from the
chain rule and for this value of t_{i},
and so from 19.3z − x = 0. (The matrix, in the above is invertible since its determinant is nonzero.) Now
it will be shown that if η is chosen sufficiently small, then for all y ∈ B
(y ,η )
0
, there exists a unique
x
(y )
∈ B
(x ,δ)
0
such that f
(x(y),y)
= 0.
Claim:If η is small enough, then the function, x → h_{y}
(x)
≡
|f (x,y)|
^{2} achieves its minimum value on
B
(x ,δ)
0
at a point of B
(x ,δ)
0
. (The existence of a point in B
(x ,δ)
0
at which h_{y} achieves its minimum
follows from the extreme value theorem.)
Proof of claim:Suppose this is not the case. Then there exists a sequence η_{k}→ 0 and for
some y_{k} having
|y − y |
k 0
< η_{k}, the minimum of h_{yk}on B
(x ,δ)
0
occurs on a point x_{k} such that
|x − x |
0 k
= δ. Now taking a subsequence, still denoted by k, it can be assumed that x_{k}→ x with
|x − x0|
= δ and y_{k}→ y_{0}. This follows from the fact that
{ -------- }
x ∈ B(x0,δ) : |x− x0| = δ
is a closed
and bounded set and is therefore sequentially compact. Let ε > 0. Then for k large enough,
the continuity of y → h_{y}
(x0)
implies h_{yk}
(x0)
< ε because h_{y0}
(x0)
= 0 since f
(x0,y0)
= 0.
Therefore, from the definition of x_{k}, it is also the case that h_{yk}
(xk)
< ε. Passing to the limit yields
h_{y0}
(x)
≤ ε. Since ε > 0 is arbitrary, it follows that h_{y0}
(x)
= 0 which contradicts the first
part of the argument in which it was shown that for y ∈ B
(y0,η)
there is at most one point,
x of B
(x0,δ)
where f
(x,y)
= 0. Here two have been obtained, x_{0} and x. This proves the
claim.
Choose η < η_{0} and also small enough that the above claim holds and let x
(y )
denote a point of
B
(x0,δ)
at which the minimum of h_{y} on B
(x0,δ)
is achieved. Since x
(y)
is an interior point, you can
consider h_{y}
(x(y)+ tv)
for
|t|
small and conclude this function of t has a zero derivative at t = 0.
Now