In case you are solving a system of equations, Ax = y for x, it follows that if A^{−1}
exists,
x = (A− 1A)x = A−1 (Ax ) = A −1y
thus solving the system. Now in the case that A^{−1} exists, there is a formula for A^{−1} given
above. Using this formula,
∑n ∑n 1
xi = a−ij1 yj = ------cof (A )jiyj.
j=1 j=1 det(A)
By the formula for the expansion of a determinant along a column,
( ∗ ⋅⋅⋅ y ⋅⋅⋅ ∗ )
1 | . .1 . |
xi = det(A) det( .. .. .. ) ,
∗ ⋅⋅⋅ yn ⋅⋅⋅ ∗
where here the i^{th} column of A is replaced with the column vector
(y1⋅⋅⋅,yn)
^{T}, and the
determinant of this modified matrix is taken and divided by det
(A)
. This formula is
known as Cramer’s rule.
Appendix B Implicit Function Theorem*
The implicit function theorem is one of the greatest theorems in mathematics. There are
many versions of this theorem which are of far greater generality than the one given here.
The proof given here is like one found in one of Caratheodory’s books on the calculus of
variations. It is not as elegant as some of the others which are based on a contraction
mapping principle but it may be more accessible. However, it is an advanced topic. Don’t
waste your time with it unless you have first read and understood the material on rank
and determinants found in the chapter on the mathematical theory of determinants. You
will also need to use the extreme value theorem for a function of n variables
and the chain rule of multivariable calculus as well as everything about matrix
multiplication.
Definition B.0.1Suppose U is an open set in ℝ^{n}× ℝ^{m}and
(x,y)
will denote atypical point of ℝ^{n}× ℝ^{m}with x ∈ ℝ^{n}and y ∈ ℝ^{m}. Let f : U → ℝ^{p}be in C^{1}
Then by the assumption of continuity of all the partial derivatives and the extreme value
theorem, there exists r > 0 and δ_{0},η_{0}> 0 such that if δ ≤ δ_{0} and η ≤ η_{0}, it follows that
for all
( )
x1,⋅⋅⋅,xn
∈B
(x0,δ)
^{n} and y ∈B
(y0,η)
,
| ( ( 1 n ))|
|det J x ,⋅⋅⋅,x ,y | > r > 0. (2.3)
(2.3)
and B
(x0,δ0)
×B
(y0,η0)
⊆ U. By continuity of all the partial derivatives and the
extreme value theorem, it can also be assumed there exists a constant, K such that for all
(x,y)
∈B
(x0,δ0)
×B
(y0,η0)
and i = 1,2,
⋅⋅⋅
,n, the i^{th} row of D_{2}f
(x,y)
, given by
D_{2}f_{i}
(x,y)
satisfies
|D2fi(x,y)| < K, (2.4)
(2.4)
and for all
( 1 n)
x ,⋅⋅⋅,x
∈B
(x0,δ0)
^{n} and y ∈B
(y0,η0)
the i^{th} row of the
matrix,
( )
J x1,⋅⋅⋅,xn,y −1
which equals e_{i}^{T}
( ( 1 n )−1)
J x ,⋅⋅⋅,x ,y
satisfies
| ( ( )−1)|
||eTi J x1,⋅⋅⋅,xn,y || < K. (2.5)
(2.5)
(Recall that e_{i} is the column vector consisting of all zeros except for a 1 in the i^{th}
position.)
To begin with it is shown that for a given y ∈ B
(y0,η)
there is at most one
x ∈ B
(x0,δ)
such that f
(x,y)
= 0.
Pick y ∈ B
(y0,η)
and suppose there exist x,z∈B
(x0,δ)
such that
f
(x,y)
= f
(z,y )
= 0. Consider f_{i} and let
h(t) ≡ fi(x + t(z− x),y).
Then h
(1)
= h
(0)
and so by the mean value theorem, h^{′}
(ti)
= 0 for some t_{i}∈
(0,1)
.
Therefore, from the chain rule and for this value of t_{i},
and so from 2.3z − x = 0. (The matrix, in the above is invertible since its determinant is
nonzero.) Now it will be shown that if η is chosen sufficiently small, then for all
y ∈ B
(y0,η)
, there exists a unique x
(y)
∈ B
(x0,δ)
such that f
(x(y),y)
= 0.
Claim:If η is small enough, then the function, x → h_{y}
(x)
≡
|f (x,y)|
^{2} achieves its
minimum value on B
(x0,δ)
at a point of B
(x0,δ)
. (The existence of a point in
B
(x0,δ)
at which h_{y} achieves its minimum follows from the extreme value
theorem.)
Proof of claim:Suppose this is not the case. Then there exists a sequence
η_{k}→ 0 and for some y_{k} having
|yk− y0|
< η_{k}, the minimum of h_{yk}on B
(x0,δ)
occurs on a point x_{k} such that
|x0− xk|
= δ. Now taking a subsequence, still
denoted by k, it can be assumed that x_{k}→ x with
|x − x0|
= δ and y_{k}→ y_{0}.
This follows from the fact that
{x ∈ B-(x-,δ) : |x − x | = δ}
0 0
is a closed and
bounded set and is therefore sequentially compact. Let ε > 0. Then for k large
enough, the continuity of y → h_{y}
(x0)
implies h_{yk}
(x0)
< ε because h_{y0}
(x0)
= 0
since f
(x0,y0)
= 0. Therefore, from the definition of x_{k}, it is also the case that
h_{yk}
(xk)
< ε. Passing to the limit yields h_{y0}
(x)
≤ ε. Since ε > 0 is arbitrary, it
follows that h_{y0}
(x)
= 0 which contradicts the first part of the argument in which
it was shown that for y ∈ B
(y0,η)
there is at most one point, x of B
(x0,δ)
where f
(x,y)
= 0. Here two have been obtained, x_{0} and x. This proves the
claim.
Choose η < η_{0} and also small enough that the above claim holds and let x
(y)
denote
a point of B
(x0,δ)
at which the minimum of h_{y} on B
(x0,δ)
is achieved. Since x
(y )
is an
interior point, you can consider h_{y}
(x(y) +tv)
for
|t|
small and conclude this function of
t has a zero derivative at t = 0. Now