Definition A.6.1Let A be an n × n matrix. The characteristic polynomialis defined as
q (t) ≡ det (tI − A )
A
and the solutions to q_{A}
(t)
= 0 are called eigenvalues. For A a matrix and p
(t)
= t^{n} + a_{n−1}t^{n−1} +
⋅⋅⋅
+ a_{1}t + a_{0},denote by p
(A)
the matrix defined by
p (A ) ≡ An + an−1An−1 +⋅⋅⋅+ a1A + a0I.
The explanation for the last term is that A^{0}is interpreted as I, the identity matrix.
The Cayley-Hamilton theorem states that every matrix satisfies its characteristic equation,
that equation defined by q_{A}
(t)
= 0. It is one of the most important theorems in linear
algebra.^{2}
The proof in this section is not the most general proof, but works well when the field of scalars is ℝ or ℂ.
The following lemma will help with its proof.
Lemma A.6.2Suppose for all
|λ|
large enough,
m
A0 +A1 λ+ ⋅⋅⋅+ Am λ = 0,
where the A_{i}are n × n matrices. Then each A_{i} = 0.
Proof:Multiply by λ^{−m} to obtain
A0λ−m + A1 λ−m+1 + ⋅⋅⋅+ Am −1λ−1 + Am = 0.
Now let
|λ|
→∞ to obtain A_{m} = 0. With this, multiply by λ to obtain
A0λ− m+1 + A1 λ−m+2 + ⋅⋅⋅+ Am −1 = 0.
Now let
|λ|
→∞ to obtain A_{m−1} = 0. Continue multiplying by λ and letting λ →∞ to obtain that all the
A_{i} = 0. ■
With the lemma, here is a simple corollary.
Corollary A.6.3Let A_{i}and B_{i}be n × n matrices and suppose
m m
A0 + A1λ + ⋅⋅⋅+ Am λ = B0 + B1λ + ⋅⋅⋅+ Bm λ
for all
|λ|
large enough. Then A_{i} = B_{i}for all i. If A_{i} = B_{i}for each A_{i},B_{i}then one can substitute ann × n matrix M for λ and the identity will continue to hold.
Proof:Subtract and use the result of the lemma.The last claim is obvious by matching terms.
■
With this preparation, here is a relatively easy proof of the Cayley-Hamilton theorem.
Theorem A.6.4Let A be an n × n matrix and let q
(λ)
≡ det
(λI − A)
be the characteristicpolynomial. Then q
(A )
= 0.
Proof:Let C
(λ)
equal the transpose of the cofactor matrix of
(λI − A)
for
|λ|
large. (If
|λ|
is large
enough, then λ cannot be in the finite list of eigenvalues of A and so for such λ,
Then, using Corollary A.6.3, one can replace λ on both sides with A. Then the right side is seen to equal 0.
Hence the left side, q
(A)
I is also equal to 0. ■
The following theorem is of fundamental importance and ties together many of the ideas presented
above.
Theorem A.6.5Let A be an n × n matrix. Then the following are equivalent.
det
(A)
= 0.
A,A^{T}are not one to one.
A is not onto.
Proof:Suppose det
(A)
= 0. Then A cannot be one to one because if it were, then it would be onto as
well thanks to Theorem A.4.1. Hence you would have the existence of A^{−1} because, there would exist b_{i}
such that Ab_{i} = e_{i} and so B ≡
( )
b1 ⋅⋅⋅ bn
satisfies AB = I and so B = A^{−1}. But
then
( ) ( )
1 = det AA −1 = det(A )det A−1 = 0.
Now det
(A)
= det
(AT )
and so the same reasoning implies A^{T} is not one to one. This verifies that 1.
implies 2..
Now suppose 2. Then since A^{T} is not one to one, it follows there exists x≠0 such that
AT x = 0.
Taking the transpose of both sides yields
T T
x A = 0
where the 0^{T} is a 1 × n matrix or row vector. Now if Ay = x, then
2 T ( T )
|x|= x (Ay) = x A y = 0y = 0
contrary to x≠0. Consequently there can be no y such that Ay = x and so A is not onto. This shows that
2. implies 3..
Finally, suppose 3. If 1. does not hold, then det
(A)
≠0 but then from Theorem A.5.17A^{−1} exists and so
for every y ∈ F^{n} there exists a unique x ∈ F^{n} such that Ax = y. In fact x = A^{−1}y. Thus A would be onto
contrary to 3.. This shows 3. implies 1.. ■
Corollary A.6.6Let A be an n × n matrix. Then the following are equivalent.
det
(A)
≠0.
A and A^{T}are one to one.
A is onto.
Proof:This follows immediately from the above theorem.