Here is a simple proof of the Cayley Hamilton theorem in the special case that the field of scalars is ℝ, ℚ,
or ℂ. This proof does not work for arbitrary fields. A proof of this theorem valid for every field will be
outlined in exercises. See Problem 21 on Page 507. The cases considered here comprise most major
applications of the Cayley Hamilton theorem.
Definition 8.9.1Let A be an n × n matrix. The characteristic polynomialis defined as
qA (t) ≡ det (tI − A )
and the solutions to q_{A}
(t)
= 0 are called eigenvalues. For A a matrix and p
(t)
= t^{n} + a_{n−1}t^{n−1} +
⋅⋅⋅
+ a_{1}t + a_{0},denote by p
(A)
the matrix defined by
p (A ) ≡ An + an−1An−1 +⋅⋅⋅+ a1A + a0I.
The explanation for the last term is that A^{0}is interpreted as I, the identity matrix. This isalways the characteristic polynomial, but in this section, the field will be one of those mentionedabove.
The Cayley Hamilton theorem states that every matrix satisfies its characteristic equation,
that equation defined by q_{A}
(t)
= 0. It is one of the most important theorems in linear
algebra^{1} .
The proof in this section is not the most general proof, but works well when the field of scalars is ℝ or ℂ.
The following lemma will help with its proof.
Lemma 8.9.2Suppose for all
|λ|
large enough,
m
A0 +A1 λ+ ⋅⋅⋅+ Am λ = 0,
where the A_{i}are n × n matrices. Then each A_{i} = 0.
Proof:Suppose some A_{i}≠0. Let p be the largest index of those which are non zero. Then multiply by
λ^{−p}.
A λ−p + A λ− p+1 + ⋅⋅⋅+ A λ−1 + A = 0
0 1 p−1 p
Now let λ →∞. Thus A_{p} = 0 after all. Hence each A_{i} = 0. ■
With the lemma, here is a simple corollary.
Corollary 8.9.3Let A_{i}and B_{i}be n × n matrices and suppose
A0 + A1λ + ⋅⋅⋅+ Am λm = B0 + B1λ + ⋅⋅⋅+ Bm λm
for all
|λ|
large enough. Then A_{i} = B_{i}for all i. If A_{i} = B_{i}for each A_{i},B_{i}then one can substitute ann × n matrix M for λ and the identity will continue to hold.
Proof:Subtract and use the result of the lemma.The last claim is obvious by matching terms.
■
With this preparation, here is a relatively easy proof of the Cayley Hamilton theorem.
Theorem 8.9.4Let A be an n × n matrix and let q
(λ)
≡ det
(λI − A)
be the characteristicpolynomial. Then q
(A )
= 0.
Proof:Let C
(λ)
equal the transpose of the cofactor matrix of
(λI − A)
for
|λ|
large. (If
|λ|
is large
enough, then λ cannot be in the finite list of eigenvalues of A and so for such λ,
Then, using Corollary 8.9.3, one can replace λ on both sides with A. Then the right side is seen to equal 0.
Hence the left side, q
(A)
I is also equal to 0. ■
Here is an interesting and significant application of block multiplication. In this theorem, q_{M}
(t)
denotes
the characteristic polynomial, det
(tI − M )
. The zeros of this polynomial will be shown later to be
eigenvalues of the matrix M. First note that from block multiplication, for the following block matrices
consisting of square blocks of an appropriate size,
( ) ( ) ( )
A 0 A 0 I 0
B C = B I 0 C so
( A 0 ) ( A 0 ) ( I 0 )
det = det det = det(A)det(C )
B C B I 0 C
Theorem 8.9.5Let A be an m × n matrix and let B be an n × m matrix for m ≤ n. Then
qBA (t) = tn− mqAB (t),
so the eigenvalues of BA and AB arethe same including multiplicities except that BA hasn − m extra zero eigenvalues. Here q_{A}
(t)
denotes the characteristic polynomial of the matrixA.
Proof:Use block multiplication to write
( )( ) ( )
AB 0 I A = AB ABA
B 0 0 I B BA
( )( ) ( )
I A 0 0 AB ABA
0 I B BA = B BA .
( ) ( ) ( ) ( )
I A 0 0 AB 0 I A
0 I B BA = B 0 0 I
Therefore,
( )−1 ( ) ( ) ( )
I A AB 0 I A = 0 0
0 I B 0 0 I B BA
Since the two matrices above are similar, it follows that
( ) ( )
0m ×m 0 AB 0
B BA , B 0n×n
have the same characteristic polynomials.Thus
( ) ( )
tIm×m 0 tI − AB 0
det − B tI − BA = det − B tIn×n (8.13)