8.9 The Cayley Hamilton Theorem
Here is a simple proof of the Cayley Hamilton theorem in the special case that the field of scalars is ℝ, ℚ,
or ℂ. This proof does not work for arbitrary fields. A proof of this theorem valid for every field will be
outlined in exercises. See Problem 21 on Page 507. The cases considered here comprise most major
applications of the Cayley Hamilton theorem.
Definition 8.9.1 Let A be an n × n matrix. The characteristic polynomial is defined as
and the solutions to qA
are called eigenvalues. For A a matrix and p
denote by p
the matrix defined by
The explanation for the last term is that A0 is interpreted as I, the identity matrix. This is
always the characteristic polynomial, but in this section, the field will be one of those mentioned
The Cayley Hamilton theorem states that every matrix satisfies its characteristic equation,
that equation defined by qA
= 0. It is one of the most important theorems in linear
The proof in this section is not the most general proof, but works well when the field of scalars is ℝ
The following lemma will help with its proof.
Lemma 8.9.2 Suppose for all
where the Ai are n × n matrices. Then each Ai = 0.
Proof: Suppose some Ai≠0. Let p be the largest index of those which are non zero. Then multiply by
Now let λ →∞. Thus Ap = 0 after all. Hence each Ai = 0. ■
With the lemma, here is a simple corollary.
Corollary 8.9.3 Let Ai and Bi be n × n matrices and suppose
large enough. Then Ai
= Bi for all i. If Ai
= Bi for each Ai,Bi then one can substitute an
n × n matrix M for λ and the identity will continue to hold.
Proof: Subtract and use the result of the lemma. The last claim is obvious by matching terms.
With this preparation, here is a relatively easy proof of the Cayley Hamilton theorem.
Theorem 8.9.4 Let A be an n × n matrix and let q
be the characteristic
polynomial. Then q
Proof: Let C
equal the transpose of the cofactor matrix of
cannot be in the finite list of eigenvalues of A
and so for such λ,
Therefore, by Theorem 8.6.1
Note that each entry in C
is a polynomial in
having degree no more than n −
For example, you
might have something like
Therefore, collecting the terms in the general case,
for Cj some n × n matrix. Then
Then multiplying out the middle term, it follows that for all
Then, using Corollary 8.9.3, one can replace λ on both sides with A. Then the right side is seen to equal 0.
Hence the left side, q
is also equal to 0. ■
Here is an interesting and significant application of block multiplication. In this theorem, qM
the characteristic polynomial, det
The zeros of this polynomial will be shown later to be
eigenvalues of the matrix M
. First note that from block multiplication, for the following block matrices
consisting of square blocks of an appropriate size,
Theorem 8.9.5 Let A be an m × n matrix and let B be an n × m matrix for m ≤ n. Then
so the eigenvalues of BA and AB are the same including multiplicities except that BA has
n − m extra zero eigenvalues. Here qA
denotes the characteristic polynomial of the matrix
Proof: Use block multiplication to write
Since the two matrices above are similar, it follows that
have the same characteristic polynomials.Thus
and so det