A.6 Cayley-Hamilton Theorem
Definition A.6.1 Let A be an n × n matrix. The characteristic polynomial is defined as
and the solutions to qA
are called eigenvalues. For A a matrix and p
denote by p
the matrix defined by
The explanation for the last term is that A0 is interpreted as I, the identity matrix.
The Cayley-Hamilton theorem states that every matrix satisfies its characteristic equation,
that equation defined by qA
= 0. It is one of the most important theorems in linear
The proof in this section is not the most general proof, but works well when the field of scalars is ℝ
The following lemma will help with its proof.
Lemma A.6.2 Suppose for all
where the Ai are n × n matrices. Then each Ai = 0.
Proof: Multiply by λ−m to obtain
to obtain Am
With this, multiply by λ
to obtain Am−1
Continue multiplying by λ
and letting λ →∞
to obtain that all the
= 0. ■
With the lemma, here is a simple corollary.
Corollary A.6.3 Let Ai and Bi be n × n matrices and suppose
large enough. Then Ai
= Bi for all i. If Ai
= Bi for each Ai,Bi then one can substitute an
n × n matrix M for λ and the identity will continue to hold.
Proof: Subtract and use the result of the lemma. The last claim is obvious by matching terms.
With this preparation, here is a relatively easy proof of the Cayley-Hamilton theorem.
Theorem A.6.4 Let A be an n × n matrix and let q
be the characteristic
polynomial. Then q
Proof: Let C
equal the transpose of the cofactor matrix of
cannot be in the finite list of eigenvalues of A
and so for such λ,
Therefore, by Theorem A.5.17
Note that each entry in C
is a polynomial in
having degree no more than n −
For example, you
might have something like
Therefore, collecting the terms in the general case,
for Cj some n × n matrix. Then
Then multiplying out the middle term, it follows that for all
Then, using Corollary A.6.3, one can replace λ on both sides with A. Then the right side is seen to equal 0.
Hence the left side, q
is also equal to 0. ■
The following theorem is of fundamental importance and ties together many of the ideas presented
Theorem A.6.5 Let A be an n × n matrix. Then the following are equivalent.
- det = 0
- A,AT are not one to one.
- A is not onto.
Proof: Suppose det
cannot be one to one because if it were, then it would be onto as
well thanks to Theorem A.4.1
. Hence you would have the existence of A−1
because, there would exist bi
such that Abi
and so B ≡
and so B
and so the same reasoning implies
is not one to one. This verifies that 1.
Now suppose 2. Then since AT is not one to one, it follows there exists x≠0 such that
Taking the transpose of both sides yields
where the 0T is a 1 × n matrix or row vector. Now if Ay = x, then
contrary to x≠0. Consequently there can be no y such that Ay = x and so A is not onto. This shows that
2. implies 3..
Finally, suppose 3. If 1. does not hold, then det
0 but then from Theorem A.5.17 A−1
exists and so
for every y ∈ Fn
there exists a unique x ∈ Fn
such that Ax
In fact x
would be onto
contrary to 3.
. This shows 3.
Corollary A.6.6 Let A be an n × n matrix. Then the following are equivalent.
- A and AT are one to one.
- A is onto.
Proof: This follows immediately from the above theorem.