Kenneth Kuttler

27.4. THE CAYLEY HAMILTON THEOREM∗ 515

 9 3 0−6 0 0−1 −1 2

+λ

 −6 −1 02 −3 01 1 −3

+λ2

 1 0 00 1 00 0 1

Therefore, collecting the terms in the general case,

C (λ ) =C0 +C1λ + · · ·+Cn−1λn−1

for C j some n×n matrix. Then

C (λ )(λ I−A) =(

C0 +C1λ + · · ·+Cn−1λn−1)(λ I−A) = q(λ ) I

Then multiplying out the middle term, it follows that for all |λ | sufficiently large,

a0I +a1Iλ + · · ·+ Iλn =C0λ +C1λ

2 + · · ·+Cn−1λn

−[C0A+C1Aλ + · · ·+Cn−1Aλ

n−1]

=−C0A+(C0−C1A)λ +(C1−C2A)λ2 + · · ·+(Cn−2−Cn−1A)λ

n−1 +Cn−1λn

Then, using Corollary 27.4.3, one can replace λ on both sides with A. Then the right sideis seen to equal 0. Hence the left side, q(A) I is also equal to 0. ■

It is good to keep in mind the following example when considering the above proof ofthe Cayley Hamilton theorem. If p(λ ) = q(λ ) for all λ or for all λ large enough wherep(λ ) ,q(λ ) are polynomials having matrix coefficients, then it is not necessarily the casethat p(A) = q(A) for A a matrix of an appropriate size. Let

E1 =

(1 00 0

),E2 =

(0 00 1

),N =

(0 10 0

)Then a short computation shows that for all complex λ ,

(λ I +E1)(λ I +E2) =(

λ2 +λ

)I = (λ I +E2)(λ I +E1)

However,(NI +E1)(NI +E2) ̸= (NI +E2)(NI +E1)

The reason this can take place is that N fails to commute with Ei. Of course a scalarcommutes with any matrix so there was no difficulty in obtaining that the matrix equationheld for arbitrary λ , but this factored equation does not continue to hold if λ is replacedby a matrix. In the above proof of the Cayley Hamilton theorem, this issue was avoided byconsidering only polynomials which are of the form C0+C1λ + · · · in which the polynomialidentity held because the corresponding matrix coefficients were equal. However, you canalso argue that in the above proof, the Ci each commute with A.

Theorem 27.4.5 Let q(λ ) be the characteristic polynomial and p(λ ) the minimal poly-nomial. Then there is a polynomial l (λ ) which could be a constant such that q(λ ) =l (λ ) p(λ ).

Proof: By the division algorithm, q(λ )= p(λ ) l (λ )+r (λ ) where the degree of r (λ ) isless than the degree of p(λ ) or else r (λ ) = 0. But then, substituting in A, you get r (A) = 0which is impossible if its degree is less than that of p(λ ). It follows that r (λ ) = 0 and sothe claim is established. p(λ ) “divides” q(λ ). ■

27.4. THE CAYLEY HAMILTON THEOREM 5159 3 0 6 -1 0 10 0=| 6 0 0 |4+al 2 -3 o |+a?] 01 0-1 -1 2 1 1 -3 00 1Therefore, collecting the terms in the general case,C(A) = Cot CA +2 +C, 10"!for Cj some n X n matrix. ThenC(A)(AI—A) = (c 4C\Abe +C,14"") (AI—A) =q(A)IThen multiplying out the middle term, it follows that for all |A| sufficiently large,aol ala + FIA" = CoA 4 CV? $+ G14"— CoA +C\AA+-- +C,1Aa™||= —CyoA+(Cy—CiA)A + (Cy — CoA) A? He + (Cy-2 —Cy_1A) A" | + CA"Then, using Corollary 27.4.3, one can replace A on both sides with A. Then the right sideis seen to equal 0. Hence the left side, g(A)/ is also equal to 0.It is good to keep in mind the following example when considering the above proof ofthe Cayley Hamilton theorem. If p(A) = q(A) for all A or for all A large enough wherep(A),q(A) are polynomials having matrix coefficients, then it is not necessarily the casethat p (A) = q(A) for A a matrix of an appropriate size. Let1 1E,= 0 Ee 0 0 Ne 00 0 0 1 0 0Then a short computation shows that for all complex /,(AI+E})(AI+ E>) = (2? +A) T= (AI+ Ep) (AI+E})However,(NI+ E)) (NI + Ez) 4 (NI+ Ep) (NI+ E;)The reason this can take place is that N fails to commute with E£;. Of course a scalarcommutes with any matrix so there was no difficulty in obtaining that the matrix equationheld for arbitrary 7, but this factored equation does not continue to hold if A is replacedby a matrix. In the above proof of the Cayley Hamilton theorem, this issue was avoided byconsidering only polynomials which are of the form Cy +C A +--+ in which the polynomialidentity held because the corresponding matrix coefficients were equal. However, you canalso argue that in the above proof, the C; each commute with A.Theorem 27.4.5 Let q(A) be the characteristic polynomial and p(A) the minimal poly-nomial. Then there is a polynomial 1(A) which could be a constant such that q(A) =I(A) p(A).Proof: By the division algorithm, g (A) = p(A)1(A)+r(A) where the degree of r (A) isless than the degree of p (A) or else r(A) = 0. But then, substituting in A, you get r(A) =0which is impossible if its degree is less than that of p(A). It follows that r(A) = 0 and sothe claim is established. p (A) “divides” q(A).