Kenneth Kuttler

7.2. THE CAYLEY HAMILTON THEOREM∗ 141

Therefore, collecting the terms in the general case,

C (λ ) =C0 +C1λ + · · ·+Cn−1λn−1

for C j some n×n matrix. Then

C (λ )(λ I−A) =(

C0 +C1λ + · · ·+Cn−1λn−1)(λ I−A) = q(λ ) I

Then multiplying out the middle term, it follows that for all |λ | sufficiently large,

a0I +a1Iλ + · · ·+ Iλn =C0λ +C1λ

2 + · · ·+Cn−1λn

−[C0A+C1Aλ + · · ·+Cn−1Aλ

n−1]

=−C0A+(C0−C1A)λ +(C1−C2A)λ2 + · · ·+(Cn−2−Cn−1A)λ

n−1 +Cn−1λn

Then, using Corollary 7.2.3, one can replace λ on both sides with A. Then the right side isseen to equal 0. Hence the left side, q(A) I is also equal to 0. ■

It is good to keep in mind the following example when considering the above proof ofthe Cayley Hamilton theorem. It was shown to me by Marc van Leeuwen. If p(λ ) = q(λ )for all λ or for all λ large enough where p(λ ) ,q(λ ) are polynomials having matrix coeffi-cients, then it is not necessarily the case that p(A) = q(A) for A a matrix of an appropriatesize. Let

E1 =

(1 00 0

),E2 =

(0 00 1

),N =

(0 10 0

)Then a short computation shows that for all complex λ ,

(λ I +E1)(λ I +E2) =(

λ2 +λ

)I = (λ I +E2)(λ I +E1)

However,(NI +E1)(NI +E2) ̸= (NI +E2)(NI +E1)

The reason this can take place is that N fails to commute with Ei. Of course a scalarcommutes with any matrix so there was no difficulty in obtaining that the matrix equationheld for arbitrary λ , but this factored equation does not continue to hold if λ is replacedby a matrix. In the above proof of the Cayley Hamilton theorem, this issue was avoided byconsidering only polynomials which are of the form C0+C1λ + · · · in which the polynomialidentity held because the corresponding matrix coefficients were equal. However, you canalso argue that in the above proof, the Ci each commute with A. Nevertheless, an earlierproof of the Cayley Hamilton theorem using this approach was misleading because thisissue was not made clear.

7.2. THE CAYLEY HAMILTON THEOREM* 141Therefore, collecting the terms in the general case,C(A) =Cot CA++ +C,-14"|!for C; some n X n matrix. ThenC(A)(AI—A) = (G+ CA++ +612") (AI-A) =q(A)IThen multiplying out the middle term, it follows that for all |A| sufficiently large,agl +ayIA +++ FIA" = CoA FCA? 40-4 Cn 1A"— [CoA +C\AA+--- +C,1A0""]= —CyA + (Cy —CiA) A + (Cy — CoA) A? +++ + (Cra — Cpt A) Am 1 +. C1 A"Then, using Corollary 7.2.3, one can replace A on both sides with A. Then the right side isseen to equal 0. Hence the left side, g(A)/ is also equal to 0. MfIt is good to keep in mind the following example when considering the above proof ofthe Cayley Hamilton theorem. It was shown to me by Marc van Leeuwen. If p(A) = q(A)for all A or for all A large enough where p (A) ,qg(A) are polynomials having matrix coeffi-cients, then it is not necessarily the case that p(A) = q(A) for A a matrix of an appropriatesize. Let1 1E,= 0 be 0 0 N= 00 0 0 1 0 0Then a short computation shows that for all complex A,(AI+E)) (A+ Eo) = (2? +A) T= (AI-+ Ep) (AI+E})However,(NI+ E)) (NI+ Ez) 4 (NI+ Ep) (NI+ E;)The reason this can take place is that N fails to commute with E;. Of course a scalarcommutes with any matrix so there was no difficulty in obtaining that the matrix equationheld for arbitrary 7, but this factored equation does not continue to hold if A is replacedby a matrix. In the above proof of the Cayley Hamilton theorem, this issue was avoided byconsidering only polynomials which are of the form Cp ++C,A +--+ in which the polynomialidentity held because the corresponding matrix coefficients were equal. However, you canalso argue that in the above proof, the C; each commute with A. Nevertheless, an earlierproof of the Cayley Hamilton theorem using this approach was misleading because thisissue was not made clear.