7.3 Nilpotent Transformations and Jordan Canonical Form
Definition 7.3.1Let V be a vector space over the field of scalars F. Then N ∈ℒ
(V,V )
is callednilpotentif for some m, it follows that Nm = 0.
In general, when you have A ∈ℒ
(V,V)
and the minimum polynomial is ∏i=1pϕi
(λ)
ki, you can decompose
V into a direct sum as
V =p ker(ϕ (A )ki)
i=1 i
Then on ker
( )
ϕi(A)ki
, it follows by definition that ϕi
(A)
is nilpotent.
The following lemma contains some significant observations about nilpotent transformations.
Lemma 7.3.2Suppose Nkx≠0if and only if
{ }
x,N x,⋅⋅⋅,N kx
is linearly independent. Also, theminimum polynomial of N is λmwhere m is the first such that Nm = 0.
Proof:Suppose ∑i=0kciNix = 0 where not all ci = 0. There exists l such that k ≤ l < m
and Nl+1x = 0 but Nlx≠0. Then multiply both sides by Nl to conclude that c0 = 0. Next
multiply both sides by Nl−1 to conclude that c1 = 0 and continue this way to obtain that all the
ci = 0.
Next consider the claim that λm is the minimum polynomial. If p
(λ)
is the minimum polynomial, then
by the division algorithm,
m
λ = p(λ)l(λ )+ r(λ)
where the degree of r
(λ)
is less than that of p
(λ)
or else r
(λ)
= 0. The above implies 0 = 0 + r
(N )
contrary to p
(λ)
being minimum. Hence r
(λ)
= 0 and so p
(λ)
divides λm. Hence p
(λ )
= λk for k ≤ m. But
if k < m, this would contradict the definition of m as being the smallest such that Nm = 0.
■
Note how this lemma implies that if Nkx is a linear combination of the preceeding vectors for k as small
as possible, then Nkx = 0.
Now suppose V = ker
( )
ϕ(A )k
where ϕ
(λ)
is irreducible and the minimum polynomial for A on V is
ϕ
(λ )
k as in the above. Letting B = ϕ
(A)
, we can consider a cyclic basis for V of the form
{βx,⋅⋅⋅,βx}
1 s
where βx≡
{x,Bx,⋅⋅⋅,Bm −1x}
. From Lemma 7.3.2, Bmx = 0. So what is the matrix of B with respect to
this basis? It is block diagonal, the blocks coming from the individual βxi, the size of the blocks being
|βx |
i
×
|βx |
i
,i = 1,
⋅⋅⋅
,s.
( )
C1
|| .. ||
( . )
Cp
Using the useful gimmick and ordering the basis using decreasing
powers,1
one of these blocks is of the form Ck where Ck is the matrix on the right
( m− 1 )
0 B x ⋅⋅⋅ Bx
( 0 1 0 )
| . |
( m −1 m −2 )|| 0 0 .. ||
= B x B x ⋅⋅⋅ x || .. .. ||
( . . 1 )
0 0 ⋅⋅⋅ 0
That is, Ck is the
|βx |
×
|βx|
matrix which has ones down the super diagonal and zeros elsewhere. Note
that the size of the blocks is determined by
|βx|
just as in the above rational canonical form. In fact, if you
ordered the basis in the opposite way, one of these blocks would just be the companion matrix for B just as
in the rational canonical form. This is because of Lemma 7.3.2 which requires Bmx = 0 and so the
linear combination of Bmx in terms of Bkx for k < m has all zero coefficients. You would
have
In words, there is an unbrokenstring of ones down the super diagonal and the number α filling every spaceon the main diagonal with zeros everywhere else.
Then with this definition and the above discussion, the following proposition has been proved.
Proposition 7.3.4Let A ∈ℒ
(V,V )
and let the minimal polylnomial of A be
p∏ ⊕p ( )
p (λ) ≡ ϕi (λ)ki ,V = ker ϕi(A)ki
i=1 i=1
where the ϕi
(λ)
are irreducible. Also let Bi≡ ϕi
(A)
. Then Biis nilpotent and
ki ( ki)
Bi = 0 on Vi = ker ϕi(A )
Letting the dimension of Vibe di,thereexists a basis for Visuch that the matrix of Biwith respect to thisbasis is of the form
is called a Jordan block of sizerj× rjwith 0 down the main diagonal.
Observation 7.3.5Observe that Jr
(0)
r = 0 but Jr
(0)
r−1≠0.
In fact, the matrix of the above proposition is unique. This is a general fact for a nilpotent matrix
N.
Corollary 7.3.6Let J,J′both be matrices of the nilpotent linear transformation N ∈ ℒ
(W, W )
which are of the form described in Proposition 7.3.4. Then J = J′. Infact, if the rank of Jkequalsthe rank of J′kfor all nonnegative integers k, then J = J′.
Proof: Since J and J′ are similar, it follows that for each k an integer, Jk and J′k are
similar. Hence, for each k, these matrices have the same rank. Now suppose J≠J′. Note first
that
Jr (0)r = 0,Jr(0)r−1 ⁄= 0.
Denote the blocks of J as Jrk
(0)
and the blocks of J′ as Jrk′
(0)
. Let k be the first such that
Jrk
(0)
≠Jrk′
(0)
. Suppose that rk> rk′. By block multiplication and the above observation, it follows that
the two matrices Jrk−1 and J′rk−1 are respectively of the forms
where Mrj = Mrj′ for j ≤ k − 1 but Mrk′ is a zero rk′×rk′ matrix while Mrk is a larger matrix which is
not equal to 0. For example, Mrk could look like
( )
| 0 ⋅⋅⋅ 1 |
Mrk = |( ... ... |)
0 0
Thus this contradicts the requirement that Jk and J′k have the same rank. ■
The Jordan canonical form is available when the minimum polynomial can be factored in the field of
scalars. Thus
∏r
p(λ) = (λ− μk)mk
k=1
and
⊕r ⊕r
V = ker(A − μiI)mi ≡ Vi
i=1 i=1
Now here is a useful observation.
Observation 7.3.7If W is a vector space and L ∈ℒ
(W, W )
is given by Lw = μw, then for any basisfor W, the matrix of L with respect to this basis is
+ rs = di and we can assume the size of the Jordan blocks decreases from upper left to lower
right. It follows then from the above observation that the matrix of A =
has a size which is uniquely determined by the dimension of ker
(A − μiI)
ki which comes from
the minimum polynomial. Furthermore, each J
(μi)
is a block diagonal matrix in which the blocks have a
certain specified form.
Note that if any of the βk consists of eigenvectors, then the corresponding Jordan block will consist of a
diagonal matrix having λk down the main diagonal. This corresponds to mk = 1. The vectors which
are in ker
(A − λkI)
mk which are not in ker
(A − λkI)
are called generalized eigenvectors.
The following is the main result on the Jordan canonical form.
Theorem 7.3.9Let V be an n dimensional vector space with field of scalars ℂ or some other fieldsuch that the minimum polynomial of A ∈ℒ
(V,V )
completely factors into powers of linear factors.Then there exists a uniqueJordan canonical form for A, where uniqueness is in the sense that anytwo have the same number and size of Jordan blocks.
Proof: Suppose there are two, J and J′. Then these are matrices of A with respect to possibly
different bases and so they are similar. Therefore, they have the same minimum polynomials and the
generalized eigenspaces ker
(A − μiI)
ki have the same dimension. Thus the size of the matrices J
(λk)
and
J′
(λk)
defined by the dimension of these generalized eigenspaces, also corresponding to the algebraic
multiplicity of λk, must be the same. Therefore, they comprise the same set of positive integers. Thus
listing the eigenvalues in the same order, corresponding blocks J
(λk)
,J′
(λk)
are the same
size.
It remains to show that J
(λk)
and J′
(λk)
are not just the same size but also are the same up to order
of the Jordan blocks running down their respective diagonals. It is only necessary to worry about the
number and size of the Jordan blocks making up J
(λk)
and J′
(λk)
. Since J,J′ are similar, so are J −λkI
and J′− λkI.
Thus the following two matrices are similar
( )
J (λ1)− λkI 0
|| ... ||
|| ||
A ≡ || J (λk)− λkI ||
|( ... |)
0 J (λ ) − λ I
r k
and it is required to verify that p = r and that the same blocks occur in both. Without loss of generality,
let the blocks be arranged according to size with the largest on upper left corner falling to smallest in lower
right. Now the desired conclusion follows from Corollary 7.3.6. ■
Note that if any of the generalized eigenspaces ker
(A − μ I)
k
mk has a basis of eigenvectors, then it
would be possible to use this basis and obtain a diagonal matrix in the block corresponding
to μk. By uniqueness, this is the block corresponding to the eigenvalue μk. Thus when this
happens, the block in the Jordan canonical form corresponding to μk is just the diagonal matrix
having μk down the diagonal and there are no generalized eigenvectors as fussed over in
ordinary differential equations. Recall that these were vectors in ker
(A− μ I)
k
mk but not in
ker
(A − μI)
.
The Jordan canonical form is very significant when you try to understand powers of a matrix. There exists an
n×n matrix S2
such that
A = S−1JS.
Therefore, A2 = S−1JSS−1JS = S−1J2S and continuing this way, it follows
Ak = S−1JkS.
where J is given in the above corollary. Consider Jk. By block multiplication,
( )
Jk1 0
Jk = || ... || .
( k )
0 Jr
The matrix Js is an ms× ms matrix which is of the form
Js = D + N (7.6)
(7.6)
for D a multiple of the identity and N an upper triangular matrix with zeros down the main diagonal.
Thus Nms = 0. Now since D is just a multiple of the identity, it follows that DN = ND. Therefore, the
usual binomial theorem may be applied and this yields the following equations for k ≥ ms.
∑k ( )
Jks = (D + N )k = k Dk −jNj
j=0 j
m∑s ( )
= k Dk −jN j, (7.7)
j=0 j
the third equation holding because Nms = 0. Thus Jsk is of the form