7.3 Nilpotent Transformations and Jordan Canonical Form
Definition 7.3.1Let V be a vector space over the field of scalars F. Then N ∈ℒ
(V,V )
is callednilpotentif for some m, it follows that N^{m} = 0.
In general, when you have A ∈ℒ
(V,V)
and the minimum polynomial is ∏_{i=1}^{p}ϕ_{i}
(λ)
^{ki}, you can decompose
V into a direct sum as
V =p ker(ϕ (A )ki)
i=1 i
Then on ker
( )
ϕi(A)ki
, it follows by definition that ϕ_{i}
(A)
is nilpotent.
The following lemma contains some significant observations about nilpotent transformations.
Lemma 7.3.2Suppose N^{k}x≠0if and only if
{ }
x,N x,⋅⋅⋅,N kx
is linearly independent. Also, theminimum polynomial of N is λ^{m}where m is the first such that N^{m} = 0.
Proof:Suppose ∑_{i=0}^{k}c_{i}N^{i}x = 0 where not all c_{i} = 0. There exists l such that k ≤ l < m
and N^{l+1}x = 0 but N^{l}x≠0. Then multiply both sides by N^{l} to conclude that c_{0} = 0. Next
multiply both sides by N^{l−1} to conclude that c_{1} = 0 and continue this way to obtain that all the
c_{i} = 0.
Next consider the claim that λ^{m} is the minimum polynomial. If p
(λ)
is the minimum polynomial, then
by the division algorithm,
m
λ = p(λ)l(λ )+ r(λ)
where the degree of r
(λ)
is less than that of p
(λ)
or else r
(λ)
= 0. The above implies 0 = 0 + r
(N )
contrary to p
(λ)
being minimum. Hence r
(λ)
= 0 and so p
(λ)
divides λ^{m}. Hence p
(λ )
= λ^{k} for k ≤ m. But
if k < m, this would contradict the definition of m as being the smallest such that N^{m} = 0.
■
Note how this lemma implies that if N^{k}x is a linear combination of the preceeding vectors for k as small
as possible, then N^{k}x = 0.
Now suppose V = ker
( )
ϕ(A )k
where ϕ
(λ)
is irreducible and the minimum polynomial for A on V is
ϕ
(λ )
^{k} as in the above. Letting B = ϕ
(A)
, we can consider a cyclic basis for V of the form
{βx,⋅⋅⋅,βx}
1 s
where β_{x}≡
{x,Bx,⋅⋅⋅,Bm −1x}
. From Lemma 7.3.2, B^{m}x = 0. So what is the matrix of B with respect to
this basis? It is block diagonal, the blocks coming from the individual β_{xi}, the size of the blocks being
|βx |
i
×
|βx |
i
,i = 1,
⋅⋅⋅
,s.
( )
C1
|| .. ||
( . )
Cp
Using the useful gimmick and ordering the basis using decreasing
powers,^{1}
one of these blocks is of the form C_{k} where C_{k} is the matrix on the right
( m− 1 )
0 B x ⋅⋅⋅ Bx
( 0 1 0 )
| . |
( m −1 m −2 )|| 0 0 .. ||
= B x B x ⋅⋅⋅ x || .. .. ||
( . . 1 )
0 0 ⋅⋅⋅ 0
That is, C_{k} is the
|βx |
×
|βx|
matrix which has ones down the super diagonal and zeros elsewhere. Note
that the size of the blocks is determined by
|βx|
just as in the above rational canonical form. In fact, if you
ordered the basis in the opposite way, one of these blocks would just be the companion matrix for B just as
in the rational canonical form. This is because of Lemma 7.3.2 which requires B^{m}x = 0 and so the
linear combination of B^{m}x in terms of B^{k}x for k < m has all zero coefficients. You would
have
In words, there is an unbrokenstring of ones down the super diagonal and the number α filling every spaceon the main diagonal with zeros everywhere else.
Then with this definition and the above discussion, the following proposition has been proved.
Proposition 7.3.4Let A ∈ℒ
(V,V )
and let the minimal polylnomial of A be
p∏ ⊕p ( )
p (λ) ≡ ϕi (λ)ki ,V = ker ϕi(A)ki
i=1 i=1
where the ϕ_{i}
(λ)
are irreducible. Also let B_{i}≡ ϕ_{i}
(A)
. Then B_{i}is nilpotent and
ki ( ki)
Bi = 0 on Vi = ker ϕi(A )
Letting the dimension of V_{i}be d_{i},thereexists a basis for V_{i}such that the matrix of B_{i}with respect to thisbasis is of the form
≥ r_{s}≥ 1 and∑_{i=1}^{s}r_{i} = d_{i}. In the above, the J_{rj}
(0)
is called a Jordan block of sizer_{j}× r_{j}with 0 down the main diagonal.
Observation 7.3.5Observe that J_{r}
(0)
^{r} = 0 but J_{r}
(0)
^{r−1}≠0.
In fact, the matrix of the above proposition is unique. This is a general fact for a nilpotent matrix
N.
Corollary 7.3.6Let J,J^{′}both be matrices of the nilpotent linear transformation N ∈ ℒ
(W, W )
which are of the form described in Proposition 7.3.4. Then J = J^{′}. Infact, if the rank of J^{k}equalsthe rank of J^{′k}for all nonnegative integers k, then J = J^{′}.
Proof: Since J and J^{′} are similar, it follows that for each k an integer, J^{k} and J^{′k} are
similar. Hence, for each k, these matrices have the same rank. Now suppose J≠J^{′}. Note first
that
Jr (0)r = 0,Jr(0)r−1 ⁄= 0.
Denote the blocks of J as J_{rk}
(0)
and the blocks of J^{′} as J_{rk′}
(0)
. Let k be the first such that
J_{rk}
(0)
≠J_{rk′}
(0)
. Suppose that r_{k}> r_{k}^{′}. By block multiplication and the above observation, it follows that
the two matrices J^{rk−1} and J^{′rk−1} are respectively of the forms
where M_{rj} = M_{rj′} for j ≤ k − 1 but M_{rk′} is a zero r_{k}^{′}×r_{k}^{′} matrix while M_{rk} is a larger matrix which is
not equal to 0. For example, M_{rk} could look like
( )
| 0 ⋅⋅⋅ 1 |
Mrk = |( ... ... |)
0 0
Thus this contradicts the requirement that J^{k} and J^{′k} have the same rank. ■
The Jordan canonical form is available when the minimum polynomial can be factored in the field of
scalars. Thus
∏r
p(λ) = (λ− μk)mk
k=1
and
⊕r ⊕r
V = ker(A − μiI)mi ≡ Vi
i=1 i=1
Now here is a useful observation.
Observation 7.3.7If W is a vector space and L ∈ℒ
(W, W )
is given by Lw = μw, then for any basisfor W, the matrix of L with respect to this basis is
Definition 7.3.8When the minimum polynomial for A is∏_{i=1}^{r}
(λ− μi)
^{mi}, the Jordan canonicalformof A is the block diagonal matrix of A obtained from using the cyclic basis for
(A − μiI)
onV_{i}≡ ker
(A − μiI)
^{mi}. Where
⊕r
V = Vi
i=1
It is unique up to order of blocks and is a block diagonal matrix with each block being blockdiagonal.
So what is the matrix description of the Jordan canonical form? From Proposition 7.3.4, the matrix of
A − μ_{i}I on V_{i} having dimension d_{i} is of the form
+ r_{s} = d_{i} and we can assume the size of the Jordan blocks decreases from upper left to lower
right. It follows then from the above observation that the matrix of A =
has a size which is uniquely determined by the dimension of ker
(A − μiI)
^{ki} which comes from
the minimum polynomial. Furthermore, each J
(μi)
is a block diagonal matrix in which the blocks have a
certain specified form.
Note that if any of the β_{k} consists of eigenvectors, then the corresponding Jordan block will consist of a
diagonal matrix having λ_{k} down the main diagonal. This corresponds to m_{k} = 1. The vectors which
are in ker
(A − λkI)
^{mk} which are not in ker
(A − λkI)
are called generalized eigenvectors.
The following is the main result on the Jordan canonical form.
Theorem 7.3.9Let V be an n dimensional vector space with field of scalars ℂ or some other fieldsuch that the minimum polynomial of A ∈ℒ
(V,V )
completely factors into powers of linear factors.Then there exists a uniqueJordan canonical form for A, where uniqueness is in the sense that anytwo have the same number and size of Jordan blocks.
Proof: Suppose there are two, J and J^{′}. Then these are matrices of A with respect to possibly
different bases and so they are similar. Therefore, they have the same minimum polynomials and the
generalized eigenspaces ker
(A − μiI)
^{ki} have the same dimension. Thus the size of the matrices J
(λk)
and
J^{′}
(λk)
defined by the dimension of these generalized eigenspaces, also corresponding to the algebraic
multiplicity of λ_{k}, must be the same. Therefore, they comprise the same set of positive integers. Thus
listing the eigenvalues in the same order, corresponding blocks J
(λk)
,J^{′}
(λk)
are the same
size.
It remains to show that J
(λk)
and J^{′}
(λk)
are not just the same size but also are the same up to order
of the Jordan blocks running down their respective diagonals. It is only necessary to worry about the
number and size of the Jordan blocks making up J
(λk)
and J^{′}
(λk)
. Since J,J^{′} are similar, so are J −λ_{k}I
and J^{′}− λ_{k}I.
Thus the following two matrices are similar
( )
J (λ1)− λkI 0
|| ... ||
|| ||
A ≡ || J (λk)− λkI ||
|( ... |)
0 J (λ ) − λ I
r k
and it is required to verify that p = r and that the same blocks occur in both. Without loss of generality,
let the blocks be arranged according to size with the largest on upper left corner falling to smallest in lower
right. Now the desired conclusion follows from Corollary 7.3.6. ■
Note that if any of the generalized eigenspaces ker
(A − μ I)
k
^{mk} has a basis of eigenvectors, then it
would be possible to use this basis and obtain a diagonal matrix in the block corresponding
to μ_{k}. By uniqueness, this is the block corresponding to the eigenvalue μ_{k}. Thus when this
happens, the block in the Jordan canonical form corresponding to μ_{k} is just the diagonal matrix
having μ_{k} down the diagonal and there are no generalized eigenvectors as fussed over in
ordinary differential equations. Recall that these were vectors in ker
(A− μ I)
k
^{mk} but not in
ker
(A − μI)
.
The Jordan canonical form is very significant when you try to understand powers of a matrix. There exists an
n×n matrix S^{2}
such that
A = S−1JS.
Therefore, A^{2} = S^{−1}JSS^{−1}JS = S^{−1}J^{2}S and continuing this way, it follows
Ak = S−1JkS.
where J is given in the above corollary. Consider J^{k}. By block multiplication,
( )
Jk1 0
Jk = || ... || .
( k )
0 Jr
The matrix J_{s} is an m_{s}× m_{s} matrix which is of the form
Js = D + N (7.6)
(7.6)
for D a multiple of the identity and N an upper triangular matrix with zeros down the main diagonal.
Thus N^{ms} = 0. Now since D is just a multiple of the identity, it follows that DN = ND. Therefore, the
usual binomial theorem may be applied and this yields the following equations for k ≥ m_{s}.
∑k ( )
Jks = (D + N )k = k Dk −jNj
j=0 j
m∑s ( )
= k Dk −jN j, (7.7)
j=0 j
the third equation holding because N^{ms} = 0. Thus J_{s}^{k} is of the form