would be independent which is impossible by Theorem 3.1.5. You have an independent set which is longer
than a spanning set. ■
Suppose now that A ∈ℒ
(V,W )
and B ∈ℒ
(W, U)
where V,W,U are all finite dimensional vector
spaces. Then it is interesting to consider ker
(BA )
. The following theorem of Sylvester is a very useful and
important result.
Theorem 6.1.3Let A ∈ℒ
(V,W )
and B ∈ℒ
(W,U )
where V,W,U are all vector spacesover a field F. Suppose also that ker
(A)
and A
(ker(BA ))
are finite dimensional subspaces.Then
dim (ker(BA )) ≤ dim (ker(B ))+ dim (ker(A )).
Equality holds if and only if A
(ker(BA ))
= ker
(B )
.
Proof:If x ∈ ker
(BA)
, then Ax ∈ ker
(B)
and so
A (ker (BA )) ⊆ ker(B).
The following picture may help.
PICT
Now let
{x1,⋅⋅⋅,xn}
be a basis of ker
(A )
and let
{Ay1,⋅⋅⋅,Aym }
be a basis for A
(ker(BA ))
, each
y_{i}∈ ker
(BA )
. Take any z ∈ ker
(BA )
. Then Az = ∑_{i=1}^{m}a_{i}Ay_{i} and so
( )
m∑
A z − aiyi = 0
i=1
which means z −∑_{i=1}^{m}a_{i}y_{i}∈ ker
(A)
and so there are scalars b_{i} such that
∑m ∑n
z − aiyi = bixi.
i=1 j=1
It follows span
(x1,⋅⋅⋅,xn,y1,⋅⋅⋅,ym )
= ker
(BA )
and so by the first part, (See the picture.)
dim (ker(BA)) ≤ n + m ≤ dim(ker(A))+ dim (A (ker(BA )))
≤ dim (ker(A ))+ dim (ker(B ))
Now
{x1,⋅⋅⋅,xn,y1,⋅⋅⋅,ym}
is linearly independent because if
∑ aixi +∑ bjyj = 0
i j
then you could do A to both sides and conclude that ∑_{j}b_{j}Ay_{j} = 0 which requires that each b_{j} = 0. Then
it follows that each a_{i} = 0 also because it implies ∑_{i}a_{i}x_{i} = 0. Thus the first inequality in the above list is
an equal sign and
{x1,⋅⋅⋅,xn,y1,⋅⋅⋅,ym}
is a basis for ker
(BA )
. Each vector is in ker
(BA )
, they are linearly independent, and their span is
ker
dim (ker(BA )) = m + n = dim (ker(B))+ dim (ker(A)).■
Of course this result holds for any finite product of linear transformations by induction. One way this is
quite useful is in the case where you have a finite product of linear transformations ∏_{i=1}^{l}L_{i} all in ℒ
(V,V)
.
Then
( )
∏l ∑l
dim ker Li ≤ dim (kerLi).
i=1 i=1
Now here is a useful lemma which is likely already understood.
Lemma 6.1.4Let L ∈ ℒ
(V,W )
where V,W are n dimensional vector spaces. Then L is one toone, if and only if L is also onto. In fact, if
{v ,⋅⋅⋅,v }
1 n
is a basis, then so is
{Lv ,⋅⋅⋅,Lv }
1 n
.
Proof:Let
{v1,⋅⋅⋅,vn}
be a basis for V . Then I claim that
{Lv1,⋅⋅⋅,Lvn}
is a basis for W. First of
all, I show
{Lv1,⋅⋅⋅,Lvn }
is linearly independent. Suppose
∑n
ckLvk = 0.
k=1
Then
( ∑n )
L ckvk = 0
k=1
and since L is one to one, it follows
∑n
ckvk = 0
k=1
which implies each c_{k} = 0. Therefore,
{Lv1,⋅⋅⋅,Lvn }
is linearly independent. If there exists w not in the
span of these vectors, then by Lemma 3.1.7,
{Lv1,⋅⋅⋅,Lvn,w}
would be independent and this contradicts
the exchange theorem, Theorem 3.1.5 because it would be a linearly independent set having more vectors
than the spanning set
{v1,⋅⋅⋅,vn}
.
Conversely, suppose L is onto. Then there exists a basis for W which is of the form
{Lv1,⋅⋅⋅,Lvn }
. It
follows that
{v1,⋅⋅⋅,vn}
is linearly independent. Hence it is a basis for V by similar reasoning
to the above. Then if Lx = 0, it follows that there are scalars c_{i} such that x = ∑_{i}c_{i}v_{i} and
consequently 0 = Lx = ∑_{i}c_{i}Lv_{i}. Therefore, each c_{i} = 0 and so x = 0 also. Thus L is one to one.
■
Here is a fundamental lemma which gives a typical situation where a vector space is the direct sum of
subspaces.
Lemma 6.1.5Let L_{i}be in ℒ
(V,V)
and suppose for i≠j,L_{i}L_{j} = L_{j}L_{i}andalso L_{i}is one to one on
ker
(Lj)
whenever i≠j. Then
( )
∏p
ker Li = ker(L1 )⊕ +⋅⋅⋅+ ⊕ ker(Lp )
i=1
Here∏_{i=1}^{p}L_{i}is the product of all the linear transformations. It signifies
Lp ∘Lp−1 ∘⋅⋅⋅∘L1
or the product in any other order since the transformations commute.
Proof :Note that since the operators commute, L_{j} : ker
(Li)
→ ker
(Li)
. Here is why. If L_{i}y = 0 so
that y ∈ ker
(Li)
, then
LiLjy = LjLiy = Lj0 = 0
and so L_{j} : ker
(Li)
↦→
ker
(Li)
. Next observe that it is obvious that, since the operators commute,
p ( p )
∑ ker(Lp) ⊆ ker ∏ Li
i=1 i=1
Next, why is ∑_{i} ker
(Lp)
= ker
(L1)
⊕
⋅⋅⋅
⊕ ker
(Lp)
? Suppose
∑p
vi = 0,vi ∈ ker(Li),
i=1
but some v_{i}≠0. Then do ∏_{j≠i}L_{j} to both sides. Since the linear transformations commute, this results
in
( )
∏
( Lj) (vi) = 0
j⁄=i
which contradicts the assumption that these L_{j} are one to one on ker
(Li)
and the observation that they
map ker
(Li)
to ker
(Li)
. Thus if
∑
vi = 0,vi ∈ ker(Li)
i
then each v_{i} = 0. It follows that
( )
∏p
ker(L1) ⊕+ ⋅⋅⋅+ ⊕ ker(Lp) ⊆ ker Li (*)
i=1
(*)
From Sylvester’s theorem and the observation about direct sums in Lemma 6.0.2,
∑p
dim (ker(Li)) = dim (ker(L1)⊕ + ⋅⋅⋅+ ⊕ ker(Lp))
i=1 ( ( p )) p
∏ ∑
≤ dim ker i=1Li ≤ i=1 dim (ker(Li))
which implies all these are equal. Now in general, if W is a subspace of V, a finite dimensional vector
space and the two have the same dimension, then W = V, Lemma 6.1.2. It follows from *
that
( )
∏p
ker(L1)⊕ + ⋅⋅⋅+ ⊕ ker(Lp) = ker Li ■
i=1
So how does the above situation occur? First recall the following theorem and corollary about
polynomials. It was Theorem 6.1.6 and Corollary 6.1.7 proved earlier.
Theorem 6.1.6Let f
(λ)
be a nonconstantpolynomial with coefficients in F. Then there is somea ∈ F such that f
(λ)
= a∏_{i=1}^{n}ϕ_{i}
(λ)
where ϕ_{i}
(λ)
is an irreducible nonconstant monic polynomialand repeats are allowed. Furthermore, this factorization is unique in the sense that any two of thesefactorizations have the same nonconstant factors in the product, possibly in different order and thesame constant a.
Corollary 6.1.7Let q
(λ)
= ∏_{i=1}^{p}ϕ_{i}
(λ)
^{ki}where the k_{i}are positive integers and the ϕ_{i}
(λ)
areirreducible monic polynomials. Suppose also that p
(λ)
is a monic polynomial which divides q
(λ)
.Then
∏p ri
p (λ) = ϕi(λ)
i=1
where r_{i}is a nonnegative integer no larger than k_{i}.
Now I will show how to use these basic theorems about polynomials to produce L_{i} such that the above
major result follows. This is going to have a striking similarity to the notion of a minimum polynomial in
the context of algebraic numbers.
Definition 6.1.8Let V be an n dimensional vector space, n ≥ 1, and let L ∈ ℒ
(V,V )
whichis a vector space of dimension n^{2}by Theorem 5.1.4. Then p
(λ)
will be the non constant monicpolynomial such that p
(L )
= 0 and out of all polynomials q
(λ)
such that q
(L)
= 0, the degree ofp
(λ)
is the smallest. This is called the minimum polynomial.It is always understood that L≠0. It isnot interesting to fuss with this case of the zero linear transformation.
In the following, we always define L^{0}≡ I.
Theorem 6.1.9The above definition is well defined. Also, if q
(L)
= 0, then p
(λ)
divides q
(λ )
.
Proof:The dimension of ℒ
(V,V)
is n^{2}. Therefore, I,L,
⋅⋅⋅
,L^{n2
} are linearly dependent and
so there is some polynomial q
(λ)
such that q
(L )
= 0. Let m be the smallest degree of any
polynomial with this property. Such a smallest number exists by well ordering of ℕ. To obtain a
monic polynomial p
(λ)
with degree m, divide such a polynomial with degree m, having the
property that p
(L)
= 0 by the leading coefficient. Now suppose q
(λ)
is any polynomial such that
q
(L )
= 0. Then by the Euclidean algorithm, there is r
(λ)
either zero or having degree less
than the degree of p
(λ)
such that q
(λ)
= p
(λ)
k
(λ)
+ r
(λ)
for some polynomial k
(λ)
. But
then
0 = q(L) = k (L )p(L)+ r(L) = r(L)
If r
(λ)
≠0, then this is a contradiction to p
(λ)
having the smallest degree. Therefore, p
(λ)
divides q
(λ )
.
Now suppose
ˆp
(λ)
and p
(λ)
are two monic polynomials of degree m. Then from what was just shown
ˆp
(λ)
divides p
(λ)
and p
(λ)
divides
ˆp
(λ)
. Since they are both monic polynomials, they must be equal.
Thus the minimum polynomial is unique and this shows the above definition is well defined.
■
Now here is the major result which comes from Sylvester’s theorem given above.
Theorem 6.1.10Let L ∈ℒ
(V,V)
where V is an n dimensional vector space with field of scalars F.Letting p
(λ)
be the minimum polynomial for L,
∏p ki
p (λ) = ϕi (λ)
i=1
where the k_{i}are positive integers and the ϕ_{i}
(λ)
are distinct irreducible monic polynomials. Also the linearmaps ϕ_{i}
(L)
^{ki}commute and ϕ_{i}
(L )
^{ki}is one to one on ker
( k)
ϕj(L) j
for all j≠i as is ϕ_{i}
(L)
.Also
( ) ( )
V = ker ϕ1(L)k1 ⊕ ⋅⋅⋅⊕ker ϕp (L )kp
and each ker
( )
ϕ (L )ki
i
is invariant with respect to L. Letting L_{j}be the restriction of L to
( )
ker ϕj(L)kj ,
it follows that the minimum polynomial of L_{j}equals ϕ_{j}
is monic, it follows that a = 1.
Since L commutes with itself, all of these ϕ_{i}
(L)
^{ki} commute. Also
ϕ (L ) : ker(ϕ (L)kj) → ker( ϕ (L)kj)
i j j
because all of these operators commute.
Now consider ϕ_{i}
(L)
. Is it one to one on ker
( )
ϕj(L)kj
? Suppose not. Suppose that for some k≠i,ϕ_{i}
(L )
is not one to one on ker
( kk)
ϕk(L)
. We know that ϕ_{i}
(λ)
,ϕ_{k}
(λ)
^{kk} are relatively prime meaning the monic
polynomial of greatest degree which divides them both is 1. Why is this? If some polynomial divided both,
then it would need to be ϕ_{i}
(λ)
or 1 because ϕ_{i}
(λ)
is irreducible. But ϕ_{i}
(λ)
cannot divide ϕ_{k}
(λ )
^{kk} unless it
equals ϕ_{k}
(λ)
, this by Corollary 6.1.7 and they are assumed unequal. Hence there are polynomials
l
(λ)
,m
(λ)
such that
1 = l(λ)ϕi(λ)+ m (λ)ϕk (λ)kk
By what we mean by equality of polynomials, that coefficients of equal powers of λ are equal, it follows
that for I the identity transformation,