6.3 Eigenvalues And Eigenvectors Of Linear Transformations
We begin with the following fundamental definition.
Definition 6.3.1Let L ∈ℒ
(V,V)
where V is a vector space of dimension n with field of scalars F. Aneigen-pair consists of a scalar λ ∈ F called an eigenvalue and a NON-ZEROv ∈ V suchthat
(λI − L )v = 0
Do eigen-pairs exist? Recall that from Theorem 6.1.10 the minimum polynomial can be factored in a
unique way as
p∏
p (λ) = ϕi (λ)ki
i=1
where each ϕ_{i}
(λ)
is irreducible and monic. Then the following theorem is obtained.
Theorem 6.3.2Let L ∈ℒ
(V,V)
and let its minimum polynomial p
(λ )
have a root μ in the fieldof scalars. Then μ is an eigenvalue of L.
Proof:Since p
(λ)
has a root, we know p
(λ)
=
(λ − μ)
q
(λ)
where the degree of q
(λ)
is less than the
degree of p
(λ)
. Therefore, there is a vector u such that q
(L)
u ≡ v≠0. Otherwise, p
(λ)
is not really the
minimum polynomial. Then
(L − μI)
q
(L)
u =
(L − μI)
v = 0 and so μ is indeed an eigenvalue.
■
Theorem 6.3.3Suppose the minimum polynomial of L ∈ℒ
(V,V )
, p
(λ)
factors completely into linearfactors (splits) so that
p
p (λ) = ∏ (λ − μ)ki
i=1 i
Then the μ_{i}are distinct eigenvalues and corresponding to each of these eigenvalues, there is aneigenvector w_{i}≠0 such that Lw_{i} = μ_{i}w_{i}. Also, there are noother eigenvalues than these μ_{i}.Also
V = ker(L− μ1I)k1 ⊕ ⋅⋅⋅⊕ker(L − μpI)kp
and if L_{i}is the restriction of L to ker
(A − μiI)
^{ki}, then L_{i}has exactly one eigenvalueand it isμ_{i}.
Proof:By Theorem 6.3.2, each μ_{i} is an eigenvalue and we can let w_{i} be a corresponding eigenvector.
By Theorem 6.1.10,
k1 kp
V = ker(L − μ1I) ⊕ ⋅⋅⋅⊕ ker(L − μpI)
Also by this theorem, the minimum polynomial of L_{i} is
(λ− μi)
^{ki} and so it has an eigenvalue μ_{i}. Could L_{i}
have any other eigenvalue ν≠μ_{i}? To save notation, denote by m the exponent k_{i} and by μ the eigenvalue μ_{i}.
Also let w denote an eigenvalue of L_{i} with respect to ν. Then since minimum polynomial for L_{i} is
(λ − μ)
^{m},
∑m ( )
0 = (L − μI)m w = (L − νI + (ν − μ )I)m w = m (L − νI )m −k(ν − μ)kw
k=0 k
= (ν − μ)mw
which is impossible because w≠0. Thus there can be no other eigenvalue for L_{i}. Consider the claim about L
having no other eigenvalues than the μ_{i}. Say μ is another eigenvalue with eigenvector w. Then let
w = ∑_{i}z_{i},z_{i}∈ ker
(L − μ I)
i
^{ki}. Then not every z_{i} = 0 and
∑ ∑ ∑
0 = (L − μI) zi = (Lzi − μzi) = Lizi − μzi
i i i
Since this is a direct sum and each ker
(L − μiI)
^{ki} is invariant with respect to L, we must have
each L_{i}z_{i}− μz_{i} = 0. This is impossible unless μ equals some μ_{i} because not every z_{i} is 0.
■
Example 6.3.4The minimum polynomial for the matrix
( )
4 0 − 6
A = |( − 1 2 3 |)
1 0 − 1
is λ^{2}− 3λ + 2. This factors as
(λ − 2)
(λ− 1)
and so the eigenvalues are 1,2. Find the eigen-pairs. Thendetermine the matrix with respect to a basis of these eigenvectors if possible.
First consider the eigenvalue 2. There exists a nonzero vector v such that
(A− 2I)
v = 0. This follows
from the above theory. However, it is best to just find it directly rather than try to get it by using the proof
of the above theorem. The augmented matrix to consider is then
You might want to consider Problem 9 on Page 298 at this point. This problem shows that the matrix with
respect to this basis is diagonal.
When the matrix of a linear transformation can be chosen to be a diagonal matrix, the transformation
is said to be nondefective. Also, note that the term applies to the matrix of a linear transformation
and so I will specialize to the consideration of matrices in what follows. As shown above, this
is equivalent to saying that any matrix of the linear transformation is similar to one which
is diagonal. That is, the matrix of a linear transformation, or more generally just a square
matrix A has the property that there exists S such that S^{−1}AS = D where D is a diagonal
matrix.
Here is a definition which also introduces one of the most horrible adjectives in all of mathematics.
Definition 6.3.5Let A be an n×n matrix. Then A is diagonalizableif there exists an invertible matrixS such that
− 1
S AS = D
where D is a diagonal matrix. This means D has a zero as every entry except for the main diagonal.Moreprecisely, D_{ij} = 0 unless i = j. Such matrices look like the following.
( )
| ∗ 0 |
|( ... |)
0 ∗
where ∗ might not be zero.
The most important theorem about
diagonalizability^{1}
is the following major result. First here is a simple observation.
Observation 6.3.6Let S =
( )
s1 ⋅⋅⋅ sn
where S is n×n. Then here is the result of multiplying onthe right by a diagonal matrix.
This follows from the way we multiply matrices. The i^{th}entry of the j^{th}column of the product on the left isof the form s_{i}λ_{j}. Thus the j^{th}column of the matrix on the left is just λ_{j}s_{j}.
Theorem 6.3.7An n × n matrix is diagonalizable if and only if F^{n}has a basis of eigenvectors of A.Furthermore, you can take the matrix S described above, to be given as
( )
S = s1 s2 ⋅⋅⋅ sn
where here the s_{k}are the eigenvectors in the basis for F^{n}. If A is diagonalizable,the eigenvalues of A arethe diagonal entries of the diagonal matrix.
Proof: To say that A is diagonalizable, is to say that for some S,
( )
λ1
S−1AS = || .. ||
( . )
λn
the λ_{i} being elements of F. This is to say that for S =
( )
s1 ⋅⋅⋅ sn
, s_{k} being the k^{th}
column,
( )
λ1
A ( ) = ( ) || .. ||
s1 ⋅⋅⋅ sn s1 ⋅⋅⋅ sn ( . )
λn
which is equivalent, from the way we multiply matrices, that
( ) ( )
As1 ⋅⋅⋅ Asn = λ1s1 ⋅⋅⋅ λnsn
which is equivalent to saying that the columns of S are eigenvectors and the diagonal matrix has the
eigenvectors down the main diagonal. Since S^{−1} is invertible, these eigenvectors are a basis. Similarly, if
there is a basis of eigenvectors, one can take them as the columns of S and reverse the above steps, finally
concluding that A is diagonalizable. ■
Corollary 6.3.8Let A be an n × n matrix with minimum polynomial
∏p ki
p(λ) = (λ− μi) , the μi being distinct.
i=1
Then A is diagonalizable if and only if each k_{i} = 1.
Proof:Suppose first that it is diagonalizable and that a basis of eigenvectors is
{v1,⋅⋅⋅,vn }
with Av_{i} = μ_{i}v_{i}. Since n ≥ p, there may be some repeats here, a μ_{i} going with more than
one v_{i}. Say k_{i}> 1. Now consider
ˆp
(λ)
≡∏_{j=1,j≠i}^{p}
(λ− μj)
^{kj}
(λ− μi)
. Thus this is a monic
polynomial which has smaller degree than p
(λ)
. If you have v ∈ F^{n}, since this is a basis, there
are scalars c_{i} such that v = ∑_{j}c_{j}v_{j}. Then
ˆp
(A )
v = 0. Since v is arbitrary, this shows that
ˆp
(A )
= 0 contrary to the definition of the minimum polynomial being p
(λ)
. Thus each k_{i} must be
1.
Conversely, if each k_{i} = 1, then
Fn = ker(A − μ1I)⊕ ⋅⋅⋅⊕ ker(A − μpI)
and you simply let β_{i} be a basis for ker
(A − μiI)
which consists entirely of eigenvectors by definition of
what you mean by ker
(A − μiI)
. Then a basis of eigenvectors consists of
{β1,β2,⋅⋅⋅,βp}
and so the
matrix A is diagonalizable. ■
Example 6.3.9The minimum polynomial for the matrix
( )
| 10 12 − 6|
A = ( − 4 − 4 3 )
3 4 − 1
is λ^{3}− 5λ^{2} + 8λ − 4. This factors as
(λ− 2)
^{2}
(λ − 1)
and so the eigenvalues are 1,2. Find theeigen-pairs. Then determine the matrix with respect to a basis of these eigenvectors if possible. Ifit is not possible to find a basis of eigenvectors, find a block diagonal matrix similar to thematrix.
First find the eigenvectors for 2. You need to row reduce
By Theorem 6.3.3, there are no other eigenvectors than those which correspond to eigenvalues
1,2. Thus there is no basis of eigenvectors because the span of the eigenvectors has dimension
two.
However, we can consider
3 ( 2)
ℝ = ker (A − 2I) ⊕ ker(A − I)
The second of these is just span
( )
( )T
2 − 1 1
. What is the first? We find it by row reducing the
following matrix which is the square of A − 2I augmented with a column of zeros.
( )
− 2 0 6 0
|( 1 0 − 3 0 |)
− 1 0 3 0
Row reducing this yields
( 1 0 − 3 0 )
| 0 0 0 0 |
( )
0 0 0 0
which says that solutions are of the form
( )
| 3z |
( y ) ,y,z ∈ ℝ not both 0
z
This is the nonzero vectors of
(( ) ( ) )
3 0
span |(|( 0 |) ,|( 1 |) |)
1 0
What is the matrix of the restriction of A to this subspace?
Then the matrix associated with the other eigenvector is just 1. Hence the matrix with respect to the above
ordered basis is
( )
| 8 4 0 |
( − 9 − 4 0 )
0 0 1
So what are some convenient computations which will allow you to find M easily? Take the transpose of
both sides of 6.2. Then you would have
( ) ( )
24 − 9 8 T 3 0 1
12 − 4 4 = M 0 1 0
Thus
( ) ( ) ( ) ( )
M T 0 = − 9 ,M T 1 = 8
1 − 4 0 4
and so M^{T} =
( )
8 − 9
4 − 4
so M =
( )
8 4
− 9 − 4
.
The eigenvalue problem is one of the hardest problems in algebra because of our inability to exactly
solve polynomial equations. Therefore, estimating the eigenvalues becomes very significant. In the case of
the complex field of scalars, there is a very elementary result due to Gerschgorin. It can at least give an
upper bound for the size of the eigenvalues.
Theorem 6.3.10Let A be an n × n matrix. Consider the n Gerschgorin discs defined as
Then every eigenvalue is contained in some Gerschgorin disc.
This theorem says to add up the absolute values of the entries of the i^{th} row which are off the main
diagonal and form the disc centered at a_{ii} having this radius. The union of these discs contains
σ
, it follows λ is contained in the k^{th} Gerschgorin disc. ■
In these examples given above, it was possible to factor the minimum polynomial and explicitly
determine eigenvalues and eigenvectors and obtain information about whether the matrix was
diagonalizable by explicit computations. Well, what if you can’t factor the minimum polynomial? What
then? This is the typical situation, not what was presented in the above examples. Just write down a 3 × 3
matrix and see if you can find the eigenvalues explicitly using algebra. Is there a way to determine whether
a given matrix is diagonalizable in the case that the minimum polynomial factors although you might have
trouble finding the factors? Amazingly, the answer is yes. One can answer this question completely using
only methods from algebra.