In the case where A has distinct positive eigenvalues it was shown above that under reasonable
conditions related to a certain matrix having an LU factorization the QR algorithm produces a
sequence of matrices
{Ak }
which converges to an upper triangular matrix. What if A is just
an n × n matrix having possibly complex eigenvalues but A is nondefective? What happens
with the QR algorithm in this case? The short answer to this question is that the A_{k} of the
algorithm typically cannot converge. However, this does not mean the algorithm is not useful
in finding eigenvalues. It turns out the sequence of matrices
{Ak }
have the appearance of a
block upper triangular matrix for large k in the sense that the entries below the blocks on
the main diagonal are small. Then looking at these blocks gives a way to approximate the
eigenvalues.
First it is important to note a simple fact about unitary diagonal matrices. In what follows Λ will
denote a unitary matrix which is also a diagonal matrix. These matrices are just the identity matrix with
some of the ones replaced with a number of the form e^{iθ} for some θ. The important property of
multiplication of any matrix by Λ on either side is that it leaves all the zero entries the same and also
preserves the absolute values of the other entries. Thus a block triangular matrix multiplied by Λ on either
side is still block triangular. If the matrix is close to being block triangular this property of being close to a
block triangular matrix is also preserved by multiplying on either side by Λ. Other patterns depending only
on the size of the absolute value occurring in the matrix are also preserved by multiplying on
either side by Λ. In other words, in looking for a pattern in a matrix, multiplication by Λ is
irrelevant.
Now let A be an n × n matrix having real or complex entries. By Lemma 16.3.3 and the assumption
that A is nondefective, there exists an invertible S,
Ak = Q (k)R(k) = SDkS −1 (16.13)
(16.13)
where
( )
| λ1 0 |
D = |( ... |)
0 λn
and by rearranging the columns of S, D can be made such that
|λ1| ≥ |λ2| ≥ ⋅⋅⋅ ≥ |λn|.
Assume S^{−1} has an LU factorization. Then
k k k −k k
A = SD LU = SD LD D U.
Consider the matrix in the middle, D^{k}LD^{−k}. The ij^{th} entry is of the form
( k −k
( ) |{ λiLijλj if j < i
DkLD −k ij = | 1 if i = j
( 0 if j > i
and these all converge to 0 whenever
|λi|
<
|λj|
. Thus
k −k
D LD = (Lk + Ek)
where L_{k} is a lower triangular matrix which has all ones down the diagonal and some subdiagonal terms of
the form
k −k
λiLijλj (16.14)
(16.14)
for which
|λi|
=
|λj|
while E_{k}→ 0. (Note the entries of L_{k} are all bounded independent of k but some may
fail to converge.) Then
Q (k)R(k) = S(L + E )DkU
k k
Let
SLk = QkRk (16.15)
(16.15)
where this is the QR factorization of SL_{k}. Then
Q(k)R (k) = (QkRk + SEk )DkU
( ∗ −1) k
= Qk I + Q kSEkR k RkD U
= Qk (I + Fk)RkDkU
where F_{k}→ 0. Let I + F_{k} = Q_{k}^{′}R_{k}^{′}. Then Q^{(k)
}R^{(k)
} = Q_{k}Q_{k}^{′}R_{k}^{′}R_{k}D^{k}U. By Lemma 16.3.4
Q ′k → I and R′k → I. (16.16)
(16.16)
Now let Λ_{k} be a diagonal unitary matrix which has the property that Λ_{k}^{∗}D^{k}U is an upper triangular
matrix which has all the diagonal entries positive. Then
(k) (k) ′ ∗ ′ ∗ k
Q R = QkQkΛk (ΛkRkRk Λk)ΛkD U
That matrix in the middle has all positive diagonal entries because it is itself an upper triangular matrix,
being the product of such, and is similar to the matrix R_{k}^{′}R_{k} which is upper triangular with positive
diagonal entries. By Lemma 16.3.4 again, this time using the uniqueness assertion,
(k) ′ (k) ∗ ′ ∗ k
Q = QkQ kΛk,R = (Λ kRkRkΛk )ΛkD U
Note the term Q_{k}Q_{k}^{′}Λ_{k} must be real because the algorithm gives all Q^{}
(k)
as real matrices. By 16.16 it
follows that for k large enough Q^{(k)
}≈ Q_{k}Λ_{k} where ≈ means the two matrices are close. Recall
A_{k} = Q^{(k)
T}AQ^{(k)
} and so for large k,
∗ ∗ ∗
Ak ≈ (QkΛk ) A(QkΛk ) = ΛkQ kAQk Λk
As noted above, the form of Λ_{k}^{∗}Q_{k}^{∗}AQ_{k}Λ_{k} in terms of which entries are large and small is not affected by
the presence of Λ_{k} and Λ_{k}^{∗}. Thus, in considering what form this is in, it suffices to consider
Q_{k}^{∗}AQ_{k}.
This could get pretty complicated but I will consider the case where
if |λi| = |λi+1|, then |λi+2| < |λi+1|. (16.17)
(16.17)
This is typical of the situation where the eigenvalues are all distinct and the matrix A is real so the
eigenvalues occur as conjugate pairs. Then in this case, L_{k} above is lower triangular with some nonzero
terms on the diagonal right below the main diagonal but zeros everywhere else. Thus maybe
where R_{k}^{−1} is upper triangular. Also recall from the definition of S in 16.13, it follows that S^{−1}AS = D.
Thus the columns of S are eigenvectors of A, the i^{th} being an eigenvector for λ_{i}. Now from the form of L_{k},
it follows L_{k}R_{k}^{−1} is a block upper triangular matrix denoted by T_{B} and so Q_{k} = ST_{B}. It follows from the
above construction in 16.14 and the given assumption on the sizes of the eigenvalues, there are finitely
many 2 × 2 blocks centered on the main diagonal along with possibly some diagonal entries.
Therefore, for large k the matrix A_{k} = Q^{}
(k)
TAQ^{(k)
} is approximately of the same form as that
of
∗ −1 −1 − 1
QkAQk = TB S ASTB = TB DTB
which is a block upper triangular matrix. As explained above, multiplication by the various diagonal
unitary matrices does not affect this form. Therefore, for large k, A_{k} is approximately a block upper
triangular matrix.
How would this change if the above assumption on the size of the eigenvalues were relaxed but the
matrix was still nondefective with appropriate matrices having an LU factorization as above? It
would mean the blocks on the diagonal would be larger. This immediately makes the problem
more cumbersome to deal with. However, in the case that the eigenvalues of A are distinct, the
above situation really is typical of what occurs and in any case can be quickly reduced to this
case.
To see this, suppose condition 16.17 is violated and λ_{j},
⋅⋅⋅
,λ_{j+p} are complex eigenvalues having nonzero
imaginary parts such that each has the same absolute value but they are all distinct. Then let μ > 0 and
consider the matrix A + μI. Thus the corresponding eigenvalues of A + μI are λ_{j} + μ,
⋅⋅⋅
,λ_{j+p} + μ. A short
computation shows
|λj + μ|
,
⋅⋅⋅
,
|λj+p + μ|
are all distinct and so the above situation of 16.17 is obtained.
Of course, if there are repeated eigenvalues, it may not be possible to reduce to the case above
and you would end up with large blocks on the main diagonal which could be difficult to deal
with.
So how do you identify the eigenvalues? You know A_{k} and behold that it is close to a block upper
triangular matrix T_{B}^{′}. You know A_{k} is also similar to A. Therefore, T_{B}^{′} has eigenvalues which are close
to the eigenvalues of A_{k} and hence those of A provided k is sufficiently large. See Theorem
14.4.2 which depends on complex analysis or the exercise on Page 959 which gives another
way to see this. Thus you find the eigenvalues of this block triangular matrix T_{B}^{′} and assert
that these are good approximations of the eigenvalues of A_{k} and hence to those of A. How
do you find the eigenvalues of a block triangular matrix? This is easy from Lemma 14.1.4.
Say
( )
| B1 ⋅⋅⋅ ∗ |
T′B = |( ... ... |)
0 B
m
Then forming λI − T_{B}^{′} and taking the determinant, it follows from Lemma 14.1.4 this equals
m∏
det(λIj − Bj )
j=1
and so all you have to do is take the union of the eigenvalues for each B_{j}. In the case emphasized here this
is very easy because these blocks are just 2 × 2 matrices.
How do you identify approximate eigenvectors from this? First try to find the approximate eigenvectors
for A_{k}. Pick an approximate eigenvalue λ, an exact eigenvalue for T_{B}^{′}. Then find v solving T_{B}^{′}v = λv. It
follows since T_{B}^{′} is close to A_{k} that A_{k}v ≈ λv and so
(k) (k)T
Q AQ v = Akv ≈ λv
Hence
(k)T (k)T
AQ v ≈ λQ v
and so Q^{}
(k)
Tv is an approximation to the eigenvector which goes with the eigenvalue of A which is close to
λ.
Example 16.3.8Here is a matrix.
( )
3 2 1
|( − 2 0 − 1|)
− 2 − 2 0
It happens that the eigenvalues of this matrix are 1,1 + i,1 − i. Lets apply the QR algorithm as if theeigenvalues were not known.
Applying the QR algorithm to this matrix yields the following sequence of matrices.
At this point the bottom two terms on the left part of the bottom row are both very small so it appears
the real eigenvalue is near 1.0. The complex eigenvalues are obtained from solving
Example 16.3.9The equation x^{4}+x^{3}+4x^{2}+x−2 = 0 has exactly two real solutions. You can seethis by graphing it. However, the rational root theorem from algebra shows neither of these solutionsare rational. Also, graphing it does not yield any information about the complex solutions. Lets usethe QR algorithm to approximate all the solutions, real and complex.
A matrix whose characteristic polynomial is the given polynomial is