The most significant theorem about eigenvalues and eigenvectors in the space of n×n complex matrices is
Schur’s theorem. First is a simple version of the Gram Schmidt theorem.
Definition 7.4.1A set of vectors in F^{n}, F = ℝ or ℂ,
{x1,⋅⋅⋅,xk}
is called an orthonormal setofvectors if
--- {
xTi xj = x∗ixj = δij ≡ 1 if i = j
0 if i ⁄= j
Note this is the same as saying that
(xi,xj)
= δ_{ij}although here it will be slightly more convenientto define the inner product differently. Indeed, we are really working with the inner product
〈x,y〉
= x^{∗}y whereas the usual inner product is
(x,y)
= x^{T}y. This alternate version of theinner product is actually more convenient in matrix theory so we use it here. The difference isthat with this new version, the complex conjugate comes out of the first entry rather than thesecond.
What does it mean to say that U^{∗}U = I which is the definition for U to be unitary? This says that for
U =
( u ⋅⋅⋅ u )
1 n
, U^{∗} =
( )
u1T
|| .. ||
( -.- )
unT
and so from the way we multiply matrices in which the ij^{th}
entry of the product is the product of the i^{th} row of the matrix on the left with the j^{th} column of the
matrix on the right, we have
u∗iuj = δij
in other words, the columns of U are orthonormal. From this simple observation, we get the following
important theorem.
Theorem 7.4.2Let
{u1,⋅⋅⋅,un }
be orthonormal. Then it is linearly independent.
Proof:We know from the above discussion that
( )
U = u1 ⋅⋅⋅ un
is unitary. Thus if Ux = 0, you can multiply on the left on both sides with U^{∗} and obtain
x = U^{∗}Ux = U^{∗}0 = 0. Thus, from the definition of linear independence, Definition 6.1.5, it follows that the
columns of U comprise an independent set of vectors. ■
Theorem 7.4.3Let v_{1}be a unit vector
(|v1| = 1)
in F^{n}, n > 1. Then there exist vectors
{v2,⋅⋅⋅,vn}
such that
{v1,⋅⋅⋅,vn}
is an orthonormal set of vectors.
Proof: The equation for x,v_{1}^{T}x = 0 has a nonzero solution x by Theorem 6.1.4. Pick such a solution
and divide by its magnitude to get v_{2} a unit vector such that v_{1}^{T}⋅ v_{2} = 0. Now suppose v_{1},
⋅⋅⋅
,v_{k}
have been chosen such that
{v1,⋅⋅⋅,vk}
is an orthonormal set of vectors. Then consider the
equations
vjTx = 0 j = 1,2,⋅⋅⋅,k
This amounts to the situation of Theorem 6.1.4 in which there are more variables than equations.
Therefore, by this theorem, there exists a nonzero x solving all these equations. Divide by its magnitude
and this gives v_{k+1}. Continue this way. At the last step, you obtain v_{n} and the resulting set is an
orthonormal set. ■
Thus, as observed above, the matrix
( )
v1 ⋅⋅⋅ vn
is a unitary matrix. With this preparation, here
is Schur’s theorem. First is some terminology. An n × n matrix T is called upper triangular if it is of the
form
( )
∗ ⋅⋅⋅ ∗
|| . . ||
( .. .. )
0 ∗
meaning that all entries are zero below the main diagonal, consisting of those entries of the form
T_{ii}.
Theorem 7.4.4Let A be a real or complex n × n matrix. Then there exists a unitary matrix U suchthat
U ∗AU = T, (7.3)
(7.3)
where T is an upper triangular matrix. If A has all real entries and eigenvalues, then U can be chosen to beorthogonal.
Proof: The theorem is clearly true if A is a 1 × 1 matrix. Just let U = 1 the 1 × 1 matrix which has 1
down the main diagonal and zeros elsewhere. Suppose it is true for
(n − 1)
×
(n− 1)
matrices and let A
be an n × n matrix. Then let v_{1} be a unit eigenvector for A. That is, there exists λ_{1} such
that
, an orthonormal set in ℂ^{n}. Let U_{0} be a matrix whose i^{th}
column is v_{i}. Then from the above, it follows U_{0} is unitary. Then from the way you multiply matrices
U_{0}^{∗}AU_{0} is of the form
where T is upper triangular. Then let U = U_{0}U_{1}. Both of the U_{i} are unitary and so U must also be
unitary. Indeed
∗ ∗ ∗ ∗ ∗
U U = (U0U1) U0U1 = U1U 0U0U1 = U1U1 = I.
Then U^{∗}AU = T.
If A is real having real eigenvalues, all of the above can be accomplished using the real dot product and
using real eigenvectors. Thus the unitary matrix can be assumed real. ■
The diagonal entries of T are each eigenvalues of A. This will become clear later when we discuss the
determinant and the characteristic polynomial. However, it is clear right now that T and A have the same
eigenvalues. If Tx = λx for nonzero x, then
∗ ∗
U AU x = λU U x
U ∗(AU x− λU x) = 0
Now multiply both sides by U and obtain that Ux is an eigenvector for A. It is nonzero because
U preserves lengths. Similar reasoning shows that every eigenvalue of A is an eigenvalue of
T.
The following result is about Hermitian matrices. These are those matrices for which the upper
triangular matrix in Schur’s theorem is actually a real diagonal matrix.
Definition 7.4.5An n×n matrix A is Hermitianif A = A^{∗}. Thus a real symmetric matrix is Hermitianbut so is
( 1 1 − i 3 )
| |
( 1+ i 2 i )
3 − i 1
In this book, we are mainly interested in real symmetric matrices.
The next theorem is the main result.
Theorem 7.4.6If A is an n × n Hermitian matrix, there exists a unitary matrix U suchthat
U ∗AU = D (7.4)
(7.4)
where D is a real diagonal matrix. That is, D has nonzero entries only on the main diagonal andthese are real. Furthermore, the columns of U are an orthonormal basis of eigenvectors for ℂ^{n}.If A is real and symmetric, then U can be assumed to be a real orthogonal matrix and thecolumns of U form an orthonormal basis for ℝ^{n}. Furthermore, if A is an n × n matrix andthere is a unitary matrix U such that U^{∗}AU = D where D is real and diagonal, then A isHermitian.
Proof: From Schur’s theorem above, there exists U unitary (real and orthogonal if A is real) such
that
U ∗AU = T
where T is an upper triangular matrix. Then from the rules for the transpose,
T ∗ = (U∗AU )∗ = U ∗A∗U = U ∗AU = T.
Thus T = T^{∗} and T is upper triangular. This can only happen if T is really a diagonal matrix having real
entries on the main diagonal. (If i≠j, one of T_{ij} or T_{ji} equals zero. But T_{ij} =T_{ji} and so they are both
zero. Also T_{ii} =T_{ii}.)
Finally, let
( )
U = u1 u2 ⋅⋅⋅ un
where the u_{i} denote the columns of U and
( λ1 0 )
| . |
D = |( .. |)
0 λn
The equation, U^{∗}AU = D implies
( )
AU = Au1 Au2 ⋅⋅⋅ Aun
= UD = ( )
λ1u1 λ2u2 ⋅⋅⋅ λnun
where the entries denote the columns of AU and UD respectively. Therefore, Au_{i} = λ_{i}u_{i} and since the
matrix is unitary, the ij^{th} entry of U^{∗}U equals δ_{ij} and so
--T --T-- -------
δij = u i uj = u iuj = (ui,uj)
This proves the corollary because it shows the vectors
{ui}
form an orthonormal basis. In case A is real
and symmetric, simply ignore all complex conjugations in the above argument.
Finally suppose that U^{∗}AU = D where D is real and diagonal. Thus D^{∗} = D. Then
∗
A = U DU
Thus A^{∗} = UD^{∗}U^{∗} = UDU^{∗} = A. This last uses the fact that
(AB )
^{∗} = B^{∗}A^{∗}. ■
Example 7.4.7Here is a symmetric matrix which has eigenvalues 6,−12,18
( )
1 − 4 13
A = |( − 4 10 − 4|)
13 − 4 1
Find a matrix U such that U^{T}AU is a diagonal matrix.
From the above explanation the columns of this matrix U are eigenvectors of unit length and in fact
this is sufficient to obtain the matrix. After doing row operations and then normalizing the vectors, you
obtain