Recall that a symmetric matrix is a real n × n matrix A such that A^{T} = A. One nice
thing about symmetric matrices is that they have only real eigenvalues.
Proposition 19.8.1Suppose A is a real symmetric matrix. Then alleigenvalues are real.
Proof:Suppose Ax = λx. Then
xTAx = xT λx = λxTx = λxTx-
The last step happens because both x^{T}x and x^{T}x are the sum of the entries of x times
the conjugate of these entries. Also x^{T}Ax is some complex number, a 1 × 1 matrix and so
it equals its transpose. Thus, since A = A^{T},
xTAx = xTAx-= xTAx- = xTλx-= λxTx-
Since x≠0,x^{T}x is a positive real number. Hence, the above shows that λ =λ.
■
Next is a simple version of the Gram Schmidt theorem.
Definition 19.8.2A set of vectors in ℝ^{n}
{x1,⋅⋅⋅,xk}
is called anorthonormal setof vectors if
{ 1 if i = j
xTi xj = δij ≡ 0 if i ⁄= j
Note this is the same as saying that
(xi,xj)
= x_{i}⋅ x_{j} = δ_{ij}.
What does it mean to say that U^{T}U = I which is the definition for U to be
orthogonal? This says that for U =
( )
u1 ⋅⋅⋅ un
, U^{T} =
( uT )
| .1 |
( .. )
uTn
and so from the
way we multiply matrices in which the ij^{th} entry of the product is the product of the
i^{th} row of the matrix on the left with the j^{th} column of the matrix on the right, we
have
uTi uj = δij
in other words, the columns of U are orthonormal. From this simple observation, we get
the following important theorem.
Theorem 19.8.3Let
{u ,⋅⋅⋅,u }
1 n
be orthonormal. Then it is linearlyindependent.
Proof:We know from the above discussion that
( )
U = u1 ⋅⋅⋅ un
is orthogonal. Thus if Ux = 0, you can multiply on the left on both sides with U^{T} and
obtain x = U^{T}Ux = U^{T}0 = 0. Thus, from the definition of linear independence,
Definition 18.10.1, it follows that the columns of U comprise an independent set of
vectors. ■
Theorem 19.8.4Let v_{1}be a unit vector
(|v1| = 1)
in ℝ^{n}, n > 1. Then thereexist vectors
{v2,⋅⋅⋅,vn}
such that this set of vectors is an orthonormal set of vectors.
Proof: The equation for x,v_{1}^{T}x = 0 has a nonzero solution x by Theorem 18.10.3.
Pick such a solution and divide by its magnitude to get v_{2} a unit vector such that
v_{1}^{T}⋅ v_{2} = 0. Now suppose v_{1},
⋅⋅⋅
,v_{k} have been chosen such that
{v1,⋅⋅⋅,vk}
is an
orthonormal set of vectors. Then consider the equations
vT x = 0 j = 1,2,⋅⋅⋅,k
j
This amounts to the situation of Theorem 18.10.3 in which there are more variables than
equations. Therefore, by this theorem, there exists a nonzero x solving all these
equations. Divide by its magnitude and this gives v_{k+1}. Continue this way.
At the last step, you obtain v_{n} and the resulting set is an orthonormal set.
■
Thus, as observed above, the matrix
( )
v1 ⋅⋅⋅ vn
is a orthogonal matrix. With
this preparation, here is a major result. It is actually a specialization of a much more
interesting theorem. See any of my linear algebra books under the topic of Schur’s
theorem.
Theorem 19.8.5Let A be a real symmetric matrix.Then there is an orthogonaltransformation U such that
U TAU = D
where D is a diagonal matrix having the real eigenvalues of A down the diagonal. Also,the columns of U are an orthonormal set of eigenvectors.
Proof:This is obviously true if A is a 1 × 1 matrix. Indeed, you let U = 1 and it all
works because in this case A is already a diagonal matrix. Suppose then that the theorem
is true for any k < n and let A be a real n × n symmetric matrix. Then by the
fundamental theorem of algebra, there exists a solution λ to the characteristic
equation
det(A − λI ) = 0.
Then since A−λI has no inverse, it follows that the columns are dependent and so there
exists a nonzero vector u such that
(A − λI)
u = 0 and from Proposition 19.8.1, λ is real.
Dividing this vector by its magnitude, we can assume that
is an orthonormal set of vectors.
As observed above, if
U = (u v ⋅⋅⋅ v )
2 n
it follows that U is an orthogonal matrix. Now consider U^{T}AU. From the way we
multiply matrices, this is
( uT ) ( uT )
|| vT2 || ( ) || vT2 || ( )
| .. | A u v2 ⋅⋅⋅ vn = | .. | Au Av2 ⋅⋅⋅ Avn
( .T ) ( .T )
vn vn
( T ) ( T )
uT uT
|| v2 || ( ) || v2 || ( )
= |( ... |) Au Av2 ⋅⋅⋅ Avn = |( ... |) λu Av2 ⋅⋅⋅ Avn
vT vT
n n
Now recall the way we multiply matrices in which the ij^{th} entry is the product
of the i^{th} row on the left with the j^{th} column on the right. Thus, since these
columns of U are orthonormal, the above product reduces to something of the
form
( λ aT )
0 A1
where A_{1} is an
(n − 1)
×
(n− 1)
matrix. Summarizing, there is an orthogonal matrix U
such that
( λ aT )
UT AU = 0 A
1
I claim that a = 0. To see this, take the transpose of both sides, using symmetry of A to
obtain
( T ) ( )T ( T )
λ a = UTAU = UTAU = λ 0
0 A1 a A1
Thus a = 0 as claimed. Now by induction, there is an orthogonal matrix Û such
that
T
ˆU A1ˆU = D
where D is a diagonal matrix. Now note that from the way we multiply matrices,
which is a diagonal matrix. This shows the first part. Now if
U = (u u ⋅⋅⋅ u )
1 2 n
( λ1 0 )
| .. | T
D = ( . ) ,U AU = D
0 λn
then, multiplying on both sides by U,
AU = U D
and so, from the way we multiply matrices, this yields
( )
AU = Au1 Au2 ⋅⋅⋅ Aun = U D
( λ u λ u ⋅⋅⋅ λ u )
= 1 1 2 2 n n
which shows that Au_{j} = λ_{j}u_{j} for each j. This shows the columns of U form an
orthonormal set of eigenvectors and the diagonal entries of D are the eigenvalues of A.
■
Example 19.8.6Here is a symmetric matrix which has eigenvalues 6,−12,18
( )
( 1 − 4 13 )
A = − 4 10 − 4
13 − 4 1
Find a matrix U such that U^{T}AU is a diagonal matrix.
From the above explanation the columns of this matrix U are eigenvectors of unit
length and in fact this is sufficient to obtain the matrix. After doing row operations and
then normalizing the vectors, you obtain