Every matrix is related to an upper triangular matrix in a particularly significant way. This is
Schur’s theorem and it is the most important theorem in the spectral theory of matrices. The
important result which makes this theorem possible is the Gram Schmidt procedure of Lemma
11.4.14.
Definition 14.1.1An n × n matrix U, is unitaryif UU^{∗} = I = U^{∗}U where U^{∗}is defined to be thetranspose of the conjugate of U. Thus U_{ij} = U_{ji}^{∗}. Note that every real orthogonal, meaning Q^{T}Q = I,matrix is unitary. For A any matrix, A^{∗}, just defined as the conjugate of the transpose, is called theadjoint.As shown above, this is also defined by
(Ax,y ) = (x,A ∗y)
Note that if U =
( )
v1 ⋅⋅⋅ vn
where the v_{k} are orthonormal vectors in ℂ^{n}, then U is
unitary. This follows because the ij^{th} entry of U^{∗}U is v_{i}^{T}v_{j} = δ_{ij} since the v_{i} are assumed
orthonormal.
Lemma 14.1.2The following holds.
(AB )
^{∗} = B^{∗}A^{∗}.
Proof: Using the definition in terms of inner products,
^{∗}y = B^{∗}A^{∗}y which shows the result since y is arbitrary.
■
Theorem 14.1.3Let A be an n × n matrix.Then there exists a unitary matrix U such that
U ∗AU = T, (14.1)
(14.1)
where T is an upper triangular matrix having the eigenvalues of A on the main diagonal listed according tomultiplicity as roots of the characteristic equation. If A is a real matrix having all real eigenvalues, then Ucan be chosen to be an orthogonal real matrix.
Proof:The theorem is clearly true if A is a 1 × 1 matrix. Just let U = 1, the 1 × 1 matrix which has
entry 1. Suppose it is true for
(n − 1)
×
(n− 1)
matrices, n ≥ 2 and let A be an n×n matrix. Then let v_{1}
be a unit eigenvector for A. Then there exists λ_{1} such that
Av1 = λ1v1, |v1| = 1.
Extend
{v }
1
to a basis and then use the Gram - Schmidt process or Theorem 13.2.6 to obtain
{v_{1},
⋅⋅⋅
,v_{n}}, an orthonormal basis of ℂ^{n}. Let U_{0} be a matrix whose i^{th} column is v_{i} so that U_{0} is
unitary. Consider U_{0}^{∗}AU_{0}
where T is upper triangular. Then let U = U_{0}U_{1}. It is clear that this is unitary because both matrices
preserve distance. Therefore, so does the product and hence U. Alternatively,
∗ ∗ ∗
I = U0U1U 1U0 = (U0U1)(U0U1)
and so, it follows that A is similar to T and that U_{0}U_{1} is unitary. Hence A and T have the same
characteristic polynomials, and therefore the same eigenvalues listed according to multiplicity as roots of
the characteristic equation. These are the diagonal entries of T listed with multiplicity and so this proves
the main conclusion of the theorem. In case A is real with all real eigenvalues, the above argument can be
repeated word for word using only the real dot product to show that U can be taken to be real and
orthogonal.■
As a simple consequence of the above theorem, here is an interesting lemma.
Definition 14.1.5An n × n matrix A is called Hermitianif A = A^{∗}. Thus a real symmetric(A = A^{T}) matrix is Hermitian.
The following is the major result about Hermitian matrices. It says that any Hermitian matrix is similar
to a diagonal matrix. We say it is unitarily similar because the matrix U in the following theorem which
gives the similarity transformation is a unitary matrix.
Theorem 14.1.6If A is an n × n Hermitian matrix, there exists a unitary matrix U suchthat
∗
U AU = D (14.2)
(14.2)
where D is a real diagonal matrix. That is, D has nonzero entries only on the main diagonal and these arereal. Furthermore, the columns of U are an orthonormal basis of eigenvectors for ℂ^{n}. If A is real andsymmetric, then U can be assumed to be a real orthogonal matrix and the columns of U form anorthonormal basis for ℝ^{n}.
Proof: From Schur’s theorem above, there exists U unitary (real and orthogonal if A is real) such
that
U ∗AU = T
where T is an upper triangular matrix. Then from Lemma 14.1.2
T ∗ = (U∗AU )∗ = U ∗A∗U = U ∗AU = T.
Thus T = T^{∗} and T is upper triangular. This can only happen if T is really a diagonal matrix having real
entries on the main diagonal. (If i≠j, one of T_{ij} or T_{ji} equals zero. But T_{ij} =T_{ji} and so they are both
zero. Also T_{ii} =T_{ii}.)
Finally, let
( )
U = u1 u2 ⋅⋅⋅ un
where the u_{i} denote the columns of U and
( )
λ1 0
D = || .. ||
( . )
0 λn
The equation, U^{∗}AU = D implies
( )
AU = Au1 Au2 ⋅⋅⋅ Aun
( )
= UD = λ1u1 λ2u2 ⋅⋅⋅ λnun
where the entries denote the columns of AU and UD respectively. Therefore, Au_{i} = λ_{i}u_{i} and since the
matrix is unitary, the ij^{th} entry of U^{∗}U equals δ_{ij} and so
-----
δij = uTi uj = uTi uj = ui ⋅uj.
This proves the corollary because it shows the vectors
{ui}
form an orthonormal basis. In case A is real
and symmetric, simply ignore all complex conjugations in the above argument.■
This theorem is particularly nice because the diagonal entries are all real. What of a matrix which is
unitarily similar to a diagonal matrix without assuming the diagonal entries are real? That is, A is an n×n
matrix with
∗
U AU = D
Then this requires
∗ ∗ ∗
U A U = D
and so since the two diagonal matrices commute,
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
AA = U DU UD U = UDD U = U D DU
= U D∗U ∗U DU ∗ = A∗A
The following definition describes these matrices.
Definition 14.1.7An n × n matrix is normalmeans: A^{∗}A = AA^{∗}.
We just showed that if A is unitarily similar to a diagonal matrix, then it is normal. The converse is also
true. This involves the following lemma.
Lemma 14.1.8If T is upper triangular and normal, then T is a diagonal matrix. If A is normaland U is unitary, then U^{∗}AU is also normal.
Proof: This is obviously true if T is 1 × 1. In fact, it can’t help being diagonal in this case. Suppose
then that the lemma is true for
(n− 1)
×
(n − 1)
matrices and let T be an upper triangular normal n×n
matrix. Thus T is of the form
( t a∗ ) ( t-- 0T )
T = 11 ,T∗ = 11 ∗
0 T1 a T1
Then
( ∗ ) ( --- T ) ( 2 ∗ ∗ ∗ )
T T∗ = t11 a t11 0 = |t11| + a a a T1
0 T1 a T ∗1 T1a T1T∗1
( --- T ) ( ∗ ) ( 2 ---∗ )
T ∗T = t11 0 t11 a = |t11| t11a
a T∗1 0 T1 at11 aa∗ + T∗1T1
Since these two matrices are equal, it follows a = 0. But now it follows that T_{1}^{∗}T_{1} = T_{1}T_{1}^{∗} and so by
induction T_{1} is a diagonal matrix D_{1}. Therefore,
( T )
T = t11 0
0 D1
a diagonal matrix.
As to the last claim, let A be normal. Then
(U∗AU )∗(U∗AU ) = U ∗A∗UU ∗AU = U ∗A∗AU
= U ∗AA ∗U = U ∗AU U∗A ∗U
= (U ∗AU )(U ∗AU)∗ ■
Theorem 14.1.9An n × n matrix is unitarily similar to a diagonal matrix if and only if it isnormal.
Proof:It was already shown above that if A is similar to a diagonal matrix then it is
normal. Suppose now that A is normal. By Schur’s theorem, there is a unitary matrix U such
that
U ∗AU = T
where T is upper triangular. By Lemma 14.1.8, T is normal and, since it is upper triangular, it is a
diagonal matrix. ■