The right polar factorization involves writing a matrix as a product of two other matrices,
one which preserves distances and the other which stretches and distorts. First
here are some lemmas which review and add to many of the topics discussed so
far about adjoints and orthonormal sets and such things. This is of fundamental
significance in geometric measure theory and also in continuum mechanics. Not
surprisingly the stress should depend on the part which stretches and distorts. See
[13].
Lemma A.5.1Let A be a Hermitian matrix such that all its eigenvalues arenonnegative. Then there exists a Hermitian matrix A^{1∕2}such that A^{1∕2}has allnonnegative eigenvalues and
( )
A1∕2
^{2} = A.
Proof:Since A is Hermitian, there exists a diagonal matrix D having all real nonnegative
entries and a unitary matrix U such that A = U^{∗}DU. This is from Theorem A.4.1 above.
Then denote by D^{1∕2} the matrix which is obtained by replacing each diagonal entry of D with
its square root. Thus D^{1∕2}D^{1∕2} = D. Then define
A1∕2 ≡ U∗D1 ∕2U.
Then
( 1∕2)2 ∗ 1∕2 ∗ 1∕2 ∗
A = U D UU D U = U DU = A.
Since D^{1∕2} is real,
( ∗ 1∕2 ) ∗ ∗( 1∕2)∗ ∗ ∗ ∗ 1∕2
U D U = U D (U ) = U D U
so A^{1∕2} is Hermitian. ■
Next it is helpful to recall the Gram Schmidt algorithm and observe a certain property
stated in the next lemma.
Lemma A.5.2Suppose
{w1, ⋅⋅⋅,wr,vr+1,⋅⋅⋅,vp}
is a linearly independent set ofvectors such that
{w1,⋅⋅⋅,wr}
is an orthonormal set of vectors. Then when the GramSchmidt process is applied to the vectors in the given order, it will not change any ofthe w_{1},
⋅⋅⋅
,w_{r}.
Proof:Let
{u ,⋅⋅⋅,u }
1 p
be the orthonormal set delivered by the Gram Schmidt process.
Then u_{1} = w_{1} because by definition, u_{1}≡ w_{1}∕
|w |
1
= w_{1}. Now suppose u_{j} = w_{j} for all
j ≤ k ≤ r. Then if k < r, consider the definition of u_{k+1}.
By induction, u_{j} = w_{j} and so this reduces to w_{k+1}∕
|wk+1|
= w_{k+1}. ■
This lemma immediately implies the following lemma.
Lemma A.5.3Let V be a subspace of dimension p and let
{w1,⋅⋅⋅,wr }
be anorthonormal set of vectors in V . Then this orthonormal set of vectors may be extended to anorthonormal basis for V,
{w1,⋅⋅⋅,wr, yr+1,⋅⋅⋅,yp}
Proof: First extend the given linearly independent set
{w1, ⋅⋅⋅,wr }
to a basis for V and
then apply the Gram Schmidt theorem to the resulting basis. Since
{w1, ⋅⋅⋅,wr }
is
orthonormal it follows from Lemma A.5.2 the result is of the desired form, an orthonormal
basis extending
{w1,⋅⋅⋅,wr}
. ■
Here is another lemma about preserving distance.
Lemma A.5.4Suppose R is an m×n matrix with m > n and R preserves distances.Then R^{∗}R = I. Also, if R takes an orthonormal basis to an orthonormal set, then Rmust preserve distances.
Proof:Since R preserves distances,
|Rx|
=
|x |
for every x. Therefore from the axioms of
the dot product,
= 0 for all x,y because the given x,y were arbitrary. Let
y = R^{∗}Rx − x to conclude that for all x,
R∗Rx − x = 0
which says R^{∗}R = I since x is arbitrary.
Consider the last claim. Let R : F^{n}→ F^{m} such that
{u1,⋅⋅⋅,un}
is an orthonormal basis
for F^{n} and
{Ru1,⋅⋅⋅,Run}
is also an orthormal set, then
|| ( )||2 || ||2 || ||2
||R ∑ x u || = ||∑ x Ru || = ∑ |x|2 = ||∑ x u || ■
| i i i | | i i i| i i | i i i|
With this preparation, here is the big theorem about the right polar factorization.
Theorem A.5.5Let F be an m × n matrix where m ≥ n. Then there exists aHermitian n × n matrix U which has all nonnegative eigenvalues and an m × n matrix Rwhich satisfies R^{∗}R = I such that
F = RU.
Proof: Consider F^{∗}F. This is a Hermitian matrix because
(F ∗F)∗ = F ∗(F∗)∗ = F ∗F
Also the eigenvalues of the n × n matrix F^{∗}F are all nonnegative. This is because if x is an
eigenvalue,
λ (x,x) = (F∗F x,x) = (Fx,Fx ) ≥ 0.
Therefore, by Lemma A.5.1, there exists an n×n Hermitian matrix U having all nonnegative
eigenvalues such that
U 2 = F ∗F.
Consider the subspace U
n
(F )
. Let
{Ux1,⋅⋅⋅,Uxr}
be an orthonormal basis for
n n
U (F ) ⊆ F .
Note that U
(Fn )
might not be all of F^{n}. Using Lemma A.5.3, extend to an orthonormal basis
for all of F^{n},
{U x1,⋅⋅⋅,U xr,yr+1,⋅⋅⋅,yn}.
Next observe that
{Fx1,⋅⋅⋅,Fxr}
is also an orthonormal set of vectors in F^{m}. This is
because
(Fx ,F x ) = (F ∗Fx ,x ) = (U 2x ,x )
k j k∗ j k j
= (U xk,U xj) = (U xk,Uxj) = δjk
Therefore, from Lemma A.5.3 again, this orthonormal set of vectors can be extended to an
orthonormal basis for F^{m},
{Fx1,⋅⋅⋅,F xr,zr+1,⋅⋅⋅,zm}
Thus there are at least as many z_{k} as there are y_{j}. Now for x ∈ F^{n}, since
{Ux1,⋅⋅⋅,U xr,yr+1,⋅⋅⋅,yn}
is an orthonormal basis for F^{n}, there exist unique scalars,
( ( r ) ( r ) )
F ∑ b x − F (x),F ∑ bx − F (x)
k=1 k k k=1 k k
( ( ) ( ))
∗ ∑r ∑r
= (F F) bkxk − x , bkxk − x
( ( r k=1 ) ( r k=1 ))
= U2 ∑ b x − x , ∑ b x − x
k=1 k k k=1 k k
( ( r ) ( r ))
= U ∑ bkxk − x ,U ∑ bkxk − x
k=1 k=1
( ∑r ∑r )
= bkU xk − U x, bkUxk − Ux = 0
k=1 k=1
Therefore, F
(∑rk=1bkxk)
= F
(x)
and this shows RUx = Fx. From 1.7 it follows that R
maps an orthonormal set to an orthonormal set and so R preserves distances. Therefore, by
Lemma A.5.4R^{∗}R = I. ■