5.9. THE RIGHT POLAR DECOMPOSITION 93

Lemma 5.9.3 Suppose{

w1, · · · ,wr,vr+1, · · · ,vp}

is a linearly independent set of vectorssuch that {w1, · · · ,wr} is an orthonormal set of vectors. Then when the Gram Schmidtprocess is applied to the vectors in the given order, it will not change any of the w1, · · · ,wr.

Proof: Let{

u1, · · · ,up}

be the orthonormal set delivered by the Gram Schmidt process.Then u1 = w1 because by definition, u1 ≡ w1/ |w1| = w1. Now suppose u j = w j for allj ≤ k ≤ r. Then if k < r, consider the definition of uk+1.

uk+1 ≡wk+1−∑

k+1j=1 (wk+1,u j)u j∣∣∣wk+1−∑k+1j=1 (wk+1,u j)u j

∣∣∣By induction, u j = w j and so this reduces to wk+1/ |wk+1|= wk+1. This proves the lemma.

This lemma immediately implies the following lemma.

Lemma 5.9.4 Let V be a subspace of dimension p and let {w1, · · · ,wr} be an orthonormalset of vectors in V . Then this orthonormal set of vectors may be extended to an orthonormalbasis for V, {

w1, · · · ,wr,yr+1, · · · ,yp}

Proof: First extend the given linearly independent set {w1, · · · ,wr} to a basis for Vand then apply the Gram Schmidt theorem to the resulting basis. Since {w1, · · · ,wr} isorthonormal it follows from Lemma 5.9.3 the result is of the desired form, an orthonormalbasis extending {w1, · · · ,wr}. This proves the lemma.

Here is another lemma about preserving distance.

Lemma 5.9.5 Suppose R is an m×n matrix with m > n and R preserves distances. ThenR∗R = I.

Proof: Since R preserves distances, |Rx| = |x| for every x. Therefore from the axiomsof the dot product,

|x|2 + |y|2 +(x,y)+(y,x)= |x+y|2

= (R(x+y) ,R(x+y))= (Rx,Rx)+(Ry,Ry)+(Rx,Ry)+(Ry,Rx)= |x|2 + |y|2 +(R∗Rx,y)+(y,R∗Rx)

and so for all x,y,(R∗Rx−x,y)+(y,R∗Rx−x) = 0

Hence for all x,y,Re(R∗Rx−x,y) = 0

Now for a x,y given, choose α ∈ C such that

α (R∗Rx−x,y) = |(R∗Rx−x,y)|

5.9. THE RIGHT POLAR DECOMPOSITION 93Lemma 5.9.3 Suppose {w1, Wry Vets tt .Vp} is a linearly independent set of vectorssuch that {w,,---,w,} is an orthonormal set of vectors. Then when the Gram Schmidtprocess is applied to the vectors in the given order, it will not change any of the W,,-- , Wy.Proof: Let {uy re | p} be the orthonormal set delivered by the Gram Schmidt process.Then u; = w; because by definition, u; = w)/|w)| = wi. Now suppose uj; = w; for allJ <k<>9r. Then if k < r, consider the definition of uz4+1.k+1Wik+1 —Liri (Wr41,Uj) UjU1 =k+1We Li (Wi41,Uj) UjBy induction, uj = w; and so this reduces to wy+1/ |Wx+1| = Wx+1- This proves the lemma.This lemma immediately implies the following lemma.Lemma 5.9.4 Let V be a subspace of dimension p and let {w,--- ,w,} be an orthonormalset of vectors in V. Then this orthonormal set of vectors may be extended to an orthonormalbasis for V,{wis >WriYrt+1s° °° +Yp}Proof: First extend the given linearly independent set {w,,---,w,} to a basis for Vand then apply the Gram Schmidt theorem to the resulting basis. Since {w,,---,w,} isorthonormal it follows from Lemma 5.9.3 the result is of the desired form, an orthonormalbasis extending {w1,--- ,w,}. This proves the lemma.Here is another lemma about preserving distance.Lemma 5.9.5 Suppose R is anm Xx n matrix with m > n and R preserves distances. ThenR*R=1.Proof: Since R preserves distances, |Rx| = |x| for every x. Therefore from the axiomsof the dot product,Ix” + |y|7 + (x,y) + (yx)= |x+yl?(R(xt+y),R(x+y))= (Rx,Rx)+(Ry,Ry) + (Rx, Ry) + (Ry, Rx)= |x|? +y|? + (R*Rx,y) + (y,R*Rx)and so for all x,y,(R*Rx — x,y) + (y,R*Rx—x) =0Hence for all x,y,Re (R*Rx — x,y) =0Now for a x,y given, choose a € C such thata (R*Rx—x,y) = |(R*Rx—x,y)|