11.6. APPROXIMATIONS 205
Proof: Let u1 ≡ v1/ |v1| . Thus for k = 1, span(u1) = span(v1) and {u1} is anorthonormal set. Now suppose for some k < n, u1, · · · , uk have been chosen such that(u j,ul) = δ jl and span(v1, · · · ,vk) = span(u1, · · · ,uk). Then define
uk+1 ≡vk+1−∑
kj=1 (vk+1,u j)u j∣∣∣vk+1−∑kj=1 (vk+1,u j)u j
∣∣∣ , (11.9)
where the denominator is not equal to zero because the v j form a basis, and so
vk+1 /∈ span(v1, · · · ,vk) = span(u1, · · · ,uk)
Thus by induction,
uk+1 ∈ span(u1, · · · ,uk,vk+1) = span(v1, · · · ,vk,vk+1) .
Also, vk+1 ∈ span(u1, · · · ,uk,uk+1) which is seen easily by solving 11.9 for vk+1, and itfollows
span(v1, · · · ,vk,vk+1) = span(u1, · · · ,uk,uk+1) .
If l ≤ k,
(uk+1,ul) =C
((vk+1,ul)−
k
∑j=1
(vk+1,u j)(u j,ul)
)=
C
((vk+1,ul)−
k
∑j=1
(vk+1,u j)δ l j
)=C ((vk+1,ul)− (vk+1,ul)) = 0.
The vectors,{u j}n
j=1 , generated in this way are therefore orthonormal because each vectorhas unit length. ■
Corollary 11.6.14 If you have a basis for Rp,{u1, · · · ,um,um+1, · · · ,up
}and {u1, · · · ,um} is orthonormal, then when the Gram Schmidt process is used on thisbasis, it returns {u1, · · · ,um} . Thus it is always possible to extend an orthonormal set ofvectors to an orthonormal basis.
Proof:This follows right away from the algorithm. ■Did we ever use the fact that all of this is taking place in Rp? No, this was never used at
all! In fact everything in the Gram Schmidt process holds if V is a subspace of an arbitraryinner product space. You just need something which is a vector space which has an innerproduct to have it all work out exactly the same. A vector space is something in whichyou can add the “vectors” and multiply them by scalars in the usual way which we do forvectors in Rn.
Now return to the stated problem which was to compute the closest point in V . This isthe content of the next theorem.
Theorem 11.6.15 Let V be an m dimensional subspace of Rp having orthonormal basis{u1, · · · ,um}. Let b ∈ Rp and let y be the point of V closest to b. Then
y =m
∑k=1
(b,uk)uk (11.10)