which is seen easily by solving 7.9 for v_{k+1}, and it follows
span (v1,⋅⋅⋅,vk,vk+1) = span(u1,⋅⋅⋅,uk,uk+1).
If l ≤ k,
( k )
(uk+1,ul) = C ( (vk+1,ul)− ∑ (vk+1,uj)(uj,ul)) =
j=1
( k )
C ( (v ,u )− ∑ (v ,u )δ ) = C ((v ,u )− (v ,u )) = 0.
k+1 l j=1 k+1 j lj k+1 l k+1 l
The vectors,
{uj}
_{j=1}^{n}, generated in this way are therefore orthonormal because each vector has unit
length. ■
Corollary 7.6.14If you have a basis for ℝ^{p},
{u1,⋅⋅⋅,um,um+1, ⋅⋅⋅,up}
and
{u1,⋅⋅⋅,um}
is orthonormal, then when the Gram Schmidt process is used on this basis, it returns
{u1,⋅⋅⋅,um}
. Thus it is always possible to extend an orthonormal set of vectors to an orthonormalbasis.
Proof:This follows right away from the algorithm. ■
Did we ever use the fact that all of this is taking place in ℝ^{p}? No, this was never used at all! In fact
everything in the Gram Schmidt process holds if V is a subspace of an arbitrary inner product space. You
just need something which is a vector space which has an inner product to have it all work out exactly the
same. A vector space is something in which you can add the “vectors” and multiply them by scalars in the
usual way which we do for vectors in ℝ^{n}.
Now return to the stated problem which was to compute the closest point in V . This is the content of
the next theorem.
Theorem 7.6.15Let V be an m dimensional subspace of ℝ^{p}having orthonormal basis
{u1,⋅⋅⋅,um}
. Letb ∈ ℝ^{p}and let y be the point of V closest to b. Then
∑m
y = (b,uk)uk (7.10)
k=1
(7.10)
Proof:We only need to show that this satisfies the orthogonality condition 7.7. But this is fairly
obvious because, from properties of the inner product and y given above,
Therefore, the orthogonality condition holds for y given by the above formula and so y equals the above
sum in 7.10. ■
The sum in 7.10 is the Fourier series approximation to b. The scalars
(b,uk)
are the Fourier
coefficients. Note that all this works any time you have a norm which comes from an inner product,
something which satisfies the same axioms as the dot product. That is,
|x|
=
(x,x)
where
(⋅,⋅)
satisfies the
inner product axioms:
(f,g)
=
(g,f)
(af + bg,h)
= a
(f,h)
+ b
(g,h)
(f,f)
≥ 0 and equals 0 only if f = 0
The conjugate is placed on the
(g,f)
to include the case of a complex inner product. Just ignore it in
the case where the scalars are real numbers.
Now we generalize these ideas more.
Theorem 7.6.16Let V be a finite dimensional subspace of an inner product space X, somethingwith an inner product. (X is a nonempty set which satisfies the vector space axioms. In addition ithas an inner product satisfying the inner product axioms.) If b ∈ X, there exists a unique y ∈ Vsuch that
|b − y|
≤
|b − z|
for all z ∈ V . This point is characterized by
(b− y,z)
= 0 for all z ∈ V .
Proof: Letting t ∈ ℝ,
|b− (y+ tz)|2 = |b− y|2 − 2t(b− y,z)+ t2|z|2
If y is closest to b then taking the derivative and setting t = 0, we must have
(b− y,z)
= 0. Conversely, if
this equals zero, let t = 1 and you have
|b− (y+ z)|2 = |b− y|2 + |z|2
and so y solves the minimization property. It only remains to show the existence of such y satisfying
(b − y,z)
= 0. However, using the Gram Schmidt process, there is an orthonormal basis
{u1,⋅⋅⋅,un}
whose
span is V . Then all that remains is to verify that ∑_{i=1}^{n}
(b,ui)
u_{i} satisfies the orthogonality condition.
Indeed,
Note that any time the norm comes from an inner product, something which satisfies the properties of
the dot product, all of this holds. You don’t need to be considering vectors in ℝ^{n}. It was only the axioms of
the inner product which were used.