13.16. APPROXIMATION IN THE FROBENIUS NORM 375

It follows∥Σ∥2

F = ||U∗AV ||2F = ||AV ||2F = ||A||2F . ■

Of course, this shows that||A||2F = ∑

2i ,

the sum of the squares of the singular values of A.Why is the singular value decomposition important? It implies

A =U

(σ 00 0

)V ∗

where σ is the diagonal matrix having the singular values down the diagonal. Now some-times A is a huge matrix, 1000×2000 or something like that. This happens in applicationsto situations where the entries of A describe a picture. What also happens is that most ofthe singular values are very small. What if you deleted those which were very small, sayfor all i≥ l and got a new matrix

A′ ≡U

(σ ′ 00 0

)V ∗?

Then the entries of A′ would end up being close to the entries of A but there is much lessinformation to keep track of. This turns out to be very useful. More precisely, letting

σ =

σ1 0

. . .

0 σ r

 , U∗AV =

(σ 00 0

),

∣∣∣∣A−A′∣∣∣∣2

F =

∣∣∣∣∣∣∣∣∣∣U(

σ −σ ′ 00 0

)V ∗∣∣∣∣∣∣∣∣∣∣2

F

=r

∑k=l+1

σ2k

Thus A is approximated by A′ where A′ has rank l < r. In fact, it is also true that out ofall matrices of rank l, this A′ is the one which is closest to A in the Frobenius norm. Here

is roughly why this is so. Suppose B̃ approximates A =

(σ r×r 0

0 0

)as well as possible

out of all matrices B̃ having rank no more than l < r the size of the matrix σ r×r.Suppose the rank of B̃ is l. Then obviously no column x j of B̃ in a basis for the column

space can have j > r since if so, the approximation of A could be improved by simply

making this column into a zero column. Therefore there are

(rl

)choices for columns

for a basis for the column space of B̃.Let x be a column in the basis for the column space of B̃ and let it be column j in

the matrix B̃. Denote the diagonal entry by x j = σ j + h. Then the error incurred due toapproximating with this column is

h2 +∑i ̸= j

x2i

13.16. APPROXIMATION IN THE FROBENIUS NORM 375It follows2 * 2 2 2|Z || = ||U"AV |p = ||AV ||; = ||Allp.Of course, this shows that2WAllz =) 97,ithe sum of the squares of the singular values of A.Why is the singular value decomposition important? It impliesa-ul ° 9 \ye0 0where o is the diagonal matrix having the singular values down the diagonal. Now some-times A is a huge matrix, 1000x2000 or something like that. This happens in applicationsto situations where the entries of A describe a picture. What also happens is that most ofthe singular values are very small. What if you deleted those which were very small, sayfor all i > / and got a new matrix/azul 2 © \y0 0Then the entries of A’ would end up being close to the entries of A but there is much lessinformation to keep track of. This turns out to be very useful. More precisely, lettingO01 0c= 7 curav=({ ° 2).0 00 oO,2o—o’ 0 rU Vel] = o;pf k=lA-A'lfp =Thus A is approximated by A’ where A’ has rank / < r. In fact, it is also true that out ofall matrices of rank /, this A’ is the one which is closest to A in the Frobenius norm. HereOrxr0out of all matrices B having rank no more than / < r the size of the matrix 0;.,.Suppose the rank of B is /. Then obviously no column x 7 of B in a basis for the columnspace can have j > r since if so, the approximation of A could be improved by simply~ 0is roughly why this is so. Suppose B approximates A = ( 0 ) as well as possible: : : r :making this column into a zero column. Therefore there are ( ; ) choices for columnsfor a basis for the column space of B.Let x be a column in the basis for the column space of B and let it be column j inthe matrix B. Denote the diagonal entry by x j = 0; +h. Then the error incurred due toapproximating with this column ish? + YixyiAj