13.10. SIMULTANEOUS DIAGONALIZATION 357
We assume that S has rank p. Thus it is a self adjoint matrix which has all positive eigen-values. Therefore, from the property of the trace, trace(AB) = trace(BA) , the thing tomaximize is
n ln(det(Σ−1))− trace
(S1/2
Σ−1S1/2
)Now let B = S1/2Σ−1S1/2. Then B is positive and self adjoint also and so there exists Uunitary such that B = U∗DU where D is the diagonal matrix having the positive scalarsλ 1, · · · ,λ p down the main diagonal. Solving for Σ−1 in terms of B, this yields
S−1/2BS−1/2 = Σ−1
and so
ln(det(Σ−1)) = ln
(det(
S−1/2)
det(B)det(
S−1/2))
= ln(det(S−1))+ ln(det(B))
which yieldsC (S)+n ln(det(B))− trace(B)
as the thing to maximize. Of course this yields
C (S)+n ln
(p
∏i=1
λ i
)−
p
∑i=1
λ i
= C (S)+np
∑i=1
ln(λ i)−p
∑i=1
λ i
as the quantity to be maximized. To do this, take ∂/∂λ k and set equal to 0. This yieldsλ k = n. Therefore, from the above, B =U∗nIU = nI. Also from the above,
B−1 =1n
I = S−1/2ΣS−1/2
and so
Σ =1n
S =1n
n
∑i=1
(xi−m)(xi−m)∗
This has shown that the maximum likelihood estimates are
m= x̄≡ 1n
n
∑i=1
xi, Σ =1n
n
∑i=1
(xi−m)(xi−m)∗ .
13.10 Simultaneous DiagonalizationRecall the following definition of what it means for a matrix to be diagonalizable.
Definition 13.10.1 Let A be an n×n matrix. It is said to be diagonalizable if there existsan invertible matrix S such that
S−1AS = D
where D is a diagonal matrix.