First of all, here is what it means for two matrices to be similar.
Definition A.7.11Let A,B be two n×n matrices. Then they are similarif and only if there exists aninvertible matrix S such that
− 1
A = S BS
Proposition A.7.12Define for n × n matrices A ∼ B if A is similar to B. Then
A ∼ A,
If A ∼ B then B ∼ A
If A ∼ B and B ∼ C then A ∼ C
Proof:It is clear that A ∼ A because you could just take S = I. If A ∼ B, then for some S
invertible,
A = S−1BS
and so
SAS −1 = B
But then
(S−1)−1AS −1 = B
which shows that B ∼ A.
Now suppose A ∼ B and B ∼ C. Then there exist invertible matrices S,T such that
A = S−1BS, B = T −1CT.
Therefore,
A = S−1T −1CT S = (T S)−1C (T S)
showing that A is similar to C. ■
For your information, when ∼ satisfies the above conditions, it is called a similarity relation. Similarity
relations are very significant in mathematics.
When a matrix is similar to a diagonal matrix, the matrix is said to be diagonalizable. I think this is
one of the worst monstrosities for a word that I have ever seen. Nevertheless, it is commonly used
in linear algebra. It turns out to be the same as nondefective. The following is the precise
definition.
Definition A.7.13Let A be an n × n matrix. Then A is diagonalizableif there exists an invertiblematrix S such that
S− 1AS = D
where D is a diagonal matrix. This means D has a zero as every entry except for the main diagonal.Moreprecisely, D_{ij} = 0 unless i = j. Such matrices look like the following.
( )
∗ 0
|| .. ||
( . )
0 ∗
where ∗ might not be zero.
The most important theorem about
diagonalizability^{3}
is the following major result.
Theorem A.7.14An n × n matrix is diagonalizable if and only if F^{n}has a basis of eigenvectors of A.Furthermore, you can take the matrix S described above, to be given as
( )
S = v1 v2 ⋅⋅⋅ vn
where here the v_{k}are the eigenvectors in the basis for F^{n}. If A is diagonalizable,the eigenvalues of A arethe diagonal entries of the diagonal matrix.
Proof: Suppose there exists a basis of eigenvectors
{vk}
where Av_{k} = λ_{k}v_{k}. Then let S be given as
above. It follows S^{−1} exists because these vectors are linearly independent so N
(S)
=
{0}
which
implies S is one to one which implies det
(S)
≠0 which implies S^{−1} exists. Let S^{−1} be of the
form
( T )
| w 1T |
−1 || w 2 ||
S = |( ... |)
T
w n
where w_{k}^{T}v_{j} = δ_{kj}. Then
( ) ( T )
λ1 0 | w 1 |
|| . || || wT2 || ( )
( .. ) = |( ... |) λ1v1 λ2v2 ⋅⋅⋅ λnvn
0 λn T
( w n )
wT1
|| wT || ( )
= || .2 || Av1 Av2 ⋅⋅⋅ Avn
( .. )
wTn
−1
= S AS
Next suppose A is diagonalizable so that S^{−1}AS = D. Let
showing the v_{i} are eigenvectors of A and the λ_{k} are eigenvectors. Now the v_{k} form a basis for F^{n} because
the matrix S having these vectors as columns is given to be invertible. ■
In other words, to diagonalize A you get a basis of eigenvectors
{v1,⋅⋅⋅,vn }
and let S =
( )
v1 ⋅⋅⋅ vn
.
Then S^{−1}AS = D a diagonal matrix which has the eigenvalues down the main diagonal listed according to
multiplicity. Note also that for n a positive integer,
----------50 times---------
n ◜ −1 −1 ◞◟ − 1 − ◝1
A = SDS SDS SDS ⋅⋅⋅SDS
The interior S^{−1}S cancel and so this reduces to
n n −1
A = SD S
and it is easy to compute D^{m}. More generally, you can define functions of the matrix using power series in
this way.