Definition A.10.1Let A be an n × n matrix. The characteristic polynomialisdefined as
qA (t) ≡ det(tI − A )
and the solutions to q_{A}
(t)
= 0 are called eigenvalues. For A a matrix andp
(t)
= t^{n} + a_{n−1}t^{n−1} +
⋅⋅⋅
+ a_{1}t + a_{0}, denote by p
(A)
the matrix defined by
p(A ) ≡ An + an−1An−1 + ⋅⋅⋅+ a1A +a0I.
The explanation for the last term is that A^{0}is interpreted as I, the identity matrix.
The Cayley Hamilton theorem states that every matrix satisfies its characteristic equation,
that equation defined by q_{A}
(t)
= 0. It is one of the most important theorems in linear
algebra^{2} .
The proof in this section is not the most general proof, but works well when the field of
scalars is ℝ or ℂ. The following lemma will help with its proof.
Lemma A.10.2Suppose for all
|λ|
large enough,
A + A λ+ ⋅⋅⋅+ A λm = 0,
0 1 m
where the A_{i}are n × n matrices. Then each A_{i} = 0.
Proof:Suppose some A_{i}≠0. Let p be the largest index of those which are non zero. Then
multiply by λ^{−p}.
A0λ −p + A1λ−p+1 + ⋅⋅⋅+ Ap− 1λ −1 + Ap = 0
Now let λ →∞. Thus A_{p} = 0 after all. Hence each A_{i} = 0. ■
With the lemma, here is a simple corollary.
Corollary A.10.3Let A_{i}and B_{i}be n × n matrices and suppose
A0 + A1λ + ⋅⋅⋅+ Am λm = B0 + B1λ + ⋅⋅⋅+ Bm λm
for all
|λ|
large enough. Then A_{i} = B_{i}for all i. If A_{i} = B_{i}for each A_{i},B_{i}then one cansubstitute an n × n matrix M for λ and the identity will continue to hold.
Proof:Subtract and use the result of the lemma.The last claim is obvious by matching
terms. ■
With this preparation, here is a relatively easy proof of the Cayley Hamilton
theorem.
Theorem A.10.4Let A be an n × n matrix and let q
(λ)
≡ det
(λI − A)
bethe characteristic polynomial. Then q
(A )
= 0.
Proof:Let C
(λ)
equal the transpose of the cofactor matrix of
(λI − A)
for
|λ|
large. (If
|λ|
is large enough, then λ cannot be in the finite list of eigenvalues of A and so for such λ,
Then, using Corollary A.10.3, one can replace λ on both sides with A. Then the right side is
seen to equal 0. Hence the left side, q
(A )
I is also equal to 0. ■
Appendix B The Hausdorff Maximal Theorem
The purpose of this appendix is to prove the equivalence between the axiom of choice, the
Hausdorff maximal theorem, and the well-ordering principle. The Hausdorff maximal theorem
and the well-ordering principle are very useful but a little hard to believe; so, it may be
surprising that they are equivalent to the axiom of choice. First it is shown that the axiom of
choice implies the Hausdorff maximal theorem, a remarkable theorem about partially ordered
sets.
A nonempty set is partially ordered if there exists a partial order, ≺, satisfying
x ≺ x
and
if x ≺ y and y ≺ z then x ≺ z.
An example of a partially ordered set is the set of all subsets of a given set and
≺≡⊆. Note that two elements in a partially ordered sets may not be related. In
other words, just because x, y are in the partially ordered set, it does not follow
that either x ≺ y or y ≺ x. A subset of a partially ordered set, C, is called a chain
if x, y ∈C implies that either x ≺ y or y ≺ x. If either x ≺ y or y ≺ x then x
and y are described as being comparable. A chain is also called a totally ordered
set. C is a maximal chain if whenever
^C
is a chain containing C, it follows the two
chains are equal. In other words C is a maximal chain if there is no strictly larger
chain.
Lemma B.0.1Let ℱ be a nonempty partially ordered set with partial order ≺.Then assuming the axiom of choice, there exists a maximal chain in ℱ.
Proof: Let X be the set of all chains from ℱ. For C∈X, let
SC = {x ∈ ℱ such that C∪{x} is a chain strictly larger than C}.
If S_{C} = ∅ for any C, then C is maximal. Thus, assume S_{C}≠∅ for all C∈X. Let f(C) ∈ S_{C}.
(This is where the axiom of choice is being used.) Let
g(C) = C ∪ {f (C)}.
Thus g(C) ⊋ C and g(C) ∖C ={f(C)} = {a single element of ℱ}. A subset T of X is called a
tower if
∅ ∈ T ,
C ∈ T impliesg(C) ∈ T ,
and if S⊆T is totally ordered with respect to set inclusion, then
∪ S ∈ T .
Here S is a chain with respect to set inclusion whose elements are chains.
Note that X is a tower. Let T_{0} be the intersection of all towers. Thus, T_{0} is a tower, the
smallest tower. Are any two sets in T_{0} comparable in the sense of set inclusion so that T_{0} is
actually a chain? Let C_{0} be a set of T_{0} which is comparable to every set of T_{0}. Such sets
exist, ∅ being an example. Let
ℬ ≡ {D ∈ T : D ⊋ C and f (C ) ∕∈ D }.
0 0 0
The picture represents sets of ℬ. As illustrated in the picture, D is a set of ℬ when D is larger
than C_{0} but fails to be comparable to g
(C0)
. Thus there would be more than one chain
ascending from C_{0} if ℬ≠∅, rather like a tree growing upward in more than one direction from a
fork in the trunk. It will be shown this can’t take place for any such C_{0} by showing
ℬ = ∅.
PICT
This will be accomplished by showing
^
T0
≡T_{0}∖ℬ is a tower. Since T_{0} is the smallest
tower, this will require that
^T0
= T_{0} and so ℬ = ∅.
Claim:
T^0
is a tower and so ℬ = ∅.
Proof of the claim:It is clear that ∅∈
^T0
because for ∅ to be contained in ℬ it would
be required to properly contain C_{0} which is not possible. Suppose D∈
T^0
. The plan is to
verify g
(D)
∈
T^0
.
Case 1: f
(D )
∈C_{0}. If D⊆C_{0}, then since both D and
{f (D)}
are contained in C_{0}, it
follows g
(D)
⊆C_{0} and so g
(D )
∕∈
ℬ. On the other hand, if D ⊋ C_{0}, then since D∈
^T0
,
f
(C0)
∈D and so g
(D)
also contains f
(C0)
implying g
(D)
∕∈
ℬ. These are the only two cases
to consider because C_{0} is comparable to every set of T_{0}.
Case 2: f
(D )
∈∕
C_{0}. If D ⊊ C_{0} it can’t be the case that f
(D )
∕∈
C_{0} because if this were so,
g
(D )
would not compare to C_{0}.
PICT
Hence if f
(D)
∕∈
C_{0}, then D⊇C_{0}. If D = C_{0}, then f
(D )
= f
(C0)
∈ g
(D )
so g
(D )
∕∈
ℬ.
Therefore, assume D ⊋ C_{0}. Then, since D is in
T^0
, f
(C0)
∈D and so f
(C0)
∈ g
(D)
. Therefore,
g
(D )
∈
^T0
.
Now suppose S is a totally ordered subset of
^T0
with respect to set inclusion. Then if every
element of S is contained in C_{0}, so is ∪S and so ∪S∈
T^0
. If, on the other hand, some chain
from S, C, contains C_{0} properly, then since C
∕∈
ℬ, f
(C0)
∈C⊆∪S showing that ∪S
∕∈
ℬ also.
This has proved
T^
0
is a tower and since T_{0} is the smallest tower, it follows
T^
0
= T_{0}.
This has shown roughly that no splitting into more than one ascending chain can
occur at any C_{0} which is comparable to every set of T_{0}. Next it is shown that every
element of T_{0} has the property that it is comparable to all other elements of T_{0}.
This is done by showing that these elements which possess this property form a
tower.
Define T_{1} to be the set of all elements of T_{0} which are comparable to every element of T_{0}.
(Recall the elements of T_{0} are chains from the original partial order.)
Claim: T_{1} is a tower.
Proof of the claim: It is clear that ∅∈T_{1} because ∅ is a subset of every set. Suppose
C_{0}∈T_{1}. It is necessary to verify that g
(C0)
∈T_{1}. Let D∈T_{0} (Thus D⊆C_{0} or
else D ⊋ C_{0}.)and consider g
(C0)
≡C_{0}∪
{f (C0)}
. If D⊆C_{0}, then D⊆ g
(C0)
so
g
(C0)
is comparable to D. If D ⊋ C_{0}, then D⊇ g
(C0)
by what was just shown
(ℬ = ∅)
. Hence g
(C0)
is comparable to D. Since D was arbitrary, it follows g
(C0)
is
comparable to every set of T_{0}. Now suppose S is a chain of elements of T_{1} and let
D be an element of T_{0}. If every element in the chain, S is contained in D, then
∪S is also contained in D. On the other hand, if some set, C, from S contains D
properly, then ∪S also contains D. Thus ∪S∈T_{1} since it is comparable to every
D∈T_{0}.
This shows T_{1} is a tower and proves therefore, that T_{0} = T_{1}. Thus every set of T_{0}
compares with every other set of T_{0} showing T_{0} is a chain in addition to being a
tower.
Now ∪T_{0},g
(∪T0)
∈T_{0}. Hence, because g
(∪T0)
is an element of T_{0}, and T_{0} is a chain of
these, it follows g
(∪T0)
⊆∪T_{0}. Thus
∪T0 ⊇ g(∪T0) ⊋ ∪T0,
a contradiction. Hence there must exist a maximal chain after all. This proves the
lemma.
If X is a nonempty set,≤ is an order on X if
x ≤ x,
and if x, y ∈ X, then
either x ≤ y or y ≤ x
and
if x ≤ y and y ≤ z then x ≤ z.
≤ is a well order and say that
(X,≤ )
is a well-ordered set if every nonempty subset of X has
a smallest element. More precisely, if S≠∅ and S ⊆ X then there exists an x ∈ S such
that x ≤ y for all y ∈ S. A familiar example of a well-ordered set is the natural
numbers.
Lemma B.0.2The Hausdorff maximal principle implies every nonempty set canbe well-ordered.
Proof: Let X be a nonempty set and let a ∈ X. Then {a} is a well-ordered subset of X.
Let
ℱ = {S ⊆ X : there exists a well order for S}.
Thus ℱ≠∅. For S_{1}, S_{2}∈ℱ, define S_{1}≺ S_{2} if S_{1}⊆ S_{2} and there exists a well order for S_{2},
≤_{2}such that
(S2,≤2) is well- ordered
and if
y ∈ S2 ∖ S1 then x ≤2 y for all x ∈ S1,
and if ≤_{1}is the well order of S_{1} then the two orders are consistent on S_{1}. Then observe that
≺ is a partial order on ℱ. By the Hausdorff maximal principle, let C be a maximal chain in ℱ
and let
X∞ ≡ ∪ C.
Define an order, ≤, on X_{∞} as follows. If x, y are elements of X_{∞}, pick S ∈C such that x, y
are both in S. Then if ≤_{S} is the order on S, let x ≤ y if and only if x ≤_{S}y. This definition is
well defined because of the definition of the order, ≺. Now let U be any nonempty subset of
X_{∞}. Then S ∩ U≠∅ for some S ∈C. Because of the definition of ≤, if y ∈ S_{2}∖ S_{1}, S_{i}∈C,
then x ≤ y for all x ∈ S_{1}. Thus, if y ∈ X_{∞}∖ S then x ≤ y for all x ∈ S and so the
smallest element of S ∩ U exists and is the smallest element in U. Therefore X_{∞ } is
well-ordered. Now suppose there exists z ∈ X ∖ X_{∞}. Define the following order, ≤_{1}, on
X_{∞}∪{z}.
x ≤1 y if and only if x ≤ y whenever x,y ∈ X∞
x ≤1 z whenever x ∈ X ∞.
Then let
^C = {S ∈ C or X ∞ ∪ {z}}.
Then
^C
is a strictly larger chain than C contradicting maximality of C. Thus X ∖X_{∞} = ∅ and
this shows X is well-ordered by ≤. This proves the lemma.
With these two lemmas the main result follows.
Theorem B.0.3The following are equivalent.
The axiom of choice
The Hausdorff maximal principle
The well- ordering principle.
Proof:It only remains to prove that the well-ordering principle implies the axiom
of choice. Let I be a nonempty set and let X_{i} be a nonempty set for each i ∈ I.
Let X = ∪{X_{i} : i ∈ I} and well order X. Let f