Gerschgorin’s theorem gives a convenient way to estimate eigenvalues of a matrix from
easy to obtain information. For A an n × n matrix, denote by σ
(A )
the collection of all
eigenvalues of A.
Theorem 51.7.1Let A be an n×n matrix. Consider the n Gerschgorin discs definedas
({ )}
D ≡ λ ∈ ℂ : |λ − a | ≤ ∑ |a | .
i ( ii j⁄=i ij)
Then every eigenvalue is contained in some Gerschgorin disc.
This theorem says to add up the absolute values of the entries of the i^{th} row which
are off the main diagonal and form the disc centered at a_{ii} having this radius. The union
of these discs contains σ
we see that λ is contained in the k^{th} Gerschgorin disc.
More can be said using the theory about counting zeros. To begin with the distance
between two n × n matrices, A =
(aij)
and B =
(bij)
as follows.
||A− B ||2 ≡ ∑ |a − b |2.
ij ij ij
Thus two matrices are close if and only if their corresponding entries are close.
Let A be an n × n matrix. Recall the eigenvalues of A are given by the zeros of the
polynomial, p_{A}
(z)
= det
(zI − A )
where I is the n×n identity. Then small changes in A
will produce small changes in p_{A}
(z)
and p_{A}^{′}
(z)
. Let γ_{k} denote a very small
closed circle which winds around z_{k}, one of the eigenvalues of A, in the counter
clockwise direction so that n
(γk,zk)
= 1. This circle is to enclose only z_{k} and is to
have no other eigenvalue on it. Then apply Theorem 51.6.1. According to this
theorem
1 ∫ p′(z)
2πi pA(z)dz
γ A
is always an integer equal to the multiplicity of z_{k} as a root of p_{A}
(t)
. Therefore,
small changes in A result in no change to the above contour integral because
it must be an integer and small changes in A result in small changes in the
integral. Therefore whenever every entry of the matrix B is close enough to the
corresponding entry of the matrix A, the two matrices have the same number of zeros
inside γ_{k} under the usual convention that zeros are to be counted according to
multiplicity. By making the radius of the small circle equal to ε where ε is less
than the minimum distance between any two distinct eigenvalues of A, this
shows that if B is close enough to A, every eigenvalue of B is closer than ε to
some eigenvalue of A. The next theorem is about continuous dependence of
eigenvalues.
Theorem 51.7.2If λ is an eigenvalue of A, then if
||B − A ||
is small enough,some eigenvalue of B will be within ε of λ.
Consider the situation that A
(t)
is an n×n matrix and that t → A
(t)
is continuous
for t ∈
[0,1]
.
Lemma 51.7.3Let λ
(t)
∈ σ
(A(t))
for t < 1 and let Σ_{t} = ∪_{s≥t}σ
(A (s))
. Alsolet K_{t}be the connected component of λ
(t)
in Σ_{t}. Then there exists η > 0 such thatK_{t}∩ σ
(A (s))
≠∅ for all s ∈
[t,t+ η]
.
Proof: Denote by D
(λ(t) ,δ)
the disc centered at λ
(t)
having radius δ > 0, with
other occurrences of this notation being defined similarly. Thus
D (λ(t),δ) ≡ {z ∈ ℂ : |λ(t) − z| ≤ δ} .
Suppose δ > 0 is small enough that λ
(t)
is the only element of σ
(A (t))
contained in
D
(λ(t),δ)
and that p_{A(t)
} has no zeroes on the boundary of this disc. Then by continuity,
and the above discussion and theorem, there exists η > 0,t + η < 1, such that for
s ∈
[t,t+ η]
, p_{A(s)
} also has no zeroes on the boundary of this disc and that
A
(s)
has the same number of eigenvalues, counted according to multiplicity,
in the disc as A
(t)
. Thus σ
(A(s))
∩ D
(λ (t),δ)
≠∅ for all s ∈
[t,t+ η]
. Now
let
H = ⋃ σ(A (s)) ∩D (λ(t),δ).
s∈[t,t+η]
I will show H is connected. Suppose not. Then H = P ∪Q where P,Q are separated and
λ
(t)
∈ P. Let
s0 ≡ inf{s : λ (s) ∈ Q for some λ (s) ∈ σ (A(s))} .
There exists λ
(s )
0
∈ σ
(A (s ))
0
∩ D
(λ(t),δ)
. If λ
(s )
0
∕∈
Q, then from the above
discussion there are
λ(s) ∈ σ(A (s))∩ Q
for s > s_{0} arbitrarily close to λ
(s0)
. Therefore, λ
(s0)
∈ Q which shows that s_{0}> t
because λ
(t)
is the only element of σ
(A (t))
in D
(λ (t),δ)
and λ
(t)
∈ P. Now let s_{n}↑ s_{0}.
Then λ
(sn)
∈ P for any
λ(s ) ∈ σ(A (s ))∩ D (λ(t),δ)
n n
and from the above discussion, for some choice of s_{n}→ s_{0}, λ
(sn)
→ λ
(s0)
which
contradicts P and Q separated and nonempty. Since P is nonempty, this shows Q = ∅.
Therefore, H is connected as claimed. But K_{t}⊇ H and so K_{t}∩ σ
(A (s))
≠∅ for all
s ∈
[t,t+ η]
. This proves the lemma.
The following is the necessary theorem.
Theorem 51.7.4Suppose A
(t)
is an n×n matrix and that t → A
(t)
is continuousfor t ∈
[0,1]
. Let λ
(0)
∈ σ
(A (0))
and define Σ ≡∪_{t∈[0,1]
}σ
(A (t))
. Let K_{λ(0)
} = K_{0}denote the connected component of λ
(0)
in Σ. Then K_{0}∩ σ
(A (t))
≠∅ for allt ∈
[0,1]
.
Proof:Let S ≡
{t ∈ [0,1] : K0 ∩ σ(A (s)) ⁄= ∅ for all s ∈ [0,t]}
. Then 0 ∈ S.
Let t_{0} = sup
(S)
. Say σ
(A(t0))
= λ_{1}
(t0)
,
⋅⋅⋅
,λ_{r}
(t0)
. I claim at least one of
these is a limit point of K_{0} and consequently must be in K_{0} which will show
that S has a last point. Why is this claim true? Let s_{n}↑ t_{0} so s_{n}∈ S. Now let
the discs, D
(λi(t0),δ)
,i = 1,
⋅⋅⋅
,r be disjoint with p_{A(t0)
} having no zeroes
on γ_{i} the boundary of D
(λi(t0),δ)
. Then for n large enough it follows from
Theorem 51.6.1 and the discussion following it that σ
(A (sn))
is contained in
∪_{i=1}^{r}D
(λi(t0),δ)
. Therefore, K_{0}∩
(σ (A(t0)) + D(0,δ))
≠∅ for all δ small enough. This
requires at least one of the λ_{i}
(t0)
to be in K_{0}. Therefore, t_{0}∈ S and S has a last
point.
Now by Lemma 51.7.3, if t_{0}< 1, then K_{0}∪ K_{t} would be a strictly larger connected
set containing λ
(0)
. (The reason this would be strictly larger is that K_{0}∩σ
(A (s))
= ∅
for some s ∈
(t,t+ η)
while K_{t}∩σ
(A(s))
≠∅ for all s ∈
[t,t+ η]
.) Therefore, t_{0} = 1 and
this proves the theorem.
The following is an interesting corollary of the Gerschgorin theorem.
Corollary 51.7.5Suppose one of the Gerschgorin discs, D_{i}is disjoint from theunion of the others. Then D_{i}contains an eigenvalue of A. Also, if there are ndisjoint Gerschgorin discs, then each one contains an eigenvalue of A.
Proof:Denote by A
(t)
the matrix
( t)
aij
where if i≠j, a_{ij}^{t} = ta_{ij} and a_{ii}^{t} = a_{ii}.
Thus to get A
(t)
we multiply all non diagonal terms by t. Let t ∈
[0,1]
. Then
A
(0)
= diag
(a11,⋅⋅⋅,ann)
and A
(1)
= A. Furthermore, the map, t → A
(t)
is continuous.
Denote by D_{j}^{t} the Gerschgorin disc obtained from the j^{th} row for the matrix, A
(t)
. Then
it is clear that D_{j}^{t}⊆ D_{j} the j^{th} Gerschgorin disc for A. Then a_{ii} is the eigenvalue for
A
(0)
which is contained in the disc, consisting of the single point a_{ii} which is contained
in D_{i}. Letting K be the connected component in Σ for Σ defined in Theorem
51.7.4 which is determined by a_{ii}, it follows by Gerschgorin’s theorem that
K ∩ σ
(A (t))
⊆∪_{j=1}^{n}D_{j}^{t}⊆∪_{j=1}^{n}D_{j} = D_{i}∪
(∪j⁄=iDj )
and also, since K is connected,
there are no points of K in both D_{i} and
(∪j⁄=iDj )
. Since at least one point of K
is in D_{i},(a_{ii}) it follows all of K must be contained in D_{i}. Now by Theorem
51.7.4 this shows there are points of K ∩ σ
(A )
in D_{i}. The last assertion follows
immediately.
Actually, this can be improved slightly. It involves the following lemma.
Lemma 51.7.6In the situation of Theorem 51.7.4suppose λ
(0)
= K_{0}∩σ
(A(0))
andthat λ
(0)
is a simple root of the characteristic equation of A
(0)
. Then for allt ∈
[0,1]
,
σ (A(t)) ∩K = λ (t)
0
where λ
(t)
is a simple root of the characteristic equation of A
(t)
.
Proof:Let S ≡
{t ∈ [0,1] : K0 ∩ σ (A (s)) = λ(s), a simple eigenvalue for all s ∈ [0,t]} .
Then 0 ∈ S so it is nonempty. Let t_{0} = sup
(S )
and suppose λ_{1}≠λ_{2} are two elements of
σ
(A (t))
0
∩ K_{0}. Then choosing η > 0 small enough, and letting D_{i} be disjoint
discs containing λ_{i} respectively, similar arguments to those of Lemma 51.7.3
imply
Hi ≡ ∪s∈[t−η,t]σ (A(s))∩ Di
0 0
is a connected and nonempty set for i = 1,2 which would require that H_{i}⊆ K_{0}. But then
there would be two different eigenvalues of A
(s)
contained in K_{0}, contrary to the
definition of t_{0}. Therefore, there is at most one eigenvalue, λ
(t0)
∈ K_{0}∩σ
(A (t0))
. The
possibility that it could be a repeated root of the characteristic equation must
be ruled out. Suppose then that λ
(t0)
is a repeated root of the characteristic
equation. As before, choose a small disc, D centered at λ
(t0)
and η small enough
that
H ≡ ∪s∈[t0−η,t0]σ (A(s))∩ D
is a nonempty connected set containing either multiple eigenvalues of A
(s)
or else a
single repeated root to the characteristic equation of A
(s)
. But since H is connected and
contains λ
(t0)
it must be contained in K_{0} which contradicts the condition for s ∈ S for
all these s ∈
[t0 − η,t0]
. Therefore, t_{0}∈ S as hoped. If t_{0}< 1, there exists a small disc
centered at λ
(t0)
and η > 0 such that for all s ∈
[t0,t0 + η]
, A
(s)
has only
simple eigenvalues in D and the only eigenvalues of A
(s)
which could be in
K_{0} are in D. (This last assertion follows from noting that λ
(t0)
is the only
eigenvalue of A
(t0)
in K_{0} and so the others are at a positive distance from
K_{0}. For s close enough to t_{0}, the eigenvalues of A
(s)
are either close to these
eigenvalues of A
(t0)
at a positive distance from K_{0} or they are close to the
eigenvalue, λ
(t0)
in which case it can be assumed they are in D.) But this shows
that t_{0} is not really an upper bound to S. Therefore, t_{0} = 1 and the lemma is
proved.
With this lemma, the conclusion of the above corollary can be improved.
Corollary 51.7.7Suppose one of the Gerschgorin discs, D_{i}is disjoint fromthe union of the others. Then D_{i}contains exactly one eigenvalue of A and thiseigenvalue is a simple root to the characteristic polynomial of A.
Proof: In the proof of Corollary 51.7.5, first note that a_{ii} is a simple root of A
(0)
since otherwise the i^{th} Gerschgorin disc would not be disjoint from the others. Also, K,
the connected component determined by a_{ii} must be contained in D_{i} because it is
connected and by Gerschgorin’s theorem above, K ∩ σ
(A(t))
must be contained in the
union of the Gerschgorin discs. Since all the other eigenvalues of A
(0)
, the a_{jj}, are
outside D_{i}, it follows that K ∩ σ
consists of a single simple eigenvalue. This proves the
corollary.
Example 51.7.8Consider the matrix,
( )
5 1 0
( 1 1 1 )
0 1 0
The Gerschgorin discs are D
(5,1)
,D
(1,2)
, and D
(0,1)
. Then D
(5,1)
is disjoint from the other discs. Therefore, there should be an eigenvalue in
D
(5,1)
. The actual eigenvalues are not easy to find. They are the roots of the
characteristic equation, t^{3}− 6t^{2} + 3t + 5 = 0. The numerical values of these
are −.66966,1.4231, and 5.24655, verifying the predictions of Gerschgorin’s
theorem.