The elementary matrices result from doing a row operation to the identity matrix.
As before, everything will apply to matrices having coefficients in an arbitrary field of
scalars, although we will mainly feature the real numbers in the examples.
Definition A.6.1The row operations consist of the following
Switch two rows.
Multiply a row by a nonzero number.
Replace a row by the same row added to a multiple of another row.
We refer to these as the row operations of type 1,2, and 3 respectively.
The elementary matrices are given in the following definition.
Definition A.6.2The elementary matricesconsist of those matrices whichresult by applying a row operation to an identity matrix. Those which involve switchingrows of the identity are called permutation matrices. More generally, a permutationmatrix is a matrix which comes by permuting the rows of the identity matrix, not justswitching two rows.
As an example of why these elementary matrices are interesting, consider the
following.
( ) ( ) ( )
0 1 0 a b c d x y z w
( 1 0 0 ) ( x y z w ) = ( a b c d ) .
0 0 1 f g h i f g h i
A 3 × 4 matrix was multiplied on the left by an elementary matrix which was obtained from
row operation 1 applied to the identity matrix. This resulted in applying the operation 1 to
the given matrix. This is what happens in general.
Now consider what these elementary matrices look like. First P_{ij}, which involves switching
row i and row j of the identity where i < j. We write
Lemma A.6.3Let P^{ij}denote the elementary matrix which involves switching the i^{th}andthe j^{th}rows of I. Then if P^{ij},A are conformable, we have
PijA = B
where B is obtained from A by switching the i^{th}and the j^{th}rows.
Next consider the row operation which involves multiplying the i^{th} row by a nonzero
constant, c. We write
( r1)
| r2|
I = || .. ||
( . )
rn
where
rj = (0⋅⋅⋅1 ⋅⋅⋅0)
with the 1 in the j^{th} position from the left. The elementary matrix which results from
applying this operation to the i^{th} row of the identity matrix is of the form
this elementary matrix which multiplies the i^{th} row of the identity by the
nonzero constant, c. Then from what was just discussed and the way matrices are
multiplied,
( a a ⋅⋅⋅ a )
| 11. 1.2 1.p |
|| .. .. .. ||
E (c,i)|| ai1 ai2 ⋅⋅⋅ aip ||
|( ... ... ... |)
a a ⋅⋅⋅ a
n1 n2 np
equals a matrix having the columns indicated below.
The case where i > j is similar. This proves the following lemma in which, as above, the i^{th}
row of the identity is r_{i}.
Lemma A.6.5Let E
(c× i+ j)
denote the elementary matrix obtained from I byreplacing the j^{th}row of the identity r_{j}with cr_{i} + r_{j}. Letting the k^{th}row of A bea_{k},
E (c × i+ j) A = B
where B has the same rows as A except the j^{th}row of B is ca_{i} + a_{j}.
The above lemmas are summarized in the following theorem.
Theorem A.6.6To perform any of the three row operations on a matrix A itsuffices to do the row operation on the identity matrix, obtaining an elementary matrix E, andthen take the product, EA. In addition to this, the following identities hold for the elementarymatrices described above.
E (c× i+ j)E (− c× i+ j) = E (− c ×i+ j)E (c× i+ j) = I. (1.8)
(1.8)
E (c,i)E (c−1,i) = E (c−1,i)E (c,i) = I. (1.9)
(1.9)
P ijPij = I. (1.10)
(1.10)
Proof: Consider (1.8). Starting with I and taking −c times the i^{th} row added to the j^{th}
yields E
(− c× i+ j)
which differs from I only in the j^{th} row. Now multiplying on the
left by E
(c× i+ j)
takes c times the i^{th} row and adds to the j^{th} thus restoring
the j^{th} row to its original state. Thus E
(c× i+ j)
E
(− c× i+ j)
= I. Similarly
E
(− c ×i +j)
E
(c× i+ j)
= I. The reasoning is similar for (1.9) and (1.10). ■
Each of these elementary matrices has a significant geometric significance. The
following picture shows the effect of doing E
( )
12 ×3 + 1
on a box. You will see
that it shears the box in one direction. Of course there would be corresponding
shears in the other directions also. Note that this does not change the volume.
You should think about the geometric effect of the other elementary matrices on a
box.
PICT
Definition A.6.7For an n × n matrix A, an n × n matrix B which has theproperty that AB = BA = I is denoted by A^{−1}. Such a matrix is called an inverse.When A has an inverse, it is called invertible.
The following lemma says that if a matrix acts like an inverse, then it is the inverse. Also,
the product of invertible matrices is invertible.
Lemma A.6.8If B,C are both inverses of A, then B = C. That is, there exists at mostone inverse of a matrix. If A_{1},
⋅⋅⋅
,A_{m}are each invertible m × m matrices, then the productA_{1}A_{2}
⋅⋅⋅
A_{m}is also invertible and
(A1A2 ⋅⋅⋅Am )−1 = A−m1A−m 1−1⋅⋅⋅A−11.
Proof. From the definition and associative law of matrix multiplication,
B = BI = B (AC ) = (BA )C = IC = C.
This proves the uniqueness of the inverse.
Next suppose A,B are invertible. Then
( ) ( )
AB B− 1A −1 = A BB −1 A −1 = AIA −1 = AA −1 = I
and also
( −1 −1) − 1( −1 ) − 1 −1
B A AB = B A A B = B IB = B B = I.
It follows from Definition A.6.7 that AB has an inverse and it is B^{−1}A^{−1}. Thus the case
of m = 1,2 in the claim of the lemma is true. Suppose this claim is true for k.
Then
A A ⋅⋅⋅A A = (A A ⋅⋅⋅A )A .
1 2 k k+1 1 2 k k+1
By induction, the two matrices
(A1A2 ⋅⋅⋅Ak )
, A_{k+1} are both invertible and
(A1A2 ⋅⋅⋅Ak )− 1 = A−k 1⋅⋅⋅A −21A−11.
By the case of the product of two invertible matrices shown above,
((A1A2 ⋅⋅⋅Ak )Ak+1)−1 = A −1 (A1A2 ⋅⋅⋅Ak)−1
k+−11 −1 −1 −1
= A k+1A k ⋅⋅⋅A2 A 1 .
This proves the lemma. ■
We will discuss methods for finding the inverse later. For now, observe that Theorem A.6.6
says that elementary matrices are invertible and that the inverse of such a matrix is also an
elementary matrix. The major conclusion of the above Lemma and Theorem is the following
lemma about linear relationships.
Definition A.6.9Let v_{1},
⋅⋅⋅
,v_{k},u be vectors. Then u is said to be a linearcombinationof the vectors
{v1,⋅⋅⋅,vk}
if there exist scalars c_{1},
⋅⋅⋅
,c_{k}such that
∑k
u = civi.
i=1
We also say that when the aboveholds for some scalars c_{1},
⋅⋅⋅
,c_{k}, there exists a linearrelationship between the vector u and the vectors
{v1,⋅⋅⋅,vk}
.
We will discuss this more later, but the following picture illustrates the geometric
significance of the vectors which have a linear relationship with two vectors u,v pointing in
different directions.
PICT
The following lemma states that linear relationships between columns in a matrix are
preserved by row operations. This simple lemma is the main result in understanding all the
major questions related to the row reduced echelon form as well as many other
topics.
Lemma A.6.10Let A and B be two m × n matrices and suppose B results froma row operation applied to A. Then the k^{th}column of B is alinear combination of thei_{1},
⋅⋅⋅
,i_{r}columns of B if and only if the k^{th}column of A is a linear combination ofthe i_{1},
⋅⋅⋅
,i_{r}columns of A. Furthermore, the scalars in the linear combinations are thesame. (The linear relationship between the k^{th}column of A and the i_{1},
⋅⋅⋅
,i_{r}columns ofA is the same as the linear relationship between the k^{th}column of B and the i_{1},
⋅⋅⋅
,i_{r}columns of B.)
Proof.Let A be the following matrix in which the a_{k} are the columns
( )
a1 a2 ⋅⋅⋅ an
and let B be the following matrix in which the columns are given by the b_{k}
( )
b1 b2 ⋅⋅⋅ bn .
Then by Theorem A.6.6 on Page 2121, b_{k} = Ea_{k} where E is an elementary matrix. Suppose
then that one of the columns of A is a linear combination of some other columns of A.
Say
Example A.6.11Find linear relationships between the columns of the matrix
( )
( 1 3 11 10 36 )
A = 1 2 8 9 23 .
1 1 5 8 10
It is not clear what the relationships are, so we do row operations to this matrix. Lemma
A.6.10 says that all the linear relationships between columns are preserved, so the idea is to
do row operations until a matrix results which has the property that the linear relationships
are obvious. First take −1 times the top row and add to the two bottom rows. This
yields
Next take −2 times the middle row and add to the bottom row followed by multiplying the
middle row by −1 :
( )
1 3 11 10 36
( 0 1 3 1 13 ) .
0 0 0 0 0
Next take −3 times the middle row added to the top:
( 1 0 2 7 − 3 )
( 0 1 3 1 13 ) . (1.11)
0 0 0 0 0
(1.11)
At this point it is clear that the last column is −3 times the first column added to 13 times
the second. By Lemma A.6.10, the same is true of the corresponding columns in the original
matrix A. As a check,
You should notice that other linear relationships are also easily seen from (1.11). For
example the fourth column is 7 times the first added to the second. This is obvious from
(1.11) and Lemma A.6.10 says the same relationship holds for A.
This is really just an extension of the technique for finding solutions to a linear system of
equations. In solving a system of equations earlier, row operations were used to exhibit the
last column of an augmented matrix as a linear combination of the preceding columns. The
row reduced echelon form just extends this by making obvious the linear relationships
between every column, not just the last, and those columns preceding it. The matrix in
(1.11) is in row reduced echelon form. The row reduced echelon form is the topic of the next
section.