The elementary matrices result from doing a row operation to the identity matrix.
As before, everything will apply to matrices having coefficients in an arbitrary field of
scalars, although we will mainly feature the real numbers in the examples.
Definition A.6.1The row operations consist of the following
Switch two rows.
Multiply a row by a nonzero number.
Replace a row by the same row added to a multiple of another row.
We refer to these as the row operations of type 1,2, and 3 respectively.
The elementary matrices are given in the following definition.
Definition A.6.2The elementary matricesconsist of those matrices whichresult by applying a row operation to an identity matrix. Those which involve switchingrows of the identity are called permutation matrices. More generally, a permutationmatrix is a matrix which comes by permuting the rows of the identity matrix, not justswitching two rows.
As an example of why these elementary matrices are interesting, consider the
following.
( ) ( ) ( )
0 1 0 a b c d x y z w
( 1 0 0 ) ( x y z w ) = ( a b c d ) .
0 0 1 f g h i f g h i
A 3 × 4 matrix was multiplied on the left by an elementary matrix which was obtained from
row operation 1 applied to the identity matrix. This resulted in applying the operation 1 to
the given matrix. This is what happens in general.
Now consider what these elementary matrices look like. First Pij, which involves switching
row i and row j of the identity where i < j. We write
Lemma A.6.3Let Pijdenote the elementary matrix which involves switching the ithandthe jthrows of I. Then if Pij,A are conformable, we have
PijA = B
where B is obtained from A by switching the ithand the jthrows.
Next consider the row operation which involves multiplying the ith row by a nonzero
constant, c. We write
( r1)
| r2|
I = || .. ||
( . )
rn
where
rj = (0⋅⋅⋅1 ⋅⋅⋅0)
with the 1 in the jth position from the left. The elementary matrix which results from
applying this operation to the ith row of the identity matrix is of the form
this elementary matrix which multiplies the ith row of the identity by the
nonzero constant, c. Then from what was just discussed and the way matrices are
multiplied,
( a a ⋅⋅⋅ a )
| 11. 1.2 1.p |
|| .. .. .. ||
E (c,i)|| ai1 ai2 ⋅⋅⋅ aip ||
|( ... ... ... |)
a a ⋅⋅⋅ a
n1 n2 np
equals a matrix having the columns indicated below.
The case where i > j is similar. This proves the following lemma in which, as above, the ith
row of the identity is ri.
Lemma A.6.5Let E
(c× i+ j)
denote the elementary matrix obtained from I byreplacing the jthrow of the identity rjwith cri + rj. Letting the kthrow of A beak,
E (c × i+ j) A = B
where B has the same rows as A except the jthrow of B is cai + aj.
The above lemmas are summarized in the following theorem.
Theorem A.6.6To perform any of the three row operations on a matrix A itsuffices to do the row operation on the identity matrix, obtaining an elementary matrix E, andthen take the product, EA. In addition to this, the following identities hold for the elementarymatrices described above.
E (c× i+ j)E (− c× i+ j) = E (− c ×i+ j)E (c× i+ j) = I. (1.8)
(1.8)
E (c,i)E (c−1,i) = E (c−1,i)E (c,i) = I. (1.9)
(1.9)
P ijPij = I. (1.10)
(1.10)
Proof: Consider (1.8). Starting with I and taking −c times the ith row added to the jth
yields E
(− c× i+ j)
which differs from I only in the jth row. Now multiplying on the
left by E
(c× i+ j)
takes c times the ith row and adds to the jth thus restoring
the jth row to its original state. Thus E
(c× i+ j)
E
(− c× i+ j)
= I. Similarly
E
(− c ×i +j)
E
(c× i+ j)
= I. The reasoning is similar for (1.9) and (1.10). ■
Each of these elementary matrices has a significant geometric significance. The
following picture shows the effect of doing E
( )
12 ×3 + 1
on a box. You will see
that it shears the box in one direction. Of course there would be corresponding
shears in the other directions also. Note that this does not change the volume.
You should think about the geometric effect of the other elementary matrices on a
box.
PICT
Definition A.6.7For an n × n matrix A, an n × n matrix B which has theproperty that AB = BA = I is denoted by A−1. Such a matrix is called an inverse.When A has an inverse, it is called invertible.
The following lemma says that if a matrix acts like an inverse, then it is the inverse. Also,
the product of invertible matrices is invertible.
Lemma A.6.8If B,C are both inverses of A, then B = C. That is, there exists at mostone inverse of a matrix. If A1,
⋅⋅⋅
,Amare each invertible m × m matrices, then the productA1A2
⋅⋅⋅
Amis also invertible and
(A1A2 ⋅⋅⋅Am )−1 = A−m1A−m 1−1⋅⋅⋅A−11.
Proof. From the definition and associative law of matrix multiplication,
B = BI = B (AC ) = (BA )C = IC = C.
This proves the uniqueness of the inverse.
Next suppose A,B are invertible. Then
( ) ( )
AB B− 1A −1 = A BB −1 A −1 = AIA −1 = AA −1 = I
and also
( −1 −1) − 1( −1 ) − 1 −1
B A AB = B A A B = B IB = B B = I.
It follows from Definition A.6.7 that AB has an inverse and it is B−1A−1. Thus the case
of m = 1,2 in the claim of the lemma is true. Suppose this claim is true for k.
Then
A A ⋅⋅⋅A A = (A A ⋅⋅⋅A )A .
1 2 k k+1 1 2 k k+1
By induction, the two matrices
(A1A2 ⋅⋅⋅Ak )
, Ak+1 are both invertible and
(A1A2 ⋅⋅⋅Ak )− 1 = A−k 1⋅⋅⋅A −21A−11.
By the case of the product of two invertible matrices shown above,
((A1A2 ⋅⋅⋅Ak )Ak+1)−1 = A −1 (A1A2 ⋅⋅⋅Ak)−1
k+−11 −1 −1 −1
= A k+1A k ⋅⋅⋅A2 A 1 .
This proves the lemma. ■
We will discuss methods for finding the inverse later. For now, observe that Theorem A.6.6
says that elementary matrices are invertible and that the inverse of such a matrix is also an
elementary matrix. The major conclusion of the above Lemma and Theorem is the following
lemma about linear relationships.
Definition A.6.9Let v1,
⋅⋅⋅
,vk,u be vectors. Then u is said to be a linearcombinationof the vectors
{v1,⋅⋅⋅,vk}
if there exist scalars c1,
⋅⋅⋅
,cksuch that
∑k
u = civi.
i=1
We also say that when the aboveholds for some scalars c1,
⋅⋅⋅
,ck, there exists a linearrelationship between the vector u and the vectors
{v1,⋅⋅⋅,vk}
.
We will discuss this more later, but the following picture illustrates the geometric
significance of the vectors which have a linear relationship with two vectors u,v pointing in
different directions.
PICT
The following lemma states that linear relationships between columns in a matrix are
preserved by row operations. This simple lemma is the main result in understanding all the
major questions related to the row reduced echelon form as well as many other
topics.
Lemma A.6.10Let A and B be two m × n matrices and suppose B results froma row operation applied to A. Then the kthcolumn of B is alinear combination of thei1,
⋅⋅⋅
,ircolumns of B if and only if the kthcolumn of A is a linear combination ofthe i1,
⋅⋅⋅
,ircolumns of A. Furthermore, the scalars in the linear combinations are thesame. (The linear relationship between the kthcolumn of A and the i1,
⋅⋅⋅
,ircolumns ofA is the same as the linear relationship between the kthcolumn of B and the i1,
⋅⋅⋅
,ircolumns of B.)
Proof.Let A be the following matrix in which the ak are the columns
( )
a1 a2 ⋅⋅⋅ an
and let B be the following matrix in which the columns are given by the bk
( )
b1 b2 ⋅⋅⋅ bn .
Then by Theorem A.6.6 on Page 2121, bk = Eak where E is an elementary matrix. Suppose
then that one of the columns of A is a linear combination of some other columns of A.
Say
Example A.6.11Find linear relationships between the columns of the matrix
( )
( 1 3 11 10 36 )
A = 1 2 8 9 23 .
1 1 5 8 10
It is not clear what the relationships are, so we do row operations to this matrix. Lemma
A.6.10 says that all the linear relationships between columns are preserved, so the idea is to
do row operations until a matrix results which has the property that the linear relationships
are obvious. First take −1 times the top row and add to the two bottom rows. This
yields
Next take −2 times the middle row and add to the bottom row followed by multiplying the
middle row by −1 :
( )
1 3 11 10 36
( 0 1 3 1 13 ) .
0 0 0 0 0
Next take −3 times the middle row added to the top:
( 1 0 2 7 − 3 )
( 0 1 3 1 13 ) . (1.11)
0 0 0 0 0
(1.11)
At this point it is clear that the last column is −3 times the first column added to 13 times
the second. By Lemma A.6.10, the same is true of the corresponding columns in the original
matrix A. As a check,
You should notice that other linear relationships are also easily seen from (1.11). For
example the fourth column is 7 times the first added to the second. This is obvious from
(1.11) and Lemma A.6.10 says the same relationship holds for A.
This is really just an extension of the technique for finding solutions to a linear system of
equations. In solving a system of equations earlier, row operations were used to exhibit the
last column of an augmented matrix as a linear combination of the preceding columns. The
row reduced echelon form just extends this by making obvious the linear relationships
between every column, not just the last, and those columns preceding it. The matrix in
(1.11) is in row reduced echelon form. The row reduced echelon form is the topic of the next
section.