The elementary matrices result from doing a row operation to the identity matrix. Recall the following
definition.
Definition 4.5.1The row operations consist of the following
Switch two rows.
Multiply a row by a nonzero number.
Replace a row by a multiple of another row added to it.
The elementary matrices are given in the following definition.
Definition 4.5.2The elementary matricesconsist of those matrices which result by applying asingle row operation to an identity matrix. Those which involve switching rows of the identity arecalled permutation matrices^{1}.
As an example of why these elementary matrices are interesting, consider the following.
( ) ( ) ( )
| 0 1 0 | | a b c d | | x y z w |
( 1 0 0 ) ( x y z w ) = ( a b c d )
0 0 1 f g h i f g h i
A 3 × 4 matrix was multiplied on the left by an elementary matrix which was obtained from row operation
1 applied to the identity matrix. This resulted in applying the operation 1 to the given matrix. This is what
happens in general.
Now consider what these elementary matrices look like. First consider the one which involves switching
row i and row j where i < j. This matrix is of the form
Note how the i^{th} and j^{th} rows are switched in the identity matrix and there are thus all ones on the main
diagonal except for those two positions indicated. The two exceptional rows are shown. The i^{th} row was the
j^{th} and the j^{th} row was the i^{th} in the identity matrix. Now consider what this does to a column
vector.
Now denote by P^{ij} the elementary matrix which comes from the identity from switching rows i
and j. From what was just explained consider multiplication on the left by this elementary
matrix.
Lemma 4.5.3Let P^{ij}denote the elementary matrix which involves switching the i^{th}and the j^{th}rows.Then
PijA = B
where B is obtained from A by switching the i^{th}and the j^{th}rows.
Next consider the row operation which involves multiplying the i^{th} row by a nonzero constant, c. The
elementary matrix which results from applying this operation to the i^{th} row of the identity matrix is of the
form
this elementary matrix which multiplies the i^{th} row of the identity by the nonzero
constant, c. Then from what was just discussed and the way matrices are multiplied,
The case where i > j is handled similarly. This proves the following lemma.
Lemma 4.5.6Let E
(c× i+ j)
denote the elementary matrix obtained from I by replacing the j^{th}rowwith c times the i^{th}row added to it. Then
E (c× i+ j)A = B
where B is obtained from A by replacing the j^{th}row of A with itself added to c times the i^{th}row ofA.
Example 4.5.7Consider the third row operation.
( ) ( ) ( )
1 0 0 a b a b
|( 0 1 0|) |( c d |) = |( c d |)
2 0 1 e f 2a+ e 2b+ f
The next theorem is the main result.
Theorem 4.5.8To perform any of the three row operations on a matrix A it suffices to do therow operation on the identity matrix obtaining an elementary matrix E and then take the product,EA. Furthermore, each elementary matrix is invertible and its inverse is an elementary matrix.
Proof: The first part of this theorem has been proved in Lemmas 4.5.3 - 4.5.6. It only remains to verify
the claim about the inverses. Consider first the elementary matrices corresponding to row operation of type
three.
E (− c× i+ j)E (c× i+ j) = I
This follows because the first matrix takes c times row i in the identity and adds it to row j.
When multiplied on the left by E
(− c× i+ j)
it follows from the first part of this theorem
that you take the i^{th} row of E
(c× i+ j)
which coincides with the i^{th} row of I since that row
was not changed, multiply it by −c and add to the j^{th} row of E
(c× i+ j)
which was the
j^{th} row of I added to c times the i^{th} row of I. Thus E
(− c× i+ j)
multiplied on the left,
undoes the row operation which resulted in E
(c× i+ j)
. The same argument applied to the
product
E (c× i+ j)E (− c × i+ j)
replacing c with −c in the argument yields that this product is also equal to I. Therefore,
E (c× i+ j)−1 = E (− c ×i+ j).
Similar reasoning shows that for E
(c,i)
the elementary matrix which comes from multiplying the i^{th}
row by the nonzero constant, c,
−1 (− 1 )
E(c,i) = E c ,i .
Finally, consider P^{ij} which involves switching the i^{th} and the j^{th} rows.
ij ij
P P = I
because by the first part of this theorem, multiplying on the left by P^{ij} switches the i^{th} and j^{th} rows of P^{ij}
which was obtained from switching the i^{th} and j^{th} rows of the identity. First you switch them to get P^{ij}
and then you multiply on the left by P^{ij} which switches these rows again and restores the identity matrix.
Thus
( ij)
P
^{−1} = P^{ij}.■
Using Theorem 4.3.4, this shows the following result.
Theorem 4.5.9Let A be an n×n matrix. Then if R is its row reduced echelon form, there is a sequenceof elementary matrices E_{i}such that
E1E2 ⋅⋅⋅EmA = R
In particular, A is invertible if and only if there is a sequence of elementary matrices as above suchthat