Matrices II

Section 2.4 Matrices II

Having seen what the world is like from the abstract viewpoint, it is now time to take a deep breath with mindless computations if only to reassure ourselves that we understand what is going on. So in this section, we will return to Section 1.3 and exercise Exercise 2.3.2 in order to solve the fundamental linear equation:

\begin{equation} T (\mathbf{u} ) = \mathbf{v} \tag{2.4.1} \end{equation}

where

\begin{equation*} T : U \to V \end{equation*}

is a linear transformation. This is the abstract version of the equation, but the computational one is the case where

\begin{equation*} U = K^n, \hspace{.2in} V = K^m \end{equation*}

and \(T\) is multiplication by an \(m \times n\) matrix \(A\text{.}\) Every linear algebra course that has ever been given and will ever be given spends time solving this equation, and we are no exception! So let us start by writing it out with variables, entries and constants. Our variables (i.e. potential solutions) are the coordinates in the column vector \(\mb{u}\) so we write

\begin{equation*} \mb{u} = \mb{x} = \left[ \begin{matrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{matrix} \right] . \end{equation*}

Our entries (or, as we see in a moment, coefficients) of this equation are those of our matrix \(A\)

\begin{equation*} A = \left[ \begin{matrix} a_{11} \amp a_{12} \amp \cdots \amp a_{1n} \\ a_{21} \amp a_{22} \amp \cdots \amp a_{2n} \\ \vdots \amp \ddots \amp \amp \vdots \\ a_{m1} \amp a_{m2} \amp \cdots \amp a_{mn} \end{matrix} \right] . \end{equation*}

Finally our constants are coordinates of \(\mathbf{v}\) which are often written as

\begin{equation*} \mb{v} = \mb{b} = \left[ \begin{matrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{matrix} \right] . \end{equation*}

Now equation (2.4.1) turns into the matrix equation

\begin{equation} A \mb{x} = \mb{b} \tag{2.4.2} \end{equation}

which, when written out using equation (2.3.1), gives the linear system of equations

\begin{align} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n \amp = b_1 , \tag{2.4.3}\\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n \amp = b_2, \tag{2.4.4}\\ \vdots \amp \vdots \tag{2.4.5}\\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n \amp = b_m. \tag{2.4.6} \end{align}

Before discussing the algorithm to solve these equations, we take a moment to look at three examples.

Example 2.4.1. Overdetermined systems.

Let

\begin{equation*} A = \left[ \begin{matrix} 1 \amp 0 \\ 1 \amp 1 \\ 0 \amp 1 \end{matrix} \right] \end{equation*}

and

\begin{equation*} \mb{b} = \threevec{1}{1}{1} . \end{equation*}

Then our system of equations is in fact three equations of two variables

\begin{align*} x_1 + 0 \amp = 1 ,\\ x_1 + x_2 \amp = 1 , \\ 0 + x_2 \amp = 1. \end{align*}

The first and the third equation tell us that any solution must have \(x_1\) and \(x_2\) both equal to \(1\text{.}\) But the second equation then would imply \(2 = 1 + 1 = 1\text{,}\) which does not make sense here on earth. So for this example there are no solutions! This case where the number of equations exceeds the number of variables is called an overdetermined system of linear equations. For these, it is often the case that no solution exists (although there certainly are exceptions!).

Let us see what happens if we just flip our matrix (called taking a transpose):

Example 2.4.2. Underdetermined systems.

Let

\begin{equation*} A = \left[ \begin{matrix} 1 \amp 1 \amp 0 \\ 0 \amp 1 \amp 1 \end{matrix} \right] \end{equation*}

and

\begin{equation*} \mb{b} = \twovec{1}{1} . \end{equation*}

Now our system of equations is two equations of three variables

\begin{align*} x_1 + x_2 + 0 \amp = 1 ,\\ 0 + x_2 + x_3 \amp = 1. \end{align*}

Now, playing around a little with these equations, it may occur to you that, for any number \(t\text{,}\) the equations

\begin{align*} x_1 \amp = t , \\ x_2 \amp = 1 - t, \\ x_3 \amp = t \end{align*}

give a solution. So for this example there are an infinite number of solutions! In fact, we can consider the function

\begin{equation*} \mb{x} (t) := \threevec{t}{1 - t}{t} \end{equation*}

as a parametrization of all solutions. This case where the number of equations is less than the number of variables is called an underdetermined system of linear equations. For these, it is often the case that an infinite number of solutions exist (although there are cases when none exist at all!).

Example 2.4.3. Systems with unique solutions.

For this final example we will cut off the last column of the previous example’s matrix (called taking a submatrix). So we consider

\begin{equation*} A = \left[ \begin{matrix} 1 \amp 1 \\ 0 \amp 1 \end{matrix} \right] \end{equation*}

and keep

\begin{equation*} \mb{b} = \twovec{1}{1} \end{equation*}

giving two equations in two variables

\begin{align*} x_1 + x_2 \amp = 1,\\ 0 + x_2 \amp = 1. \end{align*}

Notice that the second equation fully specifies \(x_2 = 1\text{,}\) so substituting into the first and solving gives \(x_1 = 0\text{.}\) So

\begin{equation*} \twovec{x_1}{x_2} = \twovec{0}{1} \end{equation*}

is the only solution. In this case, a solution exists and it is unique.

Any technique to solve the general equation (2.4.2) should help us answer:

Whether a solution exists.
If it does exist, is it unique?
If it is not unique, what is a function \(\mb{x}: K^r \to K^n \) parametrizing all solutions?

Happily, such a technique is well known and called either row reduction or Gaussian elimination.

Subsection 2.4.1 Row Echelon Form

To understand this algorithm, we start at the end.

Definition 2.4.4.

An \(m \times n\) matrix \(A\) is in row echelon form if no zero row has a non-zero row below it and the first nonzero entry of any row has only zero entries to the left and below its own position (i.e. to its southwest).

More precisely, if \(a_{ij}\) is the first non-zero entry in the \(i\)-th row then \(a_{kl} = 0\) for all entries with \(k \geq i\) (on or below the \(i\)-th row) and \(l \leq j\) (to the left of the \(j\)-th column) excluding the case where \(k = i\) and \(l = j\text{.}\) The first non-zero entry of any row is called the leading coefficient for that row (it is also called a pivot). This definition is best understood by looking at a few examples and non-examples.

Example 2.4.5. Examples of row echelon form matrices.

Here are some matrices in row echelon form:

\begin{equation*} \left[ \begin{matrix} 0 \amp 3 \amp 1 \amp 2 \\ 0 \amp 0 \amp 0 \amp \pi \\ 0 \amp 0 \amp 0 \amp 0 \end{matrix} \right], \hspace{.3in} \left[ \begin{matrix} 1 \amp 2 \\ 0 \amp 1 \\ 0 \amp 0 \end{matrix} \right], \hspace{.3in} \left[ \begin{matrix} 5 \amp -6 \amp 2 \\ 0 \amp 3 \amp 0 \\ 0 \amp 0 \amp 4 \end{matrix} \right]. \end{equation*}

And here are some that are not

\begin{equation*} \left[ \begin{matrix}0 \amp 0 \amp 0 \amp \pi \\ 0 \amp 3 \amp 1 \amp 2 \\ 0 \amp 0 \amp 0 \amp 0 \end{matrix} \right], \hspace{.3in} \left[ \begin{matrix} 1 \amp 2 \\ 2 \amp 1 \\ 0 \amp 0 \end{matrix} \right], \hspace{.3in} \left[ \begin{matrix} 5 \amp -6 \amp 2 \\ 0 \amp 0 \amp 2 \\ 0 \amp 1 \amp 0 \end{matrix} \right]. \end{equation*}

To see why the second set of matrices are not in row echelon form, let’s look at each in turn. The first non-example we see that below and to the left of \(\pi\) are nonzero entries \(3, 1,2\text{.}\) In the second case, we see that below the \((1,1)\)-entry \(1\) is \(2\text{.}\) In the last case, even though the \((3,2)\)-entry is not directly below a leading coefficient, it is to the left and below the leading coefficient \(2\) in the \((2,3)\)-position, disqualifying this matrix from being in row echelon form.

There is a special type of row echelon form called reduced row echelon form where we add the conditions that:

Every leading coefficient is \(1\text{.}\)
Every other entry on a column of a leading coefficient is zero.

None of the matrices in Example 2.4.5 are in reduced form, but the following matrices are:

\begin{equation*} \left[ \begin{matrix} 0 \amp 1 \amp 5 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \\ 0 \amp 0 \amp 0 \amp 0 \end{matrix} \right], \hspace{.3in} \left[ \begin{matrix} 1 \amp 0 \\ 0 \amp 1 \\ 0 \amp 0 \end{matrix} \right], \hspace{.3in} \left[ \begin{matrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{matrix} \right]. \end{equation*}

Let us now set out the strategy for solving equation (2.4.5). First, we will solve this system if the matrix we start with is in reduced row echelon form. We will quickly see that in these cases, we obtain answers to the existence, uniqueness and parametrization questions above. Second, we will discover the algorithm to take an arbitrary matrix \(A\) and transform it into a matrix in reduced echelon form by taking simple steps (called elementary row operations).

To accomplish step one we make a definition.

Definition 2.4.6.

Let \(A\) be a matrix in row echelon form and \(\mb{x}\) the column of variables from equation (2.4.2). We say that the variable \(x_i\) is a basic variable if there is a leading coefficient of some row of \(A\) on the \(i\)-th column. Otherwise we call it a free variable.

Note that every variable is either basic or free. Now we state our solution as a theorem.

Theorem 2.4.7.

Let \(A\) be an \(m \times n\) matrix in reduced row echelon form with \(r\) free variables with indices \(j_1 \lt j_2 \lt \cdots \lt j_r\) and \(s\) basic variables with indices \(i_1 \lt i_2 \lt \cdots \lt i_{s}\text{.}\) Then for the equation

\begin{equation*} A \mb{x} = \mb{b}, \end{equation*}

There are no solutions if and only if \(s \lt m\) and \(b_i \ne 0\) for some \(s \lt i \leq m\text{.}\)
If we are not in case (1), the vector
\begin{equation*} \mb{x} = b_1 \mb{e}_{i_1} + b_2 \mb{e}_{i_2} + \cdots + b_s \mb{e}_{i_s} \end{equation*}
is a solution and is unique if \(r = 0\text{.}\)
If we are not in case (1) and \(r \gt 0\text{,}\) then for any \(1 \leq i \leq n\) define the function \(x_i : K^r \to K\) by
\begin{equation*} x_i \left( \left[ \begin{matrix} t_1 \\ t_2 \\ \vdots \\ t_r \end{matrix} \right] \right) = \begin{cases} t_i \amp \text{ if } x_i \text{ is a free variable,} \\ b_i -a_{i j_1} t_1 - a_{i j_2} t_2 - \cdots - a_{i j_r} t_r \amp \text{ if } x_i \text{ is a basic variable.} \end{cases} \end{equation*}
For \(\mb{t}\) in \(K^r\) take
\begin{equation*} \mb{x} (\mb{t} ) = \left[ \begin{matrix} x_1 (\mb{t}) \\ x_2(\mb{t}) \\ \vdots \\ x_n (\mb{t}) \end{matrix} \right] . \end{equation*}

This theorem is very easy to prove when one translates the matrix equation into the linear system. We demonstrate this through working out examples.

Example 2.4.8. Case 1. of Theorem 2.4.7.

For

\begin{equation*} A = \left[ \begin{matrix} 1 \amp 0 \\ 0 \amp 1 \\ 0 \amp 0 \end{matrix} \right], \end{equation*}

and

\begin{align*} \mb{x} \amp = \twovec{x}{y} \amp \mb{b} \amp = \threevec{3}{2}{1} \end{align*}

Here \(A\) is \(3 \times 2\) so \(m = 3\) and \(n = 2\text{.}\) There are exactly \(s = 2\) basic variables so that \(s \lt m\text{.}\) The fact that \(b_3 = 1 \ne 0\) shows us that we are in Case (1) of Theorem 2.4.7 so there are no solutions. Indeed, the linear equations become

\begin{align*} x + 0\cdot y \amp = 3,\\ 0 \cdot x + y \amp = 2, \\ 0 + 0 \amp = 1. \end{align*}

It is the last equation that causes problems and shows no solution exists. The easiest way to tell if you are in Case 1. of Theorem 2.4.7 is to see if there are any zero rows in \(A\text{.}\) If there are, for there to be a solution the corresponding row of \(\mb{b}\) must also be zero.

Note that in this example if we replaced \(\mb{b}\) with

\begin{equation*} \mb{b^\prime} = \threevec{3}{2}{0} \end{equation*}

then we would be in the setting of Case (2) and there would be a unique solution

\begin{equation*} \twovec{x}{y} = \twovec{3}{2} \end{equation*}

Example 2.4.9. Case 2. of Theorem 2.4.7.

In the case of a square matrix one can hope that the reduced row echelon form is the identity matrix. Indeed, for

\begin{equation*} \left[ \begin{matrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{matrix} \right]. \end{equation*}

and

\begin{align*} \mb{x} \amp = \threevec{x}{y}{z}\amp \mb{b} \amp = \threevec{3}{2}{1} \end{align*}

the solution is simply \(\mb{x} = \mb{b}\text{.}\)

Example 2.4.10. Case 3. of Theorem 2.4.7.

For Case 3. of Theorem 2.4.7, consider

\begin{equation*} A = \left[ \begin{matrix} 0 \amp 1 \amp 5 \amp 0 \amp 2 \\ 0 \amp 0 \amp 0 \amp 1 \amp 3 \\ 0 \amp 0 \amp 0 \amp 0 \amp 0\end{matrix} \right], \end{equation*}

with

\begin{align*} \mb{x} \amp = \left[ \begin{matrix} x_1 \\ x_2 \\ x_3 \\ x_4 \\ x_5 \end{matrix} \right], \amp \mb{b} \amp = \threevec{3}{2}{0} . \end{align*}

The free variables for the equation are \(x_1, x_3\) and \(x_5\) while the basic variables are \(x_2\) and \(x_4\text{.}\) Theorem 2.4.7 then says there is a three dimensional parametrization of the solution given by

\begin{equation*} \mb{x} \left( \threevec{t_1}{t_2}{t_3} \right) = \left[ \begin{matrix} t_1 \\ 3 - 5 t_2 - 2 t_3 \\ t_2 \\ 2 - 3 t_3 \\ t_3 \end{matrix} \right] . \end{equation*}

This solution can be verified easily, but it is motivated by considering the linear system of three equations

\begin{align*} x_2 + 5x_3 + 2 x_5 \amp = 3, \\ x_4 + 3 x_5 \amp = 2, \\ 0 \amp = 0. \end{align*}

The parametrization \(\mb{x}( \mb{t})\) is defined through solving these equations by rewriting them as

\begin{align*} x_2 \amp = 3 - 5x_3 - 2 x_5, \\ x_4 \amp = 2 - 3 x_5, \\ 0 \amp = 0. \end{align*}

putting the basic variables on the left and free variables on the right. Letting the free variables be \(t_1, t_2, t_3\) gives the parametrization (notice \(x_1\) does not show up in the equations, but is still free to vary).

Subsection 2.4.2 Row Reduction

Now that we can solve some matrix equations of a particularly nice form, our task is to put any matrix equation in that form. This technique is known as row reduction or alternatively, Gaussian elimination.

For the moment, let us represent our matrix \(A\) as a column of row vectors

\begin{equation*} \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ - \amp \mb{r}_2 \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] . \end{equation*}

To manipulate \(A\text{,}\) we allow ourselves three types of operations:

Type I: Switch the \(i\)-th row \(\mb{r}_i\) and the \(j\)-th row \(\mb{r}_j\)
\begin{equation*} \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_i \amp - \\ \amp\vdots \amp \\ - \amp \mb{r}_j \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] \longrightarrow \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_j \amp - \\ \amp\vdots \amp \\ - \amp \mb{r}_i \amp -\\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] \end{equation*}
Type II: Multiply the \(i\)-th row \(\mb{r}_i\) by a non-zero scalar \(c \ne 0\)
\begin{equation*} \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_i \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] \longrightarrow \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ \amp \vdots \amp \\ - \amp c \, \mb{r}_i \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] . \end{equation*}
Type III: Add any scalar multiple \(c \, \mb{r}_j\) of the \(j\)-th row to the \(i\)-th row \(\mb{r}_i\)
\begin{equation*} \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_i \amp - \\ \amp\vdots \amp \\ - \amp \mb{r}_j \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] \longrightarrow \left[ \begin{matrix} - \amp \mb{r}_1 \amp - \\ \amp \vdots \amp \\ - \amp \mb{r}_i + c \, \mb{r}_j \amp - \\ \amp\vdots \amp \\ - \amp \mb{r}_j \amp -\\ \amp \vdots \amp \\ - \amp \mb{r}_m \amp - \end{matrix} \right] \end{equation*}

It is important to note a fact concerning these row operations:

Fact: Each step may be reversed through another elementary row operation. More precisely, for the Type I operation, simply repeat the same operation and you return to the original matrix. For Type II operations, multiply the same row by the reciprocal \(1/c\) (which you can do because \(c \ne 0\)). For Type III add \((-c)\,\mb{r}_j\) to the \(i\)-th row.

We will see this fact in another light in our next section on matrices.

Theorem 2.4.11.

Every \(m \times n\) matrix can be put in reduced row echelon form through applying a sequence of elementary row operations.

Proof.

To see that this is the case, we argue by induction. More precisely, we will show that (a) we can put an \(m \times 1\) matrix in reduced row echelon form and (b) if we can put an \(m \times n\) matrix in reduced row echelon form with elementary row operations, then we can do so with an \(m \times (n + 1)\) matrix.

To accomplish step (a), suppose \(A = (a_{i1})\) is an \(m \times 1\) matrix. If \(a_{i1} = 0\) for all \(1 \leq i \leq m\) then \(A\) is in reduced row echelon form already. Otherwise, choose the smallest \(i\) for which \(a_{i1}\) is non-zero. Using a Type I row operation, switch the \(i\)-th row with the first row. Call this matrix \(A^\prime = (a_{i1}^\prime)\text{.}\) Now \(a_{11}^\prime \ne 0\) and we may use a Type II row operation to multiply the first row by \(1 / a_{11}^\prime\) to obtain a matrix of the form

\begin{equation*} A^{\prime \prime} = \left[ \begin{matrix} 1 \\ a_{21}^\prime \\ \vdots \\ a_{m1}^\prime \end{matrix} \right]. \end{equation*}

Finally, use Type III row operations to clear out the lower entries by adding \(-a^\prime_{i1}\) times the first row to the \(i\)-th row for each \(2 \leq i \leq m\text{.}\) This gives the matrix

\begin{equation} A^{\prime \prime} = \left[ \begin{matrix} 1 \\ 0 \\ \vdots \\ 0 \end{matrix} \right] \tag{2.4.7} \end{equation}

which is in reduced row echelon form.

Often in induction proofs, step (a) (called the base case) is trivial and the hard part is step (b) (called the induction step). In this proof though, step (a) is really where we learn our algorithm and step (b) is just a rehash of (a). Indeed, if we know that we can put an \(m \times n\) matrix in reduced row echelon form and \(A\) is an \(m \times (n + 1)\) matrix then we can put the submatrix of \(A\) consisting of the first \(n\) columns in reduced echelon form with elementary row operations while ignoring the last column. What we will be left with is a matrix

\begin{equation*} A^\prime = \left[ \begin{array}{cccccccc|c} 0 \amp \cdots \amp 0 \amp 1 \amp * \amp \cdots \amp 0 \amp \cdots \amp a_{1(n+1)} \\ 0 \amp \cdots \amp 0 \amp 0 \amp \cdots \amp 0 \amp 1 \amp \cdots \amp a_{2(n+1)} \\ \vdots \amp \amp \amp \amp \amp \vdots \amp \amp \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp a_{r(n + 1)} \\ \vdots \amp \amp \cdots \amp \amp \amp \cdots \amp \amp \vdots \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp a_{m(n + 1)} \end{array} \right]. \end{equation*}

Here, we assume all the rows up to the \(r\)-th row of the \(m \times n\) submatrix are non-zero, and from the \(r\)-th down are zero. But then apply part (a) of the argument to the last part of the last column

\begin{equation*} \threevec{a_{r(n + 1)}}{{\vdots}}{{a_{m(n + 1)}}} . \end{equation*}

Since the columns before this are all zero (for the affected rows), there is no effect on the left hand side. If there’s a non-zero entry of the above vector we obtain a matrix of the form

\begin{equation*} A^{\prime \prime} = \left[ \begin{array}{cccccccc|c} 0 \amp \cdots \amp 0 \amp 1 \amp * \amp \cdots \amp 0 \amp \cdots \amp a_{1(n+1)} \\ 0 \amp \cdots \amp 0 \amp 0 \amp \cdots \amp 0 \amp 1 \amp \cdots \amp a_{2(n+1)} \\ \vdots \amp \amp \amp \amp \amp \vdots \amp \amp \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp 1 \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp \vdots \amp \vdots \\ \vdots \amp \vdots \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp 0 \end{array} \right]. \end{equation*}

Adding multiples \(-a_{i(n + 1)}\) of the \(r\)-th row to the \(i\)-th row puts this final matrix into reduced row echelon form.

The proof of Theorem 2.4.11 provides the algorithm to row reduce a matrix. Start at the first column and perform row operations (on the whole matrix) to get it into the form of equation (2.4.7). Then proceed left to right, column by column, as in the proof of part (b).

To solve a linear system using this algorithm, it pays to augment the matrix \(A\) with the constant vector \(\mb{b}\text{.}\) In other words, row reduce the matrix \(A\) but perform the row operations on the larger matrix

\begin{equation*} \left[ \begin{matrix} A \amp \mb{b} \end{matrix} \right] = \left[ \begin{array}{cccc|c} a_{11} \amp a_{12} \amp \cdots \amp a_{1n} \amp b_1 \\ a_{21} \amp a_{22} \amp \cdots \amp a_{2n} \amp b_2 \\ \vdots \amp \ddots \amp \amp \vdots \amp \vdots \\ a_{m1} \amp a_{m2} \amp \cdots \amp a_{mn} \amp b_m \end{array} \right] . \end{equation*}

In the end, obtain the augmented matrix

\begin{equation*} \left[ \begin{matrix} A^\prime \amp \mb{b}^\prime \end{matrix} \right] \end{equation*}

where \(A^\prime\) is in reduced row echelon form. As you will show in the exercises, our original equation is then equivalent to the new equation

\begin{equation*} A^\prime \mb{x} = \mb{b}^\prime. \end{equation*}

We see this procedure in an example.

Example 2.4.12. Row reduction of a matrix.

Let

\begin{equation*} A = \left[ \begin{matrix} 0 \amp 0 \amp -1 \amp 1 \\ -1 \amp 2 \amp 1 \amp 0 \\ 2 \amp -4 \amp - 1 \amp 0 \end{matrix} \right], \hspace{.2in} \mb{x} = \left[ \begin{matrix} x_1 \\ x_2 \\ x_3 \\ x_4 \end{matrix} \right],\hspace{.2in} \mb{b} = \left[ \begin{matrix} 0 \\ -2 \\ 4 \end{matrix} \right] \end{equation*}

Then we make the augmented matrix

\begin{equation*} A = \left[ \begin{array}{cccc|c} 0 \amp 0 \amp -1 \amp 1 \amp 0 \\ -1 \amp 2 \amp 1 \amp 0 \amp -2 \\ 2 \amp -4 \amp - 1 \amp 0 \amp 4 \end{array} \right] \end{equation*}

and row reduce as follows:

\begin{align*} \left[ \begin{array}{cccc|c} 0 \amp 0 \amp -1 \amp 1 \amp 0 \\ -1 \amp 2 \amp 1 \amp 0 \amp -2 \\ 2 \amp -4 \amp - 1 \amp 0 \amp 4 \end{array} \right] \amp \stackrel{\mb{r}_1 \leftrightarrow \mb{r}_2}{\longrightarrow} \left[ \begin{array}{cccc|c} -1 \amp 2 \amp 1 \amp 0 \amp -2 \\ 0 \amp 0 \amp -1 \amp 1 \amp 0 \\ 2 \amp -4 \amp - 1 \amp 0 \amp 4 \end{array} \right] , \\ \amp \stackrel{ - \mb{r}_1 }{\longrightarrow} \hspace{.1in} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp - 1 \amp 0 \amp 2 \\ 0 \amp 0 \amp -1 \amp 1 \amp 0 \\ 2 \amp -4 \amp - 1 \amp 0 \amp 4 \end{array} \right] , \\ \amp \stackrel{ \mb{r}_3 - 2 \mb{r}_1 }{\longrightarrow} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp - 1 \amp 0 \amp 2 \\ 0 \amp 0 \amp -1 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \amp 0 \amp 0 \end{array} \right] , \\ \amp \stackrel{ - \mb{r}_2 }{\longrightarrow} \hspace{.1in} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp - 1 \amp 0 \amp 2 \\ 0 \amp 0 \amp 1 \amp - 1 \amp 0 \\ 0 \amp 0 \amp 1 \amp 0 \amp 0 \end{array} \right] , \\ \amp \stackrel{ \mb{r}_3 - \mb{r}_2 }{\longrightarrow} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp - 1 \amp 0 \amp 2 \\ 0 \amp 0 \amp 1 \amp - 1 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \amp 0 \end{array} \right] , \\ \amp \stackrel{ \mb{r}_1 + \mb{r}_2 }{\longrightarrow} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp 0 \amp -1 \amp 2 \\ 0 \amp 0 \amp 1 \amp -1 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \amp 0 \end{array} \right] , \\ \amp \stackrel{ \mb{r}_1 + \mb{r}_3 }{\longrightarrow} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp 0 \amp 0 \amp 2 \\ 0 \amp 0 \amp 1 \amp -1 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \amp 0 \end{array} \right], \\ \amp \stackrel{ \mb{r}_2 + \mb{r}_3 }{\longrightarrow} \left[ \begin{array}{cccc|c} 1 \amp -2 \amp 0 \amp 0 \amp 2 \\ 0 \amp 0 \amp 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \amp 0 \end{array} \right] \end{align*}

Now we can write the reduced row echelon system of equations as

\begin{align*} x_1 - 2x_2 \amp = 2,\\ x_3 \amp = 0, \\ x_4 \amp = 0. \end{align*}

So we have one free variable \(x_2\) and our general solution is

\begin{equation*} \mb{x} (t) = \left[ \begin{matrix} {2 + 2t} \\ t \\ 0 \\ 0 \end{matrix} \right]. \end{equation*}

Of course, as row reduction is a very explicit computational algorithm, every linear algebra library for a given programming language has a method that will perform this automatically. In Sage (and Matlab), this is the .rref() command which procedes the matrix. We can also take a matrix \(A\) and augment it by \(b\) using the method .augment(b). Let’s see how this works with the last example.

It is certainly a helpful check to follow up a hand calculation with the computer version to find the inevitable errors that occur. We end this section with a little bit of vocabulary.

Definition 2.4.13.

Given a matrix \(A\) with reduced row echelon form \(A^\prime\text{,}\) we say that the \(j\)-th column is free if the variable associated to a linear system \(A^\prime \mb{x} = \mb{b}\) is free. Otherwise we say the column is basic.

It is a bit unusual to call columns free and basic rather than variables, but these terms make discussing matrix equations more fluid.

Exercises 2.4.3 Exercises

1.

Using elementary row operations, convert the matrix into reduced row echelon form. Show each step.

(a)

\begin{equation*} \left[ \begin{matrix} -1 \amp 4 \amp 9 \\ -1 \amp 3 \amp 7 \\ 3 \amp -8 \amp -19 \end{matrix} \right] \end{equation*}

(b)

\begin{equation*} \left[ \begin{matrix} 2 \amp -5 \\ -1 \amp 3 \end{matrix} \right] \end{equation*}

(c)

\begin{equation*} \left[ \begin{matrix} 3 \amp 3 \amp -4 \\ -2 \amp -2 \amp 3 \\ 1 \amp 1 \amp -2 \\ -2 \amp -2 \amp 3 \end{matrix} \right] \end{equation*}

(d)

\begin{equation*} \left[ \begin{matrix} 1 \amp 1 \amp 0 \amp 1 \\ -1 \amp -1 \amp 1 \amp 0 \end{matrix} \right] \end{equation*}

2.

Give the general solution, if it exists, for the matrix equations.

(a)

\begin{equation*} \left[ \begin{matrix} 1 \amp 0 \amp 0 \\ -1 \amp 0 \amp 1 \\ 1 \amp 0 \amp -1 \end{matrix} \right] \threevec{x_1}{x_2}{x_3} = \threevec{-2}{0}{1} \end{equation*}

(b)

\begin{equation*} \left[ \begin{matrix} 1 \amp 1 \amp 1 \\ -1 \amp 0 \amp -2 \end{matrix} \right] \threevec{x_1}{x_2}{x_3} = \twovec{-1}{0} \end{equation*}

(c)

\begin{equation*} \left[ \begin{matrix} -1 \amp 3 \\ -1 \amp 2 \\ 1 \amp -3 \end{matrix} \right] \twovec{x_1}{x_2} = \threevec{-5}{-3}{5} \end{equation*}

3.

Let \(A\) be a matrix, \(A^\prime\) its reduced echelon form, \([A | \mb{b}]\) an augmented matrix and \([A^\prime | \mb{b}^\prime]\) the augmented matrix obtained by row reducing \(A\text{.}\) Explain why every solution to the row reduced matrix equation

\begin{equation*} A^\prime \mb{x} = \mb{b}^\prime \end{equation*}

is also a solution to the original equation

\begin{equation*} A \mb{x} = \mb{b}. \end{equation*}

Prev Top Next