To see that this is the case, we argue by induction. More precisely, we will show that (a) we can put an \(m \times 1\) matrix in reduced row echelon form and (b) if we can put an \(m \times n\) matrix in reduced row echelon form with elementary row operations, then we can do so with an \(m \times (n + 1)\) matrix.
To accomplish step (a), suppose \(A = (a_{i1})\) is an \(m \times 1\) matrix. If \(a_{i1} = 0\) for all \(1 \leq i \leq m\) then \(A\) is in reduced row echelon form already. Otherwise, choose the smallest \(i\) for which \(a_{i1}\) is non-zero. Using a Type I row operation, switch the \(i\)-th row with the first row. Call this matrix \(A^\prime = (a_{i1}^\prime)\text{.}\) Now \(a_{11}^\prime \ne 0\) and we may use a Type II row operation to multiply the first row by \(1 / a_{11}^\prime\) to obtain a matrix of the form
\begin{equation*}
A^{\prime \prime} = \left[ \begin{matrix} 1 \\ a_{21}^\prime \\ \vdots \\ a_{m1}^\prime \end{matrix} \right].
\end{equation*}
Finally, use Type III row operations to clear out the lower entries by adding \(-a^\prime_{i1}\) times the first row to the \(i\)-th row for each \(2 \leq i \leq m\text{.}\) This gives the matrix
\begin{equation}
A^{\prime \prime} = \left[ \begin{matrix} 1 \\ 0 \\ \vdots \\ 0 \end{matrix} \right] \tag{2.4.7}
\end{equation}
which is in reduced row echelon form.
Often in induction proofs, step (a) (called the base case) is trivial and the hard part is step (b) (called the induction step). In this proof though, step (a) is really where we learn our algorithm and step (b) is just a rehash of (a). Indeed, if we know that we can put an \(m \times n\) matrix in reduced row echelon form and \(A\) is an \(m \times (n + 1)\) matrix then we can put the submatrix of \(A\) consisting of the first \(n\) columns in reduced echelon form with elementary row operations while ignoring the last column. What we will be left with is a matrix
\begin{equation*}
A^\prime = \left[ \begin{array}{cccccccc|c} 0 \amp \cdots \amp 0 \amp 1 \amp * \amp \cdots \amp 0 \amp \cdots \amp a_{1(n+1)} \\ 0 \amp \cdots \amp 0 \amp 0 \amp \cdots \amp 0 \amp 1 \amp \cdots \amp a_{2(n+1)} \\ \vdots \amp \amp \amp \amp \amp \vdots \amp \amp \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp a_{r(n + 1)} \\ \vdots \amp \amp \cdots \amp \amp \amp \cdots \amp \amp \vdots \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp a_{m(n + 1)} \end{array} \right].
\end{equation*}
Here, we assume all the rows up to the \(r\)-th row of the \(m \times n\) submatrix are non-zero, and from the \(r\)-th down are zero. But then apply part (a) of the argument to the last part of the last column
\begin{equation*}
\threevec{a_{r(n + 1)}}{{\vdots}}{{a_{m(n + 1)}}} .
\end{equation*}
Since the columns before this are all zero (for the affected rows), there is no effect on the left hand side. If there’s a non-zero entry of the above vector we obtain a matrix of the form
\begin{equation*}
A^{\prime \prime} = \left[ \begin{array}{cccccccc|c} 0 \amp \cdots \amp 0 \amp 1 \amp * \amp \cdots \amp 0 \amp \cdots \amp a_{1(n+1)} \\ 0 \amp \cdots \amp 0 \amp 0 \amp \cdots \amp 0 \amp 1 \amp \cdots \amp a_{2(n+1)} \\ \vdots \amp \amp \amp \amp \amp \vdots \amp \amp \amp \vdots \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp 1 \\ 0 \amp 0 \amp \cdots \amp \amp \amp \cdots \amp \amp \vdots \amp \vdots \\ \vdots \amp \vdots \amp \cdots \amp \amp \amp \cdots \amp \amp 0 \amp 0 \end{array} \right].
\end{equation*}
Adding multiples \(-a_{i(n + 1)}\) of the \(r\)-th row to the \(i\)-th row puts this final matrix into reduced row echelon form.