Section 2.6 Linear Transformations II
\begin{equation}
\begin{CD}
U @>T>> V\\
@VV\coord{}{\mathcal{B}}V @VV\coord{}{\mathcal{C}}V\\
K^n @>\cob{T}{\mathcal{B}}{\mathcal{C}}>> K^m
\end{CD}\tag{2.6.4}
\end{equation}
Example 2.6.1. The matrix of a derivative.
Consider \(P_n\) as the polynomials of degree less than or equal to \(n\) with coefficients in \(\mathbb{R}\text{.}\) Take
\begin{equation*}
D: P_2 \to P_1
\end{equation*}
to be the linear transformation obtained by taking the derivative. In other words,
\begin{equation*}
D(f) = f^\prime .
\end{equation*}
In order to write out \(D\) as a matrix, we need to choose bases for \(P_2\) and \(P_1\text{.}\) The natural candidate for \(P_n\) is \(\{1, x, x^2, \ldots, x^n\}\text{,}\) so we take
\begin{align*}
\mathcal{B} \amp = \{1, x, x^2\}, \\
\mathcal{C} \amp = \{1, x\}.
\end{align*}
To find
\(\cob{D}{\mathcal{B}}{\mathcal{C}}\) we will simply need to find the coefficients from equation
(2.6.1). In other words, we need to take the derivative of our polynomials from
\(\mathcal{B}\) and write them out as linear combinations of the polynomials in
\(\mathcal{C}\text{.}\)
\begin{align*}
D(1) \amp = 0 = 0\cdot 1 + 0 \cdot x, \\
D(x) \amp = 1 = 1\cdot 1 + 0 \cdot x, \\
D(x^2) \amp = 2x = 0\cdot 1 + 2 \cdot x.
\end{align*}
Placing these coefficients into the matrix (appropriately!) gives
\begin{equation*}
\cob{D}{\mathcal{B}}{\mathcal{C}} = \begin{bmatrix} {0}\amp {1}\amp {0}\\ {0}\amp {0} \amp {2} \end{bmatrix} .
\end{equation*}
Of course, in this example, since we all know how to take derivatives, the matrix representation of
\(D\) is of limited usefulness. Nonetheless, let us show how equation
(2.6.2) works in this case. Take the arbitrary quadratic polynomial
\begin{equation*}
f = ax^2 + bx + c
\end{equation*}
in \(P_2\) and observe that
\begin{equation*}
\coord{f}{\mathcal{B}} = \threevec{c}{b}{a} .
\end{equation*}
Then multiplying this column vector on the left by \(\cob{D}{\mathcal{B}}{\mathcal{C}}\) gives
\begin{align*}
\cob{D}{\mathcal{B}}{\mathcal{C}} \coord{f}{\mathcal{B}} \amp = \begin{bmatrix} {0}\amp {1}\amp {0}\\ {0}\amp {0} \amp {2} \end{bmatrix} \threevec{c}{b}{a} ,\\
\amp = \twovec{b}{2a} .
\end{align*}
But this vector represents the element
\begin{equation*}
b\cdot 1 + 2a \cdot x = 2ax + b
\end{equation*}
which we all know as the derivative of \(f\text{.}\)
The technique of representing a linear transformation as a matrix already gives us some important results. First though, we define the following notions.
Definition 2.6.2.
If \(U\) and \(V\) are finite dimensional vector spaces over \(K\text{,}\) the rank of a linear transformation \(T : U \to V\text{,}\) denoted \(\rk (T)\text{,}\) is the dimension of \(\im (T)\text{.}\) The nullity of \(T\) is the dimension of \(\ker (T)\text{.}\)The following theorem gives us a good amount of qualitative information about a linear transformation.
Theorem 2.6.3. Rank-Nullity Theorem.
If \(U\) and \(V\) are finite dimensional vector spaces over \(K\) and \(T : U \to V\) linear transformation then
\begin{equation*}
\nullity (T) + \rk (T) = \dim (U).
\end{equation*}
Proof.
To see this, suppose \(\dim (U) = n\text{,}\) \(\dim (V) = m\) and let \(A\) be the \(m \times n\) matrix representing \(T\) relative to some bases on \(U\) and \(V\text{.}\) Then the kernel of \(T\) is isomorphic to the null space of \(A\) (which is the kernel of multiplying by \(A\)) and the image of \(T\) is isomorphic to the column space of \(A\) (which is the image of multiplying by \(A\)). Now, the null space of \(A\) is the linear subspace of solutions to the matrix equation
\begin{equation*}
A \mb{x} = \mb{0}.
\end{equation*}
We saw in
Theorem 2.4.7 that the solutions to these equations were parameterized by
\(K^r\) where
\(r\) was the number of free columns of
\(A\text{.}\) So the nullity of
\(A\) is precisely
\(r\text{.}\) On the other hand,
Proposition 2.5.11 showed that the dimension of the column space equaled the number of basic columns of
\(A\text{.}\) Now every column of
\(A\) is either free or basic (but not both), so the sum of these two numbers is precisely
\(n = \dim (U)\text{.}\)
From this we obtain the corollary
Corollary 2.6.4.
Suppose \(U\) and \(V\) are finite dimensional vector spaces over \(K\) and \(T : U \to V\) is a linear transformation. Then
if \(T\) is one-to-one then \(\dim (U) \leq \dim (V)\text{.}\)
if \(T\) is onto then \(\dim (U) \geq \dim (V)\text{.}\)
if \(T\) is a linear isomorphism then \(\dim (U) = \dim (V)\text{.}\)
Proof.
For the first claim, represent
\(T\) by a matrix and apply
Corollary 2.5.6. For the second, if
\(T\) is onto then then
\(\im (T) = V\) so that
\(\rk (T) = \dim (V)\text{.}\) By
Theorem 2.6.3,
\(\nullity (T) + \dim (V) = \dim (U)\) implying
\(\dim (U) \geq \dim (V)\text{.}\) The last claim follows from the fact that a linear isomorphism is a one-to-one correspondence by definition.
We also can use our theorem to make it easier to detect linear isomorphisms.
Corollary 2.6.5.
If \(U\) and \(V\) are finite dimensional vector spaces of the same dimension and \(T : U \to V\) is a linear transformation, then the following are equivalent:
\(T\) is one-to-one.
\(T\) is onto.
\(T\) is a linear isomorphism.
Proof.
Note that
\(T\) is one-to-one if and only if
\(\nullity (T) = 0\) which by
Theorem 2.6.3 holds if and only if
\(\rk (T) = \dim U = \dim (V)\text{.}\) But then
\(\im (T) = V\) since otherwise
\(\im (T)\) would be a proper subspace of
\(V\) and
Corollary 2.5.9 would give that
\(\rk (T) = \dim (\im (T)) \lt \dim (V)\text{.}\) Thus
\(T\) is onto and a linear isomorphism.
Now, if
\(T\) is onto then
\(\rk (T) = \dim (V) = \dim (U)\) which again implies by
Theorem 2.6.3 that
\(\nullity (T) = 0\text{.}\) This would give that
\(\ker (T) = \{\mb{0} \}\) so that
\(T\) is one-to-one and a linear isomorphism.
Clearly, if \(T\) is a linear isomorphism then it is both one-to-one and onto by definition.