Section 2.3 Linear Transformations I
Definition 2.3.1.
Suppose \(U\) and \(V\) are vector spaces over \(K\) and \(T : U \to V\) is a function. We say that \(T\) is a linear transformation if, for any vectors \(\mb{u}_0\) and \(\mb{u}_1\) in \(U\) and any scalars \(\lambda_0\) and \(\lambda_1\) of \(K\) we have
\begin{equation*}
T( \lambda_0 \mb{u}_0 + \lambda_1 \mb{u}_1 ) = \lambda_0 T(\mb{u}_0) + \lambda_1 T(\mb{u}_1) .
\end{equation*}
Two comments are in order here. First, you have seen linear transformations many times in your life but probably not known that they had a name. Second, amongst all functions between vector spaces, being linear is a very rare and exceptionally strong condition. If you write down a random function from \(\mathbb{R}\) to \(\mathbb{R}\) it is most likely not linear! That said let us consider some examples.
Example 2.3.2. Matrix multiplication as a linear transformation.
As it turns out, this example is the computational heart of linear algebra and can be used to describe any linear transformation between two finite dimensional vector spaces (we will see this shortly). So it is worth taking a look at a few special cases of such transformations.
Example 2.3.3. Rotation matrices.
We have seen that rotation in the complex plane can be obtained by multiplication by a complex number of the form \(e^{i\phi}\text{.}\) However, if we think of the plane as the real vector space \(\mathbb{R}^2\) we can look at such a rotation as multiplication by the matrix
\begin{equation*}
A_\phi = \left[ \begin{matrix} \cos \phi \amp - \sin \phi \\ \sin \phi \amp \cos \phi \end{matrix} \right]
\end{equation*}
Example 2.3.4. Projection matrices.
Example 2.3.5. Scaling along axes.
Multiplying by a matrix can scale in various directions. In doing so, one can relate common shapes. For example, if we multiply all the points of the unit circle satisfying
\begin{equation*}
x^2 + y^2 = 1
\end{equation*}
by the matrix
\begin{equation*}
\text{Diag}(a, b) = \left[ \begin{matrix} a \amp 0 \\ 0 \amp b \end{matrix} \right],
\end{equation*}
we obtain the set of points of the ellipse \(E\) given by
\begin{equation*}
\frac{x^2}{a^2} + \frac{y^2}{b^2} = 1.
\end{equation*}
A useful consequence of this is when you want to parametrize the ellipse \(E\text{,}\) you simply parametrize the circle and compose with \(\text{Diag}(a,b)\) getting
\begin{equation*}
\mb{f} (\theta ) = \text{Diag}(a,b) \twovec{\cos \theta}{ \sin \theta} = \twovec{a \cos \theta}{b \sin \theta}.
\end{equation*}
Note that the map \(\mb{f}\) is not a linear transformation, but the use of a linear transformation (along with some basic knowledge of trigonometry) helped us find it!
We will be working extensively with matrices in several contexts, but for now, let us recall a couple of other important examples of linear transformations.
Example 2.3.6. Coordinates as a linear transformation.
Example 2.3.7. The derivative as a linear transformation.
One should also note that important operators of quantum mechanics (creation and annihilation operators) are linear transformations as well. These last examples show that, while matrix multiplication is a very important type of linear transformation, there are many natural examples that are not packaged well as matrices.
A linear transformation \(T: U \to V\) gives rise to interesting linear subspaces and equations which we define here.
Definition 2.3.8.
Given vector spaces \(U\) and \(V\) over \(K\) and a linear transformation
\begin{equation*}
T : U \to V,
\end{equation*}
The
kernel of
\(T\) is the set of vectors
\begin{equation*}
\ker (T) = \left\{ \mb{u} \in U : T( \mb{u} ) = \mb{0} \right\} \subseteq U .
\end{equation*}
The
image of
\(T\) is the range of
\(T\)
\begin{equation*}
\im (T) = \left\{ T (\mb{u} ) \in V : \mb{u} \text{ a vector in } U \right\} \subseteq V .
\end{equation*}
The kernel is sometimes called the nullspace of \(T\) because it is the set of vectors of \(U\) that are sent to the zero vector. Let’s check that these subsets are in fact vector subspaces.
Proposition 2.3.9.
The kernel and image of a linear transformation are vector subspaces.
Proof.
We stick with the notation of
\(U\text{,}\) \(V\) and
\(T\) given in
Definition 2.3.8. By
Proposition 2.2.9, we only need to show that
\(\ker (T)\) and
\(\im(T)\) are closed under vector addition and scalar multiplication. So let’s check. If
\(\mb{u}_1\) and
\(\mb{u}_2\) are in the kernel, then by definition
\(T (\mb{u}_1) = \mb{0} = T (\mb{u}_2)\text{.}\) So then using the definition of a linear transformation we see
\begin{equation*}
T (\mb{u}_1 + \mb{u}_2 ) = T (\mb{u}_1) + T (\mb{u}_2 ) = \mb{0} + \mb{0} = \mb{0}.
\end{equation*}
This implies \(\mb{u}_1 + \mb{u}_2\) is in the kernel as well and so vector addition is closed. Similarly, for any scalar \(\lambda\) in \(K\text{,}\)
\begin{equation*}
T (\lambda \mb{u}_1 ) = \lambda T (\mb{u}_1 ) = 0
\end{equation*}
so that scalar multiplication of a vector in \(\ker (T)\) is still in \(\ker (T)\text{.}\) Thus \(\ker (T)\) is a vector subspace of \(U\text{.}\)
For the image we proceed in a similar manner. If \(\mb{v}_1\) and \(\mb{v}_2\) are in the image then there are vectors \(\mb{u}_1\) and \(\mb{u}_2\) for which \(T( \mb{u}_1 ) = \mb{v}_1\) and \(T( \mb{u}_2 ) = \mb{v}_2\text{.}\) But then
\begin{equation*}
T (\mb{u}_1 + \mb{u}_2 ) = T (\mb{u}_1) + T (\mb{u}_2 ) = \mb{v}_1 + \mb{v}_2
\end{equation*}
so that \(\mb{v}_1 + \mb{v}_2\) is also in the image and the image is closed under vector addition. The closedness of scalar multiplication is also justified by observing
\begin{equation*}
T( \lambda \mb{u}_1 ) = \lambda T (\mb{u}_1 ) = \lambda \mb{v}_1 .
\end{equation*}
We mention this proposition here because many important subspaces (for example, planes, lines, tangent spaces, etc) arise most naturally as a kernel or an image of some linear transformation. Of course, being linear is a very limiting property of a space, so the proposition should clarify that the geometry of these spaces is very basic. Another reason to consider these spaces is that they inform us on properties of \(T\text{.}\)
Proposition 2.3.10.
Let \(U\) and \(V\) be vector spaces and \(T : U \to V\) be a linear transformation. As a function, \(T\) is one-to-one if and only if \(\ker (T) = \{ \mb{0} \}\text{.}\) It is onto if and only if \(\im (T) = V\text{.}\)
Proof.
The second statement is just the definition of an onto function (that the range equals the codomain). For the first, suppose that \(T\) is one-to-one and \(\mb{u}\) is in the kernel \(\ker (T)\text{.}\) Then \(T (\mb{u} ) = \mb{0} = T( \mb{0})\) so that \(\mb{u} = \mb{0}\text{.}\) But since \(\mb{u}\) was an arbitrary element of the kernel, this means that the only element of \(\ker (T)\) is zero itself.
Definition 2.3.11.
The dual vector space to \(V\text{,}\) denoted \(V^*\text{,}\) is the vector space of linear transformations \(T : V \to K\text{.}\)
Some dual vectors are well known to even the most inattentive calculus student:
Example 2.3.12. Evaluation as a dual vector.
Many more examples exist with infinite dimensional vector spaces of functions.
Example 2.3.13. Limit as a dual vector.
Example 2.3.14. Definite integral as a dual vector.
It shows a measure of mathematical maturity to appreciate that, while \(V\) and \(V^*\) look a lot alike, they are in fact two separate vector spaces with their own personalities (and more importantly, transformation properties). This will be more apparent when we develop multivariable calculus.