Skip to main content

Section 2.3 Linear Transformations I

As was mentioned in Section 1.2, functions play a central role in much of mathematics. So what about functions
\begin{equation*} T : U \to V \end{equation*}
where \(U\) and \(V\) are vector spaces over \(K\) (remember that \(K\) is either \(\mathbb{R}\) or \(\mathbb{C}\))? Well, in fact, such functions are what we truly should label vector valued functions because they are functions with values in a vector space. For now though, we will not consider this level of generality, but rather stick to functions that satisfy a very basic and important property.

Definition 2.3.1.

Suppose \(U\) and \(V\) are vector spaces over \(K\) and \(T : U \to V\) is a function. We say that \(T\) is a linear transformation if, for any vectors \(\mb{u}_0\) and \(\mb{u}_1\) in \(U\) and any scalars \(\lambda_0\) and \(\lambda_1\) of \(K\) we have
\begin{equation*} T( \lambda_0 \mb{u}_0 + \lambda_1 \mb{u}_1 ) = \lambda_0 T(\mb{u}_0) + \lambda_1 T(\mb{u}_1) . \end{equation*}
Two comments are in order here. First, you have seen linear transformations many times in your life but probably not known that they had a name. Second, amongst all functions between vector spaces, being linear is a very rare and exceptionally strong condition. If you write down a random function from \(\mathbb{R}\) to \(\mathbb{R}\) it is most likely not linear! That said let us consider some examples.

Example 2.3.2. Matrix multiplication as a linear transformation.

The most important example of a linear transformation is multiplication by a matrix \(A \in M_{m, n} (K)\) (on the left) which will take column vectors \(K^n\) to column vectors \(K^m\text{.}\) This operation can be written as
\begin{equation} \left[ \begin{matrix} a_{11} \amp a_{12} \amp \cdots \amp a_{1n} \\ a_{21} \amp a_{22} \amp \cdots \amp a_{2n} \\ \vdots \amp \ddots \amp \amp \vdots \\ a_{m1} \amp a_{m2} \amp \cdots \amp a_{mn} \end{matrix} \right] \left[ \begin{matrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{matrix} \right] = \left[ \begin{matrix} a_{11} x_1 + a_{12} x_2 + \cdots + a_{1n} x_n \\ a_{21} x_1 + a_{22} x_2 + \cdots + a_{2n} x_n \\ \vdots \\ a_{m1} x_1 + a_{m2} x_2 + \cdots + a_{mn} x_n \end{matrix} \right].\tag{2.3.1} \end{equation}
It is a slightly annoying (but easy) exercise to show that this is indeed a linear transformation.
As it turns out, this example is the computational heart of linear algebra and can be used to describe any linear transformation between two finite dimensional vector spaces (we will see this shortly). So it is worth taking a look at a few special cases of such transformations.

Example 2.3.3. Rotation matrices.

We have seen that rotation in the complex plane can be obtained by multiplication by a complex number of the form \(e^{i\phi}\text{.}\) However, if we think of the plane as the real vector space \(\mathbb{R}^2\) we can look at such a rotation as multiplication by the matrix
\begin{equation*} A_\phi = \left[ \begin{matrix} \cos \phi \amp - \sin \phi \\ \sin \phi \amp \cos \phi \end{matrix} \right] \end{equation*}

Example 2.3.4. Projection matrices.

Multiplying by a matrix can give a projection. For example, the matrix
\begin{equation*} A = \left[ \begin{matrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 0 \end{matrix} \right] \end{equation*}
will give
\begin{equation*} A \threevec{x}{y}{z} = \threevec{x}{y}{0} . \end{equation*}
In fact, projections from \(\mathbb{R}^n\) to any vector subspace \(U\) can be written as matrix multiplications. We will see more on this when we study more on linear geometry.

Example 2.3.5. Scaling along axes.

Multiplying by a matrix can scale in various directions. In doing so, one can relate common shapes. For example, if we multiply all the points of the unit circle satisfying
\begin{equation*} x^2 + y^2 = 1 \end{equation*}
by the matrix
\begin{equation*} \text{Diag}(a, b) = \left[ \begin{matrix} a \amp 0 \\ 0 \amp b \end{matrix} \right], \end{equation*}
we obtain the set of points of the ellipse \(E\) given by
\begin{equation*} \frac{x^2}{a^2} + \frac{y^2}{b^2} = 1. \end{equation*}
A useful consequence of this is when you want to parametrize the ellipse \(E\text{,}\) you simply parametrize the circle and compose with \(\text{Diag}(a,b)\) getting
\begin{equation*} \mb{f} (\theta ) = \text{Diag}(a,b) \twovec{\cos \theta}{ \sin \theta} = \twovec{a \cos \theta}{b \sin \theta}. \end{equation*}
Note that the map \(\mb{f}\) is not a linear transformation, but the use of a linear transformation (along with some basic knowledge of trigonometry) helped us find it!
The matrix \(\text{Diag} (a,b)\) in Example 2.3.5 is an example of a diagonal matrix. These are square matrices with zeros everywhere except possibly on the diagonal. We will use notation of the form
\begin{equation} \text{Diag}(\lambda_1, \lambda_2, \ldots, \lambda_n) = \left[ \begin{matrix} \lambda_1 \amp 0 \amp \cdots \amp 0 \\ 0 \amp \lambda_2 \amp \ddots \amp \vdots \\ \vdots \amp \ddots \amp\ddots \amp 0 \\ 0 \amp \cdots \amp 0 \amp \lambda_n \end{matrix} \right]. \tag{2.3.2} \end{equation}
An important diagonal matrix is the identity matrix \(I_n = \text{Diag}(1, 1, \ldots, 1)\text{.}\)
We will be working extensively with matrices in several contexts, but for now, let us recall a couple of other important examples of linear transformations.

Example 2.3.6. Coordinates as a linear transformation.

Suppose \(V\) is a vector space over \(K\) and \(\mathcal{B} = \{\mb{v}_1, \cdots , \mb{v}_n \}\) is a basis. Recall that in equation (2.2.6) we defined a function
\begin{equation*} \coord{}{\mathcal{B}} : V \to K^n . \end{equation*}
This is a linear transformation as you will show in the exercises. As was shown in the previous set of exercises, \(\coord{}{\mathcal{B}}\) also has an inverse. Thus we may use \(\coord{}{\mathcal{B}}\) to identify the abstract vector space \(V\) with the concrete column vector space \(K^n\text{.}\) Note however that this identification depended on our choice of \(\mathcal{B}\text{.}\)

Example 2.3.7. The derivative as a linear transformation.

The vector space of continuously differentiable functions on \(\mathbb{R}\) is denoted \(C^1(\mathbb{R} )\text{.}\) Then
\begin{equation*} \frac{\diff}{\diff x} : C^1(\mathbb{R}) \to C (\mathbb{R}) \end{equation*}
is a linear transformation.
One should also note that important operators of quantum mechanics (creation and annihilation operators) are linear transformations as well. These last examples show that, while matrix multiplication is a very important type of linear transformation, there are many natural examples that are not packaged well as matrices.
A linear transformation \(T: U \to V\) gives rise to interesting linear subspaces and equations which we define here.

Definition 2.3.8.

Given vector spaces \(U\) and \(V\) over \(K\) and a linear transformation
\begin{equation*} T : U \to V, \end{equation*}
  1. The kernel of \(T\) is the set of vectors
    \begin{equation*} \ker (T) = \left\{ \mb{u} \in U : T( \mb{u} ) = \mb{0} \right\} \subseteq U . \end{equation*}
  2. The image of \(T\) is the range of \(T\)
    \begin{equation*} \im (T) = \left\{ T (\mb{u} ) \in V : \mb{u} \text{ a vector in } U \right\} \subseteq V . \end{equation*}
The kernel is sometimes called the nullspace of \(T\) because it is the set of vectors of \(U\) that are sent to the zero vector. Let’s check that these subsets are in fact vector subspaces.

Proof.

We stick with the notation of \(U\text{,}\) \(V\) and \(T\) given in Definition 2.3.8. By Proposition 2.2.9, we only need to show that \(\ker (T)\) and \(\im(T)\) are closed under vector addition and scalar multiplication. So let’s check. If \(\mb{u}_1\) and \(\mb{u}_2\) are in the kernel, then by definition \(T (\mb{u}_1) = \mb{0} = T (\mb{u}_2)\text{.}\) So then using the definition of a linear transformation we see
\begin{equation*} T (\mb{u}_1 + \mb{u}_2 ) = T (\mb{u}_1) + T (\mb{u}_2 ) = \mb{0} + \mb{0} = \mb{0}. \end{equation*}
This implies \(\mb{u}_1 + \mb{u}_2\) is in the kernel as well and so vector addition is closed. Similarly, for any scalar \(\lambda\) in \(K\text{,}\)
\begin{equation*} T (\lambda \mb{u}_1 ) = \lambda T (\mb{u}_1 ) = 0 \end{equation*}
so that scalar multiplication of a vector in \(\ker (T)\) is still in \(\ker (T)\text{.}\) Thus \(\ker (T)\) is a vector subspace of \(U\text{.}\)
For the image we proceed in a similar manner. If \(\mb{v}_1\) and \(\mb{v}_2\) are in the image then there are vectors \(\mb{u}_1\) and \(\mb{u}_2\) for which \(T( \mb{u}_1 ) = \mb{v}_1\) and \(T( \mb{u}_2 ) = \mb{v}_2\text{.}\) But then
\begin{equation*} T (\mb{u}_1 + \mb{u}_2 ) = T (\mb{u}_1) + T (\mb{u}_2 ) = \mb{v}_1 + \mb{v}_2 \end{equation*}
so that \(\mb{v}_1 + \mb{v}_2\) is also in the image and the image is closed under vector addition. The closedness of scalar multiplication is also justified by observing
\begin{equation*} T( \lambda \mb{u}_1 ) = \lambda T (\mb{u}_1 ) = \lambda \mb{v}_1 . \end{equation*}
We mention this proposition here because many important subspaces (for example, planes, lines, tangent spaces, etc) arise most naturally as a kernel or an image of some linear transformation. Of course, being linear is a very limiting property of a space, so the proposition should clarify that the geometry of these spaces is very basic. Another reason to consider these spaces is that they inform us on properties of \(T\text{.}\)

Proof.

The second statement is just the definition of an onto function (that the range equals the codomain). For the first, suppose that \(T\) is one-to-one and \(\mb{u}\) is in the kernel \(\ker (T)\text{.}\) Then \(T (\mb{u} ) = \mb{0} = T( \mb{0})\) so that \(\mb{u} = \mb{0}\text{.}\) But since \(\mb{u}\) was an arbitrary element of the kernel, this means that the only element of \(\ker (T)\) is zero itself.
Conversely, suppose \(\ker (T)\) only contains zero and
\begin{equation*} T(\mb{u}_1 ) = T(\mb{u}_2). \end{equation*}
Then
\begin{equation*} T(\mb{u}_1 - \mb{u}_2 ) = T(\mb{u}_1 ) - T (\mb{u}_2) = \mb{0}. \end{equation*}
This implies \(\mb{u}_1 - \mb{u}_2\) is in the kernel. But since the kernel only contains zero, we have that \(\mb{u}_1 - \mb{u}_2 = \mb{0}\) or \(\mb{u}_1 = \mb{u}_2\) which gives us that \(T\) is one-to-one.
It turns out the set of all linear transformations from \(U\) to \(V\) forms its own vector space with the natural operations:
\begin{align*} (T + S) (\mb{u} ) \amp = T( \mb{u} ) + S ( \mb{u}), \\ (\lambda T )(\mb{u} )\amp = \lambda T (\mb{u}) \end{align*}
where \(T\) and \(S\) are linear transformations from \(U\) to \(V\text{.}\) The most important example of this is when \(V\) is the vector space \(K\) itself.

Definition 2.3.11.

The dual vector space to \(V\text{,}\) denoted \(V^*\text{,}\) is the vector space of linear transformations \(T : V \to K\text{.}\)
Some dual vectors are well known to even the most inattentive calculus student:

Example 2.3.12. Evaluation as a dual vector.

Suppose \(P_n\) is the vector space of polynomials with complex coefficients. Then evaluating a polynomial at \(0\) is a linear transformation
\begin{equation*} ev_0 : P_n \to \mathbb{C} \end{equation*}
given by
\begin{equation*} ev_0 (f(z)) = f(0). \end{equation*}
Many more examples exist with infinite dimensional vector spaces of functions.

Example 2.3.13. Limit as a dual vector.

Let \(\mathcal{V}\) be the vector space of functions from \((0,1)\) to \(\mathbb{R}\) such that \(\lim_{x \to 0^-} f(x)\) exists. Then,
\begin{equation*} \lim_{x \to 0^-} : \mathcal{V} \to \mathbb{R} \end{equation*}
is a linear transformation. Part of this is a fancy way of saying something known to every first year calculus student as ‘The limit of the sum is the sum of the limits.’

Example 2.3.14. Definite integral as a dual vector.

Recall that \(C([a,b])\) denotes the set of continuous functions on the closed interval \([a,b]\text{.}\) Then
\begin{equation*} \int : C([a,b]) \to \mathbb{R} \end{equation*}
given by
\begin{equation*} \int f := \int_a^b f(x) \, \diff x \end{equation*}
is a linear transformation.
For a vector space \(V\) with a finite basis \(\mathcal{B} = \{\mb{v}_1 , \ldots, \mb{v}_n\}\text{,}\) we can define a dual basis
\begin{equation*} \mathcal{B}^* = \{\mb{v}_1^* , \ldots, \mb{v}_n^* \} \end{equation*}
of the dual space \(V^*\) by taking
\begin{equation} \mb{v}_i^* \left( a_1 \mb{v}_1 + \cdots + a_n \mb{v}_n \right) := a_i. \tag{2.3.3} \end{equation}
One can show that this is in fact a basis. However, in the infinite dimensional setting, this no longer holds and there are several definitions and conditions that are created to recover weaker versions of it.
It shows a measure of mathematical maturity to appreciate that, while \(V\) and \(V^*\) look a lot alike, they are in fact two separate vector spaces with their own personalities (and more importantly, transformation properties). This will be more apparent when we develop multivariable calculus.

Exercises Exercises

1.

Calculate:
(a)
\(a\) and \(b\) where
\begin{equation*} \left[ \begin{matrix} 3 \amp 0 \amp {4} \\ {2} \amp {-1} \amp {3} \end{matrix} \right] \threevec{-2}{3}{1} = \twovec{a}{b}. \end{equation*}
(b)
\(c\) and \(d\) if
\begin{equation*} \left[ \begin{matrix} {c} \amp {2} \\ {1} \amp {1} \end{matrix} \right] \twovec{1}{d} = \twovec{-1}{0}. \end{equation*}

2.

Give complete responses to the following questions:
(a)
Let \(T\) be the linear transformation from \(\mathbb{R}^2\) to \(\mathbb{R}\) given by
\begin{equation*} T \left( \twovec{x}{y} \right) = 3x - 7 y. \end{equation*}
In other words, \(T\) is multiplication by the matrix \(\left[ \begin{matrix} 3 \amp -7 \end{matrix} \right]\text{.}\) Describe the kernel of \(T\) in familiar geometric terms.
(b)
The equation \(T (\mb{u} ) = 0\) defining the kernel is an implicit equation which can be thought of as a ‘problem that needs to be solved’. As in Section 1.3, a solution to this equation could be a function
\begin{equation*} S : \mathbb{R} \to \mathbb{R}^2 \end{equation*}
which parameterizes the geometric object you described above. Find such a parameterization which is also a linear transformation.
Hint.
This can also be given as multiplication by a matrix.

3.

Whether a function is linear or not may depend on what \(K\) is, even for the same vector space:
(a)
Consider \(\mathbb{C}\) as a real vector space. Is complex conjugation a linear transformation?
(b)
Consider \(\mathbb{C}\) as a complex vector space. Is complex conjugation a linear transformation?

4.

A linear isomorphism is a linear transformation that is also a one-to-one correspondence. Show that if \(\mathcal{B}\) is a basis of a vector space \(V\) over \(K\) the function
\begin{equation*} \coord{}{\mathcal{B}} : V \to K^n \end{equation*}
is a linear transformation. By a prior exercise, this shows that it is a linear isomorphism.

5.

If \(T : U \to V\) is a linear isomorphism, what is its kernel and image? Explain your response.

6.

Suppose \(A\) is an \(m \times n\) matrix with entries in \(K\) and \(T_A : K^n \to K^m\) is the linear transformation obtained by left multiplying by \(A\text{.}\) In other words,
\begin{equation*} T_A (\mb{v} ) = A \mb{v}. \end{equation*}
Prove that \(T_A\) is a linear isomorphism if and only if the columns of \(A\) are a basis of \(K^m\text{.}\)
Hint.
Use the part on columns from Example 2.2.12 and the definition of a basis.

7.

Verify that the multiplication by \(A_\phi\) given in Example 2.3.3 is indeed counter-clockwise rotation by \(\phi\text{.}\)
Hint.
Write a column vector in ‘polar’ coordinates \(\twovec{r \cos \theta}{r \sin \theta}\) and see what happens when you multiply by the matrix \(A_\phi\text{.}\)

8.

Using the standard basis, write projection from the plane to the line \(y = x\) as multiplication by a \(2 \times 2\) matrix.