SI242: Matrix op's and linear transformations

Linear transformation

We have talked about "linear transformations" without ever defining them. Now is the time to correct that! As you'll see, a linear transformation is a function that takes a vector as input and produces a vector as output, and which satisfies two key properties.

A function $T$ with inputs from vector space $R^n$ and outputs in vector space $R^m$ that satisfies

$c\cdot T(\boldsymbol{u}) = T(c\cdot \boldsymbol{u})$
$T(\boldsymbol{u}+\boldsymbol{v})=T(\boldsymbol{u}) + T(\boldsymbol{v})$

for any vectors $\boldsymbol{u}$, $\boldsymbol{u}$ in $R^n$ and scalar $c$ in $R$, is called a linear transformation.

At first blush, linear transformations seem to have nothing to do with matrices, which are after all our current topic. Vectors ... yes. Matrices ... no. But the amazing thing is that this abstract definition of linear transformation is in fact exactly equivalent to being a function that is defined as a matrix times an input vector.

A function $T$ with inputs from vector space $R^n$ and outputs in vector space $R^m$ is a linear transformation if and only if there is an $m\times n$ matrix $A$ over $R$ such that for any $\boldsymbol{u}$ in $R^n$, $T(\boldsymbol{u}) = A\boldsymbol{u}$.

We have to prove "$T$ is a linear transformation" $\Leftrightarrow$ "there is an $A$ such that $T(\boldsymbol{u}) = A\boldsymbol{u}$". As we usually do, we will prove equivalence $x\Leftrightarrow y$ by proving both $x\Rightarrow y$ and $y\Rightarrow x$.

Part 1: Prove that if $T(\boldsymbol{u}) = A\boldsymbol{u}$ then $T$ is a linear transformation.

ACTIVITY:
Prove that for any $m\times n$ matrix $A$: [Break up into groups and choose one of the below to do.]

$(cA)\boldsymbol{u} = A(c \boldsymbol{u})$ for any scalar $c$ and vector $\boldsymbol{u}$
$A(\boldsymbol{u}+\boldsymbol{v})=A\boldsymbol{u} + A\boldsymbol{v}$

Hint: the easiest way to do this is write down expressions for element $i,j$ of the output for the two sides and then show why the two ouput expressions are the same.

Part 2: Prove that if $T$ is a linear transformation then there is an $A$ such that $T(\boldsymbol{u}) = A\boldsymbol{u}$.

The $i$th unit vector in $R^n$, denoted $\boldsymbol{e_i}$, is the vector with $i$th component 1, and all other components 0. Note that for any vector $\boldsymbol{u}$ in $R^n$ we have $\boldsymbol{u} = u_1\boldsymbol{e_1} + \cdots + u_n\boldsymbol{e_n}$; in other words, all vectors can be written as linear combinations of the unit vectors.

Consider the column vectors $\boldsymbol{c_1},\ldots,\boldsymbol{c_n}$, where $\boldsymbol{c_i} = T(\boldsymbol{e_i})$, and let $A$ be the matrix with column vectors $\boldsymbol{c_1},\ldots,\boldsymbol{c_n}$.

$A\boldsymbol{u} = u_1\boldsymbol{c_1} + \cdots + u_n\boldsymbol{c_n}$,	this equivalence of matrix-times-vector and linear combinations of the matrix's column vectors with coefficients given by the vector's components. (see "The many meanings of $A\cdot\boldsymbol{x} = \boldsymbol{0}$", Class 33)
$A\boldsymbol{u} = u_1T(\boldsymbol{e_1}) + \cdots + u_nT(\boldsymbol{e_n})$,	this follows from our definition of $\boldsymbol{c_i}$ as $\boldsymbol{c_i} = T(\boldsymbol{e_i})$ above
$A\boldsymbol{u} = T(u_1\boldsymbol{e_1}) + \cdots + T(u_n\boldsymbol{e_n})$,	this follows by applying property i of the definition of linear transformation to each term from the previous line
$A\boldsymbol{u} = T(u_1\boldsymbol{e_1} + \cdots + u_n\boldsymbol{e_n})$,	This follows from property ii of the definition of linear transformation. Technically, there's an inductive proof in there, because we have to combine $n$ applications of $T$ into a single application, and property ii only states that a pair of applications of $T$ can be combined into a single application. Giving this inductive proof will be one of your homework problems!
$A\boldsymbol{u} = T(\boldsymbol{u})$

This is a beautiful result! Whenever we have a matrix $A$, it defines a linear transformation $T(\boldsymbol{u}) = A\boldsymbol{u}$. And whenever we have a linear transformation $T$, there is a matrix $A$ that defines it. But there's more! Sometimes a proof does more than merely verifiy that the theorem statement is correct. The interesting proofs provide insights, and this is just such a proof. If you look at how this result is proved, you see that it's telling us that a linear transformation is completely described by how it maps the unit vectors. So even though there are infinitely many input vectors a linear transformation has to map, the transformation can be defined by a small set of vectors: the images of the unit vectors under that transformation ... and that is precisely what the matrix $A$ constructed by the proof is.

Square matrics

Consider the collection of all $n\times n$ matrices, for a given $n$ and ring $R$. Notice that $AB$ results in another $n\times n$ matrix. So we say this collection of matrices is "closed" under multiplication - the product stays in the same collection. So for $n\times n$ matrices we have addition, and we have multipliction. In fact, you can verify that we have all the ring axioms (though multiplication is not commutative)! So this is a ring, and therefore we can "do arithmetic". Part 2 of today's in-class activity is a chance for you to see how super-simple matrix arithmetic provides an elegent and efficient mechansim for computing with geometric objects, as is ubiquitous in computer graphics, computer aided design, robotics and much more.