SI242: Gaussian Elimination

Row echelon form

A solution to $A\cdot\boldsymbol{x}=\boldsymbol{0}$, where $A$ is a given $m\times n$ matrix, is a concrete $n$-dimensional vector value for $\boldsymbol{x}$ that satisfies the equation. Of course, $\boldsymbol{0}$ is always a solution, but not a very interesting one. We call it the trivial solution. A non-zero vector solution to the equation is called a non-trivial solution. It's hard to look at just any old matrix and know whether or not it has non-trivial solutions. For example, what about $$ A_1 = \begin{bmatrix} 2&-1&4\\ 4&1&10\\ -2&18&10 \end{bmatrix} $$ Who knows? On the other hand, for matrices with a certain structure to them, things can be easier. For example:

$$ A_2 = \begin{bmatrix} 2&-5&-1\\ 0&5&-9\\ 0&0&6 \end{bmatrix} $$	Consider $A_2\cdot\boldsymbol{x}=\boldsymbol{0}$. The last row gives us the equation $6x_3=0$, which means $x_3=0$. The second row gives is $5x_2+-9x_3=0$, and since $x_3=0$, this means $5x_2=0$, which means $x_2=0$. Finally, the first row gives us the equation $2x_1 + -5x_2 + -1x_3 = 0$, and since $x_2=x_3=0$, this means $2x_1=0$, which means $x_1=0$. So the only solution is the trivial solution $\boldsymbol{x}=\boldsymbol{0}$.
$$ A_3 = \begin{bmatrix} 3&-6&-1\\ 0&2&-3\\ 0&0&0 \end{bmatrix} $$	Consider $A_3\cdot\boldsymbol{x}=\boldsymbol{0}$. The last row gives us the equation $0=0$, which tells us nothing. The second row gives us $2x_2+-3x+3=0$, which is equivalent to $x_2=\frac{3}{2}x_3$. The first row gives us $$ \begin{array}{l} 3x_1+-6x_2+-1x_3=0\\ 3x_1 = 6x_2+x_3\\ 3x_1=6(\frac{3}{2}x_3)+x_3\\ x_1 = \frac{10}{3}x_3 \end{array} $$ So, for any value for $x_3$, $[\frac{10}{3}x_3\ \ \frac{3}{2}x_3\ \ x_3]$ is a solution!
$$ A_4 = \begin{bmatrix} 3&-6&-1&2\\ 0&0&-3&1\\ 0&0&0&7 \end{bmatrix} $$	Consider $A_4\cdot\boldsymbol{x}=\boldsymbol{0}$. By combining the kind of reasoning from the two previous examples, we get that $x_3=x_4=0$ and $3x_1-6x_2=0$, so $x_1=2x_2$. So $[2x_2\ \ x_2\ \ 0\ \ 0]$ is a solution for any $x_2$ value.

In these three examples, the shape of the matrix made it easy to read off whether or not non-trivial solutions exist and, if so, to describe them. The following definition characterizes what the right "shape" or "structure" is for this to work.

The leftmost non-zero entry of a row in a matrix is called the pivot for the row. Matrix $A$ is said to be in row echolon form when the pivot for each row is to the right of the pivot for the row above it.

Gaussian elimination

Here are two observations:

We can read off solutions to $A\cdot\boldsymbol{x}=\boldsymbol{0}$ pretty easily if $A$ is in row echelon form. (From above)
We can modify $A$ via elementary row operations without changing the solutions to $A\cdot\boldsymbol{x}=\boldsymbol{0}$. (From last class)

So here's the plan: find an algorithm that uses elementary row operations to change a given input matrix until it is in row echelon form!

ACTIVITY
Below are four matrices. See if you can use elementary row operations to put them in echelon form (then see if you can read off the solutions!):

$A_1$	$A_5$	$A_6$
$$ \begin{bmatrix} 2&-1&4\\ 4&1&10\\ -2&18&10 \end{bmatrix} $$	$$ \begin{bmatrix} 3&-1&4\\ 6&-4&12\\ -3&-11&20 \end{bmatrix} $$	$$ \begin{bmatrix} 2&-1&4&1\\ 4&-1&11&6\\ -2&7&10&25 \end{bmatrix} $$

Gaussian Elimination informally
At each step, you will be ensuring that your matrix looks like this:

As for what a "step" actually is ...

Gaussian Elimination Step	
if some remaining row (call it row k) has its pivot in column j
    swap row i and row k
    zero out the column j entry of each row below row i by adding multiples of row i (elementary row op type iii)
    i = i + 1 
j = j + 1

Here are a few examples of Gaussian Elimination at work.

Gaussian Elimination more formally

Of course we usually write algorithms in pseudo code, something a little closer to computer code.

set i=1 and j=1
while i <= m and j <= n
    if some remaining row (call it row k) has its pivot in column j
        swap row i and row k
        for l from i+1 to m do
            let a=the col j element of row i
            let b=the col j element of row l
            add -b/a times row i to row l (this zeros the col j entry of row l)
        i = i + 1 
    j = j + 1