Reading
Section 2.1 of Theory of Computing, A Gentle Introduction, concentrating on pages 17-20.
Homework
Printout the homework and answer the questions on that paper.

More formality
We now know how to define machines formally, and we know how to define algorithms on those machines formally. However, we haven't given any formal proofs that these algorithms do what we claim they do. In other words, we've never proven that the machines they produce accept the languages we claim they accept. In fact, we don't even have a formal definition of what it means for a machine to accept a string. Our starting point for pinning all this down exactly will be the concept of configuration. In the discussions that follow, we will refer to the machine M3 below:


Machine M3.

Configurations
Suppose you were tracing throught the computation of a big finite automaton on a very long input string. Halfway through the computation, the phone rings. If you go answer the phone, will you have to restart your work from the beginning when you get back? What would you need to remember in order to pick up where you left off? All you'd need to know is the state you were in, and what's left of the input!

A computation by a finite automaton involves two things: the machine's states and the transitions between them, and the input on the tape. If you stop in the middle of a computation, all you need to know to start back up and continue is what state you're in, and what symbols are left on the tape. We refer to this as the configuration of the machine during the computation.

A configuration of machine M = (Q,Σ,δ,s,W) is a tuple (q,w) ∈ Q x Σ*. It is interpreted as "Machine M is in state q with string w left on the input tape".

As a machine computes, it moves from configuration to configuration. However, the rules of the machine constrain which configurations can follow each other in a single step. For example, if you look at machine M3 above, it's clear that from configuration (q1,cba) you go to configuration (q2,ba) in one step, not to any other configuration. In one step, you remove the front character from the string of input on the tape, and you use δ with the current state and that front character to determine your next state. This notion is also formalized in the "yields in one step" relation.

For machine M = (Q,Σ,δ,s,W), we say that configuration (p,v) yields in one step configuration (q,w) if for some character a ∈ Σ we have v = aw and δ(p,a) = q. This is sometimes denoted (p,v) ⇒ (q,w), or (p,v) ⇒M (q,w) when you need to clearly indicate which machine you're referring to.

A computation simply chains together steps like these. For example, if you look at machine M3 above, it's clear that from configuration (q0,abcacb) you eventually get to configuration (q1,cb). The computation looks like:
(q0,abcacb) ⇒ (q1,bcacb) ⇒ (q0,cacb) ⇒ (q0,acb) ⇒ (q1,cb)
We package up this whole thing and simply say that (q0,abcacb) yields (q1,cb), though clearly not in "one step". This we also make formal with a definition:

For machine M = (Q,Σ,δ,s,W), we say that configuration (p,v) yields configuration (q,w) if there is a sequence of configurations (q1,w1), (q2,w2), ..., (qk,wk) such that

(p,v) = (q1,w1) ⇒ (q2,w2) ⇒ ... ⇒ (qk,wk) = (q,w)

This is sometimes denoted (p,v) ⇒* (q,w), or (p,v) ⇒*M (q,w) when you need to clearly indicate which machine you're referring to.

Formal definition of a machine accepting a string
After all those definitions, we can finally say what it means for a machine M to accept a string w:

Machine M = (Q,Σ,δ,s,W) accepts string w ∈ Σ* if configuration (s,w) ⇒* (q,λ) for some state q ∈ W.

In other words: M starts off in state s with string w on the input tape, and it ends up in an accepting state with the empty string left on the input tape. All of this may seem like a lot of work that essentially leaves us knowing no more than we did when we started. However, all of these definitions will give us a way to prove that algorithms do what we say they do.

So now we can write down the computation of a DFA M on an input string w. For machine M3 (from above) on input abcacb we get the following computation:

(q0,abcacb) ⇒ (q1,bcacb) ⇒ (q0,cacb) ⇒ (q0,acb) ⇒ (q1,cb) ⇒ (q2,b) ⇒ (q2,λ)
... and since q2 is accepting, the input abcacb is accepted.

Proving the correctness of a non-trivial algorithm
Consider the following algorithm (you may have seen it in the previous homework!):
Algorithm: AddLambda
Input: DFA $M = \left(Q,\Sigma,\delta,s,W\right)$
Output: DFA $M'$ such that $L(M') = L(M) \cup \{\lambda\}$

$M' = (Q \cup \{\$\},\Sigma,\delta',\$,W \cup \{\$\})$, where

$\delta':(Q \cup \{\$\})\times\Sigma\rightarrow (Q \cup \{\$\})$
$\delta'(p,x) = \left\{ \begin{array}{c&l} \delta(s,x) & \text{if $p = \$$}\\ \delta(p,x) & \text{otherwise} \end{array} \right.$

We'd like to prove that this algorithm is correct, i.e. that $L(M') = L(M) \cup \{\lambda\}$. With the help of our new notions of configuration, $\Rightarrow_M$, $\Rightarrow_M^*$ and the formal definition of "accept" for DFAs, we can do this. We have to prove two things. Number 1:
Claim 1: For any string $w$, if $w \in L(M)$ then $w \in L(M')$.
Proof: First of all, if $w = \lambda$ this is clearly true, since the start state of $M'$ is also an accepting state. So let's move on to the case where $w$ is non-empty, i.e. $w =xv$ for some $x \in \Sigma$ and $v \in \Sigma^*$. Then by our new definition of "accept", we have $(s,w) \Rightarrow_M^* (p,\lambda)$ for some $p\in W$. Since $w = xv$ we get \[ (s,w) \Rightarrow_M (\delta(s,x),v) \Rightarrow_M^* (p,\lambda) \] So what about machine $M'$ with input $w$? Since $\delta'(\$,x) = \delta(s,x)$ we get \[ (\$,w) \Rightarrow_{M'} (\delta(s,x),v) \] after the first step. In otherwords, after one step $M$ and $M'$ are in the same configuration. Once that happens, then the construction of $M'$ guarantees that the remainder of the configurations in the computations for the two machines will be identical, which means $M'$ will also arrive at configuration $(p,\lambda)$. Since $p$ is in $W$, it is accepting in $M'$, and we get $w \in L(M')$ as required.
qed
This shows that $M'$ accepts all the strings in $L(M)$ and, as mentioned above, $\lambda$ as well. So $M'$ accepts all the strings it is supposed to, and we only need to show that it doesn't also accept some strings it's not supposed to. To show that we need to prove:
Claim 2: For any string $w$ not equal to $\lambda$, if $w \in L(M')$ then $w \in L(M)$.
Proof: Note that this is basically the reverse of Claim 1, so the proof is going to look almost identical, with the roles of $M'$ and $M$ reversed.
Since $w$ is non-empty, $w =xv$ for some $x \in \Sigma$ and $v \in \Sigma^*$. Then by our new definition of "accept", we have $(\$,w) \Rightarrow_{M'}^* (p,\lambda)$ for some $p\in W$. (Note that because $w$ is non-empty and there are no transitions into $\$$, we may assume $p \in W$.) Since $w = xv$ we get \[ (\$,w) \Rightarrow_{M'} (\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda) \] because $\delta'(\$,x) = \delta(s,x)$, as defined in the algorithm. So what about machine $M$ with input $w$? For it we get \[ (s,w) \Rightarrow_{M} (\delta(s,x),v) \] after the first step. In otherwords, after one step $M$ and $M'$ are in the same configuration. Once that happens, then the construction of $M'$ guarantees that the remainder of the configurations in the computations for the two machines will be identical, which means $M$ will also arrive at configuration $(p,\lambda)$. Since $p$ is in $W$, we get $w \in L(M)$ as required.
qed
So, we've proved that Algorithm AddLambda meets its specifications. And we couldn't have done it without configuations, $\Rightarrow_M$, $\Rightarrow_M^*$ and the formal definition of "accept". We need them even to express the basic intuition behind the proof:
After one step of computation from their initial configurations, $M$ and $M'$ are in identical configurations, and thus they end up in the same state when they finish.
An example of what we can prove with configurations (if you haven't seen induction don't freak out)
To give you an example of the kinds of things we can express with the concept of configuration and what we can prove, consider the following idea: If I'm in a given state $p$ in a machine, and I next read characters $a_1 a_2 \cdots a_k$ from the input, I'll end up in the same state regardless of what comes after $a_1 a_2 \cdots a_k$ on the input tape. Hopefully this seems obvious enough. But how could we express this clearly and concisely, and how could we prove it was true?

Theorem: For any DFA $M$, if $(p,x y_1) \Rightarrow_M^* (q_1,y_1)$ and $(p,x y_2) \Rightarrow_M^* (q_2,y_2)$ then $q_1 = q_2$.
To explain: here $x$ plays the role of "characters $a_1 a_2 \cdots a_k$ from the input" and $y_1$ and $y_2$ are two different possiblities for "what comes after". We "end up in" states $q_1$ and $q_2$ in these two cases, and the theorem tells us that, in fact, the two states we end up in are the same!
Proof: we proceed by induction on $|x|$.

To explain: realizing that some of you will not have seen Proof by Induction, I'll try to give a bit of insight. Induction and recursion are intimately related. The above "proof" is really a recursive program for constructing a proof given a concrete string $x$. If, for example, $x = dcbc$, this proof/recursive-program would build the proof showing $q_1 = q_2$ that starts with \[ (p,dcbc y_1) \Rightarrow_M (\delta(p,d),cbc y_1) \Rightarrow_M^* (q_1,y_1) \text{ and } (p,dcbc y_2) \Rightarrow_M (\delta(p,d),cbc y_2) \Rightarrow_M^* (q_2,y_2) \] and then use a recursive call to build a proof showing that if \[ (\delta(p,d),cbc y_1) \Rightarrow_M^* (q_1,y_1) \text{ and } (\delta(p,d),cbc y_2) \Rightarrow_M^* (q_2,y_2) \] then $q_1 = q_2$. The recursion is working towards the base case $|x| = 0$ because the string in front of $y_1$ and $y_2$ is getting smaller.


Christopher W Brown
Last modified: Fri Sep 11 08:42:43 EDT 2009