Reading

Section 2.1 of Theory of Computing, A Gentle Introduction, concentrating on pages 17-20.

Homework

Printout the homework and answer the questions on that paper.

More formality

We now know how to define machines formally, and we know how to define algorithms on those machines formally. However, we haven't given any formal proofs that these algorithms do what we claim they do. In other words, we've never proven that the machines they produce accept the languages we claim they accept. In fact, we don't even have a formal definition of what it means for a machine to accept a string. Our starting point for pinning all this down exactly will be the concept of configuration. In the discussions that follow, we will refer to the machine M3 below:


Machine M3.

Configurations

Suppose you were tracing throught the computation of a big finite automaton on a very long input string. Halfway through the computation, the phone rings. If you go answer the phone, will you have to restart your work from the beginning when you get back? What would you need to remember in order to pick up where you left off? All you'd need to know is the state you were in, and what's left of the input!

A computation by a finite automaton involves two things: the machine's states and the transitions between them, and the input on the tape. If you stop in the middle of a computation, all you need to know to start back up and continue is what state you're in, and what symbols are left on the tape. We refer to this as the configuration of the machine during the computation.

A configuration of machine $M = \left(Q,\Sigma,\delta,s,W\right)$ is a tuple $(q,w) \in Q \times \Sigma^*$. It is interpreted as "Machine $M$ is in state $q$ with string $w$ left on the input tape".
As a machine computes, it moves from configuration to configuration. However, the rules of the machine constrain which configurations can follow each other in a single step. For example, if you look at machine M3 above, it's clear that from configuration $(q_1,cba)$ you go to configuration $(q_2,ba)$ in one step, not to any other configuration. In one step, you remove the front character from the string of input on the tape, and you use δ with the current state and that front character to determine your next state. This notion is also formalized in the "yields in one step" relation.
For machine $M = (Q,\Sigma,\delta,s,W)$, we say that configuration $(p,v)$ yields in one step configuration $(q,w)$ if for some character $a \in \Sigma$ we have $v = aw$ and $\delta(p,a) = q$. This is sometimes denoted $(p,v) \Rightarrow (q,w)$, or $(p,v) \Rightarrow_M (q,w)$ when you need to clearly indicate which machine you're referring to.
A computation simply chains together steps like these. For example, if you look at machine M3 above, it's clear that from configuration $(q_0,abcacb)$ you eventually get to configuration $(q_1,cb)$. The computation looks like: \[ (q_0,abcacb) \Rightarrow (q_1,bcacb) \Rightarrow (q_0,cacb) \Rightarrow (q_0,acb) \Rightarrow (q_1,cb) \] We package up this whole thing and simply say that $(q_0,abcacb)$ yields $(q_1,cb)$, though clearly not in "one step". This we also make formal with a definition, which you can think of informally as "yields in zero or more steps" or "yields eventually":
For machine $M = (Q,\Sigma,\delta,s,W)$, we say that configuration $(p,v)$ yields configuration $(q,w)$ if there is a sequence of configurations $(q_1,w_1), (q_2,w_2), ..., (q_k,w_k)$ such that \[ (p,v) = (q_1,w_1) \Rightarrow (q_2,w_2) \Rightarrow \cdots \Rightarrow (q_k,w_k) = (q,w) \] This is sometimes denoted $(p,v) \Rightarrow^* (q,w)$, or $(p,v) \Rightarrow^*_M (q,w)$ when you need to clearly indicate which machine you're referring to.
Note: If $p=q$ and $q = w$ the definition is always satisfied by a sequence of length one ($k=1$), so $(p,v) \Rightarrow_M^* (p,v)$ always holds.

Formal definition of a machine accepting a string

After all those definitions, we can finally say what it means for a machine $M$ to accept a string $w$:
Machine $M = (Q,\Sigma,\delta,s,W)$ accepts string $w \in \Sigma^*$ if configuration $(s,w) \Rightarrow^*_M (q,\lambda)$ for some state $q \in W$.
In other words: $M$ starts off in state $s$ with string $w$ on the input tape, and it ends up in an accepting state with the empty string left on the input tape. All of this may seem like a lot of work that essentially leaves us knowing no more than we did when we started. However, all of these definitions will give us a way to prove that algorithms do what we say they do.

So now we can write down the computation of a DFA $M$ on an input string $w$. For machine $M_3$ (from above) on input $abcacb$ we get the following computation: \[ (q_0,abcacb) \Rightarrow (q_1,bcacb) \Rightarrow (q_0,cacb) \Rightarrow (q_0,acb) \Rightarrow (q_1,cb) \Rightarrow (q_2,b) \Rightarrow (q_2,\lambda) \] ... and since $q_2$ is accepting, the input $abcacb$ is accepted.

Proving the correctness of a non-trivial algorithm

Consider the following algorithm (you may have seen it in the previous homework!):
Algorithm: AddLambda
Input: DFA $M = \left(Q,\Sigma,\delta,s,W\right)$
Output: DFA $M'$ such that $L(M') = L(M) \cup \{\lambda\}$

$M' = (Q \cup \{\$\},\Sigma,\delta',\$,W \cup \{\$\})$, where

$\delta':(Q \cup \{\$\})\times\Sigma\rightarrow (Q \cup \{\$\})$
$\delta'(p,x) = \left\{ \begin{array}{c&l} \delta(s,x) & \text{if $p = \$$}\\ \delta(p,x) & \text{otherwise} \end{array} \right.$

We'd like to prove that this algorithm is correct, i.e. that $L(M') = L(M) \cup \{\lambda\}$. With the help of our new notions of configuration, $\Rightarrow_M$, $\Rightarrow_M^*$ and the formal definition of "accept" for DFAs, we can do this. Before we do anything, we note the following: If $q \in Q$, i.e. $q \neq \$ $, then $\delta'(q,x) = \delta(q,x)$ for any $x\in\Sigma$, and therefore:

Claim 0: If $q \in Q$ then for any $u\in\Sigma^*$ we have $(q,u) \Rightarrow^*_{M'} (p,\lambda)$ if and only if $(q,u) \Rightarrow^*_{M} (p,\lambda)$.

We have to prove two things. Number 1:
Claim 1: For any string $w$, if $w \in L(M)$ then $w \in L(M')$.
Proof: First of all, if $w = \lambda$ this is clearly true, since the start state of $M'$ is also an accepting state. So let's move on to the case where $w$ is non-empty, i.e. $w =xv$ for some $x \in \Sigma$ and $v \in \Sigma^*$. Then by our new definition of "accept", we have $(s,w) \Rightarrow_M^* (p,\lambda)$ for some $p\in W$. Since $w = xv$ we get

1:  $(s,w) \Rightarrow_M (\delta(s,x),v) \Rightarrow_M^* (p,\lambda)$

So what about machine $M'$ with input $w$? Since $\delta'(\$,x) = \delta(s,x)$ we get

2:  $(\$,w) \Rightarrow_{M'} (\delta(s,x),v)$

and since (1) shows $(\delta(s,x),v) \Rightarrow_M^* (p,\lambda)$, by Claim 0 we get

3:  $(\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda)$.

Putting (2) and (3) together we get $(\$,w)\Rightarrow_{M'}^* (p,\lambda)$ and, since $p\in W$, we get $w \in L(M')$ as required.
qed
This shows that $M'$ accepts all the strings in $L(M)$ and, as mentioned above, $\lambda$ as well. So $M'$ accepts all the strings it is supposed to, and we only need to show that it doesn't also accept some strings it's not supposed to. To show that we need to prove:
Claim 2: For any string $w$ not equal to $\lambda$, if $w \in L(M')$ then $w \in L(M)$.
Proof: Note that this is basically the reverse of Claim 1, so the proof is going to look almost identical, with the roles of $M'$ and $M$ reversed.
Since $w$ is non-empty, $w =xv$ for some $x \in \Sigma$ and $v \in \Sigma^*$. Then by our new definition of "accept", we have $(\$,w) \Rightarrow_{M'}^* (p,\lambda)$ for some $p\in W$. (Note that because $w$ is non-empty and there are no transitions into $\$$, we may assume $p \in W$.) Since $w = xv$ we get

1:  $(\$,w) \Rightarrow_{M'} (\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda)$

because $\delta'(\$,x) = \delta(s,x)$, as defined in the algorithm. So what about machine $M$ with input $w$? For it we get

2:  $(s,w) \Rightarrow_{M} (\delta(s,x),v)$

and since (1) shows $(\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda)$, by Claim 0 we get

3:  $(\delta(s,x),v) \Rightarrow_{M}^* (p,\lambda)$.

Putting (2) and (3) together we get $(s,w)\Rightarrow_{M}^* (p,\lambda)$ and, since $p\in W$, we get $w \in L(M)$ as required.
qed
So, we've proved that Algorithm AddLambda meets its specifications. And we couldn't have done it without configuations, $\Rightarrow_M$, $\Rightarrow_M^*$ and the formal definition of "accept". We need them even to express the basic intuition behind the proof:
After one step of computation from their initial configurations, $M$ and $M'$ are in identical configurations, and thus they end up in the same state when they finish.

Another example of what we can prove with configurations (if you haven't seen induction don't freak out)

To give you an example of the kinds of things we can express with the concept of configuration and what we can prove, consider the following idea: If I'm in a given state $p$ in a machine, and I next read characters $a_1 a_2 \cdots a_k$ from the input, I'll end up in the same state regardless of what comes after $a_1 a_2 \cdots a_k$ on the input tape. Hopefully this seems obvious enough. But how could we express this clearly and concisely, and how could we prove it was true?

To explain: here $u$ plays the role of "characters $a_1 a_2 \cdots a_k$ from the input" and $v_1$ and $v_2$ are two different possiblities for "what comes after". We "end up in" states $q_1$ and $q_2$ in these two cases, and the theorem tells us that, in fact, the two states we end up in are the same!
For any DFA $M$, if $(p,u v_1) \Rightarrow_M^* (q_1,v_1)$ and $(p,u v_2) \Rightarrow_M^* (q_2,v_2)$ then $q_1 = q_2$. (Note that $u$, $v_1$ and $v_2$ are strings here.)

we proceed by induction on $|u|$.


Christopher W Brown
Last modified: Fri Sep 11 08:42:43 EDT 2009