Reading
Section 2.1 of Theory of Computing, A Gentle
Introduction, concentrating on pages 17-20.
Homework
Printout
the homework and answer the questions on that paper.
More formality
We now know how to define machines formally, and we know how
to define algorithms on those machines formally. However, we
haven't given any formal proofs that these algorithms do what
we claim they do. In other words, we've never proven that the
machines they produce accept the languages we claim they
accept. In fact, we don't even have a formal definition of
what it means for a machine to accept a string.
Our starting point for pinning all this down exactly will be
the concept of configuration. In the discussions that follow,
we will refer to the machine M3 below:

Machine M3.
Configurations
Suppose you were tracing throught the computation of a big
finite automaton on a very long input string. Halfway
through the computation, the phone rings. If you go answer
the phone, will you have to restart your work from the
beginning when you get back? What would you need to
remember in order to pick up where you left off? All you'd
need to know is the state you were in, and what's left of
the input!
A computation by a finite automaton involves two things: the
machine's states and the transitions between them, and the
input on the tape. If you stop in the middle of a
computation, all you need to know to start back up and
continue is what state you're in, and what symbols are left on
the tape. We refer to this as the configuration of
the machine during the computation.
A configuration of machine
$M = \left(Q,\Sigma,\delta,s,W\right)$
is a tuple
$(q,w) \in Q \times \Sigma^*$.
It is interpreted as "Machine $M$ is in state $q$ with
string $w$ left on the input tape".
As a machine computes, it moves from configuration to
configuration. However, the rules of the machine constrain
which configurations can follow each other in a single step.
For example, if you look at machine
M3 above, it's
clear that from configuration $(q_1,cba)$ you go to
configuration $(q_2,ba)$ in one step, not to any other
configuration. In one step, you remove the front character
from the string of input on the tape, and you use δ with
the current state and that front character to determine your
next state.
This notion is also formalized in the "yields in one step"
relation.
For machine $M = (Q,\Sigma,\delta,s,W)$, we say
that configuration $(p,v)$ yields in one step
configuration $(q,w)$ if for some character
$a \in \Sigma$ we have
$v = aw$ and $\delta(p,a) = q$.
This is sometimes denoted
$(p,v) \Rightarrow (q,w)$, or
$(p,v) \Rightarrow_M (q,w)$ when you need to
clearly indicate which machine you're referring to.
A computation simply chains together steps like these. For
example, if you look at machine
M3 above, it's
clear that from configuration $(q_0,abcacb)$ you
eventually get to configuration $(q_1,cb)$. The
computation looks like:
\[
(q_0,abcacb) \Rightarrow (q_1,bcacb) \Rightarrow (q_0,cacb) \Rightarrow
(q_0,acb) \Rightarrow (q_1,cb)
\]
We package up this whole thing and simply say that
$(q_0,abcacb)$
yields $(q_1,cb)$, though
clearly not in "one step". This we also make formal with a
definition, which you can think of informally as "yields in zero or more
steps" or "yields eventually":
For machine $M = (Q,\Sigma,\delta,s,W)$, we say
that configuration $(p,v)$ yields
configuration $(q,w)$ if there is a sequence of
configurations $(q_1,w_1), (q_2,w_2), ..., (q_k,w_k)$ such
that
\[
(p,v) = (q_1,w_1) \Rightarrow (q_2,w_2) \Rightarrow \cdots
\Rightarrow (q_k,w_k) = (q,w)
\]
This is sometimes denoted
$(p,v) \Rightarrow^* (q,w)$, or
$(p,v) \Rightarrow^*_M (q,w)$ when you need to
clearly indicate which machine you're referring to.
Note: If $p=q$ and $q = w$
the definition is always satisfied by a sequence of length
one
($k=1$), so $(p,v) \Rightarrow_M^* (p,v)$ always holds.
Formal definition of a machine accepting a string
After all those definitions, we can finally say what it means
for a machine $M$ to accept a string $w$:
Machine $M = (Q,\Sigma,\delta,s,W)$
accepts string $w \in \Sigma^*$
if configuration $(s,w) \Rightarrow^*_M (q,\lambda)$ for some state
$q \in W$.
In other words: $M$ starts off in state $s$
with string $w$ on the input tape, and it ends up in an
accepting state with the empty string left on the input tape.
All of this may seem like a lot of work that essentially
leaves us knowing no more than we did when we started.
However, all of these definitions will give us a way to
prove that algorithms do what we say they do.
So now we can write down the computation of a DFA $M$
on an input string $w$. For
machine $M_3$ (from above) on input $abcacb$ we
get the following computation:
\[
(q_0,abcacb) \Rightarrow (q_1,bcacb) \Rightarrow (q_0,cacb) \Rightarrow
(q_0,acb) \Rightarrow (q_1,cb) \Rightarrow (q_2,b) \Rightarrow (q_2,\lambda)
\]
... and since $q_2$ is accepting, the
input $abcacb$ is accepted.
Proving the correctness of a non-trivial algorithm
Consider the following algorithm (you may have seen it in the
previous homework!):
Algorithm: AddLambda
Input: DFA $M = \left(Q,\Sigma,\delta,s,W\right)$
Output: DFA $M'$ such that $L(M') = L(M) \cup
\{\lambda\}$
$M' = (Q \cup \{\$\},\Sigma,\delta',\$,W \cup \{\$\})$,
where
$\delta':(Q \cup \{\$\})\times\Sigma\rightarrow (Q \cup
\{\$\})$
$\delta'(p,x) = \left\{
\begin{array}{c&l}
\delta(s,x) & \text{if $p = \$$}\\
\delta(p,x) & \text{otherwise}
\end{array}
\right.$
We'd like to prove that this algorithm is correct, i.e. that
$L(M') = L(M) \cup \{\lambda\}$. With the help of our new
notions of configuration, $\Rightarrow_M$, $\Rightarrow_M^*$ and
the formal definition of "accept" for DFAs, we can do this.
Before we do anything, we note the following:
If $q \in Q$, i.e. $q \neq \$ $, then $\delta'(q,x) = \delta(q,x)$ for any
$x\in\Sigma$, and therefore:
Claim 0:
If $q \in Q$ then for any $u\in\Sigma^*$
we have
$(q,u) \Rightarrow^*_{M'} (p,\lambda)$
if and only if
$(q,u) \Rightarrow^*_{M} (p,\lambda)$.
We have to prove two things. Number 1:
Claim 1: For any string $w$, if $w \in L(M)$ then $w \in L(M')$.
Proof:
First of all, if $w = \lambda$ this is clearly true, since
the start state of $M'$ is also an accepting state. So
let's move on to the case where $w$ is non-empty, i.e. $w
=xv$ for some $x \in \Sigma$ and $v \in \Sigma^*$.
Then by our new definition of "accept", we have
$(s,w) \Rightarrow_M^* (p,\lambda)$ for some $p\in W$.
Since $w = xv$ we get
1:
$(s,w) \Rightarrow_M (\delta(s,x),v) \Rightarrow_M^* (p,\lambda)$
So what about machine $M'$ with input $w$? Since
$\delta'(\$,x) = \delta(s,x)$ we get
2:
$(\$,w) \Rightarrow_{M'} (\delta(s,x),v)$
and since (1) shows $(\delta(s,x),v) \Rightarrow_M^* (p,\lambda)$,
by Claim 0 we get
3:
$(\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda)$.
Putting (2) and (3) together we get $(\$,w)\Rightarrow_{M'}^* (p,\lambda)$
and, since $p\in W$, we get $w \in L(M')$ as required.
qed
This shows that $M'$ accepts all the strings in $L(M)$ and, as
mentioned above, $\lambda$ as well. So $M'$ accepts all the
strings it is supposed to, and we only need to show that it
doesn't also accept some strings it's not supposed to. To
show that we need to prove:
Claim 2: For any string $w$ not equal to $\lambda$,
if $w \in L(M')$ then $w \in L(M)$.
Proof:
Note that this is basically the reverse of
Claim 1, so the proof is going to look almost identical,
with the roles of $M'$ and $M$ reversed.
Since $w$ is non-empty, $w
=xv$ for some $x \in \Sigma$ and $v \in \Sigma^*$.
Then by our new definition of "accept", we have
$(\$,w) \Rightarrow_{M'}^* (p,\lambda)$ for some $p\in W$.
(Note that because $w$ is non-empty and there are no
transitions into $\$$, we may assume $p \in W$.)
Since $w = xv$ we get
1:
$(\$,w) \Rightarrow_{M'} (\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda)$
because $\delta'(\$,x) = \delta(s,x)$, as defined in the algorithm.
So what about machine $M$ with input $w$? For it we get
2:
$(s,w) \Rightarrow_{M} (\delta(s,x),v)$
and since (1) shows $(\delta(s,x),v) \Rightarrow_{M'}^* (p,\lambda)$,
by Claim 0 we get
3:
$(\delta(s,x),v) \Rightarrow_{M}^* (p,\lambda)$.
Putting (2) and (3) together we get $(s,w)\Rightarrow_{M}^* (p,\lambda)$
and, since $p\in W$, we get $w \in L(M)$ as required.
qed
So, we've proved that Algorithm AddLambda meets its
specifications. And we couldn't have done it without
configuations, $\Rightarrow_M$, $\Rightarrow_M^*$ and the
formal definition of "accept". We need them even to express
the basic intuition behind the proof:
After one step of computation from their initial
configurations, $M$ and $M'$ are in identical
configurations, and thus they end up in the same state when
they finish.
Another example of what we can prove with configurations
(if you haven't seen induction don't freak out)
To give you an example of the kinds of things we can express
with the concept of configuration and what we can prove,
consider the following idea: If I'm in a given state $p$ in a
machine, and I next read characters $a_1 a_2 \cdots a_k$ from
the input, I'll end up in the same state regardless of what
comes after $a_1 a_2 \cdots a_k$ on the input tape.
Hopefully this seems obvious enough. But how could we express
this clearly and concisely, and how could we prove it was
true?
To explain: here $u$ plays the role of
"characters $a_1 a_2 \cdots a_k$ from the input" and $v_1$
and $v_2$ are two different possiblities for "what comes after".
We "end up in" states $q_1$ and $q_2$ in these two cases,
and the theorem tells us that, in fact, the two states we
end up in are the same!
For any DFA $M$, if
$(p,u v_1) \Rightarrow_M^* (q_1,v_1)$
and
$(p,u v_2) \Rightarrow_M^* (q_2,v_2)$
then $q_1 = q_2$. (Note that $u$, $v_1$ and $v_2$ are
strings here.)
we proceed by induction on $|u|$.
- Base case $|u| = 0$: In this case $u = \lambda$, so
the "$\Rightarrow_M^*$" in
$(p,u v_1) \Rightarrow_M^* (q_1,u_1)$
is really "yields in zero steps", and $q_1 = p$.
Similarly, the "$\Rightarrow_M^*$" in
$(p,u v_2) \Rightarrow_M^* (q_2,v_2)$
is really "yields in zero steps", and $q_2 = p$.
Since both $q_1$ and $q_2$ equal $p$, we have shown $q_1 = q_2$.
-
Inductive case $|u| > 0$: In this case, since $u$ is non-empty we
can write $u = x u'$ for some character $x$ and string
$u'$. So
$(p,u v_1)
\Rightarrow_M$ $(\delta(p,x),u' v_1)
\Rightarrow_M^* (q_1,v_1)$ and
$(p,u v_2)
\Rightarrow_M$ $(\delta(p,x),u' v_2)
\Rightarrow_M^* (q_2,v_2)$.
Now, if you look at the part
in green, you should see that
it matches exactly the premises of our theorem, but with
$u'$ instead of $u$, and since $|y'| \lt |u|$ we can, by
induction, use the theorem to conclude that $q_1 = q_2$.
To explain: realizing that some of you will not have seen
Proof by Induction, I'll try to give a bit of insight.
Induction and recursion are intimately related.
The above "proof" is really a recursive program for
constructing a proof given a concrete string $x$. If, for
example, $x = dcbc$, this proof/recursive-program would
build the proof showing $q_1 = q_2$ that starts with
\[
(p,dcbc y_1) \Rightarrow_M (\delta(p,d),cbc y_1) \Rightarrow_M^* (q_1,y_1)
\text{ and }
(p,dcbc y_2) \Rightarrow_M (\delta(p,d),cbc y_2) \Rightarrow_M^* (q_2,y_2)
\]
and then use a recursive call to build a proof showing
that if
\[
(\delta(p,d),cbc y_1) \Rightarrow_M^* (q_1,y_1)
\text{ and }
(\delta(p,d),cbc y_2) \Rightarrow_M^* (q_2,y_2)
\]
then $q_1 = q_2$.
The recursion is working towards the base case $|x| = 0$
because the string in front of $y_1$ and $y_2$ is getting smaller.
Christopher W Brown
Last modified: Fri Sep 11 08:42:43 EDT 2009