SI342: Equivalence of our TM model and the Book's

A prefix machine

Problem: Write an algorithm P that takes a symbol x and alphabet Σ, and produces a turing machine that prefixes the input string with x, leaving the tape head at the first square after the prefixed character.

Algorithm: $P(x,\Sigma)$
Input: symbol $x$, alphabet $\Sigma$
Output: Machine $M$ such that on input $w = w_1w_2\cdots w_k$ machine $M$ halts in configuration $(h,x,w_1,w_2\cdots w_k)$
$M = \left(\Sigma \cup \{q_0,q_1\},\Sigma,\Sigma \cup \{x,\$\},\delta,q_0\right)$, where $\$ \notin \Sigma$, and
$$ \text{$\delta$ is given by } \left\{ \begin{array}{l} \delta(q_0,z) = (z,\$,R)\text{, for all $z \in \Sigma$}\\ \delta(q_0,\Box) = (h,x,R)\\ \delta(y,z) = (z,y,R)\text{, for all $y,z \in\Sigma$}\\ \delta(y,\Box) = (q_1,y,L)\text{, for $y \in \Sigma$}\\ \delta(q_1,z) = (q_1,z,L)\text{, for all $z \in\Sigma$}\\ \delta(q_1,\$) = (h,x,R)\\ \delta(q,y) = (q,y,S)\text{ in all other cases} \end{array} \right. $$

Roadmap

We're going to prove that there are some languages that cannot be decided by Turing Machines. In order for that result to pack any punch, however, I need to convince you that if a Turing Machine can't do it, it can't be "computed" in any reasonable sense. This assertion is known as the Church-Turing thesis.

If you think about it, our definition of a TM is pretty arbitrary. For instance, I choose to say it has a 1-way infinite tape with no special way to mark the left-most square on the tape as being "the end". The book chose a 1-way infinite tape with a special marker for the first square that could never be overwritten, and from which one could never move left. On the other hand, JFLAP chose a 2-way infinite tape, which avoids the whole marking-the-first-square problem. Now, if I prove that a language can't be decided by any machine conforming to our TM model, what's to say that a JFLAP-style or Book-style machine couldn't decide it? What about machines with multiple tapes? Random access memory? An infinite 2-dimensional grid of memory cells? Higher-dimensional lattices of squares? In short, with so many possible models of what it means to "compute", who cares if our little model has limitations?

The remarkable thing is that all of these fancy models of what it might mean to compute have exactly the same "computational power" as our simple Turing Machine model. So if we prove something is impossible in our model, it's impossible in any of those models!

"Computational Power"

To even begin to compare different models of what a Turing Machine might be, we need to come up with a definition of "computational power", preferrably one that we can acutally use to compare different models of computers. Certainly we felt comfortable saying that a PDA is more powerful than a DFA, but that an NDFA is not more powerful than a DFA. What was our yardstick for "computational power" when we made these claims? It was the languages that could be accepted by a given model of computation. DFAs and NDFAs are capable of accepting all the same languages. PDAs, on the other hand, can accept anything a DFA/NDFA can, but can accept other languages as well. Now, TMs can both decide languages and semi-decide languages so:

Defintion Two Turing machine models have the same computational power if the set of languages they can decide is the same, and the set of languages they can semi-decide is the same.

The book's model

Let's say that a "Bookish TM" conforms exactly to our model, except that there is a special symbol $\$$ in the first sqare of the tape, which is not an element of Σ, which can't be written on any other square, and which cannot be overwritten. Formally,

A Bookish Turing Machine is a 5-tuple $(Q,\Sigma,\Gamma,\delta,s)$ where

$Q$ is the set of states, $h \notin Q$

$\Sigma$ is the input alphabet, $\$ \notin \Sigma$

$\Gamma$ is the alphabet of symbols that may appear on the tape at any point in the computation, exluding the blank character $\Box$ and $\$$

$\delta$ is the transition function $$ \delta: Q \times (\Gamma \cup \{\Box,\$\}) \rightarrow (Q \cup \{h\})\times(\Gamma \cup \{\Box,\$\})\times\{L,R,S\} $$ such that $\delta(q,\$) \in Q \times \{\$\}\times\{R,S\}$ and $\delta(q,x) \notin (Q\cup\{h\})\times\{\$\}\times\{L,R,S\}$ if $x \neq \$$. (i.e. you can't move left from $\$$, and $\$$ gets written to the first cell and only the first cell.)

$s$ is the start state, $s\in Q$

The notion of configuration and yields in one step (i.e. ⇒) are the same for both Bookish Turing machines and our Turing machines. The only important difference is that the starting configuration for a Bookish Turing Machine $(Q,\Sigma,\Gamma,\delta,s)$ on input $w = w_1 w_2 \cdots w_k$ is $(q,\$,w_1,w_2\cdots w_k)$.

Just to understand what's going on, here's a JFLAP rendition of a Bookish Turing Machine that decides the language of all strings over {a,b} whose length is odd, along with its formal defintion as a tuple.

(

{q0,q1,q2,q3},{a,b},{a,b},

	a	b	□	$\$$
q0	(q1,□,R)	(q1,□,R)	(q0,□,L)	(q2,$\$$,R)
q1	(q0,□,R)	(q0,□,R)	(q0,□,L)	(q3,$\$$,R)
q2	(q2,a,S)	(q2,b,S)	(h,□,S)	(q2,$\$$,S)
q3	(q3,a,S)	(q3,□,S)	(h,1,S)	(q3,$\$$,S)

,q0

)

Now let's look at a computation. Suppose the above machine is given abb as input. Its computation is then:

(q0,$\$$,a,bb) ⇒ (q1,$\$$□,b,b) ⇒ (q0,$\$$□□,b,λ) ⇒ (q1,$\$$□□□,□,λ) ⇒ (q1,$\$$□□,□,λ) ⇒ (q1,$\$$□,□,λ) ⇒ (q1,$\$$,□,λ) ⇒ (q1,λ,$\$$,λ) ⇒ (q3,$\$$,□,λ) ⇒ (h,$\$$,1,λ)

Notice that if M = (Q,Σ,Γ,δ,s) is a Bookish TM, M' = (Q,Σ∪{$\$$},Γ∪{$\$$},δ,s) satisfies our definition of a a TM. So we could run our TM M' and get the same result as running the Bookish TM M, provided our input had an extra $\$$ prefixed to it. Intuitively, we could run the machine P($\$$,Σ) from above to do the prefixing, then run M', and we'd get exactly what the the Bookish TM got. But we have no way of chaining together two Turing Machines to act as a single machine. Let's make it up! A diagram like

is a machine schema, and the meaning is that after machine M1 halts, you run machine M2 with the tape contents and tape head exactly as M1 left it. The important point is that we can write down an algorithm that takes a machine schema as input and produces a single Turing Machine that does exactly what the schema does. Thus, machine schema are just a notational convenience, not a new model. By analogy to introductory programming, a single TM is like writing a program with everything in main, and machine schemas are like allowing function definitions. (Can you come up with an algorithm that creates a single machine from this schema? Think about it!) To continue: Thus, given the Bookish Turing machine M above, we use the machines P($\$$,Σ) and M' in the following schema:

> P($\$$,Σ) → M'

The machine realizing this schema simulates M; i.e. it halts on input w if and only if M halts on input w, and if it halts, the contents of the tape upon halting and the position of the tape head are exactly what they are for M when it halts. Thus, the machine realizing this schema decides/semi-decides exactly what M does, proving that Bookish TM's are no more powerful than our TM's.

The other direction - i.e. that a Bookish TM can simulate one of our TM's - is even easier. So we have a new theorem:

Theorem: Bookish Turing machines and our class's Turing machines are computationally equivalent.
Proof: We have an algorithm that takes a Bookish Turing Machine M and converts it into one of our Turing Machines as described above. Now, suppose for some input string w = w1w2...wk, we have
(s,$\$$,w1,w2...wk) ⇒*_M (q,u,x,v)
This means that the initial configuration for the Bookish TM M eventually leads to configuration (q,u,x,v). Clearly, it's also true that
(s,$\$$,w1,w2...wk) ⇒*_M' (q,u,x,v)
i.e. that the machine M' goes through the same sequence of configurations if you start at configuration (s,$\$$,w1,w2...wk). However, that's not the initial configuration for M' on input w. To prove that there is a TM of our type that is equivalent to M, we need a machine lands in configuration (q,u,x,v) from its own initial configuration, not from the initial configuration of M.
Now, let H be the TM of our type that realizes the schema > P($\$$,Σ) → M' . Where does H = (Q',Σ',Γ',δ',s') go from its initial configuration on input w = w1...wk, i.e. (s',λ,w1,w2...wk)? Well, running H is the same as running P($\$$,Σ) and then running M' from the configuraiton P($\$$,Σ) halted in, except with the halt state replaced with the start state for M'. Let's call the start state of P($\$$,Σ) s1, and of course the start state of M' is s, as above.
(s1,λ,w1,w2...wk) ⇒*_P($\$$,Σ) (h/s',$\$$,w1,w2...wk) ⇒*_M' (q,u,x,v)
This means that the machine H realizing our schema must give us
(s,λ,w1,w2...wk) ⇒*_H (q,u,x,v)
So starting with the same input w, H goes through every configuration the Bookish Machine M goes through during its computation. So they are computationally equivalent.
	  
Next we need to prove that for any TM of our type, there is an equivalent Bookish TM. This is easy. Let M be a TM of "our" kind. As long as M doesn't crash, it can be viewed as a "Bookish" TM that simply never lands in the initial square with the $\$$. If M does crash, all we have to do is to simulate the crash. If it moves into the $\$$ square, we'll make it go into an infinite loop, which is functionally equivalent to a crash. That means, take M and add the transitions δ(q,$\$$) = (q,$\$$,S) for every state q. This creates a "bookish" TM that halts on input w if and only if M halted on w, and halts pointing to □ (resp. non-□) if and only if M halts pointing to □ (resp. non-□).

So what have we learned? We learned that The Book's Turing machine model is computationally equivalent to our own. In other words, adding that $\$$ special first square on the tape didn't change the computational power of the model.