SI342: The halting problem, the limits of computation

A Language that cannot be semi-decided

Since we can encode machines, we can consider some fairly interesting languages. For example, what about the language:

L_M = { e(M) | M is a deterministic Turing machine }

i.e. the language of all strings of 0's and 1's that are valid encodings of Turing machines. This language can be decided by Turing machines (and therfore semi-decided as well), it turns out. Basically, we can figure out whether the string corresponds to the encoding rules using finite automata (remember the regular expression from last class?). This automaton can be simulated by a Turing machine. Then we need to go through and collect a list of all the states and symbols encoded in the string, and make sure that each state/symbol has exactly one transition. While tedious, this can be done with a Turing machine.

Now let's consider a slight variation of L_M. Consider the language

L = { e(M) | M is a deterministic Turing machine and M does not halt on input e(M) }

i.e. the language of all strings of 0's and 1's that are valid encodings of Turing machines such that the encoded Turing machine does not halt when given it's own encoded string as input. This is a wierd language, it's true. But the idea of giving a program to itself as input really isn't that odd. for example, the Unix program wc counts the number of bytes in a file. Since a Unix program is simply a file (a file with execute permissions), we might wonder how many bytes are in the executable file wc. So naturally we do that as follows:

valiant[1] [/usr/bin/]> wc wc
      24     272    7272 wc
valiant[2] [/usr/bin/]>

where wc tells us there are 7272 bytes in the file wc. Now, the question for us is this: Can L be semi-decided? Turns out that it cannot!

Theorem: No Turing machine semi-decides L.
Proof: Suppose there were a Turing machine semi-deciding L. Let T be such a machine. For any machine M such that e(M) ∈ L, we now have two conclusions: T halts on input e(M) (since T semi-decides L) and M does not halt on e(M) (by the definition of L). For any machine M such that e(M) ∉ L, we also have two conclusions: T does not halt on input e(M) (since T semi-decides L) and M does halt on e(M) (by the definition of L). To summarize:

For any machine M:
e(M) ∈ L ⇒ T halts on input e(M) and M does not halt on input e(M)
and
e(M) ∉ L ⇒ T does not halt on input e(M) and M halts on input e(M)

What about e(T)? Is it in L or not? I.e. let's see what happens when we consider the case M = T:

For any machine M:
e(M) ∈ L ⇒ T halts on input e(M) and M does not halt on input e(M)
and
e(M) ∉ L ⇒ T does not halt on input e(M) and M halts on input e(M)

Then for machine M = T we get: (this is just "specialization" from first-order logic!)
e(T) ∈ L ⇒ T halts on input e(T) and T does not halt on input e(T)
and
e(T) ∉ L ⇒ T does not halt on input e(T) and T halts on input e(T)

But in both cases (e(T) ∈ L and e(T) ∉ L) we contradict ourselves by saying that T both halts and does not halt on input e(T). Since we arrive at a contradiction, our assumption that there was a Turing machine semi-deciding L must have been false!

Now we've proved that there is a problem that connot be solved by a Turing machine and thus, by the Church-Turing hypothesis, cannot be solved by any computational device!

A Language that cannot be decided: The halting problem

The language L from above is not a particularly natural language, and the fact that it can't be semi-decided lacks a bit of punch because it seems so obscure. In fact here's what we proved in layman's terms: There is no program that can read in a program and halt (as opposed to crashing or running forever) if and only if that program fails to halt when given itself as input. Most people at a cocktail party would have already wandered off or be completely confused by this. So we'd like something more "natural".

Consider the language

L_H = {e(M)e(w) | M halts on input w}

We showed in class that this is undecidable, since a TM deciding this language could be used to construct a TM semi-deciding L and we know no such machine exists. Deciding L_H is what's known as the halting problem, and we showed that it cannot be solved by a computer. This result has a nice characterization, though. It says that you cannot create a system that reads in a program and input for that program and determines whether or not the given program will halt on the given input. In particular, this means that an fully automated IC210 grading program is impossible!