SI342: Reprising Proof and Algorithms

A note on HW 1 and 2 from previous homework

It's nice to make the following definition before talking about these problems (and for moving forward):

Let $L$ be a language over alphabet $\Sigma$. The chop of $L$ is defined as: $$ \text{chop}(L) = \{ w\in\Sigma^*\ \big|\ xw \in L(M) \text{ for some $x\in\Sigma$}\} $$

A note on HW 1.5 from previous homework

The previous HW had a problem (Problem 1.5) that involved a computation given as a sequence of configurations for a machine $M$ produced from the concatenation algorithm from machines $M_1$ and $M_2$. However, I didn't actually give you those machines. Well ... here they are:

$M_1$:

, $M_2$:

This might help you in reviewing the HW. It shows that if you follow the process from the proof-of-correctness for the concatenation algorithm, you really do produce strings $u$ and $v$ such that $w = uv$ along with accepting computations for $M_1$ on $u$ and $M_2$ on $v$. Does that help you understand why that proof really should be convincing?

A note on languages that can't be accepted by finite automata

Just to clear something up before we go on. We've seen so many examples of languages that are accepted by deterministic or non-deterministic finite automata, that we might be inclined to think that any language can be accpeted by a finite automaton. In fact, most languages cannot ... it's just that we've been focusing on languages that can. You should be pleasantly surprised when a lanugage turns out to be acceptable by an finite automaton. Anyway, let's look at a simple language that can't be accepted by a finite automaton just to see what the limitations are.

Suppose I wanted to construct a finite automaton to accept the languages {a^kb^k | 0 ≤ k ≤ 1}, {a^kb^k | 0 ≤ k ≤ 2}, and {a^kb^k | 0 ≤ k ≤ 3}. It turns out not to be so hard .. although a bit tedious.


{a^kb^k \| 0 ≤ k ≤ 1}	{a^kb^k \| 0 ≤ k ≤ 2}	{a^kb^k \| 0 ≤ k ≤ 3}

Presumably you see the writing on the wall: the language of strings of the form a^kb^k for k up to some bound can be accepted by some finite automaton, but the number of states keeps growing as the bound grows. If you're clever you should be able to see how many states would be needed if k was bounded by, for example, 100. But what about the language {a^kb^k | 0 ≤ k}? In other words, what if we didn't have any upper bound on k? Well then we'd need infinitely many states, and that's precisely what we're not allowed to have with a finite automaton! And that, though it's not a proof, is is the intuition behind why {a^kb^k | 0 ≤ k} cannot be accepted by a finite automaton.

It turns out to be pretty easy to do this kind of thing. For example, if L is a language let's let double(L) denote the language of doubles of strings in L, i.e. double(L) = { ww ∈ Σ^* | w ∈ L}. Then it turns out that double(L) cannot necessarily be accepted by a finite automaton, even if L can. For example if L = {w ∈ {a,b}^* | |w| is even}, then double(L) cannot be accepted by any finite automaton. Same problem as before: a machine would need infinitely many states to "remember" the first half of the string to make sure the second half matched for strings of any size. How about strings over the alphabet {a,b,:} such that the strings of a's and b's betwen :'s are in srted order? Also not acceptable with a finite automaton!

Anyway, the moral of the story is that when we prove that, for instance, the Kleene Star of a language acceptable by a NDFA is also acceptable by a NDFA, we should surprised and impressed!

Proof

In this class, as we've gone over many times, proof always ends up being an algorithm. You have a thoerem that claims that such and such a class of languages can be accepted by a DFA? You prove the claim by giving an algorithm that generates the machine for each language in the class. However, while we've had lots of practice following algorithms, and even devising them on our own, we've had no real practice in looking at a proposed theorem and figuring out which algorithm we'll need to devise in order to have a proof. When writing programs, you should always start with a clear idea of what task the program is supposed to accomplish - i.e. a specification for the program. Similarly, before devising an algrothim, you should have a specification. What kind of input should it take, what kind of output should it produce - i.e. what's the property you're looking for in your output. Let's look at an example:

Proposed theorem: If a language L is accepted by a nondeterministic finite automaton, then L is accpeted by a deterministic finite automaton.

Proof requires an algorithm with specification:
Input: Nondeterministic Machine M = (Q,Σ,Δ,s,W)
Output: Deterministic Machine M' such that L(M') = L(M)

Notice, I haven't given the algorithm here, only the specification of the algorithm. If I can provide an algorithm that meets this specification, I've proved the theorem. Hopefully it's clear to you why such an algorithm would be a proof (quotes from the theorem statement are in red):

If a language L is accepted by a nondeterministic finite automaton then there is a NDFA that accepts it, which we'll call M. We run our algorithm with M as input, and get back a DFA M' that accepts L(M) = L. Thus, L is accepted by a deterministic finite automaton.

In-class Problems

The following excercises were done in-class. You're given the statement of a proposed theorem, and you're supposed to write the specification of the algorithm that would prove the theorem. We're looking for the specification only, not the actual algorithm.

Proposed Theorem If languages L1 and L2 are both accepted by nondeterministic finite automata, then L1 ∪ L2 is accpeted by a nondeterministic finite automaton.
Proof requires an algorithm with specification: ?
Proposed Theorem If language L is accepted by a deterministic finite automaton, then L* is accepted by a deterministic finite automaton.
Proof requires an algorithm with specification: ?
Proposed Theorem If language L ⊆ Σ* is accepted by a nondeterministic finite automaton, and $w \in \Sigma^*$, then the language of all string in $L$ that do not contain $w$ as a substring is accepted by a non-deterministic finite automaton.
Proof requires an algorithm with specification: ?
Given a language L ⊆ Σ^*, let Pre_k(L) denote the language of all length k prefixes of strings in L. For example, if L = {abbc, ca, bacc, a} then Pre₂(L) = {ab,ca,ba}.
Proposed Theorem If language L is accepted by a nondeterministic finite automaton, then for any k > 0 Pre_k(L) is accepted by a nondeterministic finite automaton.
Proof requires an algorithm with specification: ?
Proposed Theorem For any k > 0 and alphabet Σ, the language of all strings over Σ of length at least k is accepted by a non-deterministic finite automaton.
Proof requires an algorithm with specification: ?

Note: In Problem2 from the last homework you faced the same kind of problem as in the blow activity. You were supposed to prove that if lanugage L is accepted by an NDFA, then "chop of L" is accepted by an NDFA. In going over this, we noted that for any specific language (e.g. $L =\{aab,bcaa\}$) we could prove that "chop of L" is accepted by an NDFA by acutally drawing a machine that accepts "chop of L". Of course if we move on to a different "language $L$", we have to come up with a different NDFA ... the machine we need for "chop L" is tailored to $L$. So, to prove that for every language $L$ accepted by an NDFA, it's true that "chop of L" is accepted by an NDFA we have to automate the process of producing the machine for chop $L$. This way we know that for any specific language $L$ anyone could every come up with, the NDFA for "chop L" exists - because we could build it by following the algorithm.